Beyond The Standard Model: A.N. Schellekens
Beyond The Standard Model: A.N. Schellekens
A.N. Schellekens
1
Contents
1 Introduction 8
1.1 A Complete Theory? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Gravity and Cosmology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 The Energy Balance of the Universe . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Environmental Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Gauge Theories 17
2.1 Classical Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Gauge Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Noethers Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Covariant Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Non-Abelian Gauge Theories . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.6 Coupling to Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7 Gauge Kinetic Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.8 Feynman Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.9 Other Gauge Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2
5 A First Look Beyond 47
5.1 The Left-handed Representation . . . . . . . . . . . . . . . . . . . . . . . . 47
5.1.1 Replacing Particles by Anti-Particles . . . . . . . . . . . . . . . . . 47
5.1.2 The Standard Model in Left-handed Representation . . . . . . . . . 49
5.1.3 Fermion Masses in the Left-handed Representation . . . . . . . . . 49
5.1.4 Yukawa Couplings in the Left-handed Representation . . . . . . . . 50
5.1.5 Real Representations . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1.6 Mirror Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Neutrino Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2.1 Modifications of the Standard Model . . . . . . . . . . . . . . . . . 53
5.2.2 Adding a Dimension 5 Operator . . . . . . . . . . . . . . . . . . . . 54
5.2.3 Neutrino-less Double-beta Decay . . . . . . . . . . . . . . . . . . . 55
5.2.4 Adding Right-handed Neutrinos . . . . . . . . . . . . . . . . . . . . 56
5.2.5 The See-Saw Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2.6 Neutrino Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.7 Neutrino Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 C,P and CP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.4 Continuous Global Symmetries . . . . . . . . . . . . . . . . . . . . . . . . 65
5.5 Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
5.5.1 Feynman Diagram Computation . . . . . . . . . . . . . . . . . . . . 68
5.5.2 Anomalous Local Symmetries . . . . . . . . . . . . . . . . . . . . . 71
5.5.3 Anomalous Global Symmetries . . . . . . . . . . . . . . . . . . . . 73
5.5.4 Global Anomalies in Field-Theoretic Form . . . . . . . . . . . . . . 75
5.5.5 Global Anomalies in QCD QED . . . . . . . . . . . . . . . . . . 75
5.5.6 The 0 Decay Width . . . . . . . . . . . . . . . . . . . . . . 76
5.5.7 The Axial U (1) Symmetry . . . . . . . . . . . . . . . . . . . . . . . 76
5.5.8 Baryon and Lepton Number Anomalies . . . . . . . . . . . . . . . . 77
5.5.9 Proton decay by Instantons and Sphalerons . . . . . . . . . . . . . 77
5.5.10 Anomaly-free Global Symmetries . . . . . . . . . . . . . . . . . . . 77
5.5.11 Mixed Gauge and Gravitational Anomalies . . . . . . . . . . . . . . 78
5.5.12 Other Anomalous Diagrams . . . . . . . . . . . . . . . . . . . . . . 78
5.5.13 Symplectic Anomalies . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6 Axions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6.1 Phases in Quark Masses . . . . . . . . . . . . . . . . . . . . . . . . 79
5.6.2 The Peccei-Quinn Mechanism . . . . . . . . . . . . . . . . . . . . . 82
5.6.3 General Axion Models . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.6.4 Axions in the Standard Model . . . . . . . . . . . . . . . . . . . . . 86
5.6.5 The Mass of the Original QCD Axion . . . . . . . . . . . . . . . . . 89
5.6.6 Invisible Axions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.6.7 Two-photon coupling . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.6.8 Axion-electron coupling . . . . . . . . . . . . . . . . . . . . . . . . 92
5.6.9 Generic Axions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.6.10 Multiple gauge group factors . . . . . . . . . . . . . . . . . . . . . . 97
3
6 Loop Corrections of the Standard Model 99
6.1 Divergences and Renormalization . . . . . . . . . . . . . . . . . . . . . . . 99
6.1.1 Ultraviolet Divergences . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1.2 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.1.3 The Origin of Ultraviolet Divergences . . . . . . . . . . . . . . . . . 101
6.1.4 Renormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.1.5 Renormalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.1.6 Dimensional Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.1.7 The Meaning of Renormalizability . . . . . . . . . . . . . . . . . . . 105
6.2 Running Coupling Constants . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.2.1 Example: Scalar Field Theories . . . . . . . . . . . . . . . . . . . . 107
6.2.2 The Renormalization Group Equation . . . . . . . . . . . . . . . . 109
6.2.3 Summing Leading Logarithms . . . . . . . . . . . . . . . . . . . . . 111
6.2.4 Asymptotic Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.2.5 Abelian gauge theories . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2.6 Yukawa Couplings . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2.7 The Higgs Self-coupling . . . . . . . . . . . . . . . . . . . . . . . . 115
4
8.10.1 B-L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.10.2 The Proton Lifetime . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.10.3 Historical Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
8.11 The Higgs System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
8.12 Magnetic Monopoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
8.13 Other GUTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8.13.1 SO(10) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
8.13.2 E6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.13.3 Flipped SU (5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8.13.4 Still Larger Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.14 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.15 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
9 Supersymmetry 156
9.1 The Supersymmetry Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . 158
9.2 Multiplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
9.3 Constructing supersymmetric Lagrangians . . . . . . . . . . . . . . . . . . 160
9.4 The Supersymmetrized Standard Model . . . . . . . . . . . . . . . . . . . 163
9.5 Additional Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
9.6 Continuous R-symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
9.7 R-Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.8 Supersymmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.9 Non-renormalization Theorems . . . . . . . . . . . . . . . . . . . . . . . . 170
9.10 Soft Supersymmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.11 Spontaneous Supersymmetry Breaking . . . . . . . . . . . . . . . . . . . . 171
9.12 The Goldstino . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
9.13 Mass Sum Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
9.14 The Minimal Supersymmetric Standard Model . . . . . . . . . . . . . . . . 174
9.15 The Higgs Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.15.1 A Weak Symmetry Breaking Minimum . . . . . . . . . . . . . . . . 179
9.16 Higgs Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.17 Corrections to the Higgs Masses . . . . . . . . . . . . . . . . . . . . . . . . 183
9.18 Neutralino Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.19 Rare Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
9.20 Direct Searches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
9.21 Supersymmetric Unification . . . . . . . . . . . . . . . . . . . . . . . . . . 188
9.21.1 MSSM -functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
9.21.2 MSSM versus SM Unification . . . . . . . . . . . . . . . . . . . . . 190
9.21.3 Proton Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.22 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
9.23 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
5
10 Supergravity 192
10.1 Local Supersymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
10.2 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
10.3 Spontaneous Symmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . 196
10.4 Hidden Sector Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
10.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
10.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
A Spinors 204
A.1 Spinors in SU (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
A.2 The Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
D Supersymmetry 239
D.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
D.2 The Wess-Zumino Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
D.3 Superfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
D.4 Translations in Superspace . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
D.5 Different Realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
D.6 Action on superfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
D.7 Changes of Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
D.8 Product Representations and Supersymmetry Invariants . . . . . . . . . . 247
D.9 Covariant Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
D.10 Chiral Superfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
D.11 Vector Superfields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
D.12 Invariant Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
6
Preface
The first version of these notes was written up for lectures at the 1995 AIO-school
(a school for PhD students) on theoretical particle physics. Later they were adapted for
lectures at the Radboud University in Nijmegen, aimed at undergraduate students in their
fourth year. This means that no detailed knowledge of quantum field theory is assumed,
only some basic ideas like the intuitive notion of Feynman diagrams and their relation to
Lagrangians. Most of the current version was updated during the spring of 2015.
The purpose of the lectures is to explain the essence of current ideas about possible
physics beyond the Standard Model. Although such ideas often have a finite life-time,
there are many that have been around for a decade or more, and are likely to play an
important role in particle physics at least for another decade. The emphasis is on those
ideas that are likely to survive for a while, not only due to lack of data, but also because
of intrinsic importance.
Another purpose is to describe the Standard Model as a special point in the huge
space of quantum field theories, and explain which alternatives are possible.
Not too much time will be devoted to the huge number of models existing in the present
literature, but only a limited set of standard ones is explained. In comparison with
other lecture notes, more attention is paid to Standard Model physics, and furthermore
most explanations are a bit more basic. A lot of background material is included in the
appendices.
The list of references is still extremely limited. Only the sources on which these notes
were based are listed. These may be consulted for a more complete set of references.
Conventions
The metric signature we use is (1, 1, 1, 1). This means that for on-shell momenta
p2 p p = m2 . The standard Dirac action is i m and the standard
action for a massive real scalar is 12 12 m2 2 . Repeated indices are always to be
summed over, but in a few equations the sums are written out explicitly anyway. In most
cases raised or lowered indices have no special significance. The exceptions are space-
time indices, which are always raised and lowered with the metric g , and SU (2) spinor
indices, which are raised or lowered with the -tensor . Except for a few pages discussing
supergravity, the metric is equal to the flat metric . Conventions regarding superspace
generally follow [2]. Covariant derivatives are of the form ieA for positively charged
particles in electromagnetism (note that some texts use the opposite sign for the gauge
field term). The meaning of + c.c is add the complex conjugate. In an expression
involving operators this is to be interpreted as the hermitean conjugate. The terms to be
conjugated are either indicated by brackets, or if there are no brackets c.c applies to all
terms. Several other conventions are stated in the appendices.
7
1 Introduction
Our field considering its name High Energy Physics is perhaps best characterized
by the quest for the fundamental laws of physics. Now that we have, in principle, a very
satisfactory description of all natural phenomena occurring on this planet in terms of the
Standard Model, it is natural for us to ask what lies beyond that model.
8
the first appearance of new physics at several orders of magnitude below the energy scale
the LHC can currently reach. We may still find evidence for such new physics, and indeed
at this moment (early 2016) there exist some tantalizing results that put the Standard
Model under stress. None of these has reached the limit of five standard deviations that
we require for observations in particle physics. But if it happens, the current moment is
merely a window in time, whose existence is rather puzzling. There is no obvious reason
why there would be an energy gap between new physics and old physics.
9
The problems associated with (quantum) gravity are completely irrelevant for our
19
p as MPlanck = 1.2 10
accelerator experiments until we reach energies as large GeV,
the Planck mass (the precise definition is MPlanck = ~c/GN , where GN is Newtons
constant). At this scale we should expect the Standard Model to break down in any case.
The fourth dimension should not be confused with time, the fourth coordinate in Minkowski space. It
is simply used for a mathematical description of the surface.
10
one gets p = wc2 , where w is a parameter. For massive particles (dust or matter)
one has w = 0 and for massless particles (radiation) one gets w = 13 . The Einstein
equations reduce to two separate equations, one determining the time evolution of matter
densities, and one equation that takes the form
2
2 a 8G kc2
H = = 2 (1.4)
a 3 a
Note that this is dimensionally correct if we either make k dimensionless, and give a the
dimension of [length], or make a dimensionless, and give k the dimension of [length]2 .
The ratio on the left hand side is the rate of change of the scale of the universe, the
quantity that Hubble measured by plotting velocity (determined from Doppler shifts)
versus distance. It is called the Hubble constant, although it is not really constant. The
density is actually the sum of the densities i of all contributing kinds of matter. It is
customary to rewrite this equation by dividing both sides by H 2 , and defining a critical
density c as
3H 2
c = (1.5)
8G
Just as H this is of course not quite constant. Now we get
kc2
1= 2 2 (1.6)
c a H
We define
kc2 i X
curv = ; i = ; = i (1.7)
a2 H 2 c i
1 = curv + (1.8)
Clearly, if we could measure the curvature of the universe, and hence curv , we can measure
using this equation the sum of all matter and radiation densities. This is like weighing the
entire universe. One can get information about curvature by considering the apparent size
of distant objects. For example, by comparing the apparent size of nearby and far away
galaxies one can get information about the curvature, but nowadays the most accurate
information came from the fluctuations in the cosmic microwave background. The size of
these fluctuations can be computed, and serves as a standard measuring unit. Since this
comes from the most distant visible feature in the universe, it gives the best measurement
for curvature. According to the latest Planck satellite data the universe is spatially flat
with a precision of about .5% (curv = 0.000 .0005). Since the FRW metric and perfect
fluid approximation for matter is clearly just an approximation, it is implausible that the
universe is exactly spatially flat. Whether the deviation is positive or negative is obviously
of utmost interest for cosmology, but we may never know. Unlike LHC, we have only one
event to look at, our universe. This implies intrinsic statistical errors, which means that
there is a fundamental limit on the accuracy we can reach.
11
However, the importance of this measurement for particle physics lies in the second
term in eqn. (1.8). It tell us that the sum of all the contribution to must be very close
to 1. A small part of this (about 4.9%) can be accounted for by baryonic (i.e. Standard
Model) matter. In the past, an important piece of information comes from the deuterium
abundance in the universe. Deuterium is produced during big bang nucleosynthesis, the
production process being p + n d + . This process can also run in the opposite
direction: photons destroy deuterium. Therefore it is not surprising that the abundance
depends strongly on the baryon-to-photon ratio. Since we know the number density of
photons (most of them are from the CMB), and can fairly accurate estimates of the ratio
of deuterium to hydrogen in the universe, this information can be used to determine the
total amount of baryonic matter. Nowadays the details of the CMB fluctuations also offer
important information about the amount of baryonic matter.
From various sources (such as galaxy rotation curves, clusters of galaxies, structure
formation, gravitational lensing and the CMB) we get information about the total fraction
of matter. This is about 30%, including baryonic matter. Therefore there is about 70%
of the total missing.
Above we have discussed two kinds of contributions (apart from curv ) to : matter
and radiation. These contributions have a different equation of state, which in this
context just means a different value for the parameter w introduced above. From general
relativity one does not just get eqn. (1.4) but also an equation describing the time
evolution of densities
a
= 3 (1 + w) (1.9)
a
which implies
a3(1+w) (1.10)
The two components we have discussed so far scale as follows with a: matter as a3 and
radiation as a4 This is intuitively clear. Matter densities scale according to volume,
but radiation has an additional dependence on scale because with increasing scale their
wavelength increases with a and hence the energy of each photo decreases with a. For
massive particles the energy is bounded from below by their mass. There can be other
contributions to the energy density of the universe. A gas of strings has w = 13 and
scales with a2 , and a gas of membranes has w = 23 . But there is no evidence for
contributions of these latter two kinds.
One contribution that we have not yet discussed in this section is a cosmological
constant. The cosmological constant is a parameter of classical general relativity that
is allowed by general coordinate invariance. It has dimension [length]2 and appears in
the Einstein equations as
R 12 g R g = 8GN T . (1.11)
Without a good argument for its absence one should therefore consider it as a free pa-
rameter that must be fitted to the data. It contributes to the equations of motion with
an equation of state p = w, with w = 1. Hence it does not scale with a at all! The
12
cosmological constant is an obvious candidate for providing the missing contribution to
, and indeed the data seem in agreement with an extra component with w = 1.
Unlike dark matter, where the Standard Model offers nothing, dark energy is provided
in abundance by the Standard Model. The parameter contributes to the equations of
motion in the same way as vacuum energy density vac , which has an energy momentum
tensor T = vac g . Vacuum energy is a constant contribution to any (quantum) field
theory Lagrangian. It receives contributions from classical effects, for example different
minima of a scalar potential and quantum corrections (e.g. zero-point energies of oscil-
lators). However, it plays no role in field theory as long as gravity is ignored. It can
simply be set to zero. Since vacuum energy and the parameter are indistinguishable it
is customary to identify vac and . The precise relation is
GN vac
= := . (1.12)
8 c2
This immediately relates the value of with all other length scales of physics, entering
in .
Vacuum energy is a notoriously divergent quantity in quantum field theory. One may
think of it as the sum of the ground states energies of all the harmonic oscillators in the
mode expansion of all the fields. Alternatively, and equivalently, it may be decribed by
the contribution of loop diagrams without external lines, that one usually throws away in
QFT. The contribution of such a loop diagram is proportional to
Z
d4 k log(k 2 m2 ) (1.13)
To understand the logarithm note that an n-point graph with external momenta is cor-
rectly obtained by differentiating n times with respect to m2 , and hence a zero-point
amplitude corresponds to not differentiating at all. If we cut off the integration at some
scale M , we get a contribution proportional to M 4 . Such a cut off could be physically
inspired by some new physics, such as a discrete structure of space-time. But surely the
scale of such new physics must lie beyond the range of LHC, because otherwise we should
have seen it already. This would suggest that M > 1 TeV. Not only quantum vacuum
energy contributes to , but also classical vacuum energy like the shift in the potential
that occurs in the Higgs mechanism.
The value of is irrelevant in QFT, but it has important effects on the time evolution
of the universe and on its size. Another relation obtained from the Einstein equations
(derivable from the foregoing two equations) is
4G
a = (1 + 3w) (1.14)
3
From this equation we see that matter and radiation decelerate the expansion of the
universe ( > 0 and w = 0 or 13 ), while a cosmological constant with > 0 accelerates
the expansion. Unlike matter densities, can have both sides, as we can already see from
the previous paragraph: the loop diagrams have opposite signs for bosons and fermions.
13
Hence for positive the universe undergoes accelerated expansion, and for negative it
collapses. The value of becomes relevant as soon as it dominates all other contributions.
But since all other contributions scale with negative powers of a, in a universe that starts
expanding this eventually happens. This implies that the simple observation that our
universe exists for billions of years and has a size of billions of light years means that we
know an experimental upper limit on ||, and that we know about this limit for a long
time already.
It is entertaining to use Planck units to specify . Then the natural value of is
about one Planck mass per Planck volume. The limit obtained from the size and life-
time of the universe described above is about 10120 in Planck units. Contributions from
particle physics cut off at 1 TeV yield a value of about 1060 in Planck units, far above
the observational upper limit. For this reason many people believed that if is so small,
it would actually vanish for a reason still to be discovered. But in 1998 it was discovered
that the universe is undergoing accelerated expansion. By now we know that the value of
needed to explain this is about the right quantity needed for .
Interestingly, the current discrepancy in the value of of about 70% was already
known for decades, albeit less precisely. People did not know that the universe was as
close to flatness as precisely as we know today. In Alan Guths famous paper on inflation
[15] he assumes that 0.01 < < 10. That seems hardly close to 1. However, if
one extrapolates backwards in time, the the contribution of curv relative to matter and
radiation approaches zero. Hence it would seem that curv must be extremely close to
zero in the early universe. Indeed, = 1 means that the density is equal to the critical
density. The term critical density indicates that being above or below this value makes
a huge difference. Indeed this is correct. This value turns out to be a point of instability.
If one starts with just above one, the universe starts expanding, but recollapses. If one
starts just below = 1 the universe expands very rapidly, an all matter gets diluted very
fast. To get a universe that still exists after 13.8 billion years and that has a substantial
matter density, one has to start with very close to one. How close depends on how early
one starts. According to [15], if one starts at a temperature corresponding to 1 MeV, one
has to tune to the value 1 with fifteen digit precision.
To explain this apparent fine-tuning, one may invent a mechanism that puts it very
close to zero in the early universe. Inflation is such a mechanism. Then one would expect
to be very close to 1 today. This theoretical expectation did not agree with the known
matter contributions to , and it was also known that dark energy could fill in the gap.
Hence one could claim that inflation predicted a positive cosmological constant of roughly
the observed size. But still, it seems that nobody was courageous enough to predict that.
14
values.
We may almost have forgotten what a real problem looks like. But if we go back
to the middle of last century, when people were trying to understand nuclear physics,
the situation was very different. Nuclear physicists were so desperate that one of them
exclaimed: Even a wrong theory would be tremendous progress.
We still have some real problems left, but the list is very short: what is the correct
theory of quantum gravity, and what are the constituents of dark matter? In the latter
case, and alternative possibility is that we have to modify gravity somehow, but no matter
how one looks at it, there is a discrepancy between the left-hand side and the right-hand
side of Einsteins equations. This is a real problem. On the other hand, dark energy
can be viewed as an environmental problem. We can describe it by simply choosing an
already existing parameter appropriately, but of course that does not imply that there is
no new physics that describes it. But anyone who tries to explain dark energy with new
physics will first have to argue away the old physics.
There is perhaps one other real problem: stability of the Higgs potential. With the
current values of the Higgs mass and the top quark mass (to which this issue is most
sensitive), we are two or three standard deviations beyond the boundary line of stability.
Beyond that line the quantum-corrected Higgs potential develops a second minimum, to
which our universe could tunnel. This does not mean that the entire universe tunnels
instantaneously, but that somewhere a tiny bubble of false vacuum appears, that starts
expanding to cover the entire universe. One can compute the life-time of the universe
under these conditions, and with current data this is expected to be far more than 13.8
billion years. However there are several theoretical uncertainties, and furthermore one has
to worry not just about the current situation, but also about the history of the universe.
So this is potentially a real problem.
Finally, neutrino masses are a real problem for the classic Standard Model, which
was defined to have only left-handed neutrinos and no neutrino masses. Then, by defi-
nition, neutrino oscillations imply non-zero neutrino masses and hence new physics. But
in principle neutrino masses can easily be introduced in a manner analogous to quark
masses, which requires assuming the (still unproven) existence of right-handed neutrinos.
This is an alternative definition of the Standard Model we might have adopted. In that
case the actual mass of the neutrino and its smallness becomes another environmental
problem.
All the rest can be called environmental problems. This list includes:
Horizon problem: Why is the the early universe homogeneous, although there are
many causally disconnected regions?
Flatness problem: Why was the energy density in the early universe so close to the
critical density?
Baryons: Why are there only baryons and leptons, but essentially no anti-particles
in the known universe?
Dark energy: Why is it so small in comparison to natural scales?
15
Dark energy vs. dark matter versus baryonic matter: why are there contributions
to today comparable in size? (the why now problem)
The Hierarchy problem: why is the Higgs mass so much smaller than the Planck
mass?
The Weak/Strong coincidence: why is the QCD scale close to the weak scale? Or
more precisely: why are light quark mass differences of the same order of magnitude
as nuclear binding energies?
Neutrino masses: Why are they so much smaller than charged lepton masses?
Quark and lepton mixing angles: Why are quark mixing angles very small, while
lepton mixing angles are not?
Charge quantization: Why is the proton charge exactly equal to minus the electron
charge?
16
understanding of their good and not-so-good features. This is the main focus of these
lectures.
2 Gauge Theories
In this section we present a brief introduction to non-abelian gauge theories, one of the
main ingredients of the Standard Model. This assumes some basic knowledge of classical
electrodynamics, which will be generalized from abelian symmetry groups (U (1), or just
phases) to non-abelian ones. Furthermore the notion of Euler-Lagrange equations for
classical fields is assumed, and basic canonical quantization of free field theories.
L = 41 F F + J A , (2.1)
with
F = A A . (2.2)
To verify this statement we simply derive the Euler-Lagrange equations that follow from
this Lagrangian
L L
= . (2.3)
( A ) A
This yields
F = J . (2.4)
Now define electric and magnetic fields
These are two of the four Maxwell equations (the other two,
~ E
~ + t B
~ = 0
~ B
~ = 0. (2.7)
are trivially satisfied if we express the electric and magnetic fields in terms of a vector
potential A ). Consistency of Eq. (2.4) clearly requires
J = F = 0 . (2.8)
17
because of the antisymmetry of F . This implies that J must be a conserved current.
For such a current one can define a charge
Z
Q = d3 xJ0 . (2.9)
where the integral is over some volume V . This charge is conserved if the flux of the
current J~ into the volume vanishes.
Integrating by parts, and making the assumption that all physical quantities fall off suf-
ficiently rapidly at spatial and temporal infinity, we get
Z
SJ = d4 x J , (2.12)
18
Gauge invariance (or current conservation) is our main guiding principles in construct-
ing an action coupling the electromagnetic field to other fields. Consider for example the
free fermion. It is not difficult to write down a Lorentz-invariant coupling:
Lkin = i 41 F F . (2.14)
Note that we have introduced two new variables here: the coupling constant e and the
charge q. The latter quantity depends on the particle one considers; for example for the
electron q = 1 and for quarks q = 23 or q = 31 . The coupling constant determines the
strength of the interaction. This quantity is the same for all particles. It turns out that
e2
the combination = 4 is small, 1/137.04. This is the expansion parameter of QED,
and its smallness explains why perturbation theory is successful for this theory. Although
only the product eq is observable, it is convenient to make this separation.
With this choice for the interaction, the current is
J = eq . (2.15)
Using the equations of motion (i.e. the Dirac equation) one may verify that this current
is indeed conserved, so that the theory is gauge invariant. But there is a nicer way of
seeing that. Notice that the fermion kinetic terms as well as the interaction are invariant
under the transformation
eieq ; eieq , (2.16)
if is independent of x. Because of the derivative this is not true if does depend
on x. However, the complete Lagrangian Lkin + Lint is invariant under the following
transformation
eieq(x) ; eieq(x)
A + (x) . (2.17)
This is the gauge transformation, extended to act also on the fermions. This is sufficient
for our purposes: it shows that also in the presence of a coupling to fermions one degree
of freedom decouples from the Lagrangian, so that the photon has only two degrees of
freedom.
19
here that only first derivatives appear, but this can be generalized). Hence the variation
of the action must have the form
Z
S = d4 x (x)J [Fields] (2.18)
where J [Fields] is some expression in terms of the fields of the theory. The precise form
of J depends on the action under consideration, and follows in a straightforward way
from the symmetry.
The equations of motion are derived by requiring that the action is a stationary point
of the action, which means that terms linear in the variation, such as Eq. (2.18) must
vanish. Integrating by parts we get then
Z
d4 x(x) J [Fields] = 0 . (2.19)
Since (x) is an arbitrary function, it follows that the Noether current J [Fields] is con-
served. It is an easy exercise to show that the symmetry (2.16) of the free fermion action
does indeed yield the current (2.15).
D = ieqA (2.20)
L = i D (2.22)
checking gauge invariance is essentially trivial. One can simply pull the phases through
D , even if they are x-dependent!
Replacing normal derivatives by covariant ones is called minimal substitution, and the
resulting interaction terms minimal coupling. It is a general principle: an action can be
made gauge invariant by replacing all derivatives by covariant derivatives. For example
the coupling of a photon to a complex scalar is given by the Lagrangian
where q is the charge of . Note that must be a complex field since the gauge transfor-
mation multiplies it by a phase. Note also that the field has opposite charge.
20
The Lagrangian of the vector bosons can also be written down in terms of covariant
derivatives. We have (for any q 6= 0)
ieqF = [D (q), D (q)] , (2.24)
from which gauge invariance of the action follows trivially. Here q has no special sig-
nificance, and any non-zero value can be used. This relation should be interpreted as a
relation for differential operators acting on some function (x). The space-time derivatives
in both covariant derivatives act on , but in the final result the action of the derivatives
on cancels out.
21
Therefore we can expand it into a complete basis of two-by-two matrices. Any such
matrix can be written as a +~b ~ , where a and ~b are four complex constants and are the
Pauli matrices. In this case we want A to be anti-hermitean (just as ) so the constants
must be purely imaginary. Furthermore we will set a = 0. This is not necessary, but
the constant component of A corresponds to an abelian gauge field that belongs to the
overall phase in U (2) in comparison with SU (2). Since we are only considering SU (2)
here, only the components proportional to ~ are interesting for us. Instead of the Pauli
matrices we will use the matrices
T a = 21 a . (2.29)
This avoids several factors 21 in formulas, and also prevents confusion with the Pauli-
matrices used for spin. Then we write the gauge fields in the following way
X
A = ig Aa T a , (2.30)
a
where we have introduced a factor ig for future purposes. The component fields Aa are
real. The factor g will play the role of the coupling constant, just as e in QED. Note that
there are three gauge fields, for a = 1, 2, 3.
To see how A should transform, it is instructive to consider infinitesimal transforma-
tions
~ = 1 + i~ T~ .
U () (2.31)
Expanding Eq. (2.28) to first order in we get
h i
~ ~ ~ ~
A A i T + ig T , A . (2.32)
In perturbation theory the first term gives rise to the fermion propagator, which in com-
parison to the one of QED has an extra factor ij . The second term is a perturbation,
22
67 gauge
which yields the Feynman rule (the curly line represents a non-abelian boson, see
below)
ig Tija
The fermionThespinors u, v, u,
fermion v now get
spinors extra
u, v, u, vindices i, j, .extra
now get . . in addition
indicestoi,their
j, . . .spinor
in addition to the
a
indices. The matrices T are multiplied together along a fermion line, starting at an
The a
outgoing arrow andmatrices
following Tthe areline multiplied together
against the arrow alongIfathere
direction. fermion line, starting at an
is a closed
fermion loop, one obtains a trace of a product of matrices T . Combinatorically this works
and following the line against the arrow direction. If there is a closed fe
exactly as for the matrices.
obtains a trace of a product of matrices T . Combinatorically this works exa
2.7 Gauge Kinetic Terms
matrices.
We can also write down a kinetic term for the gauge fields. First define
Just like A , the field strength tensor F is a two-by-two matrix, and it can be expanded
We matrices
in terms of Pauli can alsoas write down a kinetic term for the gauge fields. First define
X
a
F = ig F Ta , (2.36)
Fa = [D , D ] = A A + [A , A ] .
Now we can express the components of F in terms of those of A :
Just like A , a the field
a
strength
a
tensor F is a two-by-two matrix, and it c
abc b c
F = A A + g A A . (2.37)
in terms of Pauli matrices as
The reason for writing F as in Eq. (2.35) is that it has a nice transformation
! law under
gauge transformations a a
F = ig F T ,
F U F U 1 . (2.38)
a
Note that in contrast to the field strength of QED, the field strength tensor of non-
abelian gaugeNow we iscan
theories notexpress the components
gauge invariant. However, weofcan
Fmake
in terms
a gaugeofinvariant
those of A :
combination,
1 aF .
Lgauge = 2 TrF
2g
F
= Aa Aa + g#abcA(2.39)
b c
A .
where the trace is over the two-by-two matrices. Because of the cyclic property of the
The reason
trace this quantity is gauge for writing
invariant. F
It is alsoas in (5.9)Lorentz
manifestly is thatinvariant,
it has aandnice transformation
hence
transformations 23
F UF U 1 .
68
components we get
1 !
good candidate for the Lagrangian of theL non-abelian
= gaugeF fields.
a
F ,aIf .we write it out in
gauge
components we get 4 a
1 ! a ,a
Lgauge = F F .
4 a
5.14. for
it is a good candidate Feynman rules
the Lagrangian of the non-abelian gauge fields. If we write it
out in components we get
5.14. Feynman rules 1 X a a,a
L
Note that the linear =
terms F F . (2.40)
4 a inF are just like those for QED. If that was
gauge
Notewe just
that thehad three
linear copies
terms aofare
in F QED, for a
just like = 1,
those for2,QED.
3. The quadratic
If that was all terms in F
there was
2.8 Feynman Rules
cubic
we just hadterms in theofLagrangian
three copies that
QED, for a = 1, 2,are proportional
3. The to g, inand
quadratic terms a quartic
F give riseterm
to
a
Note that the linear 2 terms in F are just like those for QED. If that was all there was
cubictotermsg . in the
These Lagrangian
are that are proportional
interactions. Just to g, and aquartic
as in QED, terms proportional
we just had three copies of QED, for a = 1, 2, 3. The quadratic terms inweFuse
givethe
risebilinear
to terms in
2
to gindefine
. the
These
cubic terms
2
a are interactions.
Lagrangian Just in
that arewhich
propagator, as fact
proportionalin QED, we quartic
use the
toisg, identical
and bilinear
toterms terms
proportional
the one of QEDin the actionfor
except to a
to g . These areainteractions.
define propagator, Just
whichasininfact
QED, we use the
is identical tobi-linear
the one terms
of QED in the action
except to
a factor ab . To
for ab
distinguish non-abelian gauge bosons from photons
define a propagator, which in fact is identical to the one of QED except for a factor .we use another kind of li
distinguish non-abelian gauge bosons from photons we use
To distinguish non-abelian gauge bosons from photons we use another kind of line: another kind of line:
i
i
2 g ab
2
g ab
k k
The The
cubiccubic
and quartic term give
and quartic rise give
term to interactions, whose Feynman
rise to interactions, rulesFeynman
whose are rules ar
The cubic and quartic term give rise to interactions, whose Feynman rules are
g"abc [(q p) g + (k q) g + (p k) g ]
g"abc [(q p) g + (k q) g + (p k) g ]
ig 2 ["eab "ecd (g g g g )
+"eac "edb (g2 g g g )
ig ["eab "ecd (g g g g )
+"ead "ebc (g g g g )]
+"eac "edb (g g g g )
+"ead "ebc (g g g g )]
Just like photons, non-abelian gauge bosons Aa have two components (two for each value
Just like photons, non-abelian gauge bosons Aa have two components (two for each
of the
value of the index
index a of course),
a of course), and they
and when whenappear
they as
appear as external
external lines theylines they are represented by
are represented
by polarization tensors a .
Just like photons, non-abelian gauge bosons Aa have two components (two
of the index a of course),24and when they appear as external lines they are
2.9 Other Gauge Groups
All the foregoing can easily be generalized to other symmetries. Instead of SU (2) we
may use other groups like SU (N ) or SO(N ). In general, one has instead of the Pauli
matrices some other set of hermitean matrices T a . These matrices satisfy a generalized
set of commutation relations, a b
T , T = if abc T c . (2.41)
where f abc is a set of real numbers that are called the structure constants of the group.
They are fully anti-symmetric in all three indices. In addition to the commutation rela-
tions, the only other property one needs to know about these matrices is their normaliza-
tion. Often one uses
Tr T a T b = 21 ab , (2.42)
which is indeed satisfied by the SU (2) matrices we used. In Eq. (2.39) this normalization
is implicitly assumed.
To write down Lagrangians, transformations and Feynman rules for another group,
simply make everywhere the replacement
abc f abc . (2.43)
An interesting special case is the group SU (3), with fermions in triplet representations.
There are eight traceless hermitean three-by-three matrices T a . This yields QCD (quan-
tum chromodynamics). Corresponding to the eight matrices there are eight gauge bosons,
called gluons, while the fermions are called quarks. It is now completely straightforward
to write down the QCD Lagrangian.
Note that the entire discussion of non-abelian gauge theories is completely analogous
to that of QED. This is in fact a special case, obtained by replacing
Ta q
g e
abc 0 . (2.44)
25
p
where (~k) = ~k 2 + m2 . If one computes the vacuum expectation value (v.e.v.) of such
a quantum field one finds zero, since the v.e.v. of any creation/annihilation operator
vanishes. But this is not necessarily true. In general one can have = cl + qu , with
all quantum fluctuations in the second term, and cl 6= 0. In general, if one quantizes a
theory one considers the fluctuations of fields around minima of the classical action. These
fluctuations define a set of harmonic oscillators, to which the quantization procedure is
applied. For this to make sense, the change in energy must be quadratic (or higher
order) in terms of infinitesimal fluctuations. In particular, there should not be any linear
dependences. This implies that the classical field must be a solution to the equations of
motion, or in other words a stationary point of the action. Usually = 0 is a solution to
the equations of motion, but in some cases there may be other solutions.
The classical value, cl , serves as a new, non-trivial ground state of the theory. One
defines the vacuum in such a way that h0|qu |0i = 0. The properties of the quantum
vacuum state, and in particular the symmetries it respects, are determined by those
of the classical background field cl . The possible values of cl are restricted by the
symmetries the theory should have. In general, with cl 6= 0 there will be fewer symmetries
than with cl = 0. If some symmetry operation changes the classical vacuum, than this
is not going to be a symmetry of the theory expanded around that vacuum.
In any case we want our vacuum to be translation invariant and Lorentz-invariant.
This restricts cl to be a constant over all of space-time, and it restricts to be a scalar
field. If a vector field has a classical value, then it must point to some specific direction.
This breaks rotation invariance.
In addition to space-time symmetries, fields may also transform under internal sym-
metries (by definition, this is anything else than Poincare transformations). If a classical
background field transforms non-trivially under such symmetries, and if this is used to
define the classical vacuum of the theory, then the symmetry is broken. Historically,
this is called spontaneous symmetry breaking of a symmetry. This means that there is a
symmetry of the action that is not realized in the vacuum.
Now there are two possibilities one has to distinguish. The symmetry that is broken
may be a global or a local symmetry. The physics implication of these two cases is rather
different. Consider first global symmetries
26
as well as the classical Hamiltonian, has the same value for cl and cl + S . Then S
is a fluctuation that does not cost any energy. Hence it must be a massless fluctuation.
The Goldstone theorem states that for any independent generator of a spontaneously
broken continuous symmetry there is a massless scalar in the spectrum. We will not
demonstrate this here in full generality, but show it in a concrete example.
The example is a complex scalar field theory with Lagrangian
Lscalar = ( ) m2 . (3.2)
We have given this field a mass, since otherwise it would be hard to detect the appearance
of massless modes. Note that there are two real fields, the real and imaginary parts of ,
with mass m.
Before continuing, a remark on the normalization of complex and real fields. The
normalization chosen above is the standard one for complex scalar fields. A real scalar
field with mass m would have the Lagrangian
1 1
Lreal scalar = m2 2 . (3.3)
2 2
i
In both cases the normalization is such that the propagator of these fields is k2 m 2.
The complex scalar theory defined in Eqn. (3.2) has a global U (1) symmetry: if we
multiply with a constant phase, then the action does not change. The equations of
motion are
= m2 (3.4)
We are looking for constant solutions, = 0, and obviously this implies that = 0. So
this is not very interesting.
We can make it more interesting by adding interactions
1
Lscalar = ( ) m2 ( )2 . (3.5)
4
This is often written as
Lscalar = ( ) V () , (3.6)
and V () is called the scalar potential. The Hamiltonian derived from this action is
Z
3 2 2 2 1 2
H = d x |(~x )| + |(t )| + m + ( ) . (3.7)
4
For this to be bounded from below for large values of requires > 0 (where is real).
Now the equations of motion are
1
= m2 ( ) (3.8)
2
For both m2 > 0 and > 0 this still has only one constant solution: = 0. As explained
above, there is good reason why should be positive, but not for m2 . This is just a
27
parameter in a Lagrangian. The fact that we wrote it as a square is not a valid argument
for positivity. This was done so that in the previous case we could interpret m as a mass.
But we can forget about that here, and simply choose m2 < 0. Then the condition
1
= m2 ( ) (3.9)
2
has a non-trivial solution for constant
2m2 1
= v2 (3.10)
2
Note that there is an entire circle of vacua, because the solution does not depend on the
phase. We just choose one of them, for example is real and positive. Then cl = 12 v,
with v as defined above. The factor 12 seems awkward, and it might appear more natural
to define cl = v, but this normalization is more convenient for future purposes.
To quantize this theory we expand the field around the vacuum: = cl + qu . We
may do this as follows
1
= (v + + i) (3.11)
2
Now and are two real fluctuations, treated as new field variables. But there is a more
clever way of expanding around the classical vacuum:
1
= (v + )ei (3.12)
2
Expanding to first order, we see that the two expansions are related: = v. But if we
substitute Eqn. (3.12) into the Lagrangian, we see that disappears from the action,
except in the (canonically normalized) kinetic terms:
1
|( + i(v + )( )|2 . (3.13)
2
Note that does not vanish from the action. If we expand the scalar potential we get
1 1 1 3 1 1
V () = m2 v 2 + v 4 + m2 2 + v 2 2 + v 3 + 4 (3.14)
2 16 2 8 4 16
Substituting Eqn. (3.10) we get
m4 1 1
V () = m2 2 + v 3 + 4 (3.15)
4 16
There is a quadratic term in the scalar potential defining a positive mass 2m2 for
(remember that m2 < 0 and that there is a factor 12 in the canonical definition of mass
terms for real scalars). But the field is massless. This is the Goldstone boson.
Observe that the expansion Eqn. (3.12) is only valid if v 6= 0. Hence it cannot be
used to show that there is a massless mode if we expand around = 0. We would not get
quadratic kinetic terms for , but rather something like ( )2 . Indeed, the expansion
around = 0 and m2 > 0 just yields two massive modes with mass m.
28
3.3 Higgs Mechanism for Abelian Gauge Symmetry
Now consider the same example with a local instead of a global symmetry. This implies
that is coupled to an abelian gauge field. The Lagrangian is
Lscalar = (D ) D . (3.16)
where D = ieA . The gauge symmetry of this Lagrangian is eie . Let us
assume that has a v.e.v. equal to v, which we will take to be real. If we expand around
= v, the fluctuations will not have the gauge symmetry anymore, since v is fixed
and does not transform. This is puzzling at first sight, because we had argued before
that the gauge symmetry was essential for having a massless photon with two physical
polarizations.
To see what happens we rewrite the Lagrangian as before, choosing
1
(x) = (v + (x))ei(x) , (3.17)
2
so that are the real fluctuations and the imaginary ones. In the quantum theory the
quanta of and will yield the fluctuations, and they will have the usual expansion in
terms of oscillators, as in Eq. (3.1). Substituting (3.17) into the Lagrangian we get
1
2
|( + i(v + )( eA )|2 .
Now we replace everywhere A by
1
B = A (3.18)
e
Now the Lagrangian becomes
1
2
|( i(v + )eB )|2 . (3.19)
Expanding this yields
1
2
+ 21 e2 v 2 B B + 21 e2 B B (2v + ) . (3.20)
Now suppose that there are other terms in the Lagrangian in addition to (3.16). This
includes in particular the kinetic terms for the gauge bosons. All the additional terms
must be gauge invariant. The replacement B = A 1e can be realized on all other
terms as a gauge transformation, which may include -dependent transformations of other
fields. This implies that the other terms in the Lagrangian remain unchanged, except that
A is replaced everywhere by B .
To summarize, suppose we started with a Lagrangian
14 F (A)F (A) + Lscalar + Lrest (A) . (3.21)
Then after shifting the vacuum and some changes of variables we end up with
14 F (B)F (B) + 21 + 12 e2 v 2 B B + 12 e2 B B (2v + ) + Lrest (B) . (3.22)
Two observations can now be made:
29
The field had two real degrees of freedom, and , but the latter has disappeared
completely.
This magic is called the Higgs mechanism, after one of its inventors. It allows us to
give a mass to the gauge boson, simultaneously breaking the gauge symmetry. The field
is not really gone. As we have seen, a massive gauge boson has three degrees of freedom,
a massless one only two. When we made the transformation B = A 1e we have
absorbed into the gauge field to provide the extra degree of freedom needed to make it
massive. One often says that was eaten by the gauge field.
Massive vector bosons occur in the theory of weak interactions, the W and Z bosons.
You may wonder why we couldnt simply have added the mass term by hand. The reason is
that such a procedure makes the theory inconsistent. It explicitly breaks gauge invariance,
and gauge invariance is essential for consistency of theories with spin-1 particles. In the
procedure explained above gauge invariance is not manifest anymore in the shifted ground
state, but it is still present in a less obvious form.
30
The black dot indicates our choice of the ground state, but any choice on the bottom
of the Mexican hat would have been fine as well. By making a choice we break the
gauge symmetry, i.e. the phase rotations of . We have also indicated the directions of
the small perturbations and .
If one shifts the value of one finds that now gets a mass, and disappears, as
before. The mass of , the Higgs boson, is a free parameter, and is in principle unrelated
to the mass of the vector boson, ev.
Observe that in order to find a non-trivial ground state we had to take m2 < 0. If one
expands around the trivial ground state = 0 this negative value of m2 leads to trouble:
the theory now contains particles with imaginary mass. Such particles are also known as
tachyons because their velocity, given by the relativistic formula ( vc )2 = p~2 /(~p2 + m2 ),
can exceed the velocity of light. The presence of tachyons means that the theory with
this vacuum choice is sick. Classically this is related to the fact that a field configuration
on top of the hill inside the Mexican hat is unstable. The only correct vacuum is the
non-trivial one, and all fluctuations around it have positive or zero mass. If on the other
hand m2 > 0 the only vacuum is = 0. By continuously changing m2 we can go from
the symmetric vacuum = 0 to the vacuum with broken gauge symmetry. Historically,
this is what led to the name spontaneous symmetry breaking.
All the above can be generalized to non-abelian gauge theories. The main features
are the same. Some symmetries within a symmetry group are spontaneously broken, and
the corresponding vector bosons acquire a mass by each eating a scalar. The resulting
spectrum always contains (at least) one Higgs scalar, whose mass is a free parameter and
hence cannot be predicted.
L2 = i i ( ig3 AI T I ) i , (4.2)
where the implicit sum on i is over the six quark flavors u, d, c, s, b and t. These are
all the quarks we know, and there are reasons to believe that there are no more. The
31
hermitean matrix T I is the SU (3) generator in the triplet representation. Color indices
of the quarks have been suppressed. The parameter g3 is the QCD coupling constant.
Note that the Lagrangian L1 + L2 has an exact U (6) symmetry i Uij j . In fact it
has an even larger symmetry since each quark has both left- and right-handed components
which can be rotated completely independently. We can write the fermion Lagrangian as
mu = 2.2 .6 MeV
md = 4.7 .5 MeV
ms = 96 8 MeV
mc = 1.27 .03 GeV
mb = 4.1 . . . 4.7 GeV
mt = 173 1 GeV
These masses are taken into account by adding the following terms to the Lagrangian
X
L3 = mi i i . (4.4)
i
The fact that all masses are different implies a breaking from U (6) to U (1)6 , but even if
all masses were all equal these terms link the left and right-handed fermions and do not
allow us to rotate them independently:
X
L3 = mi (Li Ri + Ri Li ) . (4.5)
i
If all mi were the same this would break the symmetry to U (6)V , the vector symmetry,
which acts by rotation L and R in the same way. The orthogonal combination,
rotating L by U and R by U 1 is called the axial symmetry. The combination of
these two effects (the existence of quark masses and their differences) leaves us with the
global symmetry (U (1)6 )V , which are the six separate flavor quantum numbers. They are
conserved because QCD is flavor blind.
Using 2016 data, errors rounded off; note that there are two incompatible definitions of the b quark
mass, hence the large range
32
The third reason why U (6)L U (6)R is broken (even if all quark masses were zero) is
the coupling of the quarks to electromagnetism. This coupling adds the following terms
to the Lagrangian
L4 = i i (ieqA ) i , (4.6)
where q = 32 for the quarks u, c and t and q = 31 for d, s and b. QED is not flavor blind,
but does not mix flavors, so that they are still conserved.
Also the leptons e, and , coupling with q = 1, need to be added now. The lepton
part of the QED Lagrangian has a global U (3)L U (3)R symmetry, which is broken down
to (U (1)3 )V if we also add mass terms for the leptons. Their masses are known much
more precisely than those of the quarks:
me = 0.510998928 .000000011 MeV
m = 105.658357 .000002 MeV
m = 1776.82 .16 MeV
The generators of the three U (1)s that survive after masses are added are the three
separate lepton numbers. The remaining fermions that we know are three species of
neutrinos, but since they couple neither to QCD nor QED they will make their appearance
later.
This clearly has an exact SU (2)V symmetry that acts on the label x, that distinguishes
protons and neutrons. The SU (2)A axial symmetry, however, is not realized in the spec-
trum. This symmetry rotates the left- and right-handed components of the baryon spinor
33
x in opposite ways (if the left component is transformed with a 2 2 matrix U , the right
one transforms with U ), and this is not a symmetry of the Lagrangian because the mass
term couples the left and the right component. It is essential to know that M 6= 0, even if
mu and md vanish, and this fact is know for example from the aforementioned theoretical
extrapolations.
Here we see an example of a symmetry of the Lagrangian that is not realized in
the spectrum. This phenomenon is known as spontaneous symmetry breaking. In
general, there are two requirements for some infinitesimal operator T to be a symmetry
of a physical system: T must commute with the Hamiltonian, [H, T ] = 0, and T must
annihilate the vacuum, T |0i = 0 (if the infinitesimal symmetry operator annihilates the
vacuum, its global form will leave the vacuum invariant).
The fact that SU (2)A is not realized in the baryon spectrum is understood as a result
of a spontaneous symmetry breaking, which is dynamically generated by QCD. In other
words, the QCD vacuum is not invariant under the SU (2)A transformations. There is a
famous theorem, known as Goldstones theorem, that applies to such a situation. The
theorem states that such a symmetry breaking results in massless scalars in the spectrum,
transforming like the derivative of the current of the broken symmetry. These particles
are the pions, which indeed are quite light in comparison to other hadrons, but which are
not completely massless. The reason is that because of the mass terms in the Lagrangian
(sometimes called the current quark masses) the SU (2)A symmetry was not exact to
begin with, and hence the Goldstone bosons are only approximately massless. Often such
particles are called pseudo-Goldstone bosons.
This is the fourth reason why the chiral symmetries are broken. In the process, QCD
gives the quarks an effective mass of the order of one-third of the proton or mass,
which is called the constituent mass since it can be viewed as the mass of quarks as
constituents of hadrons. The current mass is the relevant one in hard scattering, where
soft QCD effect can be ignored.
It would be natural to expect a fourth Goldstone particle because the axial symmetry
that is spontaneously broken is U (2) and not just SU (2). We will discuss later what
happens to the extra U (1)A symmetry. If N flavors are present the mechanism extends
straightforwardly from SU (2) to SU (N ). In reality the masses of the other quarks can
not be neglected, however, and hence this description becomes less useful.
Although this intuitive picture is appealing and leads to qualitatively and quantita-
tively satisfactory results, it has not been derived rigorously from QCD. However it is
supported by lattice calculations.
g32 X I ,I
8
g32
2
G G = 2
Tr G G , (4.8)
32 I=1 16
34
where G = 12 G and G GI T I . Here we used the relation
Tr T I T J = 21 IJ
which als defines the normalization of the SU (3) generators T I . This is the standard
normalization, c.f. Eq. (2.42). This term is of the same order in fields and derivatives as
the gauge kinetic terms. Hence it has mass dimension 4. We will see later that terms of
higher order than 4 can be consistently dropped from the Lagrangian, because they have
a coupling constant with dimension [mass]1 . By assuming that the corresponding mass
scale is as large as we want, we can always make such terms arbitrarily small in a natural
way.
But since the GG term has mass dimension 4, the parameter is dimensionless. It
turns out that is like an angle: all physics is periodic in . The factor g32 and the
normalization are chosen in such a way that that the periodicity of is 2.
The term (4.8) explicitly violates parity P, but respects charge conjugation C, and
hence it also violates CP. To see why it violates parity note that the -tensor transforms
0 0 0 0
under Lorentz-transformation to 0 0 0 0 = det () . This implies in
particular that (4.8) is indeed Lorentz invariant. But the determinant is negative for space
inversion ~x ~x and also under time reversal. This is consistent with CPT-invariance:
if CP is violated, then T must be violated as well.
1
TrG G = K , (4.9)
4
where
K = Tr[A A + 23 g3 A A A ] (4.10)
Normally one would drop such total derivative terms from the Lagrangian. However
one has to be careful with boundary terms. It turns out that in non-abelian gauge theories
there exist field configurations with finite (Euclidean) action for which the boundary
integral on S3 at infinite radius does not vanish. These are called instantons. They are
characterized by an integral over all of Euclidean space that is always an integer
Z
g32
NE = d4 xTr G G = n Z
16 2 E
Note that this looks very much the action (4.8), but the latter is of course defined in
Minkowski space. The fact that this Euclidean integral is quantized is the reason that
is periodic. This can intuitively be understood as follows (the following discussion
assumes a basic understanding of path integral
R 4 in quantum fieldR theory). When going to
4
Euclidean space the integration measure M d x is changed to i E d x, where M and E
denote Minkowski and Euclidean respectively. This turns the integrand into a negative
35
exponential, dominated by the classical paths, and with exponential suppression for paths
that deviate from it. But the term behaves a bit differently. Within the action, all
contractions involving a g are changed to Euclidean contractions involving . But
terms involving a Levi-Civita tensor change by a factor i, because there is just one
time component in every non-vanishing tensor. Hence schematically we get the following
path intregrand for the path integral
Since NE is always an integral, we see that the Euclidean path integral is indeed periodic
in . The underlying physics requires a lot more discussion, but that is beyond the scope
of these lecture notes.
Theories with values of that differ by multiples of 2 are related by gauge transfor-
mations. These are not the local gauge transformations shown in eqn. (2.33), where the
local parameters a are implicitly assumed to fall off rapidly towards infinity, but they
are gauge transformations that do contribute to the boundary terms (4.10). Because such
configurations exist and contribute to the path integral one cannot simply drop the total
derivatives in the action. In QED there are no such configurations, and the CP-violating
terms may indeed be dropped; it has no observable consequences.
Observable Consequences. All terms in the Lagrangian that we have seen before
give rise to a vertex that can be used in perturbation theory. So it would be natural to
construct the vertex corresponding to (4.8). But one finds that there is no such vertex.
This is due to the fact that (4.8) is a total derivative. So we will never see the effect of
(4.8) in any Feynman diagram. But QCD is more than just Feynman diagrams. There
are contribution to physical processes that cannot be obtained by means of Feynman
diagrams. These are called non-perturbative effects.
In QCD the term (4.8) does indeed have observable consequences, as it contributes to
the electric dipole moment of the neutron, a CP-violating quantity. To see how electric
and magnetic moments transform under P and CP it is most convenient to assume CPT,
and consider T-invariance instead of CP invariance. This is because C changes quarks
into anti-quarks, and hence it is a bit cumbersome to derive CP transformations of the
neutron using C and P directly. Under parity, electric dipole moments flip. Magnetic
dipole moments transform as r p, i.e. angular momenta, and hence they are invariant
under parity. Under time reversal, magnetic moments flip (because magnetic moments
are due to, or behave like, rotating charges, and under time reversal all rotations reverse
directions), whereas electric dipole moments remain unchanged. Hence under either P or
T the electric and magnetic dipole moment flip with respect to each other, and hence if
a particle has both moments, the final state of the transformation must be different from
its initial state. Since the neutron has a magnetic dipole moment, if it also had an electric
one, this would violate both P and CP.
The electric dipole moment of the neutron, dn , has not been observed. The current
experimental limit is |dn | < 2.9 1026 ecm, which puts a bound on : < 4 1010 . The
36
electric dipole moment of the neutron is approximately given by
e mu md 1
(4.11)
mn mu + md QCD
where QCD is the QCD scale, and mn the neutron mass. We have seen above that
is like an angle, and hence its full parameter space is the interval [0, 2). It could have
take any value in this interval, but nature has chosen it to be remarkably close to 0. To
appreciate the point, define a new parameter x = /2, and suppose the value of x were
experimentally determined to be 3.1415926536 1010 . Wouldnt you think that this is
remarkably close to , and that this cannot be a coincidence? But x = is physically
equivalent to x = 0, and hence this is essentially what we observe.
The Strong CP Problem. The fact that the angle is so close to zero seems to
demand an explanation. This problem is called the strong CP problem. A first idea could
be to simply declare that CP is a symmetry of the strong and electromagnetic interactions.
Indeed, since all other terms in the SU (3) U (1) Lagrangian respect CP the term (4.8)
cannot be generated if it is set to zero. This is an important lesson, which will come back
several times in these lectures: terms can be consistently removed from a Lagrangian if
their removal leads to an enhanced symmetry. In this situation one says that the absence
of such a term is natural.
But note that the absence of P and CP violation is a property of the strong and elec-
tromagnetic interactions, but not a general of property nature, since the weak interactions
do not respect these symmetries. Hence after switching on the weak interactions we do
have to worry about this term.
Indeed, in the presence of CP-violating Yukawa couplings the discussion is rather
different. It turns out that phase rotations of the quark masses, in order to make them
real, end up changing . The experimental limit is in fact not on but on a parameter
, which is the difference between and an overall phase in the quark mass matrix. Only
this difference is observable. This will be discussed in more detail in section 5.6.
Even if one somehow manages to make exactly zero in the Lagrangian, this still
does not mean that dn = 0. Weak interactions still make contributions to dn of order
1031 ecm. That is about five orders of magnitude smaller than our current limits, but
it is essentally inevitable that such an effect exists in the Standard Model. New physics,
such as low energy supersymmetry, can make contributions as large as 1025 e.cm, and
hence current experiments are already constraining these options.
37
gauge bosons as Aa , a = 1, . . . , 3, and B . The action for the gauge fields is the canonical
one, Eq. (2.40). The SU (3) gauge group of QCD is not involved in the weak interactions.
LQ , RU , RD , LL , RE and RN
respectively. Here Q stands for quark, L for lepton, U for charge 32 quarks, D for charge
31 quarks, E for leptons of charge 1, and N for neutrinos. Here we are using a bit of
foresight regarding the final interpretation of these representations. These fields couple
to the gauge fields as indicated by their representations, and we will need three copies of
each to get the three fermion families observed experimentally.
The precise form of this coupling is as follows
X
3
LQ = i LQ, ( ig3 AI T I ig2 Aa T a ig1 B Y )LQ, , (4.12)
=1
L = (D ) (D ) 2 41 ( )2 , (4.13)
38
where D = 12 ig1 B ig2 ( 12 a )Aa is the covariant derivative. This scalar field
is called the Higgs field. Suppose that for some unknown reason the scalar mass 2
is negative. This may seem strange, but at this point 2 is just a parameter in the
Lagrangian. By writing it as a square we were incorrectly suggesting that it must be
positive, but actually 2 may have any real value. One may ask the question if the sign
ultimately requires further explanation, but that explanation is in any case beyond the
Standard Model, and we will not worry about that in this section. If 2 < 0, the true
minimum of the potential is not = 0, but some non-trivial value, which by SU (2)
rotations we can bring to the form
1 0
< >= , (4.14)
2 v
and which we can make real by U (1) transformations
p (the normalization is a convention).
The minimum of the potential is at v = 2 /.
2
The matrix is
g22 0 0 0
1 0 g22 0 0
M 2 = v2 . (4.16)
4 0 0 2
g2 g1 g2
0 0 g1 g2 g12
(The minus sign of the off-diagonal terms is due to the fact that 3 acts on < > via its
lower component.)
The mass matrix has off-diagonal terms, which means that the original vector bosons
3
A and B mix. To find the mass eigenstates we must diagonalize the matrix M . The
correct form of the bi-linear terms in the Lagrangian for a real vector field X is
After diagonalization we find that three of the four gauge bosons have acquired a mass,
namely
1 1
W = (A1 iA2 ) mass g2 v
2 2
39
q
1 1
Z = p 2 (g2 A3
g1 B ) mass g12 + g22 v
g1 + 2
g2 2
1
A = p 2 2
(g1 A3 + g2 B ) mass 0 (photon)
g1 + g2
We may now express the coupling of the fermions to the gauge fields Aa and B in
terms of the new fields W , Z and A (the coupling to the gluon is of course not affected).
For the field LQ this results in
X
3
1
LQ = i LQ, ( ig3 AI T I ig2 (W+ T + W T + )
=1
2
g1 g2 g22 T 3 g12 Y
i p 2 2
A (T
3 + Y ) i p
2 2
Z )LQ, ,
g1 + g2 g1 + g2
where T = T 1 iT 2 . The expressions for the other fermion fields are analogous.
4.2.4 Electromagnetism
The photon is found to couple to the fermions through the operator
Qem = T3 + Y , (4.19)
where Y denotes the U (1) generator before symmetry breaking. The reason the photon
remains massless is that this generator annihilates the new vacuum:
and hence Qem is an exact local symmetry of the theory. The electromagnetic coupling
constant is found to be
g1 g2
e p 2 (4.21)
g1 + g22
40
(1, 1, 0)R (1, 0)R ,
and of course we get three copies of each fields. We denote these fields as LU , LD , LE
and LN and similarly for the right-handed components. Just as before for the SU (3)
SU (2) U (1) representations, U stands for the three quarks u, c, t with charge 32 , D for
the quarks d, s, b with charge 31 , E for the leptons e, , with charge 1 and N for the
three neutrinos.
Until 1998 most data were consistent with massless, purely left-handed neutrinos. In
the zero mass limit the right-handed neutrino decouples completely from all other fields
in the Standard Model, and couples only to gravity. For this reason the existence of the
right-handed neutrino components has been a matter of speculation. Nowadays we know
that there must be mass differences between different neutrino species, and hence they
cannot all be massless. We still do not know for sure if right-handed neutrinos exists, but
it is the simplest possibility to explain the observed neutrino oscillations.
4.2.6 Parameters
An often used parameter is tan w = g1 /g2 . The electromagnetic coupling constant e is
then related to g1 and g2 as e = g2 sin w = g1 cos w . Experimental data are usually
quoted in terms of sin2 w . The measured value is .23117 .00016. The measured Z and
W masses are 80.385 .015 and 91.1876 .0021. Using this experimental information we
can compute the value of v, the Higgs v.e.v:
s
sin2 w
v = MW 246 GeV (4.22)
(MW )
Here is the QED fine structure constant, but one should not use the low energy value
1 1
137
, but the value at the mass of the W (or Z) boson, which is about 128 (more about
running coupling constants follows later).
1 0
(x) = (4.23)
2 v + (x)
The precise definition of left-handed is that the spin is oriented opposite to the direction of motion.
This definition is convention-independent; however in the literature the corresponding projections are
either 12 (1 + 5 ) (our convention) or 12 (1 5 ), and the definitions of 5 and the tensor may also
differ by signs. If the neutrino is exactly massless this relative orientation is Lorentz-invariant.
41
The purpose of the factor 12 is to make sure that the field has the correct kinetic terms
for a real scalar namely 21 . The complex field has no factor 12 in its kinetic terms
(see (4.13)). This is the last particle of the Standard Model that has been discovered,
the famous Higgs boson. It was searched for during several decades, and for a long time
it was also the only one that was missing. With its discovery the Standard Model is
complete. This does not mean that it is correct and that it is certain to survive future
experiments, but only that the list of definite particles still searched for is now empty.
The Higgs boson was discovered using the ATLAS and CMS detectors and the LHC
accelerator at CERN, and officially announced on 4 July 2012. It has a mass of about
125 GeV. In 2013, P. Higgs and F. Englert received the Nobel prize for their work from
1964 that first described the mechanism we now call the Higgs mechanism (the paper
of F. Englert was co-authored with R. Brout, who passed away before the particle was
discovered; other people who played an important role in the theoretical development of
the Higgs mechanism are Anderson, Kibble, Guralnik and Hagen.)
Note that the name Higgs is overused. We already introduced a Higgs field , a
two component complex scalar. Now we found a real scalar , which is called the Higgs
boson. It is closely related to , but not the same. In other contexts, we will find other
scalar fields that acquire a vacuum expectation value, and which are also called Higgs
fields.
The third and fourth order terms in the kinetic action of the non-abelian gauge bosons
give rise to interactions. For example one gets a coupling of the vector fields W to the
photon, confirming that the charge of these fields is indeed what is suggested by the upper
index. There are many other terms giving rise to couplings among the W , Z and fields
which we will not all present here.
where and are family labels, and gU , gD and gE are complex coupling matrices.
Here c.c stands for complex conjugate. If right-handed neutrinos exist, there may
be an additional term involving the neutrino fields. It contains the combination of fields
LL, [C ]RN , , and puts lepton and quark couplings more or less on equal footing. This is
appealing, but it is not clear whether it is also true, and in addition there are other terms
42
one can write down if one introduces right-handed neutrinos. Therefore we postpone the
discussion of neutrino masses to the next chapter.
Note that the total charge of each term must be zero. This obliges us to use in
the first term and in the second one. We also have to make sure that all terms are
SU (2) singlets. This is easy for terms of the form L , which are singlets automatically
if we contract their SU (2) doublet indices in the obvious way: Li i . This is because
transforms as the complex conjugate of , so it transforms according to the complex
conjugate representation. It is in general true that if Ai belongs to a representation R
and B i to R , then Ai B i is an invariant. However, in LY L couples not only to the field
but also to . The reason we can do this is that the doublet representation of SU (2)
is pseudo-real. In general this means that the representation matrices are not real (and
cannot be chosen real), but satisfy
(T a ) = CT a C1 , (4.25)
for some orthogonal matrix C. It is easy to check that the SU (2) doublet representation,
with representation matrices proportional to the Pauli matrices, satisfy this relation with
C = i 2 , or Cij = ij . In other words, the two SU (N ) representations (N ) and (N )
are not distinct in the special case N = 2; they are equivalent. Writing out the SU (2)
doublet indices that were suppressed above, the two couplings read thus Li (j )ij and
Li (j ) ij . [The matrix C should not be confused with the charge conjugation matrix C
that acts on spinors.] The vacuum expectation value of C is
1 v
hC i = (4.26)
2 0
which is precisely what is needed to give a mass to the upper component of the doublet.
,q ,q
ch = Uch,q ch , (4.27)
where ch = L, R denotes the chirality, and q the particle type or charge, i.e. q is either
U, D or E. If we write the fermion bi-linears in terms of the new fields the mass matrices
transform to
mD = 1 U g U v
2 L,D D R,D
43
mU = 1 U g U v
2 L,U U R,U
1
mE = U g U v
2 L,E E R,E
(4.28)
In general the matrices g are complex and neither symmetric nor hermitean, but since
we can rotate both their indices independently, we can make sure that the mass matrices
mU , mD and mE are diagonal.
To see that this is possible note that any complex matrix X van always be brought to
the form X = U H, where U is unitary and H is Hermitean. To bring X to diagonal form
we can multiply it from the left and right with distinct unitary matrices. Hence we can
multiply X from the left with U , so that a Hermitean matrix H remains. Now we can
multiply H from the left with a suitable matrix S and from the right with S, such that
S diagonalizes H in the standard way. The eigenvalues of H are real, but not necessarily
positive, but we can multiply from the left (or right) with a diagonal matrix of signs to
make all eigenvalues positive.
Note that the matrices U are not completely determined by the requirement that the
matrices m be diagonal. We may multiply each relation in (4.28) from the right with a di-
agonal unitary matrix diag (ei1 , ei2 , ei3 ) and from the left with diag (ei1 , ei2 , ei3 )
without changing the masses in any way. Thus for each pair UL , UR there are three
undetermined phases.
44
the case of the Standard Model the only symmetries are the gauge symmetries SU (3)
SU (2) U (1). In general, it is the model builder who decides which symmetries a model
should have.
The most general Lagrangian respecting all symmetries may have a large number of
parameters, but often only a subset of those parameters can be measured. The remain-
ing parameters can be absorbed by redefining fields. The set of fields in a QFT can be
redefined by taking arbitrary (non-degenerate) linear combinations. To remove the arbi-
trariness as much as possible, one first brings the kinetic terms to their canonical form;
then one does the same with the mass terms (which can always be diagonalized) and if
any field redundancy is left one can use it to bring some interaction terms to a standard
form.
A general gauge invariant expression for the kinetic terms of a set of fermions L is
X
L=i H L D L (4.31)
and similarly for the right-handed fields. The sum is over the family labels. The gauge
symmetry allows an arbitrary matrix H here (which can be different for each species
U, D, . . .). Hermiticity of the Hamiltonian (reality of the energy) requires H to be Her-
mitean; positivity of energy requires it to be positive definite. Then H can be written as
H = A A, for some complex matrix A. If we now define new fields = A using matrix
multiplication in family space we get the kinetic terms in their canonical form
X
L=i L D L (4.32)
This is usually the starting point, but note that in principle there is a large set of free
parameters H, which can be transformed away. Of course we replace A by in all other
terms in the Lagrangian, but in the Yukawa couplings this just redefines the matrices g .
Since the covariant derivatives behave as ordinary derivatives, after bringing the kinetic
terms in canonical form the gauge interactions are still diagonal in family space, as in
Eqn. (4.29).
After writing the kinetic terms in canonical form, there is still some redundancy left in
field space: we can apply to all fermions L a unitary transformation in family space. This
can be done for all species and chiralities separately. We limit ourselves now to the quark
sector, since we are considering the CKM matrix. Before weak symmetry breaking, we
have at our disposal the unitary 33 matrices UL,Q , UR,U and UR,D . After weak symmetry
breaking we can transform the two components of the SU (2) multiplet Q separately, since
we do not have to respect the broken SU (2) anymore. So then we have four matrices UL,U ,
UL,D , UR,U and UR,D .
We use this freedom first of all to bring the mass matrices produced by the Higgs
v.e.v. in diagonal form. Since mass matrices have the structure L R they are sensitive
to unitary rotations of the left-handed components with respect to the right-handed ones.
Note that we have enough freedom to make all masses positive. Since bringing the mass
45
matrices to diagonal form usually requires non-trivial transformations UL,U and UL,D , this
implies that the W vertex becomes off-diagonal in family space: we get a non-trivial CKM
matrix.
Once the masses are in their canonical form, we can determine how much freedom we
have still left. We observed above that UL,U and UL,D are not completely determined by
the mass diagonalization. We can multiply them each with a diagonal unitary matrix,
provided that we compensate for this in the corresponding UR . But these are unobservable,
so that we are allowed to change UCKM by multiplying it from the left as well as the right
by two independent diagonal unitary matrices. In other words, the mass terms as well
as the kinetic terms are invariant if we change L and R by the same diagonal phase
matrix, and we can do that for the species U and D separately. If the mass eigenvalues are
all different, these are the only transformations that leave the mass matrices invariant.
Let us now count the number of parameters in UCKM for N families. A unitary matrix
can be written as eiT , where T is a Hermitean matrix. There are N 2 independent Her-
mitean N N matrices, and hence such matrices are described by N 2 real parameters.
The 2N undetermined phases can be used to fix 2N of these parameters to any desired
value, so that we are left with N 2 2N parameters. However, not all 2N phases can
be used. If we multiply UCKM from the left with a diagonal matrix ei 1 and from the
left with the inverse of that matrix, they cancel each other. Hence we had only 2N 1,
and not 2N independent phases at our disposal, and the number of parameters is thus
N 2 2N + 1 = (N 1)2 . This gives 0, 1 and 4 for 1,2 and 3 families respectively.
As an example, consider the CKM matrix for two families. The most general CKM
matrix, a 2 2 unitary matrix, can be parametrized as
i()
e cos c ei() sin c
, (4.33)
ei() sin c ei() cos c
This depends on four and not five parameters, since it only depends on the differences of
, , and . By using the freedom to multiply this on the left and the right by diagonal
phase matrices, we can bring it to the form
cos c sin c
, (4.34)
sin c cos c
and hence there is only one physical parameter. The fact that , , and can be
transformed away by field redefinitions means that they can never be determined in any
physical process. Hence there is just one real parameter, not four. The parameter c
is called the Cabbibo angle. More precisely, it was called that when only two families
were known to exist; in fact even the c quark had not been discovered yet when c
was introduced. For three families there are more complicated expressions for the CKM
matrix, usually involving three angles and a phase. The standard parametrization is
c12 c13 s12 c13 s13 ei
UCKM = s12 c23 c12 s23 s13 ei c12 c23 s12 s23 s13 ei s23 c13 (4.35)
i i
s12 s23 c12 s23 s13 e c12 c23 s12 c23 s13 e c23 c13
46
where s12 = sin12 , c12 = cos12 , etc.
Experimentally the matrix UCKM is nearly equal to 1, but there are small off-diagonal
matrix elements; the largest of these is the old Cabbibo-angle, c 13o . It determines how
strongly an up-quark couples to an s-quark (compared to its coupling to its own family
us
member, the down quark). This coupling is small: UCKM sin(c ). Nowadays one defines
c 12 , and introduces three similar angles for mixing between the second and the third
family (23 2.38o ) and the first and the third (13 0.2o ). The phase is: = 1.2 .08.
The fact that the matrix is so close to 1 is not understood, although there are models
that produce this, together with the mass hierarchies, starting from some assumptions
about the Yukawa couplings. The structure of the matrix may contain important hints
regarding physics beyond the Standard Model.
47
for a U (1) gauge field this looks like ieqA , where q is the charge of (for non-abelian
symmetries our conventions are such that D = igT a Aa ).
Here a definite choice is made among the particle and anti-particle charge. The
SU (3) U (1) Lagrangian is always written in such a way that the charge corresponds
to what we call particles, as opposed to anti-particles. Note that there is no such
asymmetry in the action of a complex scalar.
What we call particles is simply the species that we see most abundantly in our own
environment. We see protons and electrons, and very few anti-protons and positrons. It
is still an unsolved mystery how this asymmetry has arisen (the baryogenesis problem),
but there is no fundamental reason why we should prefer particles over anti-particles.
Of course we know that the same fermion action also describes the anti-particle. Hence
it should be no surprise that it can be rewritten in such a way that the role of particle
and anti-particle are interchanged. To do so we introduce new variables
= C 1 ( 0 )T ( c ) = C ( c )T
= ( c )T C , (5.1)
where C is the charge conjugation matrix introduced in appendix C, which is a unitary
matrix satisfying
C C 1 = ( )T . (5.2)
The action of C on 5 is: C5 C 1 = (5 )T . The precise form of C depends on the explicit
representation of the Dirac -matrices, but the only thing that matters is that such a
matrix C exists in any representation. The relation for is not independent, but follows
from the one for . Note that this changes right-handed fields to left-handed ones:
R = PR = PR C 1 ( 0 )T ( c ) = C 1 (PR )T ( 0 )T ( c )
= C 1 ( 0 )T (PL )T ( c ) = C 1 ( 0 )T (PL ) ( c ) = C 1 ( 0 )T (Lc ) .
Substituting this into the action iR D R yields a new action
iR D R = i(Lc )T C D C (Lc )T
= i(Lc )T C D C 1 (Lc )T
= i(Lc )T ( )T D (Lc )T (5.3)
This expression is a number, i.e. a 1 1 matrix. So it is equal to its own transpose, and
we may replace it by its transpose. Since the fermions anti-commute this requires some
care. The identity we are using is
X X X
T M = i Mij j = j Mij i = j MjiT i = T M T , (5.4)
i,j i,j i,j
where and are mutually anti-commuting spinors. In our case they correspond to c
and c , and the indices i, j represent the complete set of indices has, e.g. spin (Dirac
indices), gauge and flavor indices. For the ordinary derivative term in the covariant
derivative this yields
i(Lc )T ( )T (Lc )T = i Lc Lc (5.5)
48
We now move to Lc by partial integration, i.e. we pretend that the Lagrangian
density is integrated over space-time. This gives a final minus sign. For the gauge boson
coupling part of the covariant derivative we get
where we made use of the fact that T a is hermitean. The final result is thus
This is the desired result since (T a ) is the generator of the complex conjugate repre-
sentation.
1 uL
(3, 2, )
6 dL
(3 , 1, 23 ) ucL
(3 , 1, 13 ) dcL
L
(1, 2, 21 )
e
L
(1, 1, 1) e+
L
(1, 1, 0) Lc (5.8)
49
of L one may always rotate M to a Hermitean matrix, but before doing that it will come
out as a general complex matrix.
In this mass term, L and R are really distinct fields. To emphasize that, we can
give L a different name, L . Then a typical (off-diagonal) mass term looks like this
R M L L M R (5.9)
and after replacing R by its left-handed anti-particle this takes the form
Note that the second term is the Hermitean conjugate of the first. All indices have been
suppressed here, but note that M and C are respectively matrices in family and in spinor
space. Mass terms clearly looked nicer in L-R notation, but we will see that the left-
handed representation has other advantages that make it worthwhile paying this price.
In the rest of these notes we will use both representations, depending on what is most
convenient. We will refer to the representation used in the previous chapter as the particle
representation, since all fermi fields are particles, and their conjugates anti-particles. This
is the most useful basis when masses are present, for example for physics at low energies.
The other one, the left-handed representation is more useful above the weak scale, where
the fermions are massless.
Ux UL,x
Vx UR,x , (5.11)
50
a real representation. Concretely, if L has N components and transforms in the following
way under the action of some symmetry
L U L (5.12)
then the fact that L and L belong to mutually complex conjugate representations means
that L transforms as
L U L , (5.13)
where U is a N N unitary matrix. In the 2N -dimensional space spanned by L and L
this transformation takes the form
L U 0 L
(5.14)
L 0 U L
This matrix can be made real by means of a unitary transformation (proof: first diagonal-
ize U . Then, in each 2 2 block of conjugate eigenvalues diag(ei , ei ) one can transform
this matrix to a two-dimensional rotation.). After this basis transformation (L , L )
becomes a spinor L which transforms according to a 2N -dimensional real representation
L OL ; OT O = 1 (5.15)
If a field belongs to a real representation one can write down a mass term of the form
mTL CL + c.c (5.16)
This is called a Majorana mass term. It is obviously invariant under (5.15).
Note that the representations of the Standard Model after symmetry breaking can also
be written entirely in terms of left-handed fields. For one family one gets then the SU (3)
U (1) representations (3, 32 ), (3 , 23 ), (3, 31 ), (3 , 13 ) for the u and d quark, (1, 1), (1, 1)
for the electron, and (1, 0) for the neutrino. One can determine all possible mass terms
by finding all real subspaces. These are (3, 32 ) + (3 , 23 ), (3, 31 ) + (3 , 31 ), (1, 1) + (1, 1)
and (1, 0) by itself. The latter one, the Majorana mass term for the neutrino does not
appear in the standard model. This term can be present without any enlargement of the
field content. However, it would break a global symmetry, namely lepton number, and
hence there are important constraints on such a term.
In general if one has a gauge theory with fermions written in left-handed represen-
tation, one can write down a mass term for any subset of the fields that form a real
representation. The distinction between Majorana masses and Dirac masses is not very
big in this language. One can speak of Majorana masses if a field is in a real repre-
sentation that is irreducible, whereas one speaks of Dirac masses when a field is in a
representation that is irreducible as a real representation, but that consists of two mutu-
ally complex conjugate components. An example of such a representation is that of a u
quark, (3, 23 ) + (3 , 23 ), which is a real representation because one can find a basis so that
the SU (3) U (1) representation matrices are real, but which is reducible as a complex
representation.
Often we will call this the broken Standard Model. It has a gauge group SU (3)QCD U (1)QED .
51
5.1.6 Mirror Fermions
Before symmetry breaking the Standard Model fermi fields (except the right-handed neu-
trino, which many people do not regard a Standard Model particle anyway) are in a fully
complex representation, so that no mass terms can be written down. This is presumably
no coincidence. It is quite possible that the fermions we see are only the low energy
remnants of a larger fermion representation, which contains some real parts. Since mass
terms for the real parts are not forbidden, they might indeed be generated, and then the
complex part is all that survives as low energies. Without further information it is of
course not possible to say anything about the masses of such particles. The most com-
mon occurrence of this sort of real SU (3) SU (2) U (1) matter in models is in the form
of mirror families. These are families of fermions whose representation is the complex
conjugate of those of the families we observe. Instead of living in a world with 3 families,
we might live in a world with N + 3 families and N mirror families, where N families have
paired with N mirrors to form massive particles.
52
as charged lepton masses, in the MeV or even GeV range. But this is not possible.
Obviously we know already since Pauli postulated it in 1930 that the neutrino emitted in
-decay is extremely light (that is why it was called neutrino). Meanwhile we know this
much more accurately from precise measurements of Tritium decays. This imposes limits
of about 2 eV on the particular neutrino combination that is emitted in -decay. Clearly,
given the very small mass differences, this essentially rules out large masses for all three
mass eigenstates, in any simple extension of the Standard Model with three neutrinos.
For more about mass limits see section 5.2.7.
Furthermore, there are limits from cosmology. If the sum of the three neutrino masses
exceeds 40 eV, neutrino matter would over-close the universe, which means that they
contribute too much to as defined in eqn. (1.7). Note that neutrino mass limits
based on this argument must necessarily depend on estimates of neutrino abundances,
assumptions about neutrino stability and basic assumptions about cosmology, in contrast
to direct observations. There are other cosmological estimates based on the properties
of the Cosmic Microwave Background (CMB) and other astrophysical features. These
too depend on some additional assumptions, and give an upper limit for the sum of the
masses ofPless than an eV. The latest limit from the Planck satellites observation of the
CMB is m < .23 eV.
From all this information we know that neutrinos have masses, and that these masses
are smaller than those of the charged leptons by a factor of a million or more.
The first possibility puts the neutrinos on the same footing as the charged leptons and
the quarks: all Standard Model particles would have left- and right-handed components,
and the only thing strange is the smallness of the neutrino masses. Indeed, although it is
often said that neutrino mass is the first example of Beyond the Standard Model physics,
this is a matter of definition. Based on what we knew before the discovery of neutrino
oscillations, two versions of the Standard Model could have been chosen. The first is to
omit right-handed neutrinos and the Dirac mass term, so that neutrinos are massless.
This is how most people define the Standard Model. Neutrino masses are then Beyond
the Standard Model by definition. But an equally reasonable definition would have been
to allow right-handed components and Dirac masses, just as for all other particles, and
assume that the masses were too small to observe. With this definition, the observation
of neutrino masses through oscillations would just be the first observation of a finite
difference of Standard Model parameters, which were too small to be be observed until
53
a few years ago. The only BSM aspect of this scenario is the existence of additional
degrees of freedom, the right-handed neutrinos.
However, the second possibility definitely deserves the label Beyond the Standard
Model, for several reasons: the Majorana mass term breaks lepton number, and adds an
additional mass parameter, which a priori is not related to the Higgs field.
Note that one can in principle choose between these two modifications for each family
separately; the first family neutrino may be given a pure Dirac mass and the second
a pure Majorana mass, etc. But we will soon see that most likely both options are
realized simultaneously: there would then be Dirac as well as Majorana mass terms in
the Lagrangian. Furthermore, in a scenario where families behave distinctly, it become
difficult, if not impossible, to obtain the observed neutrino mixing. Therefore we will from
now on assume that the neutrino mass generation mechanism is the same for all three
families.
This is called a Weinberg operator [32]. Note that C, the charge conjugation matrix in
SU (2)W space, is used here to couple L and to an SU (2) singlet. This combination
has vanishing Y -charge, and it is a fermion. The spinor space matrix C is used to couple
the two fermionic combinations to a Lorentz singlet. This combination has dimension
5, and therefore there will always be a coefficient Mg multiplying this operator, where g
is a dimensionless coupling constant and M a mass scale. A theory containing such an
operator is not renormalizable, which means concretely that it does not make sense at
Note that there is one matrix C to make a Lorentz invariant Majorana mass term in spinor space, and
a matrix C to couple the SU (2)weak indices to a singlet.
54
scales larger than M . However, it is perfectly acceptable as an effective theory below M .
This means that we can use it as long as the typical energies in a process are smaller than
M . Indeed, it would be a good idea to make M very large, in order to obtain naturally
small neutrino masses: m gv 2 /M . This idea is realized more naturally in the see-saw
mechanism discussed below.
Note that lepton number is necessarily broken, because Eq. (5.17) contains two fields
L , and not a field and its conjugate. One could try to avoid that by assigning lepton
number to the Higgs field , but then lepton number is broken as soon as the Higgs gets
a v.e.v; lepton number disappears into the vacuum.
e e e
e e
In the third case one sees two electrons but no neutrinos coming out, and hence a
violation of lepton number by two units. Such decays have been looked for, but not found
so far.
In fact, as we will see later, lepton number is not an exact symmetry of the Standard Model anyway.
However, the combination B L (baryon number minus lepton number) might be an exact global sym-
metry. The Majorana mass term also violates B L. The absence of such a Majorana mass term would
be therefore be natural if we assumed that B L is an exact symmetry of nature.
55
5.2.4 Adding Right-handed Neutrinos
The first possibility listed in section 5.2.1 is less exotic. One just adds a right-handed
neutrino field (i.e in the left-handed representation a left-handed anti-neutrino) and a
Dirac mass-term. The extra field belongs to the SU (3) SU (2) U (1) representation
(1, 1, 0), and the Dirac mass term can be generated by the Standard Model Higgs boson,
in exactly the same way as it generates the up and down quark masses. The lepton
sector looks then rather similar to the quark sector, and in particular it has its own
CKM matrix. This is not very natural, however, since one would expect the neutrino
masses to be roughly of the same order of magnitude as the other lepton masses and the
quark masses. Although the hierarchies among the quark and lepton masses are large and
not understood, a non-zero but small (< 2 eV for e ) electron-neutrino mass makes this
hierarchy problem substantially worse.
Mdiag = U T M U (5.20)
where U is unitary, and Mdiag is diagonal and real. The reason for writing diagonal-
ized with quotes is that this not the standard diagonalization of complex matrices. The
standard way is to use U instead of U T , because this is what correspond to a true basis
transformation in a complex vector space. However, the proper procedure in QFT is to
bring first the kinetic term in canonical form, and then use any remaining freedom to
bring the mass terms in diagonal form. We are treating the field with representation
(1, 1, 0) here a a left-handed Weyl fermion, just as any other Standard Model fermions.
Its kinetic term is
iLc Lc (5.21)
56
This is invariant under unitary transformations of the field Lc . Applying this transfor-
mation to the Majorana mass term gives us precisely the correct transformation (5.20) to
bring the mass matrix to real diagonal form.
Unlike all other direct quark and lepton mass terms, the mass term (5.18) is allowed
by SU (3) SU (2) U (1), and its mass scale Mm is not set by the Standard Model Higgs
mechanism. The parameter Mm is unrelated to the Higgs mass parameter 2 , and may a
priori have any value. Note that the Majorana mass term (Lc )T CMm Lc violates lepton
number, just like the Weinberg operator (5.17). On the other hand, it is not unreasonable
to assume that any term that is not explicitly forbidden by a gauge symmetry will indeed
appear, even if such a term violates a discrete symmetry. The discrete symmetries of
the Standard Model are merely a consequence of the fact that the Lagrangian terms of
dimension four and less just happen to respect B and L. There is no profound reason
why these symmetries should be sacred, unlike gauge symmetries, whose breaking renders
the theory inconsistent. Furthermore, gravity has little respect for discrete symmetries:
baryon and lepton number can disappear into a black hole, without leaving a trace.
According to this philosophy, a term (Lc )T CMm Lc should exist with Mm determined by
some higher scale.
In addition to this Majorana mass term, we also have a Dirac mass term, which can
be written in terms of left-handed fields as indicated in eqn. (5.10)
(Lc )T M CL L M C (Lc )T . (5.22)
Combining the Majorana mass term and the Dirac mass term, we get a mass matrix of
the following form
1 T c T 0 Md
( , ( ) )L C T + h.c. , (5.23)
2 Md Mm c L
It is assumed here that there are no direct Majorana contributions to the mass of L ,
such as for example a Weinberg operator. That is why there is a 0 in the first row. But
adding a non-zero entry here would not change anything quantitatively. The off-diagonal
terms are simply the first term of (5.22), distributed symmetrically, with a factor 12 to
get the correct normalization. Note that the complete mass matrix for three families is a
66, symmetric, and complex matrix, which can be diagonalized by the method explained
above. This diagonalization will mix and c .
Consider first the simplest case, one family. We may assume that Mm and Md are real,
because if they are not we may multiply the fields and c with appropriate phase factors
to make them real. Then the matrix can bep diagonalized by means of an orthogonal matrix
and it leads to mass eigenvalues 12 (Mm Mm 2 + 4M 2 ). If we make the approximation
d
Md Mm , which is reasonable according to the arguments given above, the eigenvalues
M2
are approximately Mm and Mmd (the sign is irrelevant). Then we end up with one very
massive neutrino (Lc with a very small admixture of ) and one very light one (essentially
). If we take Md 1 GeV, the value Mm = 1011 GeV leads to an naturally small neutrino
mass of about 102 eV. This is called the see-saw mechanism. In the limit Mm ,
Lc decouples from all interactions except gravity, and one recovers the Standard Model.
57
In the three family case one can solve the eigenvalue problem approximately in the
limit where the determinant of Mm is much larger than that of Md . One can then use the
following ansatz for the light eigenvectors
~v
1 , (5.24)
Mm Md~v
Acting on this with the matrix (5.23) we get
1
Md Mm (Md )T ~v
, (5.25)
0
1
Hence the vector (5.24) is transformed into Md Mm (Md )T times itself, up to corrections
of order (Md /Mm ). Then the three light neutrino mass eigenvalues are approximately the
1
eigenvalues of the 33 matrix Md Mm (Md )T . In addition there are three heavy neutrino
mass eigenvalues which are obtained by diagonalizing Mm .
The PMNS matrix. Let us make this more precise. The coupling of the W boson is
given by an expression completely analogous to the one for quarks, Eq. (4.30):
L,i i ,
L UPMNS W L , (5.26)
where P refers to Pontecorvo, who first pointed out the possibility of neutrino oscilla-
tions in 1957 [25] and MNS stands for Maki, Nakagawa and Sakata, who proposed this
58
matrix in 1962 [21] for two lepton flavors. In (5.26) N and L denote, as before, the set of
neutrinos resp. charged leptons. The labels = e, , denote the charged lepton mass
eigenstates, and i = 1, 2, 3 denote the neutrino mass eigenstates, in no particular order.
One also writes (omitting the label PMNS for convenience)
| i = Ui |i i
|i i = Ui | i
If the neutrino masses were purely of Dirac type, this matrix would have the same number
of parameters as the CKM matrix, and can be parametrized in exactly the same way,
although with very different values for the parameters 12 , 23 , 13 and . If there are also
Majorana components in the neutrino masses, there are two additional parameters (which
can be transformed away if Mm = 0). They can be chosen as follows
where U (12 , 23 , 13 , ) is a standard form as used for the CKM matrix. The phases 1 ,
2 are CP violating (just as ), but they do not contribute to neutrino oscillations. The
standard parametrization of U (12 , 23 , 13 , ) is
c12 c13 s12 c13 s13 ei
s12 c23 c12 s23 s13 ei c12 c23 s12 s23 s13 ei s23 c13 (5.28)
s12 s23 c12 s23 s13 ei c12 c23 s12 c23 s13 ei c23 c13
where s12 = sin12 , c12 = cos12 , etc. Unlike the CKM matrix elements, some of the
PMNS matrix elements are large: 12 33o , 23 45o , 13 9o . The phase is
essentially unknown.
The reason that the PMNS matrix has two extra phases is a direct consequence of
the fact that the W -boson coupling is between a standard Dirac fermion on the one hand
(the charged lepton) and a Majorana particle on the other hand. Let us compare this
to the counting for the CKM matrix. The CKM matrix couples quarks to quarks. For
N families, it is an N N unitary matrix, which can be multiplied from the left and
the right with diagonal phase matrices. These phase matrices are precisely the unitary
transformations that leave respectively the up and down quark masses invariant. Since
a Dirac mass term has the form M L R + h.c., one can multiply left-handed and right-
handed Dirac fermions with compensating phases, without affecting M . Only the phase
of the left-handed particle contributes to the CKM matrix, and hence this matrix can
be changed. Fixing that phase is like a gauge choice: we have to agree on it in order to
compare our results. There are N phases from the up-quark sector and another N from
the down-quark sector. Since the overall up-quark and down-quark phase commute with
the CKM matrix and can cancel each other, the net parameter reduction is by 2N 1,
so that we get N 2 2N + 1 = (N 1)2 parameters. But if the W boson couples a
Interestingly, the CKM matrix for quarks appeared later: Cabibbo introduced his angle for the two
family case in 1963, whereas Kobayashi and Maskawa published their paper in 1973.
59
Dirac fermion to a Majorana fermion, we do not get a diagonal phase factor from the
Majorana side because a Majorana mass term contains the same fermionic field twice,
and if we phase rotate this fermion this affects the mass (note that one could change the
fermionic field with a sign, but not with a phase). Hence the number of parameters is
N 2 N , which for N = 3 gives 6. Note that if both Majorana and Dirac fermions are
contributing, as in the seesaw mechanism, the neutrino masses are always of Majorana
type. However, the two extra phases 1 and 2 cannot be observed unless one considers
processes sensitive to the difference between Majorana and Dirac masses. In particular,
they cannot be observed in neutrino oscillations [3, 7]. Indeed, even though we have
observed oscillations, we still do not know if there exists a Majorana mass term (and
hence a violation of lepton number).
Oscillations for two neutrino species. Although the subsequent discussion is easily
generalized to three neutrinos, for simplicity we consider only two, namely the one that
couples via a W boson to the electron and the one that couples to the muon. These
are what one usually calls the electron neutrino and the muon neutrino. In collisions
with other particles, a pure electron neutrino can only produce an electron, through the
interaction e e + W + occurring as part of a more complicated process. Hence if
we observe the electron in a detector, the neutrino is thereby identified as an electron
neutrino. Similarly, if a muon scatters with matter and is converted into a neutrino by
W exchange, this neutrino is by definition a muon neutrino. These are the interaction
eigenstates. However, for generic mass matrices we cannot expect these to coincide with
the mass eigenstates, and indeed it turns out that they do not. In fact we have
|e i = cos |1 i + sin |2 i
| i = sin |1 i + cos |2 i , (5.29)
where 1 and 2 are the mass eigenstates. These mass eigenstates have the usual quantum
mechanical time evolution:
2 2
|i , ti = eiHt |i , 0i = eit (pi ) +(mi ) |i , 0i (5.30)
Since this time evolution is different for the two components, a pure electron neutrino will
not stay a pure electron neutrino as it evolves in time. If it is detected, one may find that
with some probability it has changed into a muon neutrino.
After some time interval T the interaction eigenstates has evolved to a state
2 2 2 2
|e , T i = eiT p +m1 cos |1 i + eiT p +m2 sin |2 i (5.31)
We can now compute the overlap of this state with an interaction eigenstate. The square
of the amplitude is the probability for finding an electron neutrino in the final state
2 2 2 2
P (e e ) = | eiT p +m1 (cos )2 + eiT p +m2 (sin )2 |2
q q 2
T
= 1 sin 2
p + m1 p + m1
2 2 2
[sin(2)]2 (5.32)
2
60
If we make the approximation
p that the
p neutrino mass is much smaller than its energy
(or momentum), we get p2 + m22 p2 + m21 (m21 m22 )/2E (with E = p, up to
corrections of order m2 /E 2 ). Finally we express the result not in terms of the time of
flight T of the neutrinos, but the distance L they travel. Since they are very relativistic
we get L = T (because c = 1). The final result is
2
P (e e ) = 1 sin(m2 L/4E) [sin(2)]2 (5.33)
Note that the effect disappears if the neutrinos are degenerate in mass, or if the mixing
angle vanishes.
Oscillations for three neutrino species. The three-family formula can be worked
out along the same lines, and after a bit of work one obtains
X
2
m2ij L
P ( ) = 4 Re(Ui Ui Uj Uj )sin
i>j
4E
X
m2ij L
2 Im(Ui Ui Uj Uj )sin
i>j
2E
where the upper sign is for neutrinos and the lower one for anti-neutrinos. One may verify
that for two species we re-obtain Eqns. (5.33) and (5.34). In that case U is just
cos sin
sin cos
which is real, and hence the last term vanishes. In fact, in the two-family case there is an
extra parameter, the Majorana phase. In the three family case there are two such phases,
1 and 2 in Eqn. (5.27). But such phases cancel out in the neutrino oscillation formula.
Note that because of the last term we can in principle measure the sign of m2ij , unless
the prefactor vanishes.
Direct measurements
61
In addition to this we have information from various astrophysical and cosmological
sources (already briefly mentioned in section 5.2), such as the mass density of the universe,
the effect of neutrino masses on Big Bang nucleosynthesis and on the cosmic microwave
background and the travel time of neutrinos produced in supernova explosions. This is an
exciting field with many opportunities for new results, but we will focus here on the three
classes listed above, that are not affected by cosmological and astrophysical assumptions.
The first class of measurements amount to checking energy and momentum conserva-
tion in interactions where a neutrino has been produced. For the electron neutrino the
standard experiment is tritium -decay. The tritium nucleus decays to helium-3, and
electron and an anti-neutrino:
3
1H 32 He + e + e + 18.6 KeV .
If the latter has a mass, less energy is available for the electron. Hence the experiments
try to determine the maximum energy of the decay electrons. So far no indication for non-
vanishing mass was found, which implies an upper limit of about 2 eV for the mass of the
electron neutrino. The masses of the other two neutrino combinations can be determined
from energy-momentum conservation in accelerator experiments. The current limits are
190 KeV and 18 MeV for and respectively [8]. These are maxima on the missing mass
in experiments, just as the -decay limits. These experiments precede the observation of
neutrino oscillations in 1998 by many years. Based on what we have learned meanwhile
about neutrino oscillations it seems clear that the actual neutrino masses are much, much
smaller than any of these limits.
Neutrino-less double beta decay is sensitive to the Majorana mass. This may either
be a pure Majorana mass, or the Majorana component in the more complicated situation
where also Dirac masses are present. If lepton number is violated (which is always the
case if one introduces a Majorana mass, for the left- or for the right-handed neutrino),
neutrino-less double beta decay is allowed, and may be observable. Several experiments
are looking for it, but so far without undisputed results.
Neutrino oscillation experiments are sensitive to differences of mass-squares. A positive
result proves that at least one neutrino must have non-zero mass, but unfortunately
this does not tell us anything about the masses themselves. The oscillation experiments
fall into several classes. First of all one can distinguish appearance and disappearance
experiments. The first checks oscillation from species a to a different species b, and the
second checks whether the total flux of species a is preserved. The experiments can
also be subdivided according to the origin of the neutrinos: solar, reactor, accelerator,
atmospheric or cosmic sources (e.g. supernovae).
Solar neutrino experiments. Solar neutrino experiments measure the number of elec-
tron neutrinos observed on earth that are produced in nuclear reactions in the sun. Ini-
tially these experiments were merely intended for finding solar neutrinos. They did indeed
find them, but already since the 60s these experiments reported a shortage, finding only
about one third of what was expected. The expectations depend on solar models, which
were during many years seen as the main culprit of the shortage, but over the years the
62
solar models became so robust that this became unlikely. Most of these experiments
look at so-called charged-current interactions involving a W boson. The reaction is
e + n e + p, where a neutron in a nucleus is converted into a proton. The difficulty
is finding a few of these converted nuclei (e.g. Germanium) within a huge quantity of de-
tector material (e.g. Gallium). This reaction is only sensitive to electron neutrinos since
there is not enough energy available to produce muons or taus. The Sudbury Neutrino
Observatory (SNO), was able to look in addition to neutral current interactions (involving
the Z boson). In these interactions the final state lepton is also a neutrino, and interac-
tions of all three neutrino species are observable. This experiment found in addition to the
factor three deficit in charged current interactions, precisely the expected solar neutrino
flux in neutral current interactions. This is very strong evidence that on their way from
the sun to earth a substantial fraction of neutrinos have oscillated to other species. This
was announced in 2001, and produced the decisive clue in the decade-old solar neutrino
puzzle. In 2002 Raymond Davis received the Nobel prize for his pioneering work on de-
tecting solar neutrinos. His detector used about 600 tonnes of Chlorine, which in a rare
neutrino interaction gets converted to Argon. They captured about 2000 neutrinos over
a period of thirty years!
It turns out that solar neutrino oscillations are only partly due to oscillations in vac-
uum. For high energy neutrinos (energies of about 5-20 MeV) there is a second oscillation
effect due to oscillations in matter, i.e. the sun. This is called the Mikheyev-Smirnov-
Wolfenstein (MSW) effect. The formulas we gave above are for oscillations in vacuum,
and are not valid for oscillations in matter. Taking this effect into account one gets the
survival probability of about 1/3 observed in the early solar neutrino experiments. It
turns out that the MSW effect is sensitive to the sign of the mass difference. The data
are consistent with a mass difference m2 7.5 105 eV2 and an angle 12 of about
33o . The MSW effect is less important for low energy (a few MeV) neutrinos, and in
this case we can directly compute the neutrino oscillations using the P (e e ) formula
given above. Plugging in the typical energy of the neutrinos and the distance to the sun
one finds that we are in a region where the factor sin2 (m212 L/4E) is fluctuating rapidly.
Hence this factor averages out to 21 . Now we use the value for 12 and we get a survival
probability of about 60%, which is indeed what is observed for low energy neutrinos. Of
course historically 12 and m212 were output, not input.
63
Reactor experiments. Reactor experiments look for neutrinos from nuclear reactors.
In 2005 the first experiment of this kind (KamLAND) reported evidence for oscillations.
Previous experiments sensitive to smaller values of L showed no effect. The Daya Bay
reactor experiment in China was the first to determine that 13 6= 0 in a significant way.
Hierarchy ambiguities. There is in the current data still an ambiguity in the ordering
of the three mass eigenstates. As we have seen above, using the MSW effect one could
determine the sign of the mass difference of the two mass eigenstates involved in solar
oscillations. We do not have matter oscillations at our disposal for atmospheric neutrinos
to determine the sign of the other mass difference. Hence we are left with an ambiguity.
In the future, one may be able to use the fact that the full three-family oscillation formula
has a term that is sensitive to the sign, but this is not yet possible. Hence we have two
possible mass hierarchies. Either the two mass states whose mass difference agrees with
solar oscillations are the lightest (normal hierarchy), or they are the heaviest (inverted
hierarchy). The labelling convention is to number them 1,2 and 3 in increasing order of
mass in the normal hierarchy, and 3,1, 2 in the inverted hierarchy. These are the labels
used in the PMNS matrix above for the columns; the rows are labelled e, , . With
this convention, the mixing angles are the same for both hierarchies. In particular, e ,
and have the same decomposition in terms of mass eigenstates for the normal and the
inverted hierarchy.
64
term in combination with its Hermitean conjugate. A typical pair of such terms will have
the form
LY = g L R + g R L (5.35)
Under parity the first term transforms to
g R L , (5.36)
65
The matter consists of three copies each of the five SU (3) SU (2) U (1) fermion
representations, plus a complex Higgs. The kinetic terms have thus a U (3)5 U (1) global
symmetry. The Yukawa couplings break this symmetry. Any off-diagonal U (3) transfor-
mations are destroyed because the eigenvalues of the three Yukawa coupling matrices are
all different.
To see if any U (1) transformations are preserved one can try diagonalize these matrices.
As we already know, this cannot be done in the quark sector: one may diagonalize gU
using unitary matrices UU and VU , but to diagonalize gD we would need matrices UD
and VD , with UU 6= UD . Note that both both UU and UD both act on components of
the left-handed quark doublet, and since they are different one cannot simultaneously
diagonalize all Yukawa couplings. The matrices UU , VU , UD and VU are usable after Higgs
symmetry breaking, because then it is meaningful to act on the separate components of
the weak doublet. The fact that UU 6= UD leads to a non-trivial CKM matrix, so we know
experimentally that these matrices are indeed different. Therefore if we transform any
quark field by a phase, and we want the quark Yukawa couplings to be invariant, we must
transform all quarks by the same phase, and anti-quarks by the opposite phase. This
surviving U (1) symmetry is Baryon number (B), and it is normalized in such a way that
all quarks have B = 31 .
We have previously identified four mechanisms for breaking the U (6) U (6) chiral
symmetries. They are broken to U (1)6 (the separate flavor numbers) by QCD and QED.
The weak interactions, and in particular the fact the the CKM matrix is non-trivial,
break this global symmetry to just a single U (1), thus adding a fifth origin of U (6) U (6)
breaking.
In the lepton sector the situation is more or less the same as in the quark sector. If we
start with left-handed lepton doublets, plus right-handed charged leptons and neutrinos,
then the quark sector and the lepton sector both have an U (6) U (6) in the limit of zero
fermion masses and if electroweak interactions are switched off. QCD chiral symmetry
breaking only affects the quarks, but if we treat the quark and lepton sector otherwise
equally, we also end up with just a single conserved quantity, lepton number. Since
neutrinos oscillate into each other, we know that separate electron, muon and tau lepton
numbers are not conserved. Apart from the absence of chiral symmetry breaking, the
other novelty in the lepton sector is the possibility of introducing Majorana masses. This
would break lepton number completely.
Finally we may transform the Higgs field by a phase. This is automatically a symmetry
of the Higgs potential, but it is a symmetry of the Yukawa couplings only if the quarks and
leptons transform with compensating phases. If one solves the conditions for invariance
of the Yukawa couplings, one finds only one solution, namely the gauged U (1)Y symmetry
of the Standard Model gauge group. So the single Higgs field of the Standard model does
not introduce new global symmetries.
66
5.5 Anomalies
All symmetries we discussed so far were good symmetries classically, but quantum cor-
rections break some of them. The Feynman diagrams responsible for this breaking are
fermion triangles with external (axial) vector currents (in D space-time dimension anoma-
lies originate from fermion polygons with 12 D + 1 sides; chiral anomalies exist only if D
is even). The problem occurs only if the fermion trace contains the matrix 5 . In purely
vector-like theories, where all couplings to the vector bosons are only via the Dirac matrix
, the problem does not occur. But as soon as there is a coupling via 5 some classical
symmetries must be broken in the quantum theory. Such couplings typically arise if vector
bosons only couple to left- or right-handed fermions, as in the weak interactions.
The triangle diagrams contribute to the amplitude a (k)b (p)c (q)V (k, p, q), where
a
(k) is a polarization vector of a vector boson. Here all three polarization tensors could
be different, i.e they may belong to different vector bosons. If the classical symmetry is
respected, then the amplitude must vanish if we replace any of the polarization tensors
by the momentum of the vector bosons. This follows from the momentum space version
of current conservation, J = 0. Hence the Greens function V (k, p, q) should satisfy
k V (k, p, q) = 0 , (5.38)
where k is one of the external momenta. An analogous relation should hold for the other
two external momenta if the symmetry is to hold quantum mechanically.
The relevant triangle diagrams are:
k,a k,a
l+p l q l+q l p
l l
p,b q ,c q ,c p,b
This turns into an integral over a trace of the fermions, with three propagators and
three vertices coupling to vector bosons, which can be either or 5 (and similarly
for , ). It turns out that if the fermionic trace contains a 5 , no regularization of the
diagram preserves the classical symmetry. One can impose conservation of two of the
three currents, for example the ones coupling to the vertices labeled and , but then
67
one gets for the third current
1
ik V (k, p, q) = p q . (5.39)
2 2
This result holds for a single fermion with axial vector couplings i 5 to the external
currents. The anomaly can be shifted to any of the three vertices, but cannot be removed.
If none of the currents in the diagram is gauged there is no problem, since then the
diagram will never contribute to any Greens function. The same is true if only one vertex
is a gauge current. If two or three external lines are gauge bosons there are important
consequences, however.
The i is needed for making a correct Wick rotation later on, but for the moment we will
just drop it to keep the notation simple. We use vertices of the form introduced in section
2.6, and in particular we allow for a non-abelian generator T a at every vertex. We will
set m = 0, but later we will need the Dirac propagator with non-vanishing mass.
Then the expression to be computed is
iVabc (p, q) =
Z
d4 l i(/l + 6 p) i/l i(/l 6 q)
Tr (i 5 Ta ) (i Tb ) 2 (i Tc )
(2)4 (l + p)2 l (l q)2
To this we have to add the same expression with (p, b, ) simultaneously interchanged with
(q, c, ). Note the overall minus sign due to the fact that we have a fermion loop. The
trace is over the gamma matrices as well as the gauge generators. Collecting all factors
and separating the traces we get
iVabc (p, q) =
Z
d4 l
/l + 6 p
/l
/l 6 q
Tr 5 Tr [Ta Tb Tc ]
(2)4 (l + p)2 l2 (l q)2
68
introduce a momentum cutoff. But this is not even well-defined, because it depends on
how we define the loop momentum in the first place; note that we can shift l by some fixed
amount. Generally one prefers regularization methods that can be applied directly to the
Lagrangian, rather than manipulating individual diagrams. With such a prescription at
least there is a relation between the ways different diagrams are regularized. A popular
method in gauge theories is dimensional regularization. One simply treats the number of
space-time dimensions as a variable, and sets it equal to 4 in the end. With proper care,
this can be done in a continuous way. But the presence of a 5 in the trace makes proper
care very tricky. This matrix is proportional to the product of 0 , 1 , 2 and 3 , and this
is a definition that does not extend smoothly to other dimensions.
For this reason another method is often used, called Pauli-Villars regularization. One
introduces a new particle with the same spin as the fermion going around in the loop,
but with opposite statistics. This particle is given a mass M , and in the end of the
calculation M is taken to infinity. This means that we go back to the original Lagrangian
in that limit, because particles with infinite mass can be ignored (they decouple). The
idea is that by having opposite statistics the auxiliary particle makes exactly the same
contribution as the fermion loop, but with opposite sign. Hence for M = 0 is cancels
the entire diagram, and for nonzero M at least it cancels the divergence. Of course the
auxilliary particle violates the spin-statistics theorem, but in the infinite mass limit it is
not really there, so this should not matter. If we include the auxiliary particle loops the
result for the two diagrams now becomes (Reg. stands for Regularized)
Z
d4 l
iVabc,Reg. (p, q) = 4
[I0 (l, p, q) IM
(l, p, q)] Tr [Ta Tb Tc ]
Z (2)
d4 l
4
[I0 (l, q, p) IM
(l, q, p)] Tr [Ta Tc Tb ]
(2)
where I0 is the integrand shown above, and IM is the same one with a mass M in all
fermion propagators.
The auxiliary particle mass M has potentially implications for the problem we are
considering. This is because conservation laws for currents are as follows
( ) = 0
( 5 ) = 2iM (5 )
Therefore the current of the auxiliary field is not conserved. This will be the reason we find
an anomaly. At this point one may raise the question if perhaps another regularization
method can be found that preserves the symmetry explicitly. But this is not possible.
The most convincing way of seeing that is by analyzing the problem in terms of path
integral quantization, but that is beyond the scope of these lectures. The answer is that
no matter how one approaches the problem, one always ends up with the same anomaly.
It turns out that after regularization the linear divergence of the integral cancels, but
in the triangle diagram without 5 there is still a logarithmic divergence that contributes
to the renormalization of the three-point coupling. But this is not what we are interested
69
in. We are interested in the contraction of the vertex V with k = p q . Note
that because of current conservation for M = 0 the contraction of the terms with I0 with
k yields exactly zero, so the entire contribution will come from the IM terms. However,
without these terms the integral is not defined, so one cannot prove anything by sending
M to infinity prematurely.
To work out the contraction with (p + q) we use the manifest identity
The factors (/l + p/ M ) and (/l p/ M ) combine nicely with the propagators, e.g.
/l + p/ + M
(/l + p/ M ) =1 (5.42)
(l + p)2 M 2
We use this both in the terms with M 6= 0 as in the ones with M = 0. Let us first deal
with the first two terms in Eqn. (5.41). The discussion for these two terms is identical.
With one propagator cancelled, we are left with a trace over two propagators and two
matrices from the vertices. We use the identity
Tr 5 = 4i (5.43)
Furthermore, the trace of 5 with fewer than four matrices vanishes. Then what is left
is only Z
d4 l 4i l q 4i l q
2 (5.44)
(2)4 l2 (l q)2 [l M 2 ][(l q)2 M 2 ]
This is a convergent integral. It must yield something of the form q X , where X
is a four-vector that results from the integral. But such a four-vector must point in some
direction in four-space, that must be some linear combination of vectors appearing in the
integrand. The only such vector is q, and hence X must be proportional to q, and then the
whole expression vanishes. Note that for this conclusion it is important that the second
term makes the integral finite. Without that second term, one might also think that the
first term must necessarily be proportional to q . But this conclusion would be wrong.
Note that we could shift the integration variable from l to l + t, where t is an arbitrary
vector. But if we do that, the integral will be proportional to a linear combination of
q and t . The term proportional to t does not vanish and is in fact logarithmically
divergent, so clearly the conclusion that the integral can only be proportional to q makes
no sense. By contrast, if we make a shift of integration variable in the full expression Eqn
(5.44) it has no effect, because it is merely a change of variables in a finite integral.
Having eliminated the contributions from the first two terms in Eqn (5.41) we now
have only the last one to deal with. This yields the following -matrix trace
A trace of 5 with five matrices always vanishes (at least two of the have the
same index, they can be anti-commuted to be next to each other, where they square
70
to the identity. Then we have only three gamma matrices left). Hence we only get a
contribution from the terms with four matrices, which yields 8iM 2 p q .
This trace does not depend on l, so all that is left to is a scalar integral involving the
three propagator denominators
Z
d4 l 1
S= (5.46)
(2) [(l + p) M ] [l M 2 ] [(l q)2 M 2 ]
4 2 2 2
To make this well-defined we go to Euclidean space using a Wick rotation. Note that
momenta l l = l02 ~l2 are transformed to l2 in Euclidean space, if we replace l0 by il4 .
To keep track of the proper integration contours we first re-introduce the i terms in the
propagators. We are interested in the limiting behavior of the expression for M . In
that limit the dependence on p and q can be ignored. After going to polar coordinates in
four-dimensional Euclidean space we get
Z Z Z
1 l3 i 1 x3 i
S = i d 3 dl = dx = (5.47)
(2)4 0 (l2 + M 2 )3 8 2 M 2 0 (x2 + 1)3 32 2 M 2
where the i comes from the Wick rotation and the three signs from the propagator
denominators, and the d3 integration is over the polar angles; this integral yields the
surface area of a unit 3-sphere and is equal to 2 2 . In the second step the integration
variable was changed as l = xM . The indefinite integral is
Z
x3 2x2 + 1
dx 2 = (5.48)
(x + 1)3 4(x2 + 1)2
Hence we find
Z
d4 l
ik Vabc = (p + q ) I (l, p, q)Tr Ta Tb Tc + (p, , b) (q, , c)
(2)4 M
= (8iM 2 p q )S Tr Ta Tb Tc + (p, , b) (q, , c)
1
= p q Tr Ta Tb Tc + (p, , b) (q, , c)
4 2
1
= p q Tr Ta {Tb , Tc }
4 2
If we set T a , T b and T c equal to the identity matrix this yields Eqn. (5.39).
71
The currents we consider are of the form i P T a , where P is a linear combination
of the identity matrix and 5 . The trace over the Dirac indices splits thus into two
terms, one without any 5 matrices, and one with a single 5 . As indicated in the figure,
there are two diagrams contributing to the amplitude under consideration. It is not hard
to see that for the diagrams without a 5 the trace over the group representations is
proportional to Tr[T a , T b ]T c f abc , whereas, as we have seen above, for the trace with
a 5 the trace is proportional to Tr{T a , T b }T c , which due to the cyclic properties of the
trace is completely anti-symmetric in a, b and c. The terms proportional to f abc contain
infinities, which fortunately can be subtracted since the Lagrangian contains terms of this
form as well. The symmetric terms are finite, but they do not satisfy the Ward identity
Eq. (5.39) in all three indices simultaneously.
If we split all fermions in left and right-handed ones, their contribution to the anomaly
will be with opposite sign if they are in the same representation. It is more convenient
to assume that all fermions are left-handed. Then the complete group theory factor in
the anomaly is proportional to Tr{T a , T b }T c , where the trace is over the complete set of
fermions. In the following all fermions are assumed to be left-handed. Writing all fermions
in terms of left-handed components is another way of seeing that all anomalies cancel if
there are only vector couplings: their left and right components are converted into two
left-handed components with opposite charges. Therefore QED is safe. Furthermore,
QCD is safe as well, because triplets (with representation matrix T a ) and anti-triplets
(with representation matrix (T a ) ) have opposite contributions to the anomaly:
Let us first consider the situation that all three generators T a , T b and T c are generators
of the same simple factor G of the gauge group. The Lie algebras trace equals (see
Appendix B)
Tr{T a , T b }T c = 2 Str T a T b T c = 2I3 (R)dabc , (5.49)
where dabc is a real tensor that is symmetric in three adjoint indices, and Str stand
for the symmetrized trace, defined in appendix (B). In general this is a trace over a
reducible representation, in other words a sum over the traces contributed by each fermion
in the problem. If a fermion is in a non-trivial representation of some other group G0 ,
the dimension of that representation should be taken into account as a multiplicity. The
G-anomalies may cancel for two reasons: either I3 (R) vanishes or the symmetric tensor
dabc does not exist for the group G. The vanishing of I3 (R) can be a consequence of
a non-trivial cancellation among several fermions, or it could happen that each fermion
separately contributes zero. Note that in particular any real representation contributes
zero, since the right-hand side of Eq. (5.49) is real, and on the left-hand side one may
use that in a suitable basis T a = (T a ) = (T a )T . The same is true for pseudo-
real representations, satisfying (T a ) = CT a C for some unitary matrix C. Thus in
particular the singlet and adjoint representations do not contribute to the anomaly.
72
If G has no symmetric tensor in three adjoint indices, there are no G-anomalies at
all, for any fermion representation. This is automatically true if all G-representations are
real or pseudo-real. This is the case for the gauge groups SU (2), Sp(N ), all exceptional
groups except E6 and all SO(N ) groups except SO(4n + 2), n Z. Most of the groups
with complex representations do indeed have a non-vanishing tensor dabc . This is true
for all SU (N ) groups, SO(2) and SO(6). The remaining groups, E6 and SO(4n + 2) for
n 2 have complex representations, but are nevertheless anomaly-free (i.e. dabc = 0).
Finally U (1) groups have non-trivial anomalies, which are equal to the third power of the
charge for each fermion (with the appropriate multiplicity as explained above).
The anomaly coefficients I3 (R) are integers (provided dabc is normalized in a reasonable
way) which can have either sign. They can be looked up in tables (see e.g. [28]). If
the group is not anomaly-free these coefficients are usually non-zero for any complex
irreducible representation, with very few exceptions.
If T a and T b belong to the same factor G1 of the gauge group, and T c to a different
one, G2 , then Tr{T a , T b }T c = 2 Tr T a T b Tr T c .
This relation holds for each irreducible representation (R1 , R2 ) of G1 G2 , and one
sums over all irreducible components of the complete fermion representation at the end.
Since Tr T c = 0 for simple Lie algebras, there can only be c
Pan anomaly if T is a U (1)
generator. If the full left-handed fermionP representation is i (Ri , qi ) the anomaly in the
U (1) current is thus proportional to i I2 (Ri )qi .
Finally, if all three group generators belong to different gauge groups, there is only a
contribution if all three are U (1) generators, not embedded in a simple algebra.
To illustrate all this, let us see how it works for the Standard Model. The pure SU (3)
anomalies cancel because each family contains 2 triplets and 2 anti-triplets, and complex
conjugate representations contribute with opposite signs. Cancellation of the pure SU (2)
anomalies is trivial, since SU (2) is anomaly-free. The cancellation of the pure U (1)
anomalies is more interesting:
3 3 3 3
1 2 1 1
3.2. +3 +3 +2 +1=0 (5.50)
6 3 3 2
Note that the multiplicities due to the dimensions of SU (3) and SU (2) representations
must (of course) be taken into account. The cancellation of the mixed SU (3), U (1)
anomalies and SU (2), U (1) is also non-trivial. It is a simple exercise to check that these
anomalies do indeed cancel.
73
U (N 2 ) . . . U (N k ), where k is the number of distinct representations. All fermions are here
assumed to be left-handed. If one does not do that, one would arrive at a smaller group,
since one would overlook transformations between L and R fermions. For convenience
we have assumed all fermions to be Weyl fermions in complex representations; if there are
also Majorana fermions in real representations one will get orthogonal symmetries among
them.
A natural question to ask now is if these global symmetries are preserved in the
quantum theory. It turns out that they are in general affected by anomalies due to the
same diagrams we have already computed. To see that think of global symmetry currents
as vertices c P T added to the Lagrangian. Here P is some combination of the
identity and 5 and T is some symmetry generator. The coefficients c may be thought
of as coupling constants. These terms in the Lagrangian then generate two-point vertices
with two fermionic external lines, and combining these vertices with gauge boson-fermion
three point couplings one can obtain triangle diagrams.
If one of the currents in the anomaly triangle represents a global symmetry, and
the other two are local, we are forced to preserve the local symmetries (to maintain
consistency) and choose the regularization of the diagram in such a way that the entire
anomaly is in the conservation of the current of the global symmetry. Group theoretically
these anomalies work exactly as the ones discussed above, but the interpretation is quite
different. Anomalous global symmetries are acceptable, and in fact totally unavoidable.
The only consequence is that a global symmetry of the classical action turns out not to be
a symmetry quantum mechanically. Another way of saying this is: would it be possible to
consistently gauge the global symmetry. If the answer is negative because of anomalies,
then the global symmetry is not a symmetry of the quantum theory.
Hence triangle diagrams involving two generators of Ggauge and one of Gglobal will break
part of the global symmetries. Since non-abelian generators are traceless, only U (1)s can
be broken in this way. In principle each non-abelian factor in G gauge is responsible for
one anomaly. Furthermore, if there are m U (1) factors in Ggauge , they yield an additional
1
2
m(m + 1) in principle independent anomalies, since a triangle diagram can have two
different U (1) gauge generators. Hence in general one may expect n m + 12 m(m + 1) =
n + 21 m(m 1) global U (1)s to be broken by anomalies. In practice there may be fewer,
since the set of anomalous U (1)s need not be independent. If this does not exhaust
the set of available U (1) symmetries, the remaining ones may be linearly combined into
non-anomalous symmetries.
Even though a global current may be anomalous, the classical global symmetry means
that at every vertex the charge is conserved. Hence an anomalous global symmetry is not
broken to arbitrary order in perturbation theory since one can simply follow the charges
through the diagram. However, the effects of the anomaly do appear non-perturbatively.
74
5.5.4 Global Anomalies in Field-Theoretic Form
The anomaly can be represented by a local counter-term involving the gauge fields
g2
J = TrF F , (5.51)
8 2
where F = 12 F . The fields F F
a
T a are of course in the representation of the
fermions in the loop. The left-hand side of this divergence reproduces precisely Eq. (5.39)
when written in momentum space. Since the left-hand side is itself the divergence of a
g2
current (see Eq. (4.9)) one can define a new current J = J 4 2K
that is conserved.
However, this does not change the fact that J is not conserved, and furthermore K is
not gauge invariant: it is invariant under small gauge transformations, but not under
certain large ones that cannot be continuously deformed to zero.
The two vector currents, baryon number and electromagnetism, are conserved, because
neither QCD nor QED has couplings with a 5 . Note that if we insert the current J3,A into
a triangle diagram with two gluons, the contributions of the two terms cancel, because the
u and d quark have the same couplings to the gluon. But their couplings to the photon
are different, so the diagram with the current J3,A and two photons is anomalous. Note
that for the divergence of JA does get anomalous contributions with two gluons. This is
why we choose these linear combinations. The effect of anomalies due to QCD is much
stronger than those of QCD, so we look at a combination that is only affected by QED
anomalies.
75
5.5.6 The 0 Decay Width
The symmetry corresponding to J3,A is part of the axial SU (2) symmetries that are spon-
taneously broken by QCD. This spontaneous breaking produces three pions as Goldstone
bosons. In the limit of vanishing quarks masses and QED coupling all three pions are
massless. If the QED coupling does not vanish, only J3,A correspond still to an exact sym-
metry, so one would expect the corresponding Goldstone boson, 0 to be exactly massless,
while are slightly heavier due to electromagnetism. In the real world the quarks have
a mass, lifting the pion masses to about 135 MeV, with 0 slightly lighter than .
The most interesting effect of the anomaly is not on the masses, but on the decay
of the 0 . If this symmetry is exact, it would forbid the decay of 0 which is
observed experimentally. This is a consequence of the Goldstone theorem. The pion field
has the same matrix element with the two photon state as the divergence J3,A of the
axial vector current, since the pion is the Goldstone boson of the axial symmetry. If the
current is conserved the matrix element vanishes. If one includes the quark masses that
break the chiral symmetry one gets a non-zero prediction for the decay width for 0
that is however much to small. The correct answer is that J3,A is not zero, but equal
to an anomaly term involving the photon field, generated by a triangle diagram with an
external axial vector current and two photons. Now the decay rate can be computed using
the anomaly, whose normalization is known. The result is
2 m3 Nc2
( 0 ) = = 7.73 eV , (5.52)
576 3 f2
where f 130 MeV is the pion decay constant and Nc is the number of colors. The pion
decay constant can be measured from the decay width of the charged pions to leptons.
Hence the anomaly gives a parameter-free prediction of the 0 decay width. The
agreement with the observed decay rate, 7.8 0.2 eV is very good, which may be viewed
as direct experimental evidence for the anomaly. Not only that, but the decay width
is sensitive to the properties of the quarks in the loop. Originally, these computations
were done with protons and neutrons instead of quarks. This gives the wrong answer. In
QCD, the amplitude is obviously proportional to the number of quark colors, so that
the width is proporional to Nc2 . Historically, this is one of the first ways it was discovered
that there have to be tree distinct species of u and d quarks.
76
not gauge invariant. This allows non-perturbative instanton effects to break the symmetry
explicitly, and remove any argument for the existence of a massless Goldstone boson.
77
All other anomalies can now be removed by subtracting a suitable anomalous one. For
example by making a linear combination of baryon number and lepton number we end
up with the anomaly free combination B L. This is an exact global symmetry of the
Standard Model if there are no Majorana neutrino masses.
78
5.5.13 Symplectic Anomalies
There is yet another kind of anomaly [34]. In some theories there are global gauge
transformations (gauge transformations that cannot be connected to the identity in a
continuous way) that change the sign of the path integral. This sign flip is always due to
a fermion determinant changing sign. The most likely conclusion is that such theories are
ill-defined, and hence not acceptable as a theory. The conditions for absence of such global
anomalies are known. They are related to the fourth homotopy group of the gauge group,
and this homotopy group is non-trivial only for SU (2) and symplectic groups Sp(N ).
Symplectic gauge groups are not encountered often in the literature, so only SU (2) is
really of interest to us. Since it occurs in the Standard Model we have to worry about
non-trivial global anomalies. These anomalies are absent if the number of Weyl fermions
in half-integral spin representations is even. An even number of fermions leads to an even
number of sign changes, so that the anomaly cancels. The Standard Model respects this
condition, and it does so within each family: there are four SU (2) doublets per family.
5.6 Axions
Let us now return to the QCD -parameter discussed in section 4.1.2. We have already
seen that it should be almost zero, and that within QCD alone it can simply be set equal
to zero by imposing CP. However, since CP is not a symmetry of nature, this cannot
really be justified. Furthermore, even if we put it equal to zero, non-vanishing corrections
to are to be expected.
In fact, there is an effect which is not even small. To see why, we have to examine more
carefully how we obtained the diagonal quark masses. In the Standard Model the only
possible sources of CP violation are the CKM matrix for quarks and the PMNS matrix
for leptons (see sections 4.3.3 and 5.2.6). Since is a strong interaction parameter the
CKM matrix is most directly relevant. There is a CP-violating parameter in the CKM
matrix if the number of families is three or larger. CP violation has been observed by
Cronin and Fitch in 1964 in the K0 K0 system, and more recently it has also been
found in B B systems. Hence we know that the CP-violating parameter is non-zero. For
this to work the Yukawa coupling matrices gU and gD cannot be real (if they are real the
Lagrangian is manifestly CP invariant). Hence one expects the quark masses produced by
the Higgs mechanism to be complex numbers. In section 4.3.3 we have made symmetry
transformation to make the masses real, but the existence of anomalies in some symmetries
forces us to verify if all those transformations were legitimate.
79
vev. The Lagrangian, including the -term is
g32
L = 14 Ga G,a + Tr G G + iD + mL R + m R L (5.53)
16 2
Note that the last two terms are each others conjugate, and hence the Lagrangian is real,
even if m is complex.
Let us write the mass as m = |m|ei . In classical field theory one can make the mass
real by means of the transformation
L ei/2 L
R ei/2 R (5.54)
Note that there are other phase choices that achieve this, because simultaneous phase
rotations of L and R have no effect at all. But whatever we choose, it is clear that we
will have to transform L and R with different phases to make m real.
Of course this only works if the rest of the Lagrangian is invariant under this trans-
formation. The fermion kinetic terms can be written as
iD = iL D L + iR D R
and are manifestly invariant. The gauge kinetic terms do not even depend on . So
clearly the aforementioned phase transformation is a symmetry of the classical action. It
is called an axial symmetry. As usual, there is a charge that generates the symmetry
transformation, and the charge is related to a current, the axial vector current
J5 = 5
J =
Complex masses are also used in the discussion of unstable particles. Then the real part is the mass
and the imaginary part the decay width. But this has nothing to do with the present case. We have just
a single quark that has nothing it can decay to. Hence its mass must be real.
80
Chiral Anomalies. But the axial symmetry is broken in the quantum theory, because
there are one loop diagrams (the triangle diagrams computed in section 5.5.1) that do
not satisfy axial current conservation. The way this symmetry is violated is given by
Eq. (5.51). There it was written for an arbitrary current coupling to a triangle with two
gauge bosos, without specifying where the 5 s are in the triangle. In this case, we have
a triangle with two gluons, which do not have 5 couplings, and hence the 5 can only
come from the current itself. The relevant expressons are
g32
J5 = TrG G
8 2
J = 0
In sections 2.2 and 2.3 we have seen how variations corresponding to currents J affect
the action. Let us adapt that discussion to the present case. Consider an x-dependent
variation
ei(x)5 /2
Now the action is not invariant, because the kinetic term of the fermions are not. We find
= +
Hence we can never observe and separately, only the linear combination .
At first sight this make the problem worse. The mass terms seem a priori unrelated to
the value of . So if we thought that we could solve the problem by finding an argument
why = 0, we just learned that we also need an argument why the masses are real. This
might still look possible in this simple example, but as already stated above in the real
world the masses are obtained from Yukawa couplings, that must be complex matrices in
order to get CP-violation in the CKM matrix.
A massless up quark? But although this seems to make the problem worse, it also
offers a first glimpse at possible ways out. On possibility is that m = 0. If the mass is
zero, we can multiply it with an arbitrary phase. This phase then just shifts , and we
can shift it to zero without encountering any change in the quark mass. It is sufficient to
have just one such massless quark, because there is just one parameter to shift. Note
81
that the electric dipole moment of the neutron, Eq. (4.11) vanishes if one of the light
quark masses is zero (this formula was derived under the assumption that all other quarks
are heavy, otherwise it would have been proportional to all quark masses).
But is there a massless quark in the real world? The lightest quark is the up quark and
its mass is mu = 2.2 +.6.4
MeV [8]. This is more than five standard deviations away from
zero. Nothing about QCD would change qualitatively if mu = 0, but it just does not seem
to be true. Furthermore, if indeed mu were to vanish this just leads to a problem that at
first sight is as puzzling as = 0: why would just one of the quark masses vanish exactly?
Of course it is also possible that mu is not exactly zero, but just small. It should then be
small enough that the electric dipole moment of the neutron is below the current limit,
with of order 1. This requires mu of order 109 MeV. This is not only statistically very
unlikely in view of the aforementioned experimental results, but it also looks theoretically
very implausible (although that has not stopped people from pursuing this option).
g32
L = 14 Ga G,a + Tr G G + iD + g L R + g R L (5.56)
16 2
Here g is a complex coupling constant and a complex field. In order to discuss the
vacuum expectation value of we first need the potential. We choose
Now let us assume that 2 < 0 so that gets a non-trivial vacuum expectation value. The
bottom of the mexican hat is at a value sei for an arbitrary phase , but we just make
a convenient choice. Note that g is already complex, so we gain nothing by allowing yet
another phase from the vacuum expectation value. So we set hi = s, with s real. The
value of s is of course determined by 2 and . Now we expand around the vacuum.
One possible parametrization would be
1
= (s + + ia)
2
where a and are real fields. But it is more convenient to expand in the following way
1
= (s + ) eia/s (5.58)
2
which is the same to first order in the fields, but this expansion makes the higher order
terms come out in a nicer way: most of them disappear. This is possible because the
action has a global continuous symmetry called a Peccei-Quinn symmetry. This is a shift
symmetry of a due to a combined phase symmetry of , L and R . Its action on the
82
fields is characterized by Peccei-Quinn charges. In this case these charges are 1, 12 and 12
for , L and R respectively.
Note the similarity with the expansion we made in the discussion of the Higgs mecha-
nism, Eq. (3.17). It was used there to show that the phase degree of freedom disappears
from the action. There is also a very essential difference: in the Higgs mechanism the
phase degree of freedom becomes the longitudinal component of a massive gauge boson.
Here that is not the case; indeed, there is no gauge boson coupling the . But the
parametrization of is useful for the same reason. One immediate advantage is that the
potential V () is manifestly independent of the field a. We can also make the field a
disappear in the coupling to fermions. This requires to make a field transformation of the
fermions
L ei(x)/2s L
R ei(x)/2s R (5.59)
But we cannot make a(x) disappear from the action completely, for two reasons. First
of all a(x) depends on x, and hence if we substitute our parametrization in the kinetic
terms we will get terms proportional to . In the Higgs mechanism we make a gauge
transformation (3.18) to remove such terms, but we do not have such transformations
at our disposal here. The second reason is that the transformation (5.59) is anomalous.
Hence it cannot be turned into a gauge transformation anyway. The result of the anomaly
is that the transformation generates an additional term in the action
Z
a(x) g32
S = d4 x TrG G , (5.60)
s 16 2
Note that this is a dimension-5 operator: a(x) is a boson field, and has dimension 1, and
GG has dimension 4. This is why the coupling constant in this term is proportional to s1 .
We should not be surprised to get dimension 5 operators, because we made a non-linear
(exponential) field transformation. If one does that with any other bosonic field one gets
an infinity of operators with dimension higher than 4.
Taking everything together we see now that GG appears with a factor
a(x) a(x)
++= +
s s
This means that we have turned into a dynamical variable. Rather than just a parameter
in the Lagrangian, has become a field a(x), and the value we observe is the vacuum
expectation of that field. So if we could think of some dynamics that could fix the vacuum
expectation value to a definite value, then we have determined the observed value of
dynamically. The field a(x) is called the axion (the origin of the name will be explained
below).
Multiple Quarks. The example discussed above is unrealistic in several ways. First
of all we considered only one quark. This is easy to fix. We may generalize (5.56) to N
83
fermions
X
N X
N X
N
i i D i + gij Li Rj + gij Ri Lj (5.61)
i=1 i,j=1 i,j=1
When acquires a vev s, this gives rise to mass matrices Mij = sgij . The first thing to
do is to diagonalize these mass matrices sgij using SU (N )L SU (N )R transformations.
Furthermore we can use diagonal SU (N ) transformations to make sure that all the eigen-
values have a common phase ei . Since SU (N ) transformations have no overall phases,
these transformations are not anomalous.
In the final step we remove the common phase, and then we encounter the anomaly,
as discussed above. The phase is given by the determinant of the matrix Mij , and is
denoted as = arg det M . Then the Peccei-Quinn symmetry acts with charge 21 on all
N Li and with charge 12 on all N Ri , and as before with charge 1 on . The triangle
diagram now has N quarks contributing, so the anomaly will be N times as large. Hence
the contribution of a(x) to the action now becomes
Z Z
4 a(x) g32 4 a(x) g3
2
S = d xN 2
TrG G d x 2
TrG G , (5.62)
s 16 fa 16
Axion Effective Action. In order to discuss axion physics without having to worry
about specific models one uses the following effective action
1 a g32
La = a a + TrG G , (5.63)
2 fa 16 2
This is the action for a free, massless boson with a non-renormalizable (dimension five)
coupling to the gauge bosons. No matter which axion model one considers, one always
84
ends up with an action of this form. The interaction term is generated by the anomaly
of the axial current, from a triangle diagram with two gluons. In addition to these terms
involving the axion field a there are the other terms involving GG, already mentioned
above
g2
L = ( + arg det M ) 3 2 TrG G , (5.64)
16
From this effective action one can see immediately how shifting the axion field by a
constant can change the value of in the strong CP term.
The QCD-generated Axion Potential. Up to now it may have seemed that the
potential of the field a is completely flat. It appears in the action only in the form of
derivatives a, plus the coupling to GG. But also in this coupling the dependence on a is
through a, because GG is a total derivative, and we can move the derivative to a(x) by
partial integration. Hence classically the theory is invariant under shifts a(x) a(x) + c,
for any real c. But this is not going to be a symmetry of the full quantum theory, because
we also know that shifting the value of a(x) changes . For different values of we will
measure a different value of quantities like the electric dipole moment dn of the neutron,
so we really have different physics. And if the physics is different, the vacuum energy must
be different as well. Hence somehow QCD creates a non-trivial potential V (a) on top of
the flat background we started with. This must be a non-perturbative effect, because
perturbatively the shift symmetry is exact.
We also know that the non-perturbative physics is periodic in with periodicity 2,
so clearly the potential must be periodic as well. Furthermore hi = h a(x) s
+ i = 0 is a
special point. In that point dn = 0, and for small fluctuations around that point dn hi.
Since we would not expect the vacuum energy to depend on the sign of hi, we expect that
hi is either a local minimum or a local maximum of the vacuum energy. More detailed
arguments are needed to show that it is indeed a minimum [29]. The expansion of the
potential around the minimum gives rise to a mass for the axion.
A real computation of the axion mass requires non-perturbative QCD physics but
there is already something we can say simply because the potential is a periodic function
of the dimensionless combination a/fa . The simplest possibility is then
V (a) = F 1 cos(a/fa + ) . (5.65)
and this is indeed what one gets from computations. For dimensional reasons, there must
be a pre-factor F of with dimension [mass]4 . This factor depends on the QCD scale and
the quark masses. Using current algebra techniques (not discussed in these lecture notes)
one can show that it is in fact equal to (m f )2 times dimensionless ratios of quark masses.
The latter ratios vanish in the limit where one quark mass goes to zero, because as we
have seen in that limit QCD becomes invariant under shifts of the axion field.
If V (a) is a function of a/fa , even if it is not exactly a cosine, if follows that if we
expand it around its minimum the first term is proportional to (a/fa )2 . It follows that a
85
rough approximation of the axion mass in terms of fa is
f
ma m
fa
Problems with global symmetries. Since obviously flatness of the potential without
non-perturbative QCD effects is essential, we need to rethink the potential (5.57). This
looks like the most general potential of a complex field, but of course we do not really
know if we should regard as a complex field, or two real fields 1 and 2 , with = 1 +i2 .
In the latter case, more general potentials are possible. Indeed, even the mass term could
take the form m21 12 + m22 22 . This would ruin the entire argument. The only way to
justify the potential (5.57) is to insist on the phase symmetry ei . But here we run
into a potential contradiction with folk theorems in theories of gravity. It is generally
believed that a theory of quantum gravity does not allow continuous global symmetries.
The argument goes like this: a continuous global symmetry gives rise to exactly conserved
charges. But if you throw such a charge into a black hole it is gone, and hence apparently
not conserved. If the symmetry is local, i.e. a gauge symmetry, then each charge comes
with an electric field that stretches out to infinity, and provides a permanent record of
what went into the black hole.
The way out is that ei is not really a global symmetry. When combined with
the action on fermions, it has an anomaly. But if one thinks in terms of general scalar
fields in a theory of quantum gravity, this implies that there must be a somewhat mys-
terious feedback. Somehow the scalar potential knows that a certain global symmetry
is allowed, because the phase symmetry must be realized on fermions to keep Yukawa
couplings invariant, and the action on fermions is anomalous. This is generally consid-
ered to be the weakest point of the Peccei-Quinn mechanism. We can simply postulate
a potential (5.57), but is this really consistent with a fundamental theory of quantum
gravity? If quantum gravity abhors all continuous global symmetries, does it make sense
to postulate such a potential at all? These are questions we cannot address without a
concrete theory of quantum gravity.
86
To get rid of the phases in the mass matrix we may use these extra phase transformations,
but now we have to be careful with anomalies. We denote the four phases as ei , ei , ei
and ei respectively. The phase transformations have the following effect on the quark
masses
mU ei mU ei
mU ei mU ei
The implications are the same as before. It is hard to reconcile vanishing with a complex
CKM matrix. It is not impossible, and in particular it is possibly that the CKM matrix
is complex and that det M is nevertheless real, but it is not clear how to arrange that
in a fundamental way. There exist ideas in the literature exploring this route, but the
resulting models look rather contrived.
87
want to solve the strong CP problem in this way (and in fact in any other known way)
we have to extend the Standard Model.
Axions in Extensions of the Standard Model. One very simple solution is sug-
gested by the previous discussion. In section 5.6.1 we saw that a single quark coupling to
a complex singlet Higgs can do the job. This cannot be one of the known quarks, because
we already know experimentally that they must get their mass from a doublet Higgs. But
we can postulate a new quark with left- and right-handed components that couple to
SU (3) in the usual way, and that gets its mass from from a new scalar Higgs field ,
exactly as in (5.56). The PQ charges of , L and R are respectively 1, 21 , 21 , just as
in the example discussed above. The new quark must be heavy enough to have escaped
observation so far, but this by itself is not a big challenge, because it gets its mass from
a different Higgs boson than the known quarks.
But adding extra weak-singlet quarks may look a bit awkward. An example that works
without adding extra quarks is the two-Higgs model. Instead of the single Higgs of the
Standard Model one introduces two Higgses, one that couples to the down quarks (and
the leptons), and one that couples to the up quarks. That this should work is already
clear from the previous section, because now we can assign different PQ-charges to u
and u , whereas with a single Higgs field the charges of and C must be opposite.
Considering only the quark sector, the Yukawa couplings are thus
Q,
LY = gU LQ, u RU , gD L d RD, + c.c. , (5.67)
where u takes over the role of C . Hence d is in the usual Higgs boson representation
(1, 2, 12 ) whereas u is in the representation (1, 2, 21 ). The new element is that now we
can rotate the phases of the up and down mass matrices independently, by phase rotations
of d and u , whereas previously we could only rotate the Higgs field . In the Standard
Model the down quarks couple to and the up quarks to C ; therefore any phase rotation
of cancels in arg det M . Hence in the Standard Model arg det M is fully determined by
the Yukawa couplings, but in the Peccei-Quinn model it is not.
To define the Peccei-Quinn symmetry of this theory we can choose charges 1 for both
d and u , charges 21 for all components L and charges 12 for all components R . This
rotates left and right components of the quarks with opposite phases, and hence it is
anomalous and can rotate the -angle away. For this to be a symmetry of the entire
Lagrangian, it must be a symmetry of the Higgs potential. Terms like d u , (d u )2 or
i i d u (which are allowed by SU (2)U (1)) must be absent, since they are not invariant
under this symmetry. Let us assume that the Higgs potential has that property. This can
be imposed by requiring that the Peccei-Quinn symmetry is an exact global symmetry of
the classical Lagrangian. This means that it is preserved by all vertices, and hence it will
be preserved by all loop diagrams, so if the unwanted terms in the potential are absent
at tree level, they will not be generated by loop diagrams.
This includes the anomalous triangle diagram, because this has no external line attached to the top of
the triangle, and hence cannot contribute to any perturbative amplitude computation.
88
The Axion in the Two-Higgs Model. Let us see how the axion appears in the two-
Higgs model introduced above. Weak symmetry breaking in this two-Higgs model occurs
analogously to the one-Higgs model with fields and C . However, since these are now
two unrelated fields, their absolute value of their vevs are now unrelated as well:
1 0 1 vu
hd i = ; hu i = (5.68)
2 vd 2 0
In combination with the aforementioned chiral phase transformations of the fermions
the theory has a global symmetry which acts non-trivially on the vacuum. Hence it is
spontaneously broken, and one gets a massless Goldstone bosons corresponding to the
symmetry, the axion. It is easy to see which linear combination of the variations of u
and d around the the vacuum (5.68) is the axion field. One expands around the vacuum
as
1 d + id 1 vu + u + iu
hd i = ; hu i = (5.69)
2 vd + d + id 2 u + iu
One linear combination of the phase fluctuations u and d is eaten by the Z-boson,
namely (vu u vd d ). However, because of the Peccei-Quinn symmetry the combination
a = (vd u + vu d ) remains massless. This is the axion field. Note that the two complex
doublet fields u and d have eight real components. Three of these are eaten by the Z
and W field, and hence five physical fields are left. They consist of two neutral massive
scalars (one of which should correspond to the observed Higgs boson), a massive charged
scalar, and a massless scalar, the axion.
However, as in the example we discussed earlier, it is more convenient to make a
non-linear expansion similar to (5.58).
1 d + id ia(x)/va 1 vu + u iu ia(x)/va
hd i = e ; hu i = e (5.70)
2 vd + d id 2 u + iu
The rest of the discussion then goes as before. We can remove the dependence of the
Lagrangian on a apart from derivatives and a coupling to F F . Of course the value of va
is related to vu and vd . One can determine this relation by expanding the kinetic terms
of u and d and requiring that the resulting kinetic
p terms for the axion field have the
canonical form 21 a a. It turns out that va = vu2 + vd2 . Then va is related to fa by a
numerical factor that depends on the fermions in the anomaly triangle, because ultimately
the coupling to QCD must be brought to the canonical form (5.62).
89
uR uL , dR dL , sR sL as well as the phases of d and u . In the absence of instantons (non-
perturbative QCD-contributions), quark masses and W, Z-bosons all these five particles
are massless Goldstone bosons. Due to instantons one combination, the 0 , is not a
Goldstone boson in any reasonable approximation (see section 5.5.7); one combination,
to first approximation the relative phase of d and u , is eaten by the Z; one combination
becomes the 0 and another one the ; and finally the fifth linear combination, essentially
the common phase of d and u , is the axion. For its mass Weinberg finds
1/2 1/4 1/2
N m f mu md ms 2 GF
mA = . (5.71)
2 mu + md mu md + mu ms + md ms sin 2
Here Nf is the number of flavors (meanwhile known to be six), GF the Fermi constant
g2
(GF = 18 2 M22 ), f the pion decay constant measurable in pion decay, and parametrizes
W
the ratio of the v.e.vs of the two scalars d and u in Eqn. (5.68): tan = vd /vu . Apart
from this, all parameters in the formula are known, and in particular there is no unknown
QCD instanton-generated matrix element appearing. All QCD effects are encapsulated
in the pion mass and the pion decay constant. This is possible because the pion is a
pseudo-scalar pseudo goldstone boson, just like the axion. Hence measured parameters of
the pion and its properties can be used in the computation of the axion mass.
Numerically one finds for the mass of the axion (for Nf = 6).
(140 keV)
mA . (5.72)
sin 2
Note that the axion mass is proportional to m so that it vanishes in the chiral limit, and
to the masses of the three light quarks. The latter dependence is a consequence of the
fact that if one of the quarks is massless the theory becomes independent of (and hence
) as discussed before. Then the potential is flat in the axion direction, and hence the
axion is massless. One cannot take the other quark masses to zero in this formula because
the calculation was done in the limit where their masses are much larger than the QCD
scale.
This axion is not stable and would decay into two photons with a lifetime of about
102 second. Experimentally such a particle has not been seen. This was the situation in
1978. But as we will see in a moment, that is not quite the end of the story. This is why
we called this the original QCD axion in the title of this subsection.
90
the vacuum expectation value of , which determines the mass of one or more quarks. In
the two-Higgs model, fa is directly related to the Higgs vev. If one makes the assumption
that fa is related to the weak scale, fa could be about 100 GeV, while f is about 130
1
MeV. This agrees with Weinbergs formula, because in that case fa v (GF ) 2 . This
would make the axion about a factor 1000 lighter than the pion. This gives a mass of
100 KeV, in agreement with the more precise calculation of Weinberg.
However, several authors [19, 27, 35, 6] realized a few years later that the axion scale
does not have to be related to the weak scale at all. Indeed, the scalar introduced above
is not the Standard Model Higgs field, as we have seen. If is just an additional scalar
field, its vacuum expectation value can be increased so that the axion mass and couplings
go to zero. In this way one can hope to make the axion invisible.
91
arises from an anomaly diagram, analogous to the 0 coupling to two photons (see next
subsection). The hope is to use strong electromagnetic fields to make the axion collide
with a photon, and emit another photon.
a e2
La = 0 2
F F , (5.73)
fa 16
where F is the electromagnetic field strength tensor. In specific models such a term is
generated by anomaly triangles with two photons, coupling to the axial current times the
Peccei-Quinn charge of the fermion. If there are no fermions with electric charge coupling
to the axion, this could in principle vanish. However, first of all the axion must couple
to quarks (either the ones we know or additional, heavy ones) and the standard rules of
Standard Model charge quantization make it very difficult to make an electrically neutral
quark. Secondly, the axion mixes with mesons, for example the 0 , and hence it will have
a two-photon coupling via this mixing. In an effective theory there is therefore no good
argument to set this coupling to zero, but it could happen to be small. This two-photon
coupling is important in attempts to observe axions. The vast majority of axion search
experiments rely on it.
There is indeed a large number of such experiments underway. Most use the fact
that via the two photon coupling, an axion in a strong magnetic field can convert to a
photon or vice-versa (think of the magnetic field as one of the two photons). An intriguing
example is the shining through a wall phenomenon. One aims photons at a wall with
strong magnetic fields on both sides of that wall. One magnetic field may convert the
photon to an axion; this axion can pass through the wall because it barely interacts with
matter; and on the other side of the wall the axion is reconverted into a photon, giving
the impression that the photon passed through the wall. This process depends on two
unlikely events (two photon-axion conversions) but certain regions of the parameter space
of fa and the axion-photon coupling can be explored. Other types of experiments are
helioscopes, looking for axions coming from the sun, and haloscopes, looking for axion
dark matter in the halo of our galaxy.
92
suppose
L eiQL a(x)/2s L
R eiQR a(x)/2s R
Then we find
iL L + iR R 12 a(x) QL L L + QR R R
Conservation of the vector current implies that the second term vanishes upon partial
integration. But for a massive particle the axial vector current is not conserved:
5 = 2m5
Note that this is simply due to the mass, and not the chiral anomaly. We want to
apply this result to the electron, which does not couple to QCD. One can do the same
computation for quarks, but then there will be two terms, one proportional to the quark
mass, plus the coupling to GG generated by the anomaly.
Hence the final result for the electron-axion coupling is
1
me (QL QR )a(x)5
2
Note that this depends on the Peccei-Quinn charges of the electron, which could be zero.
If they are zero, the fact that the axion mixes with mesons does not change the result,
because mesons do not have a direct coupling to electrons. Hence it is not guaranteed
that axions couple to leptons.
93
If there is more than one axion, their action and coupling to QCD will take the form
1 i i ai g32
L a = a a + i 2
TrG G , (5.74)
2 fa 16
In principle each action might have a coupling to QCD, and this coupling depends on the
way the axion interacts with colored particles. However, in this situation we may define
the QCD axion as
a X ai
= , (5.75)
fa i
fai
and we choose for all other axions a basis that is orthogonal to this. Then only the QCD
axion gets a mass from QCD effects, and all other remain massless.
The other axions must however somehow get a mass in another way, or we would end
up with an ungauged continuous symmetry. So one may ask what happens if we assume
that all axions have an additional explicit mass term, 21 m2i a2i . In principle this does not
have to be diagonal, but we may assume that the coupling to QCD we wrote above is in
terms of eigenstates ai of the explicit masses. The QCD generated axion potential plus
the mass terms is in this case (we drop the axion label a on fai )
" !#
X ai X
V (ai ) = F 1 cos + + 12 m2i a2i . (5.76)
i
f i i
Here F is the parameter introduced in (5.65); its value is roughly m2 f2 . The equations
of motion determining the minimum of the combined potential are
!
F X ai
sin + + m2j aj = 0
fj i
fi
Multiplying with fj and subtracting the equations from each other we find
!
X ai
fj m2j aj = fk m2k ak = Fsin +
i
f i
Note that F and x have dimension 4 and R has dimension 4. Then the equation reads
x = Fsin xR +
94
We are only going to solve the problem if the argument of the sine is small. So let us
assume that it is, and check afterwords if this assumption is correct. If the argument is
small, we can expand the sine, and we find
F
x=
1 + FR
Hence the argument of the sine is
xR + =
1 + FR
This is of course also the value of the argument of the sine at the minimum, or in other
words the physical angle we would observe. We must assume that is of order 1, so
the condition for the argument of the sine being small is that FR 1. Assume the
axion with minimal value of m2i fi2 has label k. Then F m2k fk2 . Now F is a QCD
parameter approximately equal to m2 f2 , so we see that the condition for the validity of
the approximation is that there is one axion label k so that for that axion m2 f2 m2k fk2 .
Furthermore the observed -angle is
m2k fk2
phys = 2 2
1 + FR m f
The approximation is valid if there is just one lightest axion separated by a substantial
gap from the next-to-lightest one. If there are more light ones the value of phys becomes
smaller; for example for M degenerate lightest axions with the same values of both mi
and fi the result is reduced by a factor M . Observe that phys approaches zero if mk goes
to zero. The reason for this is clear: in that limit we obtain an axion with an exact shift
symmetry. The dependence on fi is also clear: for fi the axion decouples from
QCD, so then even a very light axion becomes useless. If we make the axion scale very
large, we know that that QCD generated mass of a single axion is Mai m f /fai . The
condition for small phys reads then
mk Mak
which must hold for at least one axion. To get the required tuning of to a value smaller
than 1010 we need mk < 1010 Mak . In words, there must be at least one axion whose
explicit mass is ten orders of magnitude smaller than its QCD-generated mass. The
existence of axions with intermediate masses is irrelevant.
Finally we compute the masses of the axions. The mass matrix is given by the second
derivative matrix at the minimum. The result is
2
F 2 f
Mij2 = 2
cos (phys ) + mi ij Cm + m2i ij
fi fj fi fj
where C is a factor of order 1, proportional to the quark masses. To simplify the notation,
define
F
Mi =
fi
95
The first term is a matrix of the form Mi Mj , and has only one non-zero eigenvalue, with
eigenvector Mi , and eigenvalue proportional to M ~ 2 . All vectors orthogonal to M
~
P have
eigenvalue 0. Hence without the mass terms mi there is one axion with mass i Mi2 ,
and all the others have zero mass. For mi large, we can ignore the first term. Hence
heavy axions just keep their explicit mass mi , and do not contribute to the Peccei-Quinn
mechanism.
Let us analyse this explicitly for 2 axions. The QCD-generated mass contribution plus
the explicit mass term together generate a mass matrix of the following form
2
M1 + m21 M1 M2
M1 M2 M22 + m22
The eigenvalues are
q
1 2 2 2 2
2
M1 + M2 + m1 + m2 (M12 + M22 )2 + (m21 m22 )2 + 2(M12 M22 )(m21 m22 )
There are two cases of special interest: if one of the mi is large with respect to the
QCD-generated axion masses Mi , and if both mi are small.
Case 1: one heavy axion. We see that if m2 is much larger than all other masses the
dominant term in the argument of the square root is m42 , and hence we can expand it as
follows s
m2 M2 M2
m22 1 2 21 2 1 2 2 m22 m21 M12 + M22
m2 m2
Hence the eigenvalues are m22 and M12 + m21 . The latter is the QCD axion mass. The PQ
mechanism will only work if m1 < 1010 M1 , as seen above. The value of M2 is irrelevant.
It is determined by the axion coupling of a2 , but a2 decouples and does not participate
in the PQ-mechanism.
Case 2: two light axions. If both axions are light, the dominant term in the square
root is M12 + M22 , and we get the approximation
s
2 2
M12 M22 2 M1 M2
(M12 + M22 ) 1 + 2(m21 m22 ) M1
2
+ M2
2
+ (m2
1 m2 )
(M12 + M22 )2 M12 + M22
Then one eigenvalue is M12 + M22 and the other is
m21 M22 + m22 M12 m21 f12 + m22 f22
=
M12 + M22 f12 + f22
Note that now the large eigenvalue is the one of the QCD axion, and it has the expected
mass, M12 + M22 . The light particle has a mass much less than the axion mass. Note that
we need mi Mj only to make the square root expansion valid. It is sufficient that only
one of the two axions has a mass Mi for the PQ-mechanism to work, as we have seen
above.
96
5.6.10 Multiple gauge group factors
One may wonder why we only consider the SU (3) factor here. The discussion of axion
started with the desire to solve the strong CP problem, which concerns the GG term in
QCD. But we could have introduced a similar term in the weak interactions and also a
term F F in the Y -factor of the standard model. So what happens to the corresponding
-parameters?
The answer is different in these two cases. In a U (1) theory there are no non-
perturbative effects, and F F is a total derivative of a gauge invariant current. This
can easily be checked explicitly in Eq. (4.10). In SU (2) the weak term is potentially
physical, but it can be rotated to zero using a global symmetry. This symmetry is baryon
number (or lepton number). Baryon number has an anomaly with respect to SU (2), and
hence a baryon number phase rotation changes weak . Furthermore, apart from the SU (2)
anomaly, baryon number is an exact symmetry of the Standard Model. It is instructive
to compare this with the analogous phase rotation of strong . Here the global transforma-
tion is the axial rotation exp(i5 ). But this symmetry is broken not just by the
anomaly, but also explicitly by the Yukawa coupling terms. This led to a link between the
rotation of strong and the phases in the quark masses. If we bring the latter in the physi-
cally preferred form (real masses) we cannot simultaneously bring strong in the preferred
form (zero).
It may well be that baryon number is broken not just by the anomaly with the weak
interactions, but also explicitly. This happens in Grand Unified Theories (discussed in
chapter 8) and many people expect gravity to break baryon number as well. But for now
we have not seen protons decay. The anomaly-generated decay is too weak to observe
anyway, but decay by other mechanisms may be observable in the future. If these have
been observed, there will be additional Standard Model parameters we can measure, and
some of those will be modified by baryon number rotations. If there is a natural canonical
basis for these proton decay parameters, then we can in principle measure weak with
respect to that basis. But it will probably be simpler to define weak = 0, and define the
proton decay parameters with respect to that choice. But before we can even discuss that,
we need to find evidence of proton decay, and then measure CP-violating phases in such
processes. This is not going to happen anytime soon.
A related problem is that of the QCD axion coupling to other gauge groups. These
could be SU (2)weak or U (1)Y . We have already seen such couplings. They give rise to the
two-photon coupling of the axion, and also to a W + W 1 or a ZZ. But will these
couplings affect the PQ-mechanism or the axion mass? The answer is no: precisely for the
reasons discussed above, there is no U (1) generated contribution to the axion potential,
and there is no SU (2)weak generated one either (except perhaps a tiny contribution if
baryon number is broken by new physics).
But there might exist additional non-abelian interactions that we have not seen yet
or cannot see at all. There can be an entire strong interaction sector acting only on
Dark matter. There may exist non-perturbative effects that have no corresponding gauge
group. Such phenomena have been found in string theory, and go by the name exotic
97
instantons. Indeed, such contributions may be needed to give mass to the large number
of axions these theories sometimes produce, since exactly massless axions point to an
inconsistency (see the discussion of global symmetries above). If there are multiple non-
perturbative contributions to the axion potential, it will look like this
" !#
X X ai
V (ai ) = F 1 cos +
i
fi
If N M this generically implies S = 0. Then all sines vanish, and all their arguments
must be zero. This is true if the matrix F /fi is non-degenerate. For example, if there
is a single axion and a single group, it still will reduce to zero if it does not couple to
GG (i.e. f11 ).
If we ignore degeneracies, roughly the following will happen. Let us assume all fi
are of the same order of magnitude (but not all equal), so that all the scale dependence
comes from the F . This is the case if the fi are generated by some fundamental theory
at some high scale MX (for example the GUT scale or the Planck scale), and if the F are
generated by strong interaction dynamics. Strong interaction dynamics has scales of order
exp(1/g 2 )MX , where g is a dimensionless number of order 1. An example of a strong
dynamics scale is the QCD scale. We expect on the basis of the dependence on g that
the scales F can be distributed roughly logarithmically over a large range (i.e. roughly
the same number of distinct F values per decade of energy scale). Let us assume for
concreteness that F takes values Fn = exn MX , n = 1, . . . , M . Furthermore we define
fi = 1/i MX . If the dimensionless number x is sufficiently large, Fn+1 can be ignored
in comparison to Fn .
So start with F1 . We get N equations for F1 , which are all of the form
1i S1 = 2i ex S2 3i e2x S3 . . .
Without further information, we should conclude that all sines S2 , . . . S3 are of order 1.
This clearly implies that S1 is of order ex . But that would be true if there is just one
axion. With N axions, we can make linear combinations of the equations, and eliminate
S2 , S3 etc, all the way to SN . Hence if there are N axions, the first non-vanishing term
on the right-hand side is
(N +1)i eN x SN +1
and we conclude that S1 is of order eN x .
Having solved the equations for S1 we now turn to S2 . The equations for S2 read
2i S2 = 1i ex S1 3i ex S3 . . .
98
By linear combinations we can eliminate the S1 term and N 1 of the subsequent terms,
or we can eliminate N of the subsequent terms and leave the S1 term. In either case, the
conclusion is the same: S2 is of order e(N 1)x . We can continue doing this, and conclude
that Sk is of order e(N k+1)x . Hence the last Sk that is still reduced somewhat is SN ,
but SN +1 is of order 1.
For the QCD axion this implies that the sine is reduced by a factor emx , where m is
the number of light axions. We need emx 1010 . This can be achieved with one axion,
and a next scale ten orders of magnitude below the QCD scale, or with ten axions, with
strong interaction mass scales differing by a factor 10; of course there are many other
possibilities. In either case the lowest value of F is about 1010 QCD . Anything less
than that is of course also fine. An important point is that there can be any number of
mass scales in between this lowest relevant scale and the QCD scale, as long as there are
sufficiently many light axions available. This is the same conclusion we reached above
when we introduced explicit axion masses.
This is arguably the most plausible realization of the PQ mechanism. One needs a
theory producing a large number of axions, and a large number of interaction scales, so
that all of theses axion acquire a dynamical mass. If these scales are distributed on the
entire energy scale (for example between the Hubble scale of 1042 GeV and the Planck
scale of 1019 GeV) there will be a substantial number of light axions below the QCD scale.
That is all that is needed. This is known as the axiverse [1], and it may be realized in
string theory.
99
k
N LM
k+q
To be rather general we have left the number of external lines as a free parameter, and
we have used an N+2-point vertex with coupling constant N +2 and an L+2-point vertex
1
with coupling constant L+2 . The interaction Lagrangians are thus (N +2)! N +2 N +2 and
the same with L instead of N . The loop integral is
Z
d4 k i i
(iN +2 )(iL+2 ) .
(2)4 k 2 m2 (k + q)2 m2
6.1.2 Regularization
One can make the ultraviolet divergence explicit by cutting off the integral. Instead of
integrating over all of momentum space, one integrates over a finite sphere of radius 2 ,
so that k 2 < 2 . After introducing the cutoff the integral is finite, but now it depends on
the cutoff,
2
V N +2 L+2 log( 2 )
q
and we cannot take the cutoff to infinity.
The process of making the integral finite is called regularization. There are other ways
of achieving this, and since it has no obvious physical meaning, all physical quantities one
finally obtains should be independent of the regularization procedure. But first we have
to get rid of the divergences.
Since we are in Minkowski space this requires a bit more discussion, since it is not obvious what a
sphere is. In fact all these manipulations are always done after one has analytically continued the
integrand to Euclidean space using a Wick rotation.
100
6.1.3 The Origin of Ultraviolet Divergences
What is the reason for the infinity? Note that when we integrate over all of momentum
space we are doing something that is physically ridiculous. Large momentum corresponds
to large energies, and to short distances. Experimentally we have been able to explore
nature up to several hundred GeV, and without doing further experiments we cannot
pretend to know what happens at larger energies or shorter distances. Suppose that at
shorter distances space-time has a crystalline structure. Then the inverse of the cell size
would provide a maximum momentum, since wavelengths smaller than the cell size make
no sense. In this situation the momentum cut off introduced above would have a physical
meaning.
One may also envisage changes to the vertices that are small at low energies, but cut
off the integral at large energies. For example, suppose the Feynman rule for a vertex was
not L+2 but something like L+2 [2 /(P 2 + 2 )], where P is the sum of the incoming
momenta and a large mass (larger than 1 TeV, say). A low energy observer would
experimentally detect the existence of the L+2 L+2 vertex by scattering two particles,
and measuring the probability that L such scalars come out. At low energies P 2 2 ,
and the correction factor is almost one. If is large enough, it would be impossible to
observe it. However, if we insert the same vertex in a loop diagram we integrate over all
momenta, and we are sensitive to any such factor. Factors of this kind do indeed occur,
for example if our particle were not elementary, but is in fact a bound state of two other
particles. Then the interaction vertices are corrected by form-factors. If the binding
scale is sufficiently high, a low energy observer cannot resolve the sub-structure, and for
all practical purposes sees the particle as elementary.
In other words, if we claim that Feynman diagrams are divergent for large momenta, we
are simply making a completely unfounded extrapolation of known physics to extremely
short distances. But that leaves us with the question what to do about these integrals.
6.1.4 Renormalization
Let us ask the question from the perspective of an experimentalist. Clearly the loop
diagram contributes to processes with N + L external lines. Suppose our theory has an
1
additional vertex (N +L)! N +L N +L . Suppose we do a scattering experiment to measure
this vertex for example 2 particles to N +L2 such particles. The amplitude, expanded
to one-loop level has now schematically the following contributions
101
L
L+M-2
N+L-2 + + ...
N-2
M-2
This is just intended schematically, and in particular we did not draw all diagrams
here; there are others with one or both incoming lines attached to the other vertex. An
experimentalist can only measure the sum of these diagrams. The sum gives an expression
like
2
L+N + CN +2 L+2 log( 2 ) + . . . ,
q
where C is some numerical coefficient and q is some combination of the external momenta.
The explicit form of both follows from the details of the computation, but is not relevant
for our purpose. The dots indicate terms that are finite for plus contributions of
higher order diagrams.
The coupling constant L+N is a physical parameter of the theory, that is not predicted
by the theory itself, but must be measured. To measure it we must specify a physical
process. In the present case, that physical process could be - scattering to N + L 2
-particles with precisely specified external momenta. Let us call the value of q for those
fixed momenta q0 . Then the physical value of the coupling constant is related in the
following way to the parameters in the Lagrangian
2
physical
L+N = L+N + CN +2 L+2 log( ) + ... .
q02
In fact only the infinite sum of all diagrams is a measurable quantity. Here we work to second order
in the coupling constants K , which are assumed to be small. This may seem strange since the loop
correction diverges as . But note that for any finite choice of we can make the coupling
constants small enough so that the next order can be ignored. After computing a physical cross section
for small coupling, we continue the coupling to its physical value.
102
This process of absorbing short distance singularities into physical quantities is called
renormalization. The quantity physical
L+N is usually called the renormalized coupling con-
stant, and the quantities that appeared in the Lagrangian are called bare coupling con-
stant. They cannot be measured.
The crucial point is now the following. We can only give one definition of physical
L+N ,
but of course this coupling constant appears in many different processes. Whenever L+N
(the bare coupling) appears, we replace it by physical
L+N , using Eq. (6.1.4). If all goes well,
this should remove all log terms at the next order. For this to work, it should be true
that L+N always receives at the next order exactly the same loop corrections. To some
extent one can see that intuitively, but actually proving it is quite hard.
The foregoing can be summarized by the following prescription:
1. Calculate some process to a given loop order in perturbation theory.
2. Introduce a prescription to cut off all the divergent integrals. (regularization).
3. For each physical parameter, choose one specific physical process to define and
measure it.
4. Then use this definition in all other processes to substitute the bare parameters by
the renormalized ones.
If all goes well, one now obtains for each process one computes a perturbative expansion
in terms of physical, renormalized parameters, and all dependence on the regulator scale
has disappeared.
Note that it does not matter whether the momentum integrals are actually infinite or
are cut off by some unknown short distance physics. All the unknown physics is absorbed
in the renormalized parameters. These parameters depend on unknown physics and are
therefore not determined theoretically.
However, in general the number of parameters one needs in this procedure is infinite.
We can only absorb a log in a physical parameter if that parameter actually exists. For
a scalar theory the procedure outlined above will generate a vertex with N + L lines if
there exists a vertex with N + 2 and one with L + 2 lines. Suppose N = L = 3, i.e.
we consider two five point vertices. Then N + L = 6, and to absorb the corresponding
divergence we need a six-point vertex. Combining that with a five-point vertex gives a
seven-point vertex, and clearly this never stops. Then the theory has an infinite number of
parameters. To determine it completely one needs to do an infinite number of experiments.
6.1.5 Renormalizability
A theory is called renormalizable if all divergences can be absorbed into a finite number
of parameters. This is a very strong restriction, but it makes the theory enormously
The physical quantities meant here are all parameters in the Lagrangian, i.e. masses and coupling
constants In addition to physical quantities, some singularities are absorbed in the normalization of the
fields (wave function renormalization), which is not a physical quantity. For simplicity, we assume here
that all divergences are absorbed in a single coupling constant.
103
more powerful. After the determination of a handful of physical parameters, one can
make detailed predictions of all physical cross sections and decay rates! In our scalar
theory example this allows only two vertices, 3 and 4 . If there is just one scalar, the
only parameters are the couplings 3 and 4 and the mass of the scalar. The mass is
treated in a quite similar way: it also must be determined experimentally, and it is also
renormalized.
Other examples of renormalizable theories are QED and QCD. Both have just one
parameter, the coupling constants e and g respectively (if one ignores the fermion masses).
The coupling constant of QED can be defined by means of the electron-photon coupling
at zero photon momentum. For the QCD coupling constant the equivalent procedure
cannot work, because we do not have free quarks and gluons, and furthermore because
QCD perturbation theory does not work at zero gluon momentum. So one necessarily
has to define g rather more indirectly, and at a non-vanishing momentum scale.
L = 21 [ ]
tells us that has dimension 1. Similarly, fermion kinetic terms tell us that fermion fields
have dimension 32 , and gauge kinetic terms require that gauge fields have dimension 1.
Now consider interactions. Since the dimensions of the fields are fixed, dimensional
analysis now fixes the dimensions of the coupling constants. Take for example N N .
Obviously the dimension of N is 4 N . If N > 4 the coupling constant has a negative
dimension. This turns out to be the origin of non-renormalizability. Feynman diagrams
with combinations of such coupling constants can have coefficients with arbitrarily nega-
tive dimensions, whose divergences correspond to terms with an arbitrarily large number
of fields.
A necessary condition for renormalizability is absence of negative dimensional coupling
constants. This leaves only very few possibilities, namely
3 (with a coupling of dimension 1)
or 5
Note that we could have allowed for a coefficient in front of the kinetic term, which could have its own
dimension. However, we can always absorb such a coefficient by redefining . Any other term in the
Lagrangian will always have a coefficient.
104
AA2 or A4
A or 5 A
A2 2 or A
where , and A denote generic scalars, fermions and spin-1 fields. In some cases details
of the index structure are suppressed. In all cases except 3 the coupling constant is
dimensionless. In addition to these interactions also mass terms are allowed.
For spin-1 fields more severe constraints apply, which will not be explained here. Their
interactions must not only have the structure listed above, but they must have the precise
form we saw in the chapter on gauge theories. This is due to the requirement of gauge
invariance. Mass terms for spin-1 bosons are only allowed if they are due to the Higgs
mechanism.
This is only schematic, and important details such as the left-handed nature of the currents are
suppressed.
105
In our description of nature both renormalizable and non-renormalizable theories play
a role. For example the Standard Model of weak, electromagnetic and strong interactions
is renormalizable, but the theory of pion-nucleon interactions is not. In the former case
that means that we can predict scattering amplitudes of quarks and leptons with in
principle unlimited accuracy in terms of only a few (about 27) parameters that must
be determined experimentally. In the latter case we may be able to describe low-energy
pion-nucleon interactions, but if we attempt to go to higher energies more and more
parameters are required and finally the description becomes completely inadequate. At
sufficiently short distances we have to take the quark substructure of pions and nucleons
into account, and we cannot pretend that they are fundamental fields.
At some time in the future we may find ourselves in the same situation with the
Standard Model, but only experiment can tell us if and when that happens.
Here only the renormalization of the coupling constants is considered; for simplicity, wave-function and
mass renormalizations are ignored.
106
6.2.1 Example: Scalar Field Theories
Let us consider the simplest interacting scalar field theory, with Lagrangian
L = ( )( ) 21 2 2 1
24
4 . (6.2)
This Lagrangian leads to the Feynman rules shown below. Note that the normalization
of the 4 vertex is chosen in such a way that the vertex has a factor 1. The first loop
p i
p2 m2
correction to the vertex comes from the diagram shown below; here q = p1 + p2 = p3 + p4
and all momenta flow from left to right. Actually, there are two more diagrams, distin-
q k
p1 p3
p2 p4
k
Figure 3: One of the three scalar one loop graphs.
guished only by connecting p1 and p3 (and p2 , p4 ) to the same vertex and by connecting
p1 and p4 (and p2 , p3 ) to the same vertex. The computation of these other two diagrams
is completely analogous to the one shown here, and the net result will just be a factor of
three in the logarithm. The Feynman integral is
Z
1 2 d4 k i i
(i) (6.3)
2 (2)4 k 2 m2 (k q)2 m2
The factor 12 is a symmetry factor, needed because there are two identical propagators
between the two vertices. To evaluate the integral we use Feynmans trick
Z 1
1 1
= dx , (6.4)
AB 0 (xA + (1 x)B)2
107
where A and B are the two propagator denominators. We make a change of variable in
the integration, defining l = k xq . Then we get
Z
1 2 1
4 d4 l (6.5)
2 (2) (l + x(1 x)q 2 m2 )2
2
Now we make a Wick rotation to Euclidean space, by defining l0 = il4 . Doing this properly
requires a definition of the location of the poles of the propagators in the complex plane,
the i-prescription, which we will not discuss here in detail. One defines the propagators
as
1
lim 2 (6.6)
0 p m2 + i
Then the integration contour is rotated to the imaginary axis in such a way that the poles
are avoided. The net result is that the integral now becomes
Z Z
1 2 1
4 dl4 d3 l 2 (6.7)
2 (2) 2 ~
l4 l + x(1 x)q m
2 2 2
We define a variable Q2 = q 2 in order to avoid branch cuts that exist for q 2 > 0. The
integral over l can be written in terms of four-dimensional Euclidean polar angles
Z Z Z Z Z Z
3 4 3 2
dl4 d l = d l = l dl d3 = 2 l3 dl (6.8)
0 0
Clearly, this integral is divergent. One can deal with this by introducing a cut-off . By
cut-off we mean the highest allowed loop-momentum. If we believe that physics remains
unchanged up to arbitrary high momenta (i.e. arbitrarily short distances), would be
infinite, but this is clearly a preposterous assumption: we cannot possibly know this. If
there are fundamental changes in the theory at short distances (as is the case in string
theory or if space-time is discrete, just to mention a few possibilities) the momentum
integral may be finite, and then is simply a very large momentum scale. At this point
we treat as if it were just an additional parameter, representing our ignorance about
physics at very short distances. This looks worrisome. We are just computing a single one-
loop graph, and immediately we encounter a new parameter. This will happen in many
other graphs, and hence if we go on to arbitrary loop order, we would encounter an infinite
number of parameters, which have to be determined experimentally in order to use them.
Fortunately, it turns out that these parameters are not all separately observable. Only
certain combinations can be observed, and in the Standard Model these combinations
correspond precisely to the parameters in the Lagrangian.
108
After replacing by we get
Z 1 2
1 2 + x(1 x)Q2 + m2
dx log 1 (6.10)
2 16 2 0 x(1 x)Q2 + m2
Now we consider the limit Q2 2 and Q2 m2 . The former limit is simply the
assumption that the point where momenta are cut-off is well beyond the energy scale of
physical interest. If that is not the case, then surely we must know further details about
how they are cut off. The other limit assumes that we are considering energy scales much
larger than the particle masses. This is a good assumption in LHC physics, but of course
this is not always true. If a mass is larger than Q, we may ignore the Q2 dependence
in the argument of the logarithm. We will return to that case later. Consider now the
limiting case and add the result of this diagram to the tree amplitude. Then we get
2
2
i log + ... (6.11)
32 2 Q2
where the dots represent finite terms. The other two diagrams give the same depen-
dence, but a different dependence on external momenta. We can define Q2s = (p1 + p2 )2 ,
Q2t = (p1 p3 )2 and Q2u = (p1 p4 )2 , and then the sum can be written as
2 2 2 2
32 2 Q0 Q0 Q0
i log log + log + log + . . . (6.12)
32 2 Q0 2 32 2 Q2s Q2t Q2u
Here Q0 is some common energy scale, chosen in such a way that the momentum depen-
dent finite terms in the square brackets are small, so that they can be considered as part
of the finite terms. This example illustrates how expressions like Eq. (6.1) come out of
actual loop computations. Note that the coefficient b0 = 3/16 2 in this case.
109
which at tree level is equal to the coupling constant). Since no experiment can directly
measure the coefficients in the Lagrangian this is the only thing we can do. It follows
immediately that V (Q) cannot be a constant. At best we can choose a reference scale
= Q to define and measure it, and then calculate its value at any other scale. At present
the most commonly used reference scale for the Standard Model couplings is MW . Note
that the coupling constant increases with increasing Q if b0 is positive.
One of the consequences of renormalizability is that the same redefinition removes the
infinities associated with the coupling constant in all diagrams. This implies that in the
finite result the same logarithmic corrections b0 log(/Q) will always appear with any
coupling constant, albeit with process dependent quantities Q.
If we measure g() in one process we can now make predictions for all others, but
what should we take for ? The best choice would seem to be the one that minimizes the
logarithmic corrections, i.e. = Q. If we take very different from Q the convergence of
the loop expansion becomes very bad, since at each order in g one encounters the large
logarithmic correction log(/Q) to the same power. By setting = Q we are effectively
summing up these large logarithms. Consequently each process now has its own coupling
constant g(Q), and the coupling constant is not a constant anymore, but a function of
the scale. This is called the running coupling constant.
Technically this is done by means of the renormalization group equation. We will show
here how this equation is derived in the present, slightly simplified context. Consider the
measurable quantity V (Q) introduced in Eq. (6.1), and substituting for gbare the physical
coupling constant Eq. (6.13).
n
V (Q) = gphys () gphys ()b0 log( ) + higher order (6.14)
Q
Now it seems that the physical quantity V (Q) depends on , the energy scale at which we
have decided to define and measure the coupling constant. But this is just a convention,
on which no physical quantity should depend. Hence it must be true that
d
V (Q) = 0 (6.15)
d
This leads immediately to the equation
d n
gphys () b0 gphys () = 0 (6.16)
d
n
Here we have ignored the derivative of gphys () because this is of higher order in the
n
coupling constant. If we define (g) = b0 g (plus terms of higher order), we may write
this as
d
gphys () = (gphys ()) (6.17)
d
On the other hand, if we view V (Q) as a function of gphys , and Q (with an explicit
dependence on through the logarithm and an implicit dependence via gphys ) we may
110
write the derivative Eq. (6.15) in terms of partial derivatives as
d dgphys ()
0 = V (Q) = + V (gphys , , Q) , (6.18)
d d gphys
or in terms of the function (g) we just introduced
+ (gphys ) V (gphys , , Q) = 0 . (6.19)
gphys
This derivation can in fact be done to any order in g, and for any Greens function. The
function (g) (called the -function) now becomes a polynomial in g rather than just
a single term we found in the one-loop case. The general answer for a Greens function
G is (omitting again for simplicity the effects of masses and external lines, and renaming
gphys simply g)
+ (g) G(g, , Q) = 0 , (6.20)
g
where (g) is the -function and g denotes the physical (renormalized) coupling constant
at the scale . The statement that this holds to arbitrary order in g should not be misin-
terpreted. Of course both (g) and G have an expansion in powers of g with coefficients
we do not know, except for the lowest orders. However, if we just introduce parameters
for these coefficients, then Eq. (6.20) holds to any order. It is simply a consequence of the
requirement that physics should not depend on an arbitrary choice of reference scale .
111
Now we use the relation
g(t)
(g(t)) = (g) (6.23)
g
to show that Eqn (6.20) is indeed satisfied. To show that Eqn. (6.23) holds, define
g(t)
F (t) = (g(t)) (g) .
g
g n1
g n1 (t) = (6.25)
(1 (n 1)b0 tg n1 )
If we expand this solution to order g n we get precisely the one-loop contribution discussed
above. However, even if we take for (g) just the one-loop expression b0 g n we see that
g contains an infinite number of terms. These correspond to the so-called leading log
contributions to higher loop diagrams. Higher terms in (g) correspond to next-to-
leading logs, which are down by one or more powers of log(Q/). This solution is valid
only if g is small, since otherwise it is certainly not correct to ignore the higher order
terms in the function. If we extrapolate to higher energies (t = log(Q/) ) we
observe that g n1 becomes smaller and smaller if b0 < 0. However, if b0 > 0 the coupling
constant increases until it becomes formally infinite for t = 1/((n 1)b0 g n1 ) (we assume
that g > 0). This is called a Landau pole. Here perturbation theory breaks down, and
hence one cannot conclude exactly what happens to the theory. Theories with b0 < 0,
which are well-behaved at higher energies, are called asymptotically free. This is a very
desirable property since it makes it plausible that no new dynamics will appear at higher
energy; in order words, if we understand the theory at low energies, we can be quite
confident that it harbors no surprises when extrapolated to arbitrarily large energies. In
practice, however, we still have to worry about interactions with other theories, most
notably gravity, disturbing our extrapolations.
112
6.2.4 Asymptotic Freedom
To see which theories are asymptotically free we list here the values of b0 for some popular
theories. For non-abelian gauge theories coupled to Weyl fermions:
1 1
b0 = 11I2 (A) 2I2 (Rf ) I2 (Rs ) , (6.26)
96 2 2
where I2 (R) is defined as
1
TrR T a T b = I2 (R) ab , (6.27)
2
for any representation R. The representations occurring in Eq. (6.26) are the adjoint
A, the representation of the Weyl fermions Rf (left- and right handed fermions give the
same contribution) and of the scalars Rs . The scalars are assumed to be real; if a scalar
is complex, as it must be if it transforms in a complex representation, one gets an extra
factor of 2. The term depending on A is due to the gauge bosons and the Fadeev-Popov
ghost required by gauge fixing. Note that Rf and Rs may be reducible representations;
the index I2 is then simply the sum of the indices of each component. This formula is only
correct for one choice of normalization of the generators, namely so that I2 (A) = C2 (A),
where C2 denotes the quadratic Casimir eigenvalue. This is the normalization adopted
throughout these lectures (see appendix B).
For the group SU (N ) one has I2 (R) = 1 in the fundamental representation, and
C2 (A) = 2N . It follows that SU (3) is asymptotically free if the number of Weyl fermions
in the fundamental representation or its conjugate is less than 33. In the Standard Model
there are 6 flavors of Dirac fermions, each with two Weyl components, so that QCD is
asymptotically free. In SU (2) one can accommodate 22 Weyl doublets, or 21 and one
Higgs. The Standard Model has four per family, and hence a total of 12, so that SU (2)
is asymptotically free as well. The running coupling constant of these theories is given by
Eq. (6.25) with n = 3.
The leading terms (i.e. the one loop contribution) of the -functions in the Standard
Model are then
42 3
3 = g + ... (6.28)
96 2 3
19 3
2 = g + ... (6.29)
96 2 2
These are the -functions at high energies, where all SU (3) SU (2) U (1) multiplets are
participating. Since the heaviest Standard Model particle is the top quark, this means
that these functions are valid at energies above the top quark mass. In general, if one
considers some energy scale Q, one should include in the -function all particles with
masses m < Q. The second term contains the contribution of the Higgs doublet in
the representation (1, 2, 12 ) Note that the Higgs field is complex, so that the I2 (Rs ) term
in Eq. (6.26) must be multiplied by 2. Hence the Higgs field contributes 1/96 2 , and
the total contribution is 44 + 2 12 + 1 = 19; the factor 12 in the second term is
due to four Weyl doublets in each of the three families. For QCD the exact count is
(11 2 3 + 2 12) = 42.
113
6.2.5 Abelian gauge theories
Now consider abelian gauge theories. There is no gauge boson and ghost contribution,
and the matter contribution
P can be obtained directly from the non-abelian case, using
I2 (R) = 2 TrT 2 = 2 q 2 . Now b0 is always positive, and the coefficient is
1 X 2 X 2
b0 = 4 qf + qs , (6.30)
96 2
where qf and qs are the Weyl fermion and real scalar charges. For QED coupled to a
single Dirac electron we find thus b0 = 1/12 2 .
The fine structure constant increases from its low energy value of 1/137.04 to a
value of about 1/128 at the weak scale QW . Beyond that we should evolve the coupling
constants gi of SU (3) SU (2) U (1) rather than the QED coupling constant. The value
of 96 2 b0 for U (1)Y per family is
h 2 2 2 2 i
4 6 61 + 3 31 + 3 23 + 2 12 + 1 = 40 3
(6.31)
P 2
The Higgs scalar contributes qs = 4 ( 12 )2 = 1 (with a factor two for the dimension
and another one for the complexity). This gives the following result for the g1 -function:
41 3
1 = g + ... (6.32)
96 2 1
If we match it with the QED coupling at the weak scale, QW 100 GeV, we get a
boundary condition g1 = e/cos(w ) at that scale. This yields
p
4(0)
g1 (0) = = 0.357 (6.33)
cos(w )
Here we are using QW as the reference scale , so that t = log(Q/QW ), and (0) = 1/128.
Now we can extrapolate g1 to higher energies Q. It increases, until g1 (t) reaches its
Landau pole at for
1 48 2
t= = 90.7 (6.34)
2b0 g1 (0)2 41g1 (0)2
i.e. Q = e90.7 QW 1039 QW 1041 GeV. This is far beyond the Planck mass of 1019
GeV, and hence we are likely to encounter more serious difficulties before having to worry
about it.
114
Here there is an interesting competition between the first term and the other three, of
which the QCD contribution is the dominant one due to its larger coupling constant and
numerical factor.
If y is small, the negative terms dominate, and the Yukawa couplings evolve to smaller
values. If we ignore the running of g3 the equation for y has the form dy dt
= Cy, and the
solution is a negative exponential. In this case there are no problems.
If y is large, the first term dominates and the coupling grows, and will become infinitely
large. The border between these two cases is
64
y2 = s (6.36)
9
This can be converted to a quark or lepton mass of about 270 GeV. If y has exactly this
value at MW , the coupling constant initially does not move at all, but since s decreases
the first term will eventually win, and y will increase. If one requires that M does not
become infinite before the Planck scale, one gets an upper mass limit for quarks and
leptons of about 200 GeV. The exact evolution of y, including the effect of running of the
gauge coupling constants, is shown in fig. 4.
The function Eq. (6.35) is said to have an infrared fixed point. If we start at some
high energy scale with a value y0 and we allow the coupling constant to evolve to lower
energies, it must end up at the fixed point value Eq. (6.36) (this assumes that higher order
corrections to the -function may be ignored). For the fermion masses this evolution to
arbitrarily small energy scales is not really relevant though. They are determined by the
value reached by y at the weak symmetry breaking scale.
In this situation choosing almost any value for y at, for example, the Planck scale
yields a value of about 200 GeV for the fermion mass (if we blindly apply the one-loop
function even when the couplings are large), almost independent of the input value
y(MPlanck ). Only for very small values of this parameter a significant reduction of the
mass is found.
Interestingly, the top quark mass is about 175 GeV, just below the bound. This means
that all Yukawa couplings decrease at shorter distances, and hence they do not cause any
problems.
115
5
Planck
Scale
4
10 20 30 40
Figure 4: Running of the Yukawa coupling y for initial values y = 0.5 + 0.1n (corresponding to a
top-quark mass of 87 + 17.5n GeV). The horizontal axis is the energy in GeV, in powers of 10. The thick
green curve represents the observed top quark mass, the red curve is for mt = 208 GeV, and the blue
one for mt = 226 GeV. These are respectively the last one without a Landau pole below the Planck scale
(in our discrete set) and the first one with a Landau pole below (actually almost exactly at) the Planck
scale. The cutoff behavior above 1040 GeV is caused by the Landau pole of g1 .
The Triviality Bound. The Higgs self-coupling is also not asymptotically free: b0 =
3/2 2 and n = 2. We can express the value of the Higgs self-coupling in terms of the
Higgs boson mass (p 125 GeV), the mass of the W boson
p and coupling constants, using
1
the relations v = 2 /, MW = 2 g2 v and MH = 22 (here is the Higgs mass
2
116
assume that takes the value (6.37) at the scale MW . The scale Q at which the coupling
formally becomes infinite is then given by
8 2 M
133.6( MW )2
Q = MW e 3(MW ) MW e H . (6.39)
Beyond Q the Standard Model with a fundamental Higgs stops making sense. If
Q MPlanck this problem is hidden behind the Planck scale. This is true if MH < 2MW =
164 GeV. Of course we are using a very poor approximation, since we ignored all contri-
butions of other particles to the function, higher loops, and also because we trust the
renormalization group equations all the way to the pole, which is certainly not correct.
Nevertheless, it gives us a feeling that for some reasonable values of MH the Standard
Model can be extrapolated all the way to MPlanck , but that for large values one will en-
counter the pole before MPlanck .
If we increase MH the scale Q will decrease, and at some point they meet. It is easy
to solve Eq. (6.39) with Q = MH , and one finds MH 8MW 650 GeV. It does not
make sense to increase MH beyond this point, because then the mass of the scalar is larger
than the scale up to which the theory makes sense.
All this was based on extrapolation of perturbation theory beyond its limits. It can
be made more precise by putting the theory on a lattice to deal correctly with the non-
perturbative physics. This confirms in a more rigorous way that there is an upper bound of
about 700 GeV for the Higgs mass. For values of MH below that bound, the theory should
be viewed as an effective theory, valid only up to Q . Sometimes this is also formulated
in the following way: if we really want to make sense of the theory for arbitrarily large
scales, we are forced to set the coupling constant to 0. Then the theory is trivial, it
is a free theory that is certainly valid for arbitrary scales, but not very interesting. The
upper bound on MH is usually referred to as the triviality bound.
The Stability Bound. The expression for the 4 -function given above ignored all
other interactions. It is instructive to consider the complete -function at one loop order:
1 2 4 2 2 2 9 4 2 2 3 4
() = 6 24y + 12y (9g2 + 3g1 ) + g2 + 3g2 g1 + g1 + . . . (6.40)
16 2 2 2
Here y can be any quark or lepton Yukawa coupling (leptons contribute with a relative
factor 31 , since the quark contribution is enhanced by a color factor). In fact, each occur-
rence of y is an implicit sum over all quarks and leptons, but of course to a very good
approximation only the top quark contributes.
If is small it is not the first term that dominates (as assumed earlier), but the second
one. Then will decrease rather than increase, and one has to worry that it does not go
through zero, since negative values of would correspond to an unstable Higgs potential.
Requiring that this should not happen puts a lower bound on and hence on the Higgs
mass. A detailed two-loop analysis of the coupled equations [12] (using the known top
quark mass of about 175 GeV) gives a lower limit on the Standard Model Higgs mass
of about 150 GeV. If we combine it with the upper bound coming from the requirement
117
3
2 Planck
Scale
-1
0 10 20 30 40
Figure 5: Running of the Higgs coupling constant using the one-loop -function. The lines correspond
to Higgs masses of 100 + 5n GeV, n = 0, . . . 19. The green line corresponds to the observed Higgs mass
of 125 GeV. The first line that does not cross zero is the one for a Higgs mass of 145 GeV, the first one
with a Landau pole below the Planck scale is the one for a 170 GeV Higgs.
that remains finite below MPlanck we are left with a very small window between 150 and
160 GeV. Of course both bounds are different if we add extra particles to the Standard
Model. But if we dont want to do that, and the Higgs is not found within this window,
we can be pretty sure that the Standard Model must loose its validity in some way before
the Planck mass is reached.
Note that even though may initially decrease with increasing energy scale, the
Yukawa coupling decreases as well, and its contribution will eventually be smaller that
the first term. Then at higher scales the value of starts to increase again, and hence
the triviality problem is not solved by including the Yukawa coupling. Here we are using
the fact that the top quark mass is still below the bound of 200 GeV mentioned in the
previous section.
This presentation is actually a bit too naive. One should really use the full effective
potential instead of the tree level potential. This has the effect of replacing 4 by ()4 ,
where () is the running coupling constant evaluated at the scale . Since we have just
argued that for large scales becomes positive again, it is clear that the potential is not
really unbounded from below: for the potential will eventually become positive.
However, it is also clear that the potential develops a second minimum (in addition to the
one that breaks SU (3) SU (2) U (1)) for a value of near the zero of . A problem
arises then if that minimum is the global minimum of the potential, since one would then
118
expect the Standard Model vacuum to be unstable (this is often called a false vacuum),
and to decay to the true vacuum. In the true vacuum SU (3) SU (2) U (1) would also
break to SU (3)U (1), but with a much larger Higgs v.e.v. and hence much larger W and
Z masses. Note that the top Yukawa coupling only enters the function for scales larger
than the top quark mass; below that mass the top decouples. Hence if goes negative,
this can only occur for scales much larger than mtop , and hence inevitably the resulting
W mass will be much than mtop .
The lower limit of 150 GeV quoted above is based on an analysis of the effective
potential, although it turns out that simply requiring that remains positive leads to
essentially the same result. A recent two-loop analysis (see the second paper in [26])
yields a slightly smaller number, 140 GeV.
There is a further remark to be made here. We should probably not worry about
the absolute stability of our vacuum, but rather about its lifetime. It could decay via
tunneling or thermal fluctuations, or even as a result of high energy collisions in particle
accelerators. All we need to require is that it lives longer than the age of the universe.
This inevitably lowers the bound somewhat, but not by more than a few GeV [26].
119
This has to do with the differences in mass renormalization between fermions and scalars
(which set the weak scale). For fermions one has
m g 2 m log(/m) , (7.1)
The mass renormalization is formally infinite in both cases, if we make the cut-off
arbitrarily large. But the fermion mass correction has two positive features: it diverges
only logarithmically, and it is proportional to the mass of the particle. Although log()
is infinite, log(M ) is a number of order 1 for any reasonable choice of M , such as the
Planck scale. On the other hand, if we substitute the Planck mass for in the scalar
mass correction, the correction is 17 orders of magnitude large than the physical mass.
In other words, there is no protection mechanism in the Standard Model to keep the
scalar mass small.
This is not necessarily a fundamental problem. In both cases one can absorb the
infinities into the bare mass and obtain any desired value for the physical mass. This
requires huge readjustments at every loop order for the scalar mass, but one could retort
that perturbation theory is a typically human activity, and that nature does not work
order by order in perturbation theory. If there is some good reason why the physical
parameter should be small, then it is not obvious that a protection mechanism is needed
to keep it small in perturbation theory.
In other words, there are really two problems: why is a parameter small, and why
does it remain small. To appreciate this, note that the scalar mass suffers from both
problems, but that the electron mass is protected. We do not know why it is small,
but at least all corrections are proportional to the mass itself. Another way to view the
difference between these two cases is to check if one gains any new symmetries if the small
parameter is put to zero. In the case of the electron that is true: one obtains a U (1) chiral
symmetry, which forbids any perturbative contributions to the electron mass. In terms of
Feynman diagrams, one can view eL and eR as completely independent fermions, whose
lines can be followed trough each diagram. If there is no eL eR vertex in the theory, it
can never be generated. Note that this argument is strictly perturbative. For example
the same reasoning could be used for quarks, but we know already that non-perturbative
QCD effects break the chiral symmetry spontaneously.
120
On the other hand, one does not gain a symmetry if one puts a scalar mass to zero
(naively one would expect to gain a scaling symmetry (x) l(l1 x), but this symmetry
is explicitly broken by the mass scale introduced in any regularization procedure).
The maximal number of protons in a compact object is given roughly by (MPlanck /mproton )3 , and is
about 1057 in our universe, the number of protons in a star. Indeed, stars have a broad range of values
for brightness, but their masses are within one or two order of magnitude equal to 1057 proton masses.
121
question depends on the options that exist fundamentally, and how they are distributed
and selected. This cannot be answered if all we know is the Standard Model. We need
some fundamental theory, that presumably must include gravity.
This just serves as an additional warning that some of the Standard Model problems
we are trying to solve may not have any conventional solution at all. But not all small
parameter problems are potentially anthropic. For example, there is no such argument
for the parameter of QCD.
122
cosmological constant
8GN
c = V0 (7.3)
c2
If V0 > 0 the solution to the matter-free Einstein equations is not Minkowski space, but
de-Sitter space; for < 0 one gets anti de-Sitter space. For a long time there was a
strongly held belief that its value would be exactly zero, but recent observations point to
a non-zero and positive value of the order of 1084 GeV2 , which implies that the expansion
of our universe is accelerating.
Perhaps the natural scale for would be the Planck scale, in which case the expected
value is about 1038 GeV2 . But even if a Planck-scale contribution can be avoided, the shift
in the Higgs potential due to weak interaction symmetry breaking is about 1033 MH2 ,
where MH is the Higgs mass. Clearly this is also much larger than the lower limit, and
represents a fine-tuning problem that is much worse than that of the the weak interaction
scale in comparison to the Planck scale.
It is common practice to ignore the cosmological constant problem when one tries to
solve the other fine-tuning problems. One hopes that a full understanding of gravity will
lead to an understanding of why the cosmological is so small. This may be correct, but
it is also possible that all fine-tuning problems are related and have a common solution.
If that is true, we would be wasting our time by trying to understand the the smallness
of and MW /MPlanck while ignoring the smallness of c .
8 Grand Unification
In this chapter we discuss the idea of embedding the Standard Model in a larger gauge
group. One of the motivations for doing that is the convergence of the coupling constants
at high energies. We begin by examining this more closely.
123
So we have a plot with three lines, one of which can be scaled by an arbitrary factor.
Clearly there is always a factor so that the three lines go through one point, unless two
of them are parallel. Indeed, the observed unification at 1015 GeV does not occur for
the coupling constants g1 , g2 and g3 introduced in chapter 4 but for 1.291g1 , g2 and g3 .
15
The excitement caused by this discovery phad two reasons: first that the scale, 10 was
reasonable, and second that 1.291 5/3, a number that can be explained by group
theory as we will see in a moment.
Inverse
coupling
Figure 6: Unification of coupling constants. The dashed lines are explained in section 8.1.1
Any such statement is based on assumptions about the physics beyond the weak scale.
Since any particle in SU (3) SU (2) U (1) representations alters the -functions, one
is assuming that there are no (or very few) unobserved particles between 100 and 1015
GeV, except for SU (3) SU (2) U (1) singlets. Any unknown massive particle changes
the slope of one or more of the lines. Since it only has effects for scales larger than its
own mass, the result would be a kink in the straight lines in the figure. Note that any
additional matter affects all three lines by bending them in the same direction (namely
downward, with increasing energy), since matter contributions to b0 always have the same
sign. We will see in a moment that it is not quite true that no matter is allowed in the
desert between 100 and 1015 GeV, since there is a natural mechanism for bending all
lines in exactly the right way so that they continue to merge, as shown by the dashed line
in Fig. 6.
The fact that two coupling constants are equal at a certain scale need not have physical
implications. They may just cross each other and continue. But one is tempted to
124
conclude that it has a deeper meaning, namely that the three groups of the Standard
Model somehow are combined into one unified theory.
The normalization guarantees that the propagator for each field Aa is normalized in the
standard way, namely as iab g /k 2 (Feynman gauge). This is important here since
only if the fields are normalized properly one can read off the coupling constant from the
Lagrangian. The gauge coupling to all fields is governed by the covariant derivative
D = igAa T a , (8.2)
where the generators are in the representation of the field they act on.
Now suppose by some mechanism one introduces a gauge boson mass matrix
Now we express the other terms in the action in terms of the new fields B. We note first
of all that the quadratic terms in the kinetic action are not affected since S is orthogonal
(we worry about the gauge self-couplings later). For the covariant derivatives we find
Suppose now that some of the fields B remain massless, i.e. that M has some zero eigen-
values. Massless vector fields must necessarily be gauge bosons, and hence whatever
mechanism we use to generate the mass matrix M , it must be such that the massless
125
gauge bosons couple to a closed set of Lie algebra generators. From the form of the
covariant derivative we read off these generators
U a = Sab T b . (8.7)
Here the hat on the label a indicates those labels in the set for which B a is massless.
In order to define a coupling constant, we have to fix not only a normalization of
the gauge fields (as we have already done), but also for the generators. This works
differently in abelian and in non-abelian theories. The canonical normalization for non-
abelian gauge theories is given in appendix B. For SU (N ) groups, it is Tr T a T b = 21 ab
in the N dimensional representation (the vector representation). In general, one has
TrR T a T b = I2 (R) ab , (8.8)
where the subscript R indicates the representation of G under consideration, and I2
is the second index, defined in appendix B. These are integers, whose normalization
has been fixed a priori for all non-abelian Lie algebras. For abelian groups there is no
intrinsic normalization. However, the electromagnetic coupling constant has a definite
value because we have fixed the charge of the proton to be 1, thus fixing the overall
normalization of the U (1)em generator.
In the present case, orthogonality of S implies
TrR U a U b = I2G (R) ab . (8.9)
Here we have indicated that the second index of G appears on the right-hand side. The
generators U form a sub-algebra H of G. For the time being, we will assume that H is a
simple non-abelian Lie algebra. In terms of H-representations,
P the representation space
of R decomposes (in general) to a direct sum (the notation k indicates a direct sum
over components labelled by k)
X
R rk , (8.10)
k
In addition to the normalization one should also make sure that the structure constants f abc have some
standard form. Once the normalization is fixed, this can always be achieved by choosing the orthogonal
matrix S appropriately.
126
Here I(G/H) is called the embedding index for the embedding of H in G. Note that
must be the same for all representations, so that the representation dependence must
cancel in the ratio. It is then easy to see that I(G/H) must be an integer for any algebra
G with a representation of index 1 (such as SU (N )), since one can use that representation
to define I(G/H). In fact it can be shown that I(G/H) is always an integer for simple
non-abelian groups G and H.
Let us now return to the covariant derivative. When we express it in terms of the
canonically normalized fields B and generators U , we can now read off the correctly
normalized coupling constant, which is the factor of these quantities:
1
gH = p g (8.13)
I(G/H)
The discussion of the gauge-boson self-couplings is essentially the same, with the repre-
sentation R equal to the adjoint representation. This decomposes into several irreducible
components, one of which is the adjoint representation of H. To get the correctly nor-
malized structure constants, we need the same normalization factor , which is absorbed
in the coupling constant. The original gauge kinetic terms now split into gauge kinetic
terms for the massless gauge bosons B a , plus minimal couplings of these bosons to the
massive B-bosons.
If there is more than one simple factor in H, one defines a separate embedding index
for each of them. For the coupling constant of the ith factor one finds then
1
gi = p g (8.14)
I(G/Hi )
In
p other words, at the unification scale (at which the symmetry breaks), the quantities
I(G/Hi )gi are equal to each other and to g.
In this situation the logarithm of the unification scale is determined by solving a linear
equation of the form
1 1 i 1 1 j
+ b0 t = + b0 t , (8.15)
Ii gi2 Ij gj2
where i and j label two factors of the gauge group, and Ii is an abbreviation of I(G/Hi ).
It is interesting to examine the effect on convergence of coupling constants due to extra
matter representations . Suppose two coupling constants converge at some large scale t0
when they are naively extrapolated from some low scale. This means that Eq. (8.15) is
satisfied for t = t0 . We use the word naively, because inevitably such a statement implies
an assumption about the presence of matter between the low scale and t0 . Suppose we
modify the evolution by adding matter in a representation R of the low energy gauge
group Hi Hj . This changes bi0 to bi0 sI2i (R), where s is a spin-dependent factor, and
similarly for j. Here I2i denotes of course the second index of the group Hi . At the scale t0
the groups Hi and Hj are embedded in G. Suppose now that the Hi Hj representation R
forms exactly a representation of G. Then the modification of each side of Eq. (8.15) can
be written as I2 (RIi
i)
t = I2G (R)t for i as well as j. The result is thus independent of i, and
127
hence both sides of the equation change in exactly the same way. The solution t0 remains
thus unchanged (the value of the coupling constant at unification it does change, and is
in fact increased.) This is illustrated by the dashed lines in fig. 6. Here the extra matter
is assumed to have a mass somewhere in the desert between MW and MGUT . This result is
completely independent of the precise decomposition of R with respect to the subgroup,
but does assume that all its components get roughly the same mass. Even if unification
takes place, it need not always happen that a Hi Hj representation R is exactly a G
representation. It might happen that to complete it to a G representation additional
particles with a mass near the unification scale are required. In that case t0 does change,
and furthermore simultaneous unification for three or more coupling constants may be
affected. In any case, the lesson is that the fact that couplings unify, or the scale at which
this happened, is less sensitive to intermediate matter than one might have expected.
128
define t by writing this eigenvalue as e2it/3 . Note that t is opposite for complex conju-
gate representations, and 0 for real ones, including the adjoint. Confinement allows only
particles with total triality equal to zero in the spectrum, and then the observed charge
quantization follows.
A similar relation holds for the known SU (3) SU (2) U (1) representations, namely
where s is SU (2) duality (equal to 1 for spinor representations and to 0 for vectors).
Because the electromagnetic charge is Qem = T3 + Y this leads automatically to the
SU (3) U (1) relation of the previous paragraph.
Mathematically this means that the Standard Model gauge group we have observed
so far is not SU (3) SU (2) U (1), but S(U (3) U (2)). The fundamental representation
of this group consists of matrices of the form
U3 0
U= , (8.17)
0 U2
where Ui is an element of U (i), with the condition det U = 1. The latter is precisely the
charge quantization condition. The Lie-algebra of this group is exactly the same as that
of SU (3) SU (2) U (1), but the groups are globally different. A comparable situation
occurs between the groups SO(3) and SU (2): they have the same Lie algebra, but the
latter has spinor representations, and the former does not. It is precisely the absence
of certain representations from the spectrum that leads us to conclude that the group is
S(U (3) U (2)), and not SU (3) SU (2) U (1).
To see how charge quantization arises, note that elements of S(U (3) U (2)) can be
parametrized as
U3 ei/3 0
g(U3 , U2 , ) = , (8.18)
0 U2 ei/2
where the hatted matrices are elements of the groups SU (3) and SU (2) respectively.
This includes the group element g(z, y, 2), where z = diag (e2i/3 , e2i/3 , e2i/3 ) and y =
diag (1, 1). Note that g(z, y, 2) = 1. This element can be obtained as sequence of
products, and can be reached by multiplying a series of group elements that are close to the
identity. Since products are preserved in any representation, by definition of the latter, this
element must equal the identity in any representation. Consider then a representation that
is trivial in SU (3) (a color singlet) and in SU (2). In the Lie-algebra SU (3)SU (2)U (1),
this could be the representation R = (1, 1, q), with representation matrices gR (U3 , U2 , ) =
eiq , for any real value of q. But in the group S(U (3) U (2)) most of these values of q
do not give rise to valid representations, because the element gR (z, y, 2) = e2iq must be
equal to the identity. Hence the charge q must be an integer. This shows that we get
integral charges for SU (3) SU (2) singlets, and hence we see that q is normalized in the
right way to be interpreted as the Standard Model charge Y . Otherwise we would have
had to introduce a normalization factor at this point.
129
Having normalized the charge correctly with respect to the Standard Model conven-
tions, we may now consider other representations. For a general representation gR (U3 , U2 , )
the element g(z, y, 2) is given by
Since this must be one, we see that all representations of S(U (3) U (2)) indeed satisfy
the observed quantization condition Eq. (8.16).
Of course all this depends on the fermion representations observed so far, and we do
not know whether this is a fundamental property of nature, a coincidence, or something
else. Note that the requirement of anomaly cancellation does not really give us much
choice for the charge Y , at least not within one family of 15 fermions. However, this
already changes if we add an SU (3) SU (2) singlet of arbitrary charge. This gives us
16 fermions, the same as a Standard Model family with a right-handed neutrino. Even
if we require that the Higgs boson can give a mass to all fermions (which implies a
relation between the Y charges of the singlets and the doublets) there is a solution to
anomaly cancellation with arbitrary real charge. We also add a massive fermion in the
representation (1, 1, q)L + (1, 1, q)L , where q is completely arbitrary, and in particular
could be fractional.
130
turns out that none of the Standard Model particles can be mapped into each other by
this symmetry. This implies (at least) adding a boson for every Standard Model fermion
and a fermion for each boson, always in the same SU (3) SU (2) U (1) representations,
and it also implies doubling the Higgs sector. If we assume that all this extra matter has
a mass of around 1 TeV, it makes the three lines bend in precisely the right way. So far
LHC has not revealed any evidence for supersymmetric particles of masses near or below
1 TeV. Nevertheless, the concept of SU (5) unification is still important enough to have a
closer look.
The idea of SU (5) gauge unification is extremely simple. One builds a gauge theory
with an SU (5) gauge group, and then one breaks this symmetry group spontaneously to
its SU (3) SU (2) U (1) subgroup. The spontaneous breaking is achieved by a new
Higgs-like field that must be added to the theory. This Higgs field is assumed to get a
vev of about 1015 GeV, the energy scale at which the three gauge coupling lines cross in
fig. 6. This is called the GUT scale This gives a mass to all the gauge bosons that are
in SU (5), but not in SU (3) SU (2) U (1). Below the GUT scale these extra gauge
bosons do not contribute anymore to the running of the gauge couplings, as they run to
lower energies. They go their own way, resulting in fig. 6. In this way the three Standard
Model gauge groups are unified, and so are the gauge couplings.
The smallest simple group in which one can embed the Standard Model group is SU (5).
The gauge action is just the canonical one, with a coupling constant g5 . The fermions
are minimally coupled in a way that depends only on their SU (5) representations. This
representation must be anomaly free, and we will need three copies to get three families. It
must also be complex, since otherwise we would expect it to be massive, and furthermore
the theory would be invariant under C and P, while the standard model is not. It must have
at least 15 or 16 Weyl fermions per family, and preferably not more. This is just a rough
guide towards the right answer; ultimately we must find the correct SU (3)SU (2)U (1)
representations by working out the breaking of SU (5).
Note that we will use the left-handed representation for all the fermions. This allows
us to transform them freely into each other. One cannot make internal rotations among
fermions with different handedness.
where U3 and U2 are unitary 33 and 22 matrices satisfying the relation det U3 det U2 =
1. This is precisely the group S(U (3) U (2)) identified in section 8.2 as the global group
of the Standard Model. If we write U3 = ei U3 and U2 = ei U2 where U3 and U2 have
determinant 1, then we have identified the SU (3) and SU (2) subgroups. The phases must
131
satisfy 3 + 2 = 0 mod 2. This leaves one independent phase, corresponding to the
U (1).
2 1
10 (3 , 1, q) + (1, 1, q) + (3, 2, q) , (8.24)
3 6
2 1
15 (6, 1, q) + (1, 3, q) + (3, 2, q) , (8.25)
3 6
Here we have allowed for an arbitrary real factor q since the normalization of U (1)
charges is not fixed by the algebra. The SU (3) and SU (2) generators can simply be taken
as a subset of the SU (5) generators.
132
8.4.2 Normalization of Generators.
From the point of view of SU (5) there is a natural normalization for the U (1) generator.
We choose the canonical normalization for the vector representation of SU (N ), so that
Tr T a T b = 12 ab . It is important that this trace is proportional to ab , since this was
implicitly assumed in writing the gauge kinetic terms. Note that this normalization is
indeed the one we used previously in SU (2) to derive the relation between T3 , Y and the
electric charge, and in the computation of the function.
If we make sure that the SU (3) SU (2) U (1) generators all have the same nor-
malization, we can choose a basis for the 24 SU (5) generators consisting of 12 SU (3)
SU (2) U (1) generators (numbered 1 . . . 12) and 12 remaining ones. Then
X
24 X
12
Aa T a = Aa T a + rest . (8.26)
a=1 a=1
(The terms denoted rest will be discussed later.) The properly normalized generators
appear in the Lagrangian in combination with the unified coupling constant g5 . If we
want to viewp our U (1) generator directly as a properly normalized generator, we should
choose TY = 3/5 diag ( 31 , 13 , 31 , 12 , 21 ), which satisfies TrTY2 = 12 , in other words, the
p
factor q introduced above equals 3/5.
If we now compare the SU (5) minimal couplings with p those of the Standard Model,
we get immediately the relations g2 = g3 = g5 , g1 = 3/5g5 . These are precisely the
relations required for coupling constant unification p (according to the pre-LEP data at
least). From now on we will absorb the factor 3/5 in the definition of the coupling
constant, so that content of the representation 5 is (3, 1, 13 ) + (1, 2, 21 ), i.e we set q = 1
from here on. The last entry is now precisely the Y -charge as defined previously.
133
8.5.2 Matter in the Five-Dimensional Representation.
The representations contained in the 5 do not match any Standard Model particle, but
the complex conjugates do. Hence we choose the anomaly-free representation 5 + 10 (we
could just as easily have conjugated the embedding of SU (3) SU (2) U (1) in the 5,
but that is not the standard convention). Now the 5 precisely contains particles with
the quantum numbers of dc , e and , i.e. the representation
1 1
(3 , 1, ) + (1, 2, ) (8.27)
3 2
Here and in the following all fermions are left-handed unless explicit subscripts R are
shown.
[There is one subtlety here. In a normal SU (2) doublet the upper component has an
electric charge that is higher (by one unit) than that of the lower, because Qem = T3 + Y .
This is true for the doublet (e+ , ) in the 5 but not for the doublet (e , ) in the 5 . The
reason is simple: the doublet in the 5 transforms in the complex conjugate representation
2 , and not in the 2. These representations are equivalent, but the equivalence relation
involves the invariant tensor ij , which turns the doublet upside down.]
Now we construct the 10 by taking the anti-symmetric product of two 5s. This field is
most easily represented by a 5 5 matrix, whose elements i, j have the quantum numbers
134
of the tensor product of the ith components times the j th component of the 5. Here e+ di
yields the SU (3) U (1) representation of ui and ijk di dj that of uck . The result is
0 uc3 uc2 u1 d1
uc3 0 uc1 u2 d2
1
= uc2 uc1 0 u3 d3 (8.30)
2 u u2 u3 0 e +
1
d1 d2 d3 e+ 0
The factor 12 is added to ensure that the kinetic terms have the proper normalization
(note that every field appears twice in the 10).
135
Higgs is used here as a generic name for a mechanism that breaks gauge symmetries.
The Higgs boson found at LHC is a remnant of one particular Higgs mechanism, the
one that breaks SU (3) SU (2) U (1) to SU (3) U (1). There may be several more
such mechanisms operative in nature. They must work at a higher energy scale, since
otherwise we would presumably have detected them already. This implies in particular
that the vacuum expectation value of these new Higgs mechanism must be larger than
the one of the Standard Model, about 246 GeV.
The most important property of the new Higgs scalar field we are looking for is its
coupling to the gauge bosons. As for all fields, this is completely determined by its gauge
group representation. Which representation should we use? This is almost a science in
itself. Many papers have been written about the question which representation of a group
G and which potential breaks G to a certain subgroup H. These papers usually assume
the Higgs potential to be quartic, so that the theory is renormalizable. Since we do not
trust the renormalizability of the Higgs system that much anyway, this requirement should
perhaps not be taken too seriously. Indeed, it is quite reasonable to expect couplings of
the form 2 6 , where is the scale where the coupling constant blows up. This would
not be allowed in renormalizable theories because it means that we cannot make sense
of the theory for momenta larger than , but this we cannot do for the scalar theory
anyway. If is close to the Higgs mass the theory is strongly coupled, and such higher
order terms in may be relevant for the determination of the minimum.
There is however one criterion that is important: the vacuum expectation value of
is invariant under the broken gauge group, by definition of the latter. Hence the
decomposition of the representation of with respect to the subgroup H must contain a
singlet.
Searching again through the representations of SU (5) we find that the smallest rep-
resentation containing a singlet is the adjoint, 24. Its full decomposition is
This can be derived easily by computing the tensor product of a 5 and a 5 and subtracting
a singlet. This is also the decomposition for the gauge bosons, and we recognize the first
three representations as those of the SU (3), SU (2) and U (1) gauge bosons.
The rest is straightforward. We couple this Higgs scalar to the gauge bosons in the
usual way. We cannot couple them to the fermions, because one cannot build a singlet
out of 5 , 10 and 24. The scalar gets a vacuum expectation value that breaks SU (5) to
SU (3) SU (2) U (1) and that gives a mass to the 12 unwanted gauge bosons. They
eat 12 of the Higgs, and the other 12 become massive. These massive Higgs bosons are
of little interest since they do not couple to the fermions.
136
which we can decompose into two SU (3) U (1) components. These components are
massive vector bosons usually called X and Y . They are color triplets and have charges
34 and 31 respectively. Their coupling to fermions follows straightforwardly from the
minimal couplings in the SU (5) Lagrangian. They appear in these couplings as
1
X,i T 1 (i, 4) + X,i
2
T 2 (i, 4) + Y,i
1
T 1 (i, 5) + Y,i
2
T 2 (i, 5) , (8.32)
T 1 (i, j)kl = 1
( + il jk )
2 ik jl
T 2 (i, j)kl = 1
2
i(ik jl il jk )
These matrices are thus like 21 1 and 12 2 . Just like one does for the W -bosons, we now
go to the charge eigenstates X = 12 (X 1 iX 2 ) and analogous for Y (the upper index
refers only to the sign of the charge).
The full set of SU (5) gauge bosons can in fact be represented as a matrix G = Aa T a ,
where T a is a matrix in the representation 5 and only the group structure is indicated;
all space-time indices are suppressed. The group structure of the minimal coupling to the
field is then ()T (GT ), because GT = G is the matrix representing G in the
5 . The representation 10 is the anti-symmetric tensor product of two 5s. If we label
the field mn (m, n = 1, . . . 5), then the group structure of the couplings to is
mn [Gmk nl + Gnl mk ] kl ,
(8.33)
LX = g5
X [e dc e+ uc u] + c.c
+ d
2
g
5 Y [ dc
LY = 2
u e+ uc d] + c.c (8.34)
For simplicity we have suppressed color indices. They are contracted as follows for the
X-boson couplings: Xi dci , X i di , ijk Xi ucj uk and analogously for the Y boson couplings.
As expected these couplings violate both baryon number and lepton number. Diagrams
for processes leading to proton decay are easy to construct, for example
137
These diagrams contribute to the process p e+ m, where m is a meson, which could
be for example a or a . Note that we are not sure whether in the first diagram d
really is transformed to e+ or to another charged lepton. This we can only determine
after diagonalizing the mass matrices, and we will do that in a moment. If in fact the
lepton is a then the process is forbidden by energy conservation. But there are other
processes in which the lepton is a neutrino, which are allowed irrespective of the neutrino
species.
The correct way to compute the couplings between the X and Y bosons is to take
into account the matrices U and V that were introduced in Eqs. (4.27) and (5.11 to
diagonalize the mass matrices. Then the couplings in Eq. (8.34) are replaced by matrices
in flavor space, and instead of Eq. (8.34) we get
h i
LX = g52 X E [UE VD ] Dc + D [UD VE ] E+ + Uc [VU UU ] U
h i
LY = g52 Y N [UN VE ] E+ U [UU VE ] E+ Uc [VU UD ] D (8.35)
This makes many degrees of freedom of the previously unobservable rotation matrices U
and V observable. Note that the matrix U rotates left-handed quarks or leptons, whereas
V rotates anti-quarks and anti-leptons. The couplings of the Z and W -bosons involved
rotation matrices of the form U (x)U (y), where x and y are identical for the Z bosons.
Here even for couplings involving only one quarks species (the third term), no GIM-like
cancellation is possible. Note also the appearance of the neutrino mixing matrix UN .
Previously it appeared in the CKM matrix for the e coupling, U (N )U (E). In the
absence of neutrino masses it can be set equal to UE , which defines e as the neutrino to
which e decays. For more about the implications of these interactions for the stability
of the proton see section 8.10.
5 5 = 10 + 15
5 10 = 5 + 45
10 10 = 5 + 45 + 50
138
Hence the candidates are 5, 10, 15, 45 and 50 (note that we can use scalars and their
conjugates to build Yukawa couplings). The decompositions of these fields with respect
to SU (3) SU (2) U (1) are
5 (1, 2, 21 ) + (3, 1, 13 )
10 (1, 1, 1) + (3 , 1, 32 ) + (3, 2, 61 )
15 (1, 3, 1) + (3, 2, 61 ) + (6, 1, 34 )
45 (1, 2, 12 ) + (3, 1, 13 ) + (3, 3, 31 ) + (3 , 1, 34 ) + (3 , 2 76 )
+(6 , 1, 13 ) + (8, 2, 12 )
50 (1, 1, 2) + (3, 1, 31 ) + (3 , 2, 73 ) + (6 , 3, 31 ) + (6, 1, 34 ) + (8, 2, 12 ) .
To break SU (2) U (1) without breaking SU (3) we need a representation (1, R, q) where
R and q are both non-trivial. The 5, the 45 and the 15 meet that requirement, but in the
latter case the candidate Higgs scalar is a triplet of SU (2), and not a doublet. In addition
the 15 can only couple 5 to itself. The fields in the 5 are, in the usual Standard Model
notation, dR and (, e )L , and mass terms between any pair of these fields are undesirable
except for a possible Majorana mass for the neutrino. If thats the only mass we can get,
it means that the 15 is not a useful representation (not by itself, at least).
We will only discuss scalars H in the 5 in some detail. The couplings to the combina-
tion 5 + 10 are (here i, j, k, l, m, n are SU (5) indices, and and are family indices)
g1 i (5 )Ckl (10)Hm
ik lm + c.c (8.36)
Note that the 10 is an anti-symmetric tensor product of two 5s, so we can represent it
as a field with two vector indices, satisfying ij = ji . Since the indices i, k (and l, m)
belong to conjugate representations, they can be contracted by a Kronecker . For the
other coupling we need the invariant tensor ijklm of SU (5):
The fermion bi-linear is symmetric under the exchange of the two 10s (the sign change
coming from interchanging the two fermions is canceled since C = C T ), and hence g2
must be symmetric in and . To right-handed neutrinos we have to add three singlet
representations of SU (5). These can get a Majorana mass, and in combination with the
fermions in the representation 5 and the Higgs they can get a Dirac mass. The coupling
to the Higgs boson is
gneutrino i (5 )C (1)Hm im + c.c , (8.38)
where (1) is the SU (5) singlet field.
Let us assume that the field H acquires a vacuum expectation value just like it does
in the Standard Model. This issue will require further discussion, since in principle the
field H could choose an arbitrary direction within SU (5). A completely random direction
would break color, but there are Higgs potentials for which that does not happen. If color
is not broken, the Higgs v.e.v will choose a direction within SU (2) U (1). This direction
is in principle arbitrary, but we have already fixed it by assigning particles to the elements
139
of the 5 and the 10. This is standard practice, but conceptually not very elegant (in
the discussion of the Standard Model we did not follow this practice). Hence we choose
hHi i = 12 vi5 . Then the two Yukawa interactions yield the following fermion bi-linears
1
g1 i (5 )Ci5 (10)v + c.c (8.39)
2
and
1 X
4
g2 ij (10)Ckl (10)ijkl v + c.c . (8.40)
2 i,j,k,l=1
DR MD DL ER ME EL + c.c , (8.42)
where
v
M = MD = ME = g1 . (8.43)
2
The Hermitean conjugate on ME is due to the fact that the second term in Eq. (8.41 is
not of the form c M , where is a particle and c the antiparticle spinor. Hence the
lepton mass terms in Eq. (8.42) are obtained from the terms labeled c.c in Eq. (8.41)
(c.f. Eq. (5.10)). This finds its origin in the fact that the 5 contains the anti down quark
and the electron.
The second Yukawa coupling is just slightly more difficult to analyze. Note that
because of the tensor only the first four components of the 10, the u-quarks, contribute.
There are 4! = 24 terms, divided over three colors, so that for each color the multiplicity
is 8. Together with the normalization factor of the 10 and an overall sign we get then
U MU U , (8.44)
where
MU = 2 2g2 v , (8.45)
which is a symmetric matrix.
We find thus a relation among the mass matrices for the leptons and the down quarks,
whereas the up quarks have their own independent mass matrix. The mass relation implies
in particular that the eigenvalues are the same, and that the diagonalization matrices are
the same, so that the aforementioned problem of deciding which lepton belongs to which
family does not occur: we have to order them according to increasing mass. Then SU (5)
with the set of Higgs bosons chosen here implies the following mass relations
md = me
140
ms = m
mb = m
At first sight that does not look like a great success, but we have to remember that these
relations hold at MGUT . Just like the coupling constants we have to extrapolate them to
lower energies. Comparison with experimental data is not straightforward, since we do
not measure the quark masses directly, and since in addition the required extrapolation
for the d and s quarks is to mass scales that are much too low. With the -mass as input,
the predicted value for mb is somewhere between 5 and 7 GeV (depending on various
assumptions), to be compared with the mass of the lowest bb bound state, the Y , 9.46
GeV. For the other relations it safer to compare the ratios md /ms and me /m , under
the assumption that at least some of the unknown QCD effects cancel. The agreement is
nevertheless not good, the discrepancy being almost a factor 10. It is noteworthy that in
GUTs originating from string theory (in particular from heterotic strings) the relations
between the gauge couplings are preserved, but that the bad predictions for the fermion
masses do not hold.
Now let us consider briefly the effects of the other candidate Higgs boson, the 45. Now
the mass relations are ME = 3MD , leading in particular to m = 3mb , which is certainly
not an improvement. For the u quarks the result is much worse. The coupling to the 45
yields an anti-symmetric mass matrix. This is a bad feature, because the eigenvalues of
such a matrix come in pairs with opposite sign. The signs do not matter for fermion mass
terms mR L , because we may flip the sign of R without altering the kinetic terms. But
then we get two masses that are equal, and a third one that is necessarily zero. Clearly
mu = 0, mc = mt does not fit the quark masses very well.
If we choose a combination of a one 5 and one 45 (or more of each), the mass matrix
MU becomes an arbitrary complex matrix, and the mass matrices for the leptons and the
down quarks read ME = M (5) + M (45) , MD = M (5) 3M (45), which implies that
they are two independent matrices. This means that we have lost all predictive power.
In this case all mixing angles between the quarks and leptons are in principle non-trivial.
If only the 5 is used, the lepton and down quark masses can be diagonalized by the
same matrices. This is most easily seen by considering Eq. (8.41)
ME (diag) = UET M VE
MD (diag) = VDT M UD .
UE = UD and VE = VD . (8.46)
As was explained before when we discussed the CKM matrix, the matrices U and V are
not determined uniquely by the requirement that the mass matrices be diagonal. We can
replace V by V P and U by U P , where P is diagonal and unitary. This matrix P can
be chosen differently for D and E. Such phase rotations of the matrices UU and UD were
used in the weak interactions to bring the CKM matrix to a definite form. This implies
141
that the phases for the U sector and the D sector are already fixed, but the ones in the
E sector are free. It is furthermore not hard to show that the symmetry of MU implies
VU UU is diagonal (though in general not 1). The remaining two couplings, those of Y
boson to ue and to ud have non-trivial mixing.
142
+ or an anti-neutrino, and in addition there can be any number of lepton anti-lepton
pairs.The main decay modes, if one disregards family mixing, are e+ , e+ , e+ , e+ ,
c + , c + , + K 0 , etc.
143
of the Higgs boson. It turned out that all these effects go in the same direction, and reduce
MX by a factor 100, and hence the proton lifetime by a factor 108 ! This still only uses a
rather primitive treatment of the thresholds, namely a discontinuous change of the slope
as in fig (6).
A second important class of corrections are the SU (3) SU (2) U (1) loop effects
on the effective four-fermi interaction, due to gauge boson exchanges between the four
external legs. These can enhance the decay by factors of about 5 for gluon exchange and
2 for W, Z and photon exchange.
Another technical difficulty is the correct treatment of the proton structure. Various
models for hadrons have been used, such as the bag model.
Not surprisingly, the final answer is subject to a large amount of uncertainty, and is
about 10312 years. This range of values is however by now ruled out by experiment.
All of this is based on the minimal SU (5) model. The simplest way of making the
predictions for proton decay in agreement again with the experimental lower bound is
to increase MX . Within the SU (5) model that can be done by adding extra matter to
the desert. We have seen the beginning of the chapter that simply adding a full SU (5)
multiplet is not going to change MX . It will only increase g5 at MX , thus making the
decay width larger instead of smaller. The only way to increase MX without giving up
SU (5) altogether is to add broken SU (5) multiplets. The arguments in section 8.1 assume
that all particles in a multiplet contribute to the coupling constant evolution. If some
are heavier than others, they will decouple, and then it is possible to change MX . The
chiral fermions forming a family form an unbroken SU (5) multiplet, and hence to first
approximation their presence does not influence MX (if one looks more carefully the mass
splittings introduced by the weak interactions do have some effect on the evolution below
MW ). The standard model Higgs does have an effect, since one must assume that its triplet
component is heavy (see below). Hence convergence and the value of MX are sensitive to
the number of Higgs scalars. Another set of fields that have an important influence turn
out to be the gauginos in supersymmetric theories.
U U (8.48)
We can use the gauge freedom U to diagonalize . Since it traceless, this leaves 4 pa-
144
rameters. For arbitrary choices of these parameters, it would break SU (5) to U (1)4 .
However, it turns out that this cannot happen. A single adjoint Higgs can break the
group G = SU (N ) only to a so-called maximal subgroup, which is a subgroup H G
such that there is no intermediate group H 0 with H H 0 G, H 0 6= H and H 0 6= G.
With multiple adjoint Higgs field on can realize chains of symmetry breaking to smaller
groups. To understand why this is true requires a more detailed study of Higgs potentials.
We will just take it here as a fact. This implies that the vacuum expectation value of
can either be
hi = diag (v, v, v, v, 4v) (8.49)
or
hi = diag (v, v, v, 23 v, 23 v) , (8.50)
where v depends on the parameters in the potential. The first v.e.v. breaks SU (5) to
SU (4) U (1), the second to SU (3) SU (2) U (1). If we allow to have more distinct
eigenvalues we would get a subgroup of these groups. Note that there is an important
difference with the Standard Model Higgs mechanism: not all possible Higgs vevs are
gauge equivalent.
The obvious problem with the combined 24 and 5 Higgs is the hierarchy problem:
Why does one of them get a vacuum expectation value so much smaller than the other?
But there is a second problem. The Higgs is a 5 of SU (5), and in addition to the
Standard Model Higgs boson this representation contains a color triplet scalar. This
particle couples to quarks and leptons via the Yukawa couplings, and it is not hard to see
that it can mediate proton decay. Therefore its mass must be of the order of MX . On
the other hand its partner, the SU (2) doublet, must get a mass of the order of the weak
scale.
This is all possible, but in a very unsatisfactory way. To examine it more closely we
consider the complete Higgs potential for H and . If we go to quartic order and impose
(for simplicity) the discrete symmetry , the most general Higgs potential is
V (, H) = (5 )2 H H + 41 (H H)2 12 2 Tr2 + 41 a( Tr2 )2 + 12 b Tr4
+ H H Tr2 + H 2 H .
For suitable parameter choices, a minimum of this potential is
1
hi = diag (v, v, v, ( 23 12 )v, ( 32 + 12 )v); hHi = (0, 0, 0, 0, v0 )T . (8.51)
2
The terms are induced by SU (2)w breaking, and are slightly worrisome. The 24 de-
composes as in Eq. (8.31). To break SU (5) to SU (3) SU (2) U (1) only the singlet
component should get a v.e.v, as is the case in Eq. (8.49). The terms indicate that also
the SU (2) triplet component gets a v.e.v (all other components have non-trivial color,
and the vacuum we consider here does not break color). This is undesirable, since they
would contribute to the -parameter. However, this problem at least takes care of itself,
v2
since it turns out that v02 . Since we clearly want v0 v we see that v v0 , and
hence the 24 gives a negligible contribution to SU (2)w breaking in comparison to the 5.
145
For any sensible choice of the parameters in the potential, the mixed H terms will
induce a mass-term for H, or rather for the SU (3) SU (2) U (1)-components of H. The
mass of the doublet component of H is, ignoring , equal to 25 + 15 2
v 2 + 29 v 2 . Here
v is of order MX and and can be expected to be of order 1, while the sum must be
of order MW . This is only possible if the two last terms almost cancel each other. There
is no symmetry that can achieve this, and thus it requires a fine-tuning of / with a
precision of about 25 digits. Once this has been achieved we do not have to worry much
about the color triplet Higgs. Its mass is given by another combination of and , and
the natural value of its mass is roughly MX . So the problem with the SU (5) Higgs sector
is in the end just the naturalness problem of the Standard Model. The only difference is
that it shows up in a more concrete way.
F = J
F = 0 .
Although the magnetic field is present only inside the solenoid, the field A is non-zero
outside, and can be detected by means of a charged particle. In particular we may carry
146
the charge once around the solenoid and bring it back to the same point, so that it forms a
closed loop. Note that the vector potential points in the tangential direction along circles
around the solenoid, and falls of as 1/r, where r is the distance to the solenoid. Using
Stokes theorem we can then convert the loop integral of A ~ to a surface integral of B:
~
Z Z
~ = dS ~n B = ,
d~s A (8.53)
where ~n is the normal vector of the surface and the flux through the surface. Note that
the left-hand side has the same value for any circle around the solenoid: the circumference
of the circle increases with r, but the vector potential decreases with r.
This seems to lead to the conclusion that an infinitesimal solenoid can always be
detected by means of finite size loops of charged particles. To express it in a more
physical way, one may do interference experiments with particles going from point 1 to
point 2 via different paths. If the loop formed by two such paths encloses the solenoid
the interference pattern will change.
Note however that the change of the wave function is only by a phase. Hence the
solenoid would still be unobservable if the phase equals 1, or
~
= n 2 , (8.54)
e
where n is an integer. In the case under consideration is the flux through the solenoid.
Since there is no net magnetic flux escaping from any infinitesimal sphere around x = 0,
must be equal to minus the magnetic monopole flux that appears to emerge at
x = 0. This in its turn is proportional to the magnetic monopole charge that one would
define
R if indeed the end of the solenoid were a monopole. By analogy with the equation
dS ~ = 4e for the total electric flux from an electric point charge e, we define
~n E
S
Z
~ = 4g .
dS ~n B (8.55)
0
S
Here S denotes a sphere at infinity, and the prime indicates that we omit the contribution
of the infinitesimal solenoid, which precisely cancels the monopole flux. Substituting this
into Eq. (8.54) we get the famous Dirac quantization condition for magnetic charges
eg = 12 n~ . (8.56)
This result is a necessary condition for the existence of magnetic monopoles in a theory. It
implies that if the theory contains particles with electric charge e, then if any monopoles
exist they must have magnetic charges that are a multiple of ~/2e.
This result can be made more precise by showing that any field configuration with an
asymptotic behavior such that the magnetic field has a monopole component can only
be obtained from a vector potential that is not regular everywhere on the sphere. The
singularities form a string as a function of the radius of the sphere, but if (and only if) the
Dirac quantization condition is satisfied this singularity has no observable consequences.
147
Note that the Dirac quantization condition was derived by using a charge e particle,
where e is the electron or proton charge. If (unconfined) particles of charge e/m exist in
nature (where m is some integer), one could use them instead of an electron to detect Dirac
strings, and hence the minimum magnetic charge would be a factor m larger. Conversely,
if a magnetic monopole were found whose magnetic charge is precisely 1/2e (from now
on ~ = 1), we would know that all electric charges have to be a multiple of the electron
charge.
In pure electrodynamics one cannot reasonably expect magnetic monopoles to exist,
for two (not unrelated) reasons. First of all the Dirac field configuration just described
not only has a string singularity, but also a singularity in the field strength at r = 0.
Hence the magnetic field energy, which is part of the mass of the object, is infinite. One
cannot really resolve this singularity without discovering that one is looking at the end
of a solenoid and not at a monopole. With electric charges there is no such problem.
Secondly, nothing in the theory forbids us a priori to add particles of arbitrary charge:
there is no charge quantization mechanism. Hence one would expect the minimal value
of g to be infinite.
It was realized by t Hooft and Polyakov that these problems could be overcome if
the electromagnetic gauge group was embedded in a non-abelian group. The canonical
example is U (1) SO(3), with U (1)em = T3 . In this case there is a fundamental reason
for charge quantization, since the representations of SO(3) only allow integer eigenvalues
for T3 .
To see how the singularity problem is solved we have to consider how SO(3) is broken.
One uses a Higgs in the triplet representation (the adjoint representation), which devel-
ops a v.e.v. which can be rotated to the form hi = (0, 0, v). The surviving gauge group
is SO(2). However, the direction of the Higgs in group space is not relevant. We could
choose any direction we want, and even choose different directions in different space-time
points. The trick is now to make hi point in the radial direction r for large r. Locally
each asymptotic observer measures the same physical phenomena as with a fixed vacuum
direction, but globally the configuration is different. In fact, no continuous transformation
will bring it back to a fixed direction, and one says that configuration is topologically
non-trivial. Note that we are making an identification between two a priori unrelated
groups, the SO(3) gauge group and the SO(3) rotation group.
This can be done asymptotically, but one cannot continue the Higgs field to r = 0
without encountering a singularity. This can be solved by choosing the Higgs field as
where f (0) = 0 and f () = 1. In other words, the Higgs v.e.v. goes to zero at r = 0.
We are not allowed to set = 0 over all of space-time, since that would cost an infinite
amount of energy, but we can do it in a finite space-time region at finite energy cost.
If one substitutes the above ansatz for the Higgs field into the equations of motion,
one finds that for large r,
v
i ha i = (ai ri ra ) (8.58)
r
148
R
hence d3 x(i a )2 diverges. Here i is a space index, and a an SO(3) vector index. These
are thus indices of isomorphic representations. To get a configuration of finite energy, one
can must make use of the coupling to the electromagnetic field, which modifies i to Di .
To get Di to fall off sufficiently rapidly, we need (upper and lower indices are used here
merely as a notational convenience, and have no special significance)
rj
Abi (~r) = ibj [1 K(r)] . (8.59)
er
To get the proper asymptotic behavior, we need K = 0; to avoid a singularity at r = 0
we need
R 3 K(0) = 21. Consider first the large r behavior. WeR would like to demonstrate
that d x(Di a ) falls off sufficiently fast at large r, unlike d x(i a )2 . Consider
3
Di a = i ieAbi T b a (8.60)
Here we used the fact that T b are representation matrices in the adjoint representation,
b
hence Tac = ibad . Substituting the asymptotic vacuum expectation value for c , vrc ,
and performing the implicit sum over b we find in the large r limit
v
ieAbi T b ha i = (ai ri ra ) (8.62)
r
This precisely cancels the derivative contribution, Eq. (8.58). If we take into account the
r dependence in f (r) and K(r) one gets small corrections that fall off sufficiently rapidly
to keep the space integral finite. By making f (r) go to zero at the origin, and K(r) go to
1, one can ensure that the integral near the origin is finite as well. One may substitute
Eqs. (8.57) and 8.59) into the equations of motion, and obtain a set of coupled differential
equations for the functions f and K. These can only be solved numerically, and in some
cases even analytically.
We have
R now obtained a non-trivial solution to the equations of motion with finite
energy: d3 xH is finite, where H is the Hamiltonian density, defined in the usual way
H = (Di a )2 + V () (8.64)
149
If we set the energy of the vacuum (corresponding to vanishing a and Abi ) to zero, then
the field configuration we have obtained has a non-vanishing energy density localized
around a point in space (namely ~r = 0) with a finite total energy. Note that this breaks
translation invariance. There is nothing wrong with that, it can simply be interpreted as
an object localized at ~r = 0. One may of course find a completely analogous solution
to the equations of motion localized at other points, and one can find time-dependent
solutions where these objects are moving as free particles. By studying
R 3 their kinematics,
one observes that they behave as particles, with a mass given by d xH. Such solutions
can be found in many classical field theories, and are generically called solitons. So
what we have found here is that the SO(3) gauge-Higgs system has a soliton solution. In
the quantum theory these give rise to new particles, in addition to the usual ones created
from the vacuum by the quantum fields.
a
It is easy to check that the field strength F derived from the non-abelian field con-
figuration Eq. (8.59) has asymptotically non-vanishing components only in the direction
of the unbroken U (1)em . Furthermore this field configuration looks asymptotically like
the one of a magnetic monopole with magnetic charge 1/e ,
i ri
B = . (8.65)
er2
R
This implies in particular that for large r the integral d3 xB ~ 2 is finite.
To any distant observer this object would look like a magnetic monopole. Note that
the vector potential Eq. (8.59) has no string singularity. It was avoided by making use of
the embedding in SO(3), which allowed us to make the vacuum point radially. By making
a (singular) gauge transformation we can make the vacuum point in one direction only,
but then inevitably a string singularity is introduced for the gauge field.
The monopole strength is twice the minimal Dirac value 1/2e. The reason is that
we may add fields in spinor representations of SO(3) (so that the global gauge group
becomes SU (2) instead of SO(3)) whose charges are half-integer in comparison to the
SO(3) charges. These half-integral charges would be in conflict with a monopole of charge
1/2e. Since the spinor representations are not involved in the classical field configuration,
it is clear that the classical solutions in the SU (2) theory are the same as in SO(3). Hence
the SO(3) solutions already anticipate the possibility of half-integer charges.
The energy density of the object is localized around r = 0, and falls of exponentially
for r . This exponential fall-off can be used to define the size R of the object:
H er/R . The size is set by the only scale in the problem, the Higgs vev v. By
computing the three-dimensional space integral of the energy density one obtains the
energy of the field configuration, or the mass of the object. The result can be written as
4
Mmon. = MW (, e2 ) , (8.66)
e2
where MW is the mass of the massive vector bosons that are the result of the spontaneous
symmetry breaking SU (2) U (1). Hence MW ev. The function depends on the
gauge coupling e (in this case the SU (2) gauge coupling and the canonically normalized
150
U (1) gauge coupling are identical) and the Higgs quartic self-coupling . For simplicity
we assume here a simple Higgs potential V () with a quartic term (a a )2 . For = 0 we
get = 1, and in this limit (the Prasad-Sommerfield or Bogomolnyi limit) the equations
of motion can be solved analytically. The function increases monotonically with , but
reaches a finite value ( 1.7867) for .
This kind of monopole solution also exists in SU (5) grand unified theories, since
they also have the properties that electric charges are automatically quantized. One
can construct spherically symmetric solutions within suitable SU (2) subgroups of SU (5),
which must include U (1)em . There are three spherically symmetric solutions with magnetic
charges 1/2e, 1/e and 3/2e, i.e. once, twice and three times the Dirac charge. Their
classical masses are
3 MX
Mq = q(i , g5 , q) , (8.67)
8
2
e
where q is the magnetic charge in units of 1/2e, and the fine structure constant 4 . The
factor 3/8 is due to the conversion from g5 to e. The function is equal to 1 in the limit
i = 0, where i is the (set of) quartic Higgs couplings. The mass increases monotonically
with all i s, and reaches a finite limit when all i s got to infinity. For any value of the
coupling constants the decay of the higher charge monopoles into minimal charge ones is
energetically allowed.
It may seem incorrect that we obtain a minimal Dirac charge monopole even though
the theory contains quarks with charges that are multiples of 31 . The fact that the quarks
are confined should not matter, since QCD never entered the discussion so far. The
resolution of this paradox is that the minimal and double charge monopole have long-
range color fields. These color fields produce an Aharonov-Bohm phase for a particle with
color charge, and when added to the electromagnetic phase this is indeed not observable,
as required by Dirac.
The triple charge monopole satisfies the Dirac quantization condition without any
need for long-range color fields even with respect to quarks, and indeed it does not have
such color fields. Interesting questions suggest themselves regarding the fate of long range
color fields in view of confinement, but we will not pursue this discussion any further here.
It goes without saying that the experimental observation of a magnetic monopole
would be an extremely important and exciting event. The minimal charge magnetic
monopole is a stable particle. If it is light one could pair-produce it in accelerators,
but GUT-monopoles, which have a mass quit close to MPlanck , will never be produced
that way. Our only hope is then that some were formed during the early stages of our
universe. The first estimate of monopole abundances led to results that were far above any
reasonable limit. For example, a good limit (the Parker bound) is obtained from the
observed presence of galactic magnetic fields in space. Magnetic monopoles would short-
circuit such fields, and since that does not happen one may deduce a limit of about 1015
monopoles per M2 per second on the magnetic monopole flux, if one assumes that the
monopoles are distributed homogeneously. Early cosmological models produced monopole
abundances far above this (and other) limits.
In inflationary cosmological models the abundance is drastically reduced. In fact
151
inflation washes out any topological structure, thus reducing the number of monopoles to
about 1 per universe. If this is indeed true monopoles will never be seen.
Independent of cosmological limits, it is still interesting to look for monopoles on earth.
Inflation might be wrong, and many cosmological bounds might not apply if there were a
local enhancement of monopoles. The sensitivity of present experiments is still less than
the Parker bound.
Monopoles have been searched for using superconducting current loops (squids).
The passage of a monopole through such a loop increases the current by a definite, quan-
tized amount, which should be an easily recognizable signal. A second, though less direct
signal for monopoles might be catalysis of proton decay. GUT-monopoles are expected
to have a very large cross section for turning protons into leptons and mesons, violating
baryon number. This is possible because monopoles carry inside their core classical X and
Y vector bosons, and because the lowest quark and lepton partial waves can penetrate
all the way to the core without encountering any barrier. The precise magnitude of the
cross section is hard to calculate and somewhat controversial, however. In any case, no
evidence for the existence of magnetic monopoles has been found so far.
8.13.1 SO(10)
The most attractive possibility is SU (5) SO(10). The main advantage of this embed-
ding is that one Standard Model family can be fit within a single irreducible representation,
the spinor, which has dimension 16. This decomposes into SU (5) in the following way
16 5 + 10 + 1 (8.68)
We see that in addition to a Standard Model family we get a singlet. This has the quantum
numbers of a right-handed neutrino, so that in these models it would be natural for the
neutrinos to have a Dirac mass.
Another advantage is that SO(10) does not have a rank three invariant tensor, so that
all its representations are automatically anomaly-free. In SU (5) there is still a cancellation
between the 5 and the 10 which is not understood in a fundamental way. Furthermore
the 16 is a complex representation, so that no mass terms are allowed before the SO(10)
symmetry is broken.
152
In addition to SU (5), SO(10) contains a U (1) which turns out to be B L. This
was already an exact symmetry in the Standard Model and its SU (5) extension, and it
can thus be gauged, even without SO(10) unification. The gauge boson of B L must
acquire a mass well above the weak scale, since no light vector boson has been observed.
Note that the coupling of this extra gauge boson is related by unification to the Standard
Model couplings, so it can not be extremely small.
In SO(10) there are additional heavy gauge bosons, connecting the 5 , 10 and 1 to
each other. The proton decay width and the branching ratios will thus be different.
The breaking of SO(10) to the Standard Model can proceed in many ways. Simply
checking the maximal sub-algebras of SO(10) leads to the following two main breaking
chains
The first step in these two chains is a breaking to a maximal subgroup. The groups
SU (5)U (1) and SU (4)SU (2)SU (2) are the only two acceptable maximal subgroups
of SO(10). All others either do not contain the Standard Model, or break the 16 to a
real representation, or both. In principle every step requires its own Higgs mechanism,
although it is sometimes possible to perform two steps at once with a single Higgs. This
leads in general to a rather complicated Higgs Lagrangian, and one or more additional
intermediate scales, which one can consider as independent input variables in addition
to MX and MW in SU (5). Needless to say, the discussion of the possible minima of the
potential becomes extremely complicated in these models. We will not discuss that issue
here.
The second breaking of SO(10) leads to a unification model considered first by Pati
and Salam, before the SU (5) model was found. They already predicted the possibility of
proton decay in these models. In the various breaking steps, a Standard Model family
emerges in the following way
16 (4, 2, 1) + (4 , 1, 2)
(3, 2, 1, 61 ) + (1, 2, 1, 12 ) + (3 , 1, 2, 61 ) + (1, 1, 2, 21 )
(3, 2, 0, 16 ) + (1, 2, 0, 12 ) + (3 , 1, 21 , 16 ) + (3 , 1, 21 , 61 )
+(1, 1, 21 , 12 ) + (1, 1, 12 , 12 )
(3, 2, 16 ) + (1, 2, 21 ) + (3 , 1, 13 ) + (3 , 1, 32 ) + (1, 1, 1) + (1, 1, 0) (8.71)
153
Here QY = Q1 + Q2 (see Eq. (8.70) for the definition of these two charges). The first
SU (4) SU (2) SU (2) representation yields thus the left-handed quarks and leptons
(left-handed particles), while the second one yields the right-handed ones (left-handed
anti-particles). In the first two stages the model has a left-right symmetry. It is not
invariant under parity or charge conjugation: both would map (4, 2, 1)L to (4 , 2, 1)L
(after transforming back to left-handed fields), which is a representation that does not
occur. However one can define a new exact symmetry by combining P or C with an
interchange of the gauge bosons of the two SU (2) groups (provided they have the same
coupling constant).
This is true for the kinetic terms and minimal couplings; Yukawa couplings and the
Higgs potential might not respect such a symmetry. If indeed there is such left-right
symmetry it must be spontaneously broken. We end up with the usual W bosons
coupling to left-handed fermions, plus two similar but more massive bosons coupling to
right-handed ones. At still higher mass scales there are bosons transforming quarks into
leptons, due to the embedding of SU (3) U (1) in SU (4), and at still higher energies one
encounters bosons coupling particles to anti-particles.
8.13.2 E6
One can go one step further and embed SO(10) in E6 . This group is also anomaly free
while having complex representations. The simplest one is the 27. It decomposes to
27 16 + 10 + 1 , (8.72)
The first term represents one family, while the second and the third are real, and thus
have a chance to become massive well above the weak scale. This does not look especially
attractive, and nothing is gained by extending SO(10) to E6 , but nature need not follow
that kind of logic. The group E6 contains SO(10) U (1), and the extra U (1) bosons
must acquire a mass.
Just as above, the breaking of E6 does not have to go via SO(10) U (1), but one
could also consider other maximal subgroups. The most popular one is SU (3)3 (the other
viable candidate is SU (2) SU (6), but this does not seem to have been studied much).
One of those SU (3)s becomes the color group, whereas the other two contain SU (2)L
and SU (2)R , which are respectively the SU (2) group of the Standard Model, and its
counterpart for the right-handed fermions discussed above.
The main attraction of SU (3)3 is the global S3 permutation symmetry which one can
impose. If one does, the coupling constants of the three groups are equal without a need
for full unification into E6 . They will remain equal to all orders in perturbation theory.
However, this symmetry must be broken spontaneously to get the Standard Model.
154
(e , , uc )L and 10 = ( c , dc , u, d)L , and one adds an SU (5) singlet e+
L . Note that there
is one extra particle per family, a right-handed neutrino. As far as the SU (3) SU (2)
representation is concerned this is possible, since the difference between a flipped particles
is just the electric charge. However, there is now only one way to get the correct electric
charge, and that is to add an extra U (1) factor. The Standard Model U (1)Y is then a
linear combination of the U (1) subgroup of SU (5) and the extra factor. Let us denote
the charges respectively as QF , Y and Q5 , where Q5 is the U (1) embedded in SU (5).
Thus Q5 is exactly Y in standard SU (5), and we will normalize it in the same way. It is
then easy to check that the combination Y = 15 Q5 + QF gives the correct answer, if we
assign QF charges 35 , 15 and 1 to the five, ten and the singlet respectively. Note that QF
is traceless. In fact, it turns out that this SU (5) U (1) model is a subgroup of SO(10),
and that the charge QF is B L 54 Y .
This model seems to have few advantages and many disadvantages. Even the nice
property of automatic charge quantization is lost, since there is an extra U (1) factor. The
main reason why this model was considered is that one can break it to SU (3) SU (2)
U (1) with a Higgs in the 10 of SU (5). This was seen as an advantage for such a model in
the context of superstring theories, since in most string theories one cannot get a Higgs
in the 24 of SU (5), but only in the 1, 5 or 10.
8.14 Conclusions
The idea of Grand unification is a priori very attractive. This idea can in a natural way
explain the following features of the Standard Model
+ Coupling constant convergence.
+ Charge quantization.
155
Family repetition.
The first of these problems is actually made more serious due to the explicit introduc-
tion of a large scale into the problem, which does not decouple naturally from the weak
scale.
The minimal GUT model, based on SU (5) does not agree with experiment: precise
LEP measurements have shown that the three running coupling constants do not go
exactly meet in one point, and the expected proton decay has not been found. All other
models have extra parameters, and are much harder to rule out.
However, the idea of grand unification is far from dead. Two remarkable facts remain:
that the coupling constants converge approximately, and that one family fits exactly in
two representations of SU (5), and with an extra right-handed neutrino in a single
representation of SO(10). These two observations will undoubtedly continue to play an
important role in the future.
8.15 References
Most of this section was based on the extensive review by P. Langacker [20]. A useful
review of group theoretical results for unification is [28].
9 Supersymmetry
Supersymmetry is a symmetry relating bosons to fermions. There is no doubt that it
plays an important role in theoretical particle physics already. It has been used for
proving index theorems, deriving positive energy theorems and lower bounds on soliton
masses, to construct consistent fermionic strings and many other purposes. All of these
are technical applications, however. The question is: could it be a symmetry of nature?
At first sight it seems that the answer must be negative. Among the particles in
the Standard Model, there is at most one boson-fermion pair with the same mass (the
photon and one of the neutrinos), and only one pair that belongs to the same SU (3)
SU (2) U (1) representation (a lepton doublet and the complex conjugate of the Higgs).
So if supersymmetry is a symmetry of nature it must be badly broken. This is not a
problem in itself, since we know from the Standard Model that badly broken symmetries
can nevertheless play a crucial role in our understanding, but the difference is that at
least we have always known several complete SU (2)w multiplets. It is this difference that
makes phenomenological supersymmetry a much more speculative subject. There simply
is not the slightest piece of direct evidence in its favor. Many people hoped that after
the first LHC run that finished in march 2013, some of the missing superpartners would
finally have emerged, after several decades of expectations. But this has not happened.
156
There are several a priori motivations for attempting to supersymmetrize the Standard
Model. The first and most primitive one falls under the category why not. It is argued
that supersymmetry is a very beautiful idea, and that it would be a pity if nature chose
to ignore it. No further comments are needed here.
The second motivation is that supersymmetry is known to improve the divergent
behavior of perturbation theory. For example, N = 4, D = 4 Yang-Mills theory (four
supersymmetries in four dimensions) was shown to be finite to all orders in perturbation
theory! This is a very remarkable result, but not a motivation to make SU (3) SU (2)
U (1) supersymmetric. In a finite theory without a scale the -functions vanish and
the coupling constants do not run, whereas we observe that they do run. After N =
4, D = 4 Yang-Mills was shown to be finite, it was hoped that one could also find a finite
supersymmetric theory of gravity. So far the maximally supersymmetric theory, N = 8
supergravity, has not been demonstrated to be finite, nor has the contrary been shown
convincingly. The hope of a finite theory of gravity has however probably been realized
by superstring theory. The spectrum of this theory is supersymmetric, and if indeed
superstring theory is the only way to make sense of perturbative gravity, one could view
this as an argument in favor of a supersymmetric spectrum. One should add immediately
that finiteness is useful in practice only if it survives supersymmetry breaking. If it does,
then it should not make any difference if supersymmetry is broken far below the Planck
scale, or just slightly below. So even if this is an argument in favor of supersymmetry, it
does not imply low-energy supersymmetry.
The primary motivation for believing that supersymmetry might have a role to play
in particle physics is the hierarchy problem. One way to formulate that problem is that
one cannot have scalars that are naturally massless without having supersymmetry. In
comparison to large scales such as the GUT scale or the Planck scale all Standard Model
particles are essentially massless. One may call that natural if there is an exact symmetry
in the zero mass limit. Of course the particle of interest here is the scalar particle dis-
covered in 2012, the first particle that might be a fundamental scalar: the Higgs boson.
This particle is not exactly massless: before symmetry breaking the Higgs scalar has a
mass2 2 <p 0 and after symmetry breaking a physical scalar appears in the spectrum
with mass 22 . But, as explained in sec. 7, the value of 2 is extremely small in
comparison to the GUT or Planck scale, so that to first approximation the Higgs scalar
is a massless scalar.
The only massless (or nearly massless) particles that one can have in a sensible field
theory have spin 0, 21 , 1, 32 or 2. There are good arguments for that in field theory, and
string theory respects that rule as well. For each of these particles except spin 0, there
is a natural symmetry that can protect them against large mass corrections. Particles of
spin 1 are protected by gauge invariance. In order to make them massive, one has to find
an additional degree of freedom to go from two to three polarizations. A Higgs scalar can
provide that degree of freedom. But without such a scalar, there is no possibility for a
This is true by definition: superstrings are supersymmetric string theories. There also exist non-super-
symmetric string theories. They are finite at one-loop order, but beyond that it is difficult to make sense
of them.
157
spin-1 particle to acquire a mass. The same argument holds for spin-2: a graviton has two
polarizations, but a massive spin-2 particle has five. For fields of spin 2, the protection
symmetry is general relativity, and the spin 2 field must be the graviton. For spin- 32 (if
such particles are ever observed) supergravity acts as the protection mechanism.
Spin- 12 particles are protected by chiral symmetries. If a fermion mass is set to zero, a
new symmetry emerges: one can now rotate the left- and right components of the fermion
independently. Such symmetries are respected in perturbation theory, and hence no mass
term will be generated if it was not already there. In this sense massless spin- 12 particles
are protected. Unlike all previous symmetries the chiral symmetry does not have to be
local, and the spin- 12 particles are not in any sense gauge particles.
There are two known protection mechanisms for massless scalars: they could either be
Goldstone bosons of some broken global symmetry, or they could be protected by (global
or local) supersymmetry. It is difficult to regard the Standard Model Higgs boson as a
Goldstone boson, although ideas in that direction have been explored. One problem is
that a Goldstone bosons would have derivative couplings with all fields, a property not
shared by the Higgs field of the standard model.
The supersymmetric protection mechanism is easy to understand: supersymmetry
pairs the scalar with a fermion, whose mass is protected by chiral symmetry. Since
supersymmetry requires the boson and fermion mass to be equal, the boson mass is now
protected as well. It is thus natural to wonder if perhaps supersymmetry can be used to
solve the hierarchy problem.
Note that supersymmetry has nothing to say about the weak interaction scale itself.
If supersymmetry is unbroken the Higgs mass, which in the Standard Model is related to
the weak scale, is an arbitrary parameter, as we will see; if supersymmetry is broken the
weak scale is determined by the supersymmetry breaking scale, and then one can start
arguing about the origin of that scale. Here technicolor appears to have the advantage. In
that case the scale is determined by a gauge coupling constant becoming strong, and we
know from the example of QCD that it is quite natural for this to happen at scales much
below the Planck or unification scale. The most popular mechanism for supersymmetry
breaking, gaugino condensation, also involves dynamical symmetry breaking, so that in
such models the scale would be determined as well.
There is also a second possibility for massless fermions, namely that they are Goldstinos of broken
global supersymmetry. Attempts have been made to regard standard model fermions as Goldstinos, but
without much success. By Higgs field we mean here the complex scalar field in the unbroken
Standard Model. One should not confuse this with the fact that three of the four real Higgs scalars
become Goldstone bosons after SU (2) U (1) breaking, which are eaten by the W and the Z. There is
in any case also a fourth, physical Higgs boson, which cannot be a (pseudo) Goldstone boson.
158
algebra in a theory with a free boson and a free fermion).
[Q , P ] = 0
{Q , Q } = 0
{Q , Q } = 0
{Q , Q } = 2 P .
Then trivially h0| H |0i = 0, i.e. the ground state hasPzero energy. The contrary is also
6 0 then h0| H |0i can be written as 21 |Q |0i |2 > 0.
true. If Q |0i =
The fact that the energy of the vacuum is zero is a first indication of cancellation
between fermions and bosons. In a non-supersymmetric bosonic field theory the zero-
point energy of the bosonic oscillators is positive and add up to infinity (which is then set
to zero), whereas fermions give a negative contribution.
9.2 Multiplets
Since the supercharge transforms bosons into fermions and vice-versa, it is clear that it
organizes the field content of the theory into super-multiplets, which are representations
of the supersymmetry algebra. In the simplest case, N = 1, there are only two relevant
multiplets, called the chiral multiplet and the vector multiplet. The former consists of a
complex scalar and a complex Weyl fermion, the latter contains a real vector boson and
a Majorana fermion. The fields in each multiplet must transform according to the same
representation of any gauge symmetry. The members of a vector multiplet must thus
both belong to the adjoint representation of a gauge group, of which the vector boson is
the gauge boson.
Chiral supermultiplets can be left-handed or right-handed, because a scalar in a rep-
resentation R may be paired either with a left-handed Weyl fermion in the representation
R , or a right-handed Weyl fermion in the representation R. The Hermitean conjugate of
159
a left-handed chiral multiplet contains a scalar and a right-handed Weyl fermion, both in
the representation R . This is thus a right-handed chiral multiplet. Gravity requires an
additional multiplet, containing a spin 2 and a spin- 23 particle.
There also exist extended supersymmetries with more than one supercharge. Their
representations are larger and can contain more different spins. If one requires that the
highest spin that occurs is 2, the maximal number of supersymmetries is 8.
In extended supersymmetry theories every multiplet contains only real fermion repre-
sentations. This does not look like a very promising starting point if one wants to obtain
the Standard Model, which has complex matter representations. The simplest example of
an extended supersymmetry is N = 2 supersymmetry. In this theory, matter belongs to
hyper-multiplets which can be decomposed into a chiral and an anti-chiral multiplet of
N = 1 supersymmetry. Hence a hypermultiplet consists of a right-handed Weyl fermion,
a left-handed Weyl fermion and two complex scalars, all in the same representation. This
means that for every right-handed fermion there is automatically a left-handed one: the
theory is not chiral. This is not a good starting point for phenomenology (although this
has not stopped all attempts in that direction).
On-shell (i.e. when the equations of motion are imposed) a Weyl-fermion and a Ma-
jorana fermion both have two degrees of freedom. A complex scalar and a real, on-shell
vector boson also have two degrees of freedom. Hence each multiplet does indeed contain
an equal number of bosonic and fermionic degrees of freedom.
To write down an action we need off-shell fields. The equations of motion follow from
the action, but are not imposed on it. Off-shell a complex scalar still has two degrees of
freedom, but a Weyl and a Majorana fermion have four, just as a vector boson. To realize
supersymmetry off-shell additional fields have to be introduced, which can be removed
from the action by their equations of motion, since they do not have kinetic terms. These
are called auxiliary fields. For the scalar multiplet we need one complex bosonic auxiliary
field to get the correct counting. For a vector multiplet one might expect to need none,
but there is a complication since the reduction of the number of degrees of freedom for a
vector boson involves not only the field equations, but also gauge invariance. In fact, the
full set of auxiliary fields for the vector multiplet contains several bosons and fermions.
160
There are two kinds of superfields.
Chiral superfields : By definition these depend only on x and , but not on .
The component (x) is a complex scalar field, and (x) is a Weyl spinor. We choose it
left-handed by convention. The strange factor 2 is also a convention. The component
F (x) is unphysical: it does not lead to propagating degrees of freedom. It is called an
auxilliary field. Its role is make sure that the superfield contains the same number of
fermionic components both on-shell and off-shell.
Here off-shell refers to the count of the field components. A Weyl spinor is complex
two-component field. It must be complex, because SO(3, 1) Lorentz rotations acting on
spinors are complex. Hence a Weyl spinor has four off-shell components. The complex
scalar fields (x) and F (x) have two components each, so that the boson/fermion counting
works out: 2 + 2 = 4.
On the other hand on-shell refers to the counting of physical, propagating degrees
of freedom. A Weyl fermion has two propagating degrees of freedom. This is because the
word on-shell means that the Dirac equation is imposed as a constraint on the field.
This can be seen explicitly in the Dirac propagator which contains a factor 6 k + m (see
e.g 5.40). On-shell, if k 2 = m2 , this has two eigenvalues zero, reducing the number of
propagating componets by a factor 2. This reduction works in the same way for a Weyl
spinor. In both cases, one can start with a complex, four component Dirac spinor, which
has eight degrees of freedom. The Dirac equation reduces this to four physical ones, and
in addition Weyl spinors satisfy the constraint 5 = 0, which gives another reduction by
a factor 2. A complex scalar has two physical degrees of freedom. As already stated, F (x)
is entirely unphysical, and hence the on-shell boson/fermion count also works: 2 + 0 = 2.
The story for vector fields is similar, but more complicated. A full expansion gives
many terms, but using gauge invariances several can be put to zero. The result is
V (x, , ) = V + i2 i2 + 21 2 2 D + . . . , (9.5)
Here V represents a real vector field, a Majorana fermion, and D is an auxilliary field.
The on-shell count is as follows: two d.o.f. for V , two d.o.f. for and zero for D. The
off-shell count involves gauge invariance and to check it one must include the omitted
terms.
The rule for writing down supersymmetric Lagrangians using superfields are as follows.
There are two kinds of terms
F-terms: products of superfields that depend only on x and
161
D-terms: products of superfields that depend on x, and , and are Hermitean
The rule is now to expand these products of fields in terms of and , and keep only the
terms of higher order in the anti-commuting variables, i.e. proportional to 1 2 for F-
terms and proportional to 1 2 1 2 for D-terms. The coefficient functions of these highest
powers of can be shown to be supersymmetric. The resulting Lagrangian will in general
depend on the auxilliary fields F and D. However, these fields always appear without
derivatives. Hence there equations of motion are non-dynamical. They simply state that
the variation of the Lagrangian with respect to F and D must vanish. This yields a simple
algebraic constraint, that can be solved to eliminate F and D.
The F-terms are the easiest ones to deal with. They should not have any dependence
on , and hence they can only be polynomials in terms of the chiral superfields. Just as in
non-supersymmetric QFT, any term in the polynomial that respects all the symmetries
is allowed. However, to get a renormalizable theory one may allow only terms of at most
order three in the chiral superfields. This polynomial is called the superpotential. To
derive it for the supersymmetrized Standard Model all we have to do is take all the left-
handed Weyl spinors and assign a superfield to each, and then build the most general
superpotential terms that are invariant under SU (3) SU (2) U (1).
The D-terms are a bit more difficult to discuss. To construct these out of chiral
superfields, one would like to consider the Hermitean conjugate of a chiral superfield, and
build something like . But does not transform correctly under supersymmetry. It
is neither a chiral superfield nor a vector superfield. To solve this one first has to apply a
transformation to . Fortunately, D-terms play a rather simple role in the construction.
They give rise to the kinetic terms of the fermions and the bosons, which we could easily
have written down anyway.
L = D = + i + F F (9.6)
The only term of interest here is the dependence on the auxilliary field.
One can couple these kinetic term to gauge fields by considering
L = e2gV D (9.7)
The first two terms are not unexpected: derivatives become covariant derivatives. There
are two additional terms: a scalar-fermion-gaugino coupling and a coupling of two scalars
to the auxilliary field.
162
There is on other kind of D-term that is clearly possible, and that is a term linear in V .
This is invariant under supersymmetry, but only invariant under gauge transformations
if one takes a trace. This can only be non-zero for abelian gauge theories. This kind of
term does not play a major role in the following. Terms of quadratic and higher order in
V have two many 0 s to yield anything.
So far the formalism was fairly elegant, but this cannot be said about the gauge kinetic
terms. These are actually F-terms, but these are constructed in rather baroque way. We
refer to the appendix for details, but gauge invariance fixes most of the structure anyway:
a 2
Lgauge = 41 (F ) 21 D + 12 (Da )2 , (9.9)
Not surprisingly, there are kinetic terms for the gaugino, and also not surprisingly they
involve a covariant derivative. The only noteworthy term is the quadratic one involving
the auxilliary fields.
This is all we need to write down a supersymmetric extension of the Standard Model.
respectively, plus the Standard Model singlet N for the left-handed anti-neutrino (or
equivalently the right-handed neutrino), if desired. The bars are added to remind ourselves
of the fact that these fields represent anti-particles; the right-handed chiral multiplets
that are the conjugates of these fields are denoted as Q , U etc. All these fields carry, in
addition to their SU (3) SU (2) U (1) indices, a flavor index with three distinct values.
The resulting particles are called squarks (scalar quarks) and sleptons. Note that
there will be a squark for every left-handed and another one for every right-handed field
(in the particle representation). Hence for example the up quark has two scalar partners,
often denoted uL and uR ; of course since they are scalars the chirality index only refers
to the fermion they belong to.
163
The kinetic terms of these fields require no further discussion, but the Yukawa cou-
plings are more interesting. In the non-supersymmetric Standard Model we needed the
Higgs scalar as well as the conjugate C to give mass to all quarks. Suppose we
introduce a left-handed chiral superfield Hd in the representation (1, 2, 21 ). Thus Hd
transforms exactly like L, and the scalar component of Hd transforms exactly like the
complex conjugate Higgs field C . Using this field we can write down the following
Yukawa couplings:
gD QHd D + gE LHd E (9.10)
Here all indices have been suppressed, but they are exactly as in Eq. (4.24). The two
terms given in Eq. (9.10) yield the complex conjugates of the last two terms in Eq. (4.24),
when one considers only the terms involving Standard Model particles. Of course both
fermions in the resulting Yukawa coupling will be left-handed, and one has to convert one
of them to right-handed notation to get Eq. (4.24) (up to an irrelevant overall phase).
The structure of Eq. (9.10) is dictated by gauge invariance, and in particular the SU (2)
indices must be contracted as Qa Hdb ab .
Now we would like to write down the equivalent of the first term in Eq. (4.24), and
we would also like to introduce neutrino Yukawa couplings. The obvious guess is QHd U,
but Hd is a right-handed superfield, and there exists no supersymmetric coupling to the
two left-handed superfields Q and U. This forces us to introduce a new field Hu which
transforms like (1, 2, 21 ). Then the missing Yukawa couplings are
Here again all indices are contracted in the obvious way, flavor indices as in Eq. (4.24),
and all others as dictated by gauge invariance.
There is another reason why we are anyway forced to introduce an additional Higgs
doublet. The supermultiplet Hd contains a left-handed fermion in the representation
(1, 2, 21 ). This field contributes to the SU (2)U (1) and U (1)3 anomalies of the Standard
Model, and hence we have to introduce additional matter to cancel these anomalies. The
simplest solution is to add a left-handed chiral superfield in the representation (1, 2, 21 ).
This gives also another reason why it is not a good idea to identify the fields Hd with
one of the flavors of the lepton doublets Li : to get masses for the up quarks we would in
any case need a chiral superfield in the representation (1, 2, 21 ), and to cancel the anomalies
introduced by this field we need to add a (1, 2, 12 ) superfield. So the fields Hd and Hu
are needed in any case.
We have chosen Hd and Hu to transform in the same SU (2) representation and not in complex conjugate
representation, as some others do. The difference is merely an tensor.
164
in the superpotential. If one works out the scalar potential one finds that this simply
gives rise to a mass term
||2 (|hd |2 + |hu |2 ) (9.13)
for the higgs scalars Hd and hu in the superfield, as well as a contribution to higgsino mass
matrix. Note that these Higgs scalar masses are free parameters, just as in the Standard
Model, but that unlike the Standard Model Higgs mass term ||2 can only be positive.
Hence there is no possibility for spontaneous SU (2) U (1) breaking with the present
form of the potential. This gives us no reason for concern: all this is true only as long as
supersymmetry is not broken, but we know that it has to be broken. There are no other
renormalizable superpotential contributions involving only Hd and Hu . Note that we do
not get any quartic scalar potential terms from the superpotential.
The extra term, Eq. (9.12), is not good news. Unlike the Yukawa coupling constants,
has the dimension of a mass. If our ambition is only to build a theory with naturally
protected hierarchies, poses no problem: as we will see the coefficients of the superpo-
tential are not renormalized, and hence we can give any value we like in a natural
way. But unless is of order MPlanck its existence introduces a /MPlanck hierarchy prob-
lem (here instead of MPlanck one can substitute any other large scale that occurs in the
theory). Of course cannot be of order MPlanck , because it will contribute to the Higgs
mass parameter after supersymmetry breaking, and hence its natural value is of order the
weak scale (or smaller).
In addition one can add the following four terms:
where again the index structure is dictated by gauge invariance. Each term would appear
with a coupling tensor with as many flavor indices as there are fields.
These terms are undesirable, since they manifestly violate either baryon or lepton
number. They do not appear in the standard model although they would be allowed by
SU (3) SU (2) U (1) group theory. The reason is that in the Standard Model Lorentz-
invariance forbids them: one cannot couple three fermions to a singlet, or a fermion to
a scalar. This is a clear disadvantage of the supersymmetric extension of the Standard
Model.
Note that the first three terms are simply the Yukawa couplings of the field Hd , with
Hd replaced by L. This gives us yet another reason why one should not identify H with
L, because in that case such undesirable couplings are certainly inevitable.
A contribution to proton decay due to these terms is shown below. Here the dashed
line indicates a scalar component of a superfield and the solid line a fermion component.
The diagram corresponds to the decay p e+ +. . .. (the terms denoted by . . . are hadrons
that are needed for energy-momentum conservation, and that would be created when the
proton breaks up).
165
u d
~
d
u e+
The arrow convention is such that it shows the flow of color charges. In any vertex
there must be the same number as in- and outgoing arrows, except for vertices that make
use of the -tensor coupling; they have three incoming (or outgoing) arrows. By adding
a free d-quark line, this particular process can be interpreted as p = (uud) (e+ )(ddc ),
and the ddc becomes a neutral pion. One may suppress this decay either by making the
coupling constant extremely small or the mass of the scalar component of the d-quark
(usually called the d-squark) extremely large. From GUTs we know that with couplings of
order 1 the mass of this squarks would have to be of order 1015 GeV. This would imply an
extremely large supersymmetry breaking, and it is hard to see how supersymmetry could
in that case still have something to do with the breaking of weak interaction symmetries.
The limits on the three terms that only violate lepton number are less severe (the best
limit is about .01, for the first family ULD coupling), but nevertheless these couplings are
usually set to zero. They are zero by definition in the minimal supersymmetric Standard
Model (MSSM).
In general one cannot simply set allowed couplings to zero. This is possible only if there
is a symmetry protecting them. This symmetry will then be preserved by all quantum
corrections, so that the undesirable terms will not be generated if we omit them from the
Lagrangian. So we should try to find symmetries of the Standard Model Lagrangian that
are not symmetries of the undesirable terms. There are in fact many global symmetries
that are broken by these unwanted terms. The most obvious choice is B or L or B L
(where B and L are assigned to the entire supermultiplet, i.e. a squark has B = 13 ). But
B and L are not good candidates for a fundamental symmetry of nature, because these
symmetries have an anomaly with respect to SU (2)Weak . The combination B L does
not have that problem, but this global symmetry would forbid a mass-term for right-
handed neutrinos (just as L by itself) and hence would inhibit the see-saw-mechanism.
But perhaps the see-saw mechanism is not realized in nature. Then one has to accept
unnaturally small neutrino Dirac masses, which is a high price to pay for solving another
naturalness problem, the hierarchy problem. Furthermore, the idea that B L could be
an exact global symmetry is not likely to be correct, because it is believed that gravity
does not allow exact global symmetries. But if B L is a local symmetry, we should see
an extra abelian gauge boson.
There is another possibility. The Standard Model has five SU (3) SU (2) U (1)
matter representations and a single Higgs boson. The MSSM has one extra representation,
namely Hu . Therefore one may expect an additional U (1) global symmetry X. Indeed
166
there is, if the terms (9.12) and (9.14) are absent. The corresponding U (1)X charges are
1 for Hd and Hu , and 21 for all quark and lepton left-handed chiral superfields. All extra
terms discussed in this section break X, or in other words if we impose it all of these
terms are forbidden. Note that in the Standard Model Hd and Hu correspond to C and
, so that this charge assignment is not possible.
ei . (9.15)
By definition, has R-charge 1. One may assign furthermore R-charges to all superfields.
The components of the superfield transform then according to the number of factors of
by which they are accompanied. For a left-handed chiral superfield with R-charge r,
the decomposition
(x, ) = (x) + 2(x) + 2 F (x) , (9.16)
implies that has charge r, charge r 1 and F charge r 2.
For a vector superfield
V (x, , ) = V + i2 i2 + 12 2 2 D + . . . , (9.17)
167
Any other assignment of R-charges differs from the previous one by some global sym-
metry. For example, one may choose a different R-charge assignment R0 which allows
(9.12) but not (9.14) by choosing R0 = R + X (with X as defined in the previous subsec-
tion). Then all quark and lepton superfields have R0 charge 12 and Hd and Hu have R0 = 1.
It is also possible to find a linear combination of R, X, B and L that is completely free of
anomalies with respect to non-abelian groups and even with respect to gravity, as well as
a combination that allows all of the terms in (9.12) and (9.14), namely R + X 21 (B L).
The problem with continuous R symmetries is that the gauginos are in a complex
representation of any R-symmetry and hence cannot become massive as long as the R-
symmetry is exact. If we break the R-symmetry spontaneously we get a Goldstone boson.
This boson is massless if the symmetry is anomaly free, and has a very small mass (like an
axion) if the broken symmetry has an SU (3) or SU (2) anomaly. This is phenomenologi-
cally unacceptable unless one can make the axion extremely weakly coupled, i.e. invisible.
Since there is no way of breaking the R-symmetry in the supersymmetric Lagrangian (not
even if we include the unwanted B and L violating terms (9.14), we will have to worry
about this problem later.
9.7 R-Parity
In most work on the phenomenology of supersymmetry the unwanted B and L violating
terms are removed by imposing a discrete symmetry called R-parity. This symmetry is
where S denotes the spin. Since S is always conserved modulo integers, R-parity is con-
served if B L is conserved. The only reason for introducing the spin-dependent sign
is the following convenient characterization: all Standard Model particles have R-parity
+, while all their superpartners have R-parity . Consequently, in interactions involving
ordinary matter (the only interactions we can cause to happen using accelerators), super-
partners are produced in pairs. Furthermore, a superpartner can only decay in another
superpartner, and hence there must exist a lightest superpartner (often called LSP) that
is absolutely stable. This is one of the most important handles we have experimentally
on supersymmetry.
Note that the terms (9.14) are forbidden by R-parity (indeed, they break B L),
whereas (9.12) is not. A gaugino mass-term is also allowed by R-parity. Therefore most
phenomenology assumes that the continuous R-symmetry is in some way broken to R-
parity. This is in particular part of the MSSM definition.
168
In fact, the situation is not as bad as it seems since the particles we have seen so far are
precisely those whose masses are forbidden by unbroken SU (3) SU (2) U (1). Indeed,
the gauginos are in real representations, and can thus have a Majorana mass, the scalars
are allowed to have a mass no matter what their representation is, and the Higgsinos
from Hd and Hu can combine with each other to form a massive Dirac fermion. On the
other hand, the masses of the gauge bosons are protected by gauge invariance, and those
of the quarks and leptons by chirality. The only Standard Model particle whose mass
is not protected by SU (3) SU (2) U (1) is the Higgs boson, the last Standard Model
particle that has been discovered. The fact that its mass is not protected is precisely the
hierarchy problem. An optimistic point of view about supersymmetry is that we may just
be crossing the borderline between protected and unprotected particles.
In view of this the natural course to follow is to break supersymmetry first at some
scale MS , so that all non-Standard Model particles acquire masses, and so that the Higgs
scalar mass is still protected by supersymmetry cancellations, and then break the weak
interaction gauge symmetries at the lower scale MW .
To break supersymmetry we have three options
Explicit breaking
Spontaneous breaking of global supersymmetry
Spontaneous breaking of local supersymmetry
Explicit breaking means that supersymmetry is not an exact symmetry of nature, but
just a coincidental property of the low-energy spectrum (where low means a few TeV,
i.e. low with respect to the next higher energy scale, for example the Planck scale). In
other words, perhaps nature is not fundamentally supersymmetric, but for some reason
the part of the spectrum that lies well below the Planck scale consists of equal numbers
of bosons and fermions for any gauge group representation. Remarkably, it is possible to
break supersymmetry explicitly without loosing the good properties it has with respect
to scalar masses. However, for a fundamental theory of nature this does not look very
attractive, since we would never understand why these coincidences are occurring.
Spontaneous breaking of global supersymmetry has many problems, most obviously
the appearance of a massless Goldstone fermion related to the broken symmetry, the
Goldstino. In addition, it is well known that if we want to couple a supersymmetric
theory to gravity (which undoubtedly we will have to do), the global supersymmetry must
become local.
Thus if we reject option 1, and wish to see supersymmetry as a fundamental symmetry
of nature, we are inevitably led to local supersymmetry, also known as supergravity. In
supergravity the Goldstino is eaten by the gravitino, a massless spin- 32 field. If supersym-
metry breaks this particle must become massive, and its number of degrees of freedom
must increase from 2 to 4. The extra two degrees of freedom are provided by the Gold-
stino, just as the Higgs scalar contributes the extra degree of freedom needed to make
a vector boson massive. This solves already the most obvious problem of spontaneously
broken global supersymmetry.
169
The MSSM makes no statement about the kind of symmetry breaking. In all three
cases, one assumes that low-energy physics is described in terms of the supersymmetry
Lagrangian plus so-called soft supersymmetry breaking terms, which do not affect the
cancellations due to supersymmetry. In the last two cases theses soft breaking terms are
generated by the spontaneous symmetry breaking, whereas in the first case they are put
in by hand.
This yields simply the D-term of the superfield V , and is gauge invariant only if V is
a vector superfield of an abelian gauge symmetry. It receives quadratically divergent
corrections proportional to TrQ at one loop (and at one loop only), and hence there is no
problem if TrQ = 0 (which was also the condition for absence of gravitational anomalies).
Apart from this problem (which is easy to circumvent) all corrections are logarithmic.
170
9.10 Soft Supersymmetry Breaking
Remarkably, the absence of quadratic divergences can be maintained even if certain terms
are added to the Lagrangian that break supersymmetry explicitly. The allowed terms are
mij i j ; ij i j + c.c ; ijk i j k + c.c ; ( + ) , (9.21)
where is a gaugino and i a scalar field from one of the chiral multiplets; m, , and
are arbitrary parameters. The most interesting terms that are not allowed are mass terms
for the fermions in chiral multiplets, Yukawa couplings of such fermions to Higgs bosons,
and fourth order scalar interactions.
Note that the second and third terms have precisely the structure of term in the
super-potential, when the scalar field is replaced by a superfield . The conditions
for invariance under global and local symmetries that commute with supersymmetry are
identical for these terms. However, they appear directly in the potential, whereas the
similar-looking superpotential terms lead to totally different term in the potential. The
last soft breaking term does not respect continuous R-symmetries, since the gaugino
transforms non-trivially under such a symmetry. Most of the terms of the second and
third type will generically also violate R-symmetries.
If other terms are added to the action this leads in general to quadratic divergences,
so that everything one hoped to get from supersymmetry is lost. There are exceptions
however. The analysis leading to an enumeration of soft breaking terms assumes arbitrary
supersymmetric theories. In a specific theory the expected disasters may not occur, and
indeed there are examples of that. However, it does not seem that such potentially
dangerous soft breaking terms are actually generated in spontaneous symmetry breaking.
171
standard example of such a superpotential is 1 1 (23 M 2 ) + 2 2 3 , where i are
superfields, and i and M parameters.
Fayet-Illiopoulos breaking occurs when there it is not possible to have a simultaneous
solution to the equation Da = 0 for all a. In the absence of a -term (and in particular
for non-abelian fields), the condition Da = 0 becomes (see Appendix D)
Da = gi Tija j = 0 , (9.24)
and can always be satisfied by setting i = 0. [We are assuming here that the conditions
Fi = 0 are trivially satisfied. In principle all conditions on D and F have to be considered,
and this could still force us to have a non-trivial v.e.v. for i . This might lead to a breaking
of supersymmetry if the right-hand side of Eq. (9.24) is non-zero, and in any case leads
to a breaking of gauge symmetry, since manifestly transforms non-trivially under gauge
transformations.] In the presence of the -term one gets the condition
D = g 0 i Qi i = 0 . (9.25)
If the product g 0 Qi > 0 for all i this has no solution, and supersymmetry is broken. The
minimum of the potential is at i = 0 (unbroken gauge symmetries), D = . If on the
other hand there is a possibility for cancellation among the terms on the right-hand side
of Eq. (9.25) the minimum breaks gauge-symmetry, but not supersymmetry (D = 0).
Note that the condition g 0 Qi > 0 implies that all superfields coupling to the U (1)
symmetry under consideration must have charges with the same sign, which makes it
impossible to cancel the Q3 anomalies. A possible way out is to build a superpotential
in such a way that all field with a certain sign of the charge a forced to have vanishing
v.e.vs by the Fi = 0 conditions, so that they cannot contribute to Eq. (9.25), but
this is highly contrived. Fayet-Illiopoulos symmetry breaking is thus a priori not a very
attractive option. [Fayet-Illiopoulos symmetry breaking has however found an interesting
application in four-dimensional string theory, where there is a new mechanism to cancel
U (1) anomalies.]
172
In both cases only the auxiliary field can get a v.e.v, as we already know. The fermionic
state created by Q out of the vacuum is then
X 1 X a a
hFi ii + hD i , (9.28)
i
2 a
assuming all fermi fields are orthonormal, i.e. the fermion propagators have residue ij in
the space of all fermions.
Just as for Goldstone bosons the Goldstino is a fluctuation around the vacuum in the
direction of an exact symmetry. Hence it is a massless particle.
This relation holds at tree level, and can easily derived from the action. This sum rule
plays an essential role: it guarantees the absence of quadratic divergences in the one-
loop effective potential. It can be shown that those divergences are proportional to the
right-hand side of Eq. (9.30).
The sum rule Eq. (9.30) holds in fact for the gauge and matter supermultiplets sepa-
rately. This is bad news, especially if one hopes to break supersymmetry first and then,
at a lower scale, break the weak interaction symmetries. Then the quarks, leptons and
gauge bosons should remain massless after supersymmetry breaking, but the sum-rule can
then only be satisfied with massless squarks, sleptons and gauginos. Even if we evaluate
the sum rule including the masses of the quarks, leptons and gauge bosons after weak
interaction symmetry breaking the results are disastrous. For example, the sum of the
square of all 12 gaugino masses is predicted to be equal to 32 (MW
2
+ MZ2 ), so that the light-
est of them cannot be heavier than about 20 GeV. However, the current lower limit on
the gluino mass from the Tevatron is about 135 GeV, so that the gluinos by themselves
already violate the sum rule. The application of the sum rule to the matter sector is
somewhat more difficult, since both the Higgs mass and the Higgsino mass are unknown,
but for any reasonable guess for these masses the results are equally bad.
There are several possibilities to escape from these sum rules.
1. Non-standard matter
One may add extra gauge fields which acquire a mass when supersymmetry breaks,
so that there are extra contributions to the vector terms in the sum rule. Something
similar has to be done in the matter sector. This is arbitrary, difficult to arrange
and unattractive. To appreciate the difficulty note that the scalars giving mass to
the extra gauge bosons must at the same time give more mass to the gluinos than
to the extra gauginos.
173
2. Fayet-Illiopoulos breaking
In this case the mass sum rule is modified to
X
(1)2S (2S + 1)MS2 = 2g 0 hDi TrQ (9.30)
S
This is useful only if one has a U (1) gauge group with a generator that is not
traceless. This is thus in any case not the U (1) factor of the Standard Model, so
that we need to add extra gauge fields. As we have already seen, it is hard to
avoid TrQ3 anomalies, and manifestly impossible to avoid gravitational anomalies
proportional to TrQ. In addition, the Fayet-Illiopoulos mechanism by itself requires
a term; this in combination with TrQ 6= 0 leads to quadratic divergences, which
is what we wanted to avoid in the first place by means of supersymmetry. Thus this
does not look like an attractive option either.
4. Supergravity
The sum rule was derived for global supersymmetry. If one considers instead local
supersymmetry, there is a correction proportional to the gravitino mass. The result
(in the absence of Fayet-Illiopoulos breaking) is
X
(1)2S (2S + 1)MS2 = 2(N 1)m23/2 (9.31)
S
Here N is the number of chiral superfields. As before, this formula is valid only at
tree level. This is usually considered the most attractive way out of the sum rule
problem.
174
and qR (or lL , lR ). Here L and R indicate the chirality of the quark or lepton that are the
supersymmetric partners of these scalars. Furthermore there are two Higgs superfields Hd
and Hu , each containing a complex Higgs scalar hd and hu in the representations (1, 2, 12 )
and (1, 2, 21 ) respectively, and a left-handed Higgsino in the same representation.
The Lagrangian consists of the supersymmetric kinetic terms with minimal gauge
coupling, plus a superpotential for the chiral superfields. This superpotential contains
the three standard Yukawa coupling terms plus the scalar mass term Hd Hu . There is an
exact R-parity forbidding the other possible terms (9.14).
The soft supersymmetry breaking terms are
X X
Lsoft = (m2 )ij (i ) j 12 Ma a a
a
2i,j
+ mud hd hu + gU AU Q U hu + gD AD Q D hd + gE AE L E hd + c.c (9.32)
Here i denotes the scalar component of the superfield i = (Hd , Hu , Q, U, D, L, E); instead
of Hi we usually write hi , and the squark and slepton fields are often denoted as uL , uR ,
etc. (note that uL is the upper component of the SU (2) doublet Q , and that uR = U ).
The parameters denoted here as (m2 )ij are in fact matrices in all degeneracy spaces
of Standard Model representations (the square is just intended to indicate that these
parameters have the dimension of a mass-squared). This means that they are 3 3
Hermitean matrices in family space for each of the Standard Model multiplets Q, U, D, L
and E. This would also allow a soft breaking term of the form L hu between the slepton
doublet and a higgs (having the same structure as the LHu superpotential term), but we
will assume that R-parity remains unbroken, so that such a term does not appear.
The parameter m2ud is in principle a complex number, which can be chosen real and
positive by absorbing a phase in hd (or hu ).
The parameters gU , gD and gE are the Standard Model Yukawa coupling matrices,
which are modified by matrices AU , AD and AE , which have the dimension of a mass.
We have ignored all neutrino contributions in the soft breaking terms, because we do
not know exactly how many singlet neutrinos N there are. If there are three, one can
add an extra term gN AN E N hu completely analogous to the up-quark couplings. In
addition there could be a supersymmetric Majorana mass matrix for the superfields N ,
plus some extra soft breaking terms for the scalars in N .
The additional parameters are then counted as follows: five 3 3 Hermitean matrices
for the soft scalar masses of the squarks and sleptons, plus two masses for the two Higgses,
giving a total of 47; three Majorana masses for the gauginos, plus 3 unrestricted 3 3
matrices Ax with 54 parameters, plus a real parameter mud . The total number of soft
parameters is then 105, ignoring any neutrino contributions.
In principle all (or most) of these parameters are determined by the supersymmetry
breaking mechanism, and for example in supergravity models one usually finds that they
are determined by a much smaller number of input parameters. Nevertheless, if one really
wants to compare the MSSM as defined so far to the data in a supersymmetry-breaking-
independent way, one should keep all these parameters.
175
This is a fairly hopeless task, and what one usually does is make some additional
unification assumptions. One assumes relations among these parameters at some high
scale U , and then one uses renormalization group evolution to derive the low-energy
parameters. These assumptions may include gauge coupling unification a la SU (5), uni-
versal gaugino masses (Ma = m1/2 , for all a), universal scalar masses ((m2 )ij = m20 ij ),
and universal tri-linear couplings (Ax = m0 A1, for all x). If one makes all these assump-
tions, the set of parameters of the soft terms is reduced to gU , gD , gE , m1/2 , m20 , m2ud and
A. The latter four, plus the parameter , are then the parameters which are added to
the Standard Model by supersymmetry. Note that the Standard Model parameters in the
Higgs potential, 2 and , are not present in the MSSM; the parameter in the super-
potential term Hu Hd should not be confused with one in the Standard Model potential
term 2 . The complete set of parameters of the MSSM consists of the unified gauge
coupling g plus gU , gD , gE , m1/2 , m20 , m2ud , A and . Note that neither neutrino masses nor
strong CP violation have been taken into account.
For all the foregoing assumptions one can give more or less convincing arguments, of
two types: either they hold in a certain class of models, or violating them would in general
have undesirable phenomenological consequences (some of these will be discussed later).
The equality of the gaugino and scalar masses is not as unreasonable as it may seem at
first sight, if we imagine that supersymmetry breaking is an effect involving (super)gravity
interactions. With respect to gravity all matter is on equal footing, and hence it would
not be a total surprise if all chiral multiplets and all vector multiplets, regardless of
their gauge properties, experience the same supersymmetry breaking. Since gravity is
sensitive to differences in spin, it is also not unreasonable that gaugino masses and scalar
masses come out different. The relations among the tri-linear couplings are less easy to
understand from this point of view.
In addition one sometimes assumes the SU (5)-inspired relation gD = gE or one in-
troduces a parameter B so that mud = m0 B, which replaces mud . The dimensionless
parameters A and B play a similar role in the sense that both are appearing as factors
of terms that also appear in the super-potential, but that now appear in the potential as
soft breaking terms. Note that the term Hd Hu in the super-potential leads to a term
2 (h2d +h2u ) in the potential. The corresponding soft breaking term is Bhd hu , and appears
directly in the potential. Sometimes B is eliminated as a free parameter by imposing the
relation B = A 1, an assumption inspired by a simple supergravity model, which we
will discuss later.
With these five parameters, the MSSM really has some predictive power, but unfor-
tunately it can never be ruled out completely convincingly with these restrictions.
176
Note that there are contributions from the superpotential, as one might expect, but also
from the D-terms that yield the kinetic terms of the scalars. This potential can now be
computed as follows. We will need the solution for the auxiliary fields D given in Eq.
(D.73): Da = gi Tija j . Furthermore we need the analogous solutions for F .
Fd = hu ; Fu = hd (9.34)
Since Hd and Hu are both in the doublet representation of SU (2) one has, using the
correct normalization for the generators T i = 21 i :
Di = 12 g2 hd i hd 12 g2 hu i hu , (9.35)
and since they have opposite charges 21 the Y -charge D-terms contribute:
D = 21 g1 hd hd 12 g1 hu hu . (9.36)
Note that the quartic terms are determined entirely in terms of gauge couplings, and that
there is no free four-scalar coupling constant associated with them. This potential has
manifestly positive quadratic terms, and hence there is no possibility to break SU (2)
U (1). But we still have to add the soft supersymmetry breaking terms. Including them,
one gets
where 2d = ||2 +m2hd and 2u = ||2 +m2hu . The importance of supersymmetry breaking is
that now these parameters can be negative. Note that SU (2) indices are suppressed here.
In order to get an SU (2)-invariant, the explicit form of hd hu must be hd hu = hd h2 .
It should be emphasized that the positivity of ||2 is independent of perturbative
corrections. We will see later that the parameters 2d and 2u may be positive at some
scale, and then evolve to negative values at some lower scale. This would not be possible
177
if supersymmetry were unbroken. Then the superpotential, from which Eq. (9.39) is
derived, is not renormalized, and hence the form of (9.39) cannot change.
However, this is not the most general Higgs potential one can write down with two
scalar fields hd and hu . The most general one has the same set of quadratic terms, but
has the following quartic terms
2 2 2
h1 (hd hd ) + 2 (hu hu ) + 3 hd hd hu hu + 4 |hd hu | i
+ 5 (hd hu )2 + 6 hd hd (hd hu ) + 7 hu hu (hd hu ) + c.c , (9.41)
As usual SU (2) invariant contractions are not explicitly indicated. The Higgs potential
in the MSSM satisfies the additional constraints 5 = 6 = 7 = 0, and 1 = 2 = 21 3 .
There is no symmetry one can impose to enforce such a relation. For example, the
interchange hd Chu would explain one of these relations, but because of the Yukawa
couplings this cannot be a symmetry of the MSSM. For the same reason one cannot impose
a symmetry hi hi (i = u, d) to get rid of the last two terms. These constraints are
in fact due to supersymmetry. For example, the term with coefficient 5 does not appear
because it does not come from the supersymmetric part of the action, nor is it a soft term.
This has several consequences. First of all the special form of the potential ensures
that the Higgses hd and hu align correctly. A potential danger of a two-Higgs potential
with two Higgses that have to get a non-trivial vacuum expectation value (as is the case
here) is that the two Higgses choose an arbitrary direction with respect to each other.
Then SU (2) U (1) does not break to U (1)em but to nothing at all, and the photon gets
a mass. Even an extremely small misalignment would clearly be fatal. Let us assume
that the mass parameters in the potential are such that the two Higgses do indeed get a
non-trivial vacuum expectation value. Using SU (2) U (1) rotations we may bring the
hhd i to the form
1 vd
. (9.42)
2 0
Note that hd has U (1)Y charge Y = 21 , so that with this choice the vacuum has charge
Qem = T3 + Y = 0. The correct alignment of hu (Y = 12 ) is then
1 0
hhu i = . (9.43)
2 vu
However, let us assume that hu is misaligned by an arbitrary U (2) rotation. This can be
parametrized by choosing
1 vu ei sin
hhu i = i . (9.44)
2 vu e cos
The terms in the potential that depend on the orientation are
hd hu = vd vu ei sin
hd hu = ij hid hju = vd vu ei cos
178
Substituting this into the Higgs potential (9.40), but with the more general quartic inter-
actions shown in (9.41), we get
If 5 = 6 = 7 = 0 and 4 > 0, as is the case in Eq. (9.40), the quartic terms are
minimized for sin = 0, and the quadratic ones for cos = cos = 1 (if m2ud vd vu > 0)
or cos = cos = 1 (if m2ud vd vu < 0). No matter how we choose the signs, the
solutions are always = 0 mod and = 0 mod , so that hd and hu are indeed aligned
properly. For the general potential the minimization is more complicated. For example, if
we change the sign of 4 and keep 5 = 6 = 7 = 0, the minimum occurs for a non-trivial
value of due to competition between the quadratic and quartic terms. In general there
are regions in parameter space where the true minimum respects U (1)em , and hence the
alignment is not something unnatural even for the full potential. Fortunately the extra
constraints due to supersymmetry put us precisely in a region of parameter space where
the alignment is automatic.
and hence we see that there is a positivity condition, to ensure that the potential is
bounded from below in the limit |hd | :
2d + 2u 2|m2ud | (9.46)
The condition for the occurrence of symmetry breaking is that the mass matrix of hd and
hu has a negative eigenvalue. The existence of a single negative eigenvalue is equivalent
to the requirement that the determinant be negative:
Of course this is not relevant if both eigenvalues are negative, but usually one is interested
in a situation where the determinant is positive at high energies, and changes sign when
evolved to lower energies.
If we make the unification assumption m2u = m2d (universal soft scalar masses) , it
follows that at the unification scale U 2d = 2u . Then conditions (9.46) and (9.47) can
just not be satisfied: choosing m2ud = 2d = 2u saturates both inequalities. The potential
179
is flat along the lines hu = ei Chd , and there is no symmetry breaking, but all vacua
along these lines are exactly degenerate.
Now what happens if we evolve these parameters to lower mass scales? Note that
the Higgs potential looks symmetric in hd and hu , but the Yukawa couplings are not.
Most importantly, hu couples to the top quark, and hd does not. Since the top quark is
very heavy, it has a large Yukawa coupling, and this coupling turns out to dominate the
evolution. Some of the contributing diagrams are
hu hu
hu hu
hu hu
In the supersymmetric limit they would exactly cancel, so that m2u and m2d are renor-
malized only by wave function renormalizations. Both Higgs masses are equal to 2 in
the supersymmetric limit, since they both come from the superpotential term Hd Hu .
Radiative corrections may change the value of , but not the form of the superpotential,
nor the resulting equality mu = md . But once supersymmetry is broken the scalar in the
loop gets a mass, while the fermion remains massless. This suppresses the positive scalar
contribution with respect to the negative fermion contribution. Hence the net effect of
this contribution is to drive the scalar mass of the external lines to lower values. It is then
possible that even if condition (9.47) is not satisfied at the higher scale, it is satisfied at
a lower one.
There is a competing effect due to non-cancellation of the gauge boson and gaugino
contributions. In this case a fermionic contribution, namely that of the gaugino, is sup-
pressed, and hence in this case the effect is precisely opposite. If the Yukawa coupling is
sufficiently large the first effect will be larger than the second, and the mass will indeed
decrease with decreasing energy.
But there is a further complication. The lines in the diagrams shown above can be
interchanged to get corrections to the masses of tL due to tR and hu , and corrections
180
to tR due to tL and hu . Hence there is an effect not only for m2 , but also for mtL and
mtR . Now we would like the scalar m2 to get a v.e.v. and certainly not the stop squarks,
since this would break color. There is an intuitive way to see which effect will win,
namely by considering the fermion loop (to which all diagrams are proportional in the
supersymmetric limit). In the correction to m2 there is a color triplet loop (formed by tL
and tR ), which gives a factor 3; in the correction to mtR there is an SU (2) doublet loop
(the higgsino and tL ), which gives a factor 2, and finally the correction to mtL involves
a color and SU (2) singlet loop (the higgsino and tR ), so that there is no enhancement.
Hence hu receives the largest contribution, and if there is any mass2 that changes sign it
will be that of hu .
The net result of a quantitative calculation is the following set of renormalization
group equations (for t = log Q, and Q the energy scale)
" #
dm2i 1 X
2 2 2 2 2 2 2
= 2 ca (i)ga Ma + ci gt (mtL + mtR + m2 + At ) (9.48)
dt 8 a=1,2,3
where only the top quark and its superpartners are taken into account. Here ma and
At are parameters appearing in Eq. (9.32), gt the top Yukawa coupling, ca (i) is a set of
numerical coefficients, and so is ci . The most important point is that ci = 3 for hu , ci = 2
for tR and 1 for tL . If the second term dominates, and the masses of all scalars are equal
at some scale, then all masses will decrease with decreasing t. But due to the factor ci = 3
the mass of hu will go through zero before any of the others. When that happens (actually
already earlier, namely when condition (9.47) is satisfied) SU (2) U (1) breaks, and many
particles acquire a mass and decouple from the renormalization group equations. Hence
below this scale these equations show a different behavior, and in particular it is possible
that none of the other masses goes through zero.
All of this is hand-waving, and a detailed study of the full set of coupled equations
is required to show that indeed this mechanism works. This is quite complicated, even
under the drastic simplifications of the unification conditions on the masses. A more
detailed analysis does appear to show that indeed regions in parameter space exist where
this mechanism could work.
181
being h0 ). The field A0 is precisely the axion discussed in chapter 5. The mud breaks
the PQ-symmetry discussed there, and hence the axion mass will be proportional to
it. In a one-Higgs system there is of course no particle like A0 , and only one rescaling,
corresponding to the Standard Model Higgs boson.
The Higgs potential is almost symmetric under the CP-symmetry hi hi . The only
possible violation could be the term proportional to m2ud , if that parameter is not real (in
general it could be any complex number, although the notation might suggest otherwise).
However, we can always make m2ud real and positive by a relative phase rotation of hd
and hu . We could just as well have called this symmetry C , and assign positive parity to
both hd and hu , since parity is manifestly a symmetry of the Higgs action (provided both
Higgses are assigned the same parity). However, both C and P are badly broken when we
couple the Higgses to the fermions, and CP is a symmetry to a quite good approximation.
It follows then that A0 is CP-odd and h0 and H 0 are CP even. Since the Higgs potential
with real m2ud is CP-invariant the mass matrix will not mix CP-odd and CP-even states,
and furthermore any radiatively induced mixing is proportional to the CP-violating terms
in the full action, and hence probably quite small. (Unless there are large CP-violating
terms that do not manifest themselves in our present experiments).
From the condition that the higgs v.e.vs are a local minimum of the action one derives
rather easily
v 2 + vu2
2d + 2u = m2ud d (9.49)
vd vu
To see this, act with the differential operator (vd vu + vu vd ) on the potential V shown
in Eq. (9.40), with the vacuum expectation values substituted. In a local minimum,
(vd vu + vu vd )V must be zero. The potential has the form (note that hd hu = 0 because
of the vacuum alignment):
V (vd , vu ) = 2d vd2 + 22 vu2 2m2ud vd vd + 18 (g12 + g22 )(vd2 vu2 )2
The differential operator annihilates the quartic terms, and requiring that it vanishes on
the quadratic terms yields Eq. (9.49).
If m2ud has been chosen positive one may assume without loss of generality that both
vd and vu are positive. It is customary to define
vu
tan . (9.50)
vd
Furthermore one has
2
MW = 14 g22 (vd2 + vu2 ) , (9.51)
which fixes the value of (vd2 + vu2 ) to the usual value (246 GeV)2 .
The quartic terms in the potential are completely independent of the field A0 . Its
mass is thus independent of g1 and g2 , and one finds
m2ud v 2 + vu2
m2A0 = = m2ud d . (9.52)
cos sin vd vu
Note that this is not true for the full Higgs potential (9.41), since one cannot simultaneously remove
the phases in 3 , 6 and 7 . Indeed, the two-Higgs model has been proposed as a model for CP-violation.
182
This relation is often used to replace the parameter m2ud by the directly measurable quan-
tity m2A0 .
The two CP-even states mix with each other. Their mass matrix does depend on the
quartic terms in the Higgs potential, but only on the terms proportional to g12 + g22 =
4MZ2 /(vd2 + vu2 ). The mass matrix can in fact be expressed completely in terms of MZ , mA0
and . The eigenvalues are
q
2 1 2 2 2 2
mH 0 ,h0 = 2 mA0 + MZ (mA0 + MZ ) 4mA0 MZ cos 2
2 2 2 2 (9.53)
From this relation we find immediately that the lightest particle, h0 has a mass that is
less than that of the Z-boson!
This prediction is a consequence of the fact that the quartic terms in the tree level
potential are completely determined by the gauge couplings, whereas in the Standard
Model there is a free parameter .
The remaining four degrees of freedom of the Higgs system are charged. Two of them
are absorbed by W , whereas the remaining two form a charge conjugate pair H whose
2
masses are easily found to be equal to MW + m2A0 .
183
where m0 is the universal squark mass. Since the top quark mass is large, these corrections
can be considerable, and for most of the parameter space they push mh0 above MZ . Just
to give an example: for mt = 175 GeV and m0 = 100 GeV one finds that the absolute
maximum for the light Higgs mass is about 95 GeV. If we choose m0 equal to 1 TeV,
as it might well be, the maximum increases to 130 GeV; for m0 = 5 TeV (considered a
very high value) one gets 150 GeV. Here the maximum value is obtained by maximizing
with respect to all the other parameters on which mh0 depends, in particular mA0 and
tan . These maxima were discussed for about two decades, until a Higgs scalar was
finally discovered in 2012, with a mass of about 126 GeV, close to the upper limit for
m0 = 1 TeV. However, the complete story is far more complicated, and in addition
one has to take into account that no evidence for supersymmetry has been found. The
consensus is that the 126 GeV Higgs mass puts supersymmetry under stress, but does
not rule it out. Indeed, the fact that the supersymmetry breaking scale is higher than
expected reduces the tension.
where M1 is the gaugino mass in the U (1) factor of SU (3) SU (2) U (1), and M2 the
gaugino mass in the SU (2) sector.
Perhaps the most noteworthy feature of this matrix is that its determinant is propor-
tional to , so that there is a zero eigenvalue if = 0.
This is the correction to one of the diagonal entries of the 2 2 mass matrix, namely the hu -hu entry;
since hd does not couple to the top quark, the other entries receive negligible corrections.
184
d u,c,t s
W W
_ ___ _
s u,c,t d
These diagrams cancel if all intermediate quarks u, c and t are degenerate in mass.
Since they are not, there is a non-vanishing mixing matrix element between the strong
interaction eigenstates. In general this amplitude violates CP, but in the excellent ap-
proximation that CP is conserved the new mass eigenstates are the CP-eigenstates KL0 =
1 (K 0 + K 0 ) and K 0 = i (K 0 K 0 ) (the subscripts stand for Long and Short,
2 S 2
referring to the rather different lifetimes of these particles. This is due to the fact that
the final states allowed by CP are different).
In any supersymmetrized version of the Standard Model there are additional diagrams.
The following ones are sensitive to the up squark masses
~~~
d u,c,t s
_ _
s ~~~ d
u,c,t
185
be 3 TeV or more, which seem rather large in comparison to the weak scale. This is an
argument in favor of the unification assumption that all squarks masses are equal at the
unification scale. Then renormalization group corrections will still generate differences,
but it is possible that those differences are small enough.
A similar limit on the down squarks is obtained from the same diagrams with the
Winos replaced by gauginos.
~~~
d s,d,b s
_ _
s ~~~ d
s,d,b
Here again the result depends on unknown mixing angles. If the down quarks and down
squarks are diagonalized by the same matrix, the gluino-quark-squark coupling is flavor
diagonal, and the diagram vanishes. Note that precisely this assumption was used above
for arguing that the Wino-quark-squark coupling might be identical to the CKM matrix.
But it is not plausible that an exact cancellation occurs for the Wino diagrams discussed
above, since there is no reason why the down quark and up squark mass matrices could be
related. The bound for the down squarks is more stringent since the relevant coupling is
s instead of w , but unfortunately rather model-dependent due to the unknown angles.
Other constraints come from the flavor changing neutral current processes e,
KL and others. A diagram contributing to e is
186
Note that it is not true that strangeness changing neutral currents only annoy tech-
nicolor model builders. The main difference is that in supersymmetric models one has
considerably more freedom, especially as long as one has not settled on a supersymmetry
breaking mechanism.
187
into quarks and gluons plus the LSP, possibly in several steps. The details of their decays
are model dependent, but the fact that among the decay products there is an LSP follows
from R-parity. If it is neutral, the LSP cannot be seen directly, but it carries transverse
momentum. The signal to look for is thus a multi-jet event with a large amount of
missing transverse momentum that cannot be attributed to neutrinos. Such signals have
been looked for at FermiLab, and since nothing was seen one obtains a limit, which is
about 100 GeV for squarks and 200 GeV for gluinos. These limits are based on several
assumptions, but I will not elaborate on that.
Charged sleptons are much harder to see in hadron colliders due to the small produc-
tion cross section. However, the limits from Z decay mentioned above apply, and indeed
the present limits are about 45 GeV. Unfortunately these limits are also dependent on
some assumptions. The same limits apply to charged Wino-Higgsino mixtures.
which shows that Hi play the same role as before. Then there are interactions among
Higgs bosons
1 Hd Hu + 2 3 , (9.58)
as well as mass terms for the Higgs bosons
M 2 + Hd Hu , (9.59)
and finally there are undesirable terms that lead to direct B-L violation
Yi Hd , Xi Yi Yk , Y1 Hd (9.60)
These are omitted exactly as before. Even though SU (5) unification also leads to proton
decay, the extra terms due to supersymmetry would give a proton decay rate that is
certainly much too large.
The unification of coupling constants works differently because there is additional mat-
ter: squarks, sleptons, gauginos and higgsinos and an additional Higgs. In the discussion
188
of SU (5) unification we have seen that matter in complete SU (5) multiplets does not
affect the fact that coupling constants unify, nor the mass scale at which they unify. Only
the unified coupling constant is affected, and becomes larger. The squarks and sleptons
are in complete SU (5) multiplets, but the gaugino and the higgs are not. For the gaugino
this is a direct consequence of SU (5) breaking. It should be noted that SU (5) breaks at
a scale that it supposed to be above the supersymmetry breaking scale. Hence we may
assume that supersymmetry breaking may be ignored at the SU (5) breaking scale. Then
the components in the 24 that are supersymmetric partners of the X and Y vector bosons
get a mass of order the unification scale, whereas the other gauginos (the partners of the
SU (3) SU (2) U (1) gauge bosons) remain massless. The fact that the fields Hi and Hi
contribute as incomplete SU (5) multiplets i.e. that their triplet components are getting
a mass of order the unification scale is on the other hand not natural. Nevertheless,
we will have to assume that this happens, because if these triplets are light they would
generate proton decay at much too large rates.
so that we get
18
b0 (SU (3)) = (9.61)
96 2
For the weak interactions the computation is
1
b0 (SU (2)) = [ 11 2 2 (gauge bosons + ghosts)
96 2
3 12 (12 weak doublet supermultiplets)
2 4 (gauginos: Majorana fermions with I2 = 4)
3 2 (Higgses: 2 supermultiplets)]
189
leading to a positive (not asymptotically free) result
6
b0 (SU (2)) = (9.62)
96 2
Finally, for the abelian factor of the Standard Model the non-supersymmetric result was
40 + 1 (fermions+Higgs), and this now becomes 32 40 + 3 2 1=66.
66
b0 (U (1)) = (9.63)
96 2
190
9.21.3 Proton Decay
In non-supersymmetric SU (5) all diagrams leading to proton decay must involve the
bosons X and Y or the triplet component of the 5 Higgs. In any case, a boson, whose
propagator contributes M12 to the amplitude. In terms of an effective Lagrangian, any
U
term contributing to proton decay must thus have MU2 in front of it, i.e. it must be an
operator of dimension six (indeed, the relevant operators are four-fermi interactions). In
supersymmetry one can also exchange fermions with mass MU to get B-violation, namely
the partners of the aforementioned bosons. The amplitude for these processes is only
suppressed by one power of MU , and the corresponding operators are of dimension 5.
Examples can easily be constructed by taking a diagram of non-supersymmetric SU (5)
and replacing two external fermions, as well as the interchanged vector boson, by their
superpartners. Such a dimension 5 operator is built out of two fermions and two scalars.
In such a process, two ordinary quarks and/or leptons are transformed into two squarks
and or sleptons. These are much too heavy to form a valid decay product for the proton.
Hence a second step is required to get rid of the supersymmetric particles, this time
involving the exchange of a gaugino or a higgsino. Then the complete decay process is of
higher order in the coupling constant and suppressed by powers of masses of Susy-partners.
One can analyze systematically which dimension 5 operators are possible. If we require
B L or R-Parity to forbid the undesirable dimension 4 operators discussed earlier, only
two combinations of superfields are possible, namely the F-terms QQQL and U U DE,
where we use SU (3) SU (2) U (1) superfield notation. In the second expression the
color anti-symmetry enforces a flavor anti-symmetry for the two u fields. Hence if one of
them is a u-quark, the other is necessarily charm or top, into which the proton cannot
decay. Hence this operator can be ignored. The first operator involves the doublet fields
Q. One cannot take these all within the first family, since the combinations uud or ddu
cannot be made anti-symmetric in the color labels (here u and d denote the upper and
lower components of the superfield Q, i.e. they are superfields each containing a quark
and a squark). However, the combination uds is allowed. The conclusion is in general
that proton decay through dimension 5 operators must involve particles from at least two
families. The most important decay mode would then be p K + (which conserves
B-L), instead of the processes p + e or p 0 e+ expected to be important in the
non-supersymmetric case.
9.22 Conclusions
In comparison to GUTs and technicolor the case for supersymmetry is a priori very weak.
The only problem it promises to solve is the stabilization of large mass hierarchies. Unlike
technicolor, it does not explain why there is such a hierarchy of scales. It requires a lot of
courage to conclude so much on the basis of so little information. It requires even more
courage to state that the minimal form of this idea should be the correct model to test,
although doing anything else is essentially impossible.
Nevertheless, whether one likes it or not, nothing has been found so far that rules
191
out the MSSM, unlike minimal SU (5) or technicolor. The observed coupling constant
convergence skeptical as one may and should be about it may even be viewed as a
first positive hint.
Upon closer examination, the model has some interesting features that were not put in,
but come out anyway: the fact that all unobserved superpartners have SU (3) SU (2)
U (1)-allowed masses, and the fact that the mass of one of the Higgs bosons runs to zero
faster than all other scalar masses for example.
Apart from the technical hierarchy problem, supersymmetry at such solves none of
the remaining Standard Model problems: family structure, family replication, quark and
lepton mass hierarchies etc. all have to be put in by hand. One ends up with a theory with
considerably more parameters than the Standard Model, although the situation improves
if one combines supersymmetry with the idea of Grand Unification.
Is the MSSM falsifiable? Unfortunately the superpartner masses can be pushed to large
values without any real harm. The scale that determines these masses also determines
MW , but the Higgs potential has enough freedom to get the correct value of MW even
with very large superpartner masses. Although this is unnatural, unfortunately it is
impossible to obtain an upper bound from such a principle. Hence it will not be possible
to rule out supersymmetry by not finding, for example, sleptons, although it may be
possible to diminish the number of believers.
It is clear that the idea of low-energy supersymmetry will still be with us for many
years. If it turns out to be realized in nature this would be an incredible theoretical
achievement, given the tiny amount of experimental evidence on which the case is presently
based.
9.23 References
Most of the results presented here were based on the Physics Reports by H. Nilles [22]
and lecture notes by H. Haber [16]. The superfield formalism is explained in the book by
Bagger and Wess [2]. Other sources are [36], [37], [23], [9], [18] and [17], and references
cited in these papers.
10 Supergravity
As we have seen before, spontaneously broken local supersymmetry does not easily yield
an acceptable theory. The supertrace formula for the squared masses does not allow us to
get reasonable multiplet splittings, except perhaps if one uses Fayet-Illiopoulos breaking,
which however is unappealing for other reasons. In addition one gets in the spectrum a
massless fermion, the Goldstino, which has not been seen. Furthermore it is clear that
ultimately we would like to couple supersymmetry to gravity. One would expect that
in order to keep exact supersymmetry the graviton has to belong to a supermultiplet
itself. Indeed, supersymmetry requires the existence of N superpartners of spin 32 called
gravitinos if there are N supersymmetries. Particles of spin larger than 21 can only exist
192
in an interacting field theory if they are gauge particles of some symmetry. For spin 1 this
symmetry is local invariance with respect to some gauge group, for spin 2 it is general
coordinate invariance, and for spin- 32 the gauge symmetry turns out to be supergravity,
or, what is the same, local supersymmetry. Thus the combination of supersymmetry and
gravity inevitably leads us to supergravity.
There is an alternative. We could simply view supersymmetry as a coincidence without
deeper meaning. The world would then be described by a supersymmetric theory with
soft explicit supersymmetry breaking terms. At low energies this theory would not look
supersymmetric at all, like the world we observe, and at higher energies it would look more
supersymmetric, but never exactly supersymmetric. Such a theory would still have all
the miraculous cancellations of quadratic (and some logarithmic) divergences we expect
from supersymmetry. Once we couple it to gravity those properties might be lost, but
coupling a theory to gravity leads to problems anyway. However, if this would turn out
to be the solution nature has chosen it would be extremely disappointing.
Most people believe that if supersymmetry has something to do with the Standard
Model Higgs mechanism, it must be a local symmetry, i.e. supergravity. The phenomenol-
ogy of supergravity is still in its infancy. For global supersymmetry there is at least a
minimal standard model, although perhaps the restrictions imposed on the parameter
space may not convince everyone. Some of the problems of the MSSM are hoped or ex-
pected to be solved when supergravity is added, but at the moment there only exists a
rather large collection of interesting ideas, each with obvious shortcomings.
193
The right-hand side is a space-time dependent translation (x) P , in other words a
general coordinate transformation. Once we have general coordinate transformations it
is inevitable that we have to couple the theory to gravity.
The minimal particle content of such a theory is a graviton and its supersymmetric
partner the gravitino. Since the graviton has spin 2, it is not a surprise that the gravitino
has spin 32 . Since the graviton has two physical components (helicity 2), the gravitino
must have two as well. This is consistent with its mass being zero; a massive spin- 32
particle must necessarily have all four spin states 23 , 12 , 12 , 23 since we go to its rest
frame and apply an SO(3) rotation to it.
Since we do not observe massless gravitinos the gravitino must acquire a mass. It can
do so by the super-analog of the Higgs mechanism: it absorbs two degrees of freedom by
eating the Goldstino appearing when supersymmetry breaks. In this way we remove the
Goldstino from the spectrum.
em
= 1
2
m
= 1
( + 21 mn mn )
1 D
Here em is a vierbein (also called tetrad), which has one space-time index and a
local Lorentz tangent space index m, mn the spin connection and is the gravitational
coupling constant, related to the Planck mass by
r
8
= . (10.2)
MPlanck
The vierbein is related to the metric by
= em n
e mn , (10.3)
with mn = diag (1, 1, 1, 1). It can be used to replace space-time indices by local tan-
gent space indices. For example, m = em
. Furthermore
mn
41 [ m , n ] is an SO(3, 1)
generator. The covariant derivative is thus very similar to a gauge covariant derivative,
with mn interpreted as an SO(3, 1) gauge potential. However, the analogy is not perfect.
One difference is that, unlike a gauge field A , the spin connection is not an independent
physical degree of freedom (indeed, that would violate supersymmetry), but can be elim-
inated in terms of the vierbein. Finally, g is the absolute value of the determinant of the
metric.
194
Now we must couple the graviton and the gravitino to chiral multiplets and vector
multiplets. Since there is no need anymore for the theory to be renormalizable, we begin
by dropping that requirement. The most general supersymmetric action with at most two
derivatives is
Z Z Z
4 2gV 2 a b 2
L = d (Se , S) + Re d fab (S)W W + d g(S) (10.4)
Here S denotes the full set of chiral superfields, V the set of vector superfields, W =
(DD)egV D egV . The action depends on three functions of the chiral superfields which
are usually called f , and g [This notation is somewhat unfortunate in view of previous
definitions, but we will respect the traditional notation here. One should not confuse the
function with a chiral superfield or a scalar field, nor confuse the function g with a
gauge coupling.] These functions have the following properties
The function f (z) is holomorphic in z (i.e. z f (z) = 0), and transforms under
gauge transformations as the symmetric product of two adjoint representations.
In a renormalizable theory the only possibility is fab ab . The proportionality
constant may in fact be complex. Then the real part multiplies the gauge kinetic
2
terms F , and the imaginary part appears in front of the topological term F F .
The function (z, z) must be real. In a renormalizable theory it must be proportional
to zz.
The function g(z) is a holomorphic function of z, and is nothing but the superpo-
tential. In a renormalizable theory is must be a polynomial in z of degree three (or
less).
There is a deceptively simple expression way of coupling this globally supersymmetric
action to supergravity. Instead of Eq. (10.5) one writes
Z
L = d4 E (Se2gV , S) + Re R1 fab (S)Wa Wb + g(S) (10.5)
Here E is the superspace determinant and R the chiral curvature scalar. All these
fields are fields in curved superspace. We will not explain this further here.
The Lagrangian Eq. (10.5) can be written out in components. It turns out that the
result depends only on two independent functions instead of the three one has on the
global case: the functions and g only appear in the combination
G(z, z) = 3 log() log(|g|2 ) . (10.6)
This function is called the Kahler potential. It turns out that the scalars in supergravity
can be viewed as coordinates of a complex manifold with some special properties, which
is called a Kahler manifold. The metric on such a manifold can be expressed in terms of
the Kahler potential
2G
Gij (10.7)
zi z j
195
Here and in the following derivatives with respect to zi are denoted by an upper index
i, and with respect to z i by a lower index i. The indices i label all different scalars that
may appear, as well as their indices in any of the gauge group representations.
The Lagrangian can be split into four kinds of terms: bosonic kinetic terms, fermionic
kinetic terms, the scalar potential, and all remaining terms without derivatives. The
bosonic kinetic terms are
h i
LBkin = e 12 R + Gij D zi D z j 14 ( Re fab )F
a
F ,b 14 ( Im fab )F
a
F ,b (10.8)
Here and in the following D denotes a derivative that is covariant with respect to
pthe
gauge group as well as gravity, and e is the determinant of the vierbein (i.e. e = |g|,
where g is the space-time metric). This part of the Lagrangian is exactly as one could
have expected.
The scalar potential has the form
i aj
LBpot V = eG 3 + Gk (G1 )kl Gl 21 g 2 Re fab (G Ti zj )(Gk Tkbl zl ) . (10.9)
Here g is the gauge coupling (which cannot be confused anymore with the superpotential,
since the latter has been absorbed in G). If the gauge group is semi-simple the second term
becomes a sum over all factors, each of which may have a different coupling constant, and
a different function fab . The coupling constant is only normalized in the standard way if in
the kinetic terms in the vacuum one considers are properly normalized: fab (hzi) = ab . By
(G1 )kl we mean the inverse Kahler metric, and not the double derivative of the function
G1 . This metric must be regular and have an inverse for the theory to make sense.
Finally, Tiaj is a generator of the gauge group in the (in general reducible) representation
of the scalars.
The terms involving fermions are much more complicated, and we will not present
them here. These results are valid if g 6= 0 and in the absence of a Fayet-Illiopoulos term.
and
1
Da = i Re fab (gGi Tibj zj + 12 ifbc
i
i c 12 ifibc i c ) 12 a (Gi i ) (10.11)
Note that in addition to purely bosonic terms there are also fermionic ones. These terms
disappear in the limit MPlanck (the dependence on MPlanck is not been explicitly given
196
here, but can be inferred from the dimensions. In this limit all non-renormalizable terms
must vanish as well).
Supersymmetry can now be broken by a vacuum expectation value of a fermion bi-
linear, or by a purely bosonic v.e.v. We first consider the purely bosonic terms. The
fermionic terms in the action, which we did not write down, contain among many others
the terms
eG/2 [,L ,R ,L L 23 R L ] , (10.12)
where
1
= eG/2 [eG/2 Gi i + Da a ] (10.13)
2
If one of the coefficients on the right-hand side has a non-zero v.e.v, there are bi-linears
i or a in the action, and hence we see that the fields i and/or a mix with the
gravitino. It can be shown that then can be removed from the action by a shift of the
gravitino field
31 ehGi/2 23 . (10.14)
In the global limit reduces to the Goldstino field, up to normalization.
Furthermore, if heG/2 i =
6 0 we see from (10.12) that the gravitino gets a mass:
m3/2 = 1 eG/2 , (10.15)
where on the right-hand side a factor 1 = MPlanck / 8 was inserted for dimensional
reasons. Note that this mass vanishes if hgi = 0. In that case G is not well-defined, but
1
eG/2 |g|/ 3 = 0.
It is in fact not quite correct to interpret the gravitino mass in this way, nor is it
correct to conclude that heG/2 i = 6 0 implies that supersymmetry is broken. Indeed,
supersymmetry is broken if and only if the F or D term has a vacuum expectation value.
To see this more clearly, consider the scalar potential. Usually an additional assump-
tion is made, namely that the Kahler metric and the function fab are proportional to the
unit matrix:
GiJ = ij ; fab = ab (10.16)
This is called minimal coupling of Yang-Mills and matter to supergravity. Under this
assumption, the scalar potential can be written as
V = 3eG + |Fi |2 + 12 Da2 (10.17)
where in Fi and Da we only take the bosonic terms into account. Just as in the global case
we define the supersymmetry breaking scale in terms of the vacuum expectation value of
the auxiliary fields: q
2
MSusy = |Fi |2 + 12 Da2 . (10.18)
197
This is the square root of the shift of the potential due to supersymmetry breaking.
The scalar potential in supergravity shows a new feature in comparison with the one
of supersymmetry, namely the extra term 3eG . Its presence implies that the potential
is not positive definite. Furthermore, even in a supersymmetric point (hFi i = hDa i = 0)
the potential does not necessarily vanish.
In non-supersymmetric theories, the value of the potential can be changed by a con-
stant. Hence the cosmological constant can be tuned to zero, although we have no insight
in the reason for this fine-tuning. In supersymmetric theories one cannot add an arbitrary
constant to the potential. In a globally supersymmetric theory the cosmological constant
is zero before supersymmetry breaking, but is definitely non-zero and positive after su-
persymmetry breaking. In the absence of gravity we could however ignore this problem.
Fortunately, local supersymmetric theories are better in this respect. One can again tune
the cosmological constant to zero, not by adding an arbitrary constant, but by requiring a
suitable value for the constant 3ehGi . Also in this case there is no fundamental insight
in the mechanism that might impose such a fine-tuning, nor is the value of c protected
against corrections due to further shifts in the potential, for example in weak interaction
symmetry breaking.
If we do not do such a fine-tuning, we end up in de-Sitter and anti-deSitter space,
and we cannot even interpret the masses we get in the conventional way. For example,
if eG 6= 0 but all auxiliary fields have zero v.e.vs supersymmetry is unbroken, but
the gravitino has a mass. Since we are in anti-de Sitter space our usual notions about
masslessness are no longer valid, and these two facts are not in contradiction. Clearly it
would not make much sense to compute a mass spectrum from supergravity if the v.e.v.
of the potential is not tuned to zero.
This then gives us immediately an expression relating the gravitino mass to MSusy and
. Requiring V = 0 in Eq. (10.17), and using Eqs. (10.18) and (10.15) we get
2
r 2
MSusy 8 MSusy
m3/2 = = . (10.19)
3 3 MPlanck
One may also compute the tree level mass matrices for the remaining fermions, the
scalars and the spin-1 fields. Minimal coupling implies that G has the form
With these choices the scalar and Yang-Mills have their canonical form. Under this
assumption one can derive the mass sum rule Eq. (9.31). This rule is valid at tree level
for minimal coupling and if hDi = 0. If one also includes D-type breaking, it is generalized
to X
(1)2S (2S + 1)MS2 = (N 1)[2m23/2 2 hDa2 i] 2ga hDa i TrT a (10.21)
S
This mass sum rule does not give all the information that is available. In fact one can
express all tree-level mass matrices completely in terms of Fi and Da . It is somewhat
disturbing that in these expressions a non-zero gaugino mass requires the corresponding
198
Da term to have a non-zero vacuum expectation value. However, D-type breaking remains
undesirable. One possible way out of this is to use non-minimal couplings.
If we omit the D-terms, we see that the chiral multiplet splittings induced by supersym-
metry breaking are of order m3/2 . In particular a contribution of this order of magnitude
may be expected for the masses of the Higgs scalars, which in their turn determine the
weak scale. The relation between these two scales is in fact a rather complicated function
of all MSSM parameters, and there exist regions in parameter space where m3/2 is much
larger than MZ . However, this is considered unnatural, and current prejudice says that
we should have m3/2 not too much higher than MZ . Substituting m3/2 100 GeV in Eq.
(10.19) gives MSusy 1010 GeV.
Up to now we have only considered non-vanishing v.e.vs for bosonic fields. It is also
imaginable that fermion bi-linears get a v.e.v that breaks supersymmetry. Consider for
example Eq. (10.11). The second term involves a gaugino bi-linear. If fab,k 6= 0 (which
means that the couplings are not minimal) this may yield a contribution to hFi i, which
is proportional to hi. This is called gaugino condensation. Since hi has dimension
three, we conclude that, after tuning the resulting effective potential to zero we will get
supersymmetry breaking with an associated scale
2 hi
MSusy (10.22)
MPlanck
4
Here we define MSusy as the shift in the potential due to the symmetry breakdown. Of
2
course the foregoing calculations are not valid in this case, but clearly MSusy will be pro-
portional again to hFi i, and for dimensional reasons there must then be a factor 1/MPlanck .
For the gravitino mass one may then expect a formula like
hi
m3/2 2
(10.23)
MPlanck
This yields hi 1013 if m3/2 100 GeV. It goes without saying that the mass-squared
sum rule is not valid in this case, although one may expect a similar formula to hold.
The attractive point about gaugino condensation is that it is now possible that MSusy
is generated dynamically, just as in Technicolor models. Suppose we add to the Standard
Model an extra gauge group G, whose coupling merges with the Standard Model couplings
at MGUT (or perhaps MPlanck ). Then the value of the coupling at that scale is fixed, and we
can use renormalization group evolution to compute at which scale it becomes large. Just
as the fact that the SU (3)color coupling becomes large triggers chiral symmetry breaking
via quark condensates, it seems plausible that when the coupling constant of G becomes
strong it forms condensates of the fermions it couples to. Since supersymmetry is still
unbroken, those fermions include the gauginos of G, and possibly nothing else. Just as
the there is no hierarchy problem for QCD /MPlanck , there would be no hierarchy problem
for MSusy /MPlanck either. This large ratio would be explained as in terms of an exponential
2
e1/g , where g is a coupling constant which is small, but of order 1. The unattractive
point this mechanism is that very little is know about whether and how exactly it works.
199
10.4 Hidden Sector Models
It is clear that we cannot tolerate vacuum expectation values of order MSusy for quantities
that carry non-trivial SU (3) SU (2) U (1) quantum numbers. One usually assumes
that supersymmetry breaking takes place in a sector of the theory that has trivial Stan-
dard Model representations, and couples to the visible world only via (super)gravitational
interactions. This is called the hidden sector.
A frequently used toy model to describe hidden sector symmetry breaking is the
Polonyi-model. This model has just one chiral superfield in the hidden sector, whose
scalar component we will call, as before, z. The couplings are assumed to be minimal.
Let us first derive some useful results for minimally coupled theories with an arbitrary
number of scalars. We will ignore D-terms in the following, in other words we assume
that they do not get v.e.vs.
P
Since G(zi , z j ) = i (zi z i ) log |g(zi )|2 we find
gi
Gi = zi G = z i (10.24)
g
Hence
1 i gi
F i = e 2 zi z |g|(z i + ), (10.25)
g
and
i
V = ezi z |z i g + g i |2 3|g|2 . (10.26)
In all these expressions is set to 1. It can be restored easily using the fact that zi has
dimension 1, g has dimension 3, and has dimension 1.
In the Polonyi model one chooses g = m2 (z + ). The value of F is then ezz/2 m2 (1 +
z(z + )). If < 2 this quantity is positive for any value of z, and hence supergravity
must be broken. The potential is equal to
200
this destroys in most cases the possibility of having a minimum at V = 0. Clearly this is
no more than a toy model.
More generally, the hidden sector can be coupled to the observable sector simply by
writing the complete superpotential as a sum of two terms,
where zi are hidden sector scalars and ym observable sector scalars. The scalar potential
is
2 2 2
V = e (|zi | +|ym | ) |hi + 2 z i g|2 + |km + 2 y m g|2 32 |g|2 , (10.31)
where we have restored the dependence on . We will assume that the hidden sector fields
get v.e.vs of order MPlanck 1 :
hzi i = 1 bi (10.32)
The gravitino mass is given by
1 2
m3/2 = 1 heG/2 i = 2 e 2 |hzi i| hhi , (10.33)
so that
1 2
hhi = 2 m3/2 e 2 |bi | (10.34)
Finally we need to parametrize the expectation value of the derivative of h. In the Polonyi
model we had hh0 i = m2 1 m3/2 . Inspired by this result we postulate
The observable sector variables, y, k and ki have characteristic scales much below the
Planck scale. Their vacuum expectation values vanish, or are completely negligible in
comparison to MPlanck . To get the effective potential for the observable sector we take
the limit 0, after substituting the v.e.vs of the hidden sector fields. There are some
poles in , but only in terms that do not depend on observable sector fields. Those poles
have to be canceled, which can be done by requiring that |ai + bi |2 = 3. This is simply
the requirement that the cosmological constant should vanish, and as usual this has to
be arranged by hand. When the condition |ai + bi |2 = 3 is satisfied, one finds that all
constant terms cancel. The remaining terms are
h i
2 2 2
Vobs = |km | + m3/2 |ym | + m3/2 ym km + (A 3)k + c.c. (10.36)
and X
A= bi (ai + bi ) . (10.38)
i
201
In this calculation the scale m was put in by hand, and the cosmological constant was
fine-tuned to zero by hand. Furthermore we did not worry about finding the minimum
of the potential, as we did in the foregoing example. Of course it is assumed that hzi i
minimizes the potential
This result looks like the potential of a supersymmetric theory with soft breaking
terms. The first term is the usual scalar potential. The second is a scalar mass term,
one of the allowed soft supersymmetry breaking terms. The last terms are other soft
correction terms.
Note that all scalars receive the same mass m23/2 . Equality of the masses is one of the
assumptions of the MSSM. The reason it happens here is due to the choice of minimal
kinetic terms in the observable sector. It is therefore not an inevitable consequence of
supergravity. P
The superpotential k will in general be a sum n k n , where k n contains all terms with
n fields. Then X
ym m k n = nk n , (10.39)
m
The soft supersymmetry breaking terms in the potential are thus given by the terms in
super-potential with a factor m3/2 (A3+n) multiplying the nth order terms. Cubic terms
get a factor m3/2 A, quadratic one a factor m3/2 (A 1) The value of A depends on the
details of the hidden sector.
The cubic terms contains combinations of two squarks and a Higgs with the structure
of a Yukawa coupling. Such terms violate continuous R-symmetry, but not R-parity.
This is exactly what we assumed earlier. From the point of view of the observable sector
R-symmetry looks as if it is explicitly broken. The same is true for supersymmetry.
Both symmetries are however broken spontaneously in the hidden sector. One may thus
expect a Goldstone boson of broken R-symmetry. Since this boson is built out of hidden
sector field, its couplings to the observable sector are very weak, of gravitational strength.
Furthermore it may happen that the R-symmetry was not exact, but has an anomaly
with respect to some gauge group, in either the hidden or the observable sector. In that
case the Goldstone boson is actually a pseudo-Goldstone boson, and it becomes massive.
If it gets its mass from observable sector instanton effects it will be extremely light, like
an invisible axion. If it gets is mass from a hidden sector gauge group it could well
be very heavy (like a scaled-up 0 ), but since it is also very weakly coupled to us it is
completely irrelevant. Note that if we add a gauge-group in the hidden sector that only
couples to a gaugino, then R-symmetry automatically has an uncancelable anomaly with
respect to this gauge group. Whether these alternatives can actually be realized is very
model-dependent, and we merely mention them here as logical possibilities.
Now that R-symmetry is broken there is no obstruction to gaugino masses. In the
particular kind of F-type breaking considered here they are not generated at tree level.
202
However, the gauginos couple to quark-squark loops, which generate a gaugino mass
when supersymmetry is broken. The size of the mass is dominated by the top quark
contribution,
m2
ma = a t C, (10.41)
m3/2
where C is a numerical factor. This result holds in the limiting case m3/2 mt . If the
gravitino mass is much smaller than mt the result is proportional to m3/2 , since of course
it must vanish when m3/2 = 0. In any case this value is much too small in comparison
with present experimental limits on the gluino mass. As already mentioned, non-minimal
couplings fab (z) provide a possible way out. Gaugino masses are presumably also be
generated in supergravity breaking through gaugino condensation, since the Lagrangian
contains quartic gaugino interactions. However, as we have already seen, gaugino con-
densation also requires non-minimal couplings fab .
The universality of the gaugino masses assumed in the MSSM does not look natural
from this point of view. However, if there is unification of the coupling constants there
would only one gaugino mass above the unification scale. The evolution of the separate
gaugino masses starts then at MGUT .
Since the masses are running as a function of the scale, it is not quite clear at which
scale we should impose the unification condition for the scalars. Many authors assume
this to be the gauge unification scale. This is certainly true for those scalars that come
from the same multiplet of the unified gauge group, but not for different multiplets. It
seems more reasonable to assume that the boundary condition that all scalar masses are
equal should be imposed at the Planck scale.
10.5 Conclusions
Once one has accepted supersymmetry as a symmetry of nature, supergravity is nearly
inevitable. It is required in order to couple a supersymmetric theory to gravity, and also
to avoid disastrously large contributions to the cosmological constant, inevitable in spon-
taneously broken global supersymmetry. Even in supergravity models the cosmological
constant problem still requires a solution, but at least the existence of a solution is not a
priori ruled out.
Supergravity also to provides the only sensible way of spontaneously breaking su-
persymmetry, the super-Higgs mechanism. This eliminates first of all the undesirable
massless Goldstino, but also produces an indispensable contribution to the mass sum rule
for broken supermultiplets.
Finally, supergravity models offer partial justification for the unification assumptions
generally made in the MSSM, although the case is far from being convincing.
10.6 References
The references used here include some of the papers listed at the end of the supersymmetry
section, plus reviews by P. van Nieuwenhuizen [30] and S. Ferrara [11].
203
A Spinors
In this appendix we review some properties of spinors. The action for fermions is derived
using as much as possible only group properties of the Lorentz group.
204
The spinor representations are not real, but pseudo-real. This means in particular
that a matrix C must exist so that ( a ) = C a C. A matrix C with that property is
C = i 2 ; thus C = , with 12 1. Note that C = C = C T = C .
Important invariant tensors of SU (2) are C and the Pauli matrices. Suppose and
are spinors and V i a vector. Then one can construct for example the following invariants
C V~ C~
V~ ~
= (A.4)
=
= ,
so that = . This implies in particular that the following relation holds numeri-
cally
= , (A.5)
so that, for example, 12 = 1. Note that is also correctly obtained from by lowering
indices:
= (A.6)
Then for example the first invariant can also be written as
C =
=
= .
= i i (A.7)
A relation holds numerically if the left- and right hand side yield the same answer for the same index
values, regardless of the position of the indices. If the indices have different positions, the objects
transform differently as tensors under rotations, so they are not identical, they are just numerically the
same.
205
Here i are three infinitesimal SO(3) parameters, and i is numerically equal to
i
.
For the quantity with upper indices we find then
= i i
= i i
= i i,
We use the notation for tensors and for matrices. The position of the indices of indicate a
transformation property; the position of indices of has no special meaning.
206
where M a (di ) is the SU (2) representation matrix in the representation with dimension
di . For example, M a (1) = 0 and M a (2) = a (as before we omit the usual normalization
factor 12 here).
The most common SO(3, 1) representations are (1, 1) for scalars, (2, 1) for left-handed
spinors, (1, 2) for right-handed ones, and (2, 2) for vectors. The representation matri-
ces (Ra , S b ), (a, b = 1, 2, 3), for these four representations are respectively (Ra , S b ) =
(0, 0), ( a , 0), (0, ( b ) ) and ( a , ( b ) ). Here we have denoted the full matrices Eqn.
(A.9) for simplicity by just specifying the pair of SU (2) matrices (M a (d1 ), M b (d2 )) out of
which they are built. Note that in the second SU (2) factor we use the complex conjugate
representation. This is just a convention. Four-vector indices are denoted here by letters
, . . ., left-handed spinors by , , . . . and right-handed ones by , , . . ..
The tensor product of (2, 1) and (1, 2) contains (2, 2), and therefore there must exists
an invariant tensor . It is customary to express that tensor in terms of invariant tensors
of the SO(3) subgroup of SO(3, 1) corresponding to space rotations. This subgroup is
precisely the diagonal subgroup of the two SU (2)s, and has generators T a = (Ra + S a ).
Under this subgroup the two kinds of spinors are transforming as
= i i (A.10)
and
= i i (A.11)
= (1, ~ ) . (A.12)
This notation indicates that the space components i are numerically equal to the Pauli
matrices. As tensors these are indeed invariant. This follows from Eq. (A.9), plus the fact
that the dotted lower indices transform under the rotation subgroup as undotted upper
indices. The relative normalization between the space and time components does not
follow from these SO(3)-based arguments, but can be derived by requiring that
transforms as a vector V . This fixes the relative factor up to a sign, which is a convention.
Here we are following the standard conventions. It might be preferable to lower and raise systematically
all dotted indices on all quantities, so that the transform in the same way under rotations.
207
From the properties of the Pauli matrices one easily derives, in a metric-independent
notation,
, = 2 00 , = 2 00 (A.13)
Note that these relations fix the relative normalization between the space-like and time-
like components of .
An important difference between SO(4) and SO(3, 1) is the behavior of the spinors
under complex conjugation. Starting with Eqn. (A.9) we can compute the six SO(4)
and SO(3, 1) representation matrices on these two-dimensional spaces. In both cases the
representations (2, 1) and (1, 2) are obtained from R ~ = ~ , S
~ = 0 or R ~ = 0, S ~ = (~ )
respectively (note that we use, as before, the convention that the second SU (2) transforms
with the complex conjugate representation). Then we take the combinations T a = Ra +S a
and T a = Ra S a to get the generators of SO(4), and T a = Ra + S a and iT a = i(Ra S a )
to get the generators of SO(3, 1).
In SO(4) the 6 representation matrices are R ~ +S~ = ~ and R ~ S ~ = ~ for the repre-
~ ~ ~ ~
sentation (2, 1) and R + S = ~ and R S = ~ for (1, 2). If we take the conjugate of
the set of matrices the (R ~ + S,
~ R~ S)
~ = (~ , ~ ) we get (~ , (~ ) ), which is equivalent
to the original. Hence the spinor representations of SO(4) are self-conjugate (and in fact
pseudo-real). This is summarized below. The symbol denotes equivalence in SU (2),
i.e. C a C = ( a ) .
= ( ) (A.14)
In other words, we introduce a new symbol whose components are numerically equal to
those of , but we give a dotted index to indicate that it transforms as the SO(3, 1)
spinor of opposite chirality (by definition, the chirality is +1 for the representation
(2, 1) and 1 for (1, 2). Now we just have to be careful about the position of the index:
upper or lower. This can be deduced from the transformation of the left- and right-hand
side under SU (2) rotations. We have
= i i, (A.15)
208
Then the complex conjugate spinor transforms as (note the usual minus sign for infinites-
imal transformations of complex conjugates)
( ) = i ( i, ) ( ) (A.16)
= i i, (A.17)
and we have seen earlier that i = (i ) . The same relation holds when we replace
all upper indices by lower ones and vice-versa.
Using the conjugate spinor we can write down a Lorentz invariant kinetic term
i 0 (A.18)
in both metrics.
We can also write down a mass term, but only by combining with itself; (2, 1)
couples with itself to a singlet, but not with (1, 2 ) Such a mass term is of the form
m . Explicitly this is proportional to , and this vanishes if is a commuting
object. Up to now it was, since we have only introduced it as a vector in a two-dimensional
spinor space. In physics spin- 12 particles should however be anti-commuting, which can
be achieved either by making Grassmann-valued, or by making it an anti-commuting
operator in a Hilbert space. In either case we the role of complex conjugation is replaced
by hermitean conjugation (i.e. is a positive definite quantity, analogous to for
complex numbers). Hence we define = ( ) . Then the mass term is (including the
necessary hermitean conjugate term and a normalization for later purposes)
21 m( + ) . (A.19)
Note that the fact that R and R can be coupled to a singlet holds for unitary representations, as a
consequence of U U = 1; the SO(3, 1) spinor representations (and all other finite dimensional represen-
tations) are however not unitary.
209
i2 = i2 (i i ), where i is the gauge index and the spinor indices are suppressed. We
can view this as a spinor in a 2dim (R) dimensional real representation. Hence one can
describe this system exactly as above, with the same kinetic terms and a Majorana mass
term.
This is not the usual description, however. If we write out the full kinetic and Majorana
mass term we get, suppressing all spinor indices
i 0 i i + i 0 i i m( i i + c.c) (A.21)
i 0 = i 0 ( )T
= i 0 , , (A.22)
where in the first step anti-commutativity was used and in the second step integration by
parts. Furthermore we introduced the new tensor
, ( )T = (C ( ) C) , (A.23)
i , (A.25)
with
= , = , , (A.26)
and
0 0
= (A.27)
0 0
so that
0 0 1
0 = = . (A.28)
1 0
We define
= 0 (A.29)
210
Note that all expressions involving Dirac matrices are metric dependent, whereas all
results involving s are written here in metric-independent form.
The mass term can be written as
m (A.30)
{ , } = 2 , (A.31)
and other representations exist. In the present representation (and in fact in any com-
monly used representation with this choice of metric) they are hermitean ( = 1, 2, 3) or
anti-hermitean ( = 0, the time direction); 4 is hermitean.
Even if a spinor is in a real representation of all symmetry groups it is customary
do introduce a Dirac spinor
= , (A.32)
and to write down the standard action. Only in this case one has to include an extra
factor 21 , since otherwise one would get the kinetic terms Eq. (A.18) twice. The correct
action for a Majorana fermion is thus
i
2
21 m (A.33)
= T C , (A.34)
The -matrices derived here are in a representation of the Clifford algebra which is
not the most common one. It is called the Weyl representation. For example, in [5] a
different representation is used.
A Dirac spinor can be projected onto its two components using the matrix 5 defined
as 5 = i0 1 2 3 . It is hermitean, its square is 1, and it commutes with all , = 0, . . . 4.
In the explicit representation given above one has
0 1 1 0
4 = , 5 = (A.36)
1 0 0 1
211
They satisfy PR PR = PR ; PL PL = PL and PL PR = 0. The left and right-handed compo-
nents of a field are defined as L = PL ; R = PR . Due to the projections L and
R are effectively two-component spinors, called Weyl spinors. These are precisely the
spinors and introduced above.
Note that L = PR . The flip in chirality occurs because have to commute PL through
4 . Hence the non-vanishing bi-linears are iL L , R L , and terms with all Ls and Rs
interchanged. Thus the vector current (to which gauge bosons couple) preserves chirality,
but the mass term does not. Another combination that does not preserve chirality is
L [ , ]R , to which the magnetic moment is proportional.
For an arbitrary Dirac spinor one defines
= (c )T C , (A.38)
where c is the charge conjugate spinor. A Majorana spinor is thus defined by = c .
In the absence of a mass term there is not really any difference between Majorana and
Weyl spinors. We may write
i i i
= L L + R R , (A.39)
2 2 2
and then substitute R = ((c )L )T in the second term. Using the Majorana property
plus a little algebra (which is done explicitly in chapter 5 one finds that the second term
is now transformed into the first one. Hence for Majorana fermions
i
= iL L = iR R . (A.40)
2
The difference between Majorana and Weyl fermions becomes essential if one assigns
them to representations of local or global symmetries, and writes down mass terms. For
Majorana fermions the representations must be real, and masses are allowed, while for
Weyl fermions the representation can be complex, but then one cannot write down an
invariant mass term.
Majorana and Weyl spinors both have two on shell degrees of freedom. In other words
both the Majorana condition = T C and the Weyl condition = PL (or = PR )
reduce the number of degrees of freedom of a Dirac spinor from 4 to two, but one cannot
reduce the number of degrees of freedom further. This is due to the fact that the SO(3, 1)
spinor representation is complex, and hence requires always two real degrees of freedom.
In dimensions other than 4 this can be different. For example if D = 10 modulo 8 the
SO(D 1, 1) spinor representations are real, and one can impose simultaneously Weyl
and Majorana conditions.
B Lie Algebras
Here we collect some formulas and conventions for Lie-algebras. This is not a review of
group theory, but rather an encyclopedic dictionary of some relevant facts with few
explanations.
212
B.1 Classification of Lie Algebras
The algebra. We will mainly use compact groups (see below). Their Lie algebras can
be characterized by a set of dim (A) hermitean generators T a , a = 1, dim (A), where A
stands for adjoint. Provided a suitable basis choice is made, the generators satisfy the
following algebra
[T a , T b ] = if abc Tc , (B.1)
with structure constants fabc that are real and completely anti-symmetric.
Exponentiation. Locally, near the identity, the corresponding Lie group can be ob-
tained by exponentiation
a a
g() = ei T . (B.2)
The global properties of the group, involving element not close to 1, are not fully
described by the Lie-algebra alone, but will not be discussed here. The space formed by
all the group elements is called the group manifold.
Real forms. A Lie-algebra is a vector space of dimension dim (A) with an additional P op-
eration, the commutator. An arbitrary element of the vector space has the form a a T a .
In applications to physics a is either a real or a complex number. If the coefficients a
are all real and the generators Hermitean, the group manifold is a compact space. For a
given compact group there is a unique complex Lie-algebra, which is obtained simply by
allowing all coefficients a to be complex. Within the complex algebra there are several
real sub-algebras, called real forms. The generators of such a sub-algebra can be chosen
so that Eq. (B.1) is satisfied with all structure constants real, but with generators that are
not necessarily Hermitean. One can always obtain the real forms from the compact real
form (which has hermitean generators) by choosing a basis so that the generators split
into two sets, H and K, so that [H, H] H and [K, K] H. Then one may consistently
replace all generators K K by iK without affecting the reality of the coefficients f abc .
The most common case in physics are the real forms SO(n, m) of the compact real form
SO(n + m). Most of the following results hold for the compact real form of the algebra,
unless an explicit statement about non-compact forms is made.
The classical Lie groups. The group SU (N ) is the group of unitary N N matrices
with determinant 1; SO(N ) is the group of real orthogonal matrices with determinant
1, and Sp(2r) the group of real 2r 2r matrices S that satisfies S T M S =
M , where
0 1
M is a matrix which is block-diagonal in term of 2 2 blocks of the form .
1 0
Mathematicians (and some physicist) write Sp(r) instead of Sp(2r).
213
The Cartan sub-algebra. This is the maximal set of commuting generators of the
simple algebra. All such sets can be shown to be equivalent. The dimension of this space
is called the rank (denoted r) of the algebra.
[Hi , E~ ] = i E~ . (B.3)
The eigenvalues with respect to the Cartan sub-algebra are vectors in a space of dimension
r. We label the remaining generators by their eigenvalues
~ . These eigenvalues are called
the roots of the algebra.
Positive roots. A positive root is a root whose first component 1 is positive in some
fixed basis. This basis must be chosen so that 1 6= 0 for all roots.
Simple roots. Simple roots are positive roots that cannot be written as positive linear
combinations of other positive roots. There are precisely r of them. They form a basis of
the vector space of all the roots. The set of simple roots of a given algebra is unique up to
O(r) rotations. In particular it does not depend on the choice of the Cartan sub-algebra
or the basis choice in root space. This set is thus completely specified by their relative
lengths and mutual inner products. The inner product used here, denoted ~ is the
~ ,
straightforward Euclidean one.
where ~ i is a simple root. This matrix is unique for a given algebra, up to permutations
of the simple roots. One of the non-trivial results of Cartans classification of the simple
Lie algebras is that all elements of A are integers. The diagonal elements are all equal to
2 by construction; the off-diagonal ones are equal to 0, 1, 2 or 3.
Dynkin diagrams. are a graphical representation of the Cartan matrix. Each root is
represented by a dot. The dots are connected by n lines, where n is the maximum of |Aij |
and |Aji |. If |Aij | > |Aji | an arrow from root i to root j is added to the line. The simple
algebras are divided into 7 classes, labeled AG , with Dynkin diagrams as shown below.
214
Ar
1 2 3 4 r-1 r
Br
1 2 3 4 r-1 r
Cr
1 2 3 4 r-1 r
r
Dr
1 2 3 4 r-2
r-1
G2
1 2
F4
1 2 3 4
6
E6
1 2 3 4 5
E7
1 2 3 4 5 6
E8
1 2 3 4 5 6 7
Long and short roots. If a line from i to j has an arrow, Aij 6= Aji and hence the
lengths of roots i and j are not the same. An arrow points always from a root to another
root with smaller length. Lines without arrows connect roots of equal length. There is at
most one line with an arrow per diagram, and therefore there are at most two different
lengths. This is not only true for the simple roots, but for all roots. One frequently used
convention is to give all the long roots length-squared equal to two. Then the short roots
have length 1 if they are connected to the long ones by a double line, and length-squared
2
3
if they are connected by a triple line. Often the short roots are labeled by closed dots,
and the long ones by open dots, although this is strictly speaking superfluous. If all roots
have the same length the algebra is called simply laced. This is true for types A,D and E.
215
Realizations. The compact Lie-algebras corresponding to types A D are realized by
the algebras SU (n), SO(n) and Sp(n). The correspondence is as follows
Ar : SU (r + 1)
Br : SO(2r + 1)
Cr : Sp(2r)
Dr : SO(2r)
There is no such simple characterization for the algebras of types E, F and G, the excep-
tional algebras.
B.2 Representations.
A set of unitary N N matrices satisfying the algebra (B.1) is said to form a (unitary
matrix) representation of dimension N .
Weights. In any representation the matrices representing the Cartan sub-algebra gen-
erators Hi can be diagonalized simultaneously. The space on which the representation
acts decomposes in this way into eigenspaces with a set of eigenvalues ~, i.e.Hi v~ = i v~ .
The ~s, which are vectors in the vector space spanned by the roots, are called weights.
The vector space is usually called weight space.
216
Weight space versus representation space. We are now working in two quite dif-
ferent vector spaces: the r-dimensional weight space, and the N dimensional space on
which the representation matrices act. The former is a real space, the latter in general a
complex space. Often the vectors in the latter space are referred to as states, a termi-
nology borrowed from quantum mechanics. Although this may be somewhat misleading
in applications to classical physics, it has the advantage of avoiding confusion between
the two spaces.
Weight multiplicities. In the basis in which all Cartan sub-algebra generators are
simultaneously diagonal each state in a representation are characterized by some weight
vector . However, this does not characterize states completely, since several states can
have the same weight. The number of states in a representation R that have weight is
called the multiplicity of in R.
2~
Coroots. Coroots are defined as =
.
Dynkin labels. For any vector in weight space we can define Dynkin labels li as
li = i = 2 i
i i
. Since the simple (co)roots form a complete basis, these Dynkin labels
are nothing but the components of a weight written with respect to a different basis. The
advantage of this basis is that it can be shown that for any unitary representation of the
algebra the Dynkin labels are integers.
Special Representations.
Fundamental representations
The representations with Dynkin labels (0, 0, , . . . , 0, 1, 0, . . . , 0) are called the fun-
damental representations.
217
Vector representations
The N N matrices that were used above to define the classical Lie groups form
the vector representation of those groups; the expansion of these matrices around
the identity yields the vector representation of the corresponding Lie-algebra.
Fundamental spinor representations
They are defined only for SO(N ). If N is odd, they have Dynkin label (0, 0, . . . , 0, 1).
If N is even there are two fundamental spinor representations with Dynkin labels
(0, 0, 0, . . . , 1, 0) and (0, 0, 0, . . . , 0, 1).
Tensor Products. If Vi1 transforms according to some representation R1 and Wi2 ac-
cording to some representation R2 , then obviously the set of products Vi1 Wi2 forms a
representation as well. This is called the tensor product representation R1 R2 ; it has
dimension dim R1 dim R2 . This representation is usually not irreducible. It can thus be
decomposed into irreducible representations:
X
R1 R2 = N12j Rj , (B.8)
j
where ~ is called the Weyl vector. It has Dynkin labels (1, 1, 1, . . . , 1, 1).
218
The Casimir eigenvalue. The operator Ta Ta is called the (quadratic) Casimir opera-
tor. It commutes with all generators, and is thus constant on an irreducible representation.
The eigenvalue for a representation with highest weight Lambda is proportional to the
number
C() = ( ~ + 2~ ~
) (B.10)
For the adjoint representation this yields C(A) = 2g, where g is the dual Coxeter number.
It is equal to the following numbers for the simple algebras
For SO(3) we use the same normalization as for SU (2). The correctly normalized gener-
ators of the SU (2) vector representation (which is the SO(3) spinor representation) are
1 i
2
, where i are the Pauli-matrices.
219
Tensors. A tensor Vi1 ,...,im transforms by definition as
Relation to tensor products. For every term in Eq. (B.8) there are N12j distinct
invariant tensors. The invariant tensors ij and Tija correspond to the first two terms in
the tensor product R R = 1 + A + . . ..
Rank two invariant tensors. The existence of an invariant tensor with two indices
implies that the two corresponding representations R1 and R2 contain the identity in their
tensor product. For every irreducible representation R1 there is only one representation
R2 with that property. If R2 is not equivalent to R1 it is the complex conjugate of R1 , and
the invariant tensor is i1 ,i2 as discussed above (provided one chooses complex conjugate
bases). Otherwise R1 is either real or pseudo-real. If R1 is real ij is an invariant tensor;
if it is pseudo-real there exists an invariant tensor Cij = Cji .The invariance implies then
that the representation matrix U is conjugated by C: U = C 1 U C.
Symmetric invariant adjoint tensors. For each simple algebra of rank r all fully
symmetric invariant tensors with adjoint indices can be expressed in terms of r basic
tensors. The ranks (number of indices) of these tensors are as follows
The tensor of rank 2 is always dab = ab . For SU (2) this is the only such tensor.
Consequently whenever a symmetric tensor appears with four adjoint indices, it must be
proportional to ab cd + ac bd + ad bc .
where d is one of the basic invariant tensors, and the dots represent combinations of lower
order tensors. If there is no basic tensor of rank n, the index vanishes. These indices are
220
Algebra Invariant tensor ranks
Ar 2, 3, 4, . . . , r, r + 1
Br 2, 4, 6, . . . , 2r
Cr 2, 4, 6, . . . , 2r
Dr 2, 4, 6, . . . , 2r 2; r
G2 2, 6
F4 2, 6, 8, 12
E6 2, 5, 6, 8, 9, 12
E7 2, 6, 8, 10, 12, 14, 18
E8 2, 8, 12, 14, 18, 20, 24, 30
defined provided one has fixed a normalization for d; this can be done by fixing it for one
representation with non-zero index. The second and third indices I2 and I3 are defined as
Tr T a T b = 12 I2 (R) ab
Str T a T b T c = I3 (R)dabc ,
For the algebras SU (N ) a convenient normalization of dabc is such that I3 = 1 for the
fundamental representation.
where n is the rank of one of the basic invariant tensors. Because d is an invariant tensor,
these operators commute with all the generators in any given representation. For the
quadratic Casimir operator one has
dim A
C2 (R) = I2 (R) . (B.16)
dim R
B.4 Representations of SU (N )
The irreducible unitary representations of SU (N ) can be characterized in a very conve-
nient way using Young tableaux. They are specified by a sequence of N 1 integers
(q1 , . . . , qN 1 ), with qi qi+1 . This way of labelling representations can be derived in a
straightforward way from the Dynking labelling. The sequences of integers are graphically
represented by a diagram consisting of boxes forming an upside-down staircase, as in the
following example
221
q1
q2
q3
The dimension of a representation can be computed as follows. Take two copies of the
picture. In the first one we write in all the boxes N j, where j is the number of positions
left or up from the diagonal (for N + j) or right or down from the diagonal (for N j).
In the second copy of the figure we put in every box the hook length, which is the total
number of boxes in the hook formed by the boxes to right and down from the box we are
considering. The dimension is the product of the numbers in the first figure divided by
the product in the second figure. In the example these two figures are as shown here
N-1 N N+1
q2 4 2 1 q2
N-2 1
q3 q3
222
representations. For example, consider a particle that transforms according to the rank-
2 anti-symmetric tensor representation. We can obtain that representation as a tensor
product of two vector representations. The latter are just mathematical tools, and have
no physical significance. It does not mean that the particle can somehow be physically
decomposed in terms of fundamental particles in the vector representation. Indeed, if
that were an option, we would have to worry what happens to the symmetric combination.
B.5 Subalgebras
Often in physics symmetries are only approximate, and hold only in special limits. Away
from that limit only a sub-algebra remains as a symmetry. In the Standard Model this
occurs when the Higgs mechanism breaks SU (3) SU (2) U (1)Y to SU (3) U (1)QED .
In the high energy limit the symmetry is exact (or unbroken), whereas at low energy the
subgroup is the relevant symmetry. Beyond the Standard Model this kind of situation
may occur once again, with the Standard Model gauge group realized as a subalgebra of
a larger algebra. The most popular option is SU (5).
A subalgebra H is a set of generators written as a linear combination of the generators
of an algebra G, such that their commutation relations close. One also says that H is
embedded in G, and this is usually denoted as H G. There is an analogous notion of
groups and subgroups.
Particles and fields always belong to representations of symmetry groups. If the low
energy symmetry group is a subgroup H of a larger group G, all particle representations
of G decompose into particle representations of H. Suppose we have a particle or field i ,
where i is an index on which G acts via a matrix representation. Then an infinitesimal
transformation acts as follows
X
i i + i a Tija j , (B.17)
a,j
where Pap is a set of real numbers. the fact that R is irreducible means that any component
of can be transformed into any other component by the action of G. If we consider
a subalgebra that is not necessarily true anymore. In general we should expect that the
space of fields i splits into irreducible blocks. Each block consists of fields that are linear
combinations of the i . The fields within a block can be transformed by H transformations
into each other, but not into other blocks. The original field splits into a set of fields,
each forming a representation of H. The sum of the dimensions of these H representations
is equal to the dimension of R. We write this as
R r1 + . . . + rN (B.19)
223
Some authors use the notation instead of +.
To deal with subalgebras we have to know how the representations of G decompose
into representations of H. These decompositions are called the branching rules of the
representations with respect to the subalgebra embedding. If one knows the branching
rules for a sufficiently non-trivial representation of G, this provides enough information for
computing the branching rules of all other representations. The restriction to sufficiently
non-trivial representations is necessary to exclude the trivial case, the branching rule
1 1, where 1 denotes the trivial representation with representation matrices T a = 0.
Obviously this trivial branching rule contains no useful information; in general there may
be also exist non-trivial representations that do not contain enough information. The
precise mathematical terminology for sufficiently non-trivial is faithful. We will not try
to give precise definitions here, because we will focus on SU (N ) Lie algebras, and in that
case the N -dimensional vector representation is faithful.
This implies that if we know the branching rule for the vector representation, then
the decomposition of all other representations can be derived. There are several ways
of doing that. The most obvious one is to explicitly block-diagonalize the representation
matrices of H, starting with those of G. A simpler method is to write the representation of
interest in terms of Young tableaux and construct the representation as an appropriately
symmetrized tensor product of the fundamental representation. One can also use sum
rules for various traces of products of generators. The simplest example is the trace
over the identity, .e. the dimension, which must match for the left- and right-handside of
(B.19). One may also look up the result in tables, e.g. [28, 10], or use computer programs;
for some examples see the reference list of [10].
where U3 and U2 are unitary 33 and 22 matrices satisfying the relation det U3 det U2 =
1. If we write U3 = ei U3 and U2 = ei U2 where U3 and U2 have determinant 1, then
we have identified the SU (3) and SU (2) subgroups. The phases must satisfy 3 + 2 =
0 mod 2. This leaves one independent phase, corresponding to the U (1).
In the following we denote SU (5) representations by their dimension in bold face, and
the complex conjugate representation of an SU (5) representation by an asterisk. Below
we will derive the decomposition of the representations of most interest, namely the 5,
the 10 and the 24, the adjoint representation.
224
Decomposition of the Vector Representation. A vector V A , A = 1, . . . , 5 may be
split into three components V a , a = 1, 2, 3 and two components V i , i = 4, 5. Under SU (5)
transformations V transforms as
X
5 X
3 X
5
A AB B Aa a
V U V = U V + U Ai V i (B.21)
B=1 a=1 i=4
To determine how this transforms under the SU (3) color group, we take only U ab
U3ab 6= 0. The full matrix U must have determinant one, so we take U ij U2ij = ij ,
U ai = U ia = 0, and det U3 = 1. Then we find that
X
3
a
V U3ab V b ; Vi Vi (B.22)
b=1
Hence these components are respectively a vector and a singlet under SU (3). Similarly
for the SU (2) part of the subgroup we use U3ab = ab , det U2 = 1, and U ai = U ia = 0. Now
the two components are respectively a singlet and a doublet. Finally, the U (1) sub-group
acts via the diagonal SU (5) matrix
Now we know how the group SU (3) SU (2) U (1) acts on the five components of the
vector. By expanding these group elements around the identity element we obtain the
action of the Lie algebra generators. It follows that the representation 5 decomposes as
follows into representations of the SU (3) SU (2) U (1) subgroup
Here we have allowed for an arbitrary real factor q since the normalization of U (1) charges
is not fixed by the algebra. The SU (3) and SU (2) generators can simply be taken as a
subset of the SU (5) generators.
225
and we see that the two terms are separately invariant. It can be shown that the remaining
components are irreducible. This implies the tensor product rule
N N = Adjoint + 1 , (B.27)
which agrees with the fact that the dimension of the adjoint representation in SU (N ) is
N 2 1. Now we can work out the decomposition of the adjoint representation by tensoring
the decomposed vectors
(3, 1, 13 q) + (1, 2, 12 q) (3 , 1, 31 q) + (1, 2, 12 q) (B.28)
We work this out term-by term, using the tensor product rule (B.27) in SU (3) and SU (2).
Note that this produces two singlets, but we will have to remove one at the end, to account
for the trace in the (B.27). Hence we get
24 (8, 1, 0) + (1, 3, 0) + (1, 1, 0) + (3, 2, 56 q) + (3 , 2, 56 q) . (B.29)
Note that the sum is over all C and D, but we can restrict it to the range C < D using
the anti-symmetry
X
5
AB
T (U AC U BD U AD U BC )T CD (B.32)
C,D=1;C<D
226
Now we split the indices as before. Then we get
X
3
AB
T (U Ac U Bd U Ad U Bc )T cd
c,d=1;c<d
X
3 X
5
+ (U Ac U Bi U Ai U Bc )T ci
c=1 i=4
X5
+ (U Ai U Bj U Aj U Bi )T ij
i,j=4;i<j
So there are three components that transform into themselves under SU (3)SU (2)U (1),
and into each other under the full SU (5). These components are T ab , T ci and T ij . So let
us see how the subgroup SU (3) SU (2) U (1) acts on these components. The easiest
one is the U (1). The matrices U all take the diagonal form UY shown in Eqn (B.23). We
see that the components T ab acquire a phase e2i(2q/3) . With the convention q = 1 we
decided to use above this implies that the T ab has charge 32 . Similarly, T ij has charge
1
2
+ 12 = 1, and T ai has charge 31 + 12 = 16 .
Now consider the SU (2) subgroup, choosing again U3ab = ab , det U2 = 1, and U ai =
U = 0. We see that T ab transforms into itself with the matrix ac bd ad bc = ac bd
ia
because the second term vanishes if a < b and c < d. The combination T aj transforms to
X
3 X
5 X
5
aj ac ji ai jc ci
T (U U U U )T = U ji T ai (B.33)
c=1 i=4 i=4
Here the only choice of indices that is possible is k = 4, l = 5, i = 4, j = 5, and the factor
is U 44 U 55 U 45 U 54 = det U2 = 1, hence T kl is a singlet under SU (2).
Finally consider SU (3), i.e. U ij U2ij = ij , U ai = U ia = 0, and det U3 = 1. It is
easy to see that T aj transforms with an SU (3) matrix U3ab and is therefore a vector, and
that T ij transforms into itself, and is therefore an SU (2) singlet. For T ab we find
X
3
ab
T (U ac U bd U ad U bc )T cd (B.35)
c,d=1;c<d
227
We see now that the transformation of the components S f is as
Sf V fgSg (B.37)
with
V f g = f ab U ac U bd gcd (B.38)
To determine the matrix V f g we multiply it with U eg and sum over g. Then we get
C.1 Scalars
Real massive scalars have a free action
1 1
L = m2 2 (C.1)
2 2
For complex scalars one has
L = m2 (C.2)
The normalizations are such that the latter expression reduces to two copies of the former
if we define 1 = 12 ( + ) and 2 = 12 ( ).
Scalars that transform in a complex or pseudo-real representation of any global or
local symmetry must be complex, since the transformations cannot maintain their reality.
Scalars in real representations may be real or complex, but in the latter case one may
always decompose them into two real scalars.
C.2 Fermions
The Lagrangian for a massive fermion is
L = i m (C.3)
{ , } = 2 (C.4)
228
We can (and will) choose them in such a way that 0 is Hermitean and the other three
are anti-Hermitean. The conjugate spinor is defined as = 0 . The matrix 5 is
defined by
5 5 = i0 1 2 3 , (C.5)
and is Hermitean. In any representation of the -matrices there is a unitary matrix C so
that
T = C C 1 (C.6)
Left- or right-handed Weyl spinor are defined by means of the projection operators
1 1
PL = (1 + 5 ); PR = (1 5 ) . (C.7)
2 2
They satisfy PR PR = PR ; PL PL = PL and PL PR = 0. The left and right-handed compo-
nents of a field are defined as L = PL ; R = PR . Due to the projections L and
R are effectively two-component spinors, called Weyl spinors. A Dirac spinor has four
complex degrees of freedom, a Weyl-spinor only two.
[M , M ] = i(g M g M + g M g M ) . (C.8)
The operator that represents internal angular momentum on fermions must have the same
commutation relations, and this identifies it uniquely as
= 4i [ , ] , (C.9)
S i = 4i ijk j k (C.10)
To make sure that the signs are correct one can check [S i , S j ] = iijk S k . [This is valid
in the (+ ) metric. In the ( + ++) metric we have [x , p ] = +ig , and hence
229
the left-hand side of Eq. (C.8) changes sign. However in both metrics one usually chooses
{ , } = +2g . Therefore the definitions of and S change sign.]
~p
The helicity of a fermion is the eigenvalue of the operator S~ |~
p|
; the chirality is the
eigenvalue of 5 . To relate the two (for massless particles) we use the Dirac equation
p = 0. Using p = (E, p~) and choosing p~ along the x-axis, p~ = (p, 0, 0) we get
(E0 + p1 ) = 0, or, after raising some indices, p 1 = E 0 . Now consider the action
of the helicity operator: S ~ p~ = i p 2 3 . A fermion with chirality + satisfies
2
5 = i0 1 2 3 = 2 3 = i 0 1 (C.11)
~ p~ = 1 p 0 1 , which with the help of the Dirac equation becomes
Hence S 2
1
p 0 1 = 12 E( 0 )2 = 12 |p| . (C.12)
2
This means that this fermion has its helicity opposite to its momentum and is, by def-
inition, left-handed. We see that with our definition of 5 this corresponds to positive
chirality, or PL = 21 (1 + 5 ).
= ( c )T C , (C.13)
L = i 21 12 m (C.14)
This form obscures the fact that and are not to be treated as independent variables,
as they are for complex fermions. Therefore it is better to express in terms of using
the Majorana condition = ( c )T C. Then we get
In the absence of a mass term there is not really any difference between Majorana and
Weyl spinors. We may write
i 12 = i 12 (L L + R R ) , (C.16)
and then substitute R = (( c )L )T in the second term. Using the Majorana property
plus a little algebra one finds that the second term is now transformed into the first one.
Hence for Majorana fermions
i 21 = iL L = iR R . (C.17)
230
The difference between Majorana and Weyl fermions becomes essential if one assigns
them to representations of local or global symmetries, and writes down mass terms. For
Majorana fermions the representations must be real, and masses are allowed, while for
Weyl fermions the representation can be complex, but then one cannot write down an
invariant mass term.
[T a , T b ] = if abc T c (C.18)
Often we use Lie-algebra-valued fields instead of components. They are defined as follows
A igAa T a ; a
F igF Ta , (C.22)
D = + A (C.23)
The generators are in the representation of the fields on which they act.
231
motion are invariant under space inversion, or, in other words, one cannot distinguish the
time evolution of fields from its mirror image
Note that one should consider the Lagrangian (or the Hamiltonian), and not the
Lagrangian density. The latter is in general not invariant, but may change from L(~x) to
L(~x) (this denotes the full ~x-dependence, explicit or implicit via the fields). If this the
only change, the space-integral of L(~x) is of course invariant, and so is the Hamiltonian.
Classically the transformation we need to consider is therefore a replacement of all
fields by their mirror image
This is true in the simplest case, but the transformation may be more complicated. For
fields with several components due to spin (or perhaps even external degrees of freedom)
one may allow in addition to this also a transformation of these spin components, dictated
by the requirement of invariance. So in this more general situation we can consider
where P ij is some matrix. Obviously the square of P should be 1, since two space
inversions equal the identity (for fermion fields P 2 may in fact be 1) Even if there is
just one component P can be non-trivial, namely a sign, the intrinsic parity of the field.
Note that only the fields are transformed. In fact, in a local field theory there is
no explicit dependence on x, so there is nothing else to transform. However, let us,
for the sake of the argument, consider for a moment an example where there is explicit
dependence on x: L((x), x) = (x) (x) + n x 2 (x). This theory is not parity
invariant. For example, if 0 (x) satisfies the equations of motion, then 0 (x) does not.
Suppose, however, we take for parity transformation:
Then the sign would disappear when we integrate L to get the action, and perform a
change of integration variables x x. This would lead to the incorrect conclusion that
this theory is parity invariant. In other words: replacing x by x is only a field relabeling,
and not a parity transformation.
Although explicit x-dependence of L never occurs, derivatives do appear, and their
transformation may be a source of confusion. Again we need to keep in mind that we
are interchanging dynamical variables labeled by ~x. In one dimension, consider x (x).
Does this transform to x (x) or x (x)? To get the correct answer consider the
infinitesimal form of the derivative
(x + ) (x)
lim (C.27)
0
This is slightly misleading language since a mirror reflects only one axis; an inversion is a reflection
combined with a rotation, which we assume to be a symmetry anyway.
232
The symmetry transformation changes (y) to (y) for all y, and hence the derivative
changes to
(x ) (x)
lim = x (x) . (C.28)
0
This is in fact just another manifestation of the fact that we transform the fields, and not
~x.
In quantum field theory parity is represented by an operator P which acts on the Fock
space (the multi-particle Hilbert space). If parity is a symmetry, then P commutes with
the Hamiltonian, and the the time evolution of states |bi and their mirror images P |bi
are related:
ha| P eiHt P |bi = ha| eiHt |bi . (C.29)
The Hamiltonian is directly related to the Lagrange density, and since we usually work
with the latter, we wish to check the invariance for L rather than H.
Thus we should consider P LP. The operators act only on the factors in the La-
grangian density that are themselves operators, i.e. the fields, and not on coupling con-
stants, derivatives, group generators, gamma matrices or whatever else might appear in a
Lagrangian. The parity operators change every field according to the rule P P.
The result of this operation must correspond to the classical transformation, i.e.
P i (x)P = P ij i (xP ) (C.30)
For scalar fields the 1 1 matrix P is either 1 or 1. In the latter case they are called
pseudo-scalars. This is manifestly a symmetry of the scalar action (C.1). Note that it is
necessary (and allowed) to replace = x by xP P , since this derivative appears
only contracted with another derivative.
The parity transformation for fermions involves a non-trivial matrix P . The action of
a Dirac fermion is transformed to
P LP = i(xP ) 0 P 0 P (xP ) m(xP ) 0 P 0 P (xP ) (C.31)
We want to change to P and get rid of the matrices P . For the kinetic terms we get
the requirement
P 0 P = 0 ,P , (C.32)
and for the mass terms
P 0P = 0 . (C.33)
The first condition for = 0 requires P to be unitary. Then the second condition,
substituted into the first one yields
P P = ,P , (C.34)
a i.e. all three space components of should change sign. The matrix that achieves this
is unique up to a phase: P = i 0 (the factor i is not essential here, see however [5] for a
justification). Hence
P(x)P 1 = i 0 (xP ) , (C.35)
To prove uniqueness one can use the fact that the only unitary matrix that commutes with all is
the unit matrix times a phase.
233
where P is the operator acting on the Hilbert space. From this we can derive the action
on chiral fermions,
PL,R (x)P 1 = i 0 R,L (xP ) . (C.36)
Obviously any matrix P that changes the sign of three of the four Dirac matrices must
also change the sign of 5 .
To summarize, the role of the parity transformation matrix i 0 is to ensure that the
vector i transforms under parity like a vector. Parity reversal is not a symmetry of
the kinetic terms of the left-handed fields alone.
For couplings of gauge fields not to destroy parity (if it is a symmetry without the
coupling to gauge fields), the transformation of must be the same as that of D . Hence
the space components of A must change sign, while the time component does not; in
other words, A must transform like a vector. A quantity transforming with an extra
sign is called a pseudo-vector.
The fermion bi-linears transform as follows: is a scalar, 5 a pseudo-scalar,
a vector and 5 a pseudo-vector.
i C i C = CRij j , (C.38)
where CR is a unitary matrix that depends on the representation R of the scalar with
respect to the gauge group under consideration. Conjugating twice we find
234
so that CR CR = 1, or CR = CRT . From the kinetic terms alone we get no further constraints
on this matrix, but if we consider coupling to gauge bosons the issue of invariance becomes
non-trivial.
Derivatives transform exactly like , and hence covariant derivatives must also
transform like . We have
D C D i C = C D CCR . (C.40)
C D C = CR D CR (C.41)
The ordinary derivative part of D transforms trivially, but the operator C does act non-
trivially on the field A . From Eq. (C.41) we get
where TRa denotes a generator in the representation R. Since Aa is a real field, it transforms
with some real matrix CA (where A stands for adjoint), as follows
C Aa C = CAab Ab (C.43)
We can have scalars in many different representations, but the transformation CA acts
only on one set of fields A , and hence CA must be independent of R. Such a set of
matrices always exists, as we will show below explicitly for SU (N ) (charge conjugation
is in fact nothing but a space-inversion in the root space of the Lie algebra under
consideration). In the special case of a Hermitean U (1) generator Q = Q the equation
reads C A C = A , which implies that a vector boson has charge parity (C-parity)
1.
The matrices CR are not unique, but change under a basis transformation of the
representation R. Suppose TR = SR TR SR , for some unitary SR . Here TR is a generator
in the representation R, and obviously so is TR . Then (in this derivation we omit the
subscript R on C, S and T )
235
any desired phase, e.g. CR = 1. In other words, C-parity is not defined for charged fields.
For a real, one-component scalar CR must be real, and then CR CR = 1 has two solutions,
namely CR = . The basis transformation SR must be real as well for real fields, so that
we cannot change CR .
Similar remarks apply to scalars in representations of non-abelian groups. Since CR is
a unitary, symmetric matrix, we can define the square root of CR , which is also a unitary
1/2
symmetric matrix. If we now set S = CR we find that CR = 1. Hence for complex
fields we can always choose CR = 1, provided we make a suitable choice of the generators.
This may not be possible for real fields: note that S is in general a complex matrix, even
if CR is real. Then transforming by S may make the generators iT a complex, whereas for
real fields we need them to be real.
For example for SU (N ) all representations can be obtained as tensor products of
the fundamental representation, with real projections. One usually chooses a basis so
that the N 1 Cartan sub-algebra generators are real (and diagonal), while half of the
remaining N 2 N generators are real, and the other half purely imaginary. If this
choice is made in the fundamental representation, the reality properties are the same
for all representations (in other words the same generators TRa are always either real or
imaginary, if they generators are constructed using the tensor method). Then the matrix
CAab is diagonal and equal to 1 for the real generators, and 1 for the imaginary ones, so
that in any representation R
TRa CAab = (TRb ) . (C.46)
Hence we have explicitly satisfied (C.44) with CR = 1. Note however that the tensor
procedure does not produce a real basis for real representations. Indeed, (C.46) is obvi-
ously not valid for a real basis if CA is non-trivial. If we transform TRa to a real basis we
get in terms of the real generators the transformation (C.44), with a non-trivial matrix
CR = SS T generated by transforming to a real basis using (C.45). This shows how (C.44)
can indeed be satisfied for any SU (N ) representation.
For fermions the action of charge conjugation is slightly more complicated because of
the fact that they are in a spinor representation of the space-time symmetry group. For
Weyl spinors these representations are complex in four dimensions, and hence transform
into an inequivalent Weyl spinor, describing a particle with opposite helicity. Dirac spinors
transform into themselves since they contain two Weyl spinors of opposite parity, but there
is still a non-trivial matrix in the transformation. We will consider first fermions that are
in a trivial representation of any gauge group. Hence the matrix C ij only acts on spin
indices.
The transformation rule is
C C = CF , (C.47)
where CF is some matrix to be determined. Consider now some current , where is
some product of matrices. this current transforms to
C C = T (CF ) ( 0 )CF
236
= (CF )T T ( 0 )T (CF )
= 0 (CF )T T ( 0 )T (CF )
In the second step we took the transpose of the entire quantity (which is a number) and
introduced a sign because in the process we anti-commute two fermions
There are two cases of interest, namely = , and = 1, corresponding respectively
to the kinetic terms and the mass terms. Invariance of the kinetic terms leads to the
requirement
0 (CF )T ( )T ( 0 )T (CF ) = (C.48)
and invariance of the mass term to
[Note that in the first condition there is an extra sign due to the fact that the derivative
has to be partially integrated so that it acts on rather than .]
As in the discussion of parity the = 0 component of the first condition implies that
CF must be unitary. It is not hard to show that the unique solution (up to a phase) for
CF is then
CF = C 1 ( 0 )T , (C.50)
where C is the matrix introduced earlier in this appendix, satisfying
C C 1 = T , (C.51)
with C 1 = C . and C = C T . The precise form of the matrix depends on the repre-
sentation one chooses for the matrices, but it can be shown that in four dimensions it
always satisfies C = C T .
Let us now consider other choices for . We define
CC 1 = T , (C.52)
C 5 C 1 = i0T 1T 2T 3T , (C.54)
which equals 5T after one re-orders the four factors (which does not produce a sign flip).
Hence 5 = 1. Then 5 = 1 as well. Hence the axial vector 5 transforms into
The attentive reader may be confused by the apparent contradiction between the following operations:
when proving Hermiticity of the Lagrangian we have ( 0 ) = ( 0 ) (no sign change), whereas
here we have ( T ( 0 )T )T = 0 (sign change). This difference is due to the fact that the in
the first expression is a hermitean conjugate in Hilbert space, which takes (AB) to B A irrespective of
there commutation relations, whereas the T in the second expression acts only on the spinor labels.
237
itself, and hence the vector current and the axial vector current transform with opposite
sign under charge conjugation. Therefore the chiral action iL L is not invariant,
but transforms into the same expression with L replaced by R.
Apart from this helicity flip, the coupling of fermions to gauge bosons is invariant
under C. The gauge current T a j transforms to (T a ) , where we used the
fact that the generators are hermitean. Hence the action for a minimally coupled Dirac
fermion transforms to
iC D C = i ( + ig(T a ) C Aa C) . (C.55)
The rest of the discussion is identical to the earlier discussion of the transformation of
covariant derivatives for scalar fields: in addition to the matrix CF acting on the spinor
indices, we need a second, representation dependent matrix CR , satisfying (C.44). If we
define
C C = CF CR , (C.56)
then the transformed interaction term becomes
Now we complex conjugate (C.44), and make use of the fact that CA is real (since it
transforms the real fields A ). Hence
(T a ) C ab = CR T b CRT (C.58)
where T is an anti-unitary operator and T a matrix in the internal (spin) and/or external
space of degrees of freedom of the field. Here xT is the four-vector x0 , xi .
The theory is invariant under time reversal if (and only if)
and for a time evolution of time-reversed states we find the following. If we start with an
amplitude
A(t) = ha| eiH(t)t |bi , (C.62)
238
then for the time reversed states the time evolution amplitude is
ha| T eiH(t)t T |bi = ha| eiH(t)t |bi = A(t) (C.63)
From the Hamiltonian point of view space and time reversal are not treated symmet-
rically, since time plays a special role. However, from the point of view of the action the
concept of symmetry is the same in both cases: the action is invariant if L satisfies Eq.
(C.60).
Because T is anti-unitary, it is not true anymore that only the fields are transformed.
All other objects in the Lagrangian density are also changed, namely complex conjugated.
For free Dirac fermions we get
T (x)T = TF (xT ) , (C.64)
where TF is a matrix in spinor space. The Dirac Lagrangian transforms as follows
T [i (x) m(x)]T = i (xT )T ( 0 ) ( ) T (xT )
m (xT )T ( 0 ) T (xT ) .
Again we consider first the time component of the first term. This term should change
sign, which it does precisely if T is unitary. For the space components and the mass we
find then
iT ( 0 ) ( i ) T = i 0 ( i ) (C.65)
and
T ( 0 ) T = 0 (C.66)
Substituting the last equation into the first gives
iT ( i ) T = i( i ) (C.67)
The solution is, up to a phase
T = C 5 . (C.68)
Note that unlike parity and charge conjugation time reversal is a symmetry for Weyl
fermions. The physical reason is that time reversal changes the direction of both spin
and momentum and hence the helicity is conserved. Parity flips the momentum, but not
the spin (spin transforms like orbital angular momentum, ~r p~), whereas for half-integer
spin particles charge conjugation flips spin but not momentum (technically this happens
because spinors are in complex representations of SO(3, 1)).
D Supersymmetry
D.1 Notation
We will use here the (dotted) index notation for spinors introduced in appendix A. Implicit
contraction for indices are as follows
; (D.1)
239
The following relations hold for anti-commuting spinors
= ; = (D.2)
= (D.3)
= (D.4)
() = = (D.5)
( ) = . (D.6)
To derive the last two, note that for operators (AB) = B A . Hence () = ( ) ( ) =
( )( ) = . For the last one, ( ) = ( ) ; then use Hermiticity of and
the fact that, numerically, = and = . We will use a metric-independent notation,
i.e. we will write all formulas in such a way that they are correct for two metrics, (+)
as well as ( + ++). This is done by explicit factors 00 and 0 .
Lfermion = i 0 (D.9)
Here is a spinor,
which is assumed to anti-commute with all other spinors in the problem.
The factor 2 is the standard convention used in the literature. The conjugate of the
scalar transforms as
= 2() = 2 (D.11)
As a result of this transformation, the scalar Lagrangian transforms as follows
Lboson = 2 00 + (D.12)
These terms have to be canceled by the variation of the fermionic terms. An educated
guess for the fermion transformation is ( is a real parameter to be determined later)
= i( ) = i (D.13)
240
Hence for the conjugate field we get
= ( ) = i( ) ( ) = i( ) ( ) = i (D.14)
Note that under this Hermitean conjugation must be treated as a set of numbers, and
is not an operator. Substituting this into the fermion Lagrangian we get
Lfermion = 0 ( )
= 0 ( ) ,
up to total derivatives, which are irrelevant. Because of the symmetric appearance of the
derivatives we may replace [ ] by
1
2
[ + ] = 00 , (D.15)
a relation that can easily be checked explicitly. A similar relation holds with bars inter-
changed and dots on the spinor indices. Integrating once more by parts, we find then that
the two variations cancel each other if
= 2 0 (D.16)
One may introduce an operator Q that generates the transformation in the quantum
theory. Since the result has terms proportional to and we actually use the combi-
nation Q + Q. One can derive this operator as the charge of the Noether current of
supersymmetry. It is in general some bi-linear expression in terms of the quantum fields.
Here we will simply define it by its transformation properties, namely
Q + Q X = X , (D.17)
where X denotes any field. Since Q has a spinor index, it is natural to take it to be
anti-commuting, which indeed it turns out to be. Then Q + Q is a bosonic operator.
As usual with generators of a symmetry, it is interesting to study their commutator.
Consider
1 Q + Q1 , 2 Q + Q2 . (D.18)
To see what the result is, it is easiest to make it act on the generic field X.
1 Q + Q1 , 2 Q + Q2 X = (1 2 2 1 )X (D.19)
1 2 2 1 = 2i 0 (2 1 1 2 ) (D.20)
If indeed this holds also for other choices of X (as we will check in a moment), then we
have
1 Q + Q1 , 2 Q + Q2 = 2i 0 (2 1 1 2 ) (D.21)
241
Now we expand both sides in and compare the terms. We see then that {Q , Q } =
{Q , Q } = 0 and furthermore,
Q , Q = 2i 0 (D.22)
For the operators Q this means that the corresponding commutator must yield the
operator that generates translations on the fields, i.e. the momentum operator. Indeed,
if one does the explicit computation one gets
Q , Q = 2 0 00 P = 2 0 00 , P . (D.23)
To see that the sign is correct, note the following. Independent of the metric we have
p = (E, p~), q = (t, ~x), and [pi , xj ] = i ij . Therefore [p , q ] = i 00 . Hence [p , q ] =
i 00 . This implies that p = i 00 , so that the relation between Eqs. (D.22) and (D.23)
is correct.
If only the time-like components P 0 = H contribute the right-hand side is
2 0 00 0, H = 2 0 0 H = H (D.24)
This shows that any dependence on the metric and the choice of 0 nicely cancels. The
overall sign is the right one: the anti-commutator has non-negative expectation values
between states, consistent with positivity of the spectrum of H.
Now we must still consider the commutator of two supersymmetries on the fermion
field. The result is
1 2 = i 2 0 ( 2 ) 1
= 2i 0 ( 2 ) 1 (D.25)
This does not have exactly the right form, but we may use the following Fiertz identity
() = () () , (D.26)
which can be proved by writing out both sides of the identity. Applying it to Eq. (D.25)
yields
1 2 = 2i 0 ((1 ) ( 2 ) + ( 2 )1 )
= 2i 0 ((1 ) 2 + 1 ( 2 )) (D.27)
The second term has precisely the right form, but the first one does not. However,
it vanishes if we assume that satisfies the equation of motion (the Dirac equation)
= 0. This implies that the super algebra Eq. (D.23) only holds on-shell, i.e. for
fields that satisfy the equation of motion.
This is an annoying feature in a quantum field theory where off-shell degrees of freedom
do play a role in virtual processes. Of course it is not a fundamental problem, since we
can simply compute scattering amplitudes ignoring the supersymmetry, but it becomes
then very difficult to show that supersymmetry is preserved in such calculations.
242
The problem can be circumvented by introducing an auxiliary field F . It is not a
dynamical degree of freedom, as is clear from its action:
Laux = F F (D.28)
F = ; F = (D.29)
Then
Laux = F + F (D.30)
These terms are canceled if we transform the fermions as
= i 2 0 ( ) + F (D.31)
= i 2 0 ( ) + F (D.32)
with = i 0 . The second condition and have to satisfy is that the extra terms in
the commutator Eq. (D.27) cancel. This leads to the condition = 2i 0 . Combining
the two conditions we find = 2. The phase of is not determined, 0 and this is
not
surprising since it can be absorbed in F . We will choose = i 2 , so that = 2.
The last thing to check is that the commutator acting on F gives the same answer as on
and . This is true without further conditions. The commutator on F produces a term
involving , but this term has the form 2 1 (1 2). This is proportional
to 1 2 (1 2) = 0.
The transformations we have obtained are thus
= 2
= i2 0 ( ) + 2 F
F = i 2 0
The physical reason why we need auxiliary fields is that the off-shell count of degrees
of freedom between bosons and fermions does not match. Off-shell a complex scalar has
one complex degrees of freedom, and a Weyl spinor has two complex degrees of freedom.
The equation of motion for a scalar do not modify the number degrees of freedom. In
momentum space, they only impose the constraint k 2 = 0 (if the scalar is massless). For
fermions they impose the same constraint, but also the stronger constraint k = 0.
This is a matrix condition that only half the components can satisfy. The other half is
eliminated on-shell. Hence on-shell a Weyl spinor has one complex degree of freedom, the
same as the scalar. This is what makes the existence of on-shell supersymmetry possible.
To realize it off-shell we need to introduce the missing bosonic degrees of freedom in
the form of the complex auxiliary field F .
243
D.3 Superfields
The model studied in the previous section has free fields only. Now we have to find out
how to write down supersymmetric interactions, and we also need to consider fields of
higher spin, to accommodate gauge bosons (and also gravitons).
The construction of Lagrangians for theories with N = 1 supersymmetry is most
conveniently done using superfields. One introduces anti-commuting parameters which
transform according to the SO(3, 1) representation (2,1). Their Hermitean conjugates
transform then as (1,2), and are denoted as . These parameters anti-commute with
each other for any choice of indices. They also anti-commute with any other fermionic
field or operator.
The supersymmetry algebra can now be written as,
[Q, Q] = 2( 0 00 ) P . (D.33)
From the commutator of the operators Q we can derive a product rule, using the Baker-
Campbell-Hausdorff formula
1
eA eB = eA+B+ 2 [A,B]+... , (D.35)
which in this case is exact to this order, since all higher order commutators vanish. We
find
G(y, , )G(x, + , + )
, ) = G(x + y + i i, (D.36)
This defines an action on the coordinates of superspace,
x x + y + i i
244
+
+
We now introduce differential operators that yield the same action on the coordinates.
These differential operators are denoted as Qdiff and Qdiff . These operators are required
to reproduce the and terms in the transformation of the super-coordinates:
(x, , ) = Qdiff (x, , )
, )
(x, , ) = Qdiff (x,
It is easy to check that the following operators do the job
Qdiff
= + i
Qdiff = i
Here = x . The reason for using a lower index on on the left and an upper one for
x on the left is that in this way x = is a proper Lorentz invariant tensor while
is not (the only is a Lorentz invariant tensor with two lower indices is ). The indices
on the other partial derivatives follow a similar logic:
= ; = , (D.37)
where the following sign changes should be noted
= ; = . (D.38)
These are simply a consequence of raising and lowering indices with the tensor, c.f. Eq.
(A.7):
= = = = (D.39)
To understand the sign of the first term of Q diff
note that the derivatives and
anti-commute with all other fermionic objects. Hence (ignoring the second term) we have
Qdiff = Qdiff = = = = (D.40)
Hence this operator acts correctly as a shift operator on both and .
The commutator of two of these differential operators yields
[Qdiff , Qdiff ] = 2i , (D.41)
which differs by a sign from the corresponding quantum operators, Eq. (D.22). For
an explanation of this sign see [2]. Here we simply note that for the commutator of
the quantum operators the right-hand side yields the Hamiltonian, and the sign is the
fixed by the requirement of positivity of that Hamiltonian. Here we simply get a time
derivative, and the fact that it appears with the opposite sign compared to our (too
naive) expectations is not a problem.
245
D.5 Different Realizations
There are other differential operator realizations of the super-algebra, which act (as we
will see) on a slightly modified superspace. The following three sets are used
S: Qdiff
= + i
Qdiff = i
L: Qdiff
=
Q = 2i
diff
R: Qdiff
= + 2i
Qdiff =
The representations are denoted by S for Symmetric, L for Left-handed and R for
Right-handed. From here on we will only use the differential operators, and since no
confusion is possible we drop the superscript diff.
QS L (xi , , ) = QL L (x , , ) (D.43)
and the same for Q. Since the operators Q and Q act in any case the same on the explicit
-dependence (the second and third argument), all that remains to be shown is
QS L (xi ) = QL L (x ) (D.44)
This is straightforward.
246
D.8 Product Representations and Supersymmetry Invariants
The product of two superfields in the same representation is again a superfield in that
representation. This means that the transformations of the product under supersymmetry
are given by the same formulas as for a single superfield, i.e. the formulas of the previous
paragraph.
Since there are only two anti-commuting variables 1 and 2 the expansion of a super-
field in , is finite and stops at the maximal order, i.e. with a term 2 2 . The variations
of the component fields (i.e. the coefficients of the various combinations) can be read of
by expanding and in , . Consider the highest component in , i.e. the coefficient
of 2 2 . The corresponding term in is definitely not generated by the derivatives or
, because to produce 2 2 they would have to act on 3 2 or 2 3 , neither of which can
occur in . A term proportional to 2 2 in can therefore only arise from the action
of the terms involving . This means that the variation of the highest component in
transforms into a total derivative. Hence, when integrated over space-time, the highest
component of is invariant under supersymmetry. This is the principle which is always
used to build supersymmetric actions.
These two principles give the superspace method its power: products of superfields are
again superfields, and the highest component in the , expansion is a supersymmetry
invariant. A general superfield has still a large number of component fields (nine, to be
precise). In the previous section we have seen an example of a set of three fields (,
and F ) that formed a closed representation of supersymmetry. Hence there should exist
ways to restrict the number of component fields. To do this we need yet another set of
differential operators.
S: D = i
D = + i
L: D = 2i
D =
R: D =
D = + 2i
247
D.10 Chiral Superfields
There are only a few representations of the super algebra that we need to consider. At
first sight the fields and the invariant actions do not look very natural, but a very large
amount of work is quite simply summarized by these rules.
Fields (x, , ) satisfying D = 0 are called left-handed chiral superfields (also scalar
superfields). The reason that this is an interesting restriction is that D anti-commutes
with the supersymmetry transformation. Therefore the property D = 0 is preserved by
supersymmetry. This implies that chiral superfields form all by themselves representations
of supersymmetry; without the restriction D = 0 the superfield has more components
then necessary. The restriction in the number of components is most clearly seen in
the left-handed representation, since D is simplest in that representation. Then the
requirement is simply that should not depend on . Hence its expansion in terms of
can go at most to second order:
L (x, ) = (x) + 2(x) + 2 F (x) , (D.45)
248
there are many terms in the expansion in . However, life can be simplified considerably
by a gauge choice called the Wess-Zumino gauge. The complete expression is
V (x, , ) = V + i2 i2 + 21 2 2 D + . . . , (D.47)
where the dots represent additional terms, which are absent in Wess-Zumino gauge.
where LF satisfies the conditions for a left-handed chiral superfield and LD those of a
4 2 2
R 4 2 2 One usually writes d instead of d d . We define the normalization
vector superfield.
so that d = 1.
The terms L are built out of elementary superfields describing single particles. The
only terms surviving the integration are those corresponding to F and D auxiliary fields,
hence the notation. Often one writes
Z Z
2
d X [X]F , d4 X [X]D . (D.50)
The reason that the resulting Lagrangian is invariant under supersymmetry transfor-
mations is that the F and D terms in any superfield (whether composite or elementary)
transforms into a total derivative. Hence the Lagrangian transforms into a total derivative
as well, and the action is invariant.
Consider first the scalar superfields. The kinetic terms come from terms in LD . As
observed above, is a right-handed superfield in the right-handed representation. If one
multiplies two superfields in different representations, the supersymmetry has no mean-
ingful action on the product. One way to deal with this is to write both in the symmetric
representation; then S S transforms as a vector superfield under the S-representation.
To got to the S-representation we have to shift the argument x:
S (x, , ) = L (x i, , ) (D.51)
249
Now we expand S S in a Taylor series in i, and we keep only terms of order 2 2 .
Alternatively we may work entirely in the left-handed representation, but then we have
to shift the argument of L by twice as much. It is easy to see that the result is the same.
The result is Z
d4 L (x + 2i, )L (x, , ) (D.52)
consider first the scalar terms. Define a = 2i . The expansion yields, for the scalar
component
(xa) = (x) + a (x) + 12 a a (x) + . . . (D.53)
The higher order terms vanish in this case, because they have two many s. The contri-
bution of second order in and to the action is thus
2
(a a )(x) (x)
1
(D.54)
= 12 00 2 2 (D.55)
( ) = 14 2 2 (D.58)
Lfermion = i = i (D.59)
Finally, the quadratic terms for the auxiliary field comes out immediately as
Laux = F F (D.60)
250
This completes the discussion of the kinetic terms. We see that they have precisely the
form we started with in the previous section.
All other terms in the scalar superfield Lagrangian are F-terms. Any polynomial built
out of left-handed chiral superfields is manifestly a left-handed chiral superfield as well. It
is a bit more difficult to see (but true) that this is the only way to build chiral superfields.
It turns out that to get a renormalizable theory one can allow terms of at most third
order in the superfields. For a single superfield the most general polynomial is thus
LF = 12 m2 + 13 3 W () (D.61)
this is called the superpotential. It is straightforward to expand it to second order in .
The result is
Z Z
4 2 1 2 1 3
L = d + d ( 2 m + 3 ) + c.c
= + i + F F + m F 12 2 + F 2 2 + c.c .
The field F appears without kinetic terms and can thus be eliminated using the equations
of motion (hence the name auxiliary field). Clearly
F = m ( )2 (D.62)
Substituting this back into the action we get
L = + i 21 m( 2 + 2 ) 2 2 |m + 2 |2 (D.63)
The last term is VF , where VF is the contribution to the scalar potential due to F terms.
This is very easily generalized to situations with more than one superfield, and more
general superpotentials W (i ). For each term in the polynomial we only need to find the
2 terms. If we consider a term 1 . . . k we get two kinds of 2 terms: one kind consists
0
of Fi and factors i0 for all terms with i 6= i , and the other kind comes from 2i 2j
times factors k for all k 6= i, k 6= j. Hence, including the F F terms we get (note that
(i )(j ) = 12 (i j )2 )
!
X X 2
W () X W ()
Fi Fi + 12 i j + Fi + c.c (D.64)
i i,j
i
j i
i
251
Note in particular that this potential is positive definite, a consequence of supersymmetry.
The Lagrangian for vector fields is more complicated to derive in superfield formalism.
If one has a non-abelian gauge group there will be an adjoint multiplet of vector superfields
V a . To write down the coupling to a chiral superfield one contracts them with the
generators T a of the gauge group in the representation of . The minimal coupling to the
chiral superfield is then the D term in e2gV . Expanding this in components yields
|D |2 i D + 2ig[ ] + F F + g D . (D.68)
The explicit indices have been suppressed, but are uniquely determined by gauge invari-
ance. For example the third term is explicitly ii a Tija j + c.c, and the last one, involving
the auxiliary field is
gi Tija Da j . (D.69)
To write down the gauge kinetic terms one introduces a chiral superfield with a spinor
index
W = DDegV D egV (D.70)
The supersymmetric and gauge-invariant Lagrangian is
1
Lgauge = [W W ]F + c.c = 14 (F
a 2
) 21 D + 12 (Da )2 , (D.71)
32g 2
where is a four component Majorana spinor built out of the spinors and in the
superfield V :
= . (D.72)
Of course has both a Dirac index and an adjoint gauge index, and D is the gauge
covariant derivative in the adjoint representation.
In the absence of any matter multiplets, the auxiliary field must vanish; in the presence
of matter it satisfies the field equation
Da = gi Tija j , (D.73)
because of the D-term contribution (D.69). Substituting this back into the action we get
another contribution to the scalar potential, this time associated with the D-terms of the
scalars:
VD = 12 Da Da , (D.74)
with Da given by Eq. (D.73).
There is one additional term that can appear in the action, namely
Z
d2 d2 2V (D.75)
which is gauge invariant if and only if V is a U (1) gauge field. The only effect this term
has is to add to the Lagrangian a term D, where D is the auxiliary field in V . This
changes the equations of motion for D, and instead of Eq. (D.73) we get
D = g 0 i Qi i , (D.76)
252
where g 0 is the U (1) coupling constant and Qi the charge of the scalar i. The action,
expressed in terms of the auxiliary fields is still given by Eq. (D.74), where the implicit
sum now includes the U (1) factor.
Acknowledgements
I would like to thank all the students who have contributed to these notes by informing
me about typos and errors. Special thanks to Rob Verheyen, Melissa van Beekveld, Leon
Groenewegen, Gillian Lustermans, John van de Wetering, Stan Jacobs, Marrit Schutten
and Chris Ripken.
References
[1] A. Arvanitaki, S. Dimopoulos, S. Dubovsky, N. Kaloper, and J. March-Russell. String
Axiverse. Phys.Rev., D81:123530, 2010.
[2] J. Bagger and J. Wess. Supersymmetry and Supergravity. Princeton University Press,
1992.
[3] S. M. Bilenky, J. Hosek, and S. Petcov. On Oscillations of Neutrinos with Dirac and
Majorana Masses. Phys.Lett., B94:495, 1980.
[6] M. Dine, W. Fischler, and M. Srednicki. A Simple Solution to the Strong CP Problem
with a Harmless Axion. Phys.Lett., B104:199, 1981.
[8] P. D. G. C. P. et. al.(. Review of particle physics (2016). Chin. Phys., C40, 2016.
[10] R. Feger and T. W. Kephart. LieART - A Mathematica Application for Lie Algebras
and Representation Theory. Comput. Phys. Commun., 192:166195, 2015.
[11] S. Ferrara. Tensor Calculus and the breaking of Local Supersymmetry. 1986.
[12] C. Ford, D. Jones, P. Stephenson, and M. Einhorn. The Effective potential and the
renormalization group. Nucl.Phys., B395:1734, 1993.
253
[13] H. Georgi and S. Glashow. Unity of All Elementary Particle Forces. Phys.Rev.Lett.,
32:438441, 1974.
[15] A. H. Guth. The Inflationary Universe: A Possible Solution to the Horizon and
Flatness Problems. Phys.Rev., D23:347356, 1981.
[17] H. E. Haber and G. L. Kane. The Search for Supersymmetry: Probing Physics
Beyond the Standard Model. Phys.Rept., 117:75263, 1985.
[20] P. Langacker. Grand Unified Theories and Proton Decay. Phys.Rept., 72:185, 1981.
[21] Z. Maki, M. Nakagawa, and S. Sakata. Remarks on the unified model of elementary
particles. Prog. Theor. Phys., 28:870880, 1962.
[24] R. Oerter. The theory of almost everything: The standard model, the unsung triumph
of modern physics. 2006.
[25] B. Pontecorvo. Inverse beta processes and nonconservation of lepton charge. Sov.
Phys. JETP, 7:172173, 1958. [Zh. Eksp. Teor. Fiz.34,247(1957)].
[26] M. Sher. Electroweak Higgs Potentials and Vacuum Stability. Phys.Rept., 179:273
418 (Phys.Lett. B317 (1993) 159163, Addendumibid. B331 (1994) 448), 1989.
[28] R. Slansky. Group Theory for Unified Model Building. Phys.Rept., 79:1128, 1981.
[29] C. Vafa and E. Witten. Parity Conservation in QCD. Phys. Rev. Lett., 53:535, 1984.
254
[32] S. Weinberg. Baryon and Lepton Nonconserving Processes. Phys. Rev. Lett., 43:1566
1570, 1979.
[36] F. Zwirner. The quest for low-energy supersymmetry and the role of high-energy e+
e- colliders. 1991.
255