Relativistic Quantum Mechanics

arXiv:physics/0504062v17 [physics.
gen-ph] 16 Feb 2015
RELATIVISTIC QUANTUM DYNAMICS

Eugene V. Stefanovich
2014
ii
iii
Draft, 3rd edition
RELATIVISTIC QUANTUM DYNAMICS:

A Non-Traditional Perspective on Space, Time,
Particles, Fields, and Action-at-a-Distance
1
Eugene V. Stefanovich
Mountain View, California
c
Copyright 2004 - 2014 Eugene V. Stefanovich
1
e-mail: eugene stef [email protected]
web address: http : //www.arxiv.org/abs/physics/0504062
iv
v
To Regina
vi
Abstract
This book is an attempt to build a consistent relativistic quantum theory of

interacting particles. In the first part of the book “Quantum electrodynam-
ics” we follow rather traditional approach to particle physics. Our discussion
proceeds systematically from the principle of relativity and postulates of
quantum measurements to the renormalization in quantum electrodynamics.
In the second part of the book “Quantum theory of particles” this traditional
approach is reexamined. We find that formulas of special relativity should be
modified to take into account particle interactions. We also suggest reinter-
preting quantum field theory in the language of physical “dressed” particles.
This formulation eliminates the need for renormalization and opens up a
new way for studying dynamical and bound state properties of quantum
interacting systems. The developed theory is applied to realistic physical
objects and processes including the energy spectrum of the hydrogen atom,
the decay law of moving unstable particles, and the electric field of relativis-
tic electron beams. These results force us to take a fresh look at some core
issues of modern particle theories, in particular, the Minkowski space-time
unification, the role of quantum fields and renormalization as well as the al-
leged impossibility of action-at-a-distance. A new perspective on these issues
is suggested. It can help to solve the old problem of theoretical physics – a
consistent unification of relativity and quantum mechanics.
Contents
PREFACE xix
INTRODUCTION xxix
I QUANTUM ELECTRODYNAMICS 1
1 QUANTUM MECHANICS 3
1.1 Why do we need quantum mechanics? . . . . . . . . . . . . . 4
1.1.1 Corpuscular theory of light . . . . . . . . . . . . . . . . 5
1.1.2 Wave theory of light . . . . . . . . . . . . . . . . . . . 8
1.1.3 Low intensity light and other experiments . . . . . . . 9
1.2 Physical foundations of quantum mechanics . . . . . . . . . . 11
1.2.1 Single-hole experiment . . . . . . . . . . . . . . . . . . 12
1.2.2 Ensembles and measurements in quantum mechanics . 13
1.3 Lattice of propositions . . . . . . . . . . . . . . . . . . . . . . 15
1.3.1 Propositions and states . . . . . . . . . . . . . . . . . . 17
1.3.2 Partial ordering . . . . . . . . . . . . . . . . . . . . . . 20
1.3.3 Meet . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.4 Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.3.5 Orthocomplement . . . . . . . . . . . . . . . . . . . . . 22
1.3.6 Atomic propositions . . . . . . . . . . . . . . . . . . . 26
1.4 Classical logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4.1 Truth tables and distributive law . . . . . . . . . . . . 27
1.4.2 Atomic propositions in classical logic . . . . . . . . . . 30
1.4.3 Atoms and pure classical states . . . . . . . . . . . . . 32
1.4.4 Phase space of classical mechanics . . . . . . . . . . . . 34
1.4.5 Classical probability measures . . . . . . . . . . . . . . 34
vii
viii CONTENTS
1.5 Quantum logic . . . . . . . . . . . . . . . . . . . . . . . . . . 36

1.5.1 Compatibility of propositions . . . . . . . . . . . . . . 36
1.5.2 Logic of quantum mechanics . . . . . . . . . . . . . . . 39
1.5.3 Example: 3-dimensional Hilbert space . . . . . . . . . 41
1.5.4 Piron’s theorem . . . . . . . . . . . . . . . . . . . . . . 43
1.5.5 Should we abandon classical logic? . . . . . . . . . . . 44
1.6 Quantum observables and states . . . . . . . . . . . . . . . . . 45
1.6.1 Observables . . . . . . . . . . . . . . . . . . . . . . . . 45
1.6.2 States . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1.6.3 Commuting and compatible observables . . . . . . . . 50
1.6.4 Expectation values . . . . . . . . . . . . . . . . . . . . 51
1.6.5 Basic rules of classical and quantum mechanics . . . . 52
1.7 Interpretations of quantum mechanics . . . . . . . . . . . . . . 53
1.7.1 Quantum unpredictability in microscopic systems . . . 53
1.7.2 Hidden variables . . . . . . . . . . . . . . . . . . . . . 55
1.7.3 Measurement problem . . . . . . . . . . . . . . . . . . 56
1.7.4 Agnostic interpretation of quantum mechanics . . . . . 58
2 POINCARÉ GROUP 61
2.1 Inertial observers . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.1.1 Principle of relativity . . . . . . . . . . . . . . . . . . . 62
2.1.2 Inertial transformations . . . . . . . . . . . . . . . . . 64
2.2 Galilei group . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.2.1 Multiplication law of the Galilei group . . . . . . . . . 66
2.2.2 Lie algebra of the Galilei group . . . . . . . . . . . . . 67
2.2.3 Transformations of generators under rotations . . . . . 70
2.2.4 Space inversions . . . . . . . . . . . . . . . . . . . . . . 73
2.3 Poincaré group . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.3.1 Lie algebra of the Poincaré group . . . . . . . . . . . . 75
2.3.2 Transformations of translation generators under boosts 80
3 QUANTUM MECHANICS AND RELATIVITY 83

3.1 Inertial transformations in quantum mechanics . . . . . . . . . 83
3.1.1 Wigner’s theorem . . . . . . . . . . . . . . . . . . . . . 84
3.1.2 Inertial transformations of states . . . . . . . . . . . . 87
3.1.3 Heisenberg and Schrödinger pictures . . . . . . . . . . 88
3.2 Unitary representations of the Poincaré group . . . . . . . . . 89
3.2.1 Projective representations of groups . . . . . . . . . . . 90
CONTENTS ix
3.2.2 Elimination of central charges in the Poincaré algebra . 91

3.2.3 Single-valued and double-valued representations . . . . 99
3.2.4 Fundamental statement of relativistic quantum theory 100
4 OPERATORS OF OBSERVABLES 103

4.1 Basic observables . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.1.1 Energy, momentum, and angular momentum . . . . . . 104
4.1.2 Operator of velocity . . . . . . . . . . . . . . . . . . . 106
4.2 Casimir operators . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2.1 4-vectors . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.2 Operator of mass . . . . . . . . . . . . . . . . . . . . . 108
4.2.3 Pauli-Lubanski 4-vector . . . . . . . . . . . . . . . . . 109
4.3 Operators of spin and position . . . . . . . . . . . . . . . . . . 111
4.3.1 Physical requirements . . . . . . . . . . . . . . . . . . 111
4.3.2 Spin operator . . . . . . . . . . . . . . . . . . . . . . . 113
4.3.3 Position operator . . . . . . . . . . . . . . . . . . . . . 115
4.3.4 Alternative set of basic operators . . . . . . . . . . . . 119
4.3.5 Canonical form and “power” of operators . . . . . . . . 120
4.3.6 Uniqueness of the spin operator . . . . . . . . . . . . . 124
4.3.7 Uniqueness of the position operator . . . . . . . . . . . 125
4.3.8 Boost transformations of the position operator . . . . . 127
5 SINGLE PARTICLES 131

5.1 Massive particles . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.1.1 Irreducible representations of the Poincaré group . . . 133
5.1.2 Momentum-spin basis . . . . . . . . . . . . . . . . . . 136
5.1.3 Action of Poincaré transformations . . . . . . . . . . . 139
5.2 Momentum and position representations . . . . . . . . . . . . 143
5.2.1 Spectral decomposition of the identity operator . . . . 143
5.2.2 Wave function in the momentum representation . . . . 147
5.2.3 Position representation . . . . . . . . . . . . . . . . . . 149
5.2.4 Inertial transformations of observables and states . . . 152
5.3 Massless particles . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.3.1 Spectra of momentum, energy, and velocity . . . . . . . 157
5.3.2 Representations of the little group . . . . . . . . . . . . 158
5.3.3 Massless representations of the Poincaré group . . . . . 161
5.3.4 Doppler effect and aberration . . . . . . . . . . . . . . 164
x CONTENTS
6 INTERACTION 169
6.1 Hilbert space of a many-particle system . . . . . . . . . . . . . 170
6.1.1 Tensor product theorem . . . . . . . . . . . . . . . . . 170
6.1.2 Particle observables in a multiparticle system . . . . . 172
6.1.3 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.2 Relativistic Hamiltonian dynamics . . . . . . . . . . . . . . . . 175
6.2.1 Non-interacting representation of the Poincaré group . 176
6.2.2 Dirac’s forms of dynamics . . . . . . . . . . . . . . . . 177
6.2.3 Total observables in a multiparticle system . . . . . . . 179
6.3 Instant form of dynamics . . . . . . . . . . . . . . . . . . . . . 180
6.3.1 General instant form interaction . . . . . . . . . . . . . 180
6.3.2 Bakamjian-Thomas construction . . . . . . . . . . . . . 181
6.3.3 Non-Bakamjian-Thomas instant forms of dynamics . . 183
6.3.4 Cluster separability . . . . . . . . . . . . . . . . . . . . 186
6.3.5 Non-separability of the Bakamjian-Thomas dynamics . 189
6.3.6 Cluster separable 3-particle interaction . . . . . . . . . 190
6.4 Bound states and time evolution . . . . . . . . . . . . . . . . . 195
6.4.1 Mass and energy spectra . . . . . . . . . . . . . . . . . 195
6.4.2 Doppler effect revisited . . . . . . . . . . . . . . . . . . 197
6.4.3 Time evolution . . . . . . . . . . . . . . . . . . . . . . 199
6.5 Classical Hamiltonian dynamics . . . . . . . . . . . . . . . . . 202
6.5.1 Quasiclassical states . . . . . . . . . . . . . . . . . . . 203
6.5.2 Heisenberg uncertainty relation . . . . . . . . . . . . . 204
6.5.3 Spreading of quasiclassical wave packets . . . . . . . . 205
6.5.4 Phase space . . . . . . . . . . . . . . . . . . . . . . . . 206
6.5.5 Poisson brackets . . . . . . . . . . . . . . . . . . . . . 208
6.5.6 Time evolution of wave packets . . . . . . . . . . . . . 212
7 SCATTERING 217
7.1 Scattering operators . . . . . . . . . . . . . . . . . . . . . . . 218
7.1.1 S-operator . . . . . . . . . . . . . . . . . . . . . . . . . 218
7.1.2 S-operator in perturbation theory . . . . . . . . . . . . 221
7.1.3 Adiabatic switching of interaction . . . . . . . . . . . . 225
7.1.4 T -matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.1.5 S-matrix and bound states . . . . . . . . . . . . . . . . 229
7.2 Scattering equivalence . . . . . . . . . . . . . . . . . . . . . . 230
7.2.1 Scattering equivalence of Hamiltonians . . . . . . . . . 230
7.2.2 Bakamjian’s construction of the point form dynamics . 232
CONTENTS xi
7.2.3 Scattering equivalence of forms of dynamics . . . . . . 234
8 FOCK SPACE 239

8.1 Annihilation and creation operators . . . . . . . . . . . . . . . 240
8.1.1 Sectors with fixed numbers of particles . . . . . . . . . 240
8.1.3 Creation and annihilation operators. Fermions . . . . . 243
8.1.4 Anticommutators of particle operators . . . . . . . . . 245
8.1.5 Creation and annihilation operators. Photons . . . . . 246
8.1.6 Particle number operators . . . . . . . . . . . . . . . . 247
8.1.7 Continuous spectrum of momentum . . . . . . . . . . . 248
8.1.8 Generators of the non-interacting representation . . . . 250
8.1.9 Poincaré transformations of particle operators . . . . . 252
8.2 Interaction potentials . . . . . . . . . . . . . . . . . . . . . . . 253
8.2.1 Conservation laws . . . . . . . . . . . . . . . . . . . . . 254
8.2.2 Normal ordering . . . . . . . . . . . . . . . . . . . . . 256
8.2.3 General form of interaction operators . . . . . . . . . . 257
8.2.4 Five types of regular potentials . . . . . . . . . . . . . 259
8.2.5 Products and commutators of potentials . . . . . . . . 263
8.2.6 More about t-integrals . . . . . . . . . . . . . . . . . . 266
8.2.7 Solution of one commutator equation . . . . . . . . . . 268
8.2.8 Two-particle potentials . . . . . . . . . . . . . . . . . . 269
8.2.9 Cluster separability in the Fock space . . . . . . . . . . 272
8.3 A toy model theory . . . . . . . . . . . . . . . . . . . . . . . . 275
8.3.1 Fock space and Hamiltonian . . . . . . . . . . . . . . . 275
8.3.2 Drawing a diagram in the toy model . . . . . . . . . . 277
8.3.3 Reading a diagram in the toy model . . . . . . . . . . 280
8.3.4 Electron-electron scattering . . . . . . . . . . . . . . . 281
8.3.5 Effective potential . . . . . . . . . . . . . . . . . . . . 283
8.4 Diagrams in a general theory . . . . . . . . . . . . . . . . . . . 284
8.4.1 Properties of products and commutators . . . . . . . . 284
8.4.2 Cluster separability of the S-operator . . . . . . . . . . 290
8.4.3 Divergence of loop integrals . . . . . . . . . . . . . . . 292
9 QUANTUM ELECTRODYNAMICS 297

9.1 Interaction in QED . . . . . . . . . . . . . . . . . . . . . . . . 298
9.1.1 Construction of simple quantum field theories . . . . . 299
9.1.2 Interaction operators in QED . . . . . . . . . . . . . . 302
xii CONTENTS
9.2 S-operator in QED . . . . . . . . . . . . . . . . . . . . . . . . 304

9.2.1 S-operator in the second order . . . . . . . . . . . . . 304
9.2.2 Lorentz invariance of the S-operator . . . . . . . . . . 310
9.2.3 S2 in Feynman-Dyson perturbation theory . . . . . . . 312
9.2.4 Feynman diagrams . . . . . . . . . . . . . . . . . . . . 316
9.2.5 Compton scattering . . . . . . . . . . . . . . . . . . . . 321
9.2.6 Virtual particles? . . . . . . . . . . . . . . . . . . . . . 322
10 RENORMALIZATION 325
10.1 Renormalization conditions . . . . . . . . . . . . . . . . . . . . 328
10.1.1 Regularization . . . . . . . . . . . . . . . . . . . . . . . 328
10.1.2 No-self-scattering renormalization condition . . . . . . 328
10.1.3 Charge renormalization condition . . . . . . . . . . . . 331
10.1.4 Renormalization in Feynman-Dyson theory . . . . . . . 332
10.2 Counterterms . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
10.2.1 Electron self-scattering . . . . . . . . . . . . . . . . . . 334
10.2.2 Electron self-energy counterterm . . . . . . . . . . . . . 336
10.2.3 Photon self-scattering . . . . . . . . . . . . . . . . . . . 339
10.2.4 Photon self-energy counterterm . . . . . . . . . . . . . 341
10.2.5 Charge renormalization . . . . . . . . . . . . . . . . . . 343
10.2.6 Vertex renormalization . . . . . . . . . . . . . . . . . . 344
10.3 Renormalized S-matrix . . . . . . . . . . . . . . . . . . . . . . 346
10.3.1 Vacuum polarization diagrams . . . . . . . . . . . . . . 347
10.3.2 Vertex diagrams . . . . . . . . . . . . . . . . . . . . . . 347
10.3.3 Ladder diagram . . . . . . . . . . . . . . . . . . . . . . 351
10.3.4 Cross-ladder diagram . . . . . . . . . . . . . . . . . . . 355
10.3.5 Renormalizability . . . . . . . . . . . . . . . . . . . . . 358
10.4 Troubles with renormalized QED . . . . . . . . . . . . . . . . 359
10.4.1 Renormalization in QED revisited . . . . . . . . . . . . 360
10.4.2 Time evolution in QED . . . . . . . . . . . . . . . . . 362
10.4.3 Unphys and renorm operators in QED . . . . . . . . . 364
II QUANTUM THEORY OF PARTICLES 367

11 DRESSED PARTICLE APPROACH 371
11.1 Dressing transformation . . . . . . . . . . . . . . . . . . . . . 372
11.1.1 On the origins of QED interaction . . . . . . . . . . . . 373
CONTENTS xiii
11.1.2 No-self-interaction condition . . . . . . . . . . . . . . . 374

11.1.3 Main idea of the dressed particle approach . . . . . . . 377
11.1.4 Unitary dressing transformation . . . . . . . . . . . . . 378
11.1.5 Dressing in the first perturbation order . . . . . . . . . 380
11.1.6 Dressing in the second perturbation order . . . . . . . 381
11.1.7 Dressing in arbitrary order . . . . . . . . . . . . . . . . 384
11.1.8 Infinite momentum cutoff limit . . . . . . . . . . . . . 385
11.1.9 Poincaré invariance of the dressed particle approach . . 387
11.2 Dressed interactions between particles . . . . . . . . . . . . . . 387
11.2.1 General properties of dressed potentials . . . . . . . . . 387
11.2.2 Energy spectrum of the dressed theory . . . . . . . . . 392
11.2.3 Comparison with other dressed particle approaches . . 393
12 COULOMB POTENTIAL AND BEYOND 397

12.1 Darwin-Breit Hamiltonian . . . . . . . . . . . . . . . . . . . . 398
12.1.1 Electron-proton potential in the momentum space . . . 398
12.1.2 Position representation . . . . . . . . . . . . . . . . . . 401
12.2 Hydrogen atom . . . . . . . . . . . . . . . . . . . . . . . . . . 403
12.2.1 Non-relativistic Schrödinger equation . . . . . . . . . . 404
12.2.2 Relativistic energy corrections (orbital) . . . . . . . . . 406
12.2.3 Relativistic energy corrections (spin-orbital) . . . . . . 409
13 DECAYS 413
13.1 Unstable system at rest . . . . . . . . . . . . . . . . . . . . . . 414
13.1.1 Quantum mechanics of particle decays . . . . . . . . . 414
13.1.3 Normalized eigenvectors of momentum . . . . . . . . . 419
13.1.4 Interacting representation of the Poincaré group . . . . 420
13.1.5 Decay law . . . . . . . . . . . . . . . . . . . . . . . . . 424
13.2 Breit-Wigner formula . . . . . . . . . . . . . . . . . . . . . . . 425
13.2.1 Schrödinger equation . . . . . . . . . . . . . . . . . . . 425
13.2.2 Finding function µ(m) . . . . . . . . . . . . . . . . . . 430
13.2.3 Exponential decay law . . . . . . . . . . . . . . . . . . 436
13.2.4 Wave function of decay products . . . . . . . . . . . . 438
13.3 Decay law for moving particles . . . . . . . . . . . . . . . . . . 441
13.3.1 General formula for the decay law . . . . . . . . . . . . 441
13.3.2 Decays of states with definite momentum . . . . . . . . 443
13.3.3 Decay law in the moving reference frame . . . . . . . . 445
xiv CONTENTS
13.3.4 Decays of states with definite velocity . . . . . . . . . . 446

13.4 “Time dilation” in decays . . . . . . . . . . . . . . . . . . . . 447
13.4.1 Numerical results . . . . . . . . . . . . . . . . . . . . . 447
13.4.2 Decays caused by boosts . . . . . . . . . . . . . . . . . 450
13.4.3 Particle decays in different forms of dynamics . . . . . 452
14 RQD IN HIGHER ORDERS 455

14.1 Spontaneous radiative transitions . . . . . . . . . . . . . . . . 456
14.1.1 Bremsstrahlung scattering amplitude . . . . . . . . . . 457
14.1.2 3rd order perturbation Hamiltonian . . . . . . . . . . . 461
14.1.3 Instability of excited atomic states . . . . . . . . . . . 463
14.1.4 Transition rate . . . . . . . . . . . . . . . . . . . . . . 464
14.1.5 Energy correction due to level instability . . . . . . . . 467
14.2 Radiative corrections . . . . . . . . . . . . . . . . . . . . . . . 471
14.2.1 Product term in (14.2) . . . . . . . . . . . . . . . . . . 472
14.2.2 Radiative corrections to the Coulomb potential . . . . 474
14.2.3 Lamb shift . . . . . . . . . . . . . . . . . . . . . . . . . 475
14.2.4 Electron’s anomalous magnetic moment . . . . . . . . 478
15 CLASSICAL ELECTRODYNAMICS 481

15.1 Hamiltonian formulation . . . . . . . . . . . . . . . . . . . . . 482
15.1.1 Darwin-Breit Hamiltonian . . . . . . . . . . . . . . . . 483
15.1.2 Two charges . . . . . . . . . . . . . . . . . . . . . . . . 484
15.1.3 Definition of force . . . . . . . . . . . . . . . . . . . . . 486
15.1.4 Wire with current . . . . . . . . . . . . . . . . . . . . . 488
15.1.5 Charge and current loop . . . . . . . . . . . . . . . . . 491
15.1.6 Charge and spin’s magnetic moment . . . . . . . . . . 494
15.1.7 Two types of magnets . . . . . . . . . . . . . . . . . . 495
15.1.8 Longitudinal forces in conductors . . . . . . . . . . . . 498
15.2 Experiments and paradoxes . . . . . . . . . . . . . . . . . . . 499
15.2.1 Conservation laws in Maxwell’s theory . . . . . . . . . 499
15.2.2 Conservation laws in RQD . . . . . . . . . . . . . . . . 501
15.2.3 Trouton-Noble “paradox” . . . . . . . . . . . . . . . . 502
15.3 Electromagnetic induction . . . . . . . . . . . . . . . . . . . . 504
15.3.1 Moving magnets . . . . . . . . . . . . . . . . . . . . . 504
15.3.2 Homopolar induction: non-conservative forces . . . . . 506
15.3.3 Homopolar induction: conservative forces . . . . . . . . 508
15.4 Aharonov-Bohm effect . . . . . . . . . . . . . . . . . . . . . . 511
CONTENTS xv
15.4.1 Infinitely long solenoids or magnets . . . . . . . . . . . 512

15.4.2 Aharonov-Bohm experiment . . . . . . . . . . . . . . . 513
15.4.3 Toroidal magnet and moving charge . . . . . . . . . . . 516
15.5 Fast moving charges and radiation . . . . . . . . . . . . . . . . 522
15.5.1 Fast moving charge in RQD . . . . . . . . . . . . . . . 522
15.5.2 Fast moving charge in Maxwell’s electrodynamics . . . 527
15.5.3 Kislev-Vaidman “paradox” . . . . . . . . . . . . . . . . 528
15.5.4 Accelerated charges . . . . . . . . . . . . . . . . . . . . 531
15.5.5 Electromagnetic fields vs. photons . . . . . . . . . . . . 532
16 EXPERIMENTAL SUPPORT FOR RQD 535

16.1 Relativistic electron bunches . . . . . . . . . . . . . . . . . . . 536
16.1.1 Experiment at Frascati . . . . . . . . . . . . . . . . . . 537
16.1.2 Proposal for modified experiment . . . . . . . . . . . . 538
16.2 Radiation and bound fields . . . . . . . . . . . . . . . . . . . . 540
16.2.1 Near field studies . . . . . . . . . . . . . . . . . . . . . 540
16.2.2 Microwave horn antennas . . . . . . . . . . . . . . . . 541
16.2.3 Frustrated total internal reflection . . . . . . . . . . . . 542
17 PARTICLES AND RELATIVITY 545

17.1 Localizability of particles . . . . . . . . . . . . . . . . . . . . . 546
17.1.1 Measurements of position . . . . . . . . . . . . . . . . 547
17.1.2 Localized states in a moving reference frame . . . . . . 548
17.1.3 Spreading of well-localized states . . . . . . . . . . . . 549
17.1.4 Superluminal spreading and causality . . . . . . . . . . 551
17.2 Inertial transformations in multiparticle systems . . . . . . . . 554
17.2.1 Events and observables . . . . . . . . . . . . . . . . . . 554
17.2.2 Non-interacting particles . . . . . . . . . . . . . . . . . 557
17.2.3 Lorentz transformations for non-interacting particles . 558
17.2.4 Interacting particles . . . . . . . . . . . . . . . . . . . 560
17.2.5 Time translations in interacting systems . . . . . . . . 560
17.2.6 Boost transformations in interacting systems . . . . . . 562
17.2.7 Spatial translations and rotations . . . . . . . . . . . . 563
17.2.8 Physical inequivalence of forms of dynamics . . . . . . 566
17.2.9 “No interaction” theorem . . . . . . . . . . . . . . . . 567
17.3 Comparison with special relativity . . . . . . . . . . . . . . . . 573
17.3.1 On “derivations” of Lorentz transformations . . . . . . 573
17.3.2 On experimental tests of special relativity . . . . . . . 575
xvi CONTENTS
17.3.3 Poincaré invariance vs. manifest covariance . . . . . . . 577

17.3.4 Is there an observable of time? . . . . . . . . . . . . . . 578
17.3.5 Is geometry 4-dimensional? . . . . . . . . . . . . . . . 580
17.3.6 “Dynamical” relativity . . . . . . . . . . . . . . . . . . 582
17.3.7 Does action-at-a-distance violate causality? . . . . . . . 582
17.4 Are quantum fields necessary? . . . . . . . . . . . . . . . . . 587
17.4.1 Dressing transformation in a nutshell . . . . . . . . . . 587
17.4.2 What was the reason for having quantum fields? . . . 590
17.4.3 Quantum fields and space-time . . . . . . . . . . . . . 592
18 CONCLUSIONS 595
III MATHEMATICAL APPENDICES 599

A Groups and vector spaces 601
A.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
A.2 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
B Delta function and useful integrals 607
C Some lemmas for orthocomplemented lattices. 611
D Rotation group 613

D.1 Basics of the 3D space . . . . . . . . . . . . . . . . . . . . . . 613
D.2 Scalars and vectors . . . . . . . . . . . . . . . . . . . . . . . . 615
D.3 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . 615
D.4 Invariant tensors . . . . . . . . . . . . . . . . . . . . . . . . . 618
D.5 Vector parameterization of rotations . . . . . . . . . . . . . . 621
D.6 Group properties of rotations . . . . . . . . . . . . . . . . . . 624
D.7 Generators of rotations . . . . . . . . . . . . . . . . . . . . . . 626
E Lie groups and Lie algebras 629

E.1 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
E.2 Lie algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
F Hilbert space 637

F.1 Inner product . . . . . . . . . . . . . . . . . . . . . . . . . . . 637
F.2 Orthonormal bases . . . . . . . . . . . . . . . . . . . . . . . . 638
CONTENTS xvii
F.3 Bra and ket vectors . . . . . . . . . . . . . . . . . . . . . . . . 639

F.4 Tensor product of Hilbert spaces . . . . . . . . . . . . . . . . . 641
F.5 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . 641
F.6 Matrices and operators . . . . . . . . . . . . . . . . . . . . . . 643
F.7 Functions of operators . . . . . . . . . . . . . . . . . . . . . . 646
F.8 Linear operators in different orthonormal bases . . . . . . . . 650
F.9 Diagonalization of Hermitian and unitary matrices . . . . . . . 653
G Subspaces and projection operators 657

G.1 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657
G.2 Commuting operators . . . . . . . . . . . . . . . . . . . . . . . 659
H Representations of groups 667

H.1 Unitary representations of groups . . . . . . . . . . . . . . . . 667
H.2 Stone’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 669
H.3 Heisenberg Lie algebra . . . . . . . . . . . . . . . . . . . . . . 670
H.4 Double-valued representations of the rotation group . . . . . . 671
H.5 Unitary irreducible representations of the rotation group . . . 673
H.6 Pauli matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
I Special relativity 677

I.1 4-vector representation of the Lorentz group . . . . . . . . . . 677
I.2 Lorentz transformations for time and position . . . . . . . . . 682
I.3 Minkowski space-time and manifest covariance . . . . . . . . . 684
I.4 Decay of moving particles in special relativity . . . . . . . . . 685
I.5 Ban on superluminal signaling . . . . . . . . . . . . . . . . . . 686
J Quantum fields for fermions 689

J.1 Dirac’s gamma matrices . . . . . . . . . . . . . . . . . . . . . 689
J.2 Bispinor representation of the Lorentz group . . . . . . . . . . 690
J.3 Construction of the Dirac field . . . . . . . . . . . . . . . . . . 693
J.4 Properties of factors u and v . . . . . . . . . . . . . . . . . . . 695
J.5 Explicit formulas for u and v . . . . . . . . . . . . . . . . . . . 698
J.6 Convenient notation . . . . . . . . . . . . . . . . . . . . . . . 701
J.7 Transformation laws . . . . . . . . . . . . . . . . . . . . . . . 702
J.8 Functions Uµ and Wµ . . . . . . . . . . . . . . . . . . . . . . . 705
J.9 (v/c)2 approximation . . . . . . . . . . . . . . . . . . . . . . . 705
J.10 Anticommutation relations . . . . . . . . . . . . . . . . . . . . 708
xviii CONTENTS
J.11 Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . 709

J.12 Fermion propagator . . . . . . . . . . . . . . . . . . . . . . . . 712
K Quantum field for photons 715

K.1 Construction of the photon’s quantum field . . . . . . . . . . . 715
K.2 Explicit formula for eµ (p, τ ) . . . . . . . . . . . . . . . . . . . 716
K.3 Useful commutator . . . . . . . . . . . . . . . . . . . . . . . . 718
K.4 Equal time commutator of photon fields . . . . . . . . . . . . 720
K.5 Photon propagator . . . . . . . . . . . . . . . . . . . . . . . . 720
K.6 Poincaré transformations of the photon field . . . . . . . . . . 722
L QED interaction in terms of particle operators 727

L.1 Current density . . . . . . . . . . . . . . . . . . . . . . . . . 727
L.2 First-order interaction in QED . . . . . . . . . . . . . . . . . 731
L.3 Second-order interaction in QED . . . . . . . . . . . . . . . . 731
M Loop integrals in QED 747

M.1 4-dimensional delta function . . . . . . . . . . . . . . . . . . . 747
M.2 Feynman’s trick . . . . . . . . . . . . . . . . . . . . . . . . . . 747
M.3 Some basic 4D integrals . . . . . . . . . . . . . . . . . . . . . 749
M.4 Electron self-energy integral . . . . . . . . . . . . . . . . . . . 753
M.5 Integral for the vertex renormalization . . . . . . . . . . . . . 757
M.6 Integral for the ladder diagram . . . . . . . . . . . . . . . . . 766
M.7 Coulomb scattering in 2nd order . . . . . . . . . . . . . . . . . 774
N Relativistic invariance of RQD 777

N.1 Relativistic invariance of simple QFT . . . . . . . . . . . . . . 777
N.2 Relativistic invariance of QED . . . . . . . . . . . . . . . . . . 779
N.3 Relativistic invariance of classical electrodynamics . . . . . . . 786
O Dimensionality checks 791

PREFACE
Looking back at theoretical physics of the 20th century, we see two monu-
mental achievements that radically changed the way we understand space,
time, and matter – the special theory of relativity and quantum mechanics.
These theories extended our comprehension to those parts of the natural
world that are not normally accessible to human senses and experience. Spe-
cial relativistic descriptions encompassed observers and objects moving with
extremely high speeds and high energies. Quantum mechanics was essen-
tial for understanding properties of matter at the microscopic scale: nuclei,
atoms, molecules, etc. In the 21st century the challenge remains in the uni-
fication of these two theories, i.e., in the theoretical description of energetic
elementary particles and their interactions.
It is commonly accepted that the most promising candidate for such
an unification is the local quantum field theory (QFT). Indeed, this theory
achieved astonishing accuracy in calculations of certain physical observables,
such as scattering cross-sections and energy spectra. In some instances, the
discrepancies between experiments and predictions of quantum electrody-
namics (QED) are less than 0.000000001%. It is difficult to find such accu-
racy anywhere else in science! However, in spite of its success, quantum field
theory cannot be regarded as the ultimate unification of relativity and quan-
tum mechanics. Just too many fundamental questions remain unanswered
and too many serious problems are left unsolved.
It is fair to say that everyone trying to learn QFT was struck by its
detachment from physically intuitive ideas and enormous complexity. A suc-
cessful physical theory is expected to have, as much as possible, real-life
counterparts for its mathematical constructs. This is often not the case in
QFT, where such physically transparent concepts of quantum mechanics as
the Hilbert space, wave functions, particle observables, and Hamiltonian were
substituted (though not completely discarded) by more formal and obscure
xix
xx PREFACE
notions of quantum fields, ghosts, propagators, and Lagrangians. It was even

declared that the concept of a particle is not fundamental anymore and must
be abandoned in favor of the field description of nature:
In its mature form, the idea of quantum field theory is that quan-
tum fields are the basic ingredients of the universe and particles
are just bundles of energy and momentum of the fields. S. Wein-
berg [Wei97]
The most notorious failure of QFT is the problem of ultraviolet diver-

gences: To obtain sensible results from QFT calculations one must drop cer-
tain infinite terms. Although rules for doing such tricks are well-established,
they cannot be considered a part of a mathematically sound theory. As Dirac
remarked
This is just not sensible mathematics. Sensible mathematics in-

volves neglecting a quantity when it turns out to be small – not
neglecting it because it is infinitely large and you do not want it!
P. A. M. Dirac
In modern QFT the problem of ultraviolet infinities is not solved, it is “swept

under the rug.” Even if the infinities in scattering amplitudes are “renor-
malized”, one ends up with an ill-defined Hamiltonian, which is not suitable
for describing the time evolution of states. The prevailing opinion is that
ultraviolet divergences are related to our lack of understanding of physics at
short distances. It is often argued that QFT is a low energy (effective) ap-
proximation to some yet unknown truly fundamental theory, and that in this
final theory the small distance or high energy (ultraviolet) mischiefs will be
tamed somehow. There are various guesses about what this ultimate theory
may be. Some think that future theory will reveal a non-trivial, probably
discrete, or non-commutative structure of space at distances comparable to
the Planck scale of 10−33 cm. Others hope that paradoxes will go away if we
replace point-like particles by tiny extended objects, like strings.
Many researchers agree that the most fundamental obstacle on the way
forward is the deep contradiction between quantum theory and Einstein’s
relativity theory (both special and general). In a more general sense, the
basic question is “what is space and time?” The answers given by Einstein’s
theory of relativity and by quantum mechanics are quite different. In special
xxi
relativity, position and time are treated on an equal footing, both of them
being coordinates in the 4-dimensional Minkowski space-time. However in
quantum mechanics position and time play very different roles. Position (as
any other physical observable) is an observable described by an Hermitian
operator, whereas time is a numerical parameter, which cannot be cast into
the operator form without contradictions.
In our book we would like to take a fresh look at these issues. Two basic
postulates of our approach are completely non-controversial. They are the
principle of relativity (= the equivalence of all inertial frames of reference)
and the laws of quantum mechanics. From the mathematical perspective,
the former postulate is embodied in the notion of the Poincaré group and
the latter postulate leads to the algebra of operators in the Hilbert space.
When combined, these two statements inevitably imply the idea of unitary
representations of the Poincaré group in the Hilbert space as the major math-
ematical tool for the description of any isolated physical system. One of our
goals is to demonstrate that observable physics fits nicely into this math-
ematical framework. We will also see that traditional theories sometimes
deviate from these postulates, which often leads to unphysical conclusions
and paradoxes. Our goal is to find, analyze, and correct these deviations.
Although the ideas presented here have rather general nature, most cal-
culations will be performed for systems of charged particles and photons and
electromagnetic forces acting between them. Traditionally, these systems
were described by quantum electrodynamics (QED). However, our approach
will lead us to a different theory, which we call relativistic quantum dynamics
or RQD. Our approach is exactly equivalent to the renormalized QED as
long as properties related to the S-matrix (scattering cross-sections, lifetimes,
energies of bound states, etc.) are concerned. However, different results are
expected for the time evolution and boost transformations in interacting sys-
tems.
RQD differs from the traditional approach in two important aspects: the
recognition of the dynamical character of boosts and the primary role of
particles rather than fields.
The dynamical character of boosts. Lorentz transformations for

space-time coordinates of events are the most fundamental relationships in
Einstein’s special relativity. These formulas are usually derived for simple
events associated either with light beams or with free (non-interacting) par-
ticles. Nevertheless, special relativity tacitly assumes that these Lorentz
xxii PREFACE
formulas can be extended to all events with interacting particles regardless

of the interaction strength. We will show that this assumption is actually
wrong. We will derive boost transformations of particle observables by using
Wigner’s theory of unitary representations of the Poincaré group [Wig39]
and Dirac’s approach to relativistic interactions [Dir49]. It will then follow
that boost transformations should be interaction-dependent. Usual universal
Lorentz transformations of special relativity are thus only approximations.
The Minkowski 4-dimensional space-time is an approximate concept as well.
Particles rather than fields. Presently accepted quantum field theo-

ries (e.g., the renormalized QED) have serious difficulties in describing the
time evolution of even simplest systems, such as vacuum and single-particle
states. Direct application of the QED time evolution operator to these states
leads to spontaneous creation of extra (virtual) particles, which have not been
observed in experiments. The problem is that bare particles of QED have
rather remote relationship to physically observed electrons, positrons, etc.,
while the rules connecting bare and physical particles are not well established.
We solve this problem by using the “dressed particle” formalism, which is the
cornerstone of our RQD approach. The “dressed” Hamiltonian of RQD is ob-
tained by applying a unitary dressing transformation to the traditional QED
Hamiltonian. This transformation does not change the S-operator of QED,
therefore the perfect agreement with experimental data is preserved. The
RQD Hamiltonian describes electromagnetic phenomena in terms of directly
interacting physical particles (electrons, photons, etc.) without reference to
spurious bare and virtual particles. Quantum fields play only an auxiliary
role. In addition to accurate scattering amplitudes, our approach allows us
to obtain the time evolution of interacting particles and offers a rigorous way
to find both energies and wave functions of bound states. All calculations
with the RQD Hamiltonian can be done by using standard recipes of quan-
tum mechanics without encountering embarrassing ultraviolet divergences
and without the need for artificial cutoffs, regularization, renormalization,
etc.
Of course, the idea of particles with action-at-a-distance forces is not
new. The original Newtonian theory of gravity had exactly this form, and
(quasi-)particle approaches are often used in modern theories. However, the
consensus opinion is that such approaches can be only approximate, in par-
ticular, because instantaneous interactions are believed to violate important
principles of relativistic invariance and causality. Textbooks try to convince
xxiii
us that these important principles can be reconciled with quantum postulates

only in a theory based on local (quantum) fields with retarded interactions.
In this book we are going to challenge this consensus and demonstrate that
the particle picture and action-at-a-distance do not contradict relativity and
causality.
Our central message can be summarized in few sentences
The physical world is composed of point-like particles.

They obey laws of quantum mechanics and interact with
each other via instantaneous action-at-a-distance po-
tentials, which depend on distances between the par-
ticles and on their momenta. These potentials may
lead to the creation and annihilation of particles as
well. This picture is in full agreement with princi-
ples of relativity and causality. In order to establish
this agreement one should recognize that boost transfor-
mations of particle observables depend on interactions
acting in the system. Thus special-relativistic formu-
las for Lorentz transformations are approximate. Ex-
act relativistic theories of interacting particles should
be formulated without reference to the unphysical 4D
Minkowski space-time.
This book is divided into three parts. Part I: QUANTUM ELEC-

TRODYNAMICS comprises ten chapters 1 - 10. In this part we avoid
controversial issues and stick to traditionally accepted views on relativistic
quantum theories, such as QFT. We specify our basic assumptions, notation,
and terminology while trying to follow a logical path starting from basic
postulates of relativity and probability and culminating in calculation of the
renormalized S-matrix in QED. The purpose of Part I is to set the stage for
introducing our non-traditional particle-based approach in the second part
of the book.
In chapter 1 Quantum mechanics the basic laws of quantum mechanics
are derived from simple axioms of measurements (quantum logic).
In chapter 2 Poincaré group we introduce the Poincaré group as a
set of transformations that relate different (but equivalent) inertial reference
frames.
xxiv PREFACE
Chapter 3 Quantum mechanics and relativity unifies the two above

pieces and establishes unitary representations of the Poincaré group in the
Hilbert space of states as the most general mathematical description of any
isolated physical system.
In chapter 4 Operators of observables we find the correspondence
between known physical observables (such as mass, energy, momentum, spin,
position, etc.) and concrete Hermitian operators in the Hilbert space.
Chapter 5 Single particles is devoted to Wigner’s theory of irreducible
representations of the Poincaré group. It provides a complete description of
basic properties and dynamics of isolated stable elementary particles.
In chapter 6 Interaction we discuss relativistically invariant interactions
in multi-particle systems.
Chapter 7 Scattering is devoted to quantum-mechanical description of
particle collisions.
In chapter 8 Fock space we consider the general class of systems in
which particles can be created and annihilated and their numbers are not
conserved.
In chapter 9 Quantum electrodynamics we apply all the above ideas
to systems of charged particles and photons in the formalism of QED.
Chapter 10 Renormalization concludes this first “traditional” part of
the book. This chapter discusses the appearance of ultraviolet divergences
in QED and explains their elimination by means of counterterms added to
the Hamiltonian.
Part II of the book QUANTUM THEORY OF PARTICLES (chap-
ters 11 - 18) examines the new particle-based RQD approach, its connection
to the traditional theory from part I, and its advantages. Our goal is to dis-
pel the common prejudice against using particle interpretation in relativistic
quantum theories. We show that the view of the world as consisting of point
particles interacting via instantaneous direct potentials is not contradictory
and is capable to explain physical phenomena just as well - or even better -
as the mainstream field-based view.
This “non-traditional” part of the book begins with chapter 11 Dressed
particle approach, which provides a deeper analysis of renormalization and
the bare particle picture in quantum field theories. The main ideas of our
particle-based approach are formulated here and QED is being rewritten in
terms of creation and annihilation operators of physical particles, rather than
bare quantum fields.
In chapter 12 Coulomb potential and beyond we derive the dressed
xxv
interaction between charged particles and use it to calculate the spectrum of

the hydrogen atom.
Chapter 13 Decays deals with a rigorous description of unstable quantum
systems. The special focus is on decays of fast moving particles. Here we show
that the usual Einstein’s time dilation formula is not an accurate description
of such phenomena. In principle, it should be possible to observe deviations
from this formula in experiments, but, unfortunately, the required precision
cannot be reached with the currently available technology.
The mathematics of particle decays is applied to radiative transitions in
the hydrogen atom in chapter 14 RQD in higher orders. In this chapter
we also discuss infrared divergences (and their cancelation) in high pertur-
bation orders. In particular, we calculate the electron’s anomalous magnetic
moment and the Lamb shifts of atomic energy levels.
In chapter 15 Classical electrodynamics we show that classical electro-
dynamics can be reformulated as a Hamiltonian theory of charged particles
with action-at-a-distance forces. These forces depend not only on the dis-
tance between the charges, but also on their velocities and spins. In this
formulation, electromagnetic fields and potentials are not present at all and
Maxwell’s equations do not play any role. This allows us to resolve a number
of theoretical paradoxes and, at the same time, remain in agreement with
experimental data. Even the famous Aharonov-Bohm experiment gets its ex-
planation as an effect of inter-particle interactions on the phases of quantum
wave packets – i.e., without any involvement of electromagnetic potentials
and non-trivial space topology.
We conclude our discussion of electromagnetic phenomena by the chap-
ter 16 Experimental support for RQD, where we briefly describe sev-
eral experiment supporting our idea about the instantaneous propagation of
Coulomb and magnetic interactions.
In chapter 17 Particles and relativity we discuss real and imaginary
paradoxes usually associated with the particle interpretation of QFT. In par-
ticular, we discuss the superluminal spreading of localized wave packets and
the Currie-Jordan-Sudarshan “no interaction” theorem. We show that su-
perluminality and action-at-a-distance can coexist with causality if the rela-
tivistic invariance of interactions is properly understood.
The final small chapter 18 Conclusions summarizes major results and
conclusions of this work and briefly mentions possible directions for future
investigations.
Some useful mathematical facts and more technical derivations are col-
xxvi PREFACE
lected in the Part III: MATHEMATICAL APPENDICES.
Remarkably, the development of the new particle-based RQD approach

did not require introduction of radically new physical ideas. Actually, all
key ingredients of this study were formulated a long time ago, but for some
reason they have not attracted the attention they deserve. For example, the
fact that either translations or rotations or boosts must have dynamical de-
pendence on interactions was first established in Dirac’s work [Dir49]. These
ideas were further developed in “direct interaction” theories by Bakamjian
and Thomas [BT53], Foldy [Fol61], Sokolov [Sok75, SS78], Coester and Poly-
zou [CP82] and many others. The primary role of particles in formulation
of quantum field theories was emphasized in an excellent book by Weinberg
[Wei95]. The “dressed particle” approach was advocated by Greenberg and
Schweber [GS58].2 First indications that this approach can solve the prob-
lem of ultraviolet divergences in QFT are contained in papers by Ruijgrok
[Rui59], Shirokov and Vişinesku [Shi72, VS74]. The formulation of RQD
presented in this book just combined all these good ideas into one compre-
hensive approach, which, we believe, is a step toward a consistent unification
of quantum mechanics and the principle of relativity.
In this book we are using the Heaviside-Lorentz system of units3 in which
the Coulomb law has √ the form V =−10 q1 q2 /(4πr) and the proton charge has
the value of e = 2 π × 4.803 × 10 statcoulomb. The speed of light is
c = 2.998 × 1010 cm/s; the Planck constant is ~ = 1.054 × 10−27 erg · s, so the
fine structure constant is α ≡ e2 /(4π~c) ≈ 1/137.
The new material contained in this book was partially covered in six
articles [Ste01, Ste96, Ste02, Ste05, Ste06, Ste08].
I would like to express my gratitude to Peter Enders, Theo Ruijgrok and
Boris Zapol for reading this book and making valuable critical comments
and suggestions. I also would like to thank Harvey R. Brown, Rainer Grobe,
William Klink, Vladimir Korda, Chris Oakley, Federico Piazza, Wayne Poly-
zou, Alexander Shebeko and Mikhail Shirokov for enlightening conversa-
tions as well as Bilge, Bernard Chaverondier, Wolfgang Engelhardt, Juan R.
González-Álvarez, Bill Hobba, Igor Khavkine, Mike Mowbray, Arnold Neu-
maier and Dan Solomon for online discussions and fresh ideas that allowed
me to improve the quality of this manuscript over the years. These acknowl-
2
A very similar unitary transformation technique was developed even earlier by Fröhlich
[Frö52, Frö61] in the theory of electron-phonon interactions in solids.
3
see Appendix in [Jac99]
xxvii
edgements do not imply any direct or indirect endorsements of my work by

these distinguished researchers. All errors and misconceptions contained in
this book are mine and only mine.
xxviii PREFACE
INTRODUCTION
It is wrong to think that the task of physics is to find out how

nature is. Physics concerns what we can say about nature...
Niels Bohr
In this Introduction, we will try to specify more exactly what is the

purpose of theoretical physics and what are the fundamental concepts and
their relationships studied by this branch of science. Some of the definitions
and statements made here may look self-evident or even trivial. Nevertheless,
it seems important to spell out these definitions explicitly, in order to avoid
misunderstandings in later parts of the book.
We obtain all information about the physical world from measurements,
and the fundamental goal of theoretical physics is to describe and predict
the results of these measurements. The act of measurement requires at least
three objects (see Fig. 1): a preparation device, a physical system, and a
measuring apparatus. The preparation device prepares the physical system
in a certain state. The state of the system has some attributes or properties.
If an attribute or property can be assigned a numerical value it will be called
observable F . The observables are measured by bringing the system into
contact with the measuring apparatus. The result of the measurement is a
numerical value of the observable, which is a real number f . We assume that
every measurement of F yields some value f , so that there are no misfirings
of the measuring apparatus.
This was just a brief list of relevant notions. Let us now look at all these
ingredients in more detail.
Physical system. Loosely speaking, the physical system is any object
that can trigger a response (measurement) in the measuring apparatus. As
xxix
xxx INTRODUCTION
t clock
preparation measurement
measuring
preparation
physical apparatus
device
system
state f(t)
value of
observable F
Figure 1: Schematic representation of the preparation/measurement process.
physical system is the most basic concept in physics, it is difficult to give a

more precise definition. An intuitive understanding will be sufficient for our
purposes. For example, an electron, a hydrogen atom, a book, a planet are
all examples of physical systems.
Physical systems can be either elementary (also called particles) or com-
pound, i.e., consisting of two or more particles.
In this book we will limit our discussion to isolated systems, which do
not interact with the rest of the world or with any external potential.4 By
doing so, we exclude some interesting physical systems and effects, like atoms
in external electric and magnetic fields. However, this does not limit the
generality of our treatment. Indeed, one can always combine the atom and
the field-creating device into one unified system that can be studied within
the “isolated system” approach.
States. Any physical system may exist in a variety of different states:

a book can be on your desk or in the library; it can be open or closed; it
can be at rest or fly with a high speed. The distinction between different
4
Of course, the interaction with the measuring apparatus must be allowed, because this
interaction is the only way to get objective information about the system. However, we
reject the idea that the process of measurement should have a dynamical description in
the theory. See subsection 1.7.3.
xxxi
systems and different states of the same system is sometimes far from ob-
vious. For example, a separated pair of particles (electron + proton) does
not look like the hydrogen atom. So, one may conclude that these are two
different systems. However, in reality these are two different states of the
same compound system.
Preparation and measuring devices. Generally, preparation and

measuring devices can be rather sophisticated, e.g., accelerators, bubble
chambers, etc. It would be hopeless to include in our theoretical frame-
work a detailed description of their design and how they interact with the
physical system. Instead, we will use an idealized representation of both the
preparation and measurement acts. In particular, we will assume that the
measuring apparatus is a black box whose job is to produce just one real
number - the value of some observable - upon the act of measurement.
It is important to note that generally the measuring device can measure
only one observable. We will not assume that it is possible to measure several
observables at once with the same device. For example, a particle’s position
and momentum cannot be obtained in one measurement.
We will also see that one preparation/measurement act is not sufficient
for a full characterization of the studied physical system. Our prepara-
tion/measurement setup should be able to process multiple copies of the
same system prepared in exactly the same conditions.5 A striking property
of nature is that in such repetitive measurements we are not guaranteed to
obtain exactly the same results. We will see that in many cases results of
measurements are subject to a random scatter. So, theoretical descriptions
of states can be only probabilistic. This idea is the starting point of quantum
mechanics.
Observables. Theoretical physics is inclined to study simplest physical

systems and their most fundamental observable properties, such as mass,
velocity, spin, etc. We will assume exact measurability of any observable. Of
course, this claim is an idealization. Clearly, there are precision limits for
all real measuring apparatuses. However, we will suppose that with certain
efforts one can always make more and more accurate measurements, so the
precision is, in principle, unlimited.6
5
This is also called an ensemble.
6
For example, it is impossible in practice to measure location of the electron inside
the hydrogen atom. Nevertheless, we will assume that this can be done in our idealized
xxxii INTRODUCTION
Some observables can take a value anywhere on the real axis R. The
Cartesian components of position Rx , Ry , and Rz are good examples of such
(unlimited range, continuous) observables. However there are also observ-
ables for which this is not true and the allowed values form only a subset
of the real axis. Such a subset is called the spectrum of the observable.
For example, it is known (see Chapter 5) that each component of particle’s
velocity cannot exceed the speed of light c, so the spectrum of the veloc-
ity components Vx , Vy , and Vz is [−c, c]. Both position and velocity have
continuous spectra. However, there are many observables having a discrete
spectrum. For example, the number of particles in the system (which is also
a valid observable) can only take integer values 0, 1, 2, ... Later we will also
meet observables whose spectrum is a combination of discrete and continuous
parts, e.g., the energy spectrum of the hydrogen atom.
Clearly the measured values of observables must depend on the kind of
the system being measured and on its state. The measurement of any true
observable must involve some kind of interaction or contact between the ob-
served system and the measuring apparatus. We emphasize this fact because
there are numerical quantities in physics which are not associated with any
physical system and therefore they are not called observables. For example,
the number of space dimensions (3) is not an observable, because we do not
regard space as an example of a physical system.
Time and clocks. Another important physical quantity, that does not
belong to the class of observables, is time. We cannot say that time is a
property of a physical system, because a “measurement” of time (looking
at positions of the clock’s arms) does not involve any interaction with the
physical system. One can “measure” time even in the absence of any physical
system in our laboratory. To do that, one just needs to have a clock, which
is a necessary part of any experimental setup and not a physical system
by itself.7 The clock assigns a time label (a numerical parameter) to each
measurement of true observables, and this label does not depend on the
theory. Then each individual measurement of the electron’s position would yield a certain
result r. However, as will be discussed in chapter 1, results of repetitive measurements in
the ensemble are generally non-reproducible and random. So, in quantum mechanics the
electron’s state should be describable by a probability distribution |ψ(r)|2 .
7
Of course, one can decide to consider the laboratory clock as a physical system and
perform physical measurements on it. For example, one can investigate the quantum
uncertainty in the clock’s arm position. However, then this particular clock is no longer
suitable for “measuring” time. Some other device must be used for time-keeping purposes.
xxxiii
state of the observed system. The unique place of the clock and time in the
measurement process is indicated in Fig. 1.
Observers. We will call observer O a collection of measuring apparatuses

(plus a specific device called clock ), which are designed to measure all possible
observables. Laboratory is a full experimental setup, i.e., a preparation device
plus observer O with all his measuring devices.
In this book we consider only inertial observers (= inertial frames of ref-
erence) or inertial laboratories. These are observers that move uniformly
without acceleration and rotation, i.e., observers whose velocity and orienta-
tion of axes does not change with time. The importance of choosing inertial
observers will become clear in section 2.1 where we will see that measure-
ments performed by these observers obey the principle of relativity.
The minimal set of measuring devices associated with an observer include
a yardstick for measuring distances, a clock for registering time, a fixed point
of origin and three mutually perpendicular axes erected from this point. In
addition to measuring properties of physical systems, our observers can also
see their fellow observers. With the measuring kit described above each ob-
server O can characterize another observer O ′ by ten parameters {φ, ~ v, r, t}.
′
These parameters include i) the time shift t between O and O ; ii) the po-
sition vector r that connects the origin of O with the origin of O ′ ; iii) the
rotation angle8 φ~ that relates orientations of axes in O ′ to orientations of
axes in O and iv) the velocity v of O ′ relative to O.
It is convenient to introduce the notion of inertial transformations of
observers and laboratories. Transformations of this kind include
• rotations,
• space translations,
• time translations,
• changes of velocity or boosts.
There are three independent rotations (around x, y, and z axes), three inde-
pendent translations and three independent boosts. So, along with the time
translations that makes 10 basic types of inertial transformations. More gen-
eral inertial transformations can be made by performing two (or more) basic
8
The vector parameterization of rotations is discussed in Appendix D.5.
xxxiv INTRODUCTION
transformations in succession. We will postulate that for any pair of inertial

observers O and O ′ one can always find an inertial transformation g, such
that O ′ = gO. Conversely, application of any inertial transformation g to
any inertial observer O leads to a different valid inertial observer O ′ = gO.
In chapter 2 we will make an important observation that transformations g
form a group.
An important comment should be made about the definition of “observer”
used in this book. Usually, an observer is understood as a person (or a mea-
suring apparatus) that exists and performs measurements for infinitely long
time. For example, it is common to discus the time evolution of a physical
system “from the point of view” of this or that observer. However, this collo-
quial definition does not fit our purposes. The problem with this definition is
that it singles out time translations as being different from space translations,
rotations, and boosts. In this approach time translations become associated
with the observer herself rather than being treated equally with other iner-
tial transformations between observers. The central idea of our approach to
relativity is to treat all ten types of inertial transformations on equal footing.
Therefore, we will use a definition of observer that is slightly different from
the one described above. In our definition observers are “short-living.” They
exist and perform measurements in a short time interval and they can see
only a snapshot of the world around them. So, individual observers cannot
“see” the time evolution of a physical system. In our approach the time evo-
lution is described as a succession of measurements performed by a series of
instantaneous observers related to each other by time translations. Then the
colloquial “observer” is actually a continuous sequence of our “short-living”
observers displaced in time with respect to each other.
One of the most important tasks of physics is to establish the relation-

ship between measurements performed by two different observers on the same
physical system. These relationships will be referred to as inertial transfor-
mations of observables. In particular, if values of some observables measured
by O are known, and the inertial transformation connecting O with O ′ is
known as well, then we should be able to figure out the values of those ob-
servables from the point of view of O ′ . Probably the most important and
challenging task of this kind is the description of dynamics or time evolution.
In this case, observers O ′ and O are connected by a time translation.
The goals of physics. The above discussion can be summarized by
indicating five essential goals of theoretical physics:
xxxv
• provide a classification of physical systems;
• for each physical system give a list of observables and their spectra;
• for each physical system give a list of possible states;
• for each state of the system describe the results of measurements of

relevant observables;
• given one observer’s description of the system’s state find out how other
observers see the same state.
xxxvi INTRODUCTION
Part I
QUANTUM
ELECTRODYNAMICS
1
Chapter 1
QUANTUM MECHANICS
The nature of light is a subject of no material importance to the

concerns of life or to the practice of the arts, but it is in many
other respects extremely interesting.
Thomas Young
In this chapter we are going to discuss the most basic inter-relationships

between preparation devices, physical systems, and measuring apparatuses
(see Fig. 1). In particular, we will ask what kind of information about the
physical system can be obtained by the observer and how this information
depends on the state of the system?
Until the end of the 19th century these questions were answered by clas-
sical mechanics which, basically, said that in each state the physical system
has a number of observables (e.g, position, momentum, mass, etc) whose
values can be measured simultaneously, accurately, and reproducibly. These
deterministic views were held to be indisputable and self-evident not only in
classical mechanics, but throughout classical physics.
Dissatisfaction with the classical theory started to grow at the end of
the 19th century when this theory was found inapplicable to microscopic
phenomena, such as the radiation spectrum of heated bodies, discrete spec-
tra of atoms, and the photo-electric effect. Solutions for these and many
other problems were found in quantum mechanics whose creation involved
joint efforts and passionate debates of such outstanding scientists as Bohr,
3
4 CHAPTER 1. QUANTUM MECHANICS
Born, de Broglie, Dirac, Einstein, Fermi, Fock, Heisenberg, Pauli, Planck,

Schrödinger, Wigner, and many others. The picture of the physical world
emerged from these efforts was weird, paradoxical, and completely different
from the familiar classical picture. However, despite this apparent weirdness,
predictions of quantum mechanics are being tested countless times everyday
in physical and chemical laboratories around the world and not a single time
were these “weird” predictions found wrong. This makes quantum mechanics
the most successful and accurate physical theory of all times.
There are dozens of good textbooks, which explain the laws and rules
of quantum mechanics and how they can be used to perform calculations in
each specific case. These laws and rules are not controversial and the reader
of this book is supposed to be familiar with them. However, the deeper
meaning and interpretation of the quantum formalism is still a subject of
a fierce debate. Why does nature obey the rules of quantum mechanics?
Why there are wave functions satisfying the linear superposition principle?
Is it possible to change the rules (e.g., introduce some non-linearity) without
finding ourselves in contradiction with experiments? People are asking these
questions more frequently in recent years as the search for quantum gravity
has intensified, and one fashionable idea was that one should modify the rules
quantum mechanics in order to reconcile them with general relativity.
In this chapter we will present a less-known viewpoint on theoretical ori-
gins of quantum laws. This approach says that the true laws of logic applica-
ble to physical measurements are different from the classical laws of Aristotle
and Boole. The familiar classical logic should be replaced by the so-called
quantum logic. We will argue that the formalism of quantum mechanics (in-
cluding vectors and Hermitian operators in the Hilbert space) follows almost
inevitably from simple properties of measurements and quantum-logical re-
lationships between them. These properties and relationships are so basic,
that it seems impossible to modify them and thus to change quantum laws
without destroying their internal consistency and their consistency with ob-
servations. In section 1.7 we will also add some thoughts to the never-ending
philosophical debate about interpretations of quantum mechanics.
1.1 Why do we need quantum mechanics?

The inadequacy of classical concepts is best seen by analyzing the debate
between the corpuscular and wave theories of light. Let us demonstrate the
1.1. WHY DO WE NEED QUANTUM MECHANICS? 5
A
aperture
A’
photographic plate
Figure 1.1: The image in the camera obscura with a pinhole aperture is
created by straight light rays: the image at the point A′ on the photographic
plate is created only by light rays emitted from the point A and passed
straight through the hole.
essence of this centuries-old debate on an example of a thought experiment

with a pinhole camera.
1.1.1 Corpuscular theory of light

You probably saw or heard about a simple device called camera obscura or
pinhole camera. You can easily make this device yourself: Take a light-tight
box, put a photographic plate inside the box and make a small hole in the
center of the side opposite to the photographic plate (see Fig. 1.1). The light
passing through the hole inside the box creates a sharp inverted image on the
photographic plate. You will get even sharper image by decreasing the size of
the hole, though the image will become dimmer, of course. This behavior of
light was well known for centuries (a drawing of the camera obscura is present
in Leonardo da Vinci’s papers). One of the earliest scientific explanations of
this and other properties of light (reflection, refraction, etc.) was suggested
by Newton. In modern language, his corpuscular theory would explain the
formation of the image like this:
Corpuscular theory: Light is a flow of tiny particles (photons)

propagating along straight classical trajectories (light rays). Each
particle in the ray carries a certain amount of energy, which gets
(a) (b)
A B
A B
Figure 1.2: (a) Image in the pinhole camera with a very small aperture; (b)
the density of the image along the line AB
released upon impact in a very small volume corresponding to

one grain of the photographic emulsion and produces a small dot
image. When intensity of the source is high, there are so many
particles, that we cannot distinguish their individual dots. All
these dots merge into one continuous image, and the intensity of
the image is proportional to the number of particles hitting the
photographic plate during the exposure time.
Let us continue our experiment with the pinhole camera and decrease the
size of the hole even further. The corpuscular theory would insist that the
smaller size of the hole must result in a sharper image. However this is not
what experiment shows! For a very small hole the image on the photographic
plate will be blurred. If we further decrease the size of the hole, the detailed
picture will completely disappear and the image will look like one large diffuse
spot (see Fig. 1.2), independent on the form and shape of the light source
outside the camera. It appears as if light rays scatter in all directions when
they pass through a small aperture or near a small object. This effect of the
light spreading is called diffraction, and it was discovered by Grimaldi in the
middle of the 17th century.
Diffraction is rather difficult to reconcile with the corpuscular theory. For
example, we can try to save this theory by assuming that light rays deviate
from their straight paths due to some interaction with the box material sur-
L+R
(a) (b)
L R
Figure 1.3: (a) The density of the image in a two-hole camera according
to naı̈ve corpuscular theory is a superposition of images created by the left
(L) and right (R) holes; (b) Actual interference picture: In some places the
density of the image is higher than L+R (constructive interference); in other
places the density is lower than L+R (destructive interference).
rounding the hole. However this is not a satisfactory explanation, because

one can easily establish by experiment that the shape of the diffraction pic-
ture is completely independent on the type of material used to make the
walls of the pinhole camera. The most striking evidence of the fallacy of
the naı̈ve corpuscular theory is the effect of light interference discovered by
Young in 1802 [You04]. To observe the interference, we can slightly modify
our pinhole camera by making two small holes close to each other, so that
their diffraction spots on the photographic plate overlap. We already know
that when we leave the left hole open and close the right hole we get a diffuse
spot L (see Fig. 1.3(a)). When we leave open the right hole and close the
left hole we get another spot R. Let us try to predict what kind of image we
will get if both holes are opened.
Following the corpuscular theory and simple logic we might conclude
that particles reaching the photographic plate are of two sorts: those passed
through the left hole and those passed through the right hole. When the two
holes are opened at the same time, the density of the “left hole“ particles
should add to the density of the “right hole” particles and the resulting
image should be a superposition L+R of the two images (full line in Fig.
1.3(a)). Right? Wrong! This seemingly reasonable explanation disagrees
with experiment. The actual image has new features (brighter and darker
regions) called the interference picture (full line in Fig. 1.3(b)).
Can the corpuscular theory explain this strange interference pattern? We
could assume, for example, that some kind of interaction between light cor-
puscles is responsible for the interference, so that passages of different parti-
cles through left and right holes are not independent events, and the law of
addition of probabilities does not hold for them. However, this idea must be
rejected because, as we will see later, the interference picture persists even
if photons are released one-by-one, so that they cannot interact with each
other.
1.1.2 Wave theory of light

The inability to explain such basic effects of light propagation as diffraction
and interference was a major embarrassment for the Newtonian corpuscular
theory. These effects as well as all other properties of light known before
quantum era (reflection, refraction, polarization, etc.) were brilliantly ex-
plained by the wave theory of light advanced by Grimaldi, Huygens, Young,
Fresnel, and others. The wave theory gradually replaced Newtonian corpus-
cles in the course of the 19th century. The idea of light as a wave found its
strongest support from Maxwell’s electromagnetic theory which unified op-
tics with electric and magnetic phenomena. Maxwell explained that the light
wave is actually an oscillating field of electric E(x, t) and magnetic B(x, t)
vectors – a sinusoidal wave propagating with the speed of light. According to
the Maxwell’s theory, the energy of the wave and consequently the intensity
of light I, is proportional to the square of the amplitude of the field vector
oscillations, e.g., I ∝ E2 . Then formation of the photographic image can be
explained as follows:
Wave theory: Light is a continuous wave or field propagat-
ing in space in an undulatory fashion. When the light wave
meets molecules of the photo-emulsion, the charged parts of the
molecules start to oscillate under the influence of the light’s elec-
tric and magnetic field vectors. The portions of the photographic
plate with higher field amplitudes have more violent molecular
oscillations and higher image densities.
This provides a natural explanation for both diffraction and interference:
Diffraction simply means that light waves can deviate from straight paths and
go around corners, just like sound waves do.1 To explain the interference, we
just need to note that when two portions of the wave pass through different
holes and meet on the photographic plate, their electric vectors add up.
However intensities of the waves are not additive: I ∝ (E1 +E2 )2 = E21 +2E1 ·
E2 + E22 6= E21 + E22 ∝ I1 + I2 . It follows from simple geometric considerations
that in the two-hole configuration there are places on the photographic plate
where the two waves always come in phase (E1 (t) ↑↑ E2 (t) and E1 · E2 > 0,
which means constructive interference) and there are other places where the
two waves always come with opposite phases (E1 (t) ↑↓ E2 (t) and E1 · E2 < 0,
i.e., destructive interference).
1.1.3 Low intensity light and other experiments

In the 19th century physics, the wave-particle debate was decided in favor
of the wave theory. However, further experimental evidence showed that
the victory was declared prematurely. To see what goes wrong with the
wave theory, let us continue our thought experiment with the interference
picture created by two holes and gradually tune down the intensity of the
light source. At first, nothing interesting will happen: we will see that the
density of the image predictably decreases. However, after some point we
will recognize that the photographic image is not uniform and continuous
as before. It consists of small blackened dots as if some grains of photo-
emulsion were exposed to light and others were not. This observation is
very difficult to reconcile with the wave theory. How a continuous wave can
produce this dotty image? However this is exactly what the corpuscular
theory would predict. Apparently the dots are created by particles hitting
the photographic plate one-at-a-time.
A number of other effects were discovered at the end of the 19th century
and in the beginning of the 20th century, which could not be explained by
the wave theory of light. One of them was the photo-electric effect: It was
observed that when the light is shined on a piece of metal, electrons can
escape from the metal into the vacuum. This observation was not surprising
by itself. However it was rather puzzling how the number of emitted electrons
depended on the frequency and the intensity of the incident light. It was
found that only light waves with frequencies above some threshold ω0 were
1
Wavelengths corresponding to the visible light are between 0.4 micron for the violet
light and 0.7 micron for the red light. So for large obstacles or holes, the deviations from
the straight path are very small and the corpuscular theory of light works reasonably well.
capable of knocking out electrons from the metal. Radiation with frequency
below ω0 could not produce the electron emission even if the light intensity
was high. According to the wave theory, one could assume that the electrons
are knocked out of the metal due to some kind of force exerted on them by
electromagnetic vectors E, B in the wave. A larger light intensity (= larger
wave amplitude = higher values of E and B) naturally means a larger force
and a larger chance of the electron emission. Then why the low frequency
but high intensity light could not do the job?
In 1905 Einstein explained the photo-electric effect by bringing back New-
tonian corpuscles in the form of “light quanta” later called photons. He
described the light as “consisting of finite number of energy quanta which
are localized at points in space, which move without dividing and which can
only be produced and absorbed as complete units” [AP65]. According to the
Einstein’s explanation, each photon carries the energy of ~ω, where ω is the
frequency2 of the light wave, and ~ is the Planck constant. Each photon has
a chance to collide with and pass its energy to just one electron in the metal.
Only high energy photons (those corresponding to high frequency light) are
capable of passing enough energy to the electron to overcome certain energy
barrier3 Eb between the metal and the vacuum. Low-frequency light has
photons with low energy ~ω < Eb . Then, no matter what is the amplitude
(= the number of photons) of such light, its photons are just too “weak”
to kick the electrons with sufficient energy.4 In the Compton’s experiment
(1923) the interaction of light with electrons could be studied with much
greater detail. And indeed, this interaction more resembled a collision of two
particles rather than shaking of the electron by a periodic electromagnetic
wave.
These observations clearly lead us to the conclusion that light is a flow of
particles after all. But what about the interference? We already agreed that
the corpuscular theory had no logical explanation of this effect.
For example, in an interference experiment conducted by Taylor in 1909
[Tay09], the intensity of light was so low that no more than one photon was
present at any time instant, thus eliminating any possibility of the photon-
2
ω is the so-called circular frequency (measured in radians per second) which is related
to the usual frequency ν (measured in oscillations per second) by the formula ω = 2πν.
3
The barrier’s energy is roughly proportional to the threshold frequency Eb ≈ ~ω0 .
4
Actually, the low-frequency light may lead to the electron emission when two or more
low-energy photons collide simultaneously with the same electron, but such events have
very low probability and become observable only at very high light intensities.
1.2. PHYSICAL FOUNDATIONS OF QUANTUM MECHANICS 11
photon interaction and its effect on the interference picture. Another “ex-
planation” that the photon somehow splits, passes through both holes, and
then rejoins again at the point of collision with the photographic plate does
not stand criticism as well: One photon never creates two dots on the photo-
graphic plate, so it is unlikely that the photon can split during propagation.
Finally, can we assume that the particle passing through the right hole some-
how “knows” whether the left hole is open or closed and adjusts its trajectory
accordingly? Of course, there is some effect on the photon near the left hole
depending on whether the right hole is open or not. However by all estimates
this effect is negligibly small.
So, young quantum theory had an almost impossible task to reconcile two
apparently contradicting classes of experiments with light: Some experiments
(diffraction, interference) were easily explained with the wave theory, while
the corpuscular theory had serious difficulties. Other experiments (photo-
electric effect, Compton scattering) could not be explained from the wave
properties and clearly showed that light consists of particles. Just adding to
the confusion, de Broglie in 1924 advanced a hypothesis that such particle-
wave duality is not specific to photons. He proposed that all particles of
matter – like electrons – have wave-like properties. This “crazy” idea was
soon confirmed by Davisson and Germer who observed the diffraction and
interference of electron beams in 1927.
Certainly, in the first quarter of the 20th century, physics faced the great-
est challenge in its history. This is how Heisenberg described the situation:
I remember discussions with Bohr which went through many hours

till very late at night and ended almost in despair; and when at
the end of the discussion I went alone for a walk in the neighbor-
ing park I repeated to myself again and again the question: Can
nature possibly be as absurd as it seemed to us in those atomic
experiments? W. Heisenberg [Hei58]
1.2 Physical foundations of quantum mechan-

ics
In the rest of this chapter we will introduce basic formalism of quantum the-
ory. This theory rejects the duplicitous claim of particle-wave duality. It
insists that matter and light are made of point-like particles (like photons
and electrons) whose propagation is governed by non-classical rules of quan-
tum mechanics. In particular, these rules are responsible for the wave-like
behavior of quantum particles, such as in the double-hole experiment. We
will explain this experiment from the quantum-mechanical point of view in
subsection 6.5.6.
In this section we will try to explain the main difference between classical
and quantum views of the world. To understand quantum mechanics, we
must accept that certain concepts, which were taken for granted in classical
physics, cannot be applied to micro-objects like photons and electrons. To
see what is different, we should revisit basic ideas about what is physical
system, how its states are prepared, and how its observables are measured.
1.2.1 Single-hole experiment

The best way to understand the main idea of quantum mechanics is to ana-
lyze the single-hole experiment discussed in the preceding section. We have
established that in the low-intensity regime, when the source emits individ-
ual photons one-by-one, the image on the screen consists of separate dots.
We now accept this fact as a sufficient evidence that light is made of small
countable localizable particles, called photons.
However, the behavior of these particles is quite different from the one
expected in classical physics. Classical physics is based on one tacit axiom,
which we formulate here as an Assertion5
Assertion 1.1 (determinism) If we prepare a physical system repeatedly

in the same state and measure the same observable, then each time we will
get the same measurement result.
This seemingly obvious Assertion is violated in the single-hole experiment.

Indeed, suppose that the light source is monochromatic, so that all photons
reaching the hole have the same momentum and energy. At the moment of
passing through the hole the photons have rather well-defined x, y, and z-
components of position too. This guarantees that at this moment all photons
5
In this book we will distinguish Postulates, Statements, and Assertions. Postulates
form a foundation of our theory. In most cases they follow undoubtedly from experiments.
Statements follow logically from Postulates and we believe them to be true. Assertions are
commonly accepted in the literature, but they do not have a place in the RQD approach
developed in this book.
1.2. PHYSICAL FOUNDATIONS OF QUANTUM MECHANICS 13
are prepared in (almost) the same state, as required by the Assertion 1.1.
We can make this state to be defined even better by reducing the size of the
aperture. Then according to the Assertion, each photon should produce the
same measurement result, i.e., each photon should land at exactly the same
point on the photographic plate. However, this is not what happens in reality!
The dots made by photons are scattered all over the photographic plate.
Moreover, the smaller is the aperture the wider is the distribution of the
dots. Results of measurements are not reproducible even though preparation
conditions are tightly controlled!
Remarkably, it is not possible to find an “ordinary” explanation of this
extraordinary fact. For example, one can assume that different photons pass-
ing through the hole are not exactly in the same conditions. What if they
interact with atoms surrounding the hole and for each passing photon the
configuration of the nearby atoms is different? This explanation does not
seem plausible, because one can repeat the single-hole experiment with dif-
ferent materials (paper, steel, etc.) without any visible difference. Moreover,
the same diffraction picture is observed if other particles (electrons, neutrons,
C60 molecules, etc) are used instead of photons. It appears that there are
only two parameters, which determine the shape of the diffraction spot – the
size of the aperture and the particle’s momentum. So, the explanation of
this shape must be rather general and should not depend on the nature of
particles and on the material surrounding the hole. Then we must accept a
rather striking conclusion: all these carefully prepared particles behave ran-
domly. It is impossible to predict which point on the screen will be hit by
the next released particle.
1.2.2 Ensembles and measurements in quantum me-

chanics
The main lesson of the single-hole experiment is that classical Assertion 1.1
is not true. Even if the system is prepared each time in the same state, we are
not going to get reproducible results in repeated measurements. Why does
this happen? The honest answer is that nobody knows. This is one of great-
est mysteries of nature. Quantum theory does not even attempt to explain
the physical origin of randomness in microsystems. This theory assumes the
randomness as given6 and simply tries to formulate its mathematical descrip-

tion. In order to move forward, we should go beyond simple constatation of
randomness in microscopic events and introduce more precise statements and
new definitions.
We will call experiment the preparation of an ensemble (= a set of iden-
tical copies of the system prepared in the same conditions) and performing
measurements of the same observable in each member of the ensemble.7
Suppose that we prepared an ensemble of N identical copies of the system
and measured the same observable N times. As we have established above, we
cannot say a priori that all these measurements will produce the same results.
However, it seems reasonable to assume the existence of ensembles in which
measurements of one observable can be repeated with the same reproducible
result infinite number of times. Indeed, there is no point to talk about a
physical quantity, if there are no ensembles in which this quantity can be
measured with certainty. So, we begin our construction of the mathematical
formalism of quantum mechanics by introducing the following Postulate
Postulate 1.2 (partial determinism) For any observable F and any value
f from its spectrum, one can prepare an ensemble in such a state that mea-
surements of this observable are reproducible, i.e., always yield the same value
f.
Note that in classical mechanics this Postulate follows immediately from
Assertion 1.1. The Postulate itself allows us to talk only about the repro-
ducibility of just one observable in the ensemble, while Assertion 1.1 referred
to all observables. So, our quantum Postulate 1.2 is a milder requirement
than the classical Assertion 1.1.
Note also that Postulate 1.2 permits existence of certain groups of (com-
patible) observables whose measurements can be reproducible in a given en-
semble. For example, we will see in chapter 4 that the three components
6
In this book we claim that this randomness is absolute and fundamental; that it cannot
be explained and does not even require an explanation. In section 1.7 we will briefly discuss
other approaches to this deep question.
7
It is worth noting here that in this book we are not considering repeated measurements
performed on the same copy of the physical system. We will work under assumption that
after the measurement has been performed, the copy of the system is discarded. Each
measurement requires a fresh copy of the physical system. This means, in particular,
that we are not interested in the state to which the system may have “collapsed” after
the measurement. The description of repetitive quantum measurements is an interesting
subject, but it is beyond the scope of this book.
1.3. LATTICE OF PROPOSITIONS 15
of particle’s momentum (px , py , pz ) are compatible observables. The same is

true for the three components of position (rx , ry , rz ). Thus, quantum me-
chanics says that with certain efforts we can prepare an ensemble of particles
in such a state that measurements of all 3 components of position yield the
same result each time, but then results for the components of momentum
would be randomly scattered. We can also prepare (another) ensemble in a
state with certain momentum, then the position will be all over the place.
However, we cannot prepare an ensemble in which the uncertainties of both
position ∆r and momentum ∆p are zero.8
1.3 Lattice of propositions

Having described the fundamental Postulate 1.2 of quantum mechanics in
the preceding section, we now need to turn it into a working mathematical
formalism. This is the goal of the present section and next two sections. As
we mentioned in the beginning of this chapter, our explanation of quantum
mechanics will be based on the controversial but very powerful idea that
microworld is governed by a non-classical logic.
When we say that measurements of observables in the quantum world can
be irreproducible we mean that this irreproducibility or randomness is of ba-
sic, fundamental nature. It is different from the randomness often observed
in the classical world,9 which is related merely to our inability to prepare
well-controlled ensembles, incomplete knowledge of initial conditions and in-
ability to solve dynamical equations even if the initial conditions are known.
So, classical randomness is of technical rather than fundamental nature. On
the other hand, the ever-present quantum randomness cannot be reduced
by a more thorough control of preparation conditions, lowering the temper-
ature, etc. This means that quantum mechanics is bound to be a statistical
theory based on postulates of probability. However, this does not mean that
quantum mechanics is a subset of classical statistical mechanics. Actually,
the statistical theory underlying quantum mechanics is of a different more
general non-classical kind.
We know that classical (Boolean) logic is at the core of classical proba-
bility theory. The latter theory assign probabilities (real numbers between 0
and 1) to logical propositions and tells us how these numbers are affected by
8
See discussion of the Heisenberg uncertainty relation in subsection 6.5.2.
9
e.g., when we roll a die
logical operations (AND, OR, NOT, etc.) with propositions. Quite similarly,
quantum probability theory is based on logic, but this time the logic is not
Boolean. In quantum mechanics we are dealing with quantum logic whose
postulates differ slightly from the Boolean ones. So, at the most fundamen-
tal level, quantum mechanics is built on two simple ideas: the prevalence
of randomness in nature and the non-classical logical relationships between
experimental propositions.
The initial idea that the fundamental difference between classical and
quantum mechanics lies in their different logical structures belongs to Birkhoff
and von Neumann. They suggested to substitute classical logic of Aristotle
and Boole by the quantum logic. The formalism presented in this section
summarizes their seminal work [BvN36] as well as further research most no-
tably by Mackey [Mac63] and Piron [Pir64, Pir76]. Even for those who do not
accept the necessity of such radical change in our views on logic, the study
of quantum logic may provide a desirable bridge between intuitive concepts
of classical mechanics and abstract formalism of quantum theory.
In introductory quantum physics classes (especially in the United

States), students are informed ex cathedra that the state of a
physical system is represented by a complex-valued wave function
ψ, that observables correspond to self-adjoint operators, that the
temporal evolution of the system is governed by a Schrödinger
equation and so on. Students are expected to accept all this uncrit-
ically, as their professors probably did before them. Any question
of why is dismissed with an appeal to authority and an injunc-
tion to wait and see how well it all works. Those students whose
curiosity precludes blind compliance with the gospel according to
Dirac and von Neumann are told that they have no feeling for
physics and that they would be better off studying mathematics or
philosophy. A happy alternative to teaching by dogma is provided
by basic quantum logic, which furnishes a sound and intellectually
satisfying background for the introduction of the standard notions
of elementary quantum mechanics. D. J. Foulis [Fou]
The main purpose of our sections 1.3 - 1.5 is to demonstrate that pos-
tulates of quantum mechanics are robust and inevitable consequences of the
laws of probability and simple properties of measurements. Basic axioms of
quantum logic which are common for both classical and quantum mechanics
are presented in section 1.3. The close connection between the classical logic
and the phase space formalism of classical mechanics is discussed in section
1.4. In section 1.5 we will note a remarkable fact that the only difference be-
tween classical and quantum logics (and, thus, between classical and quantum
physics in general) is in a single obscure distributivity postulate. This classi-
cal postulate must be replaced by the orthomodularity postulate of quantum
theory. In section 1.6 we will demonstrate how postulates of quantum logic
lead us (via Piron’s theorem) to the standard formalism of quantum mechan-
ics with Hilbert spaces, Hermitian operators, wave functions, etc. In section
1.7 we will briefly mention some interpretations of quantum mechanics and
related philosophical issues.
1.3.1 Propositions and states

There are different types of observables in physics: position, momentum,
spin, mass, etc. As we discussed in Introduction, we are not interested in
the design of the apparatus measuring each particular observable F . We can
imagine such an apparatus as a black box that produces a real number f ∈ R
(the measured value of the observable) each time it interacts with the physical
system. So, the act of measurement can be simply described by words “the
value of observable F was found to be f .” However, for our purposes a
slightly different description of measurements will be more convenient.
With each subset10 X of the real line R we can associate a proposition
x of the type “the value of observable F is inside the subset X.” If the
measured value f has been found inside the subset X, then we say that the
proposition x is true. Otherwise we say that the proposition x is false. At
the first sight it seems that we have not gained much by this reformulation.
However, the true advantage is that propositions introduced above can be
regarded as key elements of mathematical logic and probability theory. The
powerful machinery of these theories will be found very helpful in our anal-
ysis of quantum measurements. It is also useful to note that the “yes-no”
propositions can be also regarded as special types of observables whose spec-
trum consists of just two points: 1 (if the proposition is true; the answer is
“yes”) and 0 (if the proposition is false; the answer is “no”).
Yes-no propositions are not necessarily related to single observables. As
10
Subsets X are not necessarily limited to contiguous intervals in R. All results remain
valid for more complex subsets of R, such as unions of any number of disjoint intervals.
we will see later, there are sets of (compatible) observables F1 , F2 , . . . , Fn

that are simultaneously measurable with arbitrary precision. For such sets
of observables one can define propositions corresponding to subsets in the
(n-dimensional) common spectrum of these observables. For example, the
proposition “particle’s position is inside volume V ” is a statement about
simultaneous measurements of three observables - the x, y, and z components
of the position vector. Experimentally, this proposition can be realized using
a Geiger counter occupying the volume V . The counter clicks (the proposition
is true) if the particle passes through the counter’s chamber and does not
click (the proposition is false) if the particle is outside of V .
In what follows we will denote by L the set of all propositions11 about the
physical system. The set of all possible states of the system will be denoted
by Φ. Our goal in this chapter is to study the mathematical relationships
between elements x ∈ L and φ ∈ Φ in these two sets.
The above discussion referred to a single measurement performed on one
copy of the physical system. Let us now prepare multiple copies (an ensem-
ble) of the system and perform measurements of the same proposition in all
copies. As we discussed earlier, there is no guarantee that the results of all
these measurements will be the same. So, for some members in the ensemble
the proposition x will be found ’true’, while for other members it will be
’false’, even if every effort is made to ensure that the state of the prepared
system is exactly the same in all cases.
Using results of measurements as described above, we can introduce a
function (φ|x) called the probability measure which assigns to each state φ
and to each proposition x the probability of x being true in the state φ. The
value of this function (a real number between 0 and 1) is obtained by the
following recipe:
(i) prepare a copy of the system in the state φ;
(ii) determine whether x is true or false;
(iii) repeat steps (i) and (ii) N times, then
M
(φ|x) = lim
N →∞ N
11
L also called the propositional system.
where M is the number of times when the proposition x was found to be

true.
Two states φ and ψ of the same system are said to be equal (φ = ψ) if
for any proposition x we have
(φ|x) = (ψ|x)
Indeed, there is no physical difference between these two states as all exper-
iments yield the same results (=probabilities). For the same reason we will
say that two propositions x and y are identical (x = y) if for all states φ of
the system
(φ|x) = (φ|y) (1.1)
It follows from the above discussion that the probability measure (φ|x) con-
sidered as a function on the set of all states Φ is a unique representative of
proposition x (i.e., different propositions have different probability measures
on Φ). So, we can gain some insight into the properties of different proposi-
tions by studying properties of the corresponding probability measures.
There are propositions which are always true independent on the state
of the system. For example, the proposition “the value of the observable F1
is somewhere on the real line” is always true.12 For any other observable
F2 , the proposition “the value of the observable F2 is somewhere on the real
line” is also true for all states. Therefore, according to (1.1), we will say that
these two propositions are identical. So, we can define a unique maximal
proposition I ∈ L which is always true. Inversely, the proposition “the value
of observable is not on the real line” is always false and will be called the
minimal proposition. There is just one minimal proposition ∅ in the set L,
and for each state φ we can write
(φ|I) = 1 (1.2)
(φ|∅) = 0 (1.3)
12
Measurements of observables always yield some value, since we agreed in Introduction
that an ideal measuring apparatus never misfires.
1.3.2 Partial ordering

Suppose that we found two propositions x and y, such that their measures
satisfy (φ|x) ≤ (φ|y) for all states φ. Then we will say that proposition x is
less than or equal to proposition y and denote this relation by x ≤ y. The
meaning of this relation is obvious when x and y are propositions about the
same observable and X and Y are their corresponding subsets in R. Then
x ≤ y if the subset X is inside the subset Y , i.e., X ⊆ Y . In this case, the
relation x ≤ y is associated with logical implication, i.e., if x is true then y is
definitely true as well; x IMPLIES y; or “IF x THEN y”. If x ≤ y and x 6= y
we will say that x is less than y and denote this relationship by x < y.
The implication relation ≤ has three obvious properties.
Lemma 1.3 (reflectivity) Any proposition implies itself: x ≤ x.
Proof. This follows from the fact that for any φ it is true that (φ|x) ≤ (φ|x).
Lemma 1.4 (symmetry) If two propositions imply each other, then they
are equal: If x ≤ y and y ≤ x then x = y.
Proof. If two propositions x and y are less than or equal to each other, then
(φ|x) ≤ (φ|y) and (φ|y) ≤ (φ|x) for each state φ, which implies (φ|x) = (φ|y)
and, according to (1.1), x = y.
Lemma 1.5 (transitivity) If x ≤ y and y ≤ z, then x ≤ z.
Proof. If x ≤ y and y ≤ z, then (φ|x) ≤ (φ|y) ≤ (φ|z) for every state φ.

Consequently, (φ|x) ≤ (φ|z) for each state φ and x ≤ z.
Properties 1.3, 1.4, and 1.5 tell us that ≤ is a partial ordering relation.
It is ordering because it tells which proposition is “smaller” and which is
“larger.” The ordering is partial, because it doesn’t apply to all pairs of
propositions.13 Thus, the set L of all propositions is a partially ordered set.
From equations (1.2) and (1.3) we also conclude that
Postulate 1.6 (definition of I) x ≤ I for any x ∈ L.
Postulate 1.7 (definition of ∅) ∅ ≤ x for any x ∈ L.

13
There could be propositions x and y, such that for some states (φ|x) > (φ|y), while
for other states (φ|x) < (φ|y).
1.3.3 Meet
For two propositions x and y, suppose that we found a third proposition z
such that
z ≤ x (1.4)
z ≤ y (1.5)
There could be more than one proposition satisfying these properties. We

will assume that we can always find one maximal proposition z in this set.
This maximal proposition will be called a meet of x and y and denoted by
x ∧ y.
The existence of a unique meet is obvious in the case when x and y are
propositions about the same observable, such that they correspond to two
subsets of the real line R: X and Y , respectively. Then the meet z = x ∧ y
is a proposition corresponding to the intersection of these two subsets Z =
X ∩ Y .14 In this one-dimensional case the operation meet can be identified
with the logical operation AND: the proposition x ∧ y is true only when both
x AND y are true.
The above definition of meet can be formalized in the form of two postu-
lates
Postulate 1.8 (definition of ∧) x ∧ y ≤ x and x ∧ y ≤ y for all x and y.
Postulate 1.9 (definition of ∧) If z ≤ x and z ≤ y then z ≤ x ∧ y.
It seems reasonable to assume that the order in which meet operations are
performed is not relevant
Postulate 1.10 (commutativity of ∧) x ∧ y = y ∧ x.
Postulate 1.11 (associativity of ∧) (x ∧ y) ∧ z = x ∧ (y ∧ z).

14
If X and Y do not intersect, then z = ∅.
1.3.4 Join
Similar to our discussion of meet, we can assume that for any two propositions
x and y there always exists a unique join x ∨ y, such that both x and y are
less or equal than x ∨ y, and x ∨ y is the minimal proposition with such a
property.
In the case of propositions about the same observable, the join of x and
y is a proposition z = x ∨ y whose associated subset Z of the real line is a
union of the subsets corresponding to x and y: Z = X ∪ Y . The proposition
z is true when either x OR y is true. So, the join can be identified with the
logical operation OR.15
The formal version of the above definition of join is
Postulate 1.12 (definition of ∨ ) x ≤ x ∨ y and y ≤ x ∨ y.
Postulate 1.13 (definition of ∨ ) If x ≤ z and y ≤ z then x ∨ y ≤ z.
Similar to Postulates 1.10 and 1.11 we insist that the order of join oper-
ations is irrelevant
Postulate 1.14 (commutativity of ∨) x ∨ y = y ∨ x.
Postulate 1.15 (associativity of ∨) (x ∨ y) ∨ z = x ∨ (y ∨ z).
The properties of propositions listed so far (partial ordering, meet, and

join) mean that the set of propositions L is what mathematicians call a
complete lattice.
1.3.5 Orthocomplement
There is one more operation on the set of propositions that we need to con-
sider. This operation is called orthocomplement. It has the meaning of the
logical negation (operation NOT). For any proposition x its orthocomple-
ment is denoted by x⊥ . In the case of a single observable, if proposition x
corresponds to the subset X of the real line, then its orthocomplement x⊥
corresponds to the relative complement of X with respect to R (denoted by
R \ X). When the value of observable F is found inside X, i.e., the value of
15
It is important to note that from x ∨ y being true it does not necessarily follow that
either x or y are definitely true, as we had it in classical logic.
x is 1, we immediately know that the value of x⊥ is zero. Inversely, if x is

false then x⊥ is necessarily true.
More formally, the orthocomplement x⊥ is defined as a proposition whose
probability measure for each state φ is
(φ|x⊥ ) = 1 − (φ|x) (1.6)
Lemma 1.16 (non-contradiction) x ∧ x⊥ = ∅.
Proof. Let us prove this Lemma in the case when x is a proposition about
one observable F . Suppose that x ∧ x⊥ = y 6= ∅, then, according to Postulate
1.2, there exists a state φ such that (φ|y) = 1 and, from Postulate 1.8 we
obtain
y ≤ x
y ≤ x⊥
1 = (φ|y) ≤ (φ|x)
1 = (φ|y) ≤ (φ|x⊥ )
It then follows that (φ|x) = 1 and (φ|x⊥ ) = 1, which means that any mea-
surement of the observable F in the state φ will result in a value inside both
X and R \ X simultaneously, which is impossible. This contradiction should
convince us that x ∧ x⊥ = ∅.
Lemma 1.17 (double negation) (x⊥ )⊥ = x.
Proof. From equation (1.6) we can write for any state φ
(φ|(x⊥ )⊥ ) = 1 − (φ|x⊥ ) = 1 − (1 − (φ|x)) = (φ|x)
Lemma 1.18 (contraposition) If x ≤ y then y ⊥ ≤ x⊥ .

Proof. If x ≤ y then (φ|x) ≤ (φ|y) and (1 − (φ|x)) ≥ (1 − (φ|y)) for all

states φ. But according to our definition (1.6), the two sides of the latter
inequality are probability measures for propositions x⊥ and y ⊥ , respectively,
which proves the Lemma.
Propositions x and y are said to be disjoint if x ≤ y ⊥ or, equivalently,
y ≤ x⊥ .
When x and y are disjoint propositions about the same observable, their
corresponding subsets do not intersect: X ∩ Y = ∅. For such mutually
exclusive propositions the probability of either x OR y being true (i.e., the
probability corresponding to the proposition x∨y) is the sum of probabilities
for x and y. It seems natural to generalize this property to all pairs of disjoint
propositions in L:
Postulate 1.19 (probabilities for mutually exclusive propositions) If

x and y are disjoint, then for any state φ
(φ|x ∨ y) = (φ|x) + (φ|y)
The following Lemma establishes that the join of any proposition x with its
orthocomplement x⊥ is always the trivial proposition.
Lemma 1.20 (non-contradiction) x ∨ x⊥ = I.
Proof. From Lemmas 1.3 and 1.17 it follows that x ≤ x = (x⊥ )⊥ , so that
propositions x and x⊥ are disjoint. Then, by Postulate 1.19, for any state φ
we obtain
(φ|x ∨ x⊥ ) = (φ|x) + (φ|x⊥ ) = (φ|x) + (1 − (φ|x)) = 1
which proves the Lemma.

Adding the orthocomplement to the properties of the propositional sys-
tem (complete lattice) L, we conclude that L is an orthocomplemented lattice.
Axioms of orthocomplemented lattices are collected in the upper part of Ta-
ble 1.1 for easy reference.
Table 1.1: Axioms of quantum logic

Property Postulate/Lemma
Axioms of orthocomplemented lattices
Reflectivity 1.3 x≤x
Symmetry 1.4 (x ≤ y) & (y ≤ x) ⇒ x = y
Transitivity 1.5 (x ≤ y) & (y ≤ z) ⇒ x ≤ z
Definition of I 1.6 x≤I
Definition of ∅ 1.7 ∅≤x
Definition of ∧ 1.8 x∧y ≤x
Definition of ∧ 1.9 (z ≤ x) & (z ≤ y) ⇒ z ≤ (x ∧ y)
Definition of ∨ 1.12 x≤ x∨y
Definition of ∨ 1.13 (x ≤ z) & (y ≤ z) ⇒ (x ∨ y) ≤ z
Commutativity 1.10 x∨y = y∨x
Commutativity 1.14 x∧y = y∧x
Associativity 1.11 (x ∨ y) ∨ z = x ∨ (y ∨ z)
Associativity 1.15 (x ∧ y) ∧ z = x ∧ (y ∧ z)
Non-contradiction 1.16 x ∧ x⊥ = ∅
Non-contradiction 1.20 x ∨ x⊥ = I
Double negation 1.17 (x⊥ )⊥ = x
Contraposition 1.18 x ≤ y ⇒ y ⊥ ≤ x⊥
Sum of probabilities 1.19 (φ|x ∨ y) = (φ|x) + (φ|y) if x ≤ y ⊥
Atomicity 1.21 existence of elementary propositions
Additional assertions of classical logic
Distributivity 1.25 x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z)
1.26 x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z)
Additional postulate of quantum logic
Orthomodularity 1.36 x≤y⇒x↔y
1.3.6 Atomic propositions

One says that proposition y covers proposition x if the following two state-
ments are true:
1) x < y
2) If x ≤ z ≤ y, then either z = x or z = y
In simple words, this definition means that x implies y and there are no
propositions “intermediate” between x and y.
If x is a proposition about a single observable corresponding to the interval
X ⊆ R, then the interval corresponding to the covering proposition y can be
obtained by adding just one extra point to the interval X.
A proposition covering ∅ is called an atomic proposition or simply an
atom. So, atoms are smallest non-vanishing propositions. They unambigu-
ously specify properties of the system in the most exact way.
We will say that the atom p is contained in the proposition x if p ≤ x.
The existence of atoms is not a necessary property of mathematical lattices.
Both atomic and non-atomic lattices can be studied. However, on physical
grounds we will postulate that the lattice of propositions is atomic
Postulate 1.21 (atomicity) The propositional system L is an atomic lat-

tice. This means that
1. If x 6= ∅, then there exists at least one atom p contained in x.
2. Each proposition x is a join of all atoms contained in it:
x = ∨p≤x p
3. If p is an atom and p ∧ x = ∅, then p ∨ x covers x.
There are three simple Lemmas that follow directly from this Postulate.
Lemma 1.22 If p is an atom and x is any proposition then either p ∧ x = ∅

or p ∧ x = p.
Proof. We know that ∅ ≤ p ∧ x ≤ p and that p covers ∅. Then, according

to the definition of covering, either p ∧ x = ∅ or p ∧ x = p.
1.4. CLASSICAL LOGIC 27
Lemma 1.23 x ≤ y if and only if all atoms contained in x are contained in

y as well.
Proof. If x ≤ y then for each atom p contained in x we have p ≤ x ≤ y and

p ≤ y by the transitivity property 1.5. To prove the inverse statement we
notice that if we assume that all atoms in x are also contained in y then by
Postulate 1.21(2)
y = ∨p≤y p = (∨p≤x p) ∨ (∨p≤

/x p) = x ∨ (∨p≤
/x p) ≥ x
Lemma 1.24 The meet x∧y of two propositions x and y is a union of atoms
contained in both x and y.
Proof. If p is an atom contained in both x and y (p ≤ x and p ≤ y), then

p ≤ x ∧ y. Conversely, if p ≤ x ∧ y, then p ≤ x and p ≤ y by Lemma C.1
1.4 Classical logic

1.4.1 Truth tables and distributive law
This is not a surprise that the theory constructed above is similar to classical
logic. Indeed if we make substitutions: ’less than or equal to’ → IF...THEN...;
join → OR; meet → AND; and so on, as shown in Table 1.2, then properties
described in Postulates and Lemmas 1.3 - 1.24 exactly match axioms of
classical Boolean logic. For example, the transitivity property in Lemma 1.5
allows us to make syllogisms, like the one analyzed by Aristotle
If all humans are mortal,

and all Greeks are humans,
then all Greeks are mortal.
Lemma 1.16 tells that a proposition and its negation cannot be true at the
same time. Lemma 1.20 is the famous tertium non datur law of logic: either
a proposition or its negation is true with no third possibility.
Table 1.2: Four operations and two special elements of lattice theory and
logic
Name Name Meaning Symbol
in lattice theory in logic in classical logic
less or equal implication IF x THEN y x≤y
meet injunction x AND y x∧y
join disjunction x OR y x∨y
orthocomplement negation NOT x x⊥
maximal element tautology always true I
minimal element absurdity always false ∅
Note, however, that properties 1.3 - 1.24 are not sufficient to build a
complete theory of mathematical logic: Boolean logic has two additional
axioms, which have not been mentioned yet. They are called distributive
laws
Assertion 1.25 (distributive law) x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z).
Assertion 1.26 (distributive law) x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z).
These laws, unlike axioms of orthocomplemented lattices, cannot be justified

by using our previous approach that relied on general probability measures
(φ|x). This is the reason why we call them Assertions. In the next section we
will see that these Assertions are not valid in the quantum case. However,
they can be proven if we use the fundamental Assertion 1.1 of classical me-
chanics, which says that in classical pure states all measurements yield the
same results, i.e., reproducible. Then for a given classical pure state φ each
proposition x is either always true or always false and the probability measure
can have only two values: (φ|x) = 1 or (φ|x) = 0. Such classical probabil-
ity measure is called the truth function. In the double-valued (true-false)
Boolean logic, the job of performing logical operations with propositions is
greatly simplified by analyzing their truth functions. For example, to show
the equality of two propositions it is sufficient to demonstrate that the values
of their truth functions are the same for all classical pure states.
Let us consider an example. Given two propositions x and y and an
arbitrary state φ, there are at most four possible values for the pair of their
truth functions (φ|x) and (φ|y): (1,1), (1,0), (0,1), and (0,0). To analyze
these possibilities it is convenient to put the values of the truth functions in

a truth table. Table 1.3 is the truth table for propositions x, y, x ∧ y, x ∨ y,
x⊥ and y ⊥ .16 The first row in table 1.3 refers to all classical pure states in
which both propositions x and y are false, i.e., (φ|x) = (φ|y) = 0. The other
entries in this row tell that for such states (φ|x ∧ y) = (φ|x ∨ y) = 0 and
(φ|x⊥ ) = (φ|y ⊥) = 0. The second row refer to states in which x is false and
y is true, etc.
Table 1.3: Truth table for basic logical operations

x y x∧y x∨y x⊥ y⊥
0 0 0 0 1 1
0 1 0 1 1 0
1 0 0 1 0 1
1 1 1 1 0 0
Table 1.4: Demonstration of the distributive law using truth table

x y z y∧z x ∨ (y ∧ z) x∨y x∨z (x ∨ y) ∧ (x ∨ z)
0 0 0 0 0 0 0 0
0 0 1 0 0 0 1 0
0 1 0 0 0 1 0 0
1 0 0 0 1 1 1 1
0 1 1 1 1 1 1 1
1 0 1 0 1 1 1 1
1 1 0 0 1 1 1 1
1 1 1 1 1 1 1 1
The truth table for three arbitrary non-empty propositions x, y, and

z is shown in Table 1.4. This table has 23 = 8 rows that correspond to
groups of classical states having different values of their truth functions on
the propositions x, y and z. In all these cases truth functions in columns 5
and 8 are identical, which means that
x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z)
16
Here we assumed that all these propositions are non-empty.
and that the classical distributive law (Assertion 1.25) is valid. Assertion
1.26 can be derived in a similar manner.
Thus we have shown that in the deterministic world of classical mechanics
governed by the Assertion 1.1, the set of propositions L is an orthocomple-
mented atomic lattice where the distributive laws 1.25 and 1.26 hold true.
Such a lattice will be called a classical propositional system or, shortly, classi-
cal logic. A study of classical logics and its relationship to classical mechanics
is the topic of the present section.
1.4.2 Atomic propositions in classical logic

Our next step is to demonstrate that classical logic provides the entire math-
ematical framework of classical mechanics, i.e., the description of observables
and states in the phase space.17 First, we prove four Lemmas.
Lemma 1.27 In classical logic, if x < y, then there exists an atom p such
that p ∧ x = ∅ and p ≤ y.
Proof. Clearly, y ∧ x⊥ 6= ∅, because otherwise we would have
y = y ∧ I = y ∧ (x ∨ x⊥ ) = (y ∧ x) ∨ (y ∧ x⊥ ) = (y ∧ x) ∨ ∅ = y ∧ x ≤ x
and, by Lemma 1.4, x = y in contradiction with the condition of the present

Lemma. Since y ∧ x⊥ is non-zero, then by Postulate 1.21(1) there exists an
atom p such that p ≤ y ∧ x⊥ . It then follows that p ≤ x⊥ and by Lemma
C.3 p ∧ x ≤ x⊥ ∧ x = ∅.
Lemma 1.28 In classical logic, the orthocomplement x⊥ of a proposition x

(where x 6= I) is a join of all atoms not contained in x.
Proof. First, it is clear that there should exist al least one atom p that
is not contained in x. If it were not true, then we would have x = I in
contradiction to the condition of the Lemma. Let us now prove that the
17
An example of the phase space for a single classical particle will be presented in
subsection 6.5.4.
atom p is contained in x⊥ . Indeed, using the distributive law 1.26 we can

write
p = p ∧ I = p ∧ (x ∨ x⊥ ) = (p ∧ x) ∨ (p ∧ x⊥ )
According to Lemma 1.22 we now have four possibilities:
1. p ∧ x = ∅ and p ∧ x⊥ = ∅; then p ≤ x ∧ x⊥ = ∅, which is impossible;
2. p ∧ x = p and p ∧ x⊥ = p; then p = p ∧ p = (p ∧ x) ∧ (p ∧ x⊥ ) =
p ∧ (x ∧ x⊥ ) = p ∧ ∅ = ∅, which is impossible;
3. p ∧ x = p and p ∧ x⊥ = ∅; from Postulate 1.8 it follows that p ≤ x,
which contradicts our assumption and should be dismissed;
4. p ∧ x = ∅ and p ∧ x⊥ = p; from this we have p ≤ x⊥ , i.e., p is contained
in x⊥ .
This shows that all atoms not contained in x are contained in x⊥ . Further,
Lemmas 1.16 and 1.24 imply that all atoms contained in x⊥ are not contained
in x. The statement of the Lemma then follows from Postulate 1.21(2).
Lemma 1.29 In classical logic, two different atoms p and q are always dis-
joint: q ≤ p⊥ .
Proof. By Lemma 1.28, p⊥ is a join of all atoms different from p, including
q, thus q ≤ p⊥ .
Lemma 1.30 In classical logic, the join x ∨ y of two propositions x and y

is a join of atoms contained in either x or y.
Proof. If p ≤ x or p ≤ y then p ≤ x ∨ y. Conversely, suppose that p ≤ x ∨ y
and p ∧ x = ∅, p ∧ y = ∅, then
p = p ∧ (x ∨ y) = (p ∧ x) ∨ (p ∧ y) = ∅ ∨ ∅ = ∅
which is absurd.
Now we are ready to prove the important fact that in classical mechanics
(or in classical logic) propositions can be interpreted as subsets of a set S,
which is called the phase space.
Theorem 1.31 For any classical logic L, there exists a set S and an iso-
morphism f (x) between elements x of L and subsets of the set S such that
x≤y ⇔ f (x) ⊆ f (y) (1.7)

f (x ∧ y) = f (x) ∩ f (y) (1.8)
f (x ∨ y) = f (x) ∪ f (y) (1.9)
f (x⊥ ) = S \ f (x) (1.10)
where ⊆, ∩, ∪ and \ are usual set-theoretical operations of inclusion, inter-

section, union and relative complement.
Proof. The statement of the theorem follows immediately if we choose

S to be the set of all atoms. Then property (1.7) follows from Lemma
1.23, equation (1.8) follows from Lemma 1.24. Lemmas 1.30 and 1.28 imply
equations (1.9) and (1.10), respectively.
1.4.3 Atoms and pure classical states

Lemma 1.32 In classical logic, if p is an atom and φ is a pure state such
that (φ|p) = 1,18 then for any other atom q 6= p we have (φ|q) = 0.
Proof. According to Lemma 1.29, q ≤ p⊥ and due to equation (1.6)
(φ|q) ≤ (φ|p⊥ ) = 1 − (φ|p) = 0
Lemma 1.33 In classical logic, if p is an atom and φ and ψ are two pure
states such that (φ|p) = (ψ|p) = 1, then φ = ψ.
Proof. There are propositions of two kinds: those containing the atom p
and those not containing the atom p. For any proposition x containing the
atom p we denote by q the atoms contained in x and obtain using Postulate
1.21(2), Lemma 1.29, Postulate 1.19 and Lemma 1.32
18
such a state always exists due to Postulate 1.2.
X
(φ|x) = (φ| ∨q≤x q) = (φ|q) = (φ|p) = 1
q≤x
The same equation holds for the state ψ. Similarly we can show that for any
proposition y not containing the atom p
(φ|y) = (ψ|y) = 0
Since probability measures of φ and ψ are the same for all propositions, these
two states are equal.
Theorem 1.34 In classical logic, there is an isomorphism between atoms p

and pure states φp such that
(φp |p) = 1 (1.11)
Proof. From Postulate 1.2 we know that for each atom p there is a state φp
in which equation (1.11) is valid. From Lemma 1.32 this state is unique. To
prove the reverse statement we just need to show that for each pure state φp
there is a unique atom p such that (φp |p) = 1. Suppose that for each atom p
we have (φp |p) = 0. Then, taking into account that I is a join of all atoms,
that all atoms are mutually disjoint and using (1.2) and Postulate 1.19, we
obtain
X
1 = (φp |I) = (φp | ∨p≤I p) = (φp |p) = 0
p≤I
which is absurd. Therefore, for each state φp one can always find at least
one atom p such that equation (1.11) is valid. Finally, we need to show that
if p and q are two such atoms, then p = q. This follows from the fact that
for each pure classical state the probability measures (or the truth functions)
corresponding to propositions p and q are exactly the same. For the state φp
the truth function is equal to 1, for all other pure states the truth function
is equal to 0.
1.4.4 Phase space of classical mechanics

Now we are fully equipped to discuss the phase space representation in clas-
sical mechanics. Suppose that the physical system under consideration has
observables A, B, C, . . . with corresponding spectra SA , SB , SC , ... Accord-
ing to Theorem 1.34, for each atom p of the propositional system we can
find its corresponding pure state φp . All observables A, B, C, . . . have defi-
nite values in this state.19 Therefore the state φp is characterized by a set
of real numbers Ap , Bp , Cp , . . . - the values of observables. Let us suppose
that the full set of observables {A, B, C, . . .} contains a minimal subset of
observables {X, Y, Z, . . .} whose values {Xp , Yp , Zp , . . .} uniquely enumerate
all pure states φp and therefore all atoms p. So, there is a one-to-one corre-
spondence between groups of numbers {Xp , Yp , Zp , . . .} and atoms p. Then
the set of all atoms can be identified with the direct product20 of spectra of
this minimal set of observables S = SX × SY × SZ × . . .. This direct product
is called the phase space of the system. The values {Xs , Ys , Zs , . . .} of the
independent observables {X, Y, Z, . . .} in each point s ∈ S provide the phase
space with “coordinates.” Other (dependent) observables A, B, C, . . . can
be represented as real functions A(s), B(s), C(s), . . . on S or as functions of
independent observables {X, Y, Z, . . .}.
In this representation, propositions can be viewed as subsets of the phase
space. Another way is to consider propositions as special cases of observables
(= real functions on S): The function corresponding to the proposition about
the subset T of the phase space is the characteristic function of this subset

1, if s ∈ T
ξ(s) = (1.12)
0, if s ∈
/T
Atomic propositions correspond to single-point subsets of the phase space S.
In subsection 6.5.4 we will consider one massive particle and build an explicit
and realistic example of the phase space for this physical system.
1.4.5 Classical probability measures

Probability measures have a simple interpretation in the classical phase space.
Each state φ (not necessarily a pure state) defines probabilities (φ|p) for all
19
See Assertion 1.1
20
Direct product A × B of two sets A and B is a set of all ordered pairs (x, y), where
x ∈ A and y ∈ B.
atoms p (= all points s in the phase space). Each proposition x is a join of

disjoint atoms contained in x.
x = ∨q≤x q
Then, by Postulate 1.21, Lemma 1.29, and Postulate 1.19 the probability of
the proposition x being true in the state φ is
X
(φ|x) = (φ| ∨q≤x q) = (φ|q) (1.13)
q≤x
So, the value of the probability measure for all propositions x is uniquely
determined by its values on atoms. In many important cases, the phase
space is continuous, and instead of considering probabilities (φ|q) at points
in the phase space (= atoms) it is convenient to consider probability densities
which are functions Φ(s) on the phase space such that
1) Φ(s) ≥ 0;
R
2) Φ(s)ds = 1.
S
Then the value of the probability measure (φ|x) is obtained by the integral
Z
(φ|x) = Φ(s)ds
X
over the subset X corresponding to the proposition x.

For a pure classical state φ, the probability density is represented by the
delta function Φ(s) = δ(s − s0 ) localized at one point s0 in the phase space.
For such states, the value of the probability measure in each proposition x
can be either 0 or 1: (φ|x) = 0 if the point s0 does not belong to the subset
X corresponding to the proposition x and (φ|x) = 1 otherwise21
21
States whose probability densities are nonzero at more than one point in the phase
space (i.e., the probability density is different from the delta function) are called classical
mixed states. We will not discuss them in this book.
Z
1, if s0 ∈ X
(φ|x) = δ(s − s0 )ds =
0, if s0 ∈
/X
X
This shows that for pure classical states the probability measure degenerates
into a two-valued truth function. This is in agreement with our discussion in
subsection 1.4.1.
As we discussed earlier, in classical pure states all observables have well
defined values. So, classical mechanics is a fully deterministic theory in which
one can, in principle, obtain a full information about the system at any given
time and, knowing the rules of dynamics, predict exactly its development in
the future. This belief was best expressed by P.–S. Laplace:
An intelligence that would know at a certain moment all the forces
existing in nature and the situations of the bodies that compose
nature and if it would be powerful enough to analyze all these
data, would be able to grasp in one formula the movements of the
biggest bodies of the Universe as well as of the lightest atom.
1.5 Quantum logic

The above discussion of classical logic and phase spaces relied heavily on
the determinism (Assertion 1.1) of classical mechanics and on the validity of
distributive laws (Assertions 1.25 and 1.26). In quantum mechanics we are
not allowed to use these Assertions. This is how we are going to proceed
in this section when building the formalism of quantum logic. By design,
this theory is more general than the familiar classical logic and it includes
the latter as a particular (limiting) case. As we will see in the rest of this
chapter, quantum logic is a foundation of the entire mathematical formalism
of quantum theory.
1.5.1 Compatibility of propositions

Propositions x and y are said to be compatible (denoted x ↔ y) if
x = (x ∧ y) ∨ (x ∧ y ⊥ ) (1.14)
y = (x ∧ y) ∨ (x⊥ ∧ y) (1.15)
1.5. QUANTUM LOGIC 37
The notion of compatibility has a great importance for quantum theory. In

subsection 1.6.3 we will see that two propositions can be measured simulta-
neously if and only if they are compatible.
Theorem 1.35 In an orthocomplemented lattice all propositions are com-

patible if and only if the lattice is distributive.
Proof. If the lattice is distributive then for any two propositions x and y
(x ∧ y) ∨ (x ∧ y ⊥ ) = x ∧ (y ∨ y ⊥ ) = x ∧ I = x
and, changing places of x and y
(x ∧ y) ∨ (x⊥ ∧ y) = y
These formulas coincide with our definitions of compatibility (1.14) and
(1.15), which proves the direct statement of the theorem.
The proof of the inverse statement (compatibility → distributivity) is
more lengthy. We assume that all propositions in our lattice are compatible
with each other and choose three arbitrary propositions x, y, and z. Now we
are going to prove that the distributive laws22
(x ∧ z) ∨ (y ∧ z) = (x ∨ y) ∧ z (1.16)
(x ∨ z) ∧ (y ∨ z) = (x ∧ y) ∨ z (1.17)
are valid. First we prove that the following 7 propositions (some of them
may be empty) are mutually disjoint (see Fig. 1.4)
q1 = x∧y∧z
q2 = x⊥ ∧ y ∧ z
q3 = x ∧ y⊥ ∧ z
q4 = x ∧ y ∧ z⊥
q5 = x ∧ y⊥ ∧ z⊥
q6 = x⊥ ∧ y ∧ z ⊥
q7 = x⊥ ∧ y ⊥ ∧ z
x q55 q4 q6
y
q1
q3 q2
q7
z
Figure 1.4: To the proof of Theorem 1.35.
For example, to show that propositions q3 and q5 are disjoint we notice that
q3 ≤ z and q5 ≤ z ⊥ (by Postulate 1.8). Then by Lemma 1.18 z ≤ q5⊥ and
q3 ≤ z ≤ q5⊥ . Therefore by Lemma 1.5 q3 ≤ q5⊥ .
Since by our assumption both x ∧ z and x ∧ z ⊥ are compatible with y, we
obtain
x ∧ z = (x ∧ z ∧ y) ∨ (x ∧ z ∧ y ⊥ ) = q1 ∨ q3
x ∧ z ⊥ = (x ∧ z ⊥ ∧ y) ∨ (x ∧ z ⊥ ∧ y ⊥ ) = q4 ∨ q5
x = (x ∧ z) ∨ (x ∧ z ⊥ ) = q1 ∨ q3 ∨ q4 ∨ q5
Similarly we show
y ∧ z = q1 ∨ q2
y = q1 ∨ q2 ∨ q4 ∨ q6
z = q1 ∨ q2 ∨ q3 ∨ q7
Then denoting Q = q1 ∨ q2 ∨ q3 we obtain
(x ∧ z) ∨ (y ∧ z) = (q1 ∨ q3 ) ∨ (q1 ∨ q2 ) = q1 ∨ q2 ∨ q3 = Q (1.18)
From Postulate 1.9 and y ∨ x = Q ∨ q4 ∨ q5 ∨ q6 it follows that

22
Assertions 1.25 and 1.26
Q ≤ (Q ∨ q7 ) ∧ (Q ∨ q4 ∨ q5 ∨ q6 ) = (x ∨ y) ∧ z (1.19)
On the other hand, from q4 ∨ q5 ∨ q6 ≤ q7⊥ , Lemma C.3 and the definition of
compatibility it follows that
(x ∨ y) ∧ z = (Q ∨ q4 ∨ q5 ∨ q6 ) ∧ (Q ∨ q7 ) ≤ (Q ∨ q7⊥ ) ∧ (Q ∨ q7 ) = Q(1.20)
Therefore, applying the symmetry Lemma 1.4 to equations (1.19) and (1.20),
we obtain
(x ∨ y) ∧ z = Q (1.21)
Comparing equations (1.18) and (1.21) we see that the distributive law (1.16)
is valid. The other distributive law (1.17) is obtained from equation (1.16)
by duality (see Appendix C).
This theorem tells us that in classical mechanics all propositions are compat-
ible. The presence of incompatible propositions is a characteristic feature of
quantum theories.
1.5.2 Logic of quantum mechanics

In quantum mechanics we are not allowed to use classical Assertion 1.1 and
we must abandon the distributive laws. However, in order to get a non-trivial
theory we need some substitutes for these two properties. This additional
postulate should be specific enough to yield sensible physics and general
enough to be non-empty and to include the distributive law as a particu-
lar case. The latter requirement is justified by our desire to have classical
mechanics as a particular case of the more general quantum mechanics.
To find such a generalization we suggest the following arguments. From
Theorem 1.35 we know that the compatibility of all propositions is a char-
acteristic property of classical Boolean lattices. We also mentioned that this
property is equivalent to simultaneous measurability of all propositions. We
know that in quantum mechanics not all propositions are simultaneously
measurable, therefore they cannot be compatible as well. This suggests that
we may try to find a generalization of classical theory by limiting the set of

propositions that are mutually compatible. More specifically, we will postu-
late that two propositions are definitely compatible if one implies the other
and leave it to mathematics to tell us about the compatibility of other pairs.
Postulate 1.36 (orthomodularity) Propositions about physical systems obey

the orthomodular law: If a implies b, then these two propositions are com-
patible
a ≤ b ⇒ a ↔ b. (1.22)
Orthocomplemented lattices with additional orthomodular Postulate 1.36 are

called orthomodular lattices.
Is there any deeper physical justification for the above postulate? As
far as I know, there is none. The only justification is that the orthomodu-
larity postulate really works, i.e., it results in the well-known mathematical
structure of quantum mechanics, which has been thoroughly tested in ex-
periments. In principle, one can try to introduce a different postulate to
replace the classical distributivity relationships. If the resulting set of postu-
lates turned out to be self-consistent, then one would obtain a non-classical
theory that is also different from quantum mechanics. To the author’s best
knowledge, this line of reasoning has not been fruitful. So, in this book
we will stick to orthomodular lattices and to traditional laws of quantum
mechanics that follow from them.
Before proceeding further, we need to introduce important notions of the
irreducibility of lattices and their rank. The center of a lattice is the set of
elements compatible with all others. Obviously ∅ and I are in the center.
A propositional system in which there are only two elements in the center
(∅ and I) is called irreducible. Otherwise it is called reducible. Any
Boolean lattice having more than two elements (∅ and I are present in any
lattice, of course) is reducible and its center coincides with the entire lattice.
Orthomodular atomic irreducible lattices are called quantum propositional
systems or quantum logics. The rank of a propositional system is defined as
the maximum number of mutually disjoint atoms. For example, the rank of
the classical propositional system of one massive spinless particle described
in subsection 6.5.4 is the “number of points in the phase space R6 .”
The most fundamental conclusion from our discussion in this section is

the following
Statement 1.37 (quantum logic) Experimental propositions form a quan-

tum propositional system (=orthomodular atomic irreducible lattice).
In principle, it should be possible to perform all constructions and calcu-

lations in quantum theory by using the formalism of orthomodular lattices
based on just described postulates. Such an approach would have certain
advantages because all its components have clear physical meaning: experi-
mental propositions x are realizable in laboratories, and probabilities (φ|x)
can be directly measured in experiments. However, this approach meets
tremendous difficulties mainly because lattices are rather exotic mathemati-
cal objects, and we lack intuition when dealing with lattice operations.
We saw that in classical mechanics the happy alternative to the obscure
lattice theory is provided by Theorem 1.31 which proves the isomorphism be-
tween the language of classical logic and the physically transparent language
of phase spaces. Is there a similar equivalence theorem in the quantum case?
To answer this question, we may notice that there is a striking similarity be-
tween algebras of projections on closed subspaces in a complex Hilbert space
H (see Appendices F and G) and quantum propositional systems discussed
above. In particular, if operations between projections (or subspaces) in the
Hilbert space are translated to lattice operations according to Table 1.5,23
then all axioms of quantum logic can be directly verified. For example, the
validity of the Postulate 1.36 follows from Lemmas G.4 and G.5. Atoms can
be identified with one-dimensional subspaces or rays in H. The irreducibility
follows from Lemma G.6.
1.5.3 Example: 3-dimensional Hilbert space

One can verify directly that distributive laws 1.25 and 1.26 are generally not
valid for subspaces in the Hilbert space H. Consider, for example, the system
of basis vectors and subspaces in a 3-dimensional Hilbert space H shown in
Fig. 1.5. The triples of vectors (a1 , a2 , a3 ) and (a1 , b2 , b3 ) form two orthogonal
23
We denote Sp(A, B) the linear span of two subspaces A and B in the Hilbert space.
(See Appendix A.2.) A∩B denotes the intersection of these subspaces. A′ is the orthogonal
complement of A.
Table 1.5: Translation of terms, symbols, and operations used for subspaces
in the Hilbert space, projections on these subspaces, and propositions in
quantum logic.
Subspaces Projections Propositions
X⊆Y PX PY = PY PX = PX x≤y
X ∩Y PX∩Y x∧y
Sp(X, Y ) PSp(X,Y ) x∨y
X′ 1 − PX x⊥
X and Y are compatible [PX , PY ] = 0 x↔y
X⊥Y PX PY = PY PX = 0 x ≤ y⊥
0 0 ∅
H 1 I
ray x |xihx| x is an atom
sets. They correspond to 1-dimensional subspaces X1 , X2 , X3 , Y2 , Y3. In addi-

tion, two 2-dimensional subspaces Z and Z1 can be formed as Z = Sp(X1 , X2 )
and Z1 = Sp(X2 , X3 ) = Sp(Y2 , X3 ). These subspaces satisfy obvious rela-
tionships
Sp(X2 , X3 ) = Sp(Y2, X3 ) = Sp(X2 , Y2 ) = Z1 , Sp(X1 , X2 ) = Z, X3 ∩ Y 2 = 0

X2 ∩ Y2 = 0, X1′ = Z1 , Z ∩ Z1 = X2
Then one can find a triple of subspaces for which the distributive laws in
Assertions 1.25 and 1.26 are not satisfied
Sp(Y2, (X3 ∩ X2 )) = Sp(Y2 , 0) = Y2 6= Z1 = Z1 ∩ Z1 = Sp(Y2 , X3 ) ∩ Sp(Y2 , X2 )

Y2 ∩ Sp(X3 , X2 ) = Y2 ∩ Z1 = Y2 6= 0 = Sp(0, 0) = Sp((Y2 ∩ X3 ), (Y2 ∩ X2 ))
This means that the logic represented by subspaces in the Hilbert space
is different from the classical Boolean logic. However, the orthomodularity
postulate is valid there. For example, X1 ⊆ Z so the condition in (1.22) is
satisfied and these two subspaces are compatible according to (1.14) - (1.15)
Sp((Z ∩ X1 ), (Z ∩ X1′ )) = Sp(X1 , (Z ∩ Z1 )) = Sp(X1 , X2 ) = Z

Sp((Z ∩ X1 ), (Z ′ ∩ X1 )) = Sp(X1 , (X3 ∩ X1 )) = Sp(X1 , 0) = X1
Y3 X3 Y2
H
a3
b3
b2
0 a2 X2
a1 Z1
Z
X1
Figure 1.5: Subspaces in a 3-dimensional Hilbert space H.
1.5.4 Piron’s theorem

Thus we have established that the set of closed subspaces (or projections
on these subspaces) in any complex Hilbert space H is a representation of
some quantum propositional system. The next question is: can we find a
Hilbert space representation for each quantum propositional system? The
positive answer to this question is given by the important Piron’s theorem
[Pir76, Pir64]
Theorem 1.38 (Piron) Any irreducible quantum propositional system (=

orthomodular atomic irreducible lattice) L of rank 4 or higher is isomorphic
to the lattice of closed subspaces in a Hilbert space H such that the corre-
spondences shown in Table 1.5 are true.
The proof of this theorem is beyond the scope of our book. Two further
remarks can be made regarding this theorem’s statement. First, all proposi-
tional systems of interest to physics have infinite (even uncountable) rank, so
the condition “rank ≥ 4” is not a significant restriction. Second, the original
Piron’s theorem does not specify the nature of scalars in the Hilbert space.
This theorem leaves the freedom of choosing any division ring with involutive
antiautomorphism as the set of scalars in H. We can greatly reduce this free-
dom if we remember the important role played by real numbers in physics.24
Therefore, it makes physical sense to consider only those rings which in-
clude R as a subring. In 1877 Frobenius proved that there are only three
24
e.g., values of observables are always in R
such rings. They are real numbers R, complex numbers C, and quaternions
H. Although there is vast literature on real and, especially, quaternionic
quantum mechanics [Stu60, Jau71], the relevance of these theories to physics
remains uncertain. Therefore, we will stick with complex numbers from now
on.
Piron’s theorem forms the foundation of the mathematical formalism of
quantum physics. In particular, it allows us to express the important notions
of observables and states in the new language of Hilbert spaces. In this lan-
guage pure quantum states are described by unit vectors in the Hilbert space.
Observables are described by Hermitian operators in the same Hilbert space.
These correspondences will be explained in the next section. Orthomodular
lattices of quantum logic, phase spaces of classical mechanics, and Hilbert
spaces of quantum mechanics are just different languages for describing rela-
tionships between states, observables, and their measured values. Table 1.6
can be helpful for translation between these three languages.
1.5.5 Should we abandon classical logic?

In this section we have reached a seemingly paradoxical conclusion that one
cannot use classical logic as well as classical probability theory for reason-
ing about quantum systems. How could that be? Classical logic is the
foundation of the whole mathematics and the scientific method in general.
All mathematical theorems are being proved with the logic of Aristotle and
Boole.25 Even theorems of quantum mechanics are being proved using this
logic. However, quantum mechanics insists that propositions about physical
systems satisfy laws of the non-Boolean quantum logic. The logic of classical
distributive lattices is just an approximation. Isn’t there a contradiction?
Not really.
It is still permissible to use classical logic in quantum mechanical proofs
because, thanks to the Piron’s theorem, we have replaced real life objects,
such as experimental propositions and probability measures, with abstract
and artificial notions of Hilbert spaces, state vectors, and Hermitian opera-
tors. These abstractions is the price we pay for the privilege to keep using
simple classical logic. In principle, it should be possible to formulate en-
tire quantum theory using the language of propositions, quantum logic, and
25
Note, however, attempts [III13] to develop so-called “quantum mathematics” that is
based on quantum logic.
1.6. QUANTUM OBSERVABLES AND STATES 45
Table 1.6: Glossary of terms used in general quantum logic, in classical phase
space, and in the Hilbert space of quantum mechanics.
nature Quantum logic Phase space Hilbert space
Statement proposition subset closed subspace
Unambiguous Atom Point Ray
statement
AND meet intersection intersection
OR join union linear span
NOT orthocomplement relative orthogonal
complement complement
IF...THEN implication inclusion inclusion
of subsets of subspaces
Observable proposition-valued real function Hermitian
measure on R operator
jointly compatible all observables commuting
measurable propositions are compatible operators
observables
mutually exclusive disjoint non-intersecting orthogonal
statements propositions subsets subspaces
Pure state Probability delta function Ray
measure
Mixed state Probability Probability Density
measure function operator
probability measures. However, such an approach has not been developed

yet.
1.6 Quantum observables and states

1.6.1 Observables
Each observable F naturally defines a mapping (called a proposition-valued
measure) from the set of intervals of the real line R to propositions FE in
L. These propositions can be described in words: FE = “the value of the
observable F is inside the interval E of the real line R.” We already discussed
properties of propositions about one observable. They can be summarized
as follows:
• The proposition corresponding to the intersection of intervals E1 and

E2 is the meet of propositions corresponding to these intervals
FE1 ∩E2 = FE1 ∧ FE2 (1.23)
• The proposition corresponding to the union of intervals E1 and E2 is

the join of propositions corresponding to these intervals
FE1 ∪E2 = FE1 ∨ FE2 (1.24)
• The proposition corresponding to the complement of interval E is the

orthocomplement of the proposition corresponding to E
FR\E = FE⊥ (1.25)
• The minimal proposition corresponds to the empty subset of the real

line
F∅ = ∅ (1.26)
• The maximal proposition corresponds to the real line itself.
FR = I (1.27)
Intervals E of the real line form a Boolean (distributive) lattice with re-
spect to set theoretical operations ⊆, ∩, ∪, and \. Due to the isomorphism
(1.23) - (1.27), the corresponding one-observable propositions FE also form a
Boolean lattice, which is a sublattice of our full propositional system. There-
fore, according to Theorem 1.35, all propositions about the same observable
are compatible. Due to the isomorphism “propositions”↔“subspaces,” we
can use the same notation FE for subspaces (projections) in H corresponding
to intervals E. Then, according to Lemma G.5, all projections FE , referring
to one observable F , commute with each other.
Each point f in the spectrum of observable F is called an eigenvalue of
this observable. The subspace Ff ⊂ H corresponding to the eigenvalue f is
called eigensubspace and projection Pf onto this subspace is called a spectral

projection. Each vector in the eigensubspace is called eigenvector.26
Consider two distinct eigenvalues f and g of observable F . The corre-
sponding intervals (=points) of the real line are disjoint. Then propositions
Ff and Fg are disjoint too, and corresponding (eigen)subspaces are orthogo-
nal. The linear span of subspaces Ff , where f runs through entire spectrum
of F , is the full Hilbert space H. Therefore, spectral projections of any
observable form a decomposition of unity.27 So, according to discussion in
Appendix G.2, we can associate an Hermitian operator
X
F = f Pf (1.28)
f
with each observable F . In what follows we will often use terms “observable”
and “Hermitian operator” as synonyms.
1.6.2 States
As we discussed in subsection 1.3.1, each state φ of the system defines a
probability measure (φ|x) on propositions in quantum logic L. According
to the isomorphism “propositions ↔ subspaces”,28 the state φ also defines
a probability measure (φ|X) on subspaces X in the Hilbert space H. This
probability measure is a function from subspaces to the interval [0, 1] ⊆ R
whose properties follow directly from equations (1.2), (1.3), and Postulate
1.19
• The probability corresponding to the whole Hilbert space is 1 in all

states
(φ|H) = 1 (1.29)
26
In the next subsection we will see that each vector in the Hilbert space defines a unique
pure quantum state. States defined by the above eigenvectors will be called eigenstates (of
the given observable F ). Apparently, the observable F has definite values (=eigenvalues)
in its eigenstates. This means that eigenstates are examples of states whose existence was
guaranteed by Postulate 1.2.
27
see Appendix G.1
28
See Table 1.6.
• The probability corresponding to the empty subspace is 0 in all states
(φ|0) = 0 (1.30)
• The probability corresponding to the direct sum of orthogonal sub-

spaces29 is the sum of probabilities for each subspace
(φ|X ⊕ Y ) = (φ|X) + (φ|Y ), if X ⊥ Y (1.31)
The following important theorem provides a classification of all such proba-

bility measures (= all states of the physical system).
Theorem 1.39 (Gleason [Gle57]) If (φ|X) is a probability measure on

closed subspaces in the Hilbert space H with properties (1.29) - (1.31), then
there exists a non-negative Hermitian operator ρ in H such that
T r(ρ) = 1 (1.32)
and for any subspace X with projection PX the value of the probability mea-
sure is
(φ|X) = T r(PX ρ) (1.33)
Just a few comments about the terminology and notation used here: First,
a Hermitian operator is called non-negative if all its eigenvalues are greater
than or equal to zero. Second, the operator ρ is usually called the density
operator or the density matrix. Third, T r denotes trace 30 of the matrix of
the operator ρ.
The proof of Gleason’s theorem is far from trivial, and we refer interested
reader to original works [Gle57, RB99]. Here we will focus on the physical
interpretation of this result. First, we may notice that, according to the
spectral theorem F.8, the operator ρ can be always written as
29
Note that according to Table 1.6 orthogonal subspaces correspond to disjoint propo-
sitions. For definition of the direct sum of subspaces see Appendix G.1.
30
Trace is, basically, the sum of all diagonal elements of a matrix. See Appendix F.7.
X
ρ= ρi |ei ihei | (1.34)
i
where |ei i is an orthonormal basis in H. Then the Gleason’s theorem means

that
ρi ≥ 0 (1.35)
X
ρi = 1 (1.36)
i
0 ≤ ρi ≤ 1 (1.37)
Among all states satisfying equation (1.35) - (1.37) there are simple states
for which just one coefficient ρi is non-zero. Then, from (1.36) it follows
that ρi = 1, ρj = 0 if j 6= i, and the density operator degenerates to a
projection onto the one-dimensional subspace |ei ihei |.31 Such states will be
called pure quantum states. It is also common to describe a pure state by a
unit vector from its ray. Any unit vector from this ray represents the same
state, i.e., in the vector representation of states there is a freedom of choosing
an unimodular phase factor of the state vector. In what follows we will often
use the terms “pure quantum state” and “state vector” as synonyms.
Mixed quantum states are expressed as weighed sums of pure states whose
coefficients ρi in equation (1.34) reflect the probabilities with which the pure
states enter in the mixture. Therefore, in quantum mechanics there are
uncertainties of two types. The first type is the uncertainty present in mixed
states. This uncertainty is already familiar to us from classical (statistical)
physics. This uncertainty results from our insufficient control of preparation
conditions (like when a bullet is fired from a shaky riffle). The second type
of uncertainty is present even in pure quantum states. It does not have a
counterpart in classical physics, and it cannot be avoided by tightening the
preparation conditions. This uncertainty is a reflection of the mysterious
unpredictability of microscopic phenomena.
We will not discuss mixed quantum states in this book. So, we will
deal only with uncertainties of the fundamental quantum type. Thus, when
speaking about a quantum state φ, we will always assume that there exists
31
One-dimensional subspaces are also called rays.
a corresponding state vector |φi. As discussed above, this vector is not the
unique representative of the state. Any vector eiα |φi that differs from |φi by
a unimodular phase factor eiα (where α ∈ R), is also a valid representative
of the state φ.
1.6.3 Commuting and compatible observables

In subsection 1.5.1 we defined the notion of compatible propositions. In
Lemma G.5 we showed that the compatibility of propositions is equivalent
to the commutativity of corresponding projections. The importance of these
definitions for physics comes from the fact that for a pair of compatible
propositions (=projections=subspaces) there are states in which both these
propositions are certain, i.e., simultaneously measurable. A similar state-
ment can be made for two compatible (=commuting) Hermitian operators of
observables. According to Theorem G.9, such two operators have a common
basis of eigenvectors (=eigenstates). In these eigenstates both observables
have definite (eigen)values.
We will assume that for any physical system there always exists a min-
imal and complete set of mutually compatible (= commuting) observables
F, G, H, . . ..32 Then, we should be able to build an orthonormal basis of
common eigenvectors |ei i such that each basis vector is uniquely labeled by
eigenvalues fi , gi , hi , . . . of operators F, G, H, . . ., i.e., if |ei i and |ej i are two
eigenvectors then there is at least one different number in the two sets of
eigenvalues {fi , gi , hi , . . .} and {fj , gj , hj , . . .}.
Each state vector |φi can be represented as a linear combination of these
basis vectors
X
|φi = φi |ei i (1.38)
i
where in the bra-ket notation33

32
A set F, G, H, . . . is called minimal if no observable from the set can be expressed as a
function of other observables from the same set. A set is complete if no more observables
can be added to it. For example, a single massive particle has a mutually commuting set
of observables (Px , Py , Pz , S 2 , Sz ), where P and S are the momentum and spin operators,
respectively. See section 5.1. Any function of observables from the minimal commuting
set also commutes with F, G, H, . . ., and with any other such function.
33
see Appendix F.3
φi = hei |φi (1.39)
The set of coefficients φi can be viewed as a function φ(f, g, h, . . .) on the

common spectrum of observables F, G, H, . . .. In this form, the coefficients
φi are referred to as the wave function of the state |φi in the representa-
tion defined by observables F, G, H, . . .. When the spectrum of operators
F, G, H, . . . is continuous, the index i is, actually, a continuous variable.34
1.6.4 Expectation values

Equation (1.28) defines a spectral decomposition for each observable F , where
index f runs over all distinct eigenvalues of F . Then for each pure state |φi
we can find the probability of measuring a value f of the observable F in
this state by using formula35
m
X
ρf = |hefi |φi|2 (1.40)
i=1
where |efi i are basis vectors in the range of the projection
m
X
Pf ≡ |efi ihefi | (1.41)
i=1
and m is the dimension of the corresponding subspace. Sometimes we also

need to know the weighed average of values f . This is called the expectation
value of the observable F in the state |φi and denoted hF i
X
hF i ≡ ρf f
f
34
Wave functions in the momentum and position representations for a single particle
will be discussed in section 5.2.
35
This is simply the value of the probability measure (φ|Pf ) (see subsection 1.3.1) cor-
responding to the spectral projection Pf . One can also see that this formula is equivalent
to the Gleason’s expression (1.33).
Substituting here equation (1.40) we obtain
n
X n
X
2
hF i = |hej |φi| fj ≡ |φj |2 fj
j=1 j=1
where the summation is carried out over the entire basis |ej i of eigenvectors of
the operator F with eigenvalues fj . By using decompositions (1.38), (1.28),
and (1.41) we obtain a more compact formula for the expectation value hF i
! ! !
X X X
hφ|F |φi = φ∗i hei | |ej ifj hej | φk |ek i
i j k
X X X
= φ∗i fj φk hei |ej ihej |ek i = φ∗i fj φk δij δjk = |φj |2 fj
ijk ijk j
= hF i (1.42)
1.6.5 Basic rules of classical and quantum mechanics

Results obtained in this chapter can be summarized as follows. If our system
prepared in a pure state φ and we want to calculate the probability ρ for
measuring the value of the observable F inside the interval E ⊆ R, then we
need to perform the following steps:
In classical mechanics:
1. Define the phase space S of the physical system;
2. Find a real function f : S → R corresponding to the observable F ;
3. Find the subset U of S corresponding to the subset E of the spectrum

of the observable F (U is the set of all points s ∈ S such that f (s) ∈ E);
4. Find the point sφ ∈ S representing the pure classical state φ;
5. The probability ρ is equal to 1 if sφ ∈ U and ρ = 0 otherwise.
In quantum mechanics:
1. Define the Hilbert space H of the physical system;

1.7. INTERPRETATIONS OF QUANTUM MECHANICS 53
2. Find the Hermitian operator F in H corresponding to the observable;
3. Find the eigenvalues and eigenvectors of the operator F ;
4. Find a spectral projection PE corresponding to the subset E of the

spectrum of the operator F .
5. Find the unit vector |φi (defined up to an arbitrary unimodular factor)

representing the state of the system.
6. Use formula ρ = hφ|PE |φi
At this point, there seems to be no connection between the classical and

quantum rules. However, we will see in section 6.5 that in the macroscopic
world with massive objects and poor resolution of instruments, the classical
rules emerge as a decent approximation to the quantum ones.
1.7 Interpretations of quantum mechanics

In sections 1.3 - 1.6 of this chapter we focused on the mathematical formalism
of quantum mechanics. Now it is time to discuss the physical meaning and
interpretation of these formal rules.
1.7.1 Quantum unpredictability in microscopic systems

Experiments with quantum microsystems have revealed one simple and yet
mysterious fact: if we prepare N absolutely identical physical systems in the
same conditions and measure the same observable in each of them, we may
find N different results.
Let us illustrate this experimental finding by few examples. We know
from experience that each photon passing through the hole in the camera
obscura will hit the photographic plate at some point on the photographic
plate. However, each new released photon will land at a different point.
Quantum mechanics allows us to calculate the probability density for these
points, but apart from that, the behavior of each individual photon appears
to be completely random. Quantum mechanics does not even attempt to
predict where each individual photon will hit the target.
Another example of such an apparently random behavior is the decay of
unstable nuclei. The nucleus of 232 T h has the lifetime of 14 billion years.
This means that in any sample containing thorium, approximately half of all
232
T h nuclei will decay after 14 billion years. In principle, quantum mechan-
ics can calculate the probability of the nuclear decay as a function of time by
solving the corresponding Schrödinger equation.36 However, quantum me-
chanics cannot even approximately guess when any given nucleus will decay.
It could happen today, or it could happen 100 billion years from now.
Although, such unpredictability is certainly a hallmark of microscopic

systems it would be wrong to think that it is not affecting our macroscopic
world. Quite often the effect of random microscopic processes can be am-
plified to produce a sizable equally random macroscopic effect. One famous
example of the amplification of quantum uncertainties is the thought exper-
iment with the “Schrödinger cat” [Sch35].
So, our world (even at the macroscopic scale) is full of truly random events
whose exact description and prediction is beyond capabilities of modern sci-
ence. Nobody knows why physical systems have this random unpredictable
behavior. Quantum mechanics simply accepts this fact and does not attempt
to explain it. Quantum mechanics does not describe what actually happens;
it describes the full range of possibilities of what might have happened and
the probability of each possible outcome. Each time nature chooses just
one possibility from this range, while obeying the probabilities predicted by
quantum mechanics. QM cannot say anything about which particular choice
will be made by nature. These choices are completely random and beyond
explanation by modern science. This observation is a bit disturbing and em-
barrassing. Indeed, we have real physically measurable effects (the actual
choices made by nature) for which we have no control and no power to pre-
dict the outcome. These are facts without an explanation, effects without
a cause. It seems that microscopic particles obey some mysterious random
force. Then it is appropriate to ask what is the reason for such stochastic
behavior of micro-systems? Is it truly random or it just seems to be random?
If quantum mechanics cannot explain this random behavior, maybe there is
a deeper theory that can?
36
though our current knowledge of nuclear forces is insufficient to make a reliable cal-
culation of that sort for thorium.
1.7.2 Hidden variables

One school of thought attributes the apparently random behavior of mi-
crosystems to some yet unknown “hidden” variables, which are currently
beyond our observation and control. According to these views, put some-
what simplistically, each photon in camera obscura has a guiding mechanism
which directs it to a certain predetermined spot on the photographic plate.
Each unstable nucleus has some internal “alarm clock” ticking inside. The
nucleus decays when the alarm goes off. The behavior of quantum systems
just appears to be random to us because so far we don’t have a clue about
these “guiding mechanisms” and “alarm clocks.”
According to the “hidden variables” theory, quantum mechanics is not the
final word, and future theory will be able to fully describe the properties of
individual systems and predict events without relying on chance. There are
two problems with this point of view. First, so far nobody was able to build
a convincing theory of hidden variables and to predict (even approximately)
outcomes of quantum measurements beyond calculated probabilities. The
second reason to reject the “hidden variables” argument is more formal.
The ”hidden variables” theory says that the randomness of micro-systems
does not have any special quantum-mechanical origin. It is the same classi-
cal pseudo-randomness as seen in the usual coin-tossing or die-rolling. The
”hidden variables” theory implies that rules of classical mechanics apply to
micro-systems just as well as to macro-systems. As we saw in section 1.4.1,
these rules are based on the classical Assertion 1.1 of determinism. Quan-
tum mechanics simply discards this unprovable Assertion and replaces it
with a weaker Postulate 1.2. So, quantum mechanics with its probabilities is
a more general mathematical framework, while classical mechanics with its
determinism can be represented as a particular case of this framework. As
Mittelstaedt put it [Mit02]
...classical mechanics is loaded with metaphysical hypotheses which

clearly exceed our everyday experience. Since quantum mechanics
is based on strongly relaxed hypotheses of this kind, classical me-
chanics is less intuitive and less plausible than quantum mechan-
ics. Hence classical mechanics, its language and its logic cannot
be the basis of an adequate interpretation of quantum mechanics.
P. Mittelstaedt
1.7.3 Measurement problem

If we now accept the probabilistic quantum view of reality, we must address
some deep paradoxes. One disturbing “paradox” is that in quantum me-
chanics the wave function of a physical system evolves in time smoothly and
unitarily according to the Schrödinger equation (5.50) up until the instant of
measurement, at which point the wave function experiences an unpredictable
and abrupt collapse.
The seemingly puzzling part is that it is not clear how the wave function
“knows” when it can undergo the continuous evolution and when it should
“collapse.” Where is the boundary between the measuring device and the
quantum system? For example, it is customary to say that the photon is
the quantum system while the photographic plate is the measuring device.
However, we can adopt a different view and include the photographic plate
together with the photon in our quantum system. Then, we should, in prin-
ciple, describe both the photon and the photographic plate by a joint wave
function. When does this wave function collapse? Where is the measur-
ing apparatus in this case? Human’s eye? Does it mean that while we
are not looking, the entire system (photon + photographic plate) remains
in a superposition state? Following this logic we may easily reach a seem-
ingly absurd conclusion that the ultimate measuring device is human’s brain,
and all events remain potentialities until they are registered by mind. For
many physicists these contradictions signify some troubling incompleteness
of quantum theory, its inability to describe the world “as it is.”
In order to avoid the controversial wave function collapse, several so-
called “interpretations” of quantum mechanics were proposed. In the de
Broglie–Bohm’s “pilot wave” interpretation it is postulated that the electron
propagating in the double-slit setup is actually a classical point particle,
whose movement is “guided” by a separate material “wave” that obeys the
Schrödinger equation. In the “many worlds” interpretation it is assumed that
at the instant of measurement (when several outcomes are possible, according
to quantum mechanics) the world splits into several (or even infinite number
of) copies, so all outcomes are realized at once. We see only a single outcome
because we live just in one copy of the world and lack the “bird view” of the
many worlds reality.
Interpretations of quantum mechanics attempt to suggest some kinds of
physical mechanisms of the quantum system’s behavior and the measure-
ment process. However, how we can be sure that these mechanisms are
correct? The only method of verification available in physics is experiment,

but suggested mechanisms are related to things happening in the physical
system while it is not observed. So, it is impossible to design experiments
that would (dis)prove interpretations of quantum mechanics. Being unac-
cessible to experimental verification these interpretations should belong to
philosophy rather than physics.
Actually, the “collapse” or “measurement” paradoxes are not as serious
as they look. In author’s view, their appearance is related simply to our
unrealistic expectations regarding the explanatory power of physical theory.
Intuitively we wish to have a physical theory that encompasses all physical
reality: the physical system, the measuring apparatus, the observer, and the
entire universe. However this goal is perhaps too ambitious and misleading.
Recall that the goal of a physical theory declared in Introduction is to provide
a formalism that allows us to predict results of experiments.37 In physics we
do not want and do not need to describe the whole world “as it is.” We
should be entirely satisfied if our theory allows us to calculate the outcome
of any conceivable measurement, which is a more modest task.
Thus it is not a surprise that certain aspects of reality are beyond the
reach of quantum theory. In this sense, quantum mechanics can be regarded
as “incomplete” theory. However, this author believes that this incomplete-
ness is not a problem, but a reflection of the fundamental unavoidable un-
predictability of nature. Here the following quote from Einstein seems ap-
propriate:
I now imagine a quantum theoretician who may even admit that

the quantum-theoretical description refers to ensembles of systems
and not to individual systems, but who, nevertheless, clings to the
idea that the type of description of the statistical quantum theory
will, in its essential features, be retained in the future. He may
argue as follows: True, I admit that the quantum-theoretical de-
scription is an incomplete description of the individual system. I
even admit that a complete theoretical description is, in principle,
thinkable. But I consider it proven that the search for such a com-
plete description would be aimless. For the lawfulness of nature
is thus constructed that the laws can be completely and suitably
formulated within the framework of our incomplete description.
37
More precisely, the theory should be able to calculate probabilities of measurements.
To this I can only reply as follows: Your point of view - taken as

theoretical possibility - is incontestable. A. Einstein [Ein49]
The quantum mechanical distinction between the observed system and

the measuring apparatus is not as problematic as often claimed. This dis-
tinction is naturally present in every experiment. If properly asked, the
experimentalist will always tell you which part of his setup is the observed
system and which part is the measuring apparatus.38 Therefore there is noth-
ing wrong in applying different descriptions to these two parts. In quantum
theory the state of the physical system is described as a vector in the Hilbert
space, and the measuring apparatus is described as an Hermitian operator
in the same Hilbert space. The measuring apparatus is not considered to
be a dynamical object. This means that there is no point to describe the
act of measurement as “interaction” between the physical system and the
measuring apparatus by means of some dynamical theory.39
The “collapse” of the wave function is not a dynamical process, it is just
a part of a mathematical formalism that allows us to fulfil to true task of
any physical theory - to predict outcomes of experiments. So there is no
any contradiction or paradox between the unitary time evolution of wave
functions and the abrupt “collapse” at the time of measurement.
1.7.4 Agnostic interpretation of quantum mechanics

The things we said above can be summarized in the following statements:
1. Quantum mechanics does not pretend to provide a description of the

entire universe. It only applies to the description of specific experiments
in which the physical system and the measuring apparatus are clearly
separated.
38
For instance, in the above example of the double-slit experiment the photon is the
physical system and the photographic plate is the measuring apparatus. The one-photon
Hilbert space should be used for the quantum-mechanical analysis of this experiment. If
we like, we can consider the “photon + photographic plate” as our physical system, but
this would mean that we have changed completely the experimental setup, the measuring
apparatus, and the range of meaningful questions that can be asked and answered about
the system. The new setup should be described quantum-mechanically in a different
Hilbert space with different state vectors and different operators of observables.
39
as it was done, e.g., in von Neumann’s measurement theory.
2. Quantum mechanics does not provide a mechanism of what goes on

while the physical system is not observed or while it is measured. Quan-
tum mechanics is just a mathematical recipe for calculating probabil-
ities of experimental outcomes. The ingredients used in this recipe
(Hilbert space, state superpositions, wave functions, Hermitian opera-
tors, etc.) have no direct relationship to things observable in nature.
They are just mathematical symbols.
3. We cannot measure all observables at once. A realistic experiment

measures only one observable, or, in the best case, a few mutually
compatible observables.
4. Nature is inherently probabilistic. There exists a certain level of ran-

dom “noise” that leads to unpredictability of the results of measure-
ments.
5. Logical propositions about measurements do not obey the set of clas-

sical Boolean axioms. The distributive law of logic is not valid and
should be replaced by the orthomodular law.
The most important philosophical lesson taught to us by quantum me-

chanics is the unwillingness to speculate about things that are not observable.
Einstein was very displeased with this point of view. He wrote:
I think that a particle must have a separate reality independent of

the measurements. That is an electron has spin, location and so
forth even when it is not being measured. I like to think that the
moon is there even if I am not looking at it. A. Einstein
Actually, quantum mechanics does not make any claims about things that
are not measured. So, we will also prefer to remain agnostic about non-
observable features and refuse to use them as a basis for building our theory.
Chapter 2
POINCARÉ GROUP
There are more things in Heaven and on earth, dear Horacio,

than are dreamed of in your philosophy.
Hamlet
In the preceding chapter we have learned that each physical system can
be described mathematically by a Hilbert space. Rays in this space are in
one-to-one correspondence with (pure) states of the system. Observables are
described by Hermitian operators. This vague description is not sufficient for
a working theory. We are still lacking precise classification of possible physical
systems; we still do not know which operators correspond to usual observables
like position, momentum, mass, energy, spin, etc. and how these operators
are related to each other; we still cannot tell how states and observables
evolve in time. Our theory is not complete.
It appears that many missing pieces mentioned above are supplied by the
principle of relativity - which is one of the most powerful ideas in physics.
This principle has a very general character. It works independent on what
physical system, state or observable is considered. Basically, this principle
says that there is no preferred inertial reference frame (or observer or lab-
oratory). All frames are equivalent if they are at rest or move uniformly
without rotation or acceleration. Moreover, the principle of relativity rec-
ognizes certain (group) properties of inertial transformations between these
frames or observers. Our primary goal here is to establish that the group of
61
62 CHAPTER 2. POINCARÉ GROUP
transformations between inertial observers is the celebrated Poincaré group.

In the rest of this book we will have many opportunities to appreciate the
fundamental importance of this idea for relativistic physics.
One can notice that the principle of relativity discussed here is the same
as the first postulate of Einstein’s special relativity. In this book we will
not need Einstein’s second postulate, which claims the independence of the
speed of light on the velocity of the source or observer. Actually, we will
find out that by combining the first postulate, the Poincaré group idea, and
laws of quantum mechanics we can obtain a complete working formalism of
relativistic quantum theory. This will be done in chapter 3. We will also
see in chapter 5 that the second postulate is redundant, because the speed
of massless photons appears to be invariant anyway. Another distinctive
feature of our relativistic approach is that we never assume the existence of
the 4-dimensional Minkowski manifold, which unifies space and time. Time
and position play very different roles in our theory. The significance of this
idea will be discussed in chapter 17 in the second part of this book.
2.1 Inertial observers

2.1.1 Principle of relativity
As has been said in Introduction, in this book we consider only inertial
laboratories. What is so special about them? The answer is that one can
apply the powerful principle of relativity to such laboratories. The essence of
this principle was best explained by Galileo more than 370 years ago [Gal01]:
Shut yourself up with some friend in the main cabin below

decks on some large ship and have with you there some flies, but-
terflies and other small flying animals. Have a large bowl of water
with some fish in it; hang up a bottle that empties drop by drop
into a wide vessel beneath it. With the ship standing still, observe
carefully how the little animals fly with equal speed to all sides of
the cabin. The fish swim indifferently in all directions; the drops
fall into the vessel beneath; and, in throwing something to your
friend, you need to throw it no more strongly in one direction
than another, the distances being equal; jumping with your feet
together, you pass equal spaces in every direction. When you have
observed all of these things carefully (though there is no doubt that
2.1. INERTIAL OBSERVERS 63
when the ship is standing still everything must happen this way),
have the ship proceed with any speed you like, so long as the mo-
tion is uniform and not fluctuating this way and that. You will
discover not the least change in all the effects named, nor could
you tell from any of them whether the ship was moving or stand-
ing still. In jumping, you will pass on the floor the same spaces
as before, nor will you make larger jumps toward the stern than
towards the prow even though the ship is moving quite rapidly,
despite the fact that during the time that you are in the air the
floor under you will be going in a direction opposite to your jump.
In throwing something to your companion, you will need no more
force to get it to him whether he is in the direction of the bow or
the stern, with yourself situated opposite. The droplets will fall as
before into the vessel beneath without dropping towards the stern,
although while the drops are in the air the ship runs many spans.
The fish in the water will swim towards the front of their bowl
with no more effort than toward the back and will go with equal
ease to bait placed anywhere around the edges of the bowl. Finally
the butterflies and flies will continue their flights indifferently to-
ward every side, nor will it ever happen that they are concentrated
toward the stern, as if tired out from keeping up with the course
of the ship, from which they will have been separated during long
intervals by keeping themselves in the air.
These observations can be translated into a single statement that all inertial
laboratories cannot be distinguished from the laboratory at rest by perform-
ing experiments confined to those laboratories. Any experiment performed
in one laboratory, will yield exactly the same result as an identical experi-
ment in any other laboratory. The results will be the same, independent on
how far apart the laboratories are and what are their relative orientations
and velocities. Moreover, we may repeat the same experiment at any time,
tomorrow, or many years later, still results will be the same. This allows us
to formulate one of the most important and deep postulates in physics
Postulate 2.1 (the principle of relativity) In all inertial laboratories,
the laws of nature are the same: they do not change with time, they do not
depend on the position and orientation of the laboratory in space and on its
velocity. The laws of physics are invariant against inertial transformations
of laboratories.
2.1.2 Inertial transformations

Our next goal is to study inertial transformations between laboratories in
more detail. To do this we do not need to consider physical systems at all. It
is sufficient to think about a world inhabited only by laboratories. The only
thing these laboratories can do is to measure parameters1 {φ;~ v; r; t} of their
fellow laboratories. It appears that even in this oversimplified world we can
learn quite a few useful things about properties of inertial laboratories and
their relationships to each other.
Let us first introduce a convenient labeling of inertial observers and in-
ertial transformations. We choose an arbitrary frame O as our reference
observer, then other examples of observers are
(i) an observer {~0; 0; 0; t1 }O displaced in time by the amount t1 ;
(ii) an observer {~0; 0; r1 ; 0}O shifted in space by the vector r1 ;
(iii) an observer {~0; v1 ; 0; 0}O moving with velocity v1 ;

~ 1 ; 0; 0; 0}O rotated by the vector φ
(iv) an observer {φ ~ 1 .2
Suppose now that we have three different inertial observers O, O ′, and O ′′.
~ 1 ; v1 ; r1 ; t1 } which connects O and O ′
There is an inertial transformation {φ
~ 1 ; v1 ; r1 ; t1 }O
O ′ = {φ (2.1)
where parameters φ ~ 1 , v1 , r1 , and t1 are measured by the ruler and clock

belonging to the reference frame O with respect to its basis vectors. Similarly,
there is an inertial transformation that connects O ′ and O ′′
~ 2 ; v2 ; r2 ; t2 }O ′
O ′′ = {φ (2.2)
where parameters φ ~ 2 , v2 , r2 , and t2 are defined with respect to the basis

vectors, ruler and clock of the observer O ′. Finally, there is a transformation
that connects O and O ′′
1
These parameters were explained on page xxxiii of the Introduction.
2
The parameterization of rotations by 3-vectors is discussed in Appendix D.5.
2.1. INERTIAL OBSERVERS 65
~ 3 ; v3 ; r3 ; t3 }O
O ′′ = {φ (2.3)
with all transformation parameters referring to O. We can represent the
transformation (2.3) as a composition or product of transformations (2.1)
and (2.2)
~ 3 ; v3 ; r3 ; t3 } = {φ
{φ ~ 2 ; v2 ; r2 ; t2 }{φ
~ 1 ; v1 ; r1 ; t1 } (2.4)
Apparently, this product has the property of associativity.3 Also, there ex-
ists a trivial (identity) transformation {~0; 0; 0; 0} that leaves all observers
unchanged, and for each inertial transformation {φ; ~ v; r; t} there is an inverse
transformation {φ;~ v; r; t} such that their product is the identity transfor-
−1
mation
~ v; r; t}{φ;
{φ; ~ v; r; t}−1 = {φ;
~ v; r; t}−1{φ;
~ v; r; t} = {~0; 0; 0; 0} (2.5)
In other words, the set of inertial transformations forms a group (see Ap-
pendix A.1). Moreover, since these transformations smoothly depend on
their parameters, this is a Lie group (see Appendix E.1). The main goal of
the present chapter is to study the properties of this group in some detail. In
particular, we will need explicit formulas for the composition and inversion
laws.
First we notice that a general inertial transformation {φ; ~ v; r; t} can be
represented as a product of basic transformations (i) - (iv). As these basic
transformations generally do not commute, we must agree on the canonical
order in this product. For our purposes the following choice is convenient
~ v; r; t}O = {φ;
{φ; ~ 0; 0; 0}{~0; v; 0; 0}{~0; 0; r; 0}{0; 0; ~0; t}O (2.6)
This means that in order to obtain observer O ′ = {φ;~ v; r; t}O we first shift
observer O in time by the amount t,4 then shift the time-translated observer
by the vector r, then give it velocity v, and finally rotate the obtained ob-
~
server by the angle φ.
3
see equation (A.1)
4
Recall that observers considered in this book are instantaneous, so the time flow is
regarded as time translation of observers.
2.2 Galilei group

In this section we begin our study of the group of inertial transformations
by considering a non-relativistic world in which observers move with low
speeds. This is a relatively easy task, because in these derivations we can
use our everyday experience and “common sense.” The relativistic group of
transformations will be approached in section 2.3 as a formal generalization
of the Galilei group derived here.
2.2.1 Multiplication law of the Galilei group

Let us first consider four examples of products (2.4) in which {φ ~ 1 ; v1 ; r1; t1 }
is a general inertial transformation and {φ ~ 2 ; v2 ; r2 ; t2 } is one of the basic
transformations from the list (i) - (iv). Applying a time translation to a
general reference frame {φ~ 1 ; v1 ; r1 ; t1 }O will change its time label and change
its position in space according to equation
~ 1; v1 ; r1 ; t1 }O = {φ
{~0; 0; 0; t2 }{φ ~ 1 ; v1 ; r1 + v1 t2 ; t1 + t2 }O (2.7)
Space translations affect the position
~ 1; v1 ; r1 ; t1 }O = {φ
{~0; 0; r2; 0}{φ ~ 1 ; v1 ; r1 + r2 ; t1 }O (2.8)
Boosts change the velocity
~ 1; v1 ; r1 ; t1 }O = {φ
{~0; v2 ; 0; 0}{φ ~ 1 ; v1 + v2 ; r1 ; t1 }O (2.9)
Rotations affect all vector parameters5
~ 2 ; 0; 0; 0}{φ
{φ ~ 1; v1 ; r1 ; t1 }O = {Φ(R
~ ~ R ~ ); R ~ v1 ; R ~ r1 ; t1 }O (2.10)
φ2 φ1 φ2 φ2
Now we can calculate the product of two general inertial transformations in

(2.4) by using (2.6) - (2.10)6
5 ~ see Appendix D.5.
For definition of 3 × 3 rotation matrices Rφ~ and function Φ
6
Note that sometimes the product of Galilei transformations is written in other forms.
See, for example, section 3.2 in [Bal98], where the assumed canonical order of factors was
different from our formula (2.6).
2.2. GALILEI GROUP 67
~ 2 ; v2 ; r2 ; t2 }{φ
{φ ~ 1; v1 ; r1 ; t1 }
~ 2 ; 0; 0; 0}{~0; v2 ; 0; 0}{~0; 0; r2 ; 0}{~0; 0; 0; t2 }{φ
= {φ ~ 1 ; v1 ; r1 ; t1 }
~ 2 ; 0; 0; 0}{~0; v2 ; 0; 0}{~0; 0; r2 ; 0}{φ
= {φ ~ 1; v1 ; r1 + v1 t2 ; t1 + t2 }
~ 2 ; 0; 0; 0}{~0; v2 ; 0; 0}{φ
= {φ ~ 1; v1 ; r1 + v1 t2 + r2 ; t1 + t2 }
~ 2 ; 0; 0; 0}{φ
= {φ ~ 1; v1 + v2 ; r1 + v1 t2 + r2 ; t1 + t2 }
~ ~ R ~ ); R ~ (v1 + v2 ); R ~ (r1 + v1 t2 + r2 ); t1 + t2 }
= {Φ(R (2.11)
φ2 φ1 φ2 φ2
By direct substitution to equation (2.5) it is easy to check that the inverse

~ v; r; t} is
of a general inertial transformation {φ;
~ v; r; t}−1 = {−φ;
{φ; ~ −v; −r + vt; −t} (2.12)
Equations (2.11) and (2.12) are multiplication and inversion laws which fully
determine the structure of the Lie group of inertial transformations in non-
relativistic physics. This group is called the Galilei group.
2.2.2 Lie algebra of the Galilei group

In physical applications the Lie algebra of the group of inertial transforma-
tions plays even greater role than the group itself. According to our discus-
sion in Appendix E, we can obtain the basis (H, P, ~ K,
~ J~ ) in the Lie algebra
of generators of the Galilei group by taking derivatives with respect to pa-
rameters of one-parameter subgroups. For example, the generator of time
translations is
d ~
H = lim {0; 0; 0; t}
t→0 dt
For generators of space translations and boosts along the x-axis we obtain
d ~
Px = lim {0; 0; x, 0, 0; 0}
x→0 dx
d
Kx = lim {~0; v, 0, 0; 0; 0}
v→0 dv
The generator of rotations around the x-axis is
d
Jx = lim {φ, 0, 0; 0; 0; 0}
φ→0 dφ
Similar formulas are valid for y- and z-components. According to (E.1) we

can also express finite transformations as exponents of generators
{~0; 0; 0; t} = eHt ≈ 1 + Ht (2.13)

~ ~
{~0; 0; r; 0} = ePr ≈ 1 + Pr (2.14)
~ ~
{~0; v; 0; 0} = eKv ≈ 1 + Kv (2.15)
~ 0; 0; 0} = e J~ φ
~ ~
{φ; ≈ 1 + J~ φ
Then each group element can be represented in its canonical form (2.6) as
the following function of parameters
~ v; r; t} ≡ {φ;
{φ; ~ 0; 0; 0}{~0; v; 0; 0}{~0; 0; r; 0}{0; 0; ~0; t}
~~ ~ ~
= eJ φ eKv ePr eHt (2.16)
Let us now find the commutation relations between generators, i.e., the
structure constants of the Galilei Lie algebra. Consider, for example, trans-
lations in time and space. From equation (2.11) we have
{~0; 0; 0; t}{~0; 0; x, 0, 0; 0} = {~0; 0; x, 0, 0; 0}{~0; 0; 0; t}
This implies
eHt ePx x = ePx x eHt

1 = ePx x eHt e−Px x e−Ht
Using equations (2.13) and (2.14) for the exponents we can write to the first
order in x and to the first order in t
1 ≈ (1 + Px x)(1 + Ht)(1 − Px x)(1 − Ht)

≈ 1 + Px Hxt − Px Hxt − HPx xt + Px Hxt
= 1 − HPx xt + Px Hxt
hence
[Px , H] ≡ Px H − HPx = 0
So, generators of space and time translations have vanishing Lie bracket.
Similarly we obtain Lie brackets
[H, Pi ] = [Pi , Pj ] = [Ki , Kj ] = [Ki , Pj ] = 0
for any i, j = x, y, z (or i, j = 1, 2, 3). The composition of a time translation

and a boost is more interesting since they do not commute. We calculate
from equation (2.11)
eKx v eHt e−Kx v = {~0; v, 0, 0; 0, 0}{~0; 0; 0; t}{~0; −v, 0, 0; 0; 0}

= {~0; v, 0, 0; 0; 0}{~0; −v, 0, 0; −vt, 0, 0; t}
= {~0; 0, 0, 0; −vt, 0, 0; t}
= eHt e−Px vt
Therefore, using equations (2.13), (2.15), and (E.13) we obtain
Ht + [Kx , H]vt = Ht − Px vt
[Kx , H] = −Px
Proceeding in a similar fashion for other pairs of transformations we can

obtain the full set of commutation relations for the Lie algebra of the Galilei
group.
3
X
[Ji , Pj ] = ǫijk Pk (2.17)
k=1
X3
[Ji , Jj ] = ǫijk Jk (2.18)
k=1
X3
[Ji , Kj ] = ǫijk Kk (2.19)
k=1
[Ji , H] = 0 (2.20)
[Pi , Pj ] = [Pi , H] = 0 (2.21)
[Ki , Kj ] = 0 (2.22)
[Ki , Pj ] = 0 (2.23)
[Ki , H] = −Pi (2.24)
From these Lie brackets one can identify several important sub-algebras of
the Galilei Lie algebra and, therefore, subgroups of the Galilei group. In
particular, there is an Abelian subgroup of space and time translations (with
generators P~ and H, respectively), a subgroup of rotations (with generators
J~ ), and an Abelian subgroup of boosts (with generators K).
~
2.2.3 Transformations of generators under rotations

Consider two reference frames O and O ′ connected to each other by the group
element g:
O ′ = gO
Suppose that observer O performs an (active) inertial transformation with

the group element h (e.g., h is a translation along the x-axis). We want to
find a transformation h′ which is related to the observer O ′ in the same way
as h is related to O (i.e., h′ is the translation along the x′ -axis belonging to
the observer O ′). As seen from the example in Fig. 2.1, the transformation
h′ of the object A can be obtained by first going from O ′ to O, performing
translation h there, and then returning back to the reference frame O ′
y
O y’
gg O’
h x
gg1
A g
h’ x’
Figure 2.1: Connection between similar transformations h and h′ in differ-

ent reference frames. g = exp(Jz φ) is a rotation around the z-axis that is
perpendicular to the page.
h′ = ghg −1
Similarly, if A is a generator of an inertial transformation in the reference

frame O, then
A′ = gAg −1 (2.25)
is “the same” generator in the reference frame O ′ = gO.

Let us consider the effect of rotation around the z-axis on generators of
the Galilei group. We can write
A′x ≡ Ax (φ) = eJz φ Ax e−Jz φ

A′y ≡ Ay (φ) = eJz φ Ay e−Jz φ
A′z ≡ Az (φ) = eJz φ Az e−Jz φ
where A ~ is any of the generators P,

~ J~ or K.
~ From Lie brackets (2.17) -
(2.19) we obtain
∂
Ax (φ) = eJz φ (Jz Ax − Ax Jz )e−Jz φ = eJz φ Ay e−Jz φ = Ay (φ) (2.26)
∂φ
∂
Ay (φ) = eJz φ (Jz Ay − Ay Jz )e−Jz φ = −eJz φ Ax e−Jz φ = −Ax (φ)
∂φ
∂
Az (φ) = eJz φ (Jz Az − Az Jz )e−Jz φ = 0 (2.27)
∂φ
Taking a derivative of equation (2.26) by φ we obtain a second order differ-
ential equation
∂2 ∂
2
Ax (φ) = Ay (φ) = −Ax (φ)
∂ φ ∂φ
with the general solution
Ax (φ) = B cos φ + D sin φ
where B and D are arbitrary functions of generators. From the initial con-
ditions we obtain
B = Ax (0) = Ax
d

D = Ax (φ) = Ay
dφ φ=0
so that finally
Ax (φ) = Ax cos φ + Ay sin φ (2.28)
Similar calculations show that
Ay (φ) = −Ax sin φ + Ay cos φ (2.29)

Az (φ) = Az (2.30)
Comparing (2.28) - (2.30) with equation (D.12), we see that

φ
v
φ x
0 S
x
S’
v
Figure 2.2: Transformation of generators under space inversion.
3
X
A′i =eJz φ −Jz φ
Ai e = (Rz )ij Aj (2.31)
j=1
where Rz is the rotation matrix. As shown in equation (D.21), we can also

~ 0; 0; 0} to generators
find the result of application of a general rotation {φ;
A~
~ ′ = eJ~ φ~ Ae
A ~ −J~ φ~
! # "
~ ~ ~
= A~ cos φ + φ ~·φ
A ~×
(1 − cos φ) − A
φ
sin φ
φ φ φ
~
= Rφ~ A
~ J~ , and K
This means that P, ~ are 3-vectors.7 The Lie bracket (2.20) obviously
means that H is a 3-scalar.
2.2.4 Space inversions

We will not consider physical consequences of discrete transformations (in-
version and time reversal) in this book. It is physically impossible to prepare
an exact mirror image or a time-reversed image of a laboratory, so the rel-
ativity postulate has nothing to say about such transformations. Indeed, it
7
see Appendix D.2
has been proven by experiment that these discrete symmetries are not ex-
act. Nevertheless, we will find it useful to know how generators behave with
respect to space inversions. Suppose we have a classical system S and its in-
version image S ′ (see Fig. 2.2) with respect to the origin 0. The question is:
how the image S ′ will transform if we apply a certain inertial transformation
to S?
Apparently, if we shift S by vector x, then S ′ will be shifted by −x.
This can be interpreted as the change of sign of the generator of translation
~ under inversion. The same with boost: the inverted image S ′ acquires
P
velocity −v if the original was boosted by v. So, inversion changes the sign
of the boost generator as well
~ → −K
K ~ (2.32)
~ and K,
Vectors, such as P ~ changing their sign after inversion are called true
vectors. However, the generator of rotation J~ is not a true vector. In-
deed, if we rotate S by angle φ,~ then the image S ′ is also rotated by the
same angle (see Fig. 2.2). So, J~ does not change the sign after inver-
sion. Such vectors are called pseudovectors. Similarly we can introduce the
notions of true scalars/pseudoscalars and true tensors/pseudotensors. It is
conventional to define their properties in a way opposite to those of true
vectors/pseudovectors. In particular, true scalars and true tensors (of rank
2) do not change their signs after inversion. For example, H is a true scalar.
Pseudoscalars and rank-2 pseudotensors do change their signs after inversion.
2.3 Poincaré group

It appears that the Galilei group described above is valid only for observers
moving with low speeds. In the general case a different multiplication law
should be used, and the group of inertial transformations is, in fact, the
Poincaré group (also known as the inhomogeneous Lorentz group). This is
a very important lesson following from the theory of relativity developed in
the beginning of the 20th century by Einstein and Poincaré.
Derivation of the relativistic group of inertial transformations is a diffi-
cult task, because we lack the experience of dealing with fast-moving objects
in our everyday life. So, we will use more formal mathematical arguments
2.3. POINCARÉ GROUP 75
instead. In this section we will find that there is almost a unique way to
obtain the Lie algebra of the Poincaré group by generalizing the commuta-
tion relations of the Galilei Lie algebra (2.17) - (2.24), so that they remain
compatible with some simple physical requirements.
2.3.1 Lie algebra of the Poincaré group

We can be confident about the validity of Galilei Lie brackets involving gen-
erators of space-time translations and rotations, because properties of these
transformations have been verified in everyday life and in physical experi-
ments over a wide range of involved parameters (distances, times, and an-
gles). The situation with respect to boosts is quite different. Normally, we do
not experience high speeds in our life and we lack any physical intuition that
was so helpful in deriving the Galilei Lie algebra. Therefore the arguments
that lead us to the Lie brackets (2.22) - (2.24) involving boost generators
may be not exact, and these formulas may be just approximations that can
be tolerated only for low-speed observers. So, we will base our derivation of
the relativistic group of inertial transformations on the following ideas.
(I) Just as in the non-relativistic world, the set of inertial transformations

should remain a 10-parameter Lie group. However, Lie brackets in
the exact (Poincaré) Lie algebra are expected to be different from the
Galilei Lie brackets (2.17) - (2.24).
(II) The Galilei group does a good job in describing the low-speed trans-
formations, and the speed of light c is a natural measure of speed.
Therefore we may guess that the correct Lie brackets should include c
as a parameter, and they must tend to the Galilei Lie brackets in the
limit c → ∞.8
(III) We will assume that only Lie brackets involving boosts may be subject
to revision.
(IV) We will further assume that relativistic generators of boosts K ~ still

form components of a true vector, so equations (2.19) and (2.32) remain
valid.
8
Note that here we do not assume that c is a limiting speed or that the speed of light is
invariant. These facts will come out as a result of application of our approach to massive
and massless particles in chapter 5.
Summarizing requirements (I) - (IV), we can write the following relativistic

generalizations for the Lie brackets (2.22) - (2.24)
[Ki , Pj ] = Uij (2.33)

[Ki , Kj ] = Tij
[Ki , H] = −Pi + Vi (2.34)
where Tij , Uij , and Vij are some yet unknown linear combinations of genera-
tors. The coefficients of these linear combinations must be selected in such a
way that all Lie algebra properties9 are preserved. Let us try to satisfy these
conditions step by step.
First note that the Lie bracket [Ki , Pj ] is a 3-tensor. Indeed, using equa-
tion (2.31) we obtain the tensor transformation law (D.15)
" 3 3
#
~ ~
X X
J~ φ −J~ φ ~ k, ~ l
e [Ki , Pj ]e = Rik (φ)K Rjl (φ)P
k=1 k=1
3
X
= ~ jl (φ)[K
Rik (φ)R ~ k , Pl ]
kl=1
Since both K ~ and P~ change their signs upon inversion, this a true tensor.
Therefore Uij must be a true tensor as well. This tensor should be constructed
as a linear function of generators among which we have a true scalar H, a
pseudovector J~ , and two true vectors P~ and K.
~ According to our discussion
in Appendix D.4, the only way to make a true tensor from these ingredients
is by using formulas in the first and third rows in table D.1. Therefore, the
most general expression for the Lie bracket (2.33) is
3
X
[Ki , Pj ] = −βHδij + γ ǫijk Jk
k=1
where β and γ are yet unspecified real constants.

Similar arguments suggest that Tij is also a true tensor. Due to the
relationship
9
in particular, the Jacobi identity (E.10)
[Ki , Kj ] = −[Kj , Ki ]
this tensor must be antisymmetric with respect to indices i and j. This

excludes the term proportional to δij , hence
3
X
[Ki , Kj ] = α ǫijk Jk ,
k=1
where α is, again, a yet undefined constant.

The quantity Vi in equation (2.34) must be a true vector, so, the most
general form of the Lie bracket (2.34) is
[Ki , H] = −(1 + σ)Pi + κKi .
So, we have reduced the task of generalization of Galilei Lie brackets to

finding just five real parameters α, β, γ, κ, and σ. To proceed further, let us
first use the following Jacobi identity
0 = [Px , [Kx , H]] + [Kx , [H, Px ]] + [H, [Px , Kx ]]

= κ[Px , Kx ]
= βκH
which implies
βκ = 0 (2.35)
Similarly,
0 = [Kx , [Ky , Py ]] + [Ky , [Py , Kx ]] + [Py , [Kx , Ky ]]

= −β[Kx , H] − γ[Ky , Jz ] + α[Py , Jz ]
= β(1 + σ)Px − βκKx − γKx + αPx
= (α + β + βσ)Px − (βκ + γ)Kx
= (α + β + βσ)Px − γKx
implies
α = −β(1 + σ) (2.36)
γ = 0 (2.37)
The system of equations (2.35) - (2.36) has two possible solutions (in both
cases σ remains undefined)
(i) If β 6= 0, then α = −β(1 + σ) and κ = 0.
(ii) If β = 0, then α = 0 and κ is arbitrary.
From the condition (II) we know that parameters α, β, σ, κ must depend on

c and tend to zero as c → ∞
lim κ = lim σ = lim α = lim β = 0 (2.38)

c→∞ c→∞ c→∞ c→∞
Additional insight into the values of these parameters may be obtained by

examining their dimensions. To keep the arguments of exponents in (2.16)
dimensionless we must assume the following dimensions (denoted by angle
brackets) of the generators
<H> = < time >−1

<P> = < distance >−1
<K> = < speed >−1
<J > = < angle >−1 = dimensionless
It then follows that
< K >2
<α> = =< speed >−2
<J >
< K >< P >
<β> = =< speed >−2
<H>
< κ > = < H >=< time >−1
< σ > = dimensionless
and we can satisfy condition (2.38) only by setting κ = σ = 0 (i.e., the

choice (i) above) and assuming β = −α ∝ c−2 . This approach does not
specify the coefficient of proportionality between β (and −α) and c−2 . To be
in agreement with experimental data we must choose this coefficient equal
to 1.
1
β = −α =
c2
Then the resulting Lie brackets are
3
X
[Ji , Pj ] = ǫijk Pk (2.39)
k=1
3
X
[Ji , Jj ] = ǫijk Jk (2.40)
k=1
X3
[Ji , Kj ] = ǫijk Kk (2.41)
k=1
[Ji , H] = 0 (2.42)
[Pi , Pj ] = [Pi , H] = 0 (2.43)
3
1 X
[Ki , Kj ] = − 2 ǫijk Jk (2.44)
c k=1
1
[Ki , Pj ] = − Hδij (2.45)
c2
[Ki , H] = −Pi (2.46)
This set of Lie brackets is called the Poincaré Lie algebra and it differs from
the Galilei algebra (2.17) - (2.24) only by small terms on the right hand sides
of Lie brackets (2.44) and (2.45). The general element of the corresponding
Poincaré group has the form10
~~ ~ ~ ~
eJ φ eKcθ ePx eHt (2.47)
10
Note that here we adhere to the conventional order of basic transformations adopted
in (2.6); from right to left: time translation → space translation → boost → rotation.
In equation (2.47) we denoted the parameter of boost by c~θ, where θ = |~θ|

is a dimensionless quantity called rapidity. Its relationship to the velocity of
boost v is
~θ
v(~θ) = c tanh θ
θ
cosh θ = (1 − v 2 /c2 )−1/2
The reason for introducing this new quantity is that rapidities of successive
boosts in the same direction are additive, while velocities are not.11
In spite of their simplicity, equations (2.39) - (2.46) are among the most
important equations in physics, and they have such an abundance of exper-
imental confirmations that one cannot doubt their validity. We therefore
accept that the Poincaré group is the true mathematical expression of rela-
tionships between different inertial laboratories.
Postulate 2.2 (the Poincaré group) Transformations between inertial lab-

oratories form the Poincaré group.
Even a brief comparison of the Poincaré (2.39) - (2.46) and Galilei (2.17)
- (2.24) Lie brackets reveals a number of important new features in the
relativistic theory. For example, due to the Lie bracket (2.44), boosts no
longer form a subgroup. However, boosts together with rotations do form
a 6-dimensional subgroup of the Poincaré group which is called the Lorentz
group.
2.3.2 Transformations of translation generators under

boosts
Poincaré Lie brackets allow us to derive transformation properties of genera-
~ and H with respect to boosts. Using Equation (2.25) and Lie brackets
tors P
(2.45) - (2.46) we find that if Px and H are generators in the reference frame
at rest O, then their counterparts Px (θ) and H(θ) in the reference frame O ′
moving along the x-axis are
11
see equation (4.6)
H(θ) = eKx cθ He−Kx cθ

Px (θ) = eKx cθ Px e−Kx cθ
Taking derivatives of these equations with respect to the parameter θ
∂
H(θ) = ceKx cθ (Kx H − HKx )e−Kx cθ
∂θ
= −ceKx cθ Px e−Kx cθ = −cPx (θ)
∂
Px (θ) = ceKx cθ (Kx Px − Px Kx )e−Kx cθ
∂θ
1 1
= − eKx cθ He−Kx cθ = − H(θ) (2.48)
c c
and taking a derivative of equation (2.48) again, we obtain a differential
equation
∂2 1 ∂
2
Px (θ) = − H(θ) = Px (θ)
∂ θ c ∂θ
with the general solution
Px (θ) = A cosh θ + B sinh θ
From the initial conditions we obtain
A = Px (0) = Px
∂ 1

B = Px (θ) = − H
∂θ θ=0 c
and finally
H
Px (θ) = Px cosh θ − sinh θ
c
Similar calculation shows that
H(θ) = H cosh θ − cPx sinh θ (2.49)

Py (θ) = Py
Pz (θ) = Pz
Similar to our discussion of rotations in subsection D.5, we can find the

transformation of P ~ and H corresponding to a general boost vector ~θ in
the coordinate-independent form. First we decompose P ~ into sum of two
~ =P ~k + P
~ ⊥ . The vector P ~ k = (P ~ ~
~ · ) is parallel to the direction
θ θ
vectors P θ θ
of the boost, while vector P ~⊥ = P ~ −P ~ k is perpendicular to that direction.
The perpendicular part P ~ ⊥ remains unchanged under the boost, while P ~k
transforms according to exp(Kc ~ ~θ)P
~ k exp(−Kc ~ k cosh θ − c−1 H sinh θ ~θ .
~ ~θ) = P
θ
Therefore
" ! #
~ ~
~′ = e
P
~ θ~ ~ −Kc
Kc ~ θ~
Pe ~+θ
= P ~·θ
P
1
(cosh θ − 1) − H sinh θ (2.50)
θ θ c
!
~
~ ~
H′ = eKcθ He−Kcθ
~ ~ ~ · θ sinh θ
= H cosh θ − c P (2.51)
θ
It is clear from (2.50) and (2.51) that boosts perform linear transformations
of components cP ~ and H. These transformations can be represented in a
matrix form if four generators (H, cP)~ are arranged in a column 4-vector
   
H′ H
 cPx′   cPx 
 ′
 = B(~θ)  .
 cPy   cPy 
′
cPz cPz
Explicit form of the matrix B(~θ) can be found in equation (I.8).

Chapter 3
QUANTUM MECHANICS
AND RELATIVITY
I am ashamed to tell you to how many figures I carried these

computations, having no other business at the time.
Isaac Newton
Two preceding chapters discussed the ideas of quantum mechanics and rel-
ativity separately. Now is the time to unify them in one theory. The major
contribution to such an unification was made by Wigner who formulated and
proved the famous Wigner’s theorem and developed the theory of unitary
representations of the Poincaré group in Hilbert spaces. This theory is the
mathematical foundation of the entire relativistic quantum approach pre-
sented in this book. Its discusion will occupy the present chapter as well as
the two following chapters 4 and 5.
3.1 Inertial transformations in quantum me-

chanics
The relativity Postulate 2.1 tells us that any inertial laboratory L is physi-
cally equivalent to any other laboratory L′ = gL obtained from L by applying
an inertial transformation g. This means that for identically arranged ex-
periments in these two laboratories the corresponding probability measures
83
84 CHAPTER 3. QUANTUM MECHANICS AND RELATIVITY
(φ|X) are the same. As shown in Fig. 1, laboratories are composed of two
major parts: the preparation device P and the observer O. The inertial
transformation g of the laboratory results in changes of both these part. The
change of the preparation device can be interpreted as a change of the state
of the system. We can formally denote this change by φ → gφ. The change
of the observer (or measuring apparatus) can be viewed as a change of the
experimental proposition X → gX. Then, the mathematical expression of
the relativity principle is that for any g, φ, and X
(gφ|gX) = (φ|X) (3.1)
In the rest of this chapter (and in chapters 4 – 6) we will develop a mathe-

matical formalism for representing transformations gφ and gX in the Hilbert
space. This is the formalism of unitary representations of the Poincaré group,
which is a cornerstone of any relativistic approach in quantum physics.
3.1.1 Wigner’s theorem

Let us first focus on inertial transformations of propositions X → gX.1 The
experimental propositions attributed to the observer O form a propositional
lattice L(H) which is realized as a set of closed subspaces in the Hilbert
space H. Observer O ′ = gO also represents her propositions as subspaces in
the same Hilbert space H. As these two observers are equivalent, we may
expect that their propositional systems have exactly the same mathematical
structures, i.e., they are isomorphic. This means that there exists a one-to-
one mapping
Kg : L(H) → L(H)
that connects propositions of the observer O with propositions of the observer

O ′, such that all lattice relations between propositions remain unchanged. In
particular, we will require that Kg transforms atoms to atoms; Kg maps mini-
mal and maximal propositions of O to the minimal and maximal propositions
of O ′, respectively
1
We will turn to transformations of states φ → gφ in the next subsection.
3.1. INERTIAL TRANSFORMATIONS IN QUANTUM MECHANICS 85
Kg (I) = I (3.2)
Kg (∅) = ∅ (3.3)
and for any X, Y ∈ L(H)
Kg (X ∨ Y ) = Kg (X) ∨ Kg (Y ) (3.4)
Kg (X ∧ Y ) = Kg (X) ∧ Kg (Y ) (3.5)
Kg (X ⊥ ) = Kg (X)⊥ (3.6)
As discussed in subsection 1.5.2, working with propositions is rather in-

convenient. It would be better to translate conditions (3.2) - (3.6) into the
language of vectors in the Hilbert space. In other words, we would like to find
a vector-to-vector transformation kg : H → H which generates the subspace-
to-subspace transformation Kg . More precisely, we demand that for each
subspace X, if Kg (X) = Y , then the generator kg maps all vectors in X into
vectors in Y , so that Sp(kg (x)) = Y , where x runs through all vectors in X.
The problem with finding generators kg is that there are just too many
of them. For example, if a ray p goes to the ray Kg (p), then the generator kg
must map each vector |xi ∈ p somewhere inside Kg (p), but the exact value
of kg |xi remains undetermined. Actually, we can multiply each image vector
kg |xi by an arbitrary nonzero factor η(|xi) and still have a valid generator.
Factors η(|xi) can be chosen independently for each |xi ∈ H. This freedom
is very inconvenient from the mathematical point of view.
This problem was solved by the celebrated Wigner’s theorem, [Wig31]
which states that we can always select factors η(|xi) in such a way that the
vector-to-vector mapping η(|xi)kg becomes either unitary (linear) or antiu-
nitary (antilinear).2
Theorem 3.1 (Wigner) For any isomorphic mapping Kg of a proposi-

tional lattice L(H) onto itself, one can find either unitary or antiunitary
transformation kg of vectors in the Hilbert space H, which generates Kg .
This transformation is defined uniquely up to an unimodular factor. For a
given Kg only one of these two possibilities (unitary or antiunitary) is real-
ized.
2
See Appendix F.7 for definitions of antilinear and antiunitary operators.
In this formulation, Wigner’s theorem has been proven in ref. [Uhl63] (see
also [AD78a]). The significance of this theorem comes from the fact that
there is a powerful mathematical apparatus for working with unitary and
antiunitary transformations, so that their properties (and, thus, properties
of subspace transformations Kg ) can be studied in great detail by familiar
techniques of linear algebra in Hilbert spaces.
From our study of inertial transformations in chapter 2, we know that
there is always a continuous path from the identity transformation e =
{~0, 0, 0, 0} to any other element g = {φ, ~ v, r, t} in the Poincaré group. It
is convenient to represent the identity transformation e by the identity oper-
ator which is, of course, unitary. It also seems reasonable to demand that the
mappings g → Kg and g → kg are continuous, so, the representative kg can-
not suddenly switch from unitary to antiunitary along the path connecting
e with g. Then we can reject antiunitary transformations as representatives
of Kg .3
Although Wigner’s theorem reduces the freedom of choosing generators,
it does not eliminate this freedom completely: Two unitary transformations
kg and βkg (where β is any unimodular constant) generate the same subspace
mapping. Therefore, for each Kg there is a set of generating unitary trans-
formations Ug differing from each other by a multiplicative constant. Such a
set is called a ray of transformations [Ug ].
Results of this subsection can be summarized as follows: each inertial
transformation g of the observer can be represented by a unitary opera-
tor Ug in H defined up to an arbitrary unimodular factor: ket vectors are
transformed according toP|xi → Ug |xi, and bra vectors are transformed as
hx| → hx|Ug−1 . If X = i |ei ihei |4 is a projection (proposition) associated
with the observer O, then observer O ′ = gO represents the same proposition
by the projection
X
X′ = Ug |ei ihei |Ug−1 = Ug XUg−1
i
P
Similarly, if F = i fi |ei ihei | is an operator of observable associated with
the observer O then
3
The antiunitary operators may still represent discrete transformations, e.g., time
inversion, but we agreed not to discuss such transformations in this book, because they
do not correspond to exact symmetries.
4
Here |ei i is an orthonormal basis in the subspace X.
3.1. INERTIAL TRANSFORMATIONS IN QUANTUM MECHANICS 87
X
F′ = fi Ug |ei ihei |Ug−1 = Ug F Ug−1 (3.7)
i
is operator of the same observable from the point of view of the observer
O ′ = gO.
3.1.2 Inertial transformations of states

In the preceding subsection we analyzed the effect of an inertial transforma-
tion g on observers, measuring apparatuses, propositions, and observables.
Now we are going to examine the effect of g on preparation devices and states.
We will try to answer the following question: if |Ψi is a vector describing a
pure state prepared by the preparation device P , then which state vector |Ψ′ i
describes the state prepared by the transformed preparation device P ′ = gP ?
To find the connection between |Ψi and |Ψ′ i we will use the relativity
principle. According to equation (3.1), for every observable F , its expectation
value (1.42) should not change after inertial transformation of the entire
laboratory (= both the preparation device and the observer). In the bra-ket
notation, this condition can be written as
hΨ|F |Ψi = hΨ′ |F ′|Ψ′ i = hΨ′ |Ug F Ug−1 |Ψ′ i (3.8)
This equation should be valid for any choice of observable F . Let us choose
F = |ΨihΨ|, i.e., the projection onto the ray containing vector |Ψi. Then
equation (3.8) takes the form
hΨ|ΨihΨ|Ψi = hΨ′ |Ug |ΨihΨ|Ug−1|Ψ′ i = hΨ′ |Ug |ΨihΨ′|Ug |Ψi∗

= |hΨ′|Ug |Ψi|2
The left hand side of this equation is equal to 1. So, for each |Ψi, the
transformed vector |Ψ′ i is such that
|hΨ′|Ug |Ψi|2 = 1
Since both Ug |Ψi and |Ψ′ i are unit vectors, we must have
|Ψ′ i = σ(g)Ug |Ψi
where σ(g) is an unimodular factor. Operator Ug is defined up to a unimod-

ular factor,5 therefore, we can absorb the factor σ(g) into the uncertainty of
Ug and finally write the action of the inertial transformation g on states
|Ψi → |Ψ′ i = Ug |Ψi (3.9)
Then, taking into account the transformation law for observables (3.7) we can
check that, in agreement with the relativity principle (3.8), the expectation
values remain the same in all laboratories
hF ′i = hΨ′ |F ′ |Ψ′ i = (hΨ|Ug−1 )(Ug F Ug−1 )(Ug |Ψi) = hΨ|F |Ψi

= hF i (3.10)
3.1.3 Heisenberg and Schrödinger pictures

The conservation of expectation values (3.10) is valid only in the case when
inertial transformation g is applied to the laboratory as a whole. What would
happen if only observer or only preparation device is transformed?
Let us first consider inertial transformations of observers. If we change
the observer without changing the preparation device (=state) then operators
of observables change according to (3.7) while the state vector remains the
same |Ψi. As expected, this transformation changes results of experiments.
For example, the expectation values of observable F are generally different
for different observers O and O ′ = gO
hF ′ i = hΨ|(Ug F Ug−1 )|Ψi =

6 hΨ|F |Ψi = hF i (3.11)
On the other hand, if the inertial transformation is applied to the preparation

device and the state of the system changes according to equation (3.9), then
the results of measurements are also affected
5
see subsection 3.1.1
3.2. UNITARY REPRESENTATIONS OF THE POINCARÉ GROUP 89
hF ′′ i = (hΨ|Ug−1 )F (Ug |Ψi) 6= hΨ|F |Ψi = hF i (3.12)

Formulas (3.11) and (3.12) play a prominent role because many problems
in physics can be formulated as questions about descriptions of systems af-
fected by inertial transformations. An important example is dynamics, i.e.,
the time evolution of the system. In this case one considers time transla-
tion elements of the Poincaré group g = {~0; 0; 0; t}. Then equations (3.11)
and (3.12) provide two equivalent descriptions of dynamics. Equation (3.11)
describes dynamics in the Heisenberg picture. In this picture the state vec-
tor of the system remains fixed while operators of observables change with
time. Equation (3.12) provides an alternative description of dynamics in the
Schrödinger picture. In this description, operators of observables are time-
independent, while the state vector of the system depends on time. These
two pictures are equivalent because according to (3.1) a shift of the observer
by g (forward time translation) is equivalent to the shift of the preparation
device by g −1 (backward time translation).
The notions of Schrödinger and Heisenberg pictures can be applied not
only to time translations. They can be generalized to other types of iner-
tial transformations; i.e., the transformation g above can stand for space
translations, rotations, boosts, or any combination of them.
3.2 Unitary representations of the Poincaré

group
In the preceding section we discussed the representation of a single inertial
transformation g by an isomorphism Kg of the lattice of propositions and by
a ray of unitary operators [Ug ], which act on states and/or observables in the
Hilbert space. We know from chapter 2 that inertial transformations form the
Poincaré group. Then subspace mappings Kg1 , Kg2 , Kg3 , . . . corresponding to
different group elements g1 , g2, g3 , . . . cannot be arbitrary. They must satisfy
conditions
K g 2 K g 1 = Kg 2 g 1 (3.13)
−1
Kg−1 = Kg (3.14)
Kg3 (Kg2 Kg1 ) = Kg3 Kg2 g1 = Kg3 (g2 g1 ) = K(g3 g2 )g1 = (Kg3 Kg2 )Kg1(3.15)
which reflect group properties of inertial transformations g. Our goal in this

section is to find out which conditions are imposed by (3.13) - (3.15) on the
set of unitary representatives Ug of the Poincaré group.
3.2.1 Projective representations of groups

For each group element g let us choose an arbitrary unitary representative
Ug in the ray [Ug ]. For example, let us choose the representatives (also called
generators) Ug1 ∈ [Ug1 ], Ug2 ∈ [Ug2 ], and Ug2 g1 ∈ [Ug2 g1 ]. The product Ug2 Ug1
should generate the mapping Kg2 g1 , therefore it can differ from our chosen
representative Ug2 g1 by at most a unimodular constant α(g2 , g1). So, we can
write for any two transformations g1 and g2
Ug2 Ug1 = α(g2 , g1 )Ug2 g1 (3.16)

The factors α have three properties. First, they are unimodular.
|α(g2, g1 )| = 1 (3.17)
Second, from the property (A.2) of the unit element we have for any g
Ug Ue = α(g, e)Ug = Ug (3.18)

Ue Ug = α(e, g)Ug = Ug (3.19)
which implies
α(g, e) = α(e, g) = 1 (3.20)

Third, the associative law (3.15) implies
Ug3 (α(g2, g1 )Ug2 g1 ) = (α(g3, g2 )Ug3 g2 )Ug1

α(g2, g1 )α(g3 , g2 g1 )Ug3 g2 g1 = α(g3g2 , g1)α(g3 , g2 )Ug3 g2 g1
α(g2 , g1 )α(g3 , g2 g1 ) = α(g3g2 , g1)α(g3 , g2 ) (3.21)
The mapping Ug from group elements to unitary operators in H is called a
projective representation of the group if it satisfies equations (3.16), (3.17),
(3.20), and (3.21).
3.2.2 Elimination of central charges in the Poincaré

algebra
In principle, we could keep the arbitrarily chosen unitary representatives of
the subspace transformations Ug1 , Ug2 , . . ., as discussed above, and work with
thus obtained projective representation of the Poincaré group, but this would
result in a rather complicated mathematical formalism. The theory would
be significantly simpler if we could judiciously choose the representatives6 in
such a way that the factors α(g2 , g1 ) in (3.16) are simplified or eliminated
altogether. Then we would have a much simpler linear unitary group repre-
sentation (see Appendix H) instead of the projective group representation.
In this subsection we are going to demonstrate that in any projective repre-
sentation of the Poincaré group such elimination of factors α(g2 , g1 ) is indeed
possible [CJS63].
The proof of the last statement is significantly simplified if conditions
(3.17), (3.20), and (3.21) are expressed in a Lie algebra notation. In the
vicinity of the unit element of the group we can use vectors ζ~ from the
Poincaré Lie algebra to identify other group elements (see equation (E.1)),
i.e.
10
!
X
ζ~
g = e = exp ζ a~ta
a=1
~ K,
where ta is the basis of the Poincaré Lie algebra (H, P, ~ J~ ) from subsection
2.3.1. Then we can write unitary representatives Ug of inertial transforma-
tions g in the form7
10
!
iX a
Uζ~ = exp − ζ Fa (3.22)
~ a=1
where ~ is a real constant which will be left unspecified at this point,8 and
Fa are ten Hermitian operators in the Hilbert space H called the generators
6
i.e., multiply our previously chosen unitary operators Ug by some unimodular factors
Ug → β(g)Ug
7
Here we have used Stone’s theorem H.2. The nature of one-parameter subgroups
featuring in the theorem is rather obvious. These are subgroups of similar transformations,
i.e., a subgroup of space translations, a subgroup of rotations about the z-axis, etc.
8
We will identify ~ with the Planck constant in subsection 4.1.1
of the unitary projective representation. Then we can write equation (3.16)

in the form
~ ξ)U
Uζ~ Uξ~ = α(ζ, ~ ~~ (3.23)
ζξ
~ ξ)
Since α is unimodular we can set α(ζ, ~ = exp[iκ(ζ,
~ ξ)],
~ where κ(ζ, ~ ξ)
~ is a
real function. Conditions (3.20) and (3.21) then will be rewritten in terms
of κ
~ ~0) = κ(~0, ζ)
κ(ζ, ~ =0 (3.24)
~ ζ)
κ(ξ, ~ + κ(~
χ, ξ~ζ)
~ = κ(~ ~ ζ)
χξ, ~ + κ(~ ~
χ, ξ) (3.25)
Note that we can write the lowest order term in the Taylor series for κ near
the group identity element in the form9
10
X
~ ξ)
κ(ζ, ~ = hab ζ a ξ b (3.26)
ab=1
The constant term, the terms linear in ζ a and ξ b , as well as the terms propor-
tional to ζ a ζ b and ξ a ξ b are absent on the right hand of (3.26) as a consequence
of the condition (3.24).
Using the same arguments as during our derivation of equation (E.6), we
can expand all terms in (3.23) around ζ~ = ξ~ = ~0
10 10
! 10 10
!
iX a 1 X b c iX a 1 X b c
1− ξ Fa − 2 ξ ξ Fbc + . . . 1− ζ Fa − 2 ζ ζ Fbc + . . .
~ a=1 2~ bc=1 ~ a=1 2~ bc=1
10
! 10 10
!
X h i X X
= 1+i hab ζ a ξ b + . . . 1 − ζ a + ξa + fbca ξ b ζ c + . . . Fa
ab=1
~ a=1 bc=1
10 i
1 X a a b b
− (ζ + ξ + . . .)(ζ + ξ + . . .)Fab + . . .
2~2 ab=1
Equating the coefficients multiplying products ξ a ζ b on both sides, we obtain

9
see also the third term on the right hand side of equation (E.4)
10
1 1 iX c
− (Fab + Fba ) = F F
a b − ihab + f Fc
2~2 ~2 ~ c=1 ab
The left hand side of this equation is symmetric with respect to interchange
of indices a ↔ b. The same should be true for the right hand side. From this
condition we obtain commutators of generators F
10
X
c
Fa Fb − Fb Fa = i~ Cab Fc + Eab (3.27)
c=1
c c c
where Cab ≡ fab − fba are familiar structure constants of the Poincaré Lie
algebra (2.39) - (2.46) and Eab = i~2 (hab − hba ) are imaginary constants,
which depend on our original choice of representatives Ug in rays [Ug ].10
These constants are called central charges. Our main task in this subsection
is to prove that representatives Ug can be chosen in such a way that Eab = 0,
i.e., the central charges get eliminated.
First we consider the original (arbitrary) set of representatives Ug . In
accordance with our notation in section 2.3 we will use symbols
(H̃, P̃, J̃, K̃) (3.28)

to denote ten generators Fa of an arbitrary projective representation Ug .11
These generators correspond to time translation, space translation, rota-
a
tions, and boosts, respectively. Then using the structure constants Cbc of the
Poincaré Lie algebra from equations (2.39) - (2.46) we obtain the full list of
commutators (3.27).
3
X (1)
[J˜i , P̃j ] = i~ ǫijk P̃k + Eij (3.29)
k=1
10
To be exact, we must write Eab on the right hand side of equation (3.27) multiplied
by the identity operator I. However, we will omit the symbol I here for brevity.
11
Note that generators (H, P, ~ K,
~ J~ ) in subsection 2.3.1 were abstract quantities
that could be interpreted as “derivatives of group transformations”, while generators
(H̃, P̃, J̃, K̃) here are Hermitian operators in the Hilbert space of states of our physical
system.
3
X (2)
[J˜i , J˜j ] = i~ ǫijk (J˜k + iEk ) (3.30)
k=1
X3
(3)
[J˜i , K̃j ] = i~ ǫijk K̃k + Eij (3.31)
k=1
(4)
[P̃i , P̃j ] = Eij (3.32)
(5)
[J˜i , H̃] = Ei (3.33)
(6)
[P̃i , H̃] = Ei (3.34)
3
~ X (7)
[K̃i , K̃j ] = −i ǫijk (J˜k + iEk ) (3.35)
c2 k=1
~ (8)
[K̃i , P̃j ] = −i2
H̃δij + Eij , (3.36)
c
(9)
[K̃i , H̃] = −i~P̃i + Ei (3.37)
Here we arranged Eab into nine sets of central charges E (1) . . . E (9) . In equa-
tion (3.30) and (3.35) we took into account that their left hand sides are
antisymmetric tensors. So, the central charges must form antisymmetric
tensors
P3 as well, and, according
P3 to Table D.1, they can be represented as
(2) −2 (7) (2) (7)
−~ k=1 ǫijk Ek and ~c k=1 ǫijk Ek , respectively, where Ek and Ek
are 3-vectors.
Next we will use the requirement that commutators (3.29) - (3.37) must
satisfy the Jacobi identity.12 This will allow us to make some simplifications.
(1)
For example, using13 P̃3 = − ~i [J˜1 , P̃2 ] + ~i E12 and the fact that all constants
E commute with generators of the group, we obtain
i (1) i
[P̃3 , P̃1 ] = − [([J˜1 , P̃2 ] + E12 ), P̃1 ] = − [[J˜1 , P̃2 ], P̃1 ]
~ ~
i i i (4) i (1)
= − [[P̃1 , P̃2 ], J˜1 ] − [[J˜1 , P̃1 ], P̃2 ] = − [E12 , J˜1 ] − [E11 , P̃2 ]
~ ~ ~ ~
= 0
(4) (4) (5) (6)
so E31 = 0. Similarly, we can show that Eij = Ei = Ei = 0 for all values
of indices i, j = 1, 2, 3.
12
equation (E.10), which is equivalent to the associativity condition (3.15) or (3.25)
13
Here we used equation (3.29).
Using the Jacobi identity we further obtain
i~[J˜3 , P̃3 ] = [[J˜1 , J˜2 ], P̃3 ] = [[P̃3 , J˜2 ], J˜1 ] + [[J˜1 , P̃3 ], J˜2 ]
= i~[J˜1 , P̃1 ] + i~[J˜2 , P̃2 ] (3.38)
and, similarly,
i~[J˜1 , P̃1 ] = i~[J˜2 , P̃2 ] + i~[J˜3 , P̃3 ] (3.39)

By adding equations (3.38) and (3.39) we see that
[J˜2 , P̃2 ] = 0 (3.40)

Similarly, we obtain [J˜1 , P̃1 ] = [J˜3 , P̃3 ] = 0, which means that
(1)
Eii = 0 (3.41)
Using the Jacobi identity again, we obtain
i~[J˜2 , P̃3 ] = [[J˜3 , J˜1 ], P̃3 ] = [[P̃3 , J˜1 ], J˜3 ] + [[J˜3 , P̃3 ], J˜1 ]
= −i~[J˜3 , P̃2 ]
This antisymmetry property is also true in the general case (for any i, j =
1, 2, 3; i 6= j)
[J˜i , P̃j ] = −[J˜j , P̃i ] (3.42)

Putting together (3.40) and (3.42) we see that tensor [J˜i , P̃j ] is antisymmetric.
(1)
This implies that we can introduce a vector Ek such that
3
X
(1) (1)
Eij = −~ ǫijk Ek
i=1
3
X (1)
[J˜i , P̃j ] = i~ ǫijk (P̃k + iEk ) (3.43)
i=1
(3)
Similarly, we can show that Eii = 0 and
3
X (3)
[J˜i , K̃j ] = i~ ǫijk (K̃k + iEk )
i=1
Taking into account the above results, commutation relations (3.29)-

(3.37) now take the form
3
X (1)
[J˜i , P̃j ] = i~ ǫijk (P̃k + iEk ) (3.44)
k=1
X3
(2)
[J˜i , J˜j ] = i~ ǫijk (J˜k + iEk ) (3.45)
k=1
X3
(3)
[J˜i , K̃j ] = i~ ǫijk (K̃k + iEk ) (3.46)
k=1
[P̃i , P̃j ] = [J˜i , H̃] = [P̃i , H̃] = 0 (3.47)

3
~ X (7)
[K̃i , K̃j ] = −i 2 ǫijk (J˜k + iEk ) (3.48)
c k=1
~ (8)
[K̃i , P̃j ] = −i2
H̃δij + Eij , (3.49)
c
(9)
[K̃i , H̃] = −i~P̃i + Ei (3.50)
where E on the right hand sides are certain imaginary constants. The next
step in elimination of the central charges E is to use the freedom of choos-
ing unimodular factors β(g) in front of operators of the representation Ug :
~ ~ differing by a unimodular factor β(ζ)
Two unitary operators Uζ~ and β(ζ)U ~
ζ
generate the same subspace transformation Kζ~ . Correspondingly, the choice
of generators Fa has some degree of arbitrariness as well. Since β(ζ) ~ are
unimodular, we can write
10
X
~ = exp(iγ(ζ))
β(ζ) ~ ≈1+i Ra ζ a
a=1
~ results in adding
Therefore, in the first order, the presence of factors β(ζ)
some real constants Ra to generators Fa . We would like to show that by
adding such constants we can make all central charges equal to zero.
Let us now add constants R to the generators P̃j , J˜j , and K̃j and denote
the redefined generators as
(1)
Pj = P̃j + Rj
(2)
Jj = J˜j + R j
(3)
Kj = K̃j + Rj
Then commutator (3.45) takes the form
3
X
(2) (2) (2)
[Ji , Jj ] = [J˜i + Ri , J˜j + Rj ] = [J˜i , J˜j ] = i~ ǫijk (J˜k + iEk )
k=1
(2) (2)
So, if we choose Rk = iEk , then
3
X
[Ji , Jj ] = i~ ǫijk Jk
k=1
and central charges are eliminated from this commutator.

Similarly, central charges can be eliminated from commutators
3
X
[Ji , Pj ] = i~ ǫijk Pk
k=1
X3
[Ji , Kj ] = i~ ǫijk Kk (3.51)
k=1
(1) (1) (3) (3)

by choosing Rk = iEk and Rk = iEk . From equation (3.51) we then
obtain
i i i
[K1 , K2 ] = − [[J2 , K3 ], K2 ] = − [[J2 , K2 ], K3 ] − [[K2 , K3 ], J2 ]
~ ~ ~
i i~ (7) i~
= − [− 2 (J1 + iE1 ), J2 ] = − 2 J3
~ c c
(1) (2) (3)

so, our choice of the constants Rk , Rk , and Rk eliminates the central
(7)
charges Ei .
From equation (3.51) we also obtain
i i i
[K3 , H̃] = − [[J1 , K2 ], H̃] = − [[H̃, K2 ], J1 ] − [[J1 , H̃], K2 ]
~ ~ ~
= −[J1 , P2 ] = −i~P3
which implies that the central charge E (9) is canceled as well. Finally
i i i
[K1 , P2 ] = − [[J2 , K3 ], P2 ] = − [[J2 , P2 ], K3 ] + [[K3 , P2 ], J3 ] = 0
~ ~ ~
i i i
[K1 , P1 ] = − [[J2 , K3 ], P1 ] = − [[J2 , P1 ], K3 ] + [[K3 , P1 ], J3 ]
~ ~ ~
= [K3 , P3 ]
(8)
It then follows that Eij = 0 if i 6= j and we can introduce a real scalar E (8)
such that
(8) (8) (8) i~ (8)

E11 = E22 = E33 ≡ − E
c2
i~
[Ki , Pi ] = − 2
δij (H̃ + E (8) )
c
Finally, by redefining the generator of time translations H = H̃ + E (8) we

eliminate all central charges from commutation relations of the Poincaré Lie
algebra
3
X
[Ji , Pj ] = i~ ǫijk Pk (3.52)
k=1
X3
[Ji , Jj ] = i~ ǫijk Jk (3.53)
k=1
X3
[Ji , Kj ] = i~ ǫijk Kk (3.54)
k=1
[Pi , Pj ] = [Ji , H] = [Pi , H] = 0 (3.55)

3
i~ X
[Ki , Kj ] = − 2 ǫijk Jk (3.56)
c k=1
i~
[Ki , Pj ] = − Hδij (3.57)
c2
[Ki , H] = −i~Pi (3.58)
Thus Hermitian operators H, P, J, and K provide a representation of the

Poincaré Lie algebra, and the redefined unitary operators β(g)Ug form a
unique unitary representation of the Poincaré group that corresponds to the
given projective representation Ug in the vicinity of the group identity. We
have proven that projective representations of the Poincaré group are equiv-
alent to certain unitary representations, which are much easier objects for
study (see Appendix H).
Commutators (3.52) - (3.58) are probably the most important equations
of relativistic quantum theory. In the rest of this book we will have many
opportunities to appreciate a deep physical content of these formulas.
3.2.3 Single-valued and double-valued representations

In the preceding subsection we eliminated the phase factors α(g2, g1 ) from
equation (3.16) by resorting to Lie algebra arguments. However, these ar-
guments work only in the vicinity of the group’s unit element. There is a
possibility that non-trivial phase factors may reappear in the multiplication
law (3.16) when the group manifold has a non-trivial topology and group
elements are considered which are far from the unit element.
In Appendix H.4 we established that this possibility is realized in the case
of the rotation group. This means that for quantum-mechanical applications
we need to consider both single-valued and double-valued representations of
this group. Since the rotation group is a subgroup of the Poincaré group, the
same conclusion is relevant for the Poincaré group: both single-valued and
double-valued unitary representations should be considered.14 In chapter 5
we will see that these two cases correspond to integer-spin and half-integer-
spin systems, respectively.
14
Equivalently, one can choose to consider all single-valued representations of the uni-
versal covering group of the Poincaré group.
3.2.4 Fundamental statement of relativistic quantum

theory
The most important result of this chapter is the connection between relativity
and quantum mechanics summarized in the following statement (see, e.g.,
[Wei95])
Statement 3.2 (Unitary representations of the Poincaré group) In a
relativistic quantum description of a physical system, inertial transforma-
tions are represented by unitary operators which furnish a unitary (single- or
double-valued) representation of the Poincaré group in the Hilbert space of
the system.
It is important to note that this statement is completely general. The Hilbert
space of any isolated physical system (no matter how complex) must carry a
unitary representation of the Poincaré group. Construction of Hilbert spaces
and Poincaré group representations in them is the major part of theoretical
description of physical systems. The rest of this book is primarily devoted
to performing these difficult tasks.
Basic inertial transformations from the Poincaré group are represented in
i i ~
the Hilbert space by unitary operators: e− ~ Pr for spatial translations, e− ~ Jφ
ic ~ i
for rotations, e− ~ Kθ for boosts, and e ~ Ht for time translations,15 A gen-
eral inertial transformation g = {φ, ~ v(~θ), r, t} is represented by the unitary
16
operator
i ~ ic ~ i i
Ug = e− ~ Jφ e− ~ Kθ e− ~ Pr e ~ Ht (3.59)
We will frequently use notation
~ ~θ; r, t) ≡ U(Λ; r, t)
Ug ≡ U(φ, (3.60)
where Λ is a Lorentz transformation of inertial frames that combines boost
~θ and rotation φ.
~ Then, in the Schrödinger picture17 state vectors transform
between different inertial reference frames according to18
15
The exponential form of the unitary group representatives follows from equation (3.22).
16
compare with equation (2.16)
17
18
We will see in subsection 5.2.4 that this is active transformation of states. In most
physical applications one is interested in passive transformations of states (i.e., how the
|Ψ′ i = Ug |Ψi (3.61)
In the Heisenberg picture inertial transformations of observables have the

form
F ′ = Ug F Ug−1 (3.62)
For example, the equation describing the time evolution of the observable F
in the Heisenberg picture19
i i
F (t) = e ~ Ht F e− ~ Ht (3.63)
i 1
= F + [H, F ]t − 2 [H, [H, F ]]t2 + . . . (3.64)
~ 2~
can be also written in a differential form
dF (t) i
= [H, F ]
dt ~
which is the familiar Heisenberg equation.
Note also that analogous “Heisenberg equations” can be written for trans-
formations of observables with respect to space translations, rotations, and
boosts
dF (r) i
= − [P, F ]
dr ~
~
dF (φ) i
= − [J, F ]
dφ~ ~
~
dF (θ) ic
= − [K, F ] (3.65)
d~θ ~
same state is seen by two different observers), which are given by the inverse operator
Ug−1 .
19
see equation (E.13)
We already mentioned that transformations of observables with respect

to inertial transformations of observers cover many interesting problems in
physics (the time evolution, boost transformations, etc.). From the above
formulas we see that solution of these problems requires the knowledge of
commutators between observables F and generators (H, P, J and K) of the
relevant Poincaré group representation. In the next chapter we will discuss
definitions of various observables, their connections to Poincaré generators,
and their commutation relations.
Chapter 4
OPERATORS OF
OBSERVABLES
Throwing pebbles into the water, look at the ripples they form on
the surface, otherwise, such occupation becomes an idle pastime.
Kozma Prutkov
In chapters 1 and 3 we established that in quantum theory any physical sys-

tem is described by a complex Hilbert space H, pure states are represented
by rays in H, observables are represented by Hermitian operators in H, and
there is a unitary representation Ug of the Poincaré group in H which de-
termines how state vectors and operators of observables change when the
preparation device or the measuring apparatus undergoes an inertial trans-
formation. Our next goal is to clarify the structure of the set of observables.
In particular, we wish to find which operators correspond to such familiar ob-
servables as velocity, momentum, energy, mass, position, etc, what are their
spectra and what are the relationships between these operators? We will also
find out how these observables change under inertial transformations from
the Poincaré group. This implies that we will use the Heisenberg picture
everywhere in this chapter.
We should stress that physical systems considered in this chapter are
completely arbitrary: they can be either elementary particles or compound
systems of many elementary particles or even systems (such as unstable par-
ticles) in which the number of particles is not precisely defined. The only
103
104 CHAPTER 4. OPERATORS OF OBSERVABLES
significant requirement is that our system must be isolated, i.e., its interac-
tion with the rest of the universe can be neglected.
In this chapter we will focus on observables whose operators can be ex-
pressed as functions of generators (P, J, K, H) of the Poincaré group rep-
resentation Ug . In chapter 8, we will meet other observables, such as the
number of particles. They cannot be expressed through ten generators of the
Poincaré group.
4.1 Basic observables

4.1.1 Energy, momentum, and angular momentum
The generators of the Poincaré group representation in the Hilbert space of
any system are Hermitian operators H, P, J, and K, and we might suspect
that they are related to certain observables pertinent to this system. What
are these observables? In order to get a hint, let us now postulate that the
constant ~ introduced in equation (3.22) is the Planck constant
kg · m2
~ = 6.626 · 10−34 (4.1)
s
whose dimension can be also expressed as < ~ >=< mass >< speed ><
distance >. Then the dimensions of generators can be found from the con-
dition that the arguments of exponents in (3.59) must be dimensionless
<~>
• < H >= <time>
=< mass >< speed >2 ;
<~>
• < P >= <distance>
=< mass >< speed >;
• < J >=< ~ >=< mass >< speed >< distance >

<~>
• < K >= <speed>
=< mass >< distance >;
Based on these dimensions we can guess that we are dealing with observables
of energy (or Hamiltonian) H, momentum P, and angular momentum J
of the system.1 We will call them basic observables. Operators H, P, and J
1
There is no common observable directly associated with the boost generator K, but
we will see later that K is intimately related to system’s position and spin.
4.1. BASIC OBSERVABLES 105
generate transformations of the system as a whole, so we will assume that

these are observables for the entire system, i.e., the total energy, the total
momentum, and the total angular momentum. Of course, these dimension-
ality considerations are not a proof. The justification of these choices will
become more clear later, after studying properties of operators and relations
between them.
Using this interpretation and commutators in the Poincaré Lie algebra
(3.52) - (3.58), we immediately obtain commutation relations between op-
erators of observables. Then we know which pairs of observables can be
simultaneously measured.2 For example, we see from (3.55) that energy
is simultaneously measurable with the momentum and angular momentum.
From (3.53) it is clear that different components of the angular momentum
cannot be measured simultaneously. These facts are well-known in non-
relativistic quantum mechanics. Now we have them as direct consequences
of the principle of relativity and the Poincaré group structure.
From commutators (3.52) - (3.58) we can also find formulas for transfor-
mations of operators H, P, J, and K from one inertial frame to another. For
example, each vector observable F = P, J or K transforms under rotations
as3
! "
#
~ i ~
~
φ ~
φ ~
φ
~ = e − ~i Jφ Jφ
F(φ) Fe ~ = F cos φ + F· (1 − cos φ) − F × sin φ
φ φ φ
(4.2)
The boost transformation law for generators of translations is4
!
~θ ~θ θ~
− ic Kθ~ ic
Kθ~
P(θ) = e ~ Pe ~ =P+ P· (cosh θ − 1) − H sinh θ
θ θ cθ
(4.3)
!
ic ~ ic ~
~θ
H(θ) = e− ~ Kθ He ~ Kθ = H cosh θ − c P · sinh θ (4.4)
θ
2
i.e., they have a common basis of eigenvectors, as explained in subsection 1.6.2 and
Appendix G.2
3
see equation (D.21)
4
see equations (2.50) and (2.51)
It also follows from (3.55) that energy H, momentum P, and angular mo-
mentum J do not depend on time, i.e., they are conserved observables.
4.1.2 Operator of velocity

The operator of velocity is defined as5 (see, e.g., [AW75, Jor77])
Pc2
V≡ (4.5)
H
Denoting V(θ) the velocity measured in the frame of reference moving with
the speed v = c tanh θ along the x-axis, we obtain
icPx c2 ic Kx θ c2 Px cosh θ − cH sinh θ

Vx (θ) = e− ~ Kx θ e~ =
H H cosh θ − cPx sinh θ
2 −1
c Px H − c tanh θ Vx − v
= −1
= (4.6)
1 − cPx H tanh θ 1 − Vx v/c2
p
Vy Vy 1 − v 2 /c2
Vy (θ) = = , (4.7)
(1 − Vcx tanh θ) cosh θ 1 − Vx v/c2
p
Vz Vz 1 − v 2 /c2
Vz (θ) = = (4.8)
(1 − Vcx tanh θ) cosh θ 1 − Vx v/c2
These formulas coincide with the usual relativistic law of addition of veloci-
ties. In the limit c → ∞ they reduce to the familiar non-relativistic form
Vx (v) = Vx − v
Vy (v) = Vy
Vz (v) = Vz
4.2 Casimir operators

Observables H, P, V, and J depend on the observer, so they do not repre-
sent intrinsic fundamental properties of the system. For example, if a system
5
The ratio of operators is well-defined here because P and H commute with each other.
4.2. CASIMIR OPERATORS 107
has momentum p ∈ R3 in one frame of reference, then, according to (4.3),

there are other (moving) frames of reference in which momentum takes dif-
ferent values. The measured momentum depends on both the state of the
system and the reference frame in which the observation is made. Are there
observables which reflect some intrinsic observer-independent properties of
the system? If there are such observables, then their operators (they are
called Casimir operators) must commute with all generators of the Poincaré
group. It can be shown that the Poincaré group has only two independent
Casimir operators [FN94]. Any other Casimir operator of the Poincaré group
is a function of these two. So, there are two invariant physical properties of
any physical system. One such property is mass, which is a measure of the
matter content in the system. The corresponding Casimir operator will be
considered in subsection 4.2.2. Another invariant property is related to the
speed of rotation of the system around its own axis or spin.6 The Casimir
operator corresponding to this invariant property will be found in subsection
4.2.3.
4.2.1 4-vectors
Before addressing Casimir operators, let us introduce some useful defini-
tions. We will call a quadruple of operators (A0 , Ax , Ay , Az ) a 4-vector 7 if
(Ax , Ay , Az ) is a 3-vector, A0 is a 3-scalar, and their commutators with the
boost generators are
i~
[Ki , Aj ] = − A0 δij (i, j = x, y, z) (4.9)
c
~ A0 ] = − i~ A
[K, ~ (4.10)
c
Then, it is easy to show that the 4-square Ã2 ≡ A2x + A2y + A2z − A20 of the
4-vector Ã is a 4-scalar, i.e., it commutes with both rotations and boosts.
For example
[Kx , Ã2 ] = [Kx , A2x + A2y + A2z − A20 ]

6
The invariance of the absolute value of spin is evident for macroscopic freely spinning
objects. Indeed, no matter how we translate, rotate or boost the frame of reference we
cannot stop or reverse the spinning motion of the observed system.
7
see also Appendix I.1
i~
= − (Ax A0 + A0 Ax − A0 Ax − Ax A0 ) = 0
c
Therefore, in order to find the Casimir operators of the Poincaré group we

should be looking for two functions of the Poincaré generators, which are
4-vectors and, in addition, commute with H and P. Then 4-squares of these
4-vectors are guaranteed to commute with all Poincaré generators.
4.2.2 Operator of mass

It follows from (4.2) - (4.4) that four operators (H, cP) satisfy all conditions
specified in subsection 4.2.1 for 4-vectors. Moreover, they commute with each
other. These operators are called the energy-momentum 4-vector. Then we
can construct the first Casimir invariant – called the mass operator – as the
4-square of this 4-vector
1√ 2
M =+ H − P 2 c2 (4.11)
c2
The operator of mass must be Hermitian, therefore we demand that for any
physical system H 2 − P 2 c2 ≥ 0, i.e., that the spectrum of operator H 2 − P 2 c2
does not contain negative values. Honoring the fact that masses of all known
physical systems are non-negative we choose the positive value of the square
root in (4.11). Then the relationship between energy, momentum, and mass
takes the form
√
H = + P 2 c2 + M 2 c4 (4.12)
In the non-relativistic limit (c → ∞) we obtain from equation (4.12)
P2
H ≈ Mc2 +
2M
which is the sum of the famous Einstein’s rest mass energy E = Mc2 and
the usual kinetic energy term P 2 /(2M).
4.2. CASIMIR OPERATORS 109
4.2.3 Pauli-Lubanski 4-vector

The second 4-vector commuting with H and P is the Pauli-Lubanski operator
whose components are defined as8
W 0 = (P · J) (4.13)
1
W = HJ − c[P × K] (4.14)
c
Let us check that all required 4-vector properties are, indeed, satisfied for
(W 0 , W). We can immediately observe that
[J, W 0 ] = 0
so W 0 is a scalar. Moreover, W 0 changes its sign after changing the sign of
P so it is a pseudoscalar. W is a pseudovector, because it does not change
its sign after changing the signs of K and P and
3
X
[Ji , Wj ] = i~ ǫijk Wk
k=1
Let us now check the commutators with boost generators
[Kx , W 0 ] = [Kx , Px Jx + Py Jy + Pz Jz ]

HJx i~
= −i~ 2
− Py Kz + Pz Ky = − Wx (4.15)
c c

HJx
[Kx , Wx ] = Kx , − cPy Kz + cPz Ky
c
i~ i~
= (−Px Jx − Py Jy − Pz Jz ) = − W 0 (4.16)
c c
HJy
[Kx , Wy ] = Kx , − cPz Kx + cPx Kz
c
i~
= (HKz − Px Jy − HKz + Px Jy ) = 0 (4.17)
c
[Kx , Wz ] = 0 (4.18)
8
These definitions involve products of Hermitian commuting operators, therefore oper-
ators W 0 and W are guaranteed to be Hermitian.
Putting equations (4.15) - (4.18) together we obtain the characteristic 4-

vector relations (4.9) - (4.10)
i~
[K, W 0 ] = − W (4.19)
c
i~
[Ki , Wj ] = − δij W 0 (4.20)
c
Next we need to verify that commutators with generators of translations

(H, P) are all zero. First, for W 0 we obtain
[W 0 , H] = [P · J, H] = 0
[W 0 , Px ] = [Jx Px + Jy Py + Jz Pz , Px ] = Py [Jy , Px ] + Pz [Jz , Px ]
= −i~Py Pz + i~Pz Py = 0
For the vector part W we obtain
[W, H] = −c[[P × K], H] = −c[[P, H] × K] − c[P × [K, H]] = 0

1
[Wx , Px ] = [HJx , Px ] − c[[P × K]x , Px ] = −c[Py Kz − Pz Ky , Px ] = 0
c
1 i~
[Wx , Py ] = [HJx , Py ] − c[[P × K]x , Py ] = HPz − c[Py Kz − Pz Ky , Py ]
c c
i~ i~
= HPz − HPz = 0
c c
This completes the proof that the 4-square of the Pauli-Lubanski 4-vector
Σ2 = W2 − W02
is a Casimir operator. Although operators (W 0 , W) do not have direct phys-

ical interpretation, we will find them very useful in the next section for de-
riving the operators of position R and spin S. For these calculations we will
need commutators between components of the Pauli-Lubanski 4-vector. For
example,
4.3. OPERATORS OF SPIN AND POSITION 111

HJy HWz 0
[Wx , Wy ] = Wx , + cPx Kz − cPz Kx = i~ − W Pz
c c

HJx
[W0 , Wx ] = W0 , − cPy Kz + cPz Ky = −i~Py Wz + i~Pz Wy
c
= −i~[P × W]x
The above equations are easily generalized for all components
3
i~ X
[Wi , Wj ] = ǫijk (HWk − cW 0 Pk ) (4.21)
c k=1
[W0 , Wj ] = −i~[P × W]j (4.22)
4.3 Operators of spin and position

Now we are ready to tackle the problem of finding expressions for spin and
position as functions of the Poincaré group generators [Pry48, NW49, Ber65,
Jor80].
4.3.1 Physical requirements

We will be looking for the total spin operator S and the center-of-mass posi-
tion operator R which have the following natural properties:
(I) Owing to the similarity between spin and angular momentum,9 we

demand that S is a pseudovector (just like J)
9
It is often stated that spin is a purely quantum-mechanical observable which does not
have a classical counterpart. We do not share this point of view. From classical mechanics
we know that the total angular momentum of a body is a sum of two parts. The first part
is the angular momentum resulting from the linear movement of the body as a whole with
respect to the observer. The second part is related to the rotation of the body around its
own axis, or spin. The only significant difference between classical and quantum intrinsic
angular momenta (spins) is that the latter has a discrete spectrum, while the former is
continuous. In addition, components of the quantum spin operator do not commute with
each other.
3
X
[Jj , Si ] = i~ ǫijk Sk
k=1
(II) and that components of S satisfy the same commutation relations as

components of J (3.53)
3
X
[Si , Sj ] = i~ ǫijk Sk (4.23)
k=1
(III) We also demand that spin can be measured simultaneously with mo-
mentum
[P, S] = 0
(IV) and with position
[R, S] = 0 (4.24)
(V) From the physical meaning of R it follows that space translations of

the observer simply shift the values of position.
i i
e− ~ Px a Rx e ~ Px a = Rx − a
i i
e− ~ Px a Ry e ~ Px a = Ry
i i
e− ~ Px a Rz e ~ Px a = Rz
This implies the following commutation relations10
[Ri , Pj ] = i~δij (4.25)

10
Note that in most textbooks this commutator is simply postulated. In our approach
it is derived from the definition of P as the generator of space translations.
(VI) Finally, we will assume that position is a true vector
3
X
[Ji , Rj ] = i~ ǫijk Rk (4.26)
k=1
4.3.2 Spin operator

Now we would like to make the following guess about the form of the spin
operator11
W W0 P
S = − (4.27)
Mc M(Mc2 + H)
HJ [P × K] P(P · J)
= 2
− − (4.28)
Mc M (H + Mc2 )M
which is a pseudovector commuting with P as required by the above con-

ditions (I) and (III). Next we are going to verify that condition (II) is also
valid for this operator. To calculate the commutators (4.23) between spin
components we denote
1
F ≡− (4.29)
M(Mc2 + H)
use commutators (4.21) and (4.22), the equality
1 1
(P · W) = H(P · J) = HW0 (4.30)
c c
and equation (D.17). Then

Wx Wy
[Sx , Sy ] = F W0 Px + , F W0 Py +
Mc Mc
11
Note that operator S has the mass operator M in the denominator, so expressions
(4.27) and (4.28) have mathematical sense only for systems with a strictly positive mass
spectrum.

F Px [P × W]y F Py [P × W]x HWz − cW0 Pz
= i~ − + +
Mc Mc M 2 c3

F [P × [P × W]]z HWz − cW0 Pz
= i~ − +
Mc M 2 c3

F (Pz (P · W) − Wz P 2 ) HWz − cW0 Pz
= i~ − +
Mc M 2 c3

F (Pz HW 0c−1 − Wz P 2 ) HWz − cW0 Pz
= i~ − +
Mc M 2 c3
2
P F H 0 HF 1
= i~Wz + 2 3 + i~Pz W − −
Mc M c Mc2 M 2 c2
For the expressions in parentheses we obtain
P 2F H P2 H H(Mc2 + H) − P 2c2
+ 2 3 = − 2 + =
Mc M c M c(Mc2 + H) M 2 c3 M 2 c3 (Mc2 + H)
H(Mc2 + H) − (Mc2 + H)(H − Mc2 ) 1
= =
M 2 c3 (Mc2 + H) Mc
HF 1 H 1
− − = −
Mc2 M 2 c2 M 2 c2 (Mc2 + H) M 2 c2
H − (Mc2 + H) 1
= 2 2 2
=− =F
M c (Mc + H) M(Mc2 + H)
Thus, property (4.23) follows

Wz
[Sx , Sy ] = i~ + F W 0 Pz = i~Sz
Mc
Let us now prove that spin squared S2 is a function of M 2 and Σ2 , i.e., a
Casimir operator
2
2 W W2 2W0 F (P · W)
S = + W0 PF = 2 2+ + W02 P 2 F 2
Mc M c Mc

W2 2 2H 2 W2 2
2 2H(Mc + H) − P c
2 2
= + W0 F + P F = 2 2 + W0 F
M 2 c2 Mc2 M c Mc2 (Mc2 + H)
W2 2 2
2 H + 2HMc + M c
2 4
W 2 − W02
= − W0 =
M 2 c2 M 2 c2 (Mc2 + H)2 M 2 c2
Σ2
=
M 2 c2
So far we guessed one particular form of the spin operator and verified that
the required properties are satisfied. In subsection 4.3.6 we will demonstrate
that this is the unique expression satisfying all conditions from subsection
4.3.1.
Sometimes it is convenient to use the operator of spin’s projection on
momentum (S · P)/P that is called helicity. This operator is related to the
0-th component of the Pauli-Lubanski 4-vector
(P · J)H P 2 (P · J)(H − Mc2 )

(P · S) = − = (P · J) = W0 (4.31)
Mc2 P 2 Mc2
4.3.3 Position operator

Now we are going to switch to the derivation of the position operator. Here
we will follow a route similar to that for S: we will first guess the form of the
operator R and then in subsection 4.3.7 we will prove that this is the unique
expression satisfying all requirements from subsection 4.3.1. Our guess for
R is the Newton-Wigner position operator 12 [Pry48, NW49, Ber65, Jor80,
Can65]
c2 −1 c2 [P × S]
R = − (H K + KH −1) − (4.32)
2 H(Mc2 + H)
c2 i~c2 P c[P × W]
= − K− 2
− (4.33)
H 2H MH(Mc2 + H)
which is a true vector having properties (V) and (VI), e.g.,
c2 i~
[Rx , Px ] = − [(H −1 Kx + Kx H −1 ), Px ] = (H −1 H + HH −1) = i~
2 2
2
c
[Rx , Py ] = − [(H −1 Kx + Kx H −1 ), Py ] = 0
2
12
Similarly to the operator of spin, the Newton-Wigner position operator is defined only
for systems whose mass spectrum is strictly positive.
Let us now calculate13
c2 c2 [[P × S] × P]
J − [R × P] = J + [K × P] +
H H(Mc2 + H)
c2 c2 (P(P · S) − SP 2 )
= J+ [K × P] −
H H(Mc2 + H)
c2 (c2 P(P · S) − S(H − Mc2 )(H + Mc2 ))
= J+ [K × P] −
H H(Mc2 + H)
c2 c2 P(P · S) Mc2
= J+ [K × P] + S − − S
H H(Mc2 + H) H
c2 c2 P(P · S) c2 P(P · J) c2
= J+ [K × P] + S − − J + + [P × K]
H H(Mc2 + H) H(Mc2 + H) H
= S
Therefore, just as in classical physics, the total angular momentum is a sum

of two parts: the orbital angular momentum [R×P] and the intrinsic angular
momentum or spin S
J = [R × P] + S
Next we can check that condition (IV) is satisfied as well, e.g.,
[Sx , Ry ] = [Jx − [R × P]x , Ry ] = i~Rz − [Py Rz − Pz Ry , Ry ]

= i~Rz − i~Rz = 0
Theorem 4.1 All components of the position operator commute with each
other: [Ri , Rj ] = 0.
Proof. First, we calculate the commutator [HRx , HRy ] which is related to

[Rx , Ry ] via formula14
13
Note that [Kx Py − Ky Px , H] = −i~(Px Py − Py Px ) = 0, therefore [K × P] commutes
with H, and operator H −1 [K × P] is Hermitian.
14
here we used (E.11)
[HRx , HRy ] = [HRx , H]Ry + H[HRx , Ry ]

= H[Rx , H]Ry + H[H, Ry ]Rx + H 2 [Rx , Ry ]
= i~c2 (Px Ry − Ry Px ) + H 2 [Rx , Ry ]
= i~c2 [P × R]z + H 2 [Rx , Ry ]
= −i~c2 Jz + i~c2 Sz + H 2 [Rx , Ry ] (4.34)
Using formula (4.33) for the position operator, we obtain
[HRx , HRy ]

2 i~c2 Px 2 i~c2 Py
= −c Kx − + cF [P × W]x , −c Ky − + cF [P × W]y
2H 2H
Non-zero contributions to this commutator are
[−c2 Kx , −c2 Ky ] = c4 [Kx , Ky ] = −i~c2 Jz (4.35)

i~c2 Px 2 i~c2 Px ~ 2 c4 P y P x
− , −c Ky = − Ky , = (4.36)
2H 2 H 2H 2

i~c2 Py ~ 2 c4 P y P x
−c2 Kx , − = − (4.37)
2H 2H 2
[−c2 Kx , cF [P × W]y ]

c3 Pz Wx − Px Wz
= Kx ,
M H + Mc2

c3 Pz Wx − Px Wz Pz [Kx , Wx ] [Kx , Px ]Wz
= − [Kx , H] + −
M (H + Mc2 )2 H + Mc2 H + Mc2
= i~c3 (MF 2 (Pz Wx − Px Wz )Px + F Pz W0 c−1 − F HWz c−2 )
[cF [P × W]x , −c2 Ky ]

= −c3 (−MF 2 (Py Wz − Pz Wy )[Ky , H] + F Pz [Ky , Wy ] − F [Ky , Py ]Wz )
= i~c3 (MF 2 (Py Wz − Pz Wy )Py + F Pz W0 c−1 − F HWz c−2 )
Adding together two last results and using (4.30) we obtain
[−c2 Kx , cF [P × W]y ] + [cF [P × W]x , −c2 Ky ]

= i~c3 (MF 2 [P × [P × W]]z + 2F Pz W0 c−1 − 2F HWz c−2 )
= i~c3 (MF 2 (Pz (P · W) − Wz P 2 ) + 2F Pz W0 c−1 − 2F HWz c−2 )
= i~c3 (MF 2 (Pz HW0 c−1 − Wz P 2) + 2F Pz W0 c−1 − 2F HWz c−2 )
= i~c2 MF 2 Pz W0 (H − 2(H + Mc2 )) + i~cMF 2 Wz (−(H − Mc2 )(H + Mc2 )
+2H(H + Mc2 ))
i~cWz
= i~c3 Pz W0 (F c−1 − M 2 F 2 c) + (4.38)
M
One more commutator is
[cF [P × W]x , cF [P × W]y ]

= c2 F 2 [Py Wz − Pz Wy , Pz Wx − Px Wz ]
= c2 F 2 (Pz Py [Wz , Wx ] − Pz2[Wy , Wx ] + Px Pz [Wy , Wz ])
= i~cF 2 (Pz Py (HWy − cW0 Py ) + Pz2 (HWz − cW0 Pz ) + Px Pz (HWx − cW0 Px ))
= i~cF 2 (−W0 cPz (Px2 + Py2 + Pz2 ) + HPz (Px Wx + Py Wy + Pz Wz ))
= i~c2 F 2 (−W0 Pz P 2 + H 2 Pz W0 /c2 )
= i~F 2 W0 (−Pz (H 2 − M 2 c4 ) + H 2 Pz )
i~c4 W0 Pz
= (4.39)
(H + Mc2 )2
Now we collect all terms (4.35) - (4.39) and finally calculate
i~cWz
[HRx , HRy ] = −i~c2 Jz + i~c3 Pz W0 (F − M 2 F 2 c) + + i~c4 M 2 F 2 W0 Pz
M

2 2 Pz W0 Wz
= −i~c Jz + i~c − + = −i~c2 Jz + i~c2 Sz
M(H + Mc2 ) Mc
Comparing this with equation (4.34) we obtain
H 2 [Rx , Ry ] = 0
Operator H 2 = M 2 c4 + P 2 c2 has no zero eigenvalues, because we have as-

sumed that M is strictly positive. Thus we get the desired result
[Rx , Ry ] = 0
4.3.4 Alternative set of basic operators

So far, our plan was to construct operators of observables from 10 basic
generators {P, J, K, H}. However, this set of operators is sometimes difficult
to use in calculations due to rather complicated commutation relations in
the Poincaré Lie algebra (3.52) - (3.58). For systems with a strictly positive
spectrum of the mass operator, we may find it more convenient to use an
alternative set of basic operators {P, R, S, M} whose commutation relations
are much simpler
[P, M] = [R, M] = [S, M] = [Ri , Rj ] = [Pi , Pj ] = 0 (4.40)

[Ri , Pj ] = i~δij
[P, S] = [R, S] = 0
3
X
[Si , Sj ] = i~ ǫijk Sk (4.41)
k=1
Summarizing our previous results, we can express operators in this set through
generators of the Poincaré group15
c2 −1 c[P × W]
R = − (H K + KH −1 ) − (4.42)
2 MH(Mc2 + H)
S = J − [R × P] (4.43)
1√
M = + 2 H 2 − P 2 c2 (4.44)
c
Conversely, we can express generators of the Poincaré group {P, K, J, H}
through operators {P, R, S, M}. For the energy and angular momentum we
obtain
15
Operator P is the same in both sets.
√
H = + M 2 c4 + P 2 c2 (4.45)
J = [R × P] + S (4.46)
and the expression for the boost operator is
1 [P × S]
− 2
(RH + HR) −
2c Mc2 + H
1 1 −1 [P × S]
= − − (H KH + K) −
2 2 Mc2 + H

1 1 −1 [P × S] [P × S]
− − (K + HKH ) − 2
−
2 2 Mc + H Mc2 + H
1 −1
= (H KH + K + K + HKH −1)
4
i~
= K − (H −1 P − PH −1)
4
= K (4.47)
These two sets provide equivalent descriptions of Poincaré invariant theories.

Any function of operators from the set {P, J, K, H} can be expressed as a
function of operators from the set {P, R, S, M}, and vice versa. We will use
this property in subsections 4.3.5, 6.3.2, and 7.2.2.
4.3.5 Canonical form and “power” of operators

In this subsection, we would like to mention some mathematical facts which
will be helpful in further work. When performing calculations with functions
of Poincaré generators, we meet a problem that the same operator can be
expressed in many equivalent functional forms. For example, according to
(3.58) Kx H and HKx − i~Px are two forms of the same operator. To solve
this non-uniqueness problem, we will agree to write operator factors always
in the canonical form, i.e., from left to right in the following order:16
16
Since H, Px , Py , and Pz commute with each other, the part of the operator depend-
ing on these factors can be written as an ordinary function of commuting arguments
C(Px , Py , Pz , H), whose order is irrelevant.
C(Px , Py , Pz , H), Jx , Jy , Jz , Kx , Ky , Kz (4.48)
Consider, for example, the non-canonical product Ky Py Jx . To bring it to the

canonical form, we first move factor Py to the leftmost position using (3.57)
i~
Ky Py Jx = Py Ky Jx + [Ky , Py ]Jx = Py Ky Jx − HJx
c2
The second term on the right hand side is already in the canonical form, but
the first term is not. We need to switch factors Jx and Ky there:
i~
Ky Py Jx = Py Jx Ky + Py [Ky , Jx ] − HJx
c2
i~
= Py Jx Ky − i~Py Kz − HJx (4.49)
c2
Now all terms in (4.49) are in the canonical form.
The procedure for bringing a general operator to the canonical form is
not more difficult than in the above example. If we call the original operator
the primary term, then this procedure can be formalized as the following
sequence of steps: First we transform the primary term itself to the canonical
form. We do that by switching the order of pairs of neighboring factors if
they occur in the “wrong” order. Let us call them the “left factor” L and
the “right factor” R. If R happens to commute with L, then such a change
has no other effect. If R does not commute with L, then the result of the
switch is LR → RL + [L, R]. This means that apart from switching L ↔ R
we must also add another secondary term to the original expression. The
secondary term is obtained from the primary term by replacing the product
LR with the commutator [L, R].17 At the end of the first step we have all
factors in the primary term in the canonical order. If during this process
all commutators [L, R] were zero, then we are done. If there were nonzero
commutators, then we have a number of additional secondary terms. In the
general case, these terms are not yet in the canonical form, and the above
procedure should be repeated for them resulting in tertiary, etc. terms until
all terms are in the canonical order.
17
The second and third terms on the right hand side of (4.49) are secondary.
Then, for each operator there is a unique representation as a sum of terms

in the canonical form
3
X 3
X 3
X 3
X
00
F =C + Ci10 Ji + Ci01 Ki + Cij11 Ji Kj + Cij02 Ki Kj + . . .
i=1 i=1 i,j=1 i,j=1;i≤j
(4.50)
where C αβ = C αβ (Px , Py , Pz , H) are functions of translation generators.

We will also find useful the notion of power of terms in (4.50). We
will denote pow(A) the number of factors J and/or K in the term A. For
example, the first term on the right hand side of (4.50) has power 0. The
second and third terms have power 1, etc. The power of a general operator
(4.50) is defined as the maximum power among terms in F . For operators
considered earlier in this chapter, we have
pow(H) = pow(P) = pow(V) = 0

pow(W 0) = pow(W) = pow(S) = pow(R) = 1
Lemma 4.2 If L and R are operators from the list (4.48) and [L, R] 6= 0,
then
pow([L, R]) = pow(L) + pow(R) − 1 (4.51)
Proof. The commutator [L, R] is non-zero in two cases.

1. pow(L) = 1 and pow(R) = 0 (or, equivalently, pow(L) = 0 and
pow(R) = 1). From commutation relations (3.52), (3.55), (3.57), and (3.58),
it follows that non-vanishing commutators between Lorentz generators and
translation generators are functions of translation generators, i.e., have zero
power. The same is true for commutators between Lorentz generators and
arbitrary functions of translation generators C(Px , Py , Pz , H).
2. If pow(L) = 1 and pow(R) = 1, then pow([L, R]) = 1 follows directly
from commutators (3.53), (3.54) and (3.56).
One can easily see that (4.51) holds for more complex operators as well.
For example, if C and D are two functions of Px , Py , Pz , H, then using (E.11)
and [C, D] = 0 we obtain
[CJx , DJy ] = [CJx , D]Jy + D[CJx , Jy ] = C[Jx , D]Jy + DC[Jx , Jy ] + DJx [C, Jy ]
The power of the right hand side is 1, in agreement with (4.51). Similarly,
one can see that pow([CL, DR]) = 1 if L and R are any non-commuting
components of J or K. This proves formula (4.51) for all operators L and R
having power 0 or 1. Let us now try to extend this result to general operators.
The primary term for the product of two terms AB has exactly the same
number of Lorentz generators as the original operator, i.e., pow(A)+pow(B).
Lemma 4.3 For two terms A and B, either secondary term in the product
AB is zero or its power is equal to pow(A) + pow(B) -1.
Proof. Each secondary term results from replacing a product of two gener-
ators LR in the primary term with their commutator [L, R]. According to
Lemma 4.2, if [L, R] 6= 0 such a replacement decreases the power of the term
by one unit.
The powers of tertiary and higher order terms are less than the power of
secondary terms. Therefore, for any product AB its power is determined by
the primary term only
pow(AB) = pow(BA) = pow(A) + pow(B)
This implies
18
Theorem 4.4 For two non-commuting terms A and B
pow([A, B]) = pow(A) + pow(B) − 1

18
This theorem was used by Berg in ref. [Ber65].
Proof. In the commutator AB − BA, the primary term of AB cancels

out the primary term of BA. If [A, B] 6= 0, then the secondary terms do not
cancel. Therefore, there is at least one non-zero secondary term whose power
is pow(A) + pow(B) − 1 according to Lemma 4.3.
Having at our disposal basic operators P, R, S, and M we can form a

number of Hermitian scalars, vectors, and tensors, which are classified in
table 4.1 according to their true/pseudo character and power:
Table 4.1: Scalar, vector, and tensor functions of basic operators

power 0 power 1 power 2
2
True scalar P ;M P·R+R·P R2 ; S 2
Pseudoscalar P·S R·S
True vector P R; [P × S] [R × S]
Pseudovector S; [P × R]
P3
True tensor PP ǫ S ; P R + Rj Pi Si Sj + Sj Si ; Ri Rj
P3 i j P3ijk k i j
k=1
Pseudotensor ǫ P
k=1 ijk k k=1 ǫijk Rk ; Pi Sj Ri Sj
4.3.6 Uniqueness of the spin operator

Let us now prove that (4.27) is the unique spin operator satisfying conditions
(I) - (IV) from subsection 4.3.1. Suppose that there is another spin operator
S′ satisfying the same conditions. Denoting the power of the spin components
by p = pow(Sx′ ) = pow(Sy′ ) = pow(Sz′ ) we obtain from (4.23) and Theorem
4.4
pow([Sx′ , Sy′ ]) = pow(Sz′ )

2p − 1 = p
Therefore, the components of S′ must have power 1. The most general form
of a pseudovector operator having power 1 can be deduced from Table 4.1
S′ = b(M, P 2 )S + f (M, P 2)[P × R] + e(M, P 2 )(S · P)P

where b, f , and e are arbitrary real functions.19 From condition (III) we

obtain f (M, P 2) = 0. Comparing commutator20
[Sx′ , Sy′ ] = [bSx + e(S · P)Px , bSy + e(S · P)Py ]

= b2 [Sx , Sy ] − i~ebPx [S × P]y + i~ebPy [S × P]x
= i~b2 Sz − i~eb(P × [S × P])z
= i~(b2 Sz − ebP 2 Sz + eb(S · P)Pz )
with the requirement (II)
[Sx′ , Sy′ ] = i~Sz′ = i~(bSz + e(S · P)Pz )

we obtain the system of equations
b2 − ebP 2 = b
eb = e
whose non-trivial solution is b = 1 and e = 0. Therefore, the spin operator
is unique S′ = S.
4.3.7 Uniqueness of the position operator

Assume that in addition to the Newton-Wigner position operator R there
is another position operator R′ satisfying all properties (IV) - (VI). Then
it follows from condition (V) that R′ has power 1. The most general true
vector with this property is
R′ = a(P 2 , M)R + d(P 2 , M)[S × P] + g(P 2, M)P

where a, d, and g are arbitrary real functions. From condition (IV) it follows,
for example, that
0 = [Rx′ , Sy ] = d(P 2, M)[Sy Pz − Sz Py , Sy ] = i~d(P 2 , M)Py Sx

19
These functions depend on scalars P 2 and M in order to satisfy condition (I).
20
Here we used equation [S, (S · P)] = i~[S × P].
which implies that d(P 2 , M) = 0. From (V) we obtain
i~ = [Rx′ , Px ] = a(P 2 , M)[Rx , Px ] = i~a(P 2 , M)
and a(P 2 , M) = 1. Therefore the most general form of the position operator
is
R′ = R + g(P 2, M)P (4.52)
In Theorem 17.1 we will consider boost transformations for times and posi-
tions of events in non-interacting systems of particles. If the term g(P 2, M)P
in (4.52) were non-zero, we would not get an agreement with Lorentz trans-
formations known from Einstein’s special relativity.21 Therefore, we will
assume that the factor g(P 2, M) vanishes and R′ = R. So, from now on, we
will use the Newton-Wigner operator R as the representative of the position
observable.
It follows from commutator (4.25) that
[Rx , Pxn ] = i~nPxn−1 (4.53)
so for any function f (Px )
∂f (Px )
[Rx , f (Px )] = i~ (4.54)
∂Px
For example,
√
√ ∂ P 2 c2 + M 2 c4
[R, H] = [R, + P 2 c2 M 2 c4 ]
= i~
∂P
2 2
i~Pc Pc
= √ = i~ = i~V
2 2
P c +M c 2 4 H
where V is the velocity operator (4.5). Therefore, as expected, for an observer
shifted in time by the amount t, the position of the physical system appears
shifted by Vt:
21
see (17.5) - (17.8) and Appendix I.2

i i
R(t) = exp Ht R exp − Ht (4.55)
~ ~
i
= R + [H, R]t = R + Vt (4.56)
~
Thus the center of mass R of any isolated physical system moves with con-
stant velocity V, as expected. This result is independent on the internal
composition of the systemand on interactions between its different parts.
4.3.8 Boost transformations of the position operator

Let us now find how the vector of position (4.32) transforms with respect to
boosts, i.e., we are looking for the connection between position observables
in two inertial reference frame moving with respect to each other. For sim-
plicity, we consider a massive system without spin, so that the center-of-mass
position in the reference frame at rest O can be written as
c2
R = − (KH −1 + H −1 K) (4.57)
2
First, we need to determine boost transformations of the boost operator
itself. For example, the transformation of the component Ky with respect to
the boost along the x-axis is obtained by using equations (E.13), (3.54), and
(3.56)
Ky (θ)
ic ic
= e− ~ Kx θ Ky e ~ Kx θ
icθ c2 θ 2 ic3 θ3
= Ky − [K x , K y ] − [K x , [K x , K y ]] + [Kx , [Kx , [Kx , Ky ]]] + . . .
~ 2!~2 3!~3
θ θ2 θ3 1
= Ky − J z + Ky − Jz . . . = Ky cosh θ − Jz sinh θ
c 2! 3!c c
Then the y-component of position in the reference frame O ′ moving along
the x-axis is22
22
Here we used (4.4).
ic ic c2 − ic Kx θ ic
Ry (θ) = e− ~ Kx θ Ry e ~ Kx θ = − e ~ (Ky H −1 + H −1 Ky )e ~ Kx θ
2
c2 Jz
= − (Ky cosh θ − sinh θ)(H cosh θ − cPx sinh θ)−1
2 c
c2 Jz
− (H cosh θ − cPx sinh θ)−1 (Ky cosh θ − sinh θ) (4.58)
2 c
Similarly, for the x- and z-components
c2
Rx (θ) = − Kx (H cosh θ − cPx sinh θ)−1
2
c2
− (H cosh θ − cPx sinh θ)−1 Kx (4.59)
2
c2 Jy
Rz (θ) = − (Kz cosh θ + sinh θ)(H cosh θ − cPx sinh θ)−1
2 c
c2 Jy
− (H cosh θ − cPx sinh θ)−1 (Kz cosh θ + sinh θ) (4.60)
2 c
These transformations do not resemble usual Lorentz formulas from special

relativity.23 This is not surprising, because the Newton-Wigner position op-
erator does not constitute a 3-vector component of any 4-vector quantity.24
Furthermore, we can find the time dependence of the position operator in
the moving reference frame O ′. We use label t′ to indicate the time measured
in the reference frame O ′ by its own clock and notice that the time translation
generator H ′ in O ′ is different from that in O
i i
H ′ = e− ~ Kx cθ He ~ Kx cθ (4.61)
Then we obtain
23
See also ref. [MM97] and equations (17.22) - (17.24), which are classical (~ → 0) limits
of (4.58) - (4.60).
24
In our formalism, there is no “time operator” which could serve as a 4th component
of such a 4-vector. The difference between special relativity and our approach to space
and time will be discussed in chapter 17.
i ′ ′ i ′ ′ i ′ ′ ic ic i ′ ′
R(θ, t′ ) = e ~ H t R(θ)e− ~ H t = e ~ H t e− ~ Kx θ Re ~ Kx θ e− ~ H t
ic i ′ ic
ic ic
ic i ′ ic

= e− ~ Kx θ e ~ Ht e ~ Kx θ e− ~ Kx θ Re ~ Kx θ e− ~ Kx θ e− ~ Ht e ~ Kx θ
ic i ′ i ′ ic ic ic
= e− ~ Kx θ e ~ Ht Re− ~ Ht e ~ Kx θ = e− ~ Kx θ (R + Vt′ )e ~ Kx θ
= R(θ) + V(θ)t′ (4.62)
where velocity V(θ) in the reference frame O ′ is given by equations (4.6) -

(4.8). As expected, this time evolution in the moving frame can be obtained
by a boost transformation from the trajectory (4.56) in the rest frame.
Chapter 5
SINGLE PARTICLES
The electron is as inexhaustible as the atom...

V. I. Lenin
Our discussion in the preceding chapter could be universally applied to any

isolated physical system, be it an electron or the Solar System. We have
not specified how the system was put together, and we considered only total
observables pertinent to the system as a whole. The results we obtained are
not surprising: the total energy, momentum, and angular momentum of our
system are conserved, and the center of mass is moving with a constant speed
along a straight line (4.56). Although the time evolutions of these total ob-
servables are rather uneventful, the internal structure of complex (compound)
physical systems may undergo dramatic changes due to collisions, reactions,
decays, etc. The description of such transformations is the most interesting
and challenging part of physics. To address such problems, we need to un-
derstand how complex physical systems are put together. The central idea
of this book is that all material objects are composed of elementary particles
i.e., localizable and countable systems lacking internal structure.1 In this
chapter we will study these most fundamental ingredients of nature.
In subsection 3.2.4 we have established that the Hilbert space of any
physical system carries a unitary representation of the Poincaré group. Any
1
This is in contrast to the wide-spread belief that the fundamental ingredients of nature
are continuous fields. See discussion in section 17.4.
131
132 CHAPTER 5. SINGLE PARTICLES
such representation can be decomposed into a direct sum of irreducible rep-

resentations.2 Elementary particles are defined as physical systems for which
this sum has only one summand. Therefore, by definition, the Hilbert space
of a stable elementary particle carries an irreducible unitary representation
of the Poincaré group. So, in a sense, elementary particles have simplest
non-decomposable spaces of states. The classification of irreducible rep-
resentations of the Poincaré group and their Hilbert spaces was given by
Wigner [Wig39]. From Schur’s first Lemma (Lemma H.1) we know that in
any irreducible unitary representation of the Poincaré group, the two Casimir
operators M and S2 act as multiplication by a constant. So, all different
irreducible representations and, therefore, all elementary particles, can be
classified according to the values of these two constants - the mass and the
spin squared. Of course, there are many other parameters describing elemen-
tary particles, such as charge, magnetic moment, strangeness, etc. But all of
them are related to the manner in which particles participate in interactions.
In the world where all interactions are “turned off,” particles have just two
intrinsic properties – mass and spin.
There are only six known stable elementary particles for which the clas-
sification by mass and spin applies (see Table 5.1). Some reservations should
be made about this statement. First, for each particle in the table (except
photons) there is a corresponding antiparticle having the same mass and spin
but opposite values of the electric, baryon, and lepton charges.3 So, if we also
count antiparticles, there are eleven different stable particle species. Second,
there are many more particles, like muons, pions, neutrons, etc., which are
usually called elementary but all of them are unstable and eventually decay
into particles shown in Table 5.1. This does not mean that unstable parti-
cles are “made of” stable particles or that they are less elementary. Simply,
the stable particles shown in the table have the lowest masses and there are
no lighter species to which they could decay without violating conservation
laws. Third, we do not list in Table 5.1 quarks, gluons, gravitons, and other
particles predicted theoretically, but never directly observed in experiment.
Fourth, strictly speaking, the photon is not a true elementary particle as
it is not described by an irreducible representation of the Poincaré group.
We will see in subsection 5.3.3 that the photon is described by a reducible
representation of the Poincaré group which is a direct sum of two irreducible
2
See Appendix H.1.
3
see subsection 8.2.1 for conservation laws associated with the charges
5.1. MASSIVE PARTICLES 133
representations with helicities +~ and −~. Fifth, neutrinos are not truly
stable elementary particles. According to recent experiments, three flavors
of neutrinos are oscillating between each other over time. Finally, it may
be true that protons are not elementary particles as well. They are usually
regarded as being composed of quarks. This leaves us with just two truly
stable, elementary, and directly observable particles, which are the electron
and the positron.
Table 5.1: Properties of stable elementary particles

Particle Mass Spin/helicity
Electron 0.511 MeV/c2 ~/2
2
Proton 938.3 MeV/c ~/2
Electron neutrino < 1 eV/c2 ~/2
2
Muon neutrino < 1 eV/c ~/2
2
Tau neutrino < 1 eV/c ~/2
Photon 0 ±~
In the following we will denote m the eigenvalue of the mass operator in

the Hilbert space of elementary particle and consider separately two cases:
massive particles (m > 0) and massless particles (m = 0).4
5.1 Massive particles

5.1.1 Irreducible representations of the Poincaré group
The Hilbert space H of a massive elementary particle carries an unitary
irreducible representation Ug of the Poincaré group characterized by a single
positive eigenvalue m of the mass operator M. As discussed in subsection
4.3.3, the position operator R is well-defined in this case. Components of
the position and momentum operators satisfy commutation relations of the
6-dimensional Heisenberg Lie algebra 5
[Pi , Pj ] = [Ri , Rj ] = 0
4
Wigner’s classification also permits irreducible representations with negative and imag-
inary values of m, but there is no evidence that such particles exist in nature. We skip
their discussion in this book.
5
see equations (3.55), (4.25) and Theorem 4.1
[Ri , Pj ] = i~δij
Then, according to the Stone-von Neumann theorem H.3, each component

Px , Py , Pz , Rx , Ry , Rz has continuous spectrum occupying entire real axis R.
The components of the momentum operator Px , Py , Pz commute with each
other. So, the spectrum of the vector operator P is the 3-dimensional linear
space R3 .6 Thus, there exists a decomposition of unity associated with the
spectrum of P and the Hilbert space H can be represented as a direct sum
of corresponding eigensubspaces Hp of P
H = ⊕p∈R3 Hp
This implies that the 1-particle Hilbert space H is infinite-dimensional. It

can be said that the number of mutually orthogonal basis vectors in this
space is no less than the “number of distinct points in the infinite 3D space
R3 ”, i.e., uncountable.
Let us first focus on the subspace H0 with zero momentum. This subspace
is invariant with respect to rotations, because for any vector |0i from this
i ~
subspace the result of rotation e− ~ Jφ |0i belongs to H0 7
i ~ i ~ i ~ i ~
Pe− ~ Jφ |0i = e− ~ Jφ e ~ Jφ Pe− ~ Jφ |0i
! " # !
i ~ φ~ φ ~ ~
φ
= e− ~ Jφ P· (1 − cos φ) + P cos φ − P × sin φ |0i
φ φ φ
= 0 (5.1)
This means that representation of the rotation subgroup defined in the full
Hilbert space H induces a unitary representation Vg of this subgroup in H0 .
The generators of rotations are, of course, represented by the angular
momentum vector J in H. However, in the subspace H0 , they can be equiv-
alently represented by the vector of spin S, because
Sz |0i = Jz |0i − [R × P]z |0i = Jz |0i − (Rx Py − Ry Px )|0i = Jz |0i

6
The same is true for the spectrum of the position operator R.
7
Here we used equation (4.2).
We will show later that the representation of the full Poincaré group
is irreducible if and only if the representation Vg of the rotation group in
H0 is irreducible. So, we will be interested only in such irreducible rep-
resentations Vg . The classification of unitary irreducible representations of
the rotation group (single- and double-valued)8 depends on one integer or
half-integer parameter s that we will identify with particle’s spin. The triv-
ial one-dimensional representation is characterized by spin zero (s = 0) and
corresponds to a spinless particle. The two-dimensional representation corre-
sponds to particles with spin one-half (s = 1/2). The 3-dimensional represen-
tation corresponds to particles with spin one (s = 1), etc. Correspondingly,
the dimension of the zero-momentum subspace H0 will be 1,2,3, . . ..
It is customary to choose a basis of eigenvectors of Sz in H0 and denote
these vectors by |0, σi, i.e.,
P|0, σi = 0
H|0, σi = mc2 |0, σi
M|0, σi = m|0, σi
S 2 |0, σi = ~2 s(s + 1)|0, σi
Sz |0, σi = ~σ|0, σi
where σ = −s, −s + 1, . . . , s − 1, s. The action of a rotation on these basis

vectors is
s
X
~
− ~i Jφ ~
− ~i Sφ ~
e |0, σi = e |0, σi = Dσs ′ σ (φ)|0, σ′i (5.2)
σ′ =−s
where D s are (2s + 1) × (2s + 1) matrices of the representation Vg . This

definition implies that9
s
X
~1φ
Dσs ′ σ (φ ~ 2 )|0, σ ′i
σ′ =−s
8
see Appendix H.5
9
Here φ ~1 and
~2 denotes the composition of two rotations parameterized by vectors φ
~1 φ
~
φ2 , respectively.
s
X
i ~ i ~ i ~ ~ 2 )|0, σ ′′ i
= e− ~ Jφ1 e− ~ Jφ2 |0, σi = e− ~ Jφ1 Dσs ′′ σ (φ
σ′′ =−s
s s
!
X X
= ~ 1 )D s ′′ (φ
Dσs ′ σ′′ (φ ~ 2 ) |0, σ ′ i
σ σ
σ′ =−s σ′′ =−s
and
s
X
~1φ
Dσs ′ σ (φ ~2) = ~ 1 )D s ′′ (φ
Dσs ′ σ′′ (φ ~2)
σ σ
σ′′ =−s
which means that matrices D s furnish a representation of the rotation group.
5.1.2 Momentum-spin basis

In the preceding subsection we constructed basis vectors |0, σi in the subspace
H0 . We also need basis vectors |p, σi in other subspaces Hp with p 6= 0.
We will build basis |p, σi by propagating state vectors |0, σi to other points
in the 3D momentum space using pure boost transformations.10 The unique
pure boost, which transforms momentum 0 to p, will be denoted by λp .11
The corresponding unitary operator in the Hilbert space is12
ic ~
U(λp ; 0, 0) ≡ e− ~ Kθp (5.3)
where
~θp = p sinh−1 p (5.4)

p mc
Therefore we can write

10
Of course, this choice is rather arbitrary. A different choice of transformations con-
necting momenta 0 and p (e.g., boosts coupled with rotations) would result in a different
but equivalent basis set. However, once the basis set has been fixed, all formulas should
be written with respect to it.
11
see Fig. 5.1
12
for this notation see (3.60)
σ= 1/2 σ=1/2
p’ L
σ= 1/2 σ=1/2
λp ’
p
σ= 1/2 σ=1/2 λp
0
Figure 5.1: Construction of the momentum-spin basis for a spin one-half par-
ticle. Spin eigenvectors (with eigenvalues σ = −1/2, 1/2) at zero momentum
are propagated to non-zero momenta p and p′ by using pure boosts λp and
λp′ , respectively. As discussed in subsection 5.1.3, there is a unique pure
boost L which connects momenta p and p′ .
ic ~
|p, σi = N(p)U(λp ; 0, 0)|0, σi = N(p)e− ~ Kθp |0, σi (5.5)
where N(p) is a normalization factor. The explicit expression for N(p) will
be given a bit later in equation (5.27).
To verify that vector (5.5) is indeed an eigenvector of the momentum
operator with eigenvalue p we use equation (4.3)
ic ~ ic ~ ic ~ ic ~
P|p, σi = N(p)Pe− ~ Kθp |0, σi = N(p)e− ~ Kθp e ~ Kθp Pe− ~ Kθp |0, σi
i
~
~ θp ic
~
~ θp
= N(p)e− ~ Kcθp H sinh θp |0, σi = N(p)e− ~ Kθp mc sinh θp |0, σi
cθp θp
ic ~
= N(p)pe− ~ Kθp |0, σi = p|p, σi (5.6)
Let us now find the action of the spin component Sz on the basis vectors
|p, σi13
13
Here we use (4.27) and take into account that W0 |0, σi = P|0, σi = 0 and H|0, σi =
M c2 |0, σi. We also use boost transformations (4.3) and (4.4) of the energy-momentum 4-
vector (H, cP) and similar formulas for the Pauli-Lubanski 4-vector (W0 , W). For brevity,
we denote θz the z-component of the vector ~θp and θ its absolute value.
ic ~
Sz |p, σi = N(p)Sz e− ~ Kθp |0, σi

− ic ~p ic Kθ~p Wz W0 Pz ic ~
= N(p)e ~ K θ
e~ − 2
e− ~ Kθp |0, σi
Mc M(Mc + H)
θz θ~
~p Wz + θ [(W · θ )(cosh θ − 1) − W0 sinh θ]

− ic K θ
= N(p)e ~
Mc
~
θ θz θ~ 1
W0 cosh θ − (W · θ ) sinh θ Pz + θ [(P · θ )(cosh θ − 1) − c H sinh θ]
− ~
|0, σi
M(Mc2 + H cosh θ − c(P · θθ ) sinh θ)
θz θ~ ~
− ic K ~p Wz + θ (W · θ )(cosh θ − 1)
θ
(W · θθ ) sinh θ( θθz Mc sinh θ)
= N(p)e ~ − |0, σi
Mc M(Mc2 + Mc2 cosh θ)
! !
ic ~ W z θ z
~
θ cosh θ − 1 sinh 2
θ
= N(p)e− ~ Kθp + W· − |0, σi
Mc θ θ Mc Mc(1 + cosh θ)
ic ~ Wz ic ~ ic ~
= N(p)e− ~ Kθp |0, σi = N(p)e− ~ Kθp Sz |0, σi = N(p)e− ~ Kθp ~σ|0, σi
Mc
= ~σ|p, σi
So, |p, σi are eigenvectors of the momentum, energy, and z-component of

spin14
P|p, σi = p|p, σi
H|p, σi = ωp |p, σi
M|p, σi = m|p, σi
S 2 |p, σi = ~2 s(s + 1)|p, σi
Sz |p, σi = ~σ|p, σi
14
Note that eigenvectors of the spin operator S were obtained here by applying pure
boosts to vectors at p = 0. A different set of transformations connecting bases in points 0
and p (see footnote on page 136) would result in a different momentum-spin basis |p, σi
and in a different spin operator S (see [KP91]). Does this contradict our statement about
the uniqueness of the spin operator in subsection 4.3.6? Not really. The point is that the
alternative spin operator S′ (and the corresponding alternative position operator R′ ) will
not be expressed as a function of basic generators of the Poincaré group. This condition
was important for our proof of the uniqueness of S (and R) in section 4.3.
H 0
ωp’ m> 0
=
m
ωp
Lθ P
0 p p’
Figure 5.2: Mass hyperboloid in the energy-momentum space for massive

particles and the zero-mass cone for m = 0.
where we denoted
p
ωp ≡ m2 c4 + p2 c2 (5.7)
the one-particle energy
The common spectrum of the energy-momentum eigenvalues (ωp , p) can
be conveniently represented as points on the mass hyperboloid in the 4-
dimensional energy-momentum space (see Fig. 5.2). For massive particles,
the spectrum of the velocity operator V = Pc2 /H is the interior of a 3-
dimensional sphere |v| < c in the 4D energy-momentum space. This spec-
trum does not include the surface of the sphere, therefore massive particles
cannot reach the speed of light.15
5.1.3 Action of Poincaré transformations

We can now define the action of transformations from the Poincaré group
on the basis vectors |p, σi constructed above.16 Translations act by simple
15
In quantum mechanics, the speed of propagation of particles is not a well-defined
concept. The value of particle’s speed is definite in states having certain momentum.
However, such states are described by infinitely extended plane waves (5.40) and one
cannot speak about particle propagation in such states. So, strictly speaking, the speed
of a particle cannot be obtained by measuring its positions at two different time instants
and dividing the traveled distance by the time interval. This is a consequence of the
non-commutativity of the operators of position and velocity.
16
We are working in the Schrödinger picture here.
multiplication
i i
e− ~ Pa |p, σi = e− ~ pa |p, σi (5.8)
i i
Ht ω t
e ~ |p, σi = e ~ p |p, σi (5.9)
i ~
Let us now apply rotation e− ~ Jφ to the vector |p, σi and use equations (5.2)
and (D.8)
i ~ i ~ ic ~
e− ~ Jφ |p, σi = N(p)e− ~ Jφ e− ~ Kθp |0, σi
i ~ ic ~ i ~ i ~
= N(p)e− ~ Jφ e− ~ Kθp e ~ Jφ e− ~ Jφ |0, σi
X s
− ic (R−1 K)·θ~p ~
= N(p)e ~ ~
φ Dσ′ σ (φ)|0, σ′i
σ′ =−s
s
X
ic ~
= N(p)e− ~ K·Rφ~ θp ~
Dσ′ σ (φ)|0, σ′i
σ′ =−s
s
X
= ~ ~ p, σ ′ i
Dσ′ σ (φ)|R (5.10)
φ
σ′ =−s
This means that both momentum and spin of the particle are rotated by the
~ as expected.
angle φ,
ic ~
Applying a boost L ≡ e− ~ Kθ to the vector |p, σi and using (5.5) we
obtain
L|p, σi = N(p)LU(λp ; 0, 0)|0, σi (5.11)
The product of two boosts on the right hand side of equation (5.11) is a
transformation from the Lorentz group, so it can be represented in the form
(boost)×(rotation)= BQ
LU(λp ; 0, 0) = B(p, ~θ)Q(p, θ)

~ (5.12)
Here B and Q are yet undefined quantum-mechanical operators, and now we

are going to learn a bit more about them. Multiplying both sides of equation
~ we obtain
(5.12) by B −1 (p, θ),
B −1 (p, ~θ)LU(λp ; 0, 0) = Q(p, ~θ) (5.13)
Since operator Q on the right hand side is a representative of a rotation, it

keeps invariant the subspace with zero momentum H0 . Therefore, the se-
quence of boosts on the left hand side of equation (5.13) when acting on a
vector with zero momentum |0, σi returns this vector back to the zero mo-
mentum subspace. This fact is clearly seen in Fig. 5.1: The zero momentum
vector is mapped to a vector with momentum p by the boost λp . Subse-
quent application of L transforms this vector to another eigenstate of the
momentum operator with eigenvalue p′ . This eigenvalue can be found easily
by application of formula (4.3)17
ic ~ ic ~ ic ~ ic ~
P|p′ i = PL|pi = Pe− ~ Kθ |pi = e− ~ Kθ e ~ Kθ Pe− ~ Kθ |pi
" ! #!
ic ~
~
θ ~
θ H
= e− ~ Kθ P + P· (cosh θ − 1) + sinh θ |pi
θ θ c
" ! #!
~θ ~θ ωp
= p+ p· (cosh θ − 1) + sinh θ |p′ i
θ θ c
So we conclude that
" ! #
~θ ~θ ωp
′
p = p+ p· (cosh θ − 1) + sinh θ ≡ Λp (5.14)
θ θ c
It then follows that B −1 (p, ~θ) = λ−1 ′

p′ is the boost returning p back to the
zero-momentum vector. For the rotation on the right hand side of equation
(5.13) we will be using a special symbol
U(Rφ~W (p,Λ) ; 0, 0) ≡ Q(p, ~θ) = U −1 (λΛp ; 0, 0)LU(λp ; 0, 0) (5.15)
~ W (p, Λ) is called the Wigner angle 18

where φ
17
For simplicity, in this derivation we omit spin indices.
18
Here function Φ ~ assigns a unique rotation angle to the given rotation matrix, as
explained in Appendix D.5.
~ W (p, Λ) = Φ(λ
φ ~ −1 Λλp ) (5.16)
Λp
Explicit formulas for this angle can be found, e.g., in ref. [Rit61]. Then,
substituting (5.15) in (5.12), we obtain
ic ~
e− ~ Kθ |p, σi = N(p)LU(λp ; 0, 0)|0, σi = N(p)U(λΛp ; 0, 0)Rφ~W (p,Λ) |0, σi
Xs
= N(p)U(λΛp ; 0, 0) ~ W (p, Λ))|0, σ ′i
Dσs ′ σ (φ
σ′ =−s
s
N(p) X
~ W (p, Λ))|Λp, σ ′ i
= Dσs ′ σ (φ (5.17)
N(Λp) σ′ =−s
Equations (5.10) and (5.17) show that rotations and boosts are accompa-
nied with turning the spin vector in each subspace Hp by rotation matrices
D s . If the representation of the rotation group D s were reducible, then each
subspace Hp would be represented as a direct sum of irreducible components
Hpk
Hp = ⊕k Hpk
and each subspace
Hk = ⊕p∈R3 Hpk
would be reducible with respect to the entire Poincaré group. Therefore, in
order to construct an irreducible representation of the Poincaré group in H,
the representation D s must be an irreducible unitary representation of the
rotation group, as was mentioned already in subsection 5.1.1.
In this book we will be interested in describing interactions between elec-
trons and protons, which are massive particles with spin 1/2. Then the
relevant representation D s of the rotation group is the 2-dimensional repre-
sentation from Appendix H.5.
Let us now recap the above construction of unitary irreducible repre-
sentations of the Poincaré group for massive particles.19 First we chose a
19
This construction is known as the induced representation method [Mac00].
5.2. MOMENTUM AND POSITION REPRESENTATIONS 143
standard momentum vector p = 0 and found a little group, which was a

subgroup of the Lorentz group leaving this vector invariant. The little group
turned out to be the rotation group in our case. Then we found that if the
subspace H0 corresponding to the standard vector carries an irreducible rep-
resentation of the little group, then the entire Hilbert space is guaranteed to
carry an irreducible representation of the Poincaré group. In this represen-
tation, translations are represented by multiplication (5.8) - (5.9), rotations
and boosts are represented by formulas (5.10) and (5.17), respectively. It
can be shown that a different choice of the standard vector (i.e., p 6= 0) in
the spectrum of momentum would result in a representation of the Poincaré
group isomorphic to the one found above.
5.2 Momentum and position representations

So far we discussed the action of inertial transformations on common eigen-
vectors |p, σi of the operators P and Sz . All other vectors in the Hilbert space
H can be represented as linear combinations of these basis vectors, i.e., they
can be represented as wave functions ψ(p, σ) in the momentum-spin repre-
sentation. Similarly one can construct the position space basis |r, σi from
common eigenvectors of the (commuting) Newton-Wigner position operator
and operator Sz . Then arbitrary states in H can be represented in this basis
by their position-spin wave functions ψ(r, σ). In this section we will consider
wave function representations of states in greater detail. For simplicity, we
will omit the spin label and consider only spinless particles. It is remarkable
that formulas for the momentum-space and position-space wave functions
appear very similar to those in non-relativistic quantum mechanics.
5.2.1 Spectral decomposition of the identity operator

Two basis vectors with different momenta |pi and |p′ i are eigenvectors of the
Hermitian operator P with different eigenvalues, so they must be orthogonal
hp|p′ i = 0 if p 6= p′
If the spectrum of momentum values p were discrete we could simply normal-

ize the basis vectors to unity hp|pi = 1. However, this normalization becomes
problematic in the continuous momentum space. It will be more convenient
to use non-normalizable eigenvectors of momentum. We will call such eigen-

vectors |pi improper states and use them to write arbitrary “proper” nor-
malizable state vectors |Ψi as integrals
Z
|Ψi = dpψ(p)|pi (5.18)
where ψ(q) is called the wave function in the momentum representation. It

is convenient to demand, in analogy with (1.39), that normalizable wave
functions ψ(q) are given by the inner product
Z
ψ(q) = hq|Ψi = dpψ(p)hq|pi
This implies that the inner product of two basis vectors is given by the Dirac’s
delta function (see Appendix B)20
hq|pi = δ(q − p) (5.19)
Then in analogy with equation (F.23) we can define the decomposition of the
identity operator
Z
I= dp|pihp| (5.20)
Its action on any normalized state vector |Ψi is trivial, as expected
Z Z
I|Ψi = dp|pihp|Ψi = dp|piψ(p) = |Ψi
The identity operator, of course, must be invariant with respect to Poincaré

transformations, i.e., we anticipate that
I = U(Λ; r, t)IU −1 (Λ; r, t)

20
so that the norm of such improper vectors is, actually, infinite hp|pi = ∞
The invariance of I with respect to translations follows directly from equa-

tions (5.8) and (5.9). The invariance with respect to rotations can be proven
as follows
Z Z
′ ~
− ~i Jφ i ~
Jφ ~
− ~i Jφ i ~
Jφ
I = e Ie =e ~ dp|pihp| e = ~ dp|Rφ~ pihRφ~ p|
Z Z
dp
= dq det |qihq| = dq|qihq| = I

dq
where we used (5.10) and the fact that det |dp/dq| = det(R−φ~ ) = 1 is the
Jacobian of the transformation from variables p to q = Rφ~ p.
Let us consider more closely the invariance of I with respect to boosts.
Using equation (5.17) we obtain
Z
′ − ic Kθ~ ic
Kθ~ − ic Kθ~ ic ~
I = e ~ =e Ie ~ dp|pihp| e ~ Kθ
~
Z
N(p) 2
= dp|ΛpihΛp|
N(Λp)
−1
N(Λ−1 q) 2
Z
dΛ q
= dq det
|qihq|
(5.21)
dq N(q)
where N(q) is the normalization factor introduced in (5.5) and det |dΛ−1q/dq|
is the Jacobian of the transformation from variables p to q = Λp. This Ja-
cobian should not depend on the direction of the boost θ, ~ and we can choose
this direction along the z-axis to simplify calculations. Then from (5.14) we
obtain
Λ−1 qx = qx (5.22)
Λ−1 qy = qy (5.23)
1p 2 4
Λ−1 qz = qz cosh θ − m c + q 2 c2 sinh θ (5.24)
s c
2
1p 2 4
ωΛ−1 q = m2 c4 + c2 qx2 + c2 qy2 + c2 qz cosh θ − m c + q 2 c2 sinh θ
c
= ωq cosh θ − cqz sinh θ
and21
 
−1 1 0 0
dΛ q
det

= det  0 1 0 

dq √cqx sinh θ √ cqy sinh θ
cosh θ − cqz2 sinh
√ θ
m2 c4 +q 2 c2 m2 c4 +q 2 c2 m c +q c2
4 2
cqz sinh θ ωΛ−1 q

= cosh θ − p = (5.26)
m2 c4 + q 2 c2 ωq
Inserting this result in equation (5.21) we obtain

N(Λ−1 p) 2
Z
′ ωΛ−1 p
I = dp |pihp|

ωp N(p)
Thus, to ensure the invariance of I, we should define our normalization factor

as22
s
mc2
N(p) = (5.27)
ωp
Putting together our results from equations (5.8) - (5.10), (5.17), and
(5.27), we can find the action of an arbitrary Poincaré group element on
basis vectors |p, σi. Bearing in mind that in a general Poincaré transfor-
mation23 (Λ; r; t) we agreed24 first to perform translations (r, t) and then
21
In particular, this means that inside 3D momentum integrals we are allowed to use
the equality
dq d(Λq)
= (5.25)
ωq ωΛq
for any element Λ of the Lorentz group, i.e., that dq/ωq is a “Lorentz invariant measure.”
We will use this property quite often in our calculations.
22
We could also multiply this expression for N (p) by an arbitrary unimodular factor,
but this would not have any effect, because state vectors and their wave functions are
defined up to an unimodular factor anyway.
23
Λ is a product of a rotation and a boost, as in equation (I.12).
24
see equation (3.59)
boosts/rotations Λ, we can find how this group element acts on an one-

particle state25
i i
U(Λ; r, t)|p, σi = U(Λ; 0, 0)e− ~ Pr e ~ Ht |p, σi
r s
ωΛp − i p·r+ i ωp t X ~ W (p, Λ))|Λp, σ ′ i
= e ~ ~ Dσ ′ σ ( φ (5.28)
ωp σ′ =−s
5.2.2 Wave function in the momentum representation

R
RThe inner product of two normalized vectors |Ψi = dpψ(p)|pi and |Φi =
dpφ(p)|pi can we written in terms of their wave functions
Z
hΨ|Φi = dpdp′ ψ ∗ (p)φ(p′ )hp|p′ i
Z
= dpdp′ ψ ∗ (p)φ(p′ )δ(p − p′ )
Z
= dpψ ∗ (p)φ(p) (5.29)
So, for a state vector |Ψi with unit normalization, its wave function ψ(p)
must satisfy the condition
Z
1 = hΨ|Ψi = dp|ψ(p)|2
This wave function has a direct probabilistic interpretation,

R e.g., if Ω is a
2
region in the momentum space, then the integral dp|ψ(p)| gives the prob-
Ω
ability of finding particle’s momentum inside this region.
Poincaré transformations of the state vector |Ψi can be viewed as transfor-
mations of the corresponding momentum-space wave function. For example,
using equation (5.28) we obtain26
25
Here we restore spin indices and use active transformations of states, as explained in
subsection 5.2.4.
26
Strictly speaking, operators always act on state vectors. When we apply operators to
wave functions, as in (5.30), we will place a caret above the operator symbol.
r
− ic K̂θ~ − ic Kθ~ ωΛ−1 p −1
e ~ ψ(p) ≡ hp|e |Ψi =
~ hΛ p|Ψi
ωp
r
ωΛ−1 p
= ψ(Λ−1 p) (5.30)
ωp
Then the boost invariance of the inner product (5.29) can be easily proven
using property (5.25)
Z
′ ′ ωΛ−1 p
hΦ |Ψ i = dpφ∗ (Λ−1 p)ψ(Λ−1 p)
ωp
Z
d(Λ−1 p) ∗ −1
= φ (Λ p)ψ(Λ−1 p)ωΛ−1 p
ωΛ p
−1
Z
= dpφ∗ (p)ψ(p) = hΦ|Ψi.
The action of Poincaré generators and the Newton-Wigner position op-

erator on momentum-space wave functions of a massive spinless particle can
be derived from formula (5.28)
d − i P̂x a
P̂x ψ(p) = i~ lim e ~ ψ(p) = px ψ(p) (5.31)
a→0 da
d i
Ĥψ(p) = −i~ lim e ~ Ĥt ψ(p) = ωp ψ(p) (5.32)
t→0 dt
i~ d ic
K̂x ψ(p) = lim e− ~ K̂x θ ψ(p)
c θ→0 dθ s
p
i~ d m2 c4 + p2 c2 cosh θ − cpx sinh θ
= lim p ×
c θ→0 dθ m2 c4 + p2 c2

1p 2 4
ψ px cosh θ − m c + p2 c2 sinh θ, py , pz
c

ωp d px
= i~ − 2 − ψ(p) (5.33)
c dpx 2ωp
c2
R̂x ψ(p) = − (Ĥ −1 K̂x + K̂x Ĥ −1 )ψ(p)
2
i~ −1 d d −1 px c2
= − −ωp ωp − ωp ω − 2 ψ(p)
2 dpx dpx p ωp
d
= i~ ψ(p) (5.34)
dpx

d d
Jˆx ψ(p) = (R̂y P̂z − R̂z P̂y )ψ(p) = i~ pz − py ψ(p) (5.35)
dpy dpz
According to equation (5.34), the exponent of the Newton-Wigner posi-
i
tion operator e ~ Rb acts as a translation operator in the momentum space:
i
e ~ R̂b ψ(p) = ψ(p − b) This suggests the following useful representation for
momentum eigenvectors
i
|pi = e ~ R·p|0i (5.36)
5.2.3 Position representation

In the preceding subsection we considered particle’s wave functions in the
momentum representation, i.e., with respect to common eigenvectors of three
commuting components of momentum Px , Py , and Pz . Three components of
the position operator Rx , Ry , and Rz also commute with each other,27 and
their common eigenvectors |ri also form a basis in the Hilbert space H of
one massive spinless particle. In this section we will describe particle’s wave
functions with respect to this basis set, i.e., in the position representation.
First we can expand eigenvectors |ri in the momentum basis
Z
|ri = dpψr (p)|pi (5.37)
The momentum-space eigenfunctions are
i
ψr (p) = hp|ri = (2π~)−3/2 e− ~ pr (5.38)
as can be verified by substitution of (5.34) and (5.38) to the eigenvalue equa-
tion
i d − i pr
R̂ψr (p) = (2π~)−3/2 R̂ e− ~ pr = i~(2π~)−3/2 e ~
dp
i
= r(2π~)−3/2 e− ~ pr = rψr (p)
27
see Theorem 4.1
As operator R is Hermitian, its eigenvectors with different eigenvalues r and

r′ must be orthogonal. Indeed, using equation (B.1) we establish the delta-
function inner product
Z
i i ′ ′
′ −3
hr |ri = (2π~) dpdp′ e− ~ pr+ ~ p r hp′ |pi
Z
i i ′ ′
−3
= (2π~) dpdp′ e− ~ pr+ ~ p r δ(p − p′ )
Z
i ′
−3
= (2π~) dpe− ~ p(r−r ) = δ(r − r′ ) (5.39)
which means that |ri are improper states just as |pi are. Similarly to (5.18),
a normalized state vector |Ψi can be represented as an integral over the
position space
Z
|Ψi = drψ(r)|ri
where ψ(r) = hr|Ψi is the wave function of the state |Ψi in the position
representation. The absolute square |ψ(r)|2 of the wave function is the prob-
ability density for particle’s position. The inner product of two vectors |Ψi
and |Φi can be expressed through their position-space wave functions
Z
hΦ|Ψi = drdr′φ∗ (r)ψ(r′ )hr|r′i
Z Z
′ ∗ ′ ′
= drdr φ (r)ψ(r )δ(r − r ) = drφ∗ (r)ψ(r)
Using Equations (5.37) and (5.38) we find that the position space wave
function of a momentum eigenvector is the usual plane wave
i
ψp (r) = hr|pi = (2π~)−3/2 e ~ pr (5.40)
As expected, eigenvectors of the position operator in its own representation

are given by delta-functions (5.39).28
28
Note that position eigenfunctions introduced in ref. [NW49] do not satisfy this im-
portant requirement.
From (5.20) we can also obtain a position-space representation of the

identity operator
Z Z
i ′ i
−3
dr|rihr| = (2π~) drdpdp′ e− ~ p r |pihp′|e ~ pr
Z Z
′ ′ ′
= dpdp |pihp |δ(p − p ) = dp|pihp| = I
Similar to momentum-space formulas (5.31) - (5.35) we can represent

generators of the Poincaré group by their actions on position-space wave
functions. For example, it follows from (5.8), (5.37), and (5.38) that
Z Z Z
− ~i Pa
e drψ(r)|ri = drψ(r)|r + ai = drψ(r − a)|ri
Therefore we can apply operators directly to wave functions
i
e− ~ P̂a ψ(r) = ψ(r − a)
d i d d
P̂ψ(r) = i~ lim e− ~ P̂a ψ(r) = i~ lim ψ(r − a) = −i~ ψ(r)
a→0 da a→0 da dr
(5.41)
Other operators in the position representation have the following forms29
r
d2
Ĥψ(r) = m2 c4 − ~2 c2 2 ψ(r)
dr
d d
Jˆx ψ(r) = −i~ y − z ψ(r)
dz dy
r r !
1 d 2 d 2
K̂x ψ(r) = m2 c4 − ~2 c2 2 x + x m2 c4 − ~2 c2 2 ψ(r)
2 dr dr
R̂ψ(r) = rψ(r)
29
Here we used a formal notation for the Laplacian operator
d2 ∂2 ∂2 ∂2
2
≡ 2
+ 2+ 2
dr ∂x ∂y ∂z
The switching between the position-space and momentum-space wave

functions of the same state are achieved by Fourier transformation formulas.
To see that, assume that the state |Ψi has a position-space wave function
ψ(r). Then using (5.37) and (5.38) we obtain
Z Z Z
i
−3/2
|Ψi = drψ(r)|ri = (2π~) drψ(r) dpe− ~ pr |pi
Z Z
−3/2 − ~i pr
= (2π~) dp drψ(r)e |pi
and the corresponding momentum-space wave function is
Z
i
−3/2
ψ(p) = (2π~) drψ(r)e− ~ pr (5.42)
Inversely, if the momentum-space wave function is ψ(p), then the position-

space wave function is
Z
i
−3/2
ψ(r) = (2π~) dpe ~ pr ψ(p) (5.43)
5.2.4 Inertial transformations of observables and states

Here we would like to discuss how observables and states change under iner-
tial transformations of observers. We already touched this issue in few places
in the book, but it would be useful to summarize the definitions and to clarify
the physical meaning of transformations. What do we mean exactly when
expressing observables and states in the reference frame O ′ (primed) through
observables and states in the reference frame O (unprimed)?
F ′ = Ug F Ug−1 (5.44)
|Ψ′ i = Ug |Ψi (5.45)
where30
30
see equation (3.59)
z z’
O O’
y y’
x x’
−2 −1 0 1 2 −2 −1 0 1 2
a
Figure 5.3: Rods for measuring the x-component of position in the reference
frame O and in the frame O ′ displaced by the distance a.
i ~ ic ~ i i
Ug = Ug (Λ; a, t) = e− ~ Jφ e− ~ Kθ e− ~ Pa e ~ Ht
is the unitary representative of an inertial transformation g in the Hilbert

space of the system.
Let us start with transformations of observables (5.44). For definiteness
we will assume that observable F is the x-component of position (F = Rx ).
This means that F is a mathematical representation of a measuring rod at
rest in the reference frame O, as shown in Fig. 5.3. The zero pointer on
this rod coincides with the origin of the coordinate system O. If we further
assume that Poincaré group element g is a translation by the distance a along
the x-axis
i
Ug = Ug (1; a, 0, 0, 0) = e− ~ Px a
then the transformed observable
i i
Rx′ = e− ~ Px a Rx e ~ Px a (5.46)
is the operator that describes measurements of position in the reference frame

O ′ with respect to axes and measuring rods in this frame. This measuring rod
X ′ is shifted by the distance a with respect to the rod X. The zero pointer on
X ′ coincides with the origin of the coordinate system O ′. Of course, position
measurements performed by X and X ′ on the same particle yield different

results and this difference is reflected in the difference of operators Rx 6= Rx′ .
For example, if the particle sits in the origin of the reference frame O, then a
measurement with the rod X yields position value x = 0, but a measurement
with the rod X ′ yields x′ = −a. For this reason we say that observables Rx
and Rx′ are related by
Rx′ = Rx − a
Of course, the same relationship is obtained by a formal application of (5.46)
i
Rx′ = Rx − [Px , Rx ]a = Rx − a
~
The position operator Rx can be represented also through its spectral
decomposition
Z∞
Rx = dxx|xihx|
−∞
where |xi are eigenvectors (states) with positions x. Then equation (5.46)
can be rewritten as
 
Z∞
i i
Rx′ = e− ~ Px a  dxx|xihx| e ~ Px a
−∞
Z∞ Z∞
= dxx|x + aihx + a| = dx(x − a)|xihx| = Rx − a
−∞ −∞
From this we see that the action of Ug on state vectors31
i
e− ~ Px a |xi = |x + ai
31
Here we switch to the Schrödinger representation, in which operators of observables
are assumed to be fixed.
should be interpreted as an active shift of the states, i.e., translation of the

i
states by the distance a in this case.32 For example, operator e− ~ Px a moves
a particle localized in the origin of the frame O (x = 0) to the origin of
the frame O ′ (x = a). However, in many cases of practical interest we are
not interested in active transformations applied to states. More often we are
interested in knowing how the state of the system looks from the point of
view of an inertially transformed observer, i.e., we are interested in passive
transformations of states. Apparently such passive transformations should
be represented by inverse operators Ug−1
|Ψ′ i = Ug−1 |Ψi (5.47)
This means, in particular, that if the vector |Ψi = |xi describes a state of the
particle located at point x from the point of view of the observer O (measured
by the rod X), then the same state is described by the vector
i
|Ψ′ i = Ug−1 |Ψi = e ~ Px a |xi = |x − ai (5.48)
from the point of view of the observer O ′ (the position is measured by the
rod X ′ ). A particle localized in the origin of the frame O is described by the
vector |0i in that frame. From the point of view of observer O ′ this same
particle is described by the vector | − ai.
We can also apply inertial transformations to wave functions instead of
state vectors. For example, the state vector
Z
|Ψi = drψ(r)|ri
has wave function ψ(x, y, z) in the position representation. When we shift

the observer by the distance a in the positive direction along the x-axis, we
should apply a passive transformation (5.48) to the state vector
Z Z
i
′ P a
|Ψ i = e ~ x |Ψi = drψ(x, y, z)|x − a, y, zi = drψ(x + a, y, z)|x, y, zi
32
One can also interpret this as a result of the inertial transformation g being applied
to the state preparation device.
This means that the passive transformation of the wave function has the
form
i
e ~ P̂x a ψ(x, y, z) = ψ(x + a, y, z)
The above considerations find their most important applications in the
case when the inertial transformation Ug is a time translation. As we estab-
lished in (4.56), the position operator in time-translated reference frame O ′′
takes the form
i i
Rx′′ = e ~ Ht Rx e− ~ Ht = Rx + Vx t
If we want to find how the state vector |Ψi looks from the time translated
reference frame O ′′, we should apply the passive transformation (5.47)
i
|Ψ′′ i = e− ~ Ht |Ψi (5.49)
It is common to consider a continuous sequence of time shifts parameterized
by the value of time t and to speak about time evolution of the state vector
|Ψ(t)i. Then equation (5.49) can be regarded as a solution of the time-
dependent Schrödinger equation
d
|Ψ(t)i = H|Ψ(t)i
i~ (5.50)
dt
In actual calculations it is more convenient to deal with numerical functions
(wave functions in a particular basis) rather than with abstract state vec-
tors. To get this type of description, equation (5.50) should be multiplied
on the left by certain basis bra-vectors. For example, if we multiply (5.50)
by position eigenvectors hr|, we will obtain the Schrödinger equation in the
position representation
d
i~ hr|Ψ(t)i = hr|H|Ψ(t)i
dt
d
i~ Ψ(r, t) = ĤΨ(r, t) (5.51)
dt
where Ψ(r, t) ≡ hr|Ψ(t)i is a wave function in the position representation and
the action of the Hamiltonian on this wave function is denoted by ĤΨ(r, t) ≡
hr|H|Ψ(t)i.
5.3. MASSLESS PARTICLES 157
5.3 Massless particles

5.3.1 Spectra of momentum, energy, and velocity
In the case of massless particles (m = 0), such as photons, the method
used in section 5.1 to construct irreducible unitary representations of the
Poincaré group does not work. Indeed, for massless particles the position
operator (4.33) cannot be defined. Therefore we cannot apply the Stone-von
Neumann theorem H.3 to figure out the spectrum of the operator P. To find
the spectrum of P for a single massless particle we will use another argument.
Let us choose a state of the massless particle with some nonzero momen-
tum p.33 There are two kinds of inertial transformations that can affect this
momentum value: rotations and boosts. Any vector p′ obtained from p by
rotations and boosts is also in the spectrum of P.34 So, we can use these
transformations to explore the spectrum of the momentum operator. Rota-
tions generally change the direction of the momentum vector, but preserve
its length p, so all rotation images of p form a surface of a sphere with its
center in the origin 0 and the radius of p. Boosts along the momentum vector
p do not change the direction of this vector, but do change its length. To
decrease the length of the momentum vector we can use a boost vector ~θ
which points in the direction opposite to p, i.e., ~θ/θ = −p/p. Then, using
formula (5.14) and equality35
ωp = cp (5.52)
we can write
p
p′ = Λp = p + [p(cosh θ − 1) − p sinh θ] = p[cosh θ − sinh θ]
p
= pe−θ (5.53)
so the transformed momentum reaches zero only in the limit θ → ∞. This

means that the point p = 0 cannot be reached from p by rotations and
bosts. So, this point does not belong to the spectrum of the momentum of
33
We assume that such a value exists in the spectrum of the momentum operator P.
34
The proof of this statement is the same as in equations (5.1) and (5.6).
35
This equality follows from (5.7) if m = 0.
any massless particle36 . Then we see that for massless particles the mass
hyperboloid (5.7) degenerates to a cone (5.52) with the point p = 0 deleted
(see Fig. 5.2). Therefore, the spectrum of velocity V = Pc2 /H is the surface
of a sphere |v| = c. This means that massless particles can move only with
the speed of light in any reference frame. This is the famous second postulate
of Einstein’s special theory of relativity.
Statement 5.1 (invariance of the speed of light) The speed of massless
particles (e.g., photons) is equal to c independent on the velocity of the source
and/or observer.
5.3.2 Representations of the little group

Next we need to construct unitary irreducible representations of the Poincaré
group for massless particles. To do that we can slightly modify the method
of induced representations used for massive particles in section 5.1.
We already established that vector p = (0, 0, 0) does not belong to the
momentum spectrum of a massless particle. So, unlike in the massive case,
we cannot choose this vector as the standard vector for the construction of
induced massless representations. We also mentioned that the choice of the
standard vector is arbitrary, and representations built on different standard
vectors are unitarily equivalent. Therefore, in the massless case we will choose
a different standard momentum
k = (0, 0, 1) (5.54)
The next step is to find the little group corresponding to the vector k, i.e.,
the subgroup of Lorentz transformations, which leave this vector invariant.
The energy-momentum 4-vector corresponding to the standard vector (5.54)
is (ck, ck) = (c, 0, 0, c). Therefore, in the 4D notation from Appendix I.1, the
matrices S of little group elements must satisfy equation
   
c c
 0   0 
 0 = 0
S   

c c
36
The physical meaning of this result is clear because there are no photons with zero
momentum and energy.
Since the little group is a subgroup of the Lorentz group, condition (I.3) must
be fulfilled as well
S T gS = g
One can verify that the most general matrix S with these properties has the
form [Wei64a]
 
1 + 12 (X12 + X22 ) X1 X2 − 21 (X12 + X22 )
 X1 cos θ + X2 sin θ cos θ sin θ −X1 cos θ − X2 sin θ 
S(X1 , X2 , θ) = 
 −X1 sin θ + X2 cos θ − sin θ cos θ X1 sin θ − X2 cos θ 

1
2
(X12 + X22 ) X1 X2 1 − 12 (X12 + X22 )
(5.55)
which depends on three independent real parameters X1 , X2 , and θ.37 The

three generators of these transformations are obtained by differentiation
 
0 1 0 0
∂  1 0 0 −1 
 = Jx − cKy
T1 = lim S(X1 , X2 , θ) = 
 0
X1 ,X2 ,θ→0 ∂X1 0 0 0 
0 1 0 0
 
0 0 1 0
∂  0 0 0 0  = Jy + cKx
T2 = lim S(X1 , X2 , θ) = 
X1 ,X2 ,θ→0 ∂X2  1 0 0 −1 
0 0 1 0
 
0 0 0 0
∂  0 0 1 0 
 = Jz
R = lim S(X1 , X2 , θ) = 
X1 ,X2 ,θ→0 ∂θ  0 −1 0 0 
0 0 0 0
37
So, just as in the massive case, the massless little group is a 3-dimensional subgroup
of the Lorentz group. However, this subgroup is different from the rotation group of the
massive case.
where J~ and K ~ are Lorentz group generators (I.13) and (I.14). The commu-
tators are easily calculated
[T1 , T2 ] = 0
[R, T2 ] = −T1
[R, T1 ] = T2
These are commutation relations of the Lie algebra for the group of “trans-
lations” (T1 and T2 ) and rotations (R) in a 2D plane.
The next step is to find the full set of unitary irreducible representations of
the little group constructed above. We will do that by following the “induced
representation” prescription outlined at the end of subsection 5.1.3. First we
introduce three Hermitian operators Π ~ = (Π1 , Π2 ) and Θ ≡ Jz , which provide
a representation of the Lie algebra generators T and R, respectively.38 So,
little group “translations” and rotations are represented in the subspace Hk 39
i i i
by unitary operators e− ~ Π1 x , e− ~ Π2 y and e− ~ Θφ .
Next we should clarify the structure of the Hilbert subspace Hk , keeping
in mind that this subspace should carry an irreducible representation of the
little group. Suppose that the subspace Hk contains a state vector |~π i which
is an eigenvector of Π ~ with a nonzero “momentum” ~π 6= 0
~ πi = (π1 , π2 )|~π i
Π|~
Then the rotated vector
i
e− ~ Θφ |π1 , π2 i = |π1 cos φ + π2 sin φ, π1 sin φ − π2 cos φi (5.56)
also belongs to the subspace Hk . Vectors (5.56) form a circle π12 + π22 = const
in the 2D “momentum” plane. The linear span of these vectors form an
infinite-dimensional Hilbert space. This means that Hk , is infinite-dimensional.
If we used this representation of the little group to build the unitary irre-
ducible representation of the full Poincaré group, we would obtain massless
38
One can notice a formal analogy of operators Π~ and Θ with 2-dimensional “momen-
tum” and “angular momentum”, respectively.
39
This is the eigensubspace of the particle momentum operator P, corresponding to the
“standard” eigenvector k.
particles having an infinite number of internal (spin) degrees of freedom, or

“continuous spin.” Such particles have not been observed in nature, so we will
not discuss this possibility further. The only case having relevance to physics
is the “zero-radius” circle ~π = 0. These vectors form a one-dimensional irre-
ducible subspace Hk , where “translations” are represented trivially
i ~
e− ~ Πr |~π = 0i = |~π = 0i (5.57)
and rotations around the z-axis are represented by unimodular factors
i i
e− ~ Θφ |~π = 0i ≡ e− ~ Jz φ |~π = 0i = eiτ φ |~π = 0i (5.58)
The allowed values of the parameter τ can be obtained from the fact that
the representation must be either single- or double-valued.40 Therefore, the
rotation through the angle φ = 2π can be represented by either 1 or -1, and
τ must be either integer or half-integer number
τ = . . . , −1, −1/2, 0, 1/2, 1, . . . (5.59)
We will refer to the parameter τ as to helicity.41 This parameter distin-

guishes different massless unitary irreducible representations of the Poincaré
group, i.e., different types of elementary massless particles.
5.3.3 Massless representations of the Poincaré group

In the preceding subsection we have built an unitary representation of the
little group (which is a subgroup of the Poincaré group) in the 1-dimensional
subspace Hk of the standard momentum k = (0, 0, 1). In this subsection our
goal is to build irreducible unitary representations of the full Poincaré group
in the entire Hilbert space
H = ⊕p Hp (5.60)
of a massless particle with helicity τ .

40
see Statement 3.2
41
Note that ~τ is eigenvalue of the helicity operator (J · P)/P .
First we would like to build a basis in the Hilbert space H. To do that

we choose an arbitrary basis vector |k, τ i in the subspace Hk .42 Similarly to
what we did in the massive case, we are going to propagate this basis vector
to other values of momentum p by using transformations from the Lorentz
group. So, we need to define elements λp of the Lorentz group, which connect
the standard momentum k with all other momenta p
λp k = p
Just as in the massive case, the choice of the set of transformations λp is not
unique. However, one can show that representations of the Poincaré group
constructed with differently chosen λp are unitarily equivalent. So, we are
free to choose any set λp , which makes our calculations more convenient.
Our decision is to define λp as a Lorentz boost along the z-axis
 
cosh θ 0 0 sinh θ
 0 1 0 0 
Bp = 
 0
 (5.61)
0 1 0 
sinh θ 0 0 cosh θ
p
followed by a rotation Rp which brings direction k = (0, 0, 1) to p
λp = Rp Bp (5.62)
(see Fig. 5.4). The rapidity of the boost θ = log(p) is such that the length
of Bp k is equal to p.43 The absolute value of the rotation angle in Rp is

p
cos φ = k· = pz /p (5.63)
p
~ is
and the direction of φ
~
φ p
= k× / sin φ = (p2x + p2y )−1/2 (−py , px , 0)
φ p
42
Here τ is a fixed half-integer number (5.59) that specifies the helicity of our particle.
43
see formula (5.53)
pz
p Rp
Bp
Λ k=(0,0,1)
RΛp BΛp
Λp
p px
0
Figure 5.4: Each point p (except p = 0) in the momentum space of a massless

particle can be reached from the standard vector k = (0, 0, 1) by applying a
boost Bp along the z-axis followed by a rotation Rp .
The full basis |p, τ i in H is now obtained by propagating the basis vector
|k, τ i to other fixed momentum subspaces Hp 44
1
|p, τ i = p U(λp ; 0, 0)|k, τ i (5.64)
|p|
The next step is to consider how general elements of the Poincaré group
act on the basis vectors (5.64). First we apply a general transformation from
the Lorentz subgroup U(Λ; 0, 0) to an arbitrary basis vector |p, τ i
U(Λ; 0, 0)|p, τ i = U(Λ; 0, 0)U(λp ; 0, 0)|k, τ i

= U(λΛp ; 0, 0)U(λ−1
Λp ; 0, 0)U(Λ; 0, 0)U(λp ; 0, 0)|k, τ i
= U(λΛp ; 0, 0)U(λ−1
Λp Λλp ; 0, 0)|k, τ i
−1 −1
= U(λΛp ; 0, 0)U(BΛp RΛp ΛRp Bp ; 0, 0)|k, τ i
The product of Lorentz group transformations λ−1 −1 −1

Λp Λλp = BΛp RΛp ΛRp Bp
on the right hand side brings vector k back to k (see Fig. 5.4), therefore
this product is an element of the little group. The “translation” part of this
element is irrelevant for us due to equation (5.57). The relevant angle of
44
compare with equations (5.5) and (5.27)
rotation around the z-axis is called the Wigner angle φW (p, Λ).45 According
to equation (5.58), this rotation acts as multiplication by a unimodular factor
U(λ−1
Λp Λλp ; 0, 0)|k, τ i = e
iτ φW (p,Λ)
|k, τ i
Thus, taking into account (5.64) we can write for arbitrary Lorentz transfor-
mation Λ
p
iτ φW (p,Λ) |Λp| iτ φW (p,Λ)
U(Λ; 0, 0)|p, τ i = U(λΛp ; 0, 0)e |k, τ i = p e |Λp, τ i
|p|
For a general Poincaré group element we finally obtain a transformation that
is similar to the massive case result (5.28)
s
|Λp| − i p·r+ ic |p|t iτ φW (p,Λ)
U(Λ; r, t)|p, τ i = e ~ ~ e |Λp, τ i (5.65)
|p|
As was mentioned in the beginning of this chapter, photons are described

by a reducible representation of the Poincaré group which is a direct sum
of two irreducible representations with helicities τ = 1 and τ = −1. In the
classical language these two irreducible components correspond to the left
and right circularly polarized light.
5.3.4 Doppler effect and aberration

To illustrate results obtained in this section, here we are going to derive well-
known formulas for the Doppler effect and the aberration of light. These for-
mulas connects energies and propagation directions of photons viewed from
reference frames in relative motion.
We denote H(0) the photon’s energy and P(0) its momentum in the
reference frame O at rest. We also denote H(θ) and P(θ) the photon’s energy
~
and momentum in the reference frame O ′ moving with velocity v = c θθ tanh θ.
Then using (4.4) we obtain the usual Doppler effect formula46
45
Explicit expressions for the Wigner rotation angle can be found in [Rit61, CR03].
46
The frequency of light is proportional to the photon’s energy (H = ~ω), so our formula
(5.66) applies to frequencies as well.
~θ
H(θ) = H(0) cosh θ − cP(0) · sinh θ
θ !
cP (0)P(0) ~θ
= H(0) cosh θ 1 − · tanh θ
H(0)P (0) θ
v
= H(0) cosh θ 1 − cos φ0 (5.66)
c
where we denoted φ0 the angle between the direction of photon’s propagation
(seen in the reference frame O) and the direction of movement of the reference
frame O ′ with respect to O
P(0) θ~
cos φ0 ≡ · (5.67)
P (0) θ
Sometimes the Doppler effect formula is written in another form where
the angle φ between the photon momentum and the reference frame velocity
is measured from the point of view of O ′
P(θ) θ~
cos φ ≡ · (5.68)
P (θ) θ
From (4.4) we can write
~θ
H(0) = H(θ) cosh θ + cP(θ) · sinh θ
θ !
cP (θ)P(θ) ~θ
= H(θ) cosh θ 1 + · tanh θ
H(θ)P (θ) θ
v
= H(θ) cosh θ 1 + cos φ
c
Therefore
H(0)
H(θ) = (5.69)
cosh θ(1 + vc cos φ)
S S S’
(a) v (b)
v v
O’ O’
O O
Figure 5.5: To the discussion of the Doppler effect: (a) observer at rest O
and moving observer O ′ measure light from the same source (e.g., a star) S;
(b) one observer O measures light from two sources S and S ′ that move with
respect to each other.
The difference between angles φ0 and φ, i.e., the dependence of the direc-
tion of light propagation on the observer is known as the aberration effect.
In order to see the same star in the sky observers O and O ′ must point their
telescopes in different directions. These directions make angles φ0 and φ,
~
respectively, with the direction θθ of the relative velocity of O and O ′ . The
relationship between these angles can be found by taking the scalar product
~
of both sides of (4.3) with θθ and taking into account equations (5.66) - (5.68)
and cP (θ) = H(θ)
P (0) cosh θ cos φ0 − sinh θ

cos φ = (cosh θ cos φ0 − sinh θ) =
P (θ) cosh θ(1 − vc cos φ0 )
cos φ0 − v/c
=
1 − vc cos φ0
Our derivations above referred to the case when there was one source of
light and two observers moving with respect to each other (see Fig. 5.5(a)).
However, this setup is not characteristic for most astronomical observations
of the Doppler effect. In these observations one typically has one observer
and two sources of light (stars) that move with respect to each other with
velocity v (see Fig. 5.5(b)). The aim of observations is to measure the
energy (frequency) difference of photons emitted by the two stars. Let us
assume that the distance between the stars S and S ′ is much smaller than
the distance from the stars to the Earth. Photons emitted simultaneously
by S and S ′ move with the same speed c and arrive to Earth at the same
time. Two stars are seen by O in the same region of the sky independent
on the velocity v. We also assume that sources S and S ′ are identical, i.e.,
they emit photons of the same energy in their respective reference frames.
Furthermore, we assume that the energy h(0) of photons arriving from the
source S to the observer O is known. Our goal is to find the energy (denoted
by h(θ)) of photons emitted by S ′ from the point of view of O. In order
to do that, we introduce an imaginary observer O ′ whose velocity v with
respect to O is the same as velocity of S ′ with respect to S and apply the
principle of relativity. According to this principle, the energy of photons from
S ′ registered by O ′ is the same as the energy of photons from S registered
by O, i.e., h(0). Now, in order to find the energy of photons from S ′ seen by
O we can apply formula (5.69) with the opposite sign of velocity
h(0)
h(θ) = (5.70)
cosh θ(1 − vc cos φ)
where φ is the angle between velocity v of the star S ′ and the direction of
light arriving from stars S and S ′ from the point of view of O.
Description of the Doppler effect from a different point of view will be
found in subsection 6.4.2.
Chapter 6
INTERACTION
I myself, a professional mathematician, on re-reading my own

work find it strains my mental powers to recall to mind from the
figures the meanings of the demonstrations, meanings which I my-
self originally put into the figures and the text from my mind. But
when I attempt to remedy the obscurity of the material by putting
in extra words, I see myself falling into the opposite fault of be-
coming chatty in something mathematical.
Johannes Kepler
In the preceding chapter we were concerned with isolated elementary par-

ticles moving freely in space. Starting from this chapter we will focus on
compound systems consisting of two or more particles. In addition we will
allow a redistribution of energy and momentum between different parts of the
system, in other words we will allow interactions. In this chapter, our anal-
ysis will be limited to cases in which creation and/or destruction of particles
is not allowed, and only few massive spinless particles are present. Starting
from chapter 8 we are going to lift these restrictions and consider interacting
systems in full generality.
169
170 CHAPTER 6. INTERACTION
6.1 Hilbert space of a many-particle system

In this section we will construct the Hilbert space of a compound system. In
quantum mechanics textbooks it is tacitly assumed that this space should be
built as a tensor product of Hilbert spaces of the components. Here we will
show how this statement can be proven from postulates of quantum logic.1
For simplicity, we will start with the simplest case of a two-particle system.
6.1.1 Tensor product theorem

Let L1 , L2 , and L be quantum propositional systems of particles 1, 2, and the
compound system 1+2, respectively. It seems reasonable to assume that each
proposition about subsystem 1 (or 2) is still valid in the combined system.
So, propositions in L1 and L2 should be represented also as propositions in
L. Let us formulate this idea as a new postulate
Postulate 6.1 (properties of compound systems) If L1 and L2 are quan-

tum propositional systems describing two physical systems 1 and 2, and L is
the quantum propositional system describing the compound system 1+2, then
there exist two mappings
f1 : L1 → L
f2 : L2 → L
which satisfy the following conditions:

(I) The mappings f1 and f2 preserve all logical relationships between
propositions, so that
f1 (∅L1 ) = ∅L
f1 (IL1 ) = IL
and for any two propositions x, y ∈ L1

1
see sections 1.3 - 1.5
6.1. HILBERT SPACE OF A MANY-PARTICLE SYSTEM 171
x≤y ⇔ f1 (x) ≤ f1 (x)

f1 (x ∧ y) = f1 (x) ∧ f1 (y)
f1 (x ∨ y) = f1 (x) ∨ f1 (y)
f1 (x⊥ ) = (f1 (x))⊥
The same properties are valid for the mapping f2 : L2 → L.

(II) The results of measurements on two subsystems are independent.
This means that in the compound system all propositions about subsystem
1 are compatible with all propositions about subsystem 2:
f1 (x1 ) ↔ f2 (x2 )
where x1 ∈ L1 , x2 ∈ L1
(III) If we have full information about subsystems 1 and 2, then we have
full information about the combined system. Therefore, if x1 ∈ L1 and x2 ∈
L2 are atoms then the meet of their images f1 (x1 ) ∧ f2 (x2 ) is also an atomic
proposition in L.
The following theorem [Mat75, AD78b] allows us to translate the above

properties of the compound system from the language of quantum logic to
the more convenient language of Hilbert spaces.
Theorem 6.2 (Matolcsi) Suppose that H1 , H2 , and H are three complex

Hilbert spaces corresponding to the propositional lattices L1 , L2 , and L, re-
spectively. Suppose also that f1 and f2 are two mappings satisfying all condi-
tions from Postulate 6.1. Then the Hilbert space H of the compound system
is either one of the four tensor products2 H = H1 ⊗ H2 , or H = H1∗ ⊗ H2 ,
or H = H1 ⊗ H2∗ , or H = H1∗ ⊗ H2∗ .
The proof of this theorem is beyond the scope of our book.

So we have four ways to couple two one-particle Hilbert spaces into one
two-particle Hilbert space. Quantum mechanics uses only the first possibility
2
For definition of the tensor product of two Hilbert spaces see Appendix F.4. The star
denotes a dual Hilbert space as described in Appendix F.3.
H = H1 ⊗H2 .3 This means that if particle 1 is in a state |1i ∈ H1 and particle

2 is in a state |2i ∈ H2 , then the state of the compound system is described
by the vector |1i ⊗ |2i ∈ H1 ⊗ H2 .
6.1.2 Particle observables in a multiparticle system

The mappings f1 and f2 from Postulate 6.1 map propositions (projections)
from Hilbert spaces of individual particles H1 and H2 into the Hilbert space
H = H1 ⊗ H2 of the compound system. Therefore, they also map particle
observables from H1 and H2 to H. For example, consider an 1-particle ob-
servable G(1) that is represented in the Hilbert space H1 by the Hermitian
operator with a spectral decomposition (1.28)
X
G(1) = gPg(1)
g

Then the mapping f1 transforms G(1) into a Hermitian operator f1 G(1) in
the Hilbert space H of the compound system
X
f1 (G(1) ) = gf1 Pg(1)
g
which has the same spectrum g as G(1) . Thus we conclude that all observables
of individual particles, e.g., P1 , R1 in H1 and P2 , R2 in H2 have well-defined
meanings in the Hilbert space H of the combined system.
In what follows we will use small letters to denote observables of individual
particles in H.4 For example, the position and momentum of the particle 1
in the two-particle system will be denoted as p p1 and r1 . The operator of
energy of the particle 1 will be written as h1 = m21 c4 + p21 c2 , etc. Similarly,
observables of the particle 2 in H will be denoted as p2 , r2 , and h2 . According
to Postulate 6.1(II), spectral projections of observables of different particles
commute with each other in H. Therefore, observables of different particles
commute with each other as well.
3
It is not yet clear what is the physical meaning of the other three possibilities.
4
We will keep using capital letters for the total observables (H, P, J, and K) of the
compound system.
6.1. HILBERT SPACE OF A MANY-PARTICLE SYSTEM 173
Just as in the single-particle case discussed in chapter 5, two-particle

states can be also described by wave functions. From the properties of the
tensor product of Hilbert spaces it can be derived that if ψ1 (r1 ) is the wave
function of particle 1 and ψ2 (r2 ) is the wave function of particle 2, then the
wave function of the compound system is simply a product
ψ(r1 , r2 ) = ψ1 (r1 )ψ2 (r2 ) (6.1)
In this case, both particles 1 and 2 and the compound system are in pure
quantum states. However, the most general pure 2-particle state in H1 ⊗
H2 is described by a general (normalizable) function of two vector variables
ψ(r1 , r2 ) which is not necessarily expressed in the product form (6.1). In this
case, individual subsystems are in mixed states: the results of measurements
performed on the particle 1 are correlated with the results of measurements
performed on the particle 2, even though the particles do not interact with
each other. The existence of such entangled states is a distinctive feature of
quantum mechanics, which is not present in the classical world.
6.1.3 Statistics
The above construction of the two-particle Hilbert space H = H1 ⊗H2 is valid
when particles 1 and 2 belong to different species. If particles 1 and 2 are
identical, then there are vectors in H1 ⊗ H2 that do not correspond to phys-
ically realizable states, and the Hilbert space of two-particle states is “less”
than H1 ⊗ H2 . Indeed, if two particles 1 and 2 are identical, then no mea-
surable quantity will change if these particles are interchanged. Therefore,
after permutation of two particles, the wave function may at most acquire
an insignificant unimodular phase factor β
ψ(r2 , σ2 ; r1 , σ1 ) = βψ(r1 , σ1 ; r2 , σ2 ) (6.2)
If we swap the particles again then the original wave function must be re-
stored
ψ(r1 , σ1 ; r2 , σ2 ) = βψ(r2 , σ2 ; r1 , σ1 ) = β 2 ψ(r1 , σ1 ; r2 , σ2 )

Therefore β 2 = 1, which implies that the factor β for any physical state
ψ(r1 , σ1 ; r2 , σ2 ) in H can be either 1 or -1. If a vector in H does not have
this property, then this vector does not correspond to a physically realizable
state. Thus the Hilbert space of physical states of two identical particles is
only a subspace in H.
Is it possible that in a system of two identical particles one state φ1 (r1 , σ1 ; r2, σ2 )
has factor β equal to 1
φ1 (r1 , σ1 ; r2 , σ2 ) = φ1 (r2 , σ2 ; r1 , σ1 ) (6.3)
and another state φ2 (r1 , σ1 ; r2 , σ2 ) has factor β equal to -1?
φ2 (r1 , σ1 ; r2, σ2 ) = −φ2 (r2 , σ2 ; r1 , σ1 ) (6.4)
If equations (6.3) and (6.4) were true, then the linear combination of the
states φ1 and φ2
ψ(r1 , σ1 ; r2 , σ2 ) = aφ1 (r1 , σ1 ; r2 , σ2 ) + bφ2 (r1 , σ1 ; r2 , σ2 )
would not transform like (6.2) after permutation
ψ(r2 , σ2 ; r1 , σ1 ) = aφ1 (r2 , σ2 ; r1 , σ1 ) + bφ2 (r2 , σ2 ; r1 , σ1 )

= aφ1 (r1 , σ1 ; r2 , σ2 ) − bφ2 (r1 , σ1 ; r2 , σ2 )
6= ±ψ(r1 , σ1 ; r2 , σ2 )
It then follows that the factor β must be the same for all states in the Hilbert
space H of the system of two identical particles. This result implies that all
particles in nature are divided in two categories: bosons and fermions.
For bosons β = 1 and two-particle wave functions are symmetric with
respect to permutations. Wave functions of two bosons form a linear subspace
H1 ⊗sym H2 ⊆ H1 ⊗ H2 . This means, in particular, that two identical bosons
may occupy the same quantum state, i.e., the wave function ψ(r1 , σ1 )ψ(r2 , σ2 )
belongs to the bosonic subspace H1 ⊗sym H2 .
For fermions, β = −1 and two-particle wave functions are antisymmetric
with respect to permutations of particle variables. The Hilbert space of two
6.2. RELATIVISTIC HAMILTONIAN DYNAMICS 175
identical fermions is the subspace of antisymmetric functions H1 ⊗asym H2 ⊆

H1 ⊗ H2 . This means, in particular, that two identical fermions may not
occupy the same quantum state (this is called the Pauli exclusion principle),
i.e., the wavefunction ψ(r1 , σ1 )ψ(r2 , σ2 ) does not belong to the antisymmetric
fermionic subspace H1 ⊗asym H2 .
A remarkable spin-statistics theorem has been proven in the framework
of quantum field theory. This theorem establishes (in full agreement with
experiment) that the symmetry of two-particle wave functions is related to
their spin: all particles with integer spin (e.g., photons) are bosons and
all particles with half-integer spin (e.g., neutrinos, electrons, protons) are
fermions.
All results of this section can be immediately generalized to the case of
n-particle system, where n > 2. For example, the Hilbert space of n identical
bosons is the symmetrized tensor product H = H1 ⊗sym H2 ⊗sym . . . ⊗sym Hn ,
and the Hilbert space of n identical fermions is the antisymmetrized tensor
product H = H1 ⊗asym H2 ⊗asym . . . ⊗asym Hn .
6.2 Relativistic Hamiltonian dynamics

To complete our description of the 2-particle system initiated in the preceding
section we need to specify an unitary representation Ug of the Poincaré group
in the Hilbert space H = H1 ⊗ H2 .5 We already know from chapter 4
that generators of this representation (and some functions of generators)
will define total observables of the compound system. From subsection 6.1.2
we also know how to define observables of individual particles in H. If we
assume that total observables in H may be expressed as functions of particle
observables p1 , r1 , p2 , and r2 , then the construction of Ug is equivalent to
finding 10 Hermitian operator functions
H(p1 , r1 , p2 , r2 ) (6.5)
P(p1 , r1 , p2 , r2 ) (6.6)
J(p1 , r1 , p2 , r2) (6.7)
K(p1 , r1 , p2 , r2 ) (6.8)
5
For simplicity we will assume that particles 1 and 2 are massive, spinless, and not
identical.
which satisfy commutation relations of the Poincaré Lie algebra (3.52) -

(3.58). Even in the simplest two-particle case this problem does not have a
unique solution, and additional physical principles should be applied to make
sure that generators (6.5) - (6.8) are selected in agreement with observations.
For a general multiparticle system, the construction of the representation Ug
is the most difficult and the most important part of relativistic quantum
theories. A large portion of the rest of this book is devoted to the analysis of
different ways to construct representation Ug . It is important to understand
that once this step is completed, we get everything we need for a full theo-
retical description of the physical system and for comparison of calculations
with experimental data.
6.2.1 Non-interacting representation of the Poincaré

group
There are infinitely many ways to define the representation Ug of the Poincaré
group, in the Hilbert space H = H1 ⊗ H2 . Let us start our analysis from one
legitimate choice which has a transparent physical meaning. We know from
chapter 5 that one-particle Hilbert spaces H1 and H2 carry irreducible unitary
representations Ug1 and Ug2 of the Poincaré group. Functions f1 and f2 defined
in subsection 6.1.1 allow us to map these representations to the Hilbert space
H of the compound system, i.e., to have two representations of the Poincaré
group f1 (Ug1 ) and f2 (Ug2 ) in H.6 We can then define a new representation
Ug0 of the Poincaré group in H by making a (tensor product) composition of
f1 (Ug1 ) and f2 (Ug2 ). More specifically, for any vector |1i ⊗ |2i ∈ H we define
Ug0 (|1i ⊗ |2i) = f1 (Ug1 )|1i ⊗ f2 (Ug2 )|2i (6.9)
and the action of Ug0 on other vectors in H follows by linearity. Represen-

tation (6.9) is called the tensor product of unitary representations Ug1 and
Ug2 and is denoted by Ug0 = Ug1 ⊗ Ug2 . Generators of this representation are
expressed as sums of one-particle generators
H0 = h1 + h2 (6.10)
6
These representations are no longer irreducible, of course. For example, f1 (Ug1 ) is a
direct sum of (infinitely) many irreducible representations isomorphic to Ug1 .
P0 = p1 + p2 (6.11)
J0 = j1 + j2 (6.12)
K0 = k1 + k2 (6.13)
The Poincaré commutation relations for generators (6.10) - (6.13) follow im-
mediately from the facts that one-particle generators corresponding to par-
ticles 1 and 2 satisfy Poincaré commutation relations separately and that
operators of different particles commute with each other.
With definitions (6.10) - (6.13), inertial transformations of particle ob-
servables with respect to the representation Ug0 are easy to find. For example,
positions of particles 1 and 2 change with time as
i i i i
r1 (t) = e ~ H0 t r1 e− ~ H0 t = e ~ h1 t r1 e− ~ h1 t = r1 + v1 t
r2 (t) = r2 + v2 t
Comparing this with equation (4.56), we conclude that all observables of

particles 1 and 2 transform independently from each other as if these particles
were alone. So, the representation (6.10) - (6.13) corresponds to the absence
of interaction and is called the non-interacting representation of the Poincaré
group.
6.2.2 Dirac’s forms of dynamics

Obviously, the simple choice of generators (6.10) - (6.13) is not realistic, be-
cause particles in nature do interact with each other. Therefore, to describe
interactions in multi-particle systems one should choose an interacting repre-
sentation Ug of the Poincaré group in H which is different from Ug0 . First we
write the generators (H, P, J, K) of the desired representation Ug in the most
general form where all generators are different from their non-interacting
counterparts by the presence of interaction terms denoted by V, U, Y, Z7
7
Our approach to the description of interactions based on equations (6.14) - (6.17)
and their generalizations for multiparticle systems is called the relativistic Hamiltonian
dynamics [KP91]. For completeness, we should mention that there is a number of other
(non-Hamiltonian) methods for describing interactions. Overviews of these methods and
further references can be found in [Kei94, DW65, Pol85]. We will not discuss the non-
Hamiltonian approaches in this book.
H = H0 + V (r1 , p1 , r2 , p2 ) (6.14)
P = P0 + U(r1 , p1 , r2 , p2 ) (6.15)
J = J0 + Y(r1, p1 , r2 , p2 ) (6.16)
K = K0 + Z(r1 , p1 , r2 , p2 ) (6.17)
It may happen that some interaction operators on the right hand sides of
equations (6.14) - (6.17) are zero. Then these generators and corresponding
finite transformations coincide with generators and transformations of the
non-interacting representation Ug0 . Such generators and transformations will
be referred to as kinematical. Generators which contain interaction terms
are called dynamical.
Table 6.1: Comparison of three relativistic forms of dynamics

Instant form Point form Front form
Kinematical generators
P0x K0x P0x
P0y K0y P0y
1
P0z K0z √
2
(H0 + P0z )
1
J0x J0x √
2
(K0x + J0y )
1
J0y J0y √
2
(K0y − J0x )
J0z J0z J0z
K0z
Dynamical generators
H H √1 (H − Pz )
2
Kx Px √1 (Kx − Jy )
2
Ky Py √1 (Ky + Jx )
2
Kz Pz
The description of interaction by equations (6.14) - (6.17) generalizes

traditional classical non-relativistic Hamiltonian dynamics in which the only
dynamical generator is the Hamiltonian H. To make sure that our theory
reduces to the familiar non-relativistic approach in the limit c → ∞, we will
also assume that time translations are generated by a dynamical Hamilto-
nian H = H0 + V with a non-vanishing interaction V . The choice of other
generators is restricted by the observation that kinematical transformations
should form a subgroup of the Poincaré group, so that kinematical generators

should form a subalgebra of the Poincaré Lie algebra.8 The set (P, J, K) does
not form a subalgebra. This explains why in the relativistic case we cannot
introduce interaction in the Hamiltonian alone. We must add interaction
terms to some of the other generators P, J, or K in order to be consistent
with relativity. We will say that interacting representations having different
kinematical subgroups belong to different forms of dynamics. In his famous
paper [Dir49], Dirac provided a classification of forms of dynamics based
on this principle. Table 6.1 lists three Dirac’s forms of dynamics most fre-
quently discussed in the literature. In the case of the instant form dynamics,
the kinematical subgroup is the subgroup of spatial translations and rota-
tions. In the case of the point form dynamics the kinematical subgroup is
the Lorentz subgroup [Tho52]. In both these cases the kinematical subgroup
has dimension 6. The front form dynamics has the largest number (7) of
kinematical generators.
6.2.3 Total observables in a multiparticle system

Once the interacting representation of the Poincaré group and its generators
(H, P, J, K) are defined, we immediately have expressions for total observ-
ables of the physical system considered as a whole. These are the total energy
H, the total momentum P, and the total angular momentum J. Other total
observables of the system (the total mass M, spin S, center-of-mass position
R, etc.) can be obtained as functions of these generators by formulas derived
in chapter 4.
Note also that inertial transformations of the total observables (H, P, J, K)
coincide with those presented in chapter 4. This is because total observables
coincide with the Poincaré group generators, and this fact is independent
on the interaction present in the system. For example, the total energy H
and the total momentum P form a 4-vector whose boost transformations are
always given by equation (4.3) - (4.4). Boost transformations of the center-
of-mass position R are those derived in subsection 4.3.8. Time translations
result in a uniform movement of the center-of-mass with constant velocity
along a straight line (4.56). Thus we conclude that inertial transformations
of total observables are completely independent on the form of dynamics and
8
Indeed, if two generators A and B do not contain interaction terms, then their com-
mutator [A, B] should be interaction-free as well.
on the details of interactions acting within the multiparticle system.
6.3 Instant form of dynamics

In sections 17.2 and 13.4 we will see that instant form of dynamics agrees
with observations better than other forms. So, in this book we will prefer to
use instant form interactions to describe realistic systems.
6.3.1 General instant form interaction

In the instant form we can rewrite equations (6.14) - (6.17) as
H = H0 + V (6.18)
P = P0 (6.19)
J = J0 (6.20)
K = K0 + Z (6.21)
As we discussed in subsection 6.2.3, the observables H, P, J, and K are total
observables that correspond to the compound system as a whole. The total
momentum P (6.11) and the total angular momentum J (6.12) are simply
vector sums of the corresponding operators for individual particles. The
total energy H and the boost operator K are written as sums of one-particle
operators plus interaction terms. The interaction term V in H is usually
called the potential energy operator. Similarly, we will call Z the potential
boost. It is important to note that in an instant-form relativistic interacting
system the potential boost operator cannot vanish. We will see later that the
non-vanishing “boost interaction” has a profound effect on transformations
of observables between moving reference frames. The potential boost Z will
play a crucial role in our non-traditional approach to relativity in the second
part of this book.
Other total observables (e.g., the total mass M, spin S, center-of-mass
position R, and its velocity V, etc.) are defined as functions of generators
(6.18) - (6.21) by formulas from chapter 4. For interacting systems, these
Hermitian operators are generally interaction-dependent as well.
According to the principle of relativity, ten operators (6.18) - (6.21) must
obey Poincaré commutation relations (3.52) - (3.58). This requirement leads
to the following equivalent relationships
6.3. INSTANT FORM OF DYNAMICS 181
[J0 , V ] = [P0 , V ] = 0 (6.22)

i~δij
[Zi , P0j ] = − 2 V (6.23)
c
X3
[J0i , Zj ] = i~ ǫijk Zk (6.24)
k=1
[K0i , Zj ] + [Zi , K0j ] + [Zi , Zj ] = 0 (6.25)
[Z, H0 ] + [K0 , V ] + [Z, V ] = 0 (6.26)
So, the task of constructing a Poincaré invariant theory of interacting parti-

cles has been reduced to finding a non-trivial solution for the set of equations
(6.22) - (6.26) with respect to V and Z. These equations are necessary and
sufficient conditions for the relativistic invariance of our theory.
6.3.2 Bakamjian-Thomas construction

The set of equations (6.22) - (6.26) is rather complicated. The first non-
trivial solution of these equations for multiparticle systems was found by
Bakamjian and Thomas [BT53]. The idea of their approach was as follows.
Instead of working with 10 generators (P, J, K, H), it is convenient to use
an alternative set of operators {P, R, S, M} introduced in subsection 4.3.4.
Denote {P0 , R0 , S0 , M0 } and {P0 , R, S, M} the sets of operators obtained
by using formulas (4.42) - (4.44) from the non-interacting (P0 , J0 , K0 , H0 )
and interacting (P0 , J0 , K, H) generators, respectively. In a general instant
form dynamics all three operators R, S, and M may contain interaction
terms. However, Bakamjian and Thomas decided to look for a simpler so-
lution in which the position operator remains kinematical R = R0 . It then
immediately follows that the spin operator is kinematical as well
S = J − [R × P] = J0 − [R0 × P0 ] = S0
Then interaction term N is present in the mass operator only.
M = M0 + N
From commutators (6.22), the interaction N must satisfy

[P0 , N] = [R0 , N] = [J0 , N] = 0 (6.27)
So, we have reduced our task of solving (6.22) - (6.26) to a simpler problem
of finding one operator N satisfying conditions (6.27). Indeed, by knowing N
and non-interacting operators M0 , P0 , R0 , S0 , we can restore the interacting
generators using formulas (4.45) - (4.47)
P = P0 (6.28)
q
H = + M 2 c4 + P02c2 (6.29)
1 [P0 × S0 ]
K = − (R 0 H + HR 0 ) − (6.30)
2c2 Mc2 + H
J = J0 = [R0 × P0 ] + S0 (6.31)
Now let us turn to the construction of N in the case of two massive

spinless particles. Suppose that we found two vector operators ~π and ρ~ such
that they form a 6-dimensional Heisenberg Lie algebra
[πi , ρj ] = i~δij (6.32)

[πi , πj ] = [ρi , ρj ] = 0 (6.33)
commuting with the center-of-mass position R0 and the total momentum P0 .
[~π , P0] = [~π , R0 ] = [~ρ, P0 ] = [~ρ, R0] = 0 (6.34)
Suppose also that these relative operators have the following non-relativistic
(c → ∞) limits
~π → p1 − p2
ρ~ → r1 − r2
Then observables ~π and ρ~ can be interpreted as relative momentum and rela-

tive position in the two-particle system, respectively. Moreover, any operator
in the Hilbert space H can be expressed either as a function of (p1 , r1 , p2 , r2 )
or as a function of (P0 , R0, ~π , ρ~). An interaction operator N satisfying con-

ditions [N, P0] = [N, R0 ] = 0 can be expressed as a function of ~π and ρ~
only. To satisfy the last condition [N, J0 ] = 0 we will simply require N to be
an arbitrary function of rotationally invariant combinations of the 2-particle
relative observables
N = N(π 2 , ρ2 , (~π · ρ~)) (6.35)
In this ansatz, the problem of building a relativistically invariant interaction

has reduced to finding operators of relative positions ρ~ and momenta ~π sat-
isfying equations (6.32) - (6.34). This problem has been solved in a number
of works [BT53, BF62, Osb68, FS64]. We will not need explicit formulas for
the operators of relative observables, so we will not reproduce them here.
For systems of n massive spinless particles (n > 2) similar arguments
apply, but instead of one pair of relative operators ~π and ρ~ we will have n − 1
pairs,
~πr , ρ~r , r = 1, 2, . . . , n − 1 (6.36)
These operators should form a 6(n − 1)-dimensional Heisenberg algebra com-

muting with P0 and R0 . Explicit expressions for ~πr and ρ~r were constructed,
e.g., in ref. [Cha64]. As soon as these expressions are found, we can build
a Bakamjian-Thomas interaction in an n-particle system by defining the in-
teraction N as a function of rotationally invariant combinations of relative
operators (6.36)
N = N(π12 , ρ21 , (~π1 · ρ~1 ), π22 , ρ22 , (~π2 · ρ~2 ), (~π1 · ρ~2 ), (~π2 · ρ~1 ), . . .) (6.37)
6.3.3 Non-Bakamjian-Thomas instant forms of dynam-

ics
In the Bakamjian-Thomas construction, it was assumed that R = R0 , but
this limitation is rather artificial and we will see later that realistic particle in-
teractions do not satisfy this condition. Any non-Bakamjian-Thomas variant
of the instant form dynamics has position operator R different from the non-
interacting Newton-Wigner position R0 . Let us now establish a connection
between such a general instant form interaction and the Bakamjian-Thomas
form. We are going to demonstrate that corresponding representations of the
Poincaré group are related by a unitary transformation.
Suppose that operators
(P0 , J0 , K, H) (6.38)
define a Bakamjian-Thomas dynamics. Let us now choose an unitary op-
erator W commuting with P0 and J0 .9 and apply this transformation to
generators (6.38).
J0 = W J0 W −1 (6.39)
P0 = W P0W −1 (6.40)
K′ = W KW −1 (6.41)
H′ = W HW −1 (6.42)
Since unitary transformations preserve commutators
W [A, B]W −1 = [W AW −1 , W BW −1 ]
the transformed generators (6.39) - (6.42) satisfy commutation relations of
the Poincaré Lie algebrapin the instant form. However, generally, the new
mass operator M ′ = c−2 (H ′)2 − P20 c2 does not commute with R0 , so (6.39)
- (6.42) are not necessarily in the Bakamjian-Thomas form.
Thus we have a way to build a non-Bakamjian-Thomas instant form repre-
sentation (P0 , J0 , K′ , H ′ ) if a Bakamjian-Thomas representation (P0 , J0 , K, H)
is given. However, this construction does not answer the question if all in-
stant form interactions can be connected to the Bakamjian-Thomas dynamics
by a unitary transformation? The answer to this question is “yes”: For any
instant form interaction10 {P0 , R′, S′ , M ′ } one can find a unitary operator
W which transforms it to the Bakamjian-Thomas form [CP82]
9
In the case of two massive spinless particles such an operator must be a function of
rotationally invariant combinations of vectors P0 , ~π , and ρ~.
10
Here it is convenient to use “alternative sets” of basic operators introduced in subsec-
tion 4.3.4.
W −1 {P0 , R′, S′ , M ′ }W = {P0 , R0 , S0 , M} (6.43)
To see that, let us consider the simplest two-particle case. Operator
T ≡ R′ − R0
commutes with P0 . Therefore, it can be written as a function of P0 and

relative operators ~π and ρ~: T(P0 , ~π, ρ~). Then one can show that unitary
operator11
i
W = e~W
Z P0
W = T(P, ~π, ρ~)dP (6.44)
0
performs the desired transformation (6.43). Indeed
W −1 P0 W = P0
W −1 J0 W = J0
because W is a scalar, which explicitly commutes with P0 . Operator W has

the following commutators with the center-of-mass position
Z P0 Z P0
∂
[W, R0 ] = − R0 , T(P, ~π, ρ~)dP = −i~ T(P, ~π, ρ~)dP
0 ∂P0 0
′
= −i~T(P0 , ~π , ρ~) = −i~(R − R0 )
[W, [W, R0 ]] = 0
Therefore
11
The integral in (6.44) can be treated formally as an integral of ordinary function
(rather than operator) along the segment [0, P0 ] in the 3D space of variable P0 with
arguments ~π and ρ
~ being fixed.
i i i 1
W R0 W −1 = e ~ W R0 e− ~ W = R0 + [W, R0 ] − [W, [W, R0 ]] + . . .
~ 2!~2
= R0 + (R′ − R0 ) = R′
W −1 R′ W = R0
W −1 S′ W = W −1 (J0 − R′ × P0 )W = J0 − R0 × P0 = S0
Finally we can apply transformation W to the mass operator M ′ and obtain

operator
M = W −1 M ′ W
which commutes with both R0 and P0 . This demonstrates that operators

on the right hand side of (6.43) describe a Bakamjian-Thomas instant form
of dynamics.
6.3.4 Cluster separability

As we saw above, the requirement of Poincaré invariance imposes rather
loose conditions on interaction. Relativistic invariance can be satisfied in
many different ways. However, there is another physical requirement which
limits the admissible form of interaction. We know from experiment that all
interactions between particles vanish when particles are separated by large
distances.12 So, if in a 2-particle system we remove particle 2 to infinity by
i
using the space translation operator e ~ p2 a , then interaction (6.35) must tend
to zero
i i
lim e− ~ p2 a N(π 2 , ρ2 , (~π · ρ~))e ~ p2 a = 0 (6.45)
a→∞
This condition is not difficult to satisfy in the two-particle case. However, in

the relativistic multi-particle case the mathematical form of this condition
becomes rather complicated. This is because now there is more than one
way to separate particles in mutually non-interacting groups. The form of
12
We are not considering here a hypothetical potential between quarks, which supposedly
grows as a linear function of the distance and results in the confinement of quarks inside
hadrons.
the n-particle interaction (6.37) must ensure that each spatially separated
m-particle group (m < n) behaves as if it were alone. This, in particular,
implies that we cannot independently choose interactions in systems with
different number of particles. The interaction in the n-particle sector of the
theory must be consistent with interactions in all m-particle sectors, where
m < n.
Interactions satisfying these conditions are called cluster separable. We
will postulate that all interactions in nature have the property of separability.
Postulate 6.3 (cluster separability of interactions) : All interactions

are cluster separable. This means that for any division of an n-particle system
(n ≥ 2) into two spatially separated groups (or clusters) of l and m particles
(l + m = n)
1. the interaction separates too, i.e., the clusters move independent of each
other;
2. the interaction in each cluster is the same as in separate l-particle and

m-particle systems, respectively.
A counterexample of a non-separable interaction can be built in the 4-

particle case. The interaction Hamiltonian
1
V = (6.46)
|r1 − r2 ||r3 − r4 |
has the property that no matter how far two pairs of particles (1+2 and 3+4)
are from each other, the relative distance between 3 and 4 affects the force
acting between particles 1 and 2. Such infinite-range interactions are not
present in nature.
In the non-relativistic case the cluster separability is achieved without
much effort. For example, the non-relativistic Coulomb potential energy in
the system of two charged particles is13
1 1
V12 = ≡ (6.47)
|~ρ| |r1 − r2 |
13
Here we are interested just in the general functional form of interaction, so we are not
concerned with putting correct factors in front of the potentials.
which clearly satisfies condition (6.45). In the system of three charged par-
ticles 1, 2, and 3, the potential energy can be written as a simple sum of
two-particle terms
V = V12 + V13 + V23

1 1 1
= + + (6.48)
|r1 − r2 | |r2 − r3 | |r1 − r3 |
The spatial separation between particle 3 and the cluster of particles 1+2 can
be increased by applying a large space translation to the particle 3. In agree-
ment with Postulate 6.3, such a translation effectively cancels interaction
between particles in clusters 3 and 1+2, i.e.
i i
lim e ~ p3 a (V12 + V13 + V23 )e− ~ p3 a
a→∞
1 1 1
= lim + +
a→∞ |r1 − r2 | |r2 − r3 + a| |r1 − r3 + a|
1
=
|r1 − r2 |
This is the same potential (6.47) as in an isolated 2-particle system. There-
fore, both conditions (1) and (2) are satisfied and interaction (6.48) is cluster
separable. As we will see below, in the relativistic case construction of a
general cluster-separable multi-particle interaction is a more difficult task.
Let us now make some definitions which will be useful in discussions
of cluster separability. A smooth m-particle potential V (m) is defined as
operator that depends on variables of m particles and tends to zero if any
particle or a group of particles is removed to infinity.14 For example, the
potential (6.47) is smooth while (6.46) is not. Generally, a cluster separable
interaction in a n-particle system can be written as a sum
X X
V = V (2) + V (3) + . . . + V (n) (6.49)
{2} {3}
P (2)
where {2} V is a sum of smooth 2-particle potentials over all pairs of
P (3)
particles; {3} V is a sum of smooth 3-particle potentials over all triples of
14
In section 8.4 we will explain why we call such potentials smooth.
particles, etc. The example in equation (6.48) is a sum of smooth 2-particle

potentials.
6.3.5 Non-separability of the Bakamjian-Thomas dy-

namics
We expect that the property of cluster separability (Postulate 6.3) must be
valid for both potential energy and potential boosts in realistic interacting
systems. For example, in the relativistic case of 3 massive spinless particles
with interacting generators
H = H0 + V (p1 , r1 ; p2 , r2 ; p3 , r3 )
K = K0 + Z(p1 , r1 ; p2 , r2 ; p3 , r3 )
the cluster separability requires, in particular, that
i i
lim e ~ p3 a V (p1 , r1 ; p2 , r2 ; p3 , r3)e−i ~ p3 a = V12 (p1 , r1 ; p2 , r2 ) (6.50)
a→∞
i i
lim e ~ p3 a Z(p1 , r1 ; p2 , r2 ; p3 , r3 )e− ~ p3 a = Z12 (p1 , r1 ; p2 , r2 ) (6.51)
a→∞
where V12 and Z12 are interaction operators for the 2-particle system.
Let us see if these principles can be satisfied by Bakamjian-Thomas in-
teractions. In this case the potential energy is
V = H − H0
p
= (p1 + p2 + p3 )2 c2 + (M0 + N(p1 , r1 ; p2 , r2 ; p3 , r3 ))2 c4
q
− (p1 + p2 + p3 )2 c2 + M02 c4
By removing particle 3 to infinity we obtain
i i
lim e ~ p3 a V (p1 , r1 ; p2 , r2 ; p3 , r3 )e− ~ p3 a
a→∞
p
= (p1 + p2 + p3 )2 c2 + (M0 + N(p1 , r1 ; p2 , r2 ; p3 , ∞))2 c4
q
− (p1 + p2 + p3 )2 c2 + M02 c4 (6.52)
According to (6.50) we should require that the right hand side of equation
(6.52) depends only on variables pertinent to particles 1 and 2. Then we
must set
N(p1 , r1 ; p2 , r2 ; p3 , ∞) = 0
which also means that
V (p1 , r1 ; p2 , r2 ; p3 , ∞) = V12 (p1 , r1 ; p2 , r2 ) = 0
and interaction in the 2-particle sector 1+2 vanishes. Similarly, we can show
that interaction V tends to zero when either particle 1 or particle 2 is removed
to infinity. Therefore, V is a smooth 3-particle potential, and there is no
interaction in any 2-particle subsystem: the interaction turns on only if there
are three or more particles close to each other. This is clearly unphysical.
So, we conclude that the Bakamjian-Thomas construction cannot describe
a non-trivial cluster-separable interaction in many-particle systems (see also
[Mut78]).
6.3.6 Cluster separable 3-particle interaction

The problem of constructing relativistic cluster separable many-particle in-
teractions can be solved by allowing non-Bakamjian-Thomas instant form in-
teractions. Our goal here is to construct the interacting Hamiltonian H and
boost K operators in the Hilbert space H = H1 ⊗ H2 ⊗ H3 of a 3-particle sys-
tem so that interaction satisfies the separability Postulate 6.3, i.e., it reduces
to a non-trivial 2-particle interaction when one of the particles is removed to
infinity. In this construction we follow ref. [CP82].
Let us assume that 2-particle potentials Vij and Zij , i, j = 1, 2, 3 resulting
from removing particle k 6= i, j to infinity are known. They depend on
variables of the i-th and j-th particles only. For example, when particle 3 is
removed to infinity, the interacting operators take the form15
15
Here we used (4.32) and took into account that [P0 × S12 ] = [P0 × W12 ]/(M12 c).
Similar equations result from the removal of particles 1 or 2 to infinity. They are obtained
from (6.53) - (6.56) by permutation of indices (1,2,3).
i i
lim e ~ p3 a He− ~ p3 a = H0 + V12 ≡ H12 (6.53)
a→∞
i i
lim e ~ p3 a Ke− ~ p3 a = K0 + Z12 ≡ K12 (6.54)
a→∞
q
i i 1
lim e ~ p3 a Me− ~ p3 a = 2 H12 2
− P02 c2 ≡ M12 (6.55)
a→∞ c
i i c2 c[P0 × W12 ]
lim e ~ p3 a Re− ~ p3 a = − (K12 H12 + H12 K12 ) −
a→∞ 2 M12 H12 (M12 c2 + H12 )
≡ R12 (6.56)
where operators H12 , K12 , M12 , and R12 (energy, boost, mass, and center-
of-mass position, respectively) will be considered as given. Now we want to
combine the two-particle potentials Vij and Zij together in a cluster-separable
3-particle interaction in analogy with (6.48). It appears that we cannot
form the interactions V and Z in the 3-particle system simply as a sum of
2-particle potentials. One can verify that such a definition would violate
Poincaré commutators. Therefore
V 6= V12 + V23 + V13

Z 6= Z12 + Z23 + Z13
and the relativistic “addition of interactions” should be more complicated.

When particles 1 and 2 are split apart, operators V12 and Z12 must tend
to zero, therefore
i i
lim e ~ p1 a M12 e− ~ p1 a = M0 (6.57)
a→∞
i i
lim e ~ p2 a M12 e− ~ p2 a = M0 (6.58)
a→∞
i i
lim e ~ p3 a M12 e− ~ p3 a = M12 (6.59)
a→∞
The Hamiltonian H12 and boost K12 define an instant form representation
U12 of the Poincaré group in the 3-particle Hilbert space H. The correspond-
ing position operator (6.56) is generally different from the non-interacting
Newton-Wigner position operator
c2 c[P0 × W0 ]
R0 = − (K0 H0 + H0 K0 ) − (6.60)
2 M0 H0 (M0 + H0 )
which is characteristic for the Bakamjian-Thomas form of dynamics. How-

ever, we can unitarily transform the representation U12 , so that it acquires a
Bakamjian-Thomas form with operators R0 , H 12 , K12 , M 12 .16 Let us denote
such an unitary transformation operator by B12 . We can repeat the same
steps for two other pairs of particles 1+3 and 2+3 and write in the general
case i, j = 1, 2, 3; i 6= j
Bij Rij Bij−1 = R0

Bij Hij Bij−1 = H ij
Bij Kij Bij−1 = Kij
Bij Mij Bij−1 = M ij
Operators Bij = {B12 , B13 , B23 } commute with P0 and J0 . Since representa-
tion Uij becomes non-interacting when the distance between particles i and
j tends to infinity, we can write
i i
lim e ~ p3 a B13 e− ~ p3 a = 1 (6.61)
a→∞
i i
lim e ~ p3 a B23 e− ~ p3 a = 1 (6.62)
a→∞
i i
lim e ~ p3 a B12 e− ~ p3 a = B12 (6.63)
a→∞
The transformed Hamiltonians H ij and boosts Kij define Bakamjian-Thomas

representations and their mass operators M ij now commute with R0 . So, we
can add M ij together to build a new mass operator
M = M 12 + M 13 + M 23 − 2M0
−1 −1 −1
= B12 M12 B12 + B13 M13 B13 + B23 M23 B23 − 2M0
16
which also commutes with R0 . Using this mass operator, we can build an-
other Bakamjian-Thomas representation with generators
q
2
H = P02 + M (6.64)
1 c[P0 × W0 ]
K = − 2 (R0 H + HR0 ) − (6.65)
2c MH(M c2 + H)
This representation has interactions between all particles, however, it does
not satisfy the cluster property yet. For example, by removing particle 3 to
infinity we do not obtain the interaction M12 characteristic for the subsystem
of two particles 1 and 2. Instead, we obtain a unitary transform of M12 17
i i
lim e ~ p3 a M e− ~ p3 a
a→∞
i i
−1 −1 −1
= lim e ~ p3 a (B12 M12 B12 + B13 M13 B13 + B23 M23 B23 − 2M0 )e− ~ p3 a
a→∞
i i i i
−1
= B12 M12 B12 − 2M0 + lim (e ~ p3 a M13 e− ~ p3 a + e ~ p3 a M23 e− ~ p3 a )
a→∞
−1 −1
= B12 M12 B12 − 2M0 + 2M0 = B12 M12 B12 (6.66)
To fix this deficiency, let us perform a unitary transformation of the repre-
sentation (6.64) - (6.65) with operator B 18
H = B −1 HB (6.67)
K = B −1 KB (6.68)
M = B −1 M B (6.69)
We choose the transformation B from the requirement that it must cancel
factors Bij and Bij−1 in equations like (6.66) as particle k is removed to infinity.
In other words, B can be any unitary operator, which has the following limits
i i
lim e ~ p3 a Be− ~ p3 a = B12 (6.70)
a→∞
i i
lim e ~ p2 a Be− ~ p2 a = B13 (6.71)
a→∞
i i
lim e ~ p1 a Be− ~ p1 a = B23 (6.72)
a→∞
17
Here we used (6.57) - (6.59) and (6.61) - (6.63).
18
which must commute with P0 and J0 , of course, to preserve the instant form of
interaction
One can check that one possible choice of B is
B = exp(ln B12 + ln B13 + ln B23 )
Indeed, using equations (6.61) - (6.63) we obtain
i i i i
lim e ~ p3 a Be− ~ p3 a = lim e ~ p3 a exp(ln B12 + ln B13 + ln B23 )e− ~ p3 a
a→∞ a→∞
= exp(ln B12 ) = B12
Then, it is easy to show that the interacting representation of the Poincaré

group generated by operators (6.67) and (6.68) satisfies cluster separability
properties (6.53) - (6.56). For example,
i i i i
lim e ~ p3 a He− ~ p3 a = lim e ~ p3 a B −1 HBe− ~ p3 a
a→∞ a→∞
q
−1 ~i p3 a 2 i
= lim B12 e P02 c2 + M c4 e− ~ p3 a B12
a→∞
q
−1 −1 2 4
= B12 P02 c2 + (B12 M12 B12 ) c B12
q
= P02 c2 + M12 2 4
c = H12
Generally, operator B does not commute with the Newton-Wigner position

operator (6.60). Therefore, the mass operator (6.69) also does not com-
mute with R0 , and the representation generated by operators (P0 , J0 , K, H)
does not belong to the Bakamjian-Thomas form. This is consistent with our
conclusion in subsection 6.3.5 that Bakamjian-Thomas dynamics cannot be
made cluster-separable.
Obviously, the above method of constructing relativistic cluster-separable
interactions is very cumbersome. Moreover, its applicability is limited to
interactions that conserve the number of particles. In chapters 9 and 11
we will consider a more general approach that seems to be more relevant to
interactions occurring in nature. This construction will be based on the idea
of quantum fields
6.4. BOUND STATES AND TIME EVOLUTION 195
6.4 Bound states and time evolution

We already mentioned that the knowledge of the Poincaré group represen-
tation Ug in the Hilbert space H of a multiparticle system is sufficient for
getting any desired physical information about the system. In this section,
we would like to make this statement more concrete by examining two types
of information, which can be compared with experiment: the mass and en-
ergy spectra of the system and the time evolution of its observables. In the
next section we will discuss scattering experiments, which are currently the
most informative way of studying microscopic systems.
6.4.1 Mass and energy spectra

The mass operator of a non-interacting 2-particle system is
q
1 1p
M0 = + 2 H02 − P02 c2 = + 2 (h1 + h2 )2 − (p1 + p2 )2 c2
c s c
q q 2
1 2 4 2 2 2 4 2 2
= + 2 m1 c + p1 c + m2 c + p2 c − (p1 + p2 )2 c2(6.73)
c
As particles’ momenta can have any value in the 3D momentum space, the
eigenvalues m of the mass operator have continuous spectrum in the range
m1 + m2 ≤ m < ∞ (6.74)
where the minimum value of mass m1 + m2 is obtained from (6.73) when

both particles are at rest p1 = p2 = 0. It then follows that the common
spectrum of mutually commuting operators P0 and
q
H0 = + M02 c4 + P02 c2
is the union of mass hyperboloids19 in the 4-dimensional momentum-energy

space. This spectrum is shown by the hatched region in Fig. 6.1(a).
In the presence of interaction, the eigenvalues µn of the mass operator
M = M0 + N can be found by solving the stationary Schrödinger equation
19
with masses in the interval (6.74)
H H
(m1+m2)c 2
(m1+m2)c 2
Px c Px c
0 0
(a) (b)
Figure 6.1: Typical momentum-energy spectrum of (a) non-interacting and

(b) interacting two-particle system.
M|Ψn i = µn |Ψn i (6.75)
It is well-known that in the presence of sufficiently weak interaction N, the

spectrum of M will not be perturbed much. For example, for a weak attrac-
tive N, new discrete eigenvalues in the mass spectrum may split off below the
threshold m1 + m2 . The eigenvectors of the interacting mass operator with
eigenvalues µn < m1 + m2 are called bound states. The mass eigenvalues µn
are highly degenerate. For example, if |Ψn i is an eigenvector corresponding
to µn , then for any Poincaré group element g the vector Ug |Ψn i is also an
eigenvector with the same mass eigenvalue.20 To remove this degeneracy (at
least partially) one can consider operators P0 and H, which commute with
each other and with M, so that they define a basis of common eigenvectors
M|Ψp,n i = µn |Ψp,n i
P0 |Ψp,n i = p|Ψp,n i
√ p
H|Ψp,n i = M 2 c4 + P 2 c2 |Ψp,n i = µ2n c4 + p2 c2 |Ψp,n i
Then sets of common eigenvalues of P0 and H with fixed µn < m1 + m2 form

hyperboloids
20
This means that eigensubspaces with fixed mass µn are invariant with respect to
Poincaré group actions.
p
hn = µ2n c4 + p2 c2
which are shown in Fig. 6.1(b) below the continuous part of the common
spectrum of P0 and H. An example of a bound system whose mass spec-
trum has both continuous and discrete parts – the hydrogen atom – will be
considered in greater detail in section 12.2.
6.4.2 Doppler effect revisited

In our discussion of the Doppler effect in subsection 5.3.4 we were interested
in the energy of free photons measured by moving observers or emitted by
moving sources. To this end we applied a boost transformation (5.66) to the
energy E of a free massless photon. It is instructive to look at this problem
from another point of view. Photons are usually emitted by compound mas-
sive physical systems (atoms, molecules, nuclei, etc.) in transitions between
two discrete energy levels E2 and E1 , so that the photon’s energy is found
simply from the energy conservation law21
E = E2 − E1
When the source is moving with respect to the observer (or observer is
moving with respect to the source), the energies of levels 1 and 2 experience
inertial transformations given by formula (4.4). Therefore, to check our the-
ory for consistency, we would like to prove that the Doppler shift calculated
with this formula is the same as that obtained in subsection 5.3.4.
Suppose that the compound system has two bound states characterized
by mass eigenvalues m1 and m2 > m1 (see Fig. 6.2). Suppose also that
initially the system is p
in the excited state with mass m2 , total momentum
p2 , and energy E2 = m22 c4 + p22 c2 . In the final state we have the same
systempwith a lower mass m1 , different total momentum p1 , and energy
E1 = m21 c4 + p21 c2 . In addition, there is a photon with momentum k and
energy ck. From the momentum and energy conservation laws we can write
21
The transition energy E is actually not well-defined, because the excited state 2 is
not a stationary state. (See section 13.1.) Therefore our discussion in this subsection is
valid only approximately for long-living states 2, for which the uncertainty of energy can
be neglected.
E m2
B (p ,E ) m1
2 2
AA
(p1,E1)) k
00 P xc
Figure 6.2: Energy level diagram for a bound system with the ground state
of mass m1 and the excited state of mass m2 . If the system is at rest, its
excited state is represented by point A. Note that the energy of emitted
photons (arrows) is less than (m2 − m1 )c2 . A moving excited state with
momentum p2 is represented by point B. The energies and momenta k of
emitted photons depend on the angle between k and p2 .
p2 = p1 + k
E2 = E1 + ck
q q
m22 c4 + p22 c2 = m21 c4 + p21 c2 + ck
q
= m21 c4 + (p2 − k)2 c2 + ck
Taking squares of both sides of the last equality, we obtain
q
1 2 2
k m21 c2 + (p2 − k)2 = µ c + p2 k cos φ − k 2
2
where µ2 ≡ m22 − m21 and φ is the angle between vectors p2 and k.22 Taking
squares of both sides again we obtain a quadratic equation
22
Note also that vector k points from the light emitting system to the observer, so the
angle φ can be interpreted as the angle between the velocity of the source and the line of
sight, which is equivalent to the definition of φ in subsection 5.3.4.
1
k 2 (m22 c2 + p22 − p22 cos2 φ) − kµ2 c2 p2 cos φ − µ4 c4 = 0
4
with the solution23
q
µ 2 c2
k = p2 cos φ + m22 c2 + p22
2m22 c2 + 2p22 sin2 φ
Introducing
p the rapidity θ of the initial state, we obtain p2 = m2 c sinh θ,
m22 c2 + p22 = m2 c cosh θ and
µ2 c(sinh θ cos φ + cosh θ) µ2 c

k = =
2m2 (cosh2 θ − sinh2 θ cos2 φ) 2m2 cosh θ(1 − vc cos φ)
This formula gives the energy of the photon emitted by a system moving
with the speed v = c tanh θ
E(0)
E(θ, φ) ≡ ck =
cosh θ(1 − vc cos φ)
where
µ 2 c2
E(0) =
2m2
is the energy of the photon emitted by a source at rest. This agrees with our
earlier result (5.70).
6.4.3 Time evolution

In addition to stationary energy spectra discussed above, we are often inter-
ested in the time evolution of a compound system. This includes reactions,
23
Only positive sign of the square root leads to a physical solution with positive k
scattering, decays, etc. As we discussed in subsection 5.2.4, in quantum the-

ory the time evolution of states from (earlier) time t′ to (later) time t is
described by the time evolution operator
i ′
U(t ← t′ ) = e− ~ H(t−t ) (6.76)
This operator has the following useful properties
i i ′
U(t ← t′ ) = e− ~ H(t−t1 ) e− ~ H(t1 −t ) = U(t ← t1 )U(t1 ← t′ ) (6.77)
U(t ← t′ ) = U −1 (t′ ← t) (6.78)
for any t, t′ , t1 .
In the Schrödinger picture, the time evolution of a state vector is given
by (5.49)
i ′
|Ψ(t)i = U(t ← t′ )|Ψ(t′ )i = e− ~ H(t−t ) |Ψ(t′ )i (6.79)
|Ψ(t)i is also a solution of the time dependent Schrödinger equation
d d i ′ i ′
i~ |Ψ(t)i = i~ e− ~ H(t−t ) |Ψ(t′ )i = He− ~ H(t−t ) |Ψ(t′ )i
dt dt
= H|Ψ(t)i (6.80)
In spite of simple appearance of formula (6.79), the evaluation of the expo-

nents of the Hamilton operator is an extremely difficult task. In rare cases
when all eigenvalues En and eigenvectors |Ψin of the Hamiltonian are known
H|Ψin = En |Ψin
the initial state can be represented as a sum (and/or integral) of basis eigen-
vectors
X
|Ψ(0)i = Cn |Ψin
n
and the time evolution can be calculated as
i i
X X i
|Ψ(t)i = e− ~ Ht |Ψ(0)i = e− ~ Ht Cn |Ψin = Cn e− ~ En t |Ψin (6.81)
n n
There is another useful formula for the state vector’s time evolution in a
theory with Hamiltonian H = H0 + V . Denoting
i i
V (t) = e ~ H0 (t−t0 ) V e− ~ H0 (t−t0 )
it is easy to verify that the time-dependent state vector24
Z Z Z !
t t t′
− ~i H0 (t−t0 ) i 1
|Ψ(t)i = e 1− V (t′ )dt′ − 2 V (t′ )dt′ V (t′′ )dt′′ + . . . |Ψ(t0 )i
~ t0 ~ t0 t0
(6.82)
satisfies the Schrödinger equation (6.80) with the additional condition that
at t = t0 the solution coincides with the given initial state |Ψ(t0 )i. Indeed
d
|Ψ(t)i
i~
dt !
Z Z t Z t′
d − i H0 (t−t0 ) i t 1
= i~ e ~ 1− V (t′ )dt′ − 2 V (t′ )dt′ V (t′′ )dt′′ + . . . |Ψ(t0 )i
dt ~ t0 ~ t0 t0
Z t Z t Z t′ !
i i 1
= H0 e− ~ H0 (t−t0 ) 1 − V (t′ )dt′ − 2 V (t′ )dt′ V (t′′ )dt′′ + . . . |Ψ(t0 )i
~ t0 ~ t0 t0
Z t
i i
+e− ~ H0 (t−t0 ) V (t) 1 − V (t′′ )dt′′ + . . . |Ψ(t0 )i
~ t0
= (H0 + V )|Ψ(t)i
Perturbative formula (6.82) will be found useful in our discussion of scattering

in subsection 7.1.2.
Unfortunately, the above methods for calculating the time evolution of
quantum systems have very limited practical value: The full spectrum of
24
Note that time integration variables satisfy inequalities t ≥ t′ ≥ t′′ ≥ . . . ≥ t0 .
eigenvalues and eigenvectors of the interacting Hamiltonian H 25 can be found

only for very simple models. The convergence of the perturbative expansion
(6.82) is usually rather poor. So, calculations of the time evolution in quan-
tum mechanics are rather challenging. There are, however, two areas where
we can make further progress in solving this problem. First, in most circum-
stances, quantum effects are too small to be observable. So, it is important
to understand how solutions of the time dependent Schrödinger equation cor-
respond to classical trajectories of particles that we see in everyday life. The
classical limit of quantum mechanics will be tackled in section 6.5. Second,
there is an important class of scattering experiments, which do not require a
detailed description of the time evolution of quantum states. The powerful
formalism of scattering theory will be discussed in chapter 7.
6.5 Classical Hamiltonian dynamics
There are many studies devoted to the so-called problem of quantization.

This means that given a classical theory26 one is trying to develop a corre-
sponding quantum analog. However, as the world is fundamentally quantum,
and its classical description is just a rough approximation, this line of research
is not well justified. In our opinion, it seems more logical to go in the op-
posite direction: to build an (approximate) classical theory starting from its
(exact) quantum analog.
In section 1.5.2 we have established that distributive (classical) proposi-
tional systems are particular cases of orthomodular (quantum) propositional
systems. Therefore, we may expect that quantum mechanics includes classi-
cal mechanics as a particular case. However, it is not obvious how exactly the
phase space of classical mechanics is related to the quantum Hilbert space.
We would like to analyze this relationship in the present section. For sim-
plicity, we will use as an example a system of spinless particles with non-zero
masses mi > 0. For classical treatment of massless particles, e.g., photons,
see subsection 15.5.5.
25
which are required for formula (6.81)
26
e.g., classical mechanics or classical field theory
6.5. CLASSICAL HAMILTONIAN DYNAMICS 203
6.5.1 Quasiclassical states

In the macroscopic world we do not meet localized eigenvectors |ri of the
position operator. According to equation (5.38), such states have infinite
uncertainty of momentum which is rather unusual. Similarly, we do not
meet states with sharply defined momentum. Such states are delocalized
over large distances (5.40). The reason why such states are not commonly
seen27 is not well understood yet. The most plausible hypothesis is that
eigenstates of the position or eigenstates of the momentum are susceptible
to small perturbations (e.g., due to temperature or external radiation) and
rapidly transform to more robust wave packets or quasiclassical states in
which both position and momentum have good, but not perfect localization.
So, when discussing the classical limit of quantum mechanics, we will
not consider general states allowed by quantum mechanics. We will limit
our attention only to the class of particle states |Ψr0 ,p0 i that we will call
quasiclassical. Wave functions of these states are assumed to be well-localized
in both position and momentum representations around the points r0 and p0 ,
respectively. Without loss of generality such wave functions in the position
representation can be written as
i i
ψr0 ,p0 (r) ≡ hr|Ψr0,p0 i = η(r − r0 )e ~ φ e ~ p0 (r−r0 ) (6.83)
where η(r − r0 ) is a real smooth (non-oscillating) function with a maximum

near the expectation value of position r0 and φ is a real phase.28 The last
factor in (6.83) ensures that the expectation value of momentum is p0 .29 As
we will see later, in order to discuss the classical limit of quantum mechanics
the exact choice of the function η(r − r0 ) is not important. For example, it
is convenient to choose it in the form of a Gaussian
2 /d2 i
ψr0 ,p0 (r) = Ne−(r−r0 ) e ~ p0 r (6.84)
27
Spatially delocalized states of particles play a role in such low-temperature effects as
superconductivity and superfluidity.
i i
28
such that e ~ φ is a unimodular phase factor: |e ~ φ | = 1. The introduction of this factor
seems redundant here, because any wave function is defined up to a multiplier, anyway.
i
However, we will find the factor e ~ φ important in our discussions of the interference effect
in subsection 6.5.6 and in section 15.4.
29
compare with the form (5.40) of momentum eigenfunctions in the position space
where φ = 0, parameter d controls the degree of localization, and N is a

coefficient required for the proper normalization
Z
dr|ψr0,p0 (r)|2 = 1
The exact magnitude of this coefficient is not important for our discussion,
so we will not calculate it here.
6.5.2 Heisenberg uncertainty relation

Wave functions like (6.84) cannot possess both sharp position and sharp
momentum at the same time. They are always characterized by a non-
vanishing uncertainty of position ∆r > 0 and a non-vanishing uncertainty of
momentum ∆p > 0. These uncertainties are roughly inversely proportional
to each other. To see the nature of this inverse proportionality, we assume,
for simplicity, that the particle is at rest in the origin, i.e., r0 = p0 = 0. Then
the position-space wave function is
2 /d2
ψ0,0 (r) = Ne−r (6.85)
and its counterpart in the momentum space is30
Z
2 /d2 i
−3/2
ψ0,0 (p) = (2π~) N dre−r e− ~ pr
2 d2 /(4~2 )
= (2~)−3/2 Nd3 e−p (6.86)
The product of the uncertainties of the momentum-space (∆p ≈ 2~ d
) and
position-space (∆r ≈ d) wave functions is independent on the parameter d
∆r∆p ≈ 2~ (6.87)
This is an example of the Heisenberg uncertainty relation, which tells us
that for all quantum states the above uncertainties must satisfy the famous
inequality
∆r∆p ≥ ~/2 (6.88)

30
Here we used equations (5.42) and (B.13).
6.5.3 Spreading of quasiclassical wave packets

Suppose that at time t = 0 the particle was prepared in the state with well-
localized wave function (6.85), i.e., the uncertainty of position ∆r ≈ d is
small. The corresponding time-dependent wave function in the momentum
representation is
i
ψ(p, t) = e− ~ Ĥt ψ0,0 (p, 0)
Nd3 −p2 d2 /(4~2 ) − it √m2 c4 +p2 c2
= e e ~
(2~)3/2
whose position representation counterpart is31
Nd3
Z √
−p2 d2 /(4~2 ) ~i pr − it m2 c4 +q 2 c2
ψ(r, t) = 2 3/2
dpe e e ~
(4π~ )
Z 2
Nd3 − ~i mc2 t 2 d it i
≈ e dp exp −p + + pr
(4π~2 )3/2 4~2 2~m ~
3/2
d2 m − ~i mc2 t mr 2
= N e exp −
d2 m + 2i~t d2 m + 2i~t
and the probability density is
ρ(r, t) = |ψ(r, t)|2

3/2
2 d4 m2 2r 2 d2 m2
= |N| exp − 4 2
d4 m2 + 4~2 t2 d m + 4~2 t2
The size of the wave packet at large times t → ∞ is easily found as

r
d4 m2 + 4~2 t2 2~t
∆r(t) ≈ ≈
d2 m2 dm
So, the position-space wave packet is spreading out, and the speed of spread-
ing vs is directly proportional to the uncertainty of velocity in the initially
prepared state32
2 2 2
31
Due to the factor e−p d /(4~ ) , only small values of momentum
p contribute to the in-
p2
tegral, so we can use the non-relativistic approximation m2 c4 + p2 c2 ≈ mc2 + 2m and
equation (B.13).
32
Here we used equality (6.87).
2~ ∆p
vs ≈ ≈ (6.89)
dm m
One can verify that at large times this speed does not depend on the shape
of the initial wave packet. The important parameters are the size d of this
wave packet and the particle’s mass m.
A simple estimate demonstrates that for macroscopic objects this spread-
ing phenomenon can be safely neglected. For example, for a particle of mass
m = 1 mg and the initial position uncertainty of d = 1 micron, the time
needed for the wave function to spread to 1 cm is more than 1011 years.
Therefore, for quasiclassical states of macroscopic particles with sufficiently
high masses, their positions and momenta are well defined at all times and
their time evolution can be described by a classical trajectory pretty well.
So, in these conditions one can safely replace quantum mechanics with its
classical counterpart.
6.5.4 Phase space

Let us now see how rules of classical Hamiltonian mechanics follow from the
quantum Schrödinger equation.
In subsection 6.5.1 we have established the general form (6.83) of qua-
siclassical wave packets. In most circumstances the resolution of measuring
instruments is poor, i.e., much poorer than the quantum of action ~ [KvB06].
Then the shape of the envelope function η(r − r0 ) cannot be discerned. All
quantum states (6.83) with different shapes of the function η(r − r0 ) can
now be treated as the same classical state. So, each classical state is fully
characterized by two parameters: the average position of the packet r0 and
the average momentum p0 . These states are approximate eigenstates of both
position and momentum operators simultaneously:
R|Ψr0 ,p0 i ≈ r0 |Ψr0,p0 i (6.90)

P|Ψr0 ,p0 i ≈ p0 |Ψr0 ,p0 i (6.91)
All such equivalent states can be represented by one point (r0 , p0 ) in a 6-
dimensional manifold R6 with coordinates rx , ry , rz , px , py , pz . This is the one-
particle classical phase space that was discussed from a logico-probabilistic
point of view in subsection 1.4.4.
We can continue this line of reasoning and translate other quantum no-
tions to the classical language as well. For example, we know that any 1-
particle quantum observable F can be expressed as a function of the particle
position R, momentum P, and mass M.33 The eigenvalue of M is just a
constant. Therefore, in the classical phase space picture, all observables (the
energy, angular momentum, velocity, etc.) are represented as real functions
f (p, r) on the phase space.
For example, consider a logical proposition F . As we have established in

chapter 1, logical propositions form a special class of observables (or functions
on the phase space), whose spectrum consists of only two points 0 and 1.
Thus, the phase space function f (p, r) that corresponds to the proposition
F , defines a subset of the phase space – the set of points where f (p, r) = 1.
Let us consider two examples of propositions/subsets in R6 . The propo-

sition R = “position of the particle is exactly r0 ” is represented in the phase
space by a 3-dimensional hyperplane with fixed position r = r0 and arbitrary
momentum p. The proposition P = “momentum of the particle is exactly
p0 ” is represented by another 3-dimensional hyperplane in which the value of
momentum is fixed, while position is arbitrary. The meet of these two propo-
sitions is represented by the intersection of the two hyperplanes s = R ∩ P
which is a point s = (r0 , p0 ) in the phase space and an atom in the clas-
sical propositional system. In the classical case such an intersection always
exists. Thus, there exist states in which both position and momentum are
measurable simultaneously with absolute certainty. However, this is not true
in the quantum case. As we saw in subsection 6.5.2, quantum propositions
about position R and momentum P can have a non-empty meet only if they
are associated with uncertainties (intervals) ∆r and ∆p, which satisfying the
Heisenberg uncertainty relationship (6.88).
Similar to the one-particle case, we can introduce a 6N-dimensional phase

space for any system of N particles. This phase space is a classical replace-
ment for the quantum-mechanical N-particle Hilbert space, as we discussed
in subsection 1.4.4.
33
See subsection 4.3.4. Recall that in this section we are talking only about spinless
particles. So, we set S = 0.
6.5.5 Poisson brackets

It follows from (6.90) and (6.91) that quasiclassical states |Ψr0 ,p0 i are ap-
proximate eigenstates of any classical observable
f (R, P)|Ψr0,p0 i ≈ f (r0 , p0 )|Ψr0 ,p0 i (6.92)

The expectation value of observable f (R, P) in the quasiclassical state |Ψr0 ,p0 i
is just the value of the corresponding function f (r0 , p0 )
hf (R, P)i = f (r0 , p0 )

and the expectation value of a product of two such observables is equal to
the product of expectation values
hf (R, P)g(R, P)i = f (r0 , p0 )g(r0, p0 ) = hf (R, P)ihg(R, P)i (6.93)

According to (3.52) - (3.58), commutators of observables are proportional
to ~, so in the classical limit ~ → 0 all operators of observables commute
with each other.34 There are two important roles played by commutators in
quantum mechanics. First, the commutator of two observables determines
whether these observables can be measured simultaneously, i.e., whether
there exist states in which both observables have well-defined values. Van-
ishing commutators of classical observables imply that all such observables
can be measured simultaneously. Second, commutators of observables with
generators of the Poincaré group determine how these observables transform
from one reference frame to another. One example of such a transformation
is the time translation in (3.64). However, the zero classical limit of these
commutators as ~ → 0 does not mean that t-dependent terms on the right
hand side of equation (3.64) become zero, and that the time evolution stops
in this limit. The right hand side of (3.64) does not vanish even in the clas-
sical limit, because the commutators in n-th order terms are multiplied by
large factors (−i/~)n . In the limit ~ → 0 we obtain
1
F (t) = F − [H, F ]P t + [H, [H, F ]P ]P t2 + . . . (6.94)
2
34
This is also clear from (6.93) as hf (R, P)g(R, P)i = hg(R, P)f (R, P)i.
where
−i
[f, g]P ≡ lim [f (R, P), g(R, P)] (6.95)
~→0 ~
is called the Poisson bracket. So, even though commutators of observables

are effectively zero in classical mechanics, we can still use non-vanishing
Poisson brackets when calculating the action of inertial transformations on
observables.
Now we are going to derive a useful explicit formula for the Poisson
bracket (6.95). The exact commutator of two quantum mechanical oper-
ators f (R, P) and g(R, P) can be written generally as a series in powers of
~
[f, g] = i~k1 + i~2 k2 + i~3 k3 . . .
where ki are Hermitian operators. From equation (6.95) it is clear that the
Poisson bracket is equal to the coefficient of the dominant term of the first
order in ~
[f, g]P = k1
As a consequence, the classical Poisson bracket [f, g]P is much easier to calcu-
late than the full quantum commutator [f, g]. The following theorem demon-
strates that calculation of the Poisson bracket can be reduced to simple dif-
ferentiation.
Theorem 6.4 If f (R, P) and g(R, P) are two observables of a massive spin-
less particle, then35
∂f ∂g ∂f ∂g
[f (R, P), g(R, P)]P = · − · (6.96)
∂R ∂P ∂P ∂R
35
Equation (6.96) is the definition of the Poisson bracket usually presented in classical
mechanics textbooks without proper justification. Here we are deriving this formula from
quantum-mechanical commutators.
Proof. Consider for simplicity the one-dimensional case (the 3D proof is

similar) in which the desired result (6.96) becomes
−i ∂f ∂g ∂f ∂g
lim [f (R, P ), g(R, P )] = · − · (6.97)
~→0 ~ ∂R ∂P ∂P ∂R
First, functions f (R, P ) and g(R, P ) can be represented by their Taylor ex-
pansions around the origin (r = 0, p = 0) in the phase space, e.g.,
f (R, P ) = C00 + C10 R + C01 P + C11 RP + C20 R2 + C02 P 2 + C21 R2 P + . . .

g(R, P ) = D00 + D10 R + D01 P + D11 RP + D20 R2 + D02 P 2 + D21 R2 P + . . .
where Cij and Dij are numerical coefficients, and we agreed to write factors
R to the left from factors P . Then it is sufficient to prove formula (6.97) for
f and g being monoms of the form Rn P m . In particular, we would like to
prove that
∂(Rn P m ) ∂(Rq P s ) ∂(Rn P m ) ∂(Rq P s )

[Rn P m , Rq P s ]P = −
∂R ∂P ∂P ∂R
n+q−1 m+s−1 n+q−1 m+s−1
= nsR P − mqR P
n+q−1 m+s−1
= (ns − mq)R P (6.98)
for all non-negative integers n, m, q, s ≥ 0. This result definitely holds if

f and g are linear in R and P , i.e., when n, m, q, s are either 0 or 1. For
example, in the case n = 1, m = 0, q = 0, s = 1 formula (6.98) yields
[R, P ]P = 1
which agrees with definition (6.95) and with quantum result (4.25).
To prove (6.98) for higher powers we will use mathematical induction.
Suppose that we established the validity of (6.98) for a set of powers n, m, q, s
as well as for any set of lower powers n′ , m′ , q ′, s′ , where n′ ≤ n, m′ ≤ m,
q ′ ≤ q, s′ ≤ s. The proof by induction now requires us to establish the
validity of the following equations
[Rn P m , Rq+1 P s ]P = (ns − mq − m)Rn+q P m+s−1

[Rn P m , Rq P s+1]P = (ns − mq + n)Rn+q−1 P m+s
[Rn+1 P m , Rq P s ]P = (ns − mq + s)Rn+q P m+s−1
[Rn P m+1 , Rq P s ]P = (ns − mq − q)Rn+q−1 P m+s
Let us prove only the first equation. Three others are proved similarly. Using
equations (4.53), (6.98), and (E.11) we, indeed, obtain
i
[Rn P m , Rq+1 P s ]P = − lim [Rn P m , Rq+1 P s ]
~→0 ~
i i
= − lim [Rn P m , R]Rq P s − lim R[Rn P m , Rq P s ]
~→0 ~ ~→0 ~
= [Rn P m , R]P Rq P s + R[Rn P m , Rq P s ]P
= −mRn+q P m+s−1 + (ns − mq)Rn+q P m+s−1
= (ns − mq − m)Rn+q P m+s−1
Therefore, by induction, equation (6.97) holds for all values of n, m, q, s ≥ 0

and for all smooth functions f (R, P ) and g(R, P ).
Since Poisson brackets are obtained from commutators (6.95), all prop-
erties of commutators from Appendix E.2 remain valid for Poisson brackets
as well.
For a concrete example, let us apply the above formalism of Poisson
brackets to the time evolution. We can use formulas (6.94) and (6.96) in the
case when F is either position or momentum and obtain36
dP(t) ∂H(R, P)
= −[H(R, P), P]P = − (6.99)
dt ∂R
dR(t) ∂H(R, P)
= −[H(R, P), R]P = (6.100)
dt ∂P
where one recognizes the classical Hamilton’s equations of motion.
36
Here we used equation (4.54) and a similar formula [Px , f (Rx )] = −i~∂f (Rx )/∂Rx .
6.5.6 Time evolution of wave packets

Our result (6.99) - (6.100) means, in particular, that trajectories of centers
of quasiclassical wave packets are exactly the same as those predicted by
classical Hamiltonian mechanics. Here we would like to demonstrate how this
conclusion follows from approximate solutions of the Schrödinger equation.
Earlier in this section we have established that in many cases the spread-
ing of a quasiclassical wave packet can be ignored and that the center of
the packet moves along a well-defined trajectory (r0 (t), p0 (t)). This means
that the shape of the wave packet is described by the function η(r − r0 (t)),
which remains localized around the path r = r0 (t). This also means that
the exponential r-dependent factor in (6.83) is approximately exp( ~i p0 (t)r).
For generality, we also need to assume that the phase φ(t) is time-dependent
too. Then the time-dependent quasiclassical wave packet is described by the
following ansatz

i
Ψ(r, t) = η(r − r0 (t)) exp A(t) (6.101)
~
A(t) ≡ p0 (t)(r − r0 (t)) + φ(t) (6.102)
where r0 (t), p0 (t), φ(t) are yet undetermined numerical functions. In order
to find these functions we insert (6.101) - (6.102) in the Schrödinger equation
(5.51)
∂Ψ(r, t) ~2 ∂ 2 Ψ(r, t)
i~ + − V (r)Ψ(r, t) = 0 (6.103)
∂t 2m ∂r2
which is valid for the position-space wave function Ψ(r, t) of a single particle
moving in an external potential V (r).37 Omitting for brevity time arguments
and denoting time derivatives by dots we can write

∂Ψ(r) ∂η i
i~ = −i~ · r˙0 − (p˙0 · r)η + (p˙0 · r0 )η + (p0 · r˙0 )η − φ̇η exp A
∂t ∂r ~
37
Here we make several assumptions and approximations to simplify our calculations.
First, we consider a particle moving in a fixed potential. This is not an isolated system,
which is a subject of most discussions in the book. Nevertheless, this is still a good
approximation in the case when the object creating the potential V (r) is heavy, so that it
can be considered fixed. Second, the position-dependent potential V (r) does not depend
on the particle’s momentum. Third, we are working in the non-relativistic approximation.

~2 ∂ 2 Ψ(r) ~2 ∂ ∂η i i i
2
= exp A + p0 η exp A
2m ∂r 2m ∂r ∂r ~ ~ ~

~2 ∂ 2 η 2i ∂η p20 η i
= 2
+ · p0 − 2 exp A
2m ∂r ~ ∂r ~ ~

∂V (r) i
−V (r)Ψ(r) = −V (r0 )η − (r − r0 )η exp A
∂r r=r0 ~
There are three kinds of terms on the left hand side of (6.103): those propor-
tional to ~0 , ~1 , and ~2 . They must vanish independently. The ~2 -dependent
terms are too small; they are beyond the accuracy of the quasiclassical ap-
proximation and can be ignored. The terms that are first order in ~ result in
equation ṙ0 = pm0 , which is the usual relationship between velocity and mo-
mentum for momentum-independent potentials.38 ~0 terms lead to equation
p20 ∂V (r)
0 = −(ṗ0 · r) + (ṗ0 · r0 ) + (p0 · ṙ0 ) − φ̇ − − V (r0 ) − (r − r0 )
2m ∂r r=r0
(6.104)
The function ṗ0 (t) can be determined from the first Hamilton’s equation
(6.99)
∂V (r)
ṗ0 = −
∂r r=r0
So, we can rewrite (6.104) as an equation for the last undetermined (phase)
function φ(t)
∂φ p2 (t)
= 0 − V (r0 (t))
∂t 2m
The solution of this equation for a particle propagating in the time interval
[t0 , t] is given by the so-called action integral 39
Zt
′ p20 (t′ )
φ(t) = φ(t0 ) + dt − V (r0 (t′ )) (6.105)
2m
t0
38
This is also the 2nd Hamilton’s equation (6.100).
39
Note that the integrand has the form (“kinetic energy” - “potential energy”), which
is known in classical mechanics as the Lagrangian.
B C
Figure 6.3: Interference picture in the two-slit experiment. Two dotted bell-
shaped curves on the right show the image density profiles when one of the
two slits is closed. The thick full line is the interference pattern when both
slits are opened. Compare with Fig. 1.3(b).
From the above discussion we conclude that the center of a quasiclassi-

cal wave packet moves along a trajectory determined by classical Hamilton’s
equations of motion (6.99) - (6.100). In addition, there is a genuine quan-
tum effect: the change of the overall phase of the wave packet according to
equation (6.105).
This phase change explains the double-slit (or double-hole) interference
effect discussed in section 1.1. Suppose that a monochromatic source emits
electrons,40 which pass through two slits and form an image on the scintil-
lating screen, as shown in Fig. 6.3. The electron wave packets can reach
the point C on the screen by two alternative ways: either through slit A or
through slit B. Both kinds of packets contribute to the wave function at
point C. Their complex phase factors exp( ~i φ(t)) should be added together
when calculating the probability amplitude for finding an electron at point
C. In this particular case, the calculation of phase factors is especially sim-
ple, because there is no external potential (V (r) = 0). The momentum (and
speed) of each wave packet remains constant (p20 (t) = const), so that the
action integral (6.105) is proportional to the distance traveled by the wave
packet from the slit to the screen. This means that the character of interfer-
40
The explanation of the photon interference is similar.
ence (constructive or destructive phase shift) at point C is fully determined

by the difference between two traveling distances AC and BC.
Other experimental manifestations of the phase formula (6.105) will be
discussed in section 15.4.
Chapter 7
SCATTERING
Physics is becoming so unbelievably complex that it is taking longer

and longer to train a physicist. It is taking so long, in fact, to
train a physicist to the place where he understands the nature of
physical problems that he is already too old to solve them.
Eugene P. Wigner
As we discussed at the end of section 6.4, it is very difficult to solve

the time-dependent Schrödinger equation (6.80) even for simplest models.
However, nature gives us a lucky break: there is a very important class of
experiments for which the description of dynamics by equation (6.80) is not
needed; this description is just too detailed. We are talking about scattering
experiments here. They are performed by preparing free particles,1 bringing
them into collision and studying the properties of free particles or stable
bound states leaving the region of the collision. In these experiments, often
it is not possible to observe the time evolution during interaction: particle
reactions occur almost instantaneously, and we can only register the reactants
and products moving freely before and after the collision. In such situations
the theory is not required to describe the actual evolution of particles during
the short time interval of collision. It is sufficient to provide a mapping of free
states before the collision onto free states after the collision. This mapping
is provided by the S-operator, which we are going to discuss in this chapter.
1
or their bound states, like hydrogen atoms or deuterons
217
218 CHAPTER 7. SCATTERING
7.1 Scattering operators

7.1.1 S-operator
Let us consider a scattering experiment in which free states of reactants are
prepared at time t = −∞. The collision occurs during a short time interval
[η ′ , η] around time zero.2 The free states of the products are registered at time
t = ∞, so that inequalities −∞ ≪ η ′ < 0 < η ≪ ∞ hold. Here we assume
that the two colliding particles do not form bound states neither before nor
after the collision. Therefore, at asymptotic times the exact evolution is well
approximated by non-interacting time evolution operators U0 (η ′ ← −∞) and
U0 (∞ ← η), respectively.3 Then we can write the time evolution operator
from the infinite past to the infinite future4
U(∞ ← −∞)
≈ U0 (∞ ← η)U(η ← η ′ )U0 (η ′ ← −∞)
= U0 (∞ ← η)U0 (η ← 0)[U0 (0 ← η)U(η ← η ′ )U0 (η ′ ← 0)]U0 (0 ← η ′ )U0 (η ′ ← −∞)
= U0 (∞ ← 0)Sη,η′ U0 (0 ← −∞) (7.1)
where
Sη,η′ ≡ U0 (0 ← η)U(η ← η ′ )U0 (η ′ ← 0) (7.2)
Equation (7.1) means that a simplified description of the time evolution in

scattering events is possible in which the evolution is free at all times except
sudden change at t = 0 described by the unitary operator Sη,η′ : Approxima-
tion (7.1) becomes more accurate if we increase the time interval [η ′ , η] during
which the exact time evolution is taken into account, i.e., η ′ → −∞, η → ∞.5
2
The short interaction time can be guaranteed if three conditions are met: First, the
interaction between particles is short-range or, more generally, cluster separable. Second,
states of particles are describable by localized wave packets, such as those in subsection
6.5.1. Third, particles’ velocities (or momenta) are sufficiently high.
3
Here we denoted U0 (t ← t′ ) ≡ exp(− ~i H0 (t − t′ )) the time evolution operator associ-
ated with the non-interacting Hamiltonian H0 .
4
Here we used properties (6.77) and (6.78).
5
provided that the right hand side of (7.2) converges in these limits. The issue of
convergence is discussed in subsection 7.1.3.
7.1. SCATTERING OPERATORS 219
State U0 D
C
U
0 time, t
S
B
U
A U0
Figure 7.1: A schematic representation of the scattering process.
Therefore, the exact formula for the time evolution from −∞ to ∞ can be
written as
U(∞ ← −∞) = U0 (∞ ← 0)SU0 (0 ← −∞) (7.3)
where the S-operator (or scattering operator ) is defined by formula
S = lim Sη,η′ = lim U0 (0 ← η)U(η ← η ′ )U0 (η ′ ← 0)

η′ →−∞,η→∞ η′ →−∞,η→∞
i i ′ i ′
= lim e ~ H0 η e− ~ H(η−η ) e− ~ H0 η (7.4)
η′ →−∞,η→∞
= lim S(η)
η→∞
i i ′ i ′
S(η) ≡ ′
lim e ~ H0 η e− ~ H(η−η ) e− ~ H0 η (7.5)
η →−∞
A better understanding of how scattering theory describes time evolution

can be obtained from Fig. 7.1. In this figure we plot the state of the scattering
system (represented abstractly as a point on the vertical axis) as a function
of time (the horizontal axis). The exact evolution of the state is governed by
the full time evolution operator U and is shown by the thick line A → D. In
asymptotic regions (when t is large negative or large positive) the interaction
between parts of the scattering system is weak. In these regions exact time
evolutions can be well approximated by free time evolutions. These free
“trajectories” are governed by the operator U0 and shown in the figure by
two thin straight lines with arrows: one for large positive times C → D and
another for large negative times A → B. The thick line (the exact interacting
time evolution) asymptotically approaches thin lines (free time evolutions)
in the remote past (around A) and in the remote future (around D). The
past and future free evolutions can be extrapolated to time t = 0, and there
is a gap B − C between these extrapolated states. The S-operator (which
connects states B and C as shown by the dashed arrow) is designed to bridge
this gap. This operator provides a mapping between free states extrapolated
to time t = 0. Thus, in scattering theory the exact time evolution A → D is
approximated by three steps: the system first evolves freely until time t = 0,
i.e., from A to B. Then there is a sudden jump B → C represented by the S-
operator. Finally, the time evolution is free again C → D. As seen from the
figure, this description of the scattering process is perfectly exact, as long as
we are interested only in the mapping from asymptotic states in the remote
past A to asymptotic states in the remote future D. However, it is also
clear that scattering theory does not provide a good description of the time
evolution in the interacting region around t = 0. In the scattering operator S
the information about particle interactions enters integrated over the infinite
time interval t ∈ (−∞, ∞). In order to describe the time evolution in the
interaction region (t ≈ 0) the S-matrix approach is not suitable. The full
interacting time evolution operator U is needed for this purpose.
In applications we are mostly interested in matrix elements of the S-
operator
Si→f = hf |S|ii (7.6)
where |ii is a state of non-interacting initial particles, and |f i is a state of

non-interacting final particles. Such matrix elements are called the S-matrix.
Formulas relating the S-matrix to observable quantities, such as scattering
cross-sections, can be found in any textbook on scattering theory.
An important property of the S-operator is its “Poincaré invariance,”
i.e., zero commutators with generators of the non-interacting representation
of the Poincaré group [Wei95, Kaz71]
[S, H0 ] = [S, P0 ] = [S, J0 ] = [S, K0 ] = 0 (7.7)
The vanishing commutator [S, H0 ] = 0 implies that in (7.3) one can change
places of U0 and S, so that the interacting time evolution operator can be
written as the full free time evolution operator times the S-operator
U(∞ ← −∞) = SU0 (∞ ← −∞) = U0 (∞ ← −∞)S (7.8)
7.1.2 S-operator in perturbation theory

There are various techniques available for calculations of the S-operator.
Currently, the perturbation theory is the most powerful and effective one. To
derive the perturbation expansion for the S-operator, first note that operator
S(t) in (7.5) satisfies equation
d
S(t)
dt
d i i ′ i ′
= lim e ~ H0 t e− ~ H(t−t ) e− ~ H0 t
dt t →−∞
′

i
H t i − i
H(t−t ′ ) − i H t′ i
H t i − i
H(t−t′ ) − i H t′
= lim e ~ 0 H0 e ~ e ~ 0 + e~ 0 − H e ~ e ~ 0
t′ →−∞ ~ ~
i i i ′ i ′
= − ′ lim e ~ H0 t (H − H0 )e− ~ H(t−t ) e− ~ H0 t
~ t →−∞
i i i ′ i ′
= − ′ lim e ~ H0 t V e− ~ H(t−t ) e− ~ H0 t
~ t →−∞
i i i i i ′ i ′
= − ′ lim e ~ H0 t V e− ~ H0 t e ~ H0 t e− ~ H(t−t ) e− ~ H0 t
~ t →−∞
i i i ′ i ′
= − ′ lim V (t)e ~ H0 t e− ~ H(t−t ) e− ~ H0 t
~ t →−∞
i
= − V (t)S(t) (7.9)
~
where we denoted6
6
Note that the t-dependence of V (t) does not mean that we are considering time-
i i
V (t) = e ~ H0 t V e− ~ H0 t (7.10)
One can directly check that solution of equation (7.9) with the natural
initial condition S(−∞) = 1 is given by the “old-fashioned” perturbation
expansion
Z t Z t Z t′
i ′ 1 ′ ′ ′
S(t) = 1 − V (t ) dt − 2 V (t ) dt V (t′′ ) dt′′ + . . . ,
~ −∞ ~ −∞ −∞
Therefore, the S-operator can be calculated by putting t = +∞ as the upper

limit of t-integrals
Z +∞ Z +∞ Z t
i 1
S = 1− V (t) dt − 2 V (t) dt V (t′ ) dt′ + . . . (7.11)
~ −∞ ~ −∞ −∞
This formula can be also derived from equation (6.82) in the case when the
initial time t0 = −∞ is in the remote past and the final time t = +∞ is in
the distant future
|Ψ(+∞)i
Z ∞
− ~i H0 (t−t0 ) i i ′ i ′
= lim e 1− e ~ H0 (t −t0 ) V e− ~ H0 (t −t0 ) dt′
t→+∞ ~ −∞
Z ∞ Z t′
1 i
H (t′ −t0 ) − ~i H0 (t′ −t0 ) ′ i ′′ −t ) i ′′ −t )
− 2 e ~ 0 Ve dt e ~ H0 (t 0
V e− ~ H0 (t 0
dt′′ + . . . |Ψ(−∞)i
~ −∞ −∞
Next we shift integration variables t′ − t0 → t′ and t′′ − t0 → t′′ , so that7
dependent interactions. The argument t has very little to do with actual time dependence
of operators in the Heisenberg representation, which must be generated by the full inter-
acting Hamiltonian H and not by the free Hamiltonian H0 as in equation (7.10). In such
cases we will use the term “t-dependence” instead of “time dependence”.
7
Note that the trick of “adiabatic switching” described in the next subsection (and
tacitly assumed to be working here) allows us to keep unchanged the infinite limits (−∞
and ∞) of integrals.
|Ψ(+∞)i
Z
− ~i H0 (t−t0 )
i ∞ i H0 t′ − i H0 t′ ′
= lim e 1− e~ Ve ~ dt
t→+∞ ~ −∞
Z ∞ Z t′
1 i
H0 t′ − ~i H0 t′ ′ i ′′ i ′′
− 2 e ~ Ve dt e ~ H0 t V e− ~ H0 t dt′′ + . . . |Ψ(−∞)i
~ −∞ −∞
Comparing this formula with representation (7.8) of the time evolution op-
erator we conclude that the S-factor is the same as (7.11).
We will avoid discussion of (non-trivial) convergence properties of the
series on the right hand side of equation (7.11). Throughout this book we
will tacitly assume that all relevant perturbation series do converge.
We will often use the following convenient shorthand notation for t-
integrals
Z
i t
Y (t) ≡ − Y (t′ )dt′ (7.12)
~ −∞
Z
i +∞
Y (t) ≡ − Y (t′ )dt′ = Y (∞) (7.13)
|{z} ~ −∞
In this notation the perturbation expansion of the S-operator (7.11) can be

written compactly as
S = 1 + Σ(t) (7.14)
|{z}
Σ(t) = V (t) + V (t)V (t′ ) + V (t)V (t′ )V (t′′ ) + V (t)V (t′ )V (t′′ )V (t′′′ ) + . . .
(7.15)
Formula (7.11) is not the only way to write the perturbation expansion for
the S-operator and, perhaps, not the most convenient one. In most books on
quantum field theory the covariant Feynman–Dyson perturbation expansion
[Wei95] is used, which involves a time ordering of operators in the integrands8
8
When applied to a product of several t-dependent bosonic operators, the time ordering
symbol T changes the order of operators in such a way that the t label increases from right
Z+∞ Z+∞
i 1
S = 1− dt1 V (t1 ) − dt1 dt2 T [V (t1 )V (t2 )]
~ 2!~2
−∞ −∞
Z+∞
i
+ dt1 dt2 dt3 T [V (t1 )V (t2 )V (t3 )]
3!~3
−∞
Z+∞
1
+ 4 dt1 dt2 dt3 dt4 T [V (t1 )V (t2 )V (t3 )V (t4 )] + . . . (7.17)
4!~
−∞
For our purposes in chapter 11 we found more useful yet another equivalent
perturbative expression suggested by Magnus [Mag54, PL66, BCOR09]
S = exp(F (t)) (7.18)

|{z}
where Hermitian operator F (t) will be referred to as the scattering phase

operator. It is represented as a series of multiple commutators with t-integrals
1 1
F (t) = V (t) − [V (t′ ), V (t)] + [V (t′′ ), [V (t′ ), V (t)]]
2 6
1 1
+ [[V (t′′ ), V (t′ )], V (t)] − [V (t′′′ ), [[V (t′′ ), V (t′ )], V (t)]]
6 12
1
− [[V (t′′′ ), [V (t′′ ), V (t′ )]], V (t)]
12
1
− [[V (t′′′ ), V (t′′ )], [V (t′ ), V (t)]] + . . . (7.19)
12
One important advantage of this representation is that expression (7.18) for
the S-operator is manifestly unitary in each perturbation order. The three
to left, e.g.

A(t1 )B(t2 ), if t1 > t2
T [A(t1 )B(t2 )] = (7.16)
B(t2 )A(t1 ), if t1 < t2
For the time ordered product of fermionic (anticommuting) operators see equation (J.87).
perturbative expansions (old-fashioned, Feynman–Dyson, and Magnus) are

equivalent in the sense that they converge to the same result if all perturba-
tion orders are added up to infinity. However, in each fixed order n the three
types of terms can be different.9
We will often drop t-arguments in operator expressions. Then formulas
for the S-operator (7.14) and (7.18) simplify
S = exp(|{z}
F ) = 1 + |{z} Σ (7.20)
1
F = V − [V , V ] + . . . (7.21)
2
Σ = V + V V + V VV + ... (7.22)
7.1.3 Adiabatic switching of interaction

In formulas for scattering operators (7.15) and (7.19) we meet t-integrals
V (t). A straightforward calculation of such integrals gives rather discourag-
ing result. Let us introduce a complete basis |ni of eigenvectors of the free
Hamiltonian
H0 |ni = En |ni (7.23)

X
|nihn| = 1 (7.24)
n
and calculate matrix elements of V (t) in this basis
Zt Zt
i i
H t′ i
H t′ i i ′
hn|V (t)|mi ≡ − hn|e ~ 0 Ve ~ 0 |midt′ = − Vnm e ~ (En −Em )t dt′
~ ~
−∞ −∞
i i
!
e ~ (En −Em )t e ~ (En −Em )(−∞)
= −Vnm − (7.25)
En − Em En − Em
What shall we do with the meaningless term containing (−∞) on the right
hand side?
9
the difference being of the order n + 1 or higher
This term can be made harmless if we take into account an important

fact that the S-operator cannot be applied to all states in the Hilbert space.
It can be applied only to scattering states |Ψi in which free particles are
far from each other in asymptotic limits t → ±∞. Then the time evolu-
tion of these states coincides with the free evolution in the distant past and
distant future,10 and the full time evolution in the infinite time interval is
given exactly by formula (7.8). Certainly, the above assumptions cannot be
applied to all states in the Hilbert space. For example, the time evolution
of bound states of the interacting Hamiltonian H, does not resemble the
free evolution at any time. It appears that if we exclude such bound states
from consideration and limit our application of the S-operator and t-integrals
(7.25) only to scattering states consisting of one-particle wave packets with
good localization in both position and momentum spaces, then no ambiguity
arises.
For scattering states the interaction operator is effectively zero in asymp-
totic regimes, so we can write
i
lim V e− ~ H0 t |Ψi = 0
t→±∞
lim V (t)|Ψi = 0 (7.26)
t→±∞
One approach to the exact treatment of scattering is to explicitly consider

only wave packets described above.11 Then the cluster separability of V will
ensure the correct asymptotic behavior of the colliding wave packets and the
validity of equation (7.26). However, such an approach is rather complicated,
and we would like to stay away from working with wave packets.
There is another way to achieve the same goal by using a trick called the
adiabatic switching of interaction. The trick is to add the property (7.26)
to the interaction operator “by hand.” To do that we multiply V (t) by a
numerical function of t which slowly grows from the value of zero at t = −∞
to the value of one at t ≈ 0 (turning the interaction “on”) and then slowly
decreases back to zero at t = ∞ (turning the interaction “off”). For example,
it is convenient to choose
i i
V (t) = e ~ H0 t V e− ~ H0 t e−ǫ|t| (7.27)
10
Of course, interaction V must be cluster separable to ensure that.
11
see, e.g., [GW64]
If the parameter ǫ is small and positive, such a modification would not affect
the movement of wave packets and the S-matrix. At the end of calculations
we will take the limit ǫ → +0. Then the t-integral (7.25) takes the form
i i
!
e ~ (En −Em )t−ǫ|t| e ~ (En −Em )(−∞)−ǫ(∞)
hn|V (t)|mi ≈ −Vnm −
En − Em En − Em
i
e ~ (En −Em )t
−→ −Vnm (7.28)
En − Em
so the embarrassing expression ei∞ does not appear.
7.1.4 T -matrix
In this subsection we will introduce the idea of the T -matrix, which is often
useful in scattering calculations.12 Let us calculate matrix elements of the
S-operator (7.11) in the basis of eigenvectors of the free Hamiltonian (7.23)
- (7.24)13
hn|S|mi
Z
i ∞ i ′ i ′
= δnm − hn|e ~ H0 t V e− ~ H0 t |midt′
~ −∞
Z ∞ Z t′
1 i ′ − i ′ ′ i ′′ i ′′
− 2 hn|e ~ H 0 t
Ve ~ H 0 t
|kidt hk|e ~ H0 t V e− ~ H0 t |midt′′ + . . .
~ −∞ −∞
Z
i ∞ i (En −Em )t′
= δnm − e~ Vnm dt′
~ −∞
Z Z t′
1 ∞ i (En −Ek )t′ ′ i ′′
− 2 e~ Vnk dt e ~ (Ek −Em )t Vkm dt′′ + . . .
~ −∞ −∞
Z i ′
i ∞ i (En −Ek )t′ ′ e ~ (Ek −Em )t
= δnm − 2πiδ(En − Em )Vnm + e ~ dt Vnk Vkm + . . .
~ −∞ Em − Ek
1
= δnm − 2πiδ(En − Em )Vnm + 2πiδ(En − Em ) Vnk Vkm + . . .
Em − Ek
12
I am indebted to Cao Bin for numerous discussions which resulted in writing this
subsection.
13
Summation over indices k and l is implied. Formula (7.28) is used for t-integrals.
= δnm − 2πiδ(En − Em )Vnk ×

1 1 1
δkm + Vkm + Vkl Vlm + . . . (7.29)
Em − Ek Em − Ek Em − El
The first term is a unit matrix that describes free propagation. The matrix
in the second term is called the T -matrix (or transition matrix ).

1 1 1
Tnm ≡ Vnk δkm + Vkm + Vkl Vlm + . . .
Em − Ek Em − Ek Em − El

1 1 1
= hn|V 1 + V + V V + . . . |mi
Em − H0 Em − H0 Em − H0
The series in the parentheses can be summed up using the standard formula
(1 − x)−1 = 1 + x + x2 + . . .
1
Tnm = hn|V |mi (7.30)
1 − (Em − H0 )−1 V
= hn|V (Em − H0 )(Em − H0 − V )−1 |mi
= hn|V (Em − H0 )(Em − H)−1 |mi (7.31)
The beauty of this result is that it provides a non-perturbative closed-form

expression for the S-operator and scattering amplitudes. Expression (7.30)
has been used in numerical scattering calculations [RMM74]. See also a
related method of “inverse matrix” in [BJK69, KMS90, DS10].
According to (7.29) and (7.31), the S-matrix can be represented as a
function of energy (E = Em = En )
Snm (E) = δnm − 2πiδ(En − Em )Tnm

= δnm − 2πiδ(En − Em )hn|V (E − H0 )(E − H)−1 |mi (7.32)
Let us analyze the structure of this expression in more detail. As we discussed

in subsection 7.1.3, the S-matrix is defined only for states that behave asymp-
totically as free states. For such states, the energy E is not lower than the
energy of separated reactants E0 . In the center-of-mass reference frame this
threshold is the sum of rest energies of all N particles participating in the
scattering event
N
X
E0 = ma c2
a=1
Therefore
E ∈ [E0 , ∞) (7.33)
Let us pick a value E in this interval. Then Tnm in (7.32) can be calculated
as a matrix of an energy-dependent T -operator T (E)
Tnm = hn|V (E − H0 )(E − H)−1 |mi = hn|T (E)|mi
The product δ(En − E)Tnm can be interpreted as a matrix, which is zero

everywhere except the diagonal sub-block corresponding to the eigenvalue E
of H0 . If we denote PE the projection on this eigensubspace, then δ(En −
E)T (E) = PE T (E)PE and the full S-operator (for all values of E) can be
written in a basis-independent form
X
S = 1 − 2πi PE V (E − H0 )(E − H)−1PE (7.34)
E
The corresponding S-matrix has a block-diagonal form, i.e., the matrix ele-
ment Snm is non-zero only if the indices n and m satisfy condition Em = En .
This implies that the S-operator commutes with the free Hamiltonian H0 .14
Exact formula (7.34) is not very useful in practical scattering calculations.
However, it helps to derive some interesting results, such as the connection
between poles of the S-matrix and energies of bound states discussed in the
next subsection.
7.1.5 S-matrix and bound states

There are good reasons to believe that the S-operator is an analytical function
of energy E. So, it is interesting to find where the poles of this function are
14
see also equation (7.7)
located. If operator V is non-singular, then poles of S coincide with those

values of E at which operator (E−H0 )(E−H)−1 in (7.34) is singular. In other
words, these are values E for which the denominator has zero eigenvalue.
Thus poles Eα can be found as solutions of the eigenvalue equation
H|Ψαi = Eα |Ψα i
Obviously, this is equivalent to the stationary Schrödinger equation for bound

states. This means that there exists a correspondence between poles of the S-
operator (or T -operator) and bound state energies Eα of the full Hamiltonian
H. Earlier we found that operators S(E) and T (E) are defined only for ener-
gies E in the interval (7.33). Energies of bound states are always lower than
E0 , i.e., they are outside the domain of validity of operators S(E) and T (E).
Therefore, the above correspondence (poles of the S-operator)↔(energies of
bound states) can be established only in terms of analytic continuation of
the S-operator from its natural energy range E ≥ E0 to energy values below
E0 .
It is important to stress that the above possibility to find energies of
bound states Eα from the S-matrix does not mean that state vectors of
bound states |Ψα i can be found as well. All bound states are eigenstates of
just one (infinite) eigenvalue of the T -matrix. Therefore, the maximum we
can do is to find the entire subspace spanning all bound state vectors. This
ambiguity is closely related to the scattering equivalence of Hamiltonians
discussed in the next section.
7.2 Scattering equivalence

Results from the preceding section allow us to conclude that even full knowl-
edge of the S-operator does not permit us to obtain the unique corresponding
Hamiltonian H. In other words, many different Hamiltonians may have iden-
tical scattering properties. In this section, we will discuss in more detail this
one-to-many relationship between S-operators and Hamiltonians.
7.2.1 Scattering equivalence of Hamiltonians

The S-operator and the Hamiltonian provide two different ways to describe
dynamics. The Hamiltonian completely describes the time evolution for all
7.2. SCATTERING EQUIVALENCE 231
time intervals, large or small. On the other hand, the S-operator represents
time evolution in the “integrated” form, i.e., knowing the state of the system
in the remote past |Ψ(−∞)i, the free Hamiltonian H0 , and the scattering
operator S, we can find the evolved state in the distant future15
|Ψ(∞)i = U(∞ ← −∞)|Ψ(−∞)i

= U0 (∞ ← −∞)S|Ψ(−∞)i
Calculations of the S-operator are much easier than those of the detailed
time evolution and yet they fully satisfy the needs of current experiments
in high energy physics. In particular, the knowledge of the S-operator is
sufficient to calculate scattering cross-sections as well as energies and life-
times of stable and metastable bound states.16 This situation created an
impression that a comprehensive theory can be constructed which uses the
S-operator as the fundamental quantity rather than the Hamiltonian and
wave functions. However, in order to describe the detailed time evolution of
all states and wavefunctions of bound states, the knowledge of the S-operator
is not enough: the full interacting Hamiltonian H is needed. Therefore the
S-operator description is not complete, and such a theory is applicable only
to a limited class of experiments.
Knowing the full interacting Hamiltonian H, we can calculate the S-
operator by formulas (7.14), (7.17), or (7.18). However, the inverse state-
ment is not true: the same S-operator can be obtained from many different
Hamiltonians. Suppose that two Hamiltonians H and H ′ are related to each
other by a unitary transformation eiΦ
H ′ = eiΦ He−iΦ
Then they yield the same scattering (and Hamiltonians H and H ′ are called
scattering equivalent) as long as condition
i i
lim e ~ H0 t Φe− ~ H0 t = 0 (7.35)
t→±∞
15
see equation (7.8)
16
The two latter quantities are represented by positions of poles of the S-operator on
the complex energy plane, as discussed in the preceding subsection.
is satisfied.17 Indeed, in the limit η → +∞, η ′ → −∞ we obtain from (7.5)

[Eks60]
i i ′ ′ i ′
S′ = lim e ~ H0 η e− ~ H (η−η ) e− ~ H0 η
η′ →−∞,η→∞
i
i ′
i ′
= ′ lim e ~ H0 η eiΦ e− ~ H(η−η ) e−iΦ e− ~ H0 η
η →−∞,η→∞
i i
i i ′ i ′
i ′ i ′

= ′ lim e ~ H0 η eiΦ e− ~ H0 η e ~ H0 η e− ~ H(η−η ) e− ~ H0 η e ~ H0 η e−iΦ e− ~ H0 η
η →−∞,η→∞
i i ′ i ′
= lim e ~ H0 η e− ~ H(η−η ) e− ~ H0 η
η′ →−∞,η→∞
= S (7.36)
Note that due to Lemma F.11, the energy spectra of two scattering equiv-
alent Hamiltonians H and H ′ are identical. However, their eigenvectors
are different, and corresponding descriptions of dynamics (e.g., via equa-
tion (6.81)) are different too. Therefore scattering-equivalent theories may
not be physically equivalent.
7.2.2 Bakamjian’s construction of the point form dy-

namics
So far we have been working within the instant form of Dirac’s relativistic
dynamics. It appears, however, that the above conclusions about scattering-
equivalent Hamiltonians can be made more general, in the sense that even
two different forms of dynamics (e.g., the instant form and the point form)
can have the same S-operators. This will be discussed in subsection 7.2.3.
To prepare for this discussion, here we will construct a particular version
of the point form dynamics using the Bakamjian’s prescription [Bak61].
This method bears some resemblance to the Bakamjian-Thomas approach
described in subsection 6.3.2.
We start from non-interacting operators of mass M0 , linear momentum
P0 , angular momentum J0 , position R0 , and spin S0 = J0 − [R0 × P0 ]. Next
we introduce two new operators
17
A rather general class of operators Φ that satisfy this condition will be found in
Theorem 11.2 from subsection 11.1.4.
P0
Q0 ≡
M0 c2
X0 ≡ M0 c2 R0
which satisfy canonical commutation relations
[X0i , Q0j ] = i~δij .

Then we can express generators of the non-interacting representation of
the Poincaré group in the following form
P0 = M0 c2 Q0
J0 = [X0 × Q0 ] + S0
q q
1 2 2 2 2 [Q0 × S0 ]
K0 = − 1 + c Q0 X0 + X0 1 + c Q0 − p
2 1 + 1 + c2 Q20
q
2
H0 = M0 c 1 + c2 Q20 .
A point form interaction can be now introduced by modifying the mass op-
erator M0 → M provided that the following conditions are satisfied18
[M, Q0 ] = [M, X0 ] = [M, S0 ] = 0

These conditions, in particular, guarantee that M is Lorentz invariant
[M, K0 ] = [M, J0 ] = 0.
This modification of the mass operator introduces interaction in generators
of the translation subgroup
P = Mc2 Q0
q
2
H = Mc 1 + c2 Q20 (7.37)
while Lorentz subgroup generators K0 and J0 remain interaction-free.
18
Just as in subsection 6.3.2, these conditions can be satisfied by defining M = M0 + L,
where L is a rotationally-invariant function of relative position and momentum operators
commuting with Q0 and X0 .
7.2.3 Scattering equivalence of forms of dynamics

The S-matrix equivalence of Hamiltonians established in subsection 7.2.1
remains valid even if the transformation eiΦ changes the relativistic form of
dynamics [Sok75, SS78]. Here we would like to demonstrate this equivalence
on an example of Dirac’s point and instant forms of dynamics [Sok75]. We
will use definitions and notation from subsections 7.2.2 and 7.1.3.
First we assume that a Bakamjian’s point form representation of the
Poincaré group is given, which is built on operators
6
M = M0
P = Q0 Mc2
J = J0
X0
R =
Mc2
Then we introduce the unitary operator
Θ = ζ0 ζ −1
where
ζ0 = exp(−i ln(M0 c2 )Z0 )

ζ = exp(−i ln(Mc2 )Z0 )
1
Z0 = (Q0 · X0 + X0 · Q0 )
2~
Our goal here is to demonstrate that the set of operators ΘMΘ−1 , ΘPΘ−1,
ΘJΘ−1 , and ΘRΘ−1 generates a representation of the Poincaré group in the
Bakamjian-Thomas instant form of dynamics. Moreover,√ the S-operators
computed with the point-form Hamiltonian H = M 2 c4 + P 2 c2 and the
instant form Hamiltonian H ′ = ΘHΘ−1 are the same.
Let us denote
Q0 (b) = eibZ0 Q0 e−ibZ0 , b ∈ R

From the commutator
[Z0 , Q0 ] = iQ0
it follows that
d
Q0 (b) = i[Z0 , Q0 ] = −Q0
db
Q0 (b) = e−b Q0
This formula remains valid even if b is a Hermitian operator commuting with

both Q0 and X0 . For example, if b = ln(M0 c2 ), then
2 )Z 2 )Z 2
ei ln(M0 c 0
Q0 e−i ln(M0 c 0
= e− ln(M0 c ) Q0 = M0−1 c−2 Q0
Similarly, one can prove
2 )Z 2
ei ln(M c Q0 e−i ln(M c )Z0 = M −1 c−2 Q0
0
2 2
ei ln(M0 c )Z0 X0 e−i ln(M0 c )Z0 = M0 c2 X0
2 2
ei ln(M c )Z0 X0 e−i ln(M c )Z0 = Mc2 X0
which imply
2 2 2 2
ΘPΘ−1 = e−i ln(M0 c )Z0 ei ln(M c )Z0 Q0 Mc2 e−i ln(M c )Z0 ei ln(M0 c )Z0
2 2
= e−i ln(M0 c )Z0 Q0 ei ln(M0 c )Z0 = Q0 M0 c2 = P0
−1
ΘJ0 Θ = J0
−1 2 2 2 2
ΘRΘ = e−i ln(M0 c )Z0 ei ln(M c )Z0 X0 M −1 c−2 e−i ln(M c )Z0 ei ln(M0 c )Z0
2 2
= e−i ln(M0 c )Z0 X0 ei ln(M0 c )Z0 = X0 M0−1 c−2 = R0
From these formulas it is clear that the transformed dynamics corresponds

to the Bakamjian-Thomas instant form.
Let us now demonstrate that the scattering operator S computed with
the point form Hamiltonian H (7.37) is the same as S ′ computed with the
instant form Hamiltonian H ′ = ΘHΘ−1. Note that we can write equation
(7.4) as
S = Ω+ (H, H0 )Ω− (H, H0 )
where operators
i i
Ω± (H, H0 ) ≡ lim e ~ H0 t e− ~ Ht
t→±∞
are called Møller wave operators. Now we can use the Birman-Kato invariance
principle [Dol76] which states that Ω± (H, H0 ) = Ω± (f (H), f (H0)) where f is
any smooth function with positive derivative. Using the following connection
between the point form mass operator M and the instant form mass operator
M′
M = ζ −1Mζ = ζ −1Θ−1 M ′ Θζ = ζ −1ζζ0−1M ′ ζ0 ζ −1ζ = ζ0−1 M ′ ζ0
we obtain
q q
± 2± 2
Ω (H, H0 ) ≡ Ω Mc 1 + c2 Q20 , M0 c 1 + c2 Q20
= Ω± (Mc2 , M0 c2 ) = Ω± (ζ0−1M ′ ζ0 c2 , M0 c2 )
= ζ0−1 Ω± (M ′ c2 , M0 c2 )ζ0
q
−1 ±
p
2
= ζ0 Ω (M ′ )2 c4 + P0 c2 , M0 c4 + P0 c2 ζ0
= ζ0−1 Ω± (H ′ , H0 )ζ0
S ′ = Ω+ (H ′ , H0 )Ω− (H ′ , H0 )
= ζ0 Ω+ (H, H0 )ζ0−1ζ0 Ω− (H, H0 )ζ0−1
= ζ0 Ω+ (H, H0 )Ω− (H, H0 )ζ0−1 = ζ0 Sζ0−1
but S commutes with free generators (7.7) and hence commutes with ζ0 ,
which implies that S ′ = S and transformation Θ conserves the S-matrix.
In addition to the scattering equivalence of the instant and point forms
proved above, Sokolov and Shatnii [Sok75, SS78] established the mutual scat-
tering equivalence of all three major forms of dynamics - the instant, point,
and front forms. Then, it seems reasonable to assume that the same S-
operator can be obtained in any form of dynamics.
The scattering equivalence of the S-operator is of great help in practical
calculations. If we are interested only in scattering properties, energies, and
lifetimes of bound states, then we can choose the most convenient Hamil-
tonian and the most convenient form of dynamics. However, as we have
mentioned already, the scattering equivalence of Hamiltonians and forms of
dynamics does not mean their complete physical equivalence. We will see
in subsection 17.2.7 that the instant form of dynamics should be preferred
in those cases when desired physical properties19 cannot be described by the
S-operator, but require knowledge of the full interacting Hamiltonian.
19
e.g., the detailed time evolution
Chapter 8
FOCK SPACE
This subject has been thoroughly worked out and is now under-
stood. A thesis on this topic, even a correct one, will not get you
a job.
R.F. Streater
In chapter 6 we discussed interacting quantum theories in the Hilbert

space with a fixed particle content. These theories were fundamentally in-
complete, because they could not describe many physical processes that can
change particle types and/or numbers. Familiar examples of such processes
include the emission and absorption of light (photons) by atoms and nu-
clei, decays, neutrino oscillations, etc. The persistence of particle creation
and destruction processes at high energies follows from the famous Einstein’s
formula E = mc2 . This formula, in particular, implies that if a system of
particles has sufficient energy E of their relative motion, then this energy
can be converted to the mass m of newly created particles. Generally, there
is no limit on how many particles can be created in collisions, so the Hilbert
space of any realistic quantum mechanical system should include states with
arbitrary numbers (from zero to infinity) of particles of all types. Such a
Hilbert space is called the Fock space.
Our primary goal in this chapter (and for the most part in the rest of
239
240 CHAPTER 8. FOCK SPACE
this book) is to understand electromagnetic interactions between five par-

ticle species: electrons e− , positrons e+ , protons p+ , antiprotons p− , and
photons γ within, allegedly, the most successful physical theory – quantum
electrodynamics (QED).
8.1 Annihilation and creation operators

In this section we are going to build the Fock space H of QED and introduce
creation and annihilation operators, which provide a very convenient notation
for working with operators in H.
8.1.1 Sectors with fixed numbers of particles

The numbers of particles of any type are readily measured in experiments, so
we can introduce 5 new observables in our theory: the numbers of electrons
(Nel ), positrons (Npo ), protons (Npr ), antiprotons (Nan ), and photons (Nph ).
According to general rules of quantum mechanics, these observables must
be represented by five Hermitian operators in the Hilbert (Fock) space H.
Apparently, the allowed values (the spectrum) for the number of particles of
each type are non-negative integers (0,1,2,...). We assume that these observ-
ables can be measured simultaneously, therefore the corresponding operators
commute with each other and have common spectrum. So, the Fock space
H separates into a direct sum of corresponding orthogonal eigensubspaces or
sectors H(i, j, k, l, m) with i electrons, j positrons, k protons, l antiprotons,
and m photons
H = ⊕∞
ijklm=0 H(i, j, k, l, m) (8.1)
where
Nel H(i, j, k, l, m) = iH(i, j, k, l, m)

Npo H(i, j, k, l, m) = jH(i, j, k, l, m)
Npr H(i, j, k, l, m) = kH(i, j, k, l, m)
Nan H(i, j, k, l, m) = lH(i, j, k, l, m)
Nph H(i, j, k, l, m) = mH(i, j, k, l, m)
8.1. ANNIHILATION AND CREATION OPERATORS 241
The one-dimensional subspace with no particles H(0, 0, 0, 0, 0) is called

the vacuum subspace. The vacuum vector |0i is then defined as a vector in
this subspace, up to an insignificant phase factor. The one-particle sectors
are built using prescriptions from chapter 5. The subspaces H(1, 0, 0, 0, 0)
and H(0, 1, 0, 0, 0) correspond to one electron and one positron, respectively.
They are subspaces of unitary irreducible representations of the Poincaré
group characterized by the mass m = 0.511 MeV/c2 and spin 1/2 (see Table
5.1). The subspaces H(0, 0, 1, 0, 0) and H(0, 0, 0, 1, 0) correspond to one pro-
ton and one antiproton, respectively. They have mass M = 938.3 MeV/c2
and spin 1/2. The subspace H(0, 0, 0, 0, 1) correspond to one photon. It is
characterized by zero mass and it is a direct sum of two irreducible subspaces
with helicities 1 and -1.1
Sectors with two or more particles are constructed as (anti)symmetrized
tensor products of one-particle sectors.2 For example, if we denote Hel the
one-electron Hilbert space and Hph the one-photon Hilbert space, then sectors
having only electrons and photons can be written as
H(0, 0, 0, 0, 0) = |0i (8.2)

H(1, 0, 0, 0, 0) = Hel (8.3)
H(0, 0, 0, 0, 1) = Hph (8.4)
H(1, 0, 0, 0, 1) = Hel ⊗ Hph (8.5)
H(2, 0, 0, 0, 0) = Hel ⊗asym Hel (8.6)
H(0, 0, 0, 0, 2) = Hph ⊗sym Hph (8.7)
H(1, 0, 0, 0, 2) = Hel ⊗ (Hph ⊗sym Hph ) (8.8)
H(2, 0, 0, 0, 1) = Hph ⊗ (Hel ⊗asym Hel ) (8.9)
H(2, 0, 0, 0, 2) = (Hph ⊗sym Hph ) ⊗ (Hel ⊗asym Hel ) (8.10)
...
In each sector of the Fock space we can define observables of individual
particles, e.g., position momentum, spin, etc., as described in subsection
6.1.2.
For example, in each (massive) 1-particle subspace of the Fock space there
is a Newton-Wigner operator that describes position measurements on this
1
2
See section 6.1. Note that electrons and protons are fermions, while photons are
bosons.
particle. In 2-particle sectors we can define two different position operators

for each of the two particles. In addition, we can also define the “center-of-
mass” position operators for the 2-particle system in a usual way. Similar
position operators exist in each N-particle sector.
Then, in each sector we can select a basis of common eigenvectors of a
full set of mutually commuting one-particle observables. A general state |Ψi
in the Fock space may have components in all sectors.3 Thus the number of
particles in |Ψi may be not well-defined.
For future discussions it will be convenient to use the basis in which
momenta and z-components of the spin σ of massive particles (or helicity τ
of massless particles) are diagonal. For example, basis vectors in the two-
electron sector Hel ⊗asym Hel are denoted by |p1 σ1 ; p2 σ2 i. This allows us
to define in each sector multi-particle wave functions in the momentum-spin
representation.

group
The above construction provides us with the Hilbert (Fock) space H where
multiparticle states and observables of our theory reside and where a conve-
nient orthonormal basis set is defined. To complete the formalism we need to
build a realistic interacting representation of the Poincaré group in H. Let
us first fulfill an easier task and construct the non-interacting representation
Ug0 of the Poincaré group in the Fock space H.
From subsection 6.2.1, we already know how to build a non-interacting
representation of the Poincaré group in each individual sector of H. This
can be done by making tensor products (with proper (anti)symmetrization)
of single-particle irreducible representations Ugel , Ugph , etc. Then the non-
interacting representation of the Poincaré group in the entire Fock space can
be constructed as a direct sum of such sector representations. In agreement
with the sector decomposition (8.2) - (8.10) we can write
Ug0 = 1 ⊕ Ugel ⊕ Ugph ⊕ (Ugel ⊗ Ugph ) ⊕ (Ugel ⊗asym Ugel ) . . . (8.11)
Generators of this representation will be denoted as (H0 , P0 , J0 , K0). In

3
Superselection rules forbid linear combinations of states with, e.g., different charges.
We will not discuss such rules here.
each sector these generators are simply sums of one-particle generators.4 As

usual, we assume that operators H0 , P0 , and J0 describe the total energy,
linear momentum, and angular momentum of the non-interacting system,
respectively.
Here we immediately face a serious problem. For example, according to
(8.11), the free Hamiltonian can be represented as a direct sum of sector
components
H0 = |0i ⊕ H0 (1, 0, 0, 0, 0) ⊕ H0 (0, 0, 0, 0, 1) ⊕ H0 (1, 0, 0, 0, 1) ⊕ . . .
It is tempting to use notation from section 6.2 and express each sector Hamil-
tonian using observables of individual particles there: p1 , p2 , etc. For exam-
ple, in the sector H(1, 0, 0, 0, 0), the free Hamiltonian is
p
H0 (1, 0, 0, 0, 0) = m2 c4 + p2 c2 (8.12)
while in the sector H(2, 0, 0, 0, 2) the Hamiltonian is5
q q
H0 (2, 0, 0, 0, 2) = p1 c + p2 c + m c + p3 c + m2 c4 + p24 c2
2 4 2 2
(8.13)
Clearly, this notation is very cumbersome because it does not provide a

unique expression for the operator H0 in the entire Fock space. Moreover, it is
not clear at all how one can use one-particle observables to express operators
changing the number of particles, i.e., moving state vectors across sector
boundaries. We need to find a better and simpler way to write operators in
the Fock space. This task is accomplished by introduction of annihilation
and creation operators in the rest of this section.
8.1.3 Creation and annihilation operators. Fermions

First, it is instructive to consider the case of the discrete spectrum of momen-
tum. This can be achieved by using the standard trick of putting the system
in a box or applying periodic boundary conditions. Then eigenvalues of the
4
For example, in each 2-particle sector equations (6.10) - (6.13) are valid.
5
Two photons are denoted by indices 1 and 2 and two electrons are denoted by indices
3 and 4
momentum operator form a discrete 3D lattice pi , and the usual continuous

momentum spectrum can be obtained as a limit when the size of the box
tends to infinity.
Let us examine the case of electrons. We define the (linear) creation
operator a†p,σ for the electron with momentum p and spin projection σ by its
action on basis vectors with n electrons
|p1 , σ1 ; p2 , σ2 ; . . . ; pn , σn i (8.14)
We need to distinguish two cases. The first case is when the one-particle state
(p, σ) created by a†p,σ is among the states listed in (8.14), for example (p, σ) =
(pi , σi ). Since electrons are fermions and two fermions cannot occupy the
same state due to the Pauli exclusion principle, this action leads to a zero
result, i.e.
a†p,σ |p1 , σ1 ; p2 , σ2 ; . . . ; pn , σn i = 0 (8.15)
The second case is when the created one-particle state (p, σ) is not among
the states listed in (8.14). Then the creation operator a†p,σ just adds one
electron in the state (p, σ) to the beginning of the list of particles
a†p,σ |p1 , σ1 ; p2 , σ2 ; . . . ; pn , σn i = |p, σ; p1 , σ1 ; p2 , σ2 ; . . . ; pn , σn i (8.16)
Operator a†p,σ has transformed a state with n electrons to a state with n + 1

electrons. Applying multiple creation operators to the vacuum state |0i we
can construct all basis vectors in the Fock space. For example,
a†p1 ,σ1 a†p2 ,σ2 |0i = |p1 , σ1 ; p2 , σ2 i
is a basis vector in the 2-electron sector.

We define the electron annihilation operator ap,σ as an operator adjoint
to the creation operator a†p,σ . It can be proven [Wei95] that the action of
ap,σ on the n-electron state (8.14) is the following: If the electron state with
parameters (p, σ) was already occupied, e.g. (p, σ) = (pi , σi ) then this state
is “annihilated,” and the number of particles in the system is reduced by one
ap,σ |p1 , σ1 ; . . . ; pi−1 , σi−1 ; pi , σi ; pi+1 , σi+1 ; . . . ; pn , σn i

= (−1)P |p1 , σ1 ; . . . ; pi−1 , σi−1 ; pi+1 , σi+1 ; . . . ; pn , σn i (8.17)
where P is the number of permutations of particles required to bring the

one-particle i to the first place in the list. If the state (p, σ) is not present
in the list, i.e., (p, σ) 6= (pi , σi ) for each i, then
ap,σ |p1 , σ1 ; p2 , σ2 ; . . . ; pn , σn i = 0 (8.18)
Annihilation operators always yield zero when acting on the vacuum state
ap,σ |0i = 0
The above formulas fully define the action of creation and annihilation
operators on basis vectors in purely electronic sectors. These rules are easily
generalized to all states: they do not change if other particles are present and
they can be extended to linear combinations of the basis vectors by linearity.
Creation and annihilation operators for other fermions – positrons, protons
and antiprotons – are constructed similarly.
For brevity we will refer to creation and annihilation operators collec-
tively as to particle operators. This will distinguish them from operators of
momentum, position, energy, etc. of individual particles which will be called
particle observables. Let us emphasize that creation and annihilation opera-
tors are not intended to directly describe any real physical process and they
do not correspond to physical observables. They are just formal mathemati-
cal objects that simplify our notation for other operators having more direct
physical meaning. We will see how operators of observables are built from
particle operators later in this book, e.g., in subsection 8.1.8.
8.1.4 Anticommutators of particle operators

In practical calculations one often uses anticommutators of fermion operators.
First consider the case of unequal particle states (p, σ) 6= (p′ , σ ′ )
{ap′ ,σ′ , a†p,σ } ≡ a†p,σ ap′ ,σ′ + ap′ ,σ′ a†p,σ

Then, acting on a state |p′′ , σ ′′ i, which is different from both |p, σi and
|p′ , σ ′ i, we obtain
(a†p,σ ap′ ,σ′ + ap′ ,σ′ a†p,σ )|p′′ , σ ′′ i = ap′ ,σ′ |p, σ; p′′ , σ ′′ i = 0
Similarly, we obtain
(a†p,σ ap′ ,σ′ + ap′ ,σ′ a†p,σ )|p, σi = 0

(a†p,σ ap′ ,σ′ + ap′ ,σ′ a†p,σ )|p′ , σ ′ i = a†p,σ |0i + ap′ ,σ′ |p, σ; p′ , σ ′ i
= |p, σi − |p, σi = 0
One can easily demonstrate that the result is still zero when acting on zero-,
two-, three-, etc. particle states as well as on their linear combinations. So,
we conclude that in the entire Fock space
{ap′ ,σ′ , a†p,σ } = 0, if (p, σ) 6= (p′ , σ ′ )
Similarly, in the case (p, σ) = (p′ , σ ′ ) we obtain
{a†p,σ , ap,σ } = 1
Therefore for all values of p, p′ , σ, and σ ′ we have
{a†p,σ , ap′ ,σ′ } = δp,p′ δσ,σ′ (8.19)
Using similar arguments one can show that
{a†p,σ , a†p′ ,σ′ } = {ap,σ , ap′ ,σ′ } = 0
8.1.5 Creation and annihilation operators. Photons

For photons, which are bosons, the properties of creation and annihilation
operators are slightly different from those characteristic for fermion opera-
tors described above. Two or more bosons may coexist in the same state.
Therefore, we define the action of the photon creation operator c†p,τ on a

many-photon state as
c†p,τ |p1 , τ1 ; p2 , τ2 ; . . . ; pn , τn i = |p, τ ; p1 , τ1 ; p2 , τ2 ; . . . ; pn , τn i
independent of whether or not the state (p, τ ) already existed. Just as in

the case of fermions, boson annihilation operators cp,τ are adjoint to boson
creation operators. The photon annihilation operator cp,τ destroys a multi-
photon state completely
cp,τ |p1 , τ1 ; p2 , τ2 ; . . . ; pn , τn i = 0
if the annihilated 1-photon state (p, τ ) was not present there. If the photon
(p, τ ) was present in the n-photon state, then the annihilation operator cp,τ
simply destroys this component, thus resulting in a (n − 1)-photon state
cpi ,τi |p1 , τ1 ; . . . ; pi−1 , τi−1 ; pi , τi ; pi+1 , τi+1 ; . . . ; pn , τn i

= |p1 , τ1 ; . . . ; pi−1 , τi−1 ; pi+1 , τi+1 ; . . . ; pn , τn i
The above formulas can be immediately extended to states where, in ad-

dition to photons, other particles are also present. The actions of creation
and annihilation operators on linear combinations of basis vectors are ob-
tained by linearity. Then similar to subsection 8.1.4, we obtain the following
commutation relations for photon creation and annihilation operators
[cp,τ , c†p′ ,τ ′ ] = δp,p′ δτ,τ ′

[cp,τ , cp′ ,τ ′ ] = [c†p,τ , c†p′ ,τ ′ ] = 0
8.1.6 Particle number operators

With the help of particle creation and annihilation operators we can now
build explicit expressions for various observables in the Fock space. Consider,
for example, the product of two photon operators
Np,τ = c†p,τ cp,τ (8.20)

Acting on a state with two photons with quantum numbers (p, τ ) this oper-
ator yields
Np,τ |p, τ ; p, τ i = Np,τ c†p,τ c†p,τ |0i = c†p,τ cp,τ c†p,τ c†p,τ |0i
= c†p,τ c†p,τ cp,τ c†p,τ |0i + c†p,τ c†p,τ |0i = c†p,τ c†p,τ c†p,τ cp,τ |0i + 2c†p,τ c†p,τ |0i
= 2|p, τ ; p, τ i
while acting on the state |p, τ ; p′ , τ ′ i we obtain
Np,τ |p, τ ; p′ , τ ′ i = Np,τ c†p,τ c†p′ ,τ ′ |0i = c†p,τ cp,τ c†p,τ c†p′ ,τ ′ |0i
= c†p,τ c†p,τ cp,τ c†p′ ,τ ′ |0i + c†p,τ c†p′ ,τ ′ |0i = c†p,τ c†p,τ c†p′ τ ′ cp,τ |0i + c†p,τ c†p′ ,τ ′ |0i
= |p, τ ; p′ , τ ′ i
These examples should convince us that operator Np,τ works as a counter of

the number of photons with quantum numbers (p, τ ).
8.1.7 Continuous spectrum of momentum

Properties of creation and annihilation operators presented in preceding sub-
sections were derived for the case of discrete spectrum of momentum. In re-
ality the spectrum of momentum is continuous, and the above results should
be modified by taking the “large box” limit. We can guess that in this limit
equation (8.19) transforms to
{ap′ ,σ′ , a†p,σ } = δσ,σ′ δ(p − p′ ) (8.21)
The following chain of formulas
δσ,σ′ δ(p − p′ )
= hp, σ|p′ , σ ′ i = h0|ap,σ a†p′ ,σ′ |0i = −h0|a†p′ ,σ′ ap,σ |0i + δσ,σ′ δ(p − p′ )
= δσ,σ′ δ(p − p′ )
confirms that our choice (8.21) is consistent with the normalization of mo-
mentum eigenvectors (5.19).
The same arguments now can be applied to positrons (operators bp,σ

and b†p,σ ), protons (dp,σ and d†p,σ ), antiprotons (fp,σ and fp,σ
†
) and photons
†
(cp,τ and cp,τ ). So, finally, we obtain the full set of anticommutation and
commutation relations pertinent to QED
{ap,σ , a†p′ ,σ′ } = {bp,σ , b†p′ ,σ′ } = {dp,σ , d†p′ ,σ′ } = {fp,σ , fp†′ ,σ′ }
= δ(p − p′ )δσσ′ (8.22)
{ap,σ , ap′ ,σ′ } = {bp,σ , bp′ ,σ′ } = {dp,σ , dp′ ,σ′ } = {fp,σ , fp′ ,σ′ }
= {a†p,σ , a†p′ ,σ′ } = {b†p,σ , b†p′ ,σ′ } = {d†p,σ , d†p′ ,σ′ }
†
= {fp,σ , fp†′ ,σ′ } = 0 (8.23)
[cp,τ , c†p′ ,τ ′ ] = δ(p − p′ )δτ τ ′ (8.24)
[c†p,τ , c†p′ ,τ ′ ] = [cp,τ , cp′ ,τ ′ ] = 0 (8.25)
Commutators of operators related to different particles are always zero.

In the continuous momentum limit, the analog of the particle counter
operator (8.20)
ρp,τ = c†p,τ cp,τ (8.26)
can be interpreted as the density of photons with helicity τ at momentum p.

By summing over photon polarizations and integrating density (8.26) over
entire momentum space we can define an operator for the total number of
photons in the system
XZ
Nph = dpc†p,τ cp,τ
τ
We can also write down similar operator expressions for the numbers of other
particles. For example
XZ
Nel = dpa†p,σ ap,σ (8.27)
σ
is the electron number operator. Then we conclude that operator

N = Nel + Npo + Npr + Nan + Nph (8.28)
corresponds to the total number of all particles in the system.
8.1.8 Generators of the non-interacting representation

Now we can fully appreciate the benefits of introducing annihilation and cre-
ation operators. The expression for the non-interacting Hamiltonian H0 can
be simply obtained from the particle number operator (8.28) by multiplying
the integrands (particle densities in the momentum space) by energies of free
particles
Z X Z X
H0 = dpωp [a†p,σ ap,σ + b†p,σ bp,σ ] + dpΩp [d†p,σ dp,σ + fp,σ
†
fp,σ ]
σ=±1/2 σ=±1/2
Z X
+c dpp c†p,τ cp,τ (8.29)
τ =±1
p
where we denoted p ω p = m2 c4 + p2 c2 the energy of free electrons and
positrons, Ωp = M 2 c4 + p2 c2 is the energy of free protons and antipro-
tons, and cp is the energy of free photons. One can easily verify that H0 in
(8.29) acts on states in the sector H(1, 0, 0, 0, 0) just as equation (8.12) and
it acts on states in the sector H(2, 0, 0, 0, 2) exactly as equation (8.13). So,
we have obtained a single expression which works equally well in all sectors
of the Fock space. Similar arguments demonstrate that operator
Z X Z X
P0 = dpp [a†p,σ ap,σ + b†p,σ bp,σ ] + dpp [d†p,σ dp,σ + fp,σ
†
fp,σ ]
σ=±1/2 σ=±1/2
Z X
+ dpp c†p,τ cp,τ (8.30)
τ =±1
can be regarded as the total momentum operator in QED.

Expressions for the generators J0 and K0 are more complicated as they
involve derivatives of particle operators. Let us illustrate their derivation on
an example of a massive spinless particle. Consider the action of a space

i
rotation e− ~ J0z φ on the one-particle state |pi6
i
e− ~ J0z φ |pi = |px cos φ + py sin φ, py cos φ − px sin φ, pz i
This action can be represented as annihilation of the state |pi = |px , py , pz i

followed by creation of the state |px cos φ + py sin φ, py cos φ − px sin φ, pz i,
i.e., if αp† and αp are, respectively, creation and annihilation operators for
the particle, then
i
e− ~ J0z φ |px , py , pz i = αp† x cos φ+py sin φ,py cos φ−px sin φ,pz αpx ,py ,pz |px , py , pz i
†
= αR α |px , py , pz i
z (φ)p p
Therefore, for arbitrary 1-particle state, the operator of finite rotation takes
the form
Z
− ~i J0z φ †
e = dpαR α
z (φ)p p
(8.31)
It is easy to show that the same form is valid everywhere on the Fock space.
An explicit expression for the generator J0z can be obtained now by taking
a derivative of (8.31) with respect to φ
Z
d − i J0z φ d †
J0z = i~ lim e ~ = i~ lim dpαR α
z (φ)p p
φ→0 dφ φ→0 dφ
Z !
∂αp† ∂αp†
= i~ dp py − px αp (8.32)
∂px ∂py
The action of a boost along the z-axis is obtained from (5.28)
s
− ic K0z θ ωp cosh θ + cpz sinh θ
e ~ |pi = |px , py , pz cosh θ + ωp cosh θi
ωp
(8.33)
6
See equation (5.28).
This transformation can be represented as annihilation of the state |pi =

|px , py , pz i followed by creation of the state (8.33)
s
− ic K0z θ ωp cosh θ + cpz sinh θ †
e ~ |pi = αpx ,py ,pz cosh θ+ωp cosh θ αpx ,py ,pz |px , py , pz i
ωp
Therefore, for arbitrary 1-particle state in the Fock space, the operator of a
finite boost takes the form
Z r
− ic K0z θ ωΛp †
e ~ = dp α αp (8.34)
ωp Λp
An explicit expression for the generator K0z can be now obtained by taking
a derivative of (8.34) with respect to θ
i~ d ic
K0z = lim e− ~ K0z θ
c θ→0 dθ s
Z
i~ d ωp cosh θ + cpz sinh θ †
= lim dp αpx ,py ,pz cosh θ+c−1ωp sinh θ αp
c θ→0 dθ ωp
Z !
pz † ωp ∂αp†
= i~ dp α αp + 2 αp (8.35)
2ωp p c ∂pz
Similar derivations can be done for other components of J0 and K0 .
8.1.9 Poincaré transformations of particle operators

From transformations (5.28) of 1-particle state vectors with respect to the
non-interacting representation
i ~ ic ~ i i
U0 (Λ; r, t) ≡ e− ~ J0 φ e− ~ K0 θ e− ~ P0 r e ~ H0 t
in the Fock space, we can find corresponding Poincaré transformations of

creation-annihilation operators. For electron creation operators we obtain7
7
Here we took into account that the vacuum vector is invariant with respect to U0 .
8.2. INTERACTION POTENTIALS 253
U0 (Λ; r, t)a†p,σ U0−1 (Λ; r, t)|0i = U0 (Λ; r, t)a†p,σ |0i = U0 (Λ; r, t)|p, σi
r
ωΛp − i p·r+ i ωp t X 1/2 ~
= e ~ ~ Dσ′ σ (φW (p, Λ))|Λp, σ ′i
ωp σ′
Therefore8
r
ωΛp − i p·r+ i ωp t X 1/2 ~
U0 (Λ; r, t)a†p,σ U0−1 (Λ; r, t) = e ~ ~ Dσ′ σ (φW (p, Λ))a†Λp,σ′
ωp ′
σ
r
ωΛp − i p·r+ i ωp t X 1/2 ∗ ~ W (p, Λ))a† ′
= e ~ ~ (D )σσ′ (−φ Λp,σ (8.36)
ωp ′
σ
Similarly, we obtain the transformation law for annihilation operators
r
ωΛp i p·r− i ωp t X 1/2 ~
U0 (Λ; r, t)ap,σ U0−1 (Λ; r, t) = e~ ~ Dσσ′ (−φW (p, Λ))aΛp,σ′
ωp ′σ
(8.37)
Transformation laws for photon operators are obtained from equation
(5.65)
s
|Λp| − i (p·r)+ ic pt iτ φW (p,Λ) †
U0 (Λ; r, t)c†p,τ U0−1 (Λ; r, t) = e ~ ~ e cΛp,τ (8.38)
p
s
|Λp| i (p·r)− ic pt −iτ φW (p,Λ)
U0 (Λ; r, t)cp,τ U0−1 (Λ; r, t) = e~ ~ e cΛp,τ (8.39)
p
8.2 Interaction potentials

Our primary goal in the rest of this first part of the book is to learn how to
calculate the S-operator in QED, which is the quantity most readily com-
parable with experiment. Equations in subsection 7.1.2 tell us that in order
8
Here ∗ and † denote complex conjugation and Hermitian conjugation, respectively.
~ = (D† (φ))
We also use the property DT (φ) ~ ∗ = (D−1 (φ))
~ ∗ = D∗ (−φ)
~ which is valid for
~
the unitary representation D(φ) of the rotation group.
to do that we need to know the non-interacting part H0 and the interacting

part V of the full Hamiltonian
H = H0 + V
The non-interacting Hamiltonian H0 has been constructed in equation (8.29).
The interaction energy V (and the corresponding interaction boost Z) in
QED will be explicitly written only in section 9.1. Until then, we are going
to study rather general properties of interactions and S-operators in the Fock
space. We will try to use some physical principles to narrow down the allowed
form of the operator V .
Note that in our approach we assume that interaction does not have any
effect on the structure of the Fock space. All properties of this space defined
in the non-interacting case remain valid in the presence of interaction: the
inner product, the orthogonality of n-particle sectors, the existence of particle
number operators, etc. In this respect our theory is different from axiomatic
or constructive quantum field theories, in which the Hilbert space has a non-
Fock structure, which depends on interactions.
8.2.1 Conservation laws

From experiment we know that interaction V between charged particles obeys
some several important restrictions called conservation laws. An observable
F is called conserved if it remains unchanged in the course of time evolution
i i
F (t) ≡ e ~ Ht F (0)e− ~ Ht = F (0)
It then follows that conserved observables commute with the Hamiltonian
[F, H] = [F, H0 + V ] = 0, which imposes some restrictions on the interaction
operator V . For example, the conservation of the total momentum and the
total angular momentum implies that
[V, P0 ] = 0 (8.40)
[V, J0 ] = 0 (8.41)
These commutators are automatically satisfied in the instant form of dynam-
ics (6.22) adopted in our study. It is also well-established that all interactions
conserve the lepton number (the number of electrons minus the number of
positrons, in our case). Therefore, H = H0 + V must commute with the
lepton number operator
XZ
L = Nel − Npo = dp(a†p,σ ap,σ − b†p,σ bp,σ ) (8.42)
σ
Since H0 already commutes with L, we obtain
[V, L] = 0 (8.43)
Moreover, all known interactions conserve the baryon number (=the number
of protons minus the number of antiprotons in our case). So, V must also
commute with the baryon number operator
XZ
B = Npr − Nan = dp(d†p,σ dp,σ − fp,σ
†
fp,σ ) (8.44)
σ
[V, B] = 0 (8.45)
Taking into account that electrons have charge −e, protons have charge e and
antiparticles have charges opposite to those of particles, we can introduce the
electric charge operator
Q = e(B − L)
XZ
= e dp(b†p,σ bp,σ − a†p,σ ap,σ + d†p,σ dp,σ − fp,σ
†
fp,σ ) (8.46)
σ
and obtain the charge conservation law
[H, Q] = [V, Q] = e[V, B − L] = 0 (8.47)
from equations (8.43) and (8.45).

As we saw above, both operators H0 and V in QED commute with P0 ,
J0 , L, B, and Q. Then from subsection 7.1.2 it follows that scattering opera-
tors F , Σ, and S also commute with P0 , J0 , L, B, and Q, which means that
corresponding observables (total momentum, total angular momentum, lep-

ton number, baryon number, and electric charge, respectively) are conserved
in scattering events. Although, separate numbers of particles of individual
species, i.e., electrons, or protons may not be conserved, the above con-
servation laws require that charged particles may be created or annihilated
only together with their antiparticles, i.e., in pairs. Creation of pairs is sup-
pressed in low energy reactions as such processes require additional energy
of 2mel c2 = 2 × 0.51MeV = 1.02MeV for an electron-positron pair and
2mpr c2 = 1876.6MeV for an proton-antiproton pair. These high-energy pro-
cesses can be safely neglected in classical electrodynamics. However, even in
this low-energy theory one cannot disregard the emission of photons. Since
photons have zero mass, the energetic threshold for the photon emission is
zero. Moreover, photons have zero electric charge, lepton, and baryon num-
bers, so there are no conservation laws that would restrict their creation and
annihilation. Photons can be created and destroyed in any quantities.
8.2.2 Normal ordering

In the next subsection we are going to express operators in the Fock space
as polynomials in particle creation and annihilation operators. But first we
need to overcome one notational problem related to the non-commutativity
of particle operators: two different polynomials may, actually, represent the
same operator. To have a unique polynomial representative for each operator,
we will agree always to write products of operators in the normal order, i.e.,
creation operators to the left from annihilation operators. Among creation
(annihilation) operators we will enforce a certain order based on particle
species: We will write particle operators in the order proton - antiproton -
electron - positron - photon from left to right. With these rules and with
(anti)commutation relations (8.22) - (8.25) one can always convert a product
of particle operators to the normally ordered form. This is illustrated by the
following example
ap′ ,σ′ cq′ ,τ ′ a†p,σ c†q,τ = ap′ ,σ′ a†p,σ cq′ ,τ ′ c†q,τ
= (a†p,σ ap′ ,σ′ + δ(p − p′ )δσ,σ′ )(−c†q,τ cq′ ,τ ′ + δ(q − q′ )δτ,τ ′ ))
= −a†p,σ c†q,τ ap′ ,σ′ cq′ ,τ ′ + a†p,σ ap′ ,σ′ δ(q − q′ )δτ,τ ′
− c†q,τ cq′ ,τ ′ δ(p − p′ )δσ,σ′ + δ(p − p′ )δσ,σ′ δ(q − q′ )δτ,τ ′
where the right hand side is in the normal order.
8.2.3 General form of interaction operators

A well-known theorem (see [Wei95], p. 175) states that in the Fock space
any operator V satisfying conservation laws (8.40) - (8.41)
[V, P0 ] = [V, J0 ] = 0 (8.48)
can be written as a polynomial in particle creation and annihilation opera-

tors9
∞ X
X ∞
V = VN M (8.49)
N =0 M =0
XZ
VN M = dq′1 . . . dq′N dq1 . . . dqM DN M (q′1 η1′ ; . . . ; q′N ηN
′
; q1 η1 ; . . . ; qM ηM ) ×
{η,η′ }
N M
!
X X
δ q′ i − qj αq† ′ ,η′ . . . αq† ′ ′ αq1 ,η1 . . . αqM ,ηM (8.50)
1 1 N ,ηN
i=1 j=1
where the summation is carried over all spin/helicity indices η, η ′ of creation

and annihilation operators, and the integration is carried over all particle
momenta. Individual terms VN M in the expansion (8.49) of the interaction
Hamiltonian will be called potentials. Each potential is a normally ordered
product of N creation operators α† and M annihilation operators α. The
pair of integers (N, M) will be referred to as the index of the potential VN M .
A potential is called bosonic if it has an even number of fermion particle
operators Nf + Mf . Conservation laws (8.43), (8.45) and (8.47)
[V, L] = [V, B] = [V, Q] = 0 (8.51)
imply that all potentials in QED must be bosonic.

9
Here symbols α† and α refer to generic creation and annihilation operators without
specifying the type of the particle. Although this form does not involve derivatives of
particle operators, it still can be used to represent operators like (8.32) and (8.35) if
derivatives are approximated by finite differences.
DN M is a numerical coefficient function which depends on momenta and

spin projections (or helicities) of all created and annihilated particles. In
order to satisfy [V, J0 ] = 0, the function DN M must be rotationally invari-
ant. Translational invariance of (8.50) is guaranteed by the momentum delta
function
N M
!
X X
δ q′ i − qj
i=1 j=1
which expresses the conservation of momentum: the sum of momenta of

annihilated particles is equal to the sum of momenta of created particles.
The interaction Hamiltonian V enters in formulas (7.15), (7.17), and
(7.19) for the S-operator in the t-dependent form
i i
V (t) = e ~ H0 t V e− ~ H0 t (8.52)
Operators with t-dependence determined by the free Hamiltonian H0 , as in

equation (8.52), and satisfying conservation laws (8.48), (8.51) will be called
regular. Such operators will play an important role in our calculations of
the S-operator below. In what follows, when we write a regular operator V
without its t-argument, this means that either this operator is t-independent,
i.e., it commutes with H0 , or that we take its value at t = 0.
One final notational remark. If potential VN M has coefficient function
DN M , we introduce notation VN M ◦ ζ for the operator whose coefficient func-
′
tion DN M is a product of DN M and a numerical function ζ of the same
arguments
′ ′ ′ ′ ′
DN M (q1 η1 ; . . . ; qN ηN ; q1 η1 ; . . . ; qM ηM )
= DN M (q′1 η1′ ; . . . ; q′N ηN
′
; q1 η1 ; . . . ; qM ηM )ζ(q′1η1′ ; . . . ; q′N ηN
′
; q1 η1 ; . . . ; qM ηM )
Then, substituting (8.50) in (8.52) and using (8.36) - (8.39), the t-dependent
form of any regular potential VN M (t) can be written as
i i i
VN M (t) = e ~ H0 t VN M e− ~ H0 t = VN M ◦ e ~ ENM t
where
N
X M
X
EN M (q′1 , . . . , q′N , q1 , . . . , qM ) ≡ ωq′i − ω qj (8.53)
i=1 j=1
is the difference of energies of particles created and destroyed by VN M , which

is called the energy function of the term VN M . We can also extend this
notation to a general sum of potentials VN M
i i i
V (t) = e ~ H0 t V e− ~ H0 t = V ◦ e ~ EV t
which means that for each potential VN M (t) in the sum V (t), the argument
of the t-exponent contains the corresponding energy function EN M . In this
notation we can conveniently write

d i
V (t) = V (t) ◦ EV
dt ~
Z
i ∞
V (t) ≡ − V (t)dt = −2πiV ◦ δ(EV ) (8.54)
|{z} ~ −∞
Equation (8.54) means that each term in V (t) is non-zero only on the hyper-
|{z}
surface of solutions of the equation
EN M (q′1 , . . . , q′N , q1 , . . . , qM ) = 0 (8.55)
(if such solutions exist). This hypersurface in the 3(N + M) dimensional

momentum space is called the energy shell of the potential V . We will also
say that V (t) in equation (8.54) is zero outside the energy shell of V . Note
|{z}
that the scattering operator (7.14) S = 1+Σ(t) is different from 1 only on the
|{z}
energy shell, i.e., where the energy conservation condition (8.55) is satisfied.
8.2.4 Five types of regular potentials

Here we would like to introduce a classification of regular potentials (8.50) by
dividing them into five groups depending on their index (N, M). We will call
unphys, decay
4
unphys
ys
3
ph
2
1 R, O unphys, decay
0 R unphys
0 1 2 3 4 N
Figure 8.1: Positions of different operator types in the “index space” (N, M).
N and M are numbers of creation and annihilation operators, respectively.
R = “renorm,” O = “oscillation.”
these types of operators renorm, oscillation, decay, phys, and unphys.10 The
rationale for introducing this classification and nomenclature will become
clear in chapters 10 and 11 where we will examine renormalization and the
“dressed particle” approach in quantum field theory.
Renorm potentials have either index (0,0) (such operator is simply a

numerical constant) or index (1,1) in which case both created and annihilated
particles are required to have the same mass.11 In QED the most general form
of a renorm potential obeying conservation laws is the sum of a numerical
constant C and (1,1) terms corresponding to each particle type.12
10
The correlation between potential’s index (N, M ) and its type is shown in Fig. 8.1.
There is no established terminology for the types of potentials. In the literature, our phys
operators are sometimes called good ; unphys operators may be called bad or virtual.
11
Actually, in most cases the same particle type is created and annihilated, so renorm
operators have the form α† α.
12
Here we write just the operator structure of R omitting all numerical factors, indices,
integration and summation signs. Note also that terms like a† b or d† f are forbidden by
the charge conservation law (8.47).
R = a† a + b† b + d† d + f † f + c† c + C (8.56)
Renorm potentials are characterized by the property that the energy function
(8.53) is identically zero. So, renorm potentials always have energy shell
where they do not vanish. Renorm potentials commute with H0 , therefore
regular renorm operators13 do not depend on t. The free Hamiltonian (8.29)
and the total momentum (8.30) are examples of renorm operators, i.e., sums
of renorm potentials.
Oscillation potentials have index (1, 1). In contrast to renorm po-

tentials with index (1,1), oscillation potentials destroy and create different
particle species having different masses. For this reason, the energy function
(8.53) of an oscillation potential never turns to zero, so there is no energy
shell. In QED there can be no oscillation potentials, because they would
violate either lepton number or baryon number conservation law. However,
there are particles in nature, such as kaons and neutrinos, for which oscil-
lation interactions play a significant role. These interactions are responsible
for time-dependent oscillations between different particle species [GL].
Decay potentials satisfy two conditions:
1. they have indices (1, N) or (N, 1) with N ≥ 2;

2. they have a non-empty energy shell on which the coefficient function
does not vanish;
These potentials describe decay processes 1 → N 14 in which one particle

decays into N decay products, so that the energy conservation condition is
satisfied. There are no decay terms in the QED Hamiltonian and in the cor-
responding S-matrix: decays of electrons, protons, or photons would violate
conservation laws.15 Nevertheless, particle decays play important roles in
other areas of high energy physics, and they will be considered in chapter 13.
13
whose t-dependence is determined by (8.52)
14
as well as reverse processes N → 1
15
Exceptions to this rule are given by operators describing the decay of a photon into odd
number of photons, e.g., c†k1 ,τ1 c†k2 ,τ2 c†k3 ,τ3 ck1 +k2 +k3 ,τ4 . This potential obeys all conserva-
tion laws if momenta of involved photons are collinear and k1 + k2 + k3 − |k1 + k2 + k3 | = 0.
However, it was shown in [FM96] that such contributions to the S-operator are zero on
the energy shell, so photon decays are forbidden in QED.
Phys potentials have at least two creation operators and at least two
destruction operators (index (N, M) with N ≥ 2 and M ≥ 2). For phys
potentials the energy shell always exists. For example, in the case of a phys
potential d†p+k,ρ fq−k,σ
†
ap,τ bq,η the energy shell is determined by the solution
of equation Ωp+k + Ωq−k = ωp + ωq which is not empty.
Table 8.1: Types of potentials in the Fock space.

Potential Index of potential Energy shell Examples
(N, M) exists?
Renorm (0, 0),(1, 1) yes a†p ap
Oscillation (1, 1) no forbidden in QED
Unphys (0, N ≥ 1),(N ≥ 1, 0) no a†p b†−p−k c†k
Unphys (1, N ≥ 2),(N ≥ 2, 1) no a†p ap−k ck
Decay (1, N ≥ 2),(N ≥ 2, 1) yes forbidden in QED
Phys (N ≥ 2, M ≥ 2) yes d†q+k a†p−k dq ap
All regular operators not mentioned above belong to the class of

Unphys potentials. They come in two subclasses with following indices
1. (0, N), or (N, 0), where N ≥ 1. Obviously, there is no energy shell in

this case.
2. (1, M) or (M, 1), where M ≥ 2. This is the same condition as for

decay potentials, however, in contrast to decay potentials, for unphys
potentials it is required that either the energy shell does not exist or
the coefficient function vanishes on the energy shell.
An example of an unphys potential satisfying condition 2. is
a†p−k,σ c†k,τ ap,ρ (8.57)

Its energy shell equation is ωp−k + ck = ωp whose only solution is k = 0.
However zero vector is excluded from the photon momentum spectrum,16
so the energy shell of the potential (8.57) is empty. This means that a
free electron cannot decay into the pair electron+photon without violating
energy-momentum conservation laws.
16
Properties of potentials discussed above are summarized in Table 8.1.

These five types of potentials exhaust all possibilities, therefore any regular
operator V must have a unique decomposition
V = V ren + V unp + V dec + V ph + V osc
As mentioned above, in QED interaction, oscillation and decay contributions

are absent. So, everywhere in this book17 we will assume that the most
general interaction operator is a sum of renorm, unphys, and phys potentials
V = V ren + V unp + V ph
Now we need to learn how to perform various operations with these three
classes of potentials, i.e., how to evaluate products, commutators, and t-
integrals required for calculations of scattering operators in (7.14) - (7.19).
8.2.5 Products and commutators of potentials

Lemma 8.1 The product of two (or any number of ) regular operators is
regular.
Proof. If operators A(t) and B(t) are regular, then
i i
A(t) = e ~ H0 t Ae− ~ H0 t
i i
B(t) = e ~ H0 t Be− ~ H0 t
and their product C(t) = A(t)B(t) has t-dependence
i i i i
C(t) = e ~ H0 t Ae− ~ H0 t e ~ H0 t Be− ~ H0 t
i i
= e ~ H0 t ABe− ~ H0 t
characteristic for regular operators. The conservation laws (8.48), (8.51) are
valid for the product AB if they are valid for A and B separately. Therefore
C(t) is regular.
17
except chapter 13 where we discuss decays
Lemma 8.2 A Hermitian operator A is phys if and only if it yields zero

when acting on the vacuum |0i and one-particle states |1i ≡ α† |0i18
A|0i = 0 (8.58)
A|1i = Aα† |0i = 0 (8.59)
Proof. Normally ordered phys operators have two annihilation operators

on the right, so equations (8.58) and (8.59) are satisfied. Let us now prove
the inverse statement. Renorm operators cannot satisfy (8.58) and (8.59)
because they conserve the number of particles. Unphys operators (1, N) can
satisfy equations (8.58) and (8.59), e.g.,
α1† α2 α3 |0i = 0
α1† α2 α3 |1i = 0.
However, for Hermiticity, such operators should be always present in pairs
with (N, 1) operators α2† α3† α1 . Then, there exists at least one one-particle
state |1i for which equation (8.59) is not valid, e.g.,
α3† α2† α1 |1i = α3† α2† |0i =

6 0
A similar argument is valid for unphys operators having index (0, N). There-
fore, the only remaining possibility for A is to be phys.
Lemma 8.3 Product and commutator of any two phys operators A and B
is phys.
Proof. By Lemma 8.2 if A and B are phys, then
A|0i = B|0i = A|1i = B|1i = 0.

Then the same conditions are true for the Hermitian combinations i(AB −
BA) and AB + BA. Therefore, the commutator [A, B] and anticommutator
AB + BA are phys and
18
Here α denotes any one of the five particle operators (a, b, d, f, c) relevant for QED.
Momentum and spin labels are omitted for simplicity.
1 1
AB = (AB + BA) + [A, B]
2 2
is phys as well.
Lemma 8.4 If R is a renorm operator and [A, R] 6= 0, then operator [A, R]

has the same type (i.e., renorm, phys, or unphys) as A.
Proof. The general form of a renorm operator is given in equation (8.56).

Let us consider just one term in that sum
Z
R= dpf (p)αp† αp
We calculate the commutator [A, R] = AR − RA by moving the factor R

in the term AR step-by-step to the leftmost position. If the product αp† αp
(from R) changes places with a particle operator (from A) different from α†
or α then nothing happens. If the product α† α changes places with a creation
operator αq† (from A) then, as discussed in subsection 8.2.2, a secondary term
should be added which, instead of αq† contains the commutator19
Z Z
αq† dpf (p)αp† αp
− dpf (p)αp† αp
αq†
Z Z
† †
= ± dpf (p)αp αq αp − dpf (p)αp† αp αq†
Z Z Z
= dpf (p)αp αp αq ± dpf (p)αp δ(p − q) − dpf (p)αp† αp αq†
† † †
= ±f (q)αq†
This commutator is proportional to αq† , so the secondary term has the same
operator structure as the primary term and it is already in the normal order,
so no tertiary terms need to be created. If the product α† α changes places
with an annihilation operator αq then the commutator ∓f (q)αq is propor-
tional to the annihilation operator. If there are many α† and α operators in A
19
The upper sign is for bosons and the lower sign is for fermions.
having non-vanishing commutators with R, then each one of them results in

one additional term whose type remains the same as in the original operator
A.
Lemma 8.5 A commutator [P, U] of a Hermitian phys operator P and a

Hermitian unphys operator U can be either phys or unphys, but not renorm.
Proof. Acting by [P, U] on a one-particle state |1i, we obtain
[P, U]|1i = (P U − UP )|1i = P U|1i
If U is Hermitian then the state U|1i has at least two particles (see proof of
Lemma 8.2) and the same is true for the state P U|1i. Therefore, [P, U] cre-
ates several particles when acting on a one-particle state, which is impossible
if [P, U] were renorm.
Finally, there are no limitations on the type of commutator of two unphys
operators [U, U ′ ]. It can be a superposition of phys, unphys, and renorm
terms.
The above results are summarized in Table 8.2.
Table 8.2: Operations with regular operators in the Fock space. (Notation:
P=phys, U=unphys, R=renorm, NR=non-regular.)
Type of operator
dA
A [A, P ] [A, U] [A, R] dt
A A
|{z}
P P P+U P P P P
U P+U P+U+R U U U 0
R P U R 0 NR ∞
8.2.6 More about t-integrals
Lemma 8.6 A t-derivative of a regular operator A(t) is regular and has zero
renorm part.
Proof. The derivative of a regular operator has t-dependence that is char-

acteristic for regular operators:
d d i H0 t − i H0 t i i i
A(t) = e ~ Ae ~ = e ~ H0 t [H0 , A]e− ~ H0 t
dt dt ~
i
= [H0 , A(t)] (8.60)
~
In addition, it is easy to check that the derivative obeys all conservation laws
(8.48), (8.51). Therefore, it is regular.
Suppose that dtd A(t) has a non-zero renorm part R. Then R is t-independent
and originates from a derivative of the term Rt + S in A(t), where S is t-
independent. Since A(t) is regular, its renorm part must be t-independent,
therefore R = 0.
From formula (7.28) we conclude that t-integrals of regular phys and

unphys operators are regular20
−1
V (t) = V (t) ◦ (8.61)
EV
However, this property does not hold for t-integrals of renorm operators.
These operators are t-independent, therefore
ie−ǫt i it
V ren = lim V ren ◦ = lim V ren ◦ − V ren ◦ + . . . (8.62)
ǫ→+0 ǫ~ ǫ→+0 ǫ~ ~
ren
|{z} = ∞
V (8.63)
Thus, renorm operators are different from others in the sense that their t-
integrals (8.62) are infinite and non-regular. Infinite-limit t-integrals (8.63)
are infinite, unless V ren = 0.21
Since for any unphys operator V unp either the energy shell does not ex-
ist or the coefficient function is zero on the energy shell, we conclude from
equation (8.54) that
20
Here we assume the adiabatic switching of interaction (7.27) and use integral notation
(7.12) - (7.13).
21
This fact does not limit the applicability of our theory, because, as we will see in
subsection 10.1.2, properly renormalized expressions for scattering operators should not
contain renorm terms and pathological expressions like (8.62) - (8.63).
unp
| {z } = 0
V (8.64)
From equations (7.15) and (7.19) it is then clear that unphys terms in F and
Σ do not make contributions to the S-operator. Results obtained so far in
this subsection are presented in last three columns of Table 8.2.
8.2.7 Solution of one commutator equation

In section 11.1 we will find it necessary to solve equation of the type
i[H0 , A] = V (8.65)
where H0 is the free Hamiltonian, V is a given regular Hermitian operator in

the Fock space and A is the desired solution (an unknown Hermitian opera-
tor). What can we say about the solution of this commutator equation? Let
i i
us first multiply both sides of (8.65) by the usual t-exponents e ~ H0 t . . . e− ~ H0 t
i[H0 , A(t)] = V (t)
Using (8.60) this can be rewritten as
d
~ A(t) = V (t) (8.66)
dt
Note that it follows from Lemma 8.6 that the operator on the right hand
side of (8.65) cannot contain renorm terms. Indeed, according to (8.66), the
left hand side of (8.65) is a t-derivative, which cannot be renorm. Luckily,
for our purposes in this book we will never meet equations of the above type
with renorm right hand sides. Therefore, here we will assume V ren = 0.
Next we assume that the usual “adiabatic switching” (7.27) works, so
that V (−∞) = 0 and the same property is valid for the solution A(t)
A(−∞) = 0 (8.67)
Then equation (8.66) with the initial condition (8.67) has a simple solution
Zt
1
A(t) = V (t′ )dt′ = iV (t) (8.68)
~
−∞
In order to get a t-independent solution of our original equation (8.65), we

can simply set t = 0 and obtain
1
A ≡ A(0) = iV ≡ −iV ◦ (8.69)
EV
8.2.8 Two-particle potentials

Our next goal is to express n-particle potentials (n ≥ 2)22 using the formalism
of annihilation and creation operators. These potentials conserve the number
and types of particles, so they must have equal numbers of creation and
annihilation operators (N = M, N ≥ 2, M ≥ 2). Therefore, they must be of
the phys type.
As an example, let us consider a two-particle subspace H(1, 0, 1, 0, 0) of
the Fock space. This subspace describes states of the system consisting of
one electron and one proton. A general phys operator leaving this subspace
invariant must have N = 2, M = 2 and, according to equation (8.50), it can
be written as23
Z
V̂ = dpdqdp′ dq′ D22 (p, q, p′ , q′ )δ(p + q − p′ − q′ )d†p a†q dp′ aq′
Z
= dpdqdp′ D22 (p, q, p′ , p + q − p′ )d†p a†q dp′ ap+q−p′
Z
= dpdqdkV (p, q, k)d†p a†q dp−k aq+k (8.70)
where we denoted k = p − p′ the “transferred momentum” and
V (p, q, k) ≡ D22 (p, q, p − k, q + k)

22
studied in subsection 6.3.4
23
In this subsection we use variables p and q to denote momenta of the proton and the
electron, respectively. We also omit spin indices for brevity.
Acting by this operator on an arbitrary state |Ψi of the two-particle system
Z
|Ψi = dp′′ dq′′ Ψ(p′′ , q′′ )d†p′′ a†q′′ |0i (8.71)
we obtain
V̂ |Ψi
Z Z
= dpdqdkV (p, q, k)dp aq dp−k aq+k dp′′ dq′′ Ψ(p′′ , q′′ )d†p′′ a†q′′ |0i
† †
Z Z
= dpdqdkV (p, q, k) dp′′ dq′′ Ψ(p′′ , q′′ )δ(p − k − p′′ )δ(q + k − q′′ )d†p a†q |0i
Z Z
= dpdq dkV (p, q, k)Ψ(p − k, q + k) d†p a†q |0i (8.72)
Comparing this with (8.71) we see that the momentum-space wave function
Ψ(p, q) has been transformed by the action of V̂ to the new wave function
Z
′
Ψ (p, q) ≡ V̂ Ψ(p, q) = dkV (p, q, k)Ψ(p − k, q + k)
This is the most general linear transformation of a two-particle wave function

which conserves the total momentum.
For comparison with traditionally used inter-particle potentials, it is more
convenient to have expression for operator V̂ in the position space. We can
write24
Ψ′ (x, y)
Z
1 i i
≡ V̂ Ψ(x, y) = 3
e ~ px+ ~ qy dpdqΨ′ (p, q)
(2π~)
Z Z
1 i
px+ i
qy
= e~ ~ dpdq dkV (p, q, k)Ψ(p − k, q + k)
(2π~)3
Z Z
1 i
(p+k)x+ ~i (q−k)y
= e ~ dpdq dkV (p + k, q − k, k)Ψ(p, q)
(2π~)3
24
Here x and y are positions of the proton and the electron, respectively; and we use
(5.43) to change from the momentum representation to the position representation.
Z Z
i
k(x−y) 1 i
px+ ~i qy
= dke ~ V (p + k, q − k, k) dpdqe ~ Ψ(p, q)
(2π~)3
(8.73)
where expression in square brackets is recognized as the original position-

space wave function
Z
1 i i
Ψ(x, y) = dpdqe ~ px+ ~ qy Ψ(p, q) (8.74)
(2π~)3
and the rest is an operator acting on this wave function. This operator
acquires especially simple form if we assume that V (p, q, k) does not depend
on p and q
V (p, q, k) = v(k)
Then
Z
i
V̂ Ψ(x, y) = dke ~ k(x−y) v(k)Ψ(x, y) = w(x − y)Ψ(x, y) (8.75)
where
Z
i
w(r) = dke ~ kr v(k)
is the Fourier transform of v(k). We see that interaction (8.70) acts as

multiplication by the function w(r) in the position space. So, it is a usual
position-dependent potential. Note that the requirement of the total mo-
mentum conservation implies automatically that this potential depends on
the relative position r ≡ x − y.
As an example consider interaction operator of the form (8.70)
Z
q1 q2 dpdqdk † †
V̂ = dp aq dp−k aq+k (8.76)
(2π)3 ~ k2
where constants q1 and q2 can be interpreted as charges of the two particles

and v(k) = q1 q2 /(8π 3~k 2 ). Then the position-space interaction is the usual
Coulomb potential25
Z
q1 q2 dk i kr q1 q2
w(r) = e~ = (8.77)
(2π)3 ~ k2 4πr
Let us now consider the general case (8.73). Without loss of generality
we can represent function V (p + k, q − k, k) as a series26
X
V (p + k, q − k, k) = χj (p, q)vj (k)
j
Then we obtain
V̂ Ψ(x, y)
XZ Z
i
k(x−y) 1 i
px+ ~i qy
= dke ~ vj (k) dpdqχj (p, q)e ~ Ψ(p, q)
j
(2π~)3
Z
X 1 i
px+ ~i qy
= wj (x − y)χj (p̂, q̂) dpdqe ~ Ψ(p, q)
j
(2π~)3
X
= wj (x − y)χj (p̂, q̂)Ψ(x, y) (8.78)
j
where p̂ = −i~(d/dx) and q̂ = −i~(d/dy) are differential operators, i.e.,

position-space representations (5.41) of the momentum operators of the two
particles. Expression (8.78) then demonstrates that interaction d† a† da can
be always represented as a general 2-particle potential depending on the
distance between particles and their momenta. We will use equation (8.78)
in our derivation of 2-particle RQD potentials in subsection 12.1.2.
8.2.9 Cluster separability in the Fock space

We know that a cluster separable interaction potential can be constructed as a
sum of smooth potentials (6.49) depending on particle observables (positions,
25
see equation (B.7)
26
For example, a series of this form can be obtained by writing a Taylor expansion with
respect to the variable k with χj being the coefficients depending on p and q.
momenta, and spins). However, this notation is very inconvenient to use in

the Fock space, because such sums have rather different forms in different
Fock sectors. For example, the Coulomb interaction has the form (6.47)
in the 2-particle sector and the form (6.48) in the 3-particle sector. This
notational difference for inter-particle interactions is very inconvenient. It
would be more preferable to have a unique formula, which remains valid in
all N-particle sectors. Fortunately, it is easy to satisfy cluster separability
within our standard notation (8.49) - (8.50). We just need to make sure that
factors DN M are smooth functions of particle momenta.27
Let us verify this statement on a simple example. We are going to find
out how the 2-particle potential (8.70)28 acts in the 3-particle (one proton
and two electrons) sector of the Fock space H(2, 0, 1, 0, 0) where state vectors
have the form
Z
|Ψi = dpdq1 dq2 ψ(p, q1 , q2 )d†p a†q1 a†q2 |0i (8.79)
Applying operator (8.70) to this state vector we obtain
V̂ |Ψi
Z Z
= dp dq dk dpdq1 dq2 V (p′ , q′ , k)ψ(p, q1 , q2 )d†p′ a†q′ dp′ −k aq′ +k d†p a†q1 a†q2 |0i
′ ′
(8.80)
The product of particle operators acting on the vacuum state can be normally
ordered as
d†p′ a†q′ dp′ −k aq′ +k d†p a†q1 a†q2 |0i

= −d†p′ a†q′ d†p a†q1 a†q2 dp′ −k aq′ +k |0i + δ(p′ − k − p)d†p′ a†q′ aq′ +k a†q1 a†q2 |0i
−δ(q1 − q′ − k)d†p′ a†q′ d†p a†q2 dp′ −k |0i + δ(q2 − q′ − k)d†p′ a†q′ d†p a†q1 dp′ −k |0i
= δ(p′ − k − p)δ(q1 − q′ − k)d†p′ a†q′ a†q2 |0i − δ(q2 − q′ − k)δ(p′ − k − p)d†p′ a†q′ a†q1 |0i
27
see section 4 in [Wei95].
28
As a concrete example, it is instructive to choose the Coulomb interaction (8.76) in
these calculations.
Inserting this result in (8.80) we obtain
V̂ |Ψi
Z
= dkdpdq1 dq2 V (k + p, q1 − k, k)ψ(p, q1 , q2 )d†k+p a†q1 −k a†q2 |0i
Z
− dkdpdq1 dq2 V (k + p, q2 − k, k)ψ(p, q1 , q2 )d†k+p a†q2 −k a†q1 |0i
Z Z
= dpdq1 dq2 dkV (p, q1 , k)ψ(p − k, q1 + k, q2 )
Z
+ dkV (p, q2 , k)ψ(p − k, q1 , q2 + k) d†p a†q1 a†q2 |0i (8.81)
Comparing this with equation (8.72) we see that, as expected from the con-
dition of cluster separability, the two-particle interaction in the three-particle
sector separates in two terms. One term acts on the pair of variables (p, q1 ).
The other term acts on variables (p, q2 ).
Removing the electron 2 to infinity is equivalent to multiplying the momentum-
space wave function ψ(p, q1 , q2 ) by exp( ~i q2 a) where a → ∞. The action of
V on such a wave function29 is
hZ i
lim dkV (p, q1 , k)ψ(p − k, q1 + k, q2 )e ~ q2 a
a→∞
Z i
i
+ dkV (p, q2 , k)ψ(p − k, q1 , q2 + k)e ~ (q2 +k)a
In the limit a → ∞ the exponent in the integrand of the second term is a

rapidly oscillating function of k. If the coefficient function V (p, q, k) is a
smooth function of k then the integral on k is zero due to the Riemann-
Lebesgue lemma B.1. Therefore, only the interaction proton - electron(1)
does not vanish
i
lim V̂ e ~ q̂2 a |Ψi
a→∞
Z Z
i
= lim dpdq1 dq2 dkV (p, q1 , k)ψ(p − k, q1 + k, q2 ) e ~ q2 a d†p a†q1 a†q2 |0i
a→∞
Z Z
i
q a † † †
= lim dq2 e ~ aq2 |0i
2
dpdq1 dkV (p, q1 , k)ψ(p − k, q1 + k, q2 )dp aq1 |0i
a→∞
29
i.e., the term in parentheses in (8.81)
8.3. A TOY MODEL THEORY 275
Comparing this with equation (8.72), we see that the spatial translation of
the electron 2 results in a state in which the remote electron 2 coexists with
the interacting subsystem “proton + electron 1”. This demonstrates that V̂
is a cluster separable potential.
For general potentials (8.49) - (8.50) with smooth coefficient functions the
above arguments can be repeated: If some particles are removed to infinity
such potentials automatically separate into sums of smooth terms, as required
by cluster separability. Therefore,
Statement 8.7 (cluster separability) The cluster separability of the gen-
eral interaction (8.49) is guaranteed if coefficient functions DN M of all in-
teraction potentials VN M are smooth functions of momenta.
The power of this statement is that when expressing interacting potentials
through particle operators in the momentum representation (as in (8.50)) we
have a very simple criterion of cluster separability: the coefficient functions
must be smooth, i.e., they should not contain singular factors, like delta
functions.30 This demonstrates the great advantage of writing interactions
in terms of particle (creation and annihilation) operators (8.50) instead of
particle (position and momentum) observables, as in section 6.3.31
8.3 A toy model theory

Before considering real QED interactions in the next chapter, in this section
we are going to perform a warm-up exercise. We will introduce a simple yet
quite realistic model theory with variable number of particles in the Fock
space. In this theory, the perturbation expansion of the S-operator can be
evaluated with minimal efforts, in particular, with the help of a convenient
diagram technique.
8.3.1 Fock space and Hamiltonian

The toy model introduced here is a rough approximation to QED. No particle-
antiparticle pair creation is allowed. So, for simplicity we will ignore all other
30
This is the reason why cluster separable potentials were called smooth in subsection
6.3.4.
31
Recall that in subsection 6.3.6 it was a very non-trivial matter to ensure the cluster
separability for interaction potentials written in terms of particle observables even in a
simplest 3-particle system.
types of particles except electrons and photons. The relevant part of the Fock
space is a direct sum of electron-photon sectors like those described in for-
mulas (8.2) - (8.10). We will also assume that interaction does not affect
the electron spin and photon polarization degrees of freedom, so the corre-
sponding labels will be omitted. Then relevant (anti)commutation relations
of particle operators can be taken from (8.22) - (8.25)
{ap , a†p′ } = δ(p − p′ ) (8.82)

[cp , c†p′ ] = δ(p − p′ ) (8.83)
{ap , ap′ } = {a†p , a†p′ } = 0 (8.84)
[cp , cp′ ] = [c†p , c†p′ ] = 0 (8.85)
[a†p , c†p′ ] = [a†p , cp′ ] = [ap , c†p′ ] = [ap , cp′ ] = 0
The full Hamiltonian
H = H0 + V1 (8.86)
as usual, is the sum of the free Hamiltonian
Z Z
H0 = dpωp a†p ap +c dkkc†k ck
and interaction, which we choose in the following unphys form
Z Z
ec~ dpdk † † ec~ dpdk †
V1 = √ ap ck ap+k + √ ap ap−k ck (8.87)
(2π~)3/2 ck (2π~)3/2 ck
The coupling constant e is equal to the absolute value of the electron’s charge.
Here and in what follows the perturbation order of an operator (= the power
of the coupling constant e in the operator) is shown by the subscript. For
example, the free Hamiltonian H0 does not depend on e, so it is of zero
perturbation order; the perturbation order of V1 is one, etc.
The above theory satisfies conservation laws
[H, P0 ] = [H, J0 ] = [H, Q] = 0

where operators P0 , J0 , and Q refer to the total momentum, total angular

momentum operator and total electric charge, respectively
Z
P0 = dpp(a†p ap + c†p cp )
Z
Q = −e dpa†p ap
The number of electrons is conserved, due to the conservation of charge, but

the number of photons can vary without limitations. So, this theory can
describe important processes of the photon emission and absorption.
However, our toy model has a major drawback: This model is not Poincaré
invariant. This means that we cannot construct an interacting boost oper-
ator K such that the Poincaré commutation relations with H, P0 , and J0
are satisfied. In this section we will tolerate the lack of invariance, but in
chapter 9 we will show how the Poincaré invariance can be satisfied in a more
comprehensive theory (QED) which includes both particles and antiparticles.
8.3.2 Drawing a diagram in the toy model

In this subsection we would like to introduce a diagram technique which
would greatly facilitate perturbative calculations of scattering operators (7.20)
- (7.22). Let us graphically represent each term in the interaction potential
(8.87) as a vertex (see Fig. 8.2). Each particle operator in V1 is represented
as an oriented line or arrow. The line corresponding to an annihilation oper-
ator enters the vertex and the line corresponding to a creation operator leaves
the vertex. Electron lines are shown by full arcs and photon lines are shown
by dashed arrows on the diagram. Each line is marked with the momentum
label of the corresponding particle operator. Free ends of the electron lines
are attached to the vertical electron “order bar” on the left hand side of the
diagram. The order of these external lines (from bottom to top of the order
bar) corresponds to the order of electron particle operators in the potential
(from right to left). An additional numerical factor is indicated in the upper
left corner of the diagram.
The t-integral
a†p c†k ap+k

Z
ec~ dpdk
V1 = − √
(2π~)3/2 ck ωp + ck − ωp+k
+1 (a) +1 (b))
p p
k k
p+k p−k
Figure 8.2: Diagram representation of the interaction operator V1 .
+1 (a) +1 (b))
p p
k k
p+k p−k
Figure 8.3: Diagram representation of the t-integral V1 (t).
Z
ec~ dpdk a†p ap−k ck
− √ (8.88)
(2π~)3/2 ck ωp − ck − ωp−k
differs from V1 only by the factor −EV−11
(see equation (8.61)), which is rep-
resented in the diagram 8.3 by drawing a box that crosses all external lines.
A line entering (leaving) the box contributes its energy with the negative
(positive) sign to the energy function EV1 .
In order to calculate the Σ-operator (7.22) in the 2nd perturbation order
we need products V1 V1 . The diagram corresponding to the product of two
diagrams AB is obtained by simply placing the diagram B below the diagram
A and attaching external electron lines of both diagrams to the same order
bar. For example, the diagram for the product of the second term in (8.87)
(Fig. 8.2(b)) and the first term in (8.88) (Fig. 8.3(a))
V1 V1 ∝ (a†p ap−k ck )(a†q c†k′ aq+k′ ) + . . . (8.89)

is shown in Fig. 8.4(a).32 This product should be further converted to the
normal form, i.e., all creation operators should be moved to the left. On the
diagram, the movement of creation operators from right to left is represented
by the movement of free outward pointing arrows upward, so that at the end
of this process all outgoing lines are positioned below incoming lines. Due to
anticommutation relations (8.82) and (8.84), each exchange of positions of
electron particle operators (full lines in the diagram) changes the total sign
of the expression. Each permutation of annihilation and creation operators
(incoming and outgoing lines, respectively) of similar particles creates an
additional expression, which corresponds to delta functions on the right hand
sides of (8.82) and (8.83). We represent these additional terms by diagrams
in which the swapped lines are joined together forming internal lines that
directly connects two vertices.
Applying these rules to (8.89), we first move the photon operators to
rightmost positions, move the operator a†q to the leftmost position, and add
another term due to the anticommutator {ap−k , a†q } = δ(q − p + k).
V1 V1 ∝ a†q a†p ap−k aq+k′ ck c†k′ + δ(q − p + k)a†p aq+k′ ck c†k′

= a†q a†p ap−k aq+k′ ck c†k′ + a†p ap−k+k′ ck c†k′ + . . . (8.90)
Expression (8.90) is represented by two diagrams 8.4(b) and 8.4(c). In
the diagram 8.4(b) the electron line marked q has been moved to the top of
the electron order bar. In the diagram 8.4(c) the product δ(q − p + k) and
the integration by q are represented by joining or pairing the incoming elec-
tron line carrying momentum p − k with the outgoing electron line carrying
momentum q. This produces an internal electron line carrying momentum
p − k between two vertices.
In the expression (8.90), electron operators are in the normal order, how-
ever, photon operators are not there yet. The next step is to change the
order of photon operators
32
By convention, we will place free ends of photon external lines on the right hand side
of the diagram. The order of these free ends (from top to bottom of the diagram) will
correspond to the order of photon particle operators in the expression (from left to right).
For example, in Fig. 8.4(a) the incoming photon line is above the outgoing photon line,
which corresponds to the order cc† of photon operators in (8.89).
+1 p (a) +1 q (b
b)) (c
c)
k +1
p k p k
p−k
q
= k’
k’ p−k
p−k+k’ k’
q+k’ q+k’
+1 (d) +1 (e) +1 (f
f) +1 (g
g)
q p−k
k’ q+k
p p
k
= k p−k k
p−k q p
q+k’ p
Figure 8.4: The normal product of operators in Fig. 8.2(b) and 8.3(a).
V1 V1 ∝ a†q a†p c†k′ ap−k aq+k′ ck + a†q a†p ap−k aq+k′ δ(k′ − k)
+a†p c†k′ ap−k+k′ ck + a†p ap−k+k′ δ(k′ − k) + . . .
= a†q a†p c†k′ ap−k aq+k′ ck + a†q a†p ap−k aq+k
+a†p c†k′ ap−k+k′ ck + a†p ap + . . . (8.91)
The normal ordering of photon operators in 8.4(b) yields diagrams 8.4(d) and
8.4(e) according to equation (8.83). Diagrams 8.4(f) and 8.4(g) are obtained
from 8.4(c) in a similar way.
8.3.3 Reading a diagram in the toy model

With the above diagram rules and some practice, one can perform calcula-
tions of scattering operators (7.21) and (7.22) much easier than in the usual
algebraic way. During these diagram manipulations we, actually, do not need
to keep track of the momentum labels of lines. The algebraic expression of
the result can be easily restored from an unlabeled diagram by following
these steps:
(I) Assign a distinct momentum label to each external line, except one,
whose momentum is obtained from the (momentum conservation) con-
dition that the sum of all incoming external momenta minus the sum
of all outgoing external momenta is zero.
(II) Assign momentum labels to internal lines so that the momentum con-
servation law is satisfied at each vertex: The sum of momenta of lines
entering the vertex is equal to the sum of momenta of outgoing lines.
If there are loops, one needs to introduce new independent loop mo-
menta 33
(III) Read external lines from top to bottom of the diagram and write corre-
sponding particle operators from left to right.34 Do it first for electron
lines and then for photon lines.
(IV) For each box, write a factor (Ef − Ei )−1 , where Ef is the sum of
energies of particles going out of the box and Ei is the sum of energies
of particles coming into the box.
ec~
(V) For each vertex introduce a factor √ , where k is the momentum
(2π~)3 ck
of the photon line attached to the vertex.
(VI) Integrate the obtained expression by all independent external momenta
and loop momenta.
8.3.4 Electron-electron scattering

Let us now try to extract some physical information from the above theory.
We will calculate low order terms in the perturbation expansion (7.22) for
the Σ-operator
Σ1 = V1 (8.92)
Σ2 = (V1 V1 )unp + (V1 V1 )ph + (V1 V1 )ren (8.93)
To obtain corresponding contributions to the S-operator we need to take
t-integrals
33
see diagram 8.4(g) in which k is the loop momentum.
34
Incoming lines correspond to annihilation operators; outgoing lines correspond to cre-
ation operators.
S = 1 + Σ1 + Σ2 + . . .
|{z} |{z}
Note that the right hand side of (8.92) and the first term on the right hand
side of (8.93) are unphys, so, due to equation (8.64), they do not contribute
to the S-operator. For now, we also ignore the contribution of the renorm
term in (8.93).35 Then we obtain in the 2nd perturbation order
S2 = (V1 V1 )ph + . . . (8.94)

| {z }
Operator V1 V 1 has several terms corresponding to different scattering pro-
cesses. Some of them were calculated in subsection 8.3.2. For example, terms
of the type a† c† ac (see Figs. 8.4(c) and (f)) annihilate an electron and a pho-
ton in the initial state and recreate them (with different momenta) in the
final state. So, these terms describe the electron-photon (Compton) scat-
tering. Let us consider in more detail the electron-electron scattering term
a† a† aa described by the diagram in Fig. 8.4(e). According to the rules (I) -
(VII), this diagram can be written algebraically as
Z
~2 e2 c dpdqdk
F ig.8.4(e) = a†p−k a†q+k ap aq
(2π~)3 k(ck + ωp−k − ωp )
The t-integral of this expression is
F ig.8.4(e)
| {z }
Z
2πie2 ~2 c δ(ωp−k + ωq+k − ωq − ωp ) †
= − 3
dpdqdk ap−k a†q+k ap aq
(2π~) k(ck + ωp−k − ωp )
(8.95)
The delta function in (8.95) expresses the conservation of energy in the scat-
tering process. We will also say that expression (8.95) is non-zero only on
the energy shell, which is defined as a solution of the equation
35
In fact, in a more consistent theory the renorm term (V1 V1 )ren should have been
canceled by a renormalization counterterm, as will be discussed in chapter 10.
ωp−k + ωq+k = ωq + ωp
In the non-relativistic approximation (p, q ≪ mc)
p p2
ωp ≡ p2 c2 + m2 c4 ≈ mc2 + (8.96)
2m
Then in the limit of small momentum transfer (k ≪ mc) the denominator in
(8.95) can be approximated as

2 (p − k)2 2 p2
k (ck + ωp−k − ωp ) ≈ k ck + mc + − mc −
2m 2m
2
≈ ck (8.97)
Substituting this result in (8.95), we obtain the second order contribution to

the S-operator
Z
ie2 δ(ωp−k + ωq+k − ωq − ωp ) †
† †
S2 [a a aa] ≈ − 2 dpdqdk ap−k a†q+k aq ap
4π ~ k2
(8.98)
8.3.5 Effective potential

We have discussed in subsection 7.2.1 that the same scattering matrix can
correspond to different total Hamiltonians with different interaction poten-
tials. Here we would like to demonstrate this idea by constructing the fol-
lowing 2nd order effective interaction between our model electrons
Z
e2 dpdqdk †
V2ef f = ap−k a†q+k aq ap (8.99)
(2π)3 ~ k 2
With this interaction the 2nd order amplitude for electron-electron scattering
is obtained by the usual formula (7.20), keeping only the first term in F
S2ef f = V2ef f
| {z }
Z
ie2 δ(ωp−k + ωq+k − ωq − ωp ) †
= − 2 dpdqdk ap−k a†q+k aq ap
4π ~ k2
This is the same result as (8.98), in spite of the fact that the new effective
Hamiltonian H0 + V2ef f is completely different from the original Hamilto-
nian (8.86). In particular, it is important to note that interaction (8.99)
is phys, while (8.87) is unphys. The replacement of an unphys interaction
with a scattering-equivalent effective phys potential is the central idea of the
“dressed particle” approach to quantum field theory that will be introduced
in chapter 11.
From equations (8.75) and (B.7) it follows that interaction (8.99) corre-
sponds to the ordinary position-space Coulomb potential36
Z i
e2 e ~ kr e2
w(r) = dk 2 = (8.100)
(2π)3~ k 4πr
So, our toy model is quite realistic.
8.4 Diagrams in a general theory

8.4.1 Properties of products and commutators
The diagrammatic approach developed for the toy model above can be easily
extended to interactions in the general form (8.49): Each potential VN M with
N creation operators and M annihilation operators can be represented by a
vertex with N outgoing and M incoming lines. In calculations of scattering
operators (7.21) and (7.22) we meet products of such potentials.37
Y = V (1) V (2) . . . V (V) (8.101)

As explained in subsection 8.2.2, we should bring these products to the nor-
mal order. The normal ordering transforms (8.101) into a sum of terms y (j)
36
see also equation (8.77)
37
V is the number of potentials in the product.
8.4. DIAGRAMS IN A GENERAL THEORY 285
X
Y = y (j) (8.102)
j
each of which can be described by a diagram with V vertices.

Each potential V (i) in the product (8.101) is of the standard form (8.50)
has N (i) creation operators, M (i) annihilation operators, and N (i) + M (i)
momentum integrals. Then each term y (j) in the expansion (8.102) has
V
X
N = (N (i) + M (i) ) (8.103)
i=1
integrals and independent momentum integration variables. This term also

has a product of V delta functions, which express the conservation of the total
momentum in each of the factors V (i) . In the process of normal ordering of the
product (8.101), a certain number of pairs of external lines in the factors V (i)
have to be joined together to make internal lines and to introduce additional
delta-functions. Let us denote by I the number of such delta functions in
y (j) . Then the total number of delta functions in y (j) is
Nδ = V + I (8.104)
and the number of external lines is
E = N − 2I (8.105)
The terms y (j) in the normally ordered product (8.102) can be either discon-
nected or connected. In the latter case there is a continuous sequence (path)
of internal lines connecting any two vertices. In the former case such a path
does not exist, and the diagram splits into several separated pieces.
For illustration consider a product of two potentials V (1) = V1 and V (2) =
V 1 from our example (8.89). The expansion of this product into a sum of
normally ordered terms is shown diagrammaticaly in Figures 8.4(d) - (g)
X
V (1) V (2) = y (j) (8.106)
j
There is only one disconnected term 8.4(d) in the sum on the right hand side.
Let us denote this term y (0) ≡ (V (1) V (2) )disc . This is the term in which the
factors from the original product are simply rearranged and no pairings are
introduced. All other terms y (1) , y (2) , . . . on the right hand side of (8.106)(d)
are connected, because they have at least one pairing. These pairings are
represented in diagrams 8.4(e) - (g) by one or more internal lines connecting
vertices V (1) and V (2) .
Lemma 8.8 The disconnected part of a product of two connected bosonic

operators38 does not depend on the order of the product
(V (1) V (2) )disc = (V (2) V (1) )disc (8.107)
Proof. Operators V (1) V (2) and V (2) V (1) differ only by the order of particle
operators. So, after all particle operators are brought to the normal order in
(V (1) V (2) )disc and (V (2) V (1) )disc , they may differ, at most, by a sign. So, our
goal is to show that this sign is plus (+). Any reordering of boson particle
operators does not affect the sign of an expression, so for our proof we do
not need to pay attention to creation and annihilation operators of bosons
in the two factors. Let us now focus only on fermion particle operators
in V (1) and V (2) . For simplicity, we will assume that only electron and/or
positron particle operators are present in V (1) and V (2) . The inclusion of
the proton and antiproton operators will not change anything in this proof,
(i)
except its length. For the two factors V (i) (where i = 1, 2) let us denote Ne
(i)
the numbers of electron creation operators, Np the numbers of positron
(i)
creation operators, Me the numbers of electron annihilation operators, and
(i)
Mp the numbers of positron annihilation operators. Taking into account
that V (i) are assumed to be normally ordered, we may formally write
V (1) ∝ [Ne(1) ][Np(1) ][Me(1) ][Mp(1) ]

V (2) ∝ [Ne(2) ][Np(2) ][Me(2) ][Mp(2) ]
(1) (1)
where the bracket [Ne ] denotes the product of Ne electron creation op-
(1) (1)
erators from the term V (1) , the bracket [Np ] denotes the product of Np
positron creation operators from the term V (1) , etc. Then
38
As discussed in subsection 8.2.3, all potentials considered in this book are bosonic.
V (1) V (2) ∝ [Ne(1) ][Np(1) ][Me(1) ][Mp(1) ][Ne(2) ][Np(2) ][Me(2) ][Mp(2) ] (8.108)
V (2) V (1) ∝ [Ne(2) ][Np(2) ][Me(2) ][Mp(2) ][Ne(1) ][Np(1) ][Me(1) ][Mp(1) ] (8.109)
Let us now bring particle operators on the right hand side of (8.109) to the
(1)
same order as on the right hand side of (8.108). First we move Ne electron
creation operators to the leftmost position in the product. This involves
(1) (2)
Ne Me permutations with electron annihilation operators from the factor
(1) (2)
V (2) and Ne Ne permutations with electron creation operators from the
factor V (2) . Each of these permutations changes the sign of the disconnected
(1) (2) (2)
term, so the acquired factor is (−1)Ne (Ne +Me ) .
(1)
Next we need to move the [Np ] factor to the second position from the
(1) (2) (2)
left. The factor acquired after this move is (−1)Np (Np +Mp ) . Then we move
(1) (1)
the factors [Me ] and [Mp ] to the third and fourth places in the product,
respectively. Finally, the total factor acquired by the expression (V (2) V (1) )disc
after all its terms are rearranged in the same order as in (V (1) V (2) )disc is
(1) (2) (1) (2)

f = (−1)Ke Ke +Kp Kp
(8.110)
where we denoted
Ke(i) ≡ Ne(i) + Me(i)

Kp(i) ≡ Np(i) + Mp(i)
the total (= creation + annihilation) numbers of electron and positron op-

erators, respectively, in the factor V (i) . Next, let us show that the power of
(1)
(-1) in (8.110) is even, so that f = 1. Indeed, consider the case when Ke is
(2) (1) (2)
even and Ke is odd. Then the product Ke Ke is odd. From the bosonic
(1) (1) (2) (2)
character of V (1) and V (2) it follows that Ke + Kp and Ke + Kp are
(1) (2)
even numbers. Therefore Kp is odd and Kp is even, so that the product
(1) (2)
Kp Kp is odd and the total power of (-1) in (8.110) is even.
The same result is obtained for any other assumption about the even/odd
(1) (2)
character of Ke and Ke . This proves (8.107).
Theorem 8.9 A multiple commutator of bosonic potentials is connected.
Proof. Let us first consider a single commutator of two potentials V (1) and
V (2) .
V (1) V (2) − V (2) V (1) (8.111)
According to Lemma 8.8, the disconnected terms (V (1) V (2) )disc and (V (2) V (1) )disc
in the commutator (8.111) are canceled. All other terms in the commutator
are connected. This proves the theorem for a single commutator (8.111).
Since this commutator is also bosonic, repeating the above arguments by
induction, we conclude that all multiple commutators of bosonic operators
are connected.
Lemma 8.10 In a connected diagram the number of independent loops is39
L= I −V +1 (8.112)
Proof. If there are V vertices, they can be connected together without

making loops by V − 1 internal lines. Each additional internal line will make
one independent loop. Therefore, the total number of independent loops is
I − (V − 1).
An example of a connected diagram is shown in Fig. 8.5. This diagram

has V = 4 vertices, E = 5 external lines, I = 7 internal lines, and L = 4 inde-
pendent loops. This diagram describes a nine-fold momentum integral. Five
integration momentum variables correspond to external lines in the diagram:
These are two incoming momenta p1 , p2 , and three outgoing momenta p3 ,
p4 and p5 . These five integrals/variables are a part of the general expression
for the potential (8.50). Four additional integrals are performed by loop mo-
menta p6 , p7 , p8 , and p9 . These integrals can be absorbed in the definition
of the coefficient function
D3,2 (p3 , p4 , p5 ; p1 , p2 ) =
39
Here V is the number of vertices and I is the number of internal lines in the diagram.
p3 p4
p5
D
p1+p2−p5−p8−p9
C
p8
p9 p1+p2−p7−p8−p9
p7
p6 B
p1+p2−p6−p7−p8
A
p1 p2
Figure 8.5: A diagram representing one term in the 4-order product of a

hypothetical theory. Here we do not draw the order bars as in subsection
8.3.2. However, we draw all outgoing lines on the top of the diagram and
all incoming lines at the bottom to indicate that the diagram is normally
ordered. Note that all internal lines are oriented upwards because all paired
operators (i.e., those operators whose order should be changed by the normal
ordering procedure) in the product (8.101) always occur in the order αα† .
Z
dp6 dp7 dp8 dp9 DA (p6 , p7 , p8 , p1 + p2 − p6 − p7 − p8 ; p1 , p2 ) ×
DB (p9 , p1 + p2 − p7 − p8 − p9 ; p6 , p1 + p2 − p6 − p7 − p8 ) ×
DC (p5 , p1 + p2 − p5 − p8 − p9 ; p7 , p1 + p2 − p7 − p8 − p9 ) ×
1
DD (p3 , p4 ; p8 , p9 , p1 + p2 − p5 − p8 − p9 ) (8.113)
EA (EC + ED )
where EA , EC , and ED are energy functions of the corresponding vertices
and DA , DB , DC , DD are coefficient functions at the vertices.
8.4.2 Cluster separability of the S-operator

In agreement with Postulate 6.3 (general cluster separability) and Statement
8.7 (cluster separability of smooth potentials), interactions considered in this
book are cluster-separable. In this subsection we are going to prove that the
S-operator calculated with such interactions is always cluster-separable too.
Physically, this means that if a multiparticle scattering system is separated
into two (or more) distant subsystems, then the result of scattering in each
subsystem will not depend on what is going on in other subsystem(s).
From perturbation formulas for the S-operator in subsection 7.1.2 we
know that, generally, S is a sum of products of interaction potentials like
(8.101). Mathematically, the cluster separability of interactions means that
coefficient functions of interaction potentials V (i) in the product (8.101) are
smooth. If we could show that the product (8.101) itself is a sum of smooth
operators, then the desired cluster separability of the S-operator would follow
directly from Statement 8.7. The question about the smoothness of (8.101)
is not trivial, because bringing such a product to the normal order involves
permutations of particle operators that produce singular delta functions.
The following theorem establishes an important connection between the
smoothness of terms on the right hand side of (8.102) and the connectivity
of corresponding diagrams.
Theorem 8.11 Each term y (j) in the expansion (8.102) of the product of
smooth potentials is smooth if and only if it is represented by a connected
diagram.
Proof. Let us first assume that y (j) is represented by a connected diagram.

We will establish the smoothness of the term y (j) by proving that it can be
represented in the general form (8.50) in which the integrand contains only
one delta function required by the momentum conservation condition and the
coefficient function DN M is smooth.40 From equation (8.103), the original
number of integrals in y (j) is N . Integrals corresponding to E external lines
are parts of the general form (8.50), and integrals corresponding to L loops
can be absorbed into the definition of the coefficient function of y (j). The
number of remaining integrals is then obtained from (8.105) and (8.112)
N′ = N − E − L = I + V − 1 (8.114)
This is just enough integrals to cancel all momentum delta functions (8.104)
except one, which proves that the term y (i) is smooth.
Inversely, suppose that the term y (j) is represented by a disconnected
diagram with V vertices and I internal lines. Then the number of indepen-
dent loops L is greater than the value I − V + 1 characteristic for connected
diagrams. Then the number of integrations N ′ in equation (8.114) is less
than I + V − 1 and the number of delta functions remaining in the integrand
N ′ − Nδ is greater than 1. This means that the term y (j) is represented by
expression (8.50) whose coefficient function is singular, therefore the corre-
sponding operator is not smooth.
Theorem 8.11 establishes that smooth operators are represented by con-

nected diagrams and vice versa. In what follows, we will use the terms smooth
and connected as synonyms, when applied to operators.
Putting together Theorems 8.9 and 8.11 we immediately obtain the fol-
lowing important
Theorem 8.12 All terms in a normally ordered multiple commutator of

smooth bosonic potentials are smooth.
This theorem allows us to apply the property of cluster separability to

the S-operator. Let us write the S-operator in the form (7.20)
40
This means that all other delta functions can be integrated out. Note also that
singularities present in y (j) due to energy denominators resulting from t-integrals (8.61)
can be made harmless by employing the “adiabatic switching” trick from subsection 7.1.3.
This trick essentially results in adding small imaginary contributions to each denominator,
which remove the singularities.
F
S = e|{z} (8.115)
where F is a series of multiple commutators (7.21) of smooth bosonic po-

tentials in V . According to Theorem 8.12, operators F and |{z} F are also
smooth. Then, according to Statement 8.7, operator |{z} F is cluster separa-
ble, and if all particles are divided into two spatially separated groups 1 and
2, the argument of the exponent in (8.115) takes the form of a sum
F (1) + |{z}
F → |{z}
|{z} F (2)
F (1) acts only on variables in the group 1, and |{z}

where |{z} F (2) acts only on
variables in the group 2. So, these two operators commute with each other,
and the S-operator separates into the product of two independent factors
(1)
S → exp(F F (2) ) = exp(F
|{z} + |{z}
(1)
|{z}) exp(F
(2) (1) (2)
|{z}) = S S
This relationship expresses the cluster separability of the S-operator and the
S-matrix: The total scattering amplitude for spatially separated events is
given by the product of individual amplitudes.
8.4.3 Divergence of loop integrals

In the preceding subsection we showed that S-operator terms described by
connected diagrams are smooth. However, such terms involve loop integrals,
and generally there is no guarantee that these integrals converge. This prob-
lem is evident in our toy model: the loop integral by k in diagram 8.4(g) is
divergent
Z
ren e2 ~2 c a†p ap
(V1 V1 ) =− dpdk (8.116)
(2π~)3 (ωp−k − ωp + ck)k
Substituting this result to the right hand side of (8.93) we see that the S-
operator in the second order Σ2 is infinite, which makes it meaningless and
|{z}
unacceptable.
The appearance of divergences in perturbative formulas for the S-operator

is a commonplace in quantum field theories. So, we need to understand
this phenomenon better. In this subsection, we will formulate a sufficient
condition under which loop integrals are convergent. We will find this result
useful in our discussion of the renormalization of QED in chapter 10 and in
our construction of a divergence-free theory in section 11.1.
Let us consider, for example, the diagram in Fig. 8.5. There are three
different reasons why loop integrals may diverge there:
(I) The coefficient functions DA , DB , . . . of interaction vertices in (8.113)

may contain singularities. One example is interaction (8.87), which is
singular at k = 0. Such singularities are usually related to the vanishing
photon mass. They correspond to the so-called infrared divergences of
loop integrals. We will discuss them in greater detail in chapters 10
and 14.
(II) There can be also singularities due to zeroes in energy denominators

EA and EC +ED . The energy denominators may be rendered finite and
harmless if we use the adiabatic switching prescription from subsection
7.1.3.41
(III) The coefficient functions DA , DB , . . . may not decay fast enough at

large values of loop momenta, so that the integrals may be divergent due
to the infinite integration range. These ultraviolet divergences present
more serious problems, which we are going to discuss here in some
detail.
In particular, we would like to prove the following
Theorem 8.13 If coefficient functions of potentials decay sufficiently rapidly

(e.g., exponentially) when arguments move away from the energy shell, then
all loop integrals converge.
Idea of the proof. Equation (8.113) is an integral in a 12-dimensional

space of 4 loop momenta p6 , p7 , p8 , and p9 . Let us denote this space Ξ.
Consider for example the dependence of the integrand in (8.113) on the loop
momentum p9 as p9 → ∞ and all other momenta fixed. Note that we
41
see also footnote on page 291
have chosen integration variables in Fig. 8.5 in such a way that each loop
momentum is present only in the internal lines forming the corresponding
loop, e.g., momentum p9 is confined to the loop BDCB, and the energy
function EA of the vertex A does not depend on p9 . Such a selection of
integration variables can be done for any arbitrary diagram. Taking into
account that at large values of momentum ωp ≈ cp, we obtain in the limit
p9 → ∞
EA → const,
EB = ωp1 +p2 −p7 −p8 −p9 + ωp9 − ωp6 − ωp1 +p2 −p6 −p7 −p8
≈ 2cp9 → ∞,
EC = ωp1 +p2 −p5 −p8 −p9 + ωp5 − ωp7 − ωp1 +p2 −p7 −p8 −p9
→ const,
ED = ωp3 + ωp4 − ωp8 − ωp9 − ωp1 +p2 −p5 −p8 −p9
≈ −2cp9 → ∞.
So, in this limit, according to the condition of the theorem, the coefficient
functions at vertices B and D tend to zero rapidly, e.g., exponentially. So,
the loop integral on p9 converges. In order to prove the convergence of
all four loop integrals, we need to make sure that the same rapid decay is
characteristic for all directions in the space Ξ. Here are some arguments that
this is, indeed, true.
The above analysis is applicable to all loop variables: Any loop has a
bottom vertex (vertex B in our example), a top vertex (vertex D in our
example) and possibly a number of intermediate vertices (vertex C in our
example). As the loop momentum goes to infinity, energy functions of the top
and bottom vertices tend to infinity, i.e., move away from the energy shell.
This ensures a fast (e.g., exponential) decay of the corresponding coefficient
function.
Now we can take an arbitrary direction to infinity in the space Ξ. Along
this direction, there is at least one loop momentum which goes to infinity.
Then there is at least one energy function (EA , EB , EC , or ED ) which grows
linearly, while others stay constant (in the worst case). Therefore, according
to the condition of the theorem, the integrand decreases rapidly (e.g., expo-
nentially) along this direction. Thus the integrand rapidly tends to zero in
all directions in Ξ, and integral (8.113) converges.
In chapter 10 we will see that in realistic theories, like QED, the asymp-
totic decay of the coefficient functions of potentials at large momenta is not
fast enough, so Theorem 8.13 is not applicable, and loop integrals usually
diverge. A detailed discussion of such divergences in quantum field theory
and their elimination will be presented in chapter 10, in section 11.1, and in
chapter 14.2.
Chapter 9
QUANTUM
ELECTRODYNAMICS
If it turned out that some physical system could not be described

by a quantum field theory, it would be a sensation; if it turned
out that the system did not obey the rules of quantum mechanics
and relativity, it would be a cataclysm.
Steven Weinberg
So far we have developed a general formalism of quantum theory in the

Fock space. We emphasized that any such theory must obey, at least, three
important requirements:
• the theory must be relativistically invariant, in the instant form of

dynamics;
• the interaction must be cluster separable;
• the theory must allow for processes involving creation and annihilation
of particles.
We have considered a few model examples, but they were purely academic
and not directly relevant to real systems observed in nature. The reason for
such inadequacy was that our models failed to satisfy all three requirements
mentioned above simultaneously.
297
298 CHAPTER 9. QUANTUM ELECTRODYNAMICS
For example, in subsection 6.3.6 we constructed an interacting model that

explicitly satisfied the requirement of relativistic invariance. We also man-
aged to ensure that the model is cluster separable in the 3-particle sector.
In principle, by following this approach one can build cluster-separable in-
teractions in all n-particle sectors. There is even a possibility for describing
systems with variable number of particles [Pol03]. However, the resulting
formalism is very cumbersome, and it can be applied only to model systems.
In section 8.3 we considered another toy model of interacting particles,
which was based on the formalism of creation and annihilation operators.
The great advantage of this formalism was that the cluster separability con-
dition could be conveniently expressed in terms of smoothness of interaction
potentials.1 The processes of particle creation and annihilation were easily
described as well. However, the difficult part was to ensure the relativistic
invariance. In our toy model we have not even tried to make the theory
relativistic.
Fortunately, there is a class of theories, which allow one to satisfy all
three conditions listed above. What is even more important, these theories
are directly applicable to realistic physical systems and allow one to achieve
an impressive agreement with experiments. These are quantum field theo-
ries (QFT). A particular version of QFT for describing interactions between
electrically charged particles and photons is called quantum electrodynamics
(QED). This is the topic of our discussion in the present chapter.
In section 9.1 we will write down interaction terms V (potential energy)
and Z (potential boost) in QED. The relativistic invariance of this approach
will be proven in Appendix N.2. In section 9.2 S-matrix elements will be
calculated in the lowest non-trivial order of perturbation theory.
9.1 Interaction in QED

Our goal in this section is to build a realistic interacting representation
U(Λ, a) of the Poincaré group in the Fock space (8.1). In this book we do not
pretend to derive QED interactions from first principles. We simply borrow
from the traditional approach the form of four interacting Poincaré gener-
ators H and K in terms of quantum fields for electrons/positrons ψα (x̃),
1
see Statement 8.7
9.1. INTERACTION IN QED 299
protons/antiprotons2 Ψα (x̃), and photons Aµ (x̃). Definitions of quantum

fields are given in Appendices J and K.
At this point we do not offer any physical interpretation of quantum
fields. For us they are just abstract multicomponent functions from the 4-
dimensional Minkowski space-time M to operators in the Fock space. In our
approach, the only role of quantum fields is to provide convenient “build-
ing blocks” for the construction of Poincaré invariant interactions V and Z.
This attitude was inspired by a non-traditional way of looking at quantum
fields presented in Weinberg’s book [Wei95]. Also, we are not identifying
coordinates x and t in M with positions and times of events measured in
real experiments. The space-time M will be understood as an abstract 4-
dimensional manifold with a pseudo-Euclidean metric. In section 17.4 we will
discuss in more detail the meaning of quantum fields and their arguments
x̃ ≡ (x, t).
9.1.1 Construction of simple quantum field theories

Before approaching QED we will do a warm-up exercise and build a class
of simpler QFT theories, which would allow us to demonstrate many im-
portant features characteristic for all QFT models. In simple QFT theories,
relativistic interactions are constructed in three steps [Wei95, Wei64b]:
Step 1. For each particle type3 participating in the theory we construct a
quantum field which is a multicomponent operator-valued function4 φi (x, t)
defined on an abstract Minkowski space-time M5 and satisfying following
conditions:
(I) Operator φi (x, t) contains only terms linear in creation or annihilation

operators of the particle and its antiparticle.
(II) Quantum fields are supposed to have simple transformation laws

2
Throughout this book, protons and antiprotons are treated as simple point charges.
Their internal structures are disregarded as well as their participation in strong nuclear
interactions.
3
A particle and its antiparticle are assumed to belong to the same particle type.
4
This means that for each value of its arguments (x, t) and index i, the symbol φ
denotes an operator acting in the Fock space.
5
see Appendix I.1
X
U0 (Λ; ã)φi (x̃)U0−1 (Λ; ã) = Dij (Λ−1 )φj (Λ(x̃ + ã)) (9.1)
j
with respect to the non-interacting representation6 U0 (Λ; ã) of the Poincaré

group in the Fock space, where Λ is a boost/rotation, a is a space-
time translation and Dij is a finite-dimensional representation7 of the
Lorentz group.
(III) Quantum fields turn to zero at x-infinity, i.e.
lim φi (x, t) = 0 (9.2)

|x|→∞
(IV) Fermionic fields (i.e., fields for particles with half-integer spin) φi (x̃)
and φj (x̃′ ) are required to anticommute if (x̃ − x̃′ ) is a space-like 4-
vector, or equivalently
{φi (x, t), φj (y, t)} = 0 if x 6= y (9.3)
Fermionic quantum fields for electrons-positrons ψα (x̃) and protons-

antiprotons Ψα (x̃) are constructed and analyzed in Appendix J.
(V) Bosonic fields (i.e., fields for particles with integer spin or helicity) at
points x̃ and x̃′ are required to commute if (x̃ − x̃′ ) is a space-like
4-vector, or equivalently
[φi (x, t), φj (y, t)] = 0 if x 6= y (9.4)
Bosonic quantum field for photons Aµ (x̃) is discussed in Appendix K.

6
7
The representation Dij is definitely non-unitary, because the Lorentz group is non-
compact and it is known that non-compact groups cannot have finite-dimensional unitary
representations.
Step 2. Having at our disposal quantum fields φi (x̃), ψj (x̃), χk (x̃), . . . for all
particles we can build the potential energy density
X
V (x̃) ≡ V (x, t) = Vn (x, t) (9.5)
n
in the form of a polynomial where each term is a product of fields at the

same (x, t) point
X
Vn (x, t) = Gnijk...φi (x, t)ψj (x, t)χk (x, t) . . . (9.6)
i,j,k,...
and coefficients Gnijk... are such that V (x̃)
(I) is a bosonic8 Hermitian operator function on the space-time M;
(II) transforms as a scalar with respect to the non-interacting representa-

tion of the Poincaré group:
U0 (Λ; ã)V (x̃)U0−1 (Λ; ã) = V (Λx̃ + Λã) (9.7)
From properties (9.3) - (9.4) and the bosonic character of V (x̃) it is easy
to prove that V (x̃) commutes with itself at space-like separations, e.g.,
[V (x, t), V (y, t)] = 0 if x 6= y (9.8)
Step 3. Instant-form interactions in the Hamiltonian and boost operator are

obtained by integrating the potential energy density (9.5) on x and setting
t=0
Z
H = H0 + V = H0 + dxV (x, 0) (9.9)
Z
1
K = K0 + Z = K0 + 2 dxxV (x, 0) (9.10)
c
8
i.e., there is an even number of fermionic fields in each product in (9.6)
Non-interacting generators H0 , P0 , J0 , and K0 can be found in (8.29),

(8.30), (8.32), and (8.35), respectively. With these definitions, the commu-
tation relations of the Poincaré Lie algebra are proved in Appendix N.1.
Coefficient functions of potentials in (9.9) are smooth, so the cluster separa-
bility is established by reference to Statement 8.7. The terms that change
the number of particles can be easily accomodated within the above 3-step
approach. Then all three conditions listed in the beginning of this chapter
are readily satisfied. This explains why quantum field theories are so useful
for describing realistic physical systems.
9.1.2 Interaction operators in QED

Unfortunately, formulas (9.9) and (9.10) work only for simplest QFT mod-
els. More interesting cases, such as QED, require some modifications in this
scheme. In particular, the presence of the additional term Ωµ (x̃, Λ) in the
transformation law of the photon field (K.23) does not allow us to define the
boost interaction in QFT by simple formula (9.10). Let us now postulate
QED interaction operators without proof.
The total Hamiltonian of QED has the usual form
H = H0 + V (9.11)
where the non-interacting Hamiltonian H0 is that from equation (8.29) and

interaction is composed of two terms9
V = V1 + V2 (9.12)
The first order interaction is a pseudoscalar product of two 4-component

quantities. One of them is the 4-vector fermion current operator j̃(x̃) defined
in Appendix L.1. The other is the photon quantum field Ã(x̃)10
9
Here and in what follows we denote the power of the coupling constant e (the pertur-
bation order of an operator) by a subscript, i.e., H0 is zero order, V1 is first order, V2 is
second order, etc.
10
Here we mark the photon quantum field Ã by tilde as if it were a 4-vector. However,
as shown in Appendix K.6, the components of Ã do not transform by 4-vector rules. The
last equality in (9.13) follows from equation (K.8).
Z Z
1 1
V1 = dxj̃(x, 0) · Ã(x, 0) ≡ dxjµ (x, 0)Aµ (x, 0)
c c
Z
1
= dxj(x, 0) · A(x, 0) (9.13)
c
The second order interaction is
Z
1 j0 (x, 0)j0 (y, 0)
V2 = 2 dxdy (9.14)
2c 8π|x − y|
Interaction in the boost operator
K = K0 + Z (9.15)
is defined as
Z Z
1 1 xj0 (x, 0)j0 (y, 0)
Z = 3 dxxj(x, 0)A(x, 0) + 4 dxdy
c 2c 8π|x − y|
Z
1
+ 3 dxj0 (x, 0)C(x, 0) (9.16)
c
where components of the operator function C(x, t) are given by equation

(N.7).
The above operators of energy H and boost K are those usually written
in the Coulomb gauge version of QED [Wei95, Wei64b]. In Appendix N.2 we
prove that this theory is Poincaré invariant. For this proof it is convenient
to represent interaction operators of QED in terms of quantum fields, as
above. However, for some calculations in this book it will be more useful to
express interaction V through particle creation and annihilation operators,
as in chapter 8. To do that, we just need to insert field expansions (J.26)
and (K.2) in equations (9.13) and (9.14). The resulting expressions are rather
long and cumbersome, so this derivation has been moved to Appendix L.
9.2 S-operator in QED

Having at our disposal all 10 generators of the Poincaré group representation
in the Fock space, in principle, we should be able to calculate all physical
quantities related to systems of charged particles and photons. However, this
statement is overly optimistic. In chapters 10 and 11 we will see that the
theory outlined above has serious problems and internal contradictions. In
fact, this theory allows one to calculate only simplest physical properties only
in low perturbation orders. An example of such a calculation will be given
in this section: Here we will calculate the S-operator for the proton-electron
scattering in the 2nd perturbation order.
9.2.1 S-operator in the second order

We are interested in S-operator terms of the type d† a† da. It will be convenient
to start this calculation from expanding the phase operator (7.21) in powers
of the coupling constant
F = F1 + F2 + . . .
F1 = V1
1
F2 = V2 − [V1 , V1 ] (9.17)
2
...
Taking into account that operator V1 is unphys, so that F1 = V1 = 0,11

|{z} |{z}
we obtain the following perturbation expansion
F
S = e|{z}
1
= 1 + |{z}
F + F F +...
2! |{z} |{z}
1 1 1
= 1 + F1 + F2 + F1 F1 + F2 F1 + F1 F2 + F3 + . . .
|{z} |{z} 2! |{z} |{z} 2! |{z} |{z} 2! |{z} |{z} |{z}
= 1 + F2 + F3 . . .
|{z} |{z}
1
= 1 + V2 − [V1 , V1 ] + F3 . . . (9.18)
|{z} 2 | {z } |{z}
11
see equation (8.64)
9.2. S-OPERATOR IN QED 305
Let us first evaluate expression − 12 [V 1 , V1 ] in (9.18). Only the four first

terms in equation (L.8) for V1 are relevant for this calculation:12
Z
e †
V1 = − dkdpAα (p + k)Aβ (p)Cαβ (k)
(2π~)3/2
Z
e † †
− dkdpAα (p − k)Aβ (p)Cαβ (k)
(2π~)3/2
Z
e †
+ dkdpDα (p + k)Dβ (p)Cαβ (k)
(2π~)3/2
Z
e † †
+ dkdpDα (p − k)Dβ (p)Cαβ (k) + . . .
(2π~)3/2
According to (8.61), the corresponding terms in V1 are
V1
Z
e † 1
= 3/2
dkdpAα (p + k)Aβ (p)Cαβ (k)
(2π~) ωp+k − ωp − ck
Z
e † † 1
+ 3/2
dkdpAα (p − k)Aβ (p)Cαβ (k)
(2π~) ωp−k − ωp + ck
Z
e † 1
− 3/2
dkdpDα (p + k)Dβ (p)Cαβ (k)
(2π~) Ωp+k − Ωp − ck
Z
e † † 1
− 3/2
dkdpDα (p − k)Dβ (p)Cαβ (k)
(2π~) Ωp−k − Ωp + ck
+... (9.19)
In order to obtain terms of the type D † A† DA in the expression [V 1 , V1 ],

we need to consider four commutators: the 1st term in V 1 commuting with
the 4th term in V1 , the 2nd term in V 1 commuting with the 3rd term in V1 ,
the 3rd term in V 1 commuting with the 2nd term in V1 and the 4th term in
V 1 commuting with the 1st term in V1 . Using commutator (K.12) we then
obtain
1
− [V1 , V1 ]
2
12
Operators A, C, D are defined in (J.49) - (J.56) and (K.11).
Z
e2 † † ′
= − dkdpdk′ dp′ Aα (p + k)Aβ (p)D γ (p′ − k )Dδ (p′ ) ×
2(2π~)3
† 1
[Cαβ (k), Cγδ (k′ )]
ωp+k − ωp − ck
2 Z
e † † ′
− 3
dkdpdk′ dp′ Aα (p − k)Aβ (p)Dγ (p′ + k )Dδ (p′ ) ×
2(2π~)
† 1
ωp−k − ωp + ck
2 Z
e † † ′
− 3
dkdpdk′ dp′ Dα (p + k)Dβ (p)Aγ (p′ − k )Aδ (p′ ) ×
2(2π~)
† 1
Ωp+k − Ωp − ck
2 Z
e † † ′
− 3
dkdpdk′ dp′ Dα (p − k)Dβ (p)Aγ (p′ + k )Aδ (p′ ) ×
2(2π~)
† 1
[Cαβ (k), Cγδ (k′ )] + ...
Ωp−k − Ωp + ck
Z
e2 ~2 c dkdpdq µ ν
= 3
γαβ γγδ hµν (k) ×
4(2π~) k
† † 1
− D γ (p − k)Dδ (p)Aα (q + k)Aβ (q)
ωq+k − ωq − ck
† † 1
+ D γ (p + k)Dδ (p)Aα (q − k)Aβ (q)
ωq−k − ωq + ck
† † 1
− D α (p + k)Dβ (p)Aγ (q − k)Aδ (q)
Ωp+k − Ωp − ck
† † 1
+ D α (p − k)Dβ (p)Aγ (q + k)Aδ (q) + ...
2 2 Z
e~ c dkdpdq µ ν
= γαβ γγδ hµν (k) ×
4(2π~)3 k
† † 1
− D α (p − k)Dβ (p)Aγ (q + k)Aδ (q)
ωq+k − ωq − ck
† † 1
+ D α (p − k)Dβ (p)Aγ (q + k)Aδ (q)
ωq+k − ωq + ck
† † 1
− D α (p − k)Dβ (p)Aγ (q + k)Aδ (q)
Ωp−k − Ωp − ck
† † 1
+ D α (p − k)Dβ (p)Aγ (q + k)Aδ (q) + ...
Z
e2 ~2 c2 µ ν hµν (k)
= − 3
dkdpdqγαβ γγδ ×
2(2π~) (q̃ + k̃ ÷ q̃)2
† †
D α (p − k)Aγ (q + k)Dβ (p)Aδ (q)
Z
e2 ~2 c2 µ ν hµν (k)
− 3
dkdpdqγαβ γγδ ×
2(2π~) (P̃ − K̃ ÷ P̃ )2
† †
D α (p − k)Aγ (q + k)Dβ (p)Aδ (q) (9.20)
where we denoted13
(p̃ ÷ q̃)2 ≡ (ωp − ωq )2 − c2 (p − q)2 (9.21)

(P̃ ÷ Q̃)2 ≡ (Ωp − Ωq )2 − c2 (p − q)2 (9.22)
Next take into account that we need to know our S-operator only in the
vicinity of the energy shell where
Ωp−k − Ωp = ωq − ωq+k
(P̃ − K̃ ÷ P̃ )2 = (q̃ + k̃ ÷ q̃)2 (9.23)
Also use notation (J.62) - (J.63) in which
† mc2 X ν
A (q + k)γ ν A(q) = √ U (q + k, σ; q, σ ′ )a†q+k,σ aq,σ′
ωq+k ωq ′
σσ
2
† Mc X
D (p − k)γ µ Dβ (p) = p W µ (p − k, τ ; p, τ ′ )d†p−k,τ dp,τ ′
Ωp−k Ωp τ τ ′
and equation (K.14). Then
1
− [V1 , V1 ]
2
13
Recall that in Appendix I.1 we agreed to denote 4-vectors by the tilde. Then (p̃ ÷ q̃)2
is a 4-square of the difference between 4-vectors p̃ and q̃, i.e., (p̃ ÷ q̃)2 = (p̃ − q̃)µ (p̃ − q̃)µ .
Thus, for example, (q̃ + k̃ ÷ q̃)2 = (ωq+k − ωq )2 − c2 k 2 .
e2 ~2 c2 Mmc4 X
= − √ p ×
(2π~)3 ωq+k ωq Ωp−k Ωp
σ,τ,σ′ ,τ ′
Z
hµν (k)U ν (q + k, σ; q, σ ′ )W µ (p − k, τ ; p, τ ′ ) †
dkdpdq dp−k,τ a†q+k,σ dp,τ ′ aq,σ′
(q̃ + k̃ ÷ q̃) 2
2 2 2 4
e ~c Mmc X
= − √ p ×
(2π~)3 ωq+k ωq Ωp−k Ωp
σ,τ σ′ ,τ ′
Z h (U(q + k, σ; q, σ ′) · W(p − k, τ ; p, τ ′ )
dkdpdq
(q̃ + k̃ ÷ q̃)2
(k · U(q + k, σ; q, σ ′))(k · W(p − k, τ ; p, τ ′ ) i †
− dp−k,τ a†q+k,σ dp,τ ′ aq,σ′
2
k (q̃ + k̃ ÷ q̃) 2
Combining this expression with the term D † A† DA in V2 ,14 we see that

operator F2 in (9.17) takes the form
e2 ~2 c2 Mmc4 X Z
F2 = p dkdpdq ×
(2π~)3 ωq+k ωq Ωp−k Ωp ′ ′
σ,τ,σ ,τ
h U(q + k, σ; q, σ ′ ) · W(p − k, τ ; p, τ ′ )
−
(q̃ + k̃ ÷ q̃)2
(k · U(q + k, σ; q, σ ′ ))(k · W(p − k, τ ; p, τ ′ ))
+
k 2 (q̃ + k̃ ÷ q̃)2
U 0 (q + k, σ; q, σ ′ )W 0 (p − k, τ ; p, τ ′ ) i †
− 2 2
dp−k,τ a†q+k,σ dp,τ ′ aq,σ(9.24)
′
c k
From (J.84) we further obtain
Ωp−k − Ωp 0
(k · W(p − k, τ ; p, τ ′ )) = − W (p − k, τ ; p, τ ′ )
c
ωq+k − ωq 0
= W (p − k, τ ; p, τ ′ )
c
′ ωq+k − ωq 0
(k · U(q + k, σ; q, σ )) = U (q + k, σ; q, σ ′ )
c
and
14
the third term in equation (L.11)
(k · U)(k · W) U 0 W 0 (ωq+k − ωq )2 U 0 W 0 [(ωq+k − ωq )2 − c2 k 2 ]U 0 W 0

− 2 2 = −
k 2 (q̃ + k̃ ÷ q̃)2 c k c2 k 2 (q̃ + k̃ ÷ q̃)2 c2 k 2 (q̃ + k̃ ÷ q̃)2
U 0W 0
=
(q̃ + k̃ ÷ q̃)2
so that our final expression for the F -operator is
e2 ~2 c2 Mmc4 X Z
F2 = p dkdpdq ×
(2π~)3 ωq+k ωq Ωp−k Ωp ′ ′ σ,τ,σ ,τ
U (q + k, σ; q, σ )W (p − k, τ ; p, τ ′ ) − (U(q + k, σ; q, σ ′) · W(p − k, τ ; p, τ ′ )
0 ′ 0
×
(q̃ + k̃ ÷ q̃)2
d†p−k,τ a†q+k,σ dp,τ ′ aq,σ′
e2 ~2 c2 Mmc4 X Z
= p dkdpdq ×
(2π~)3 ωq+k ωq Ωp−k Ωp σ,τ,σ′ ,τ ′
Uµ (q + k, σ; q, σ ′)W µ (p − k, τ ; p, τ ′ ) †
dp−k,τ a†q+k,σ dp,τ ′ aq,σ′ (9.25)
(q̃ + k̃ ÷ q̃) 2
Now we insert this result in formula (9.18) for the S-operator. According to
(8.54), in order to perform the integration on t from −∞ to ∞, we just need
to multiply the coefficient function by the factor −2πiδ(E(p, q, k)), where
E(p, q, k) = ωq+k − ωq + Ωp−k − Ωp
is the energy function. This makes the S-operator non-trivial only on the
energy shell E(p, q, k) = 0. Finally, we can represent the 2nd order scattering
operator in the general form (8.50)
X Z
† †
S2 [d a da] = dpdqdp′ dq′ s2 (p, q, p′ , q′ ; σ, τ, σ ′ , τ ′ )d†p,τ a†q,σ dp′ ,τ ′ aq′ ,σ′
σ,τ,σ′ ,τ ′
(9.26)
with the coefficient function
s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )
ie2 c2 mMc4 δ 4 (p̃ + q̃ − p̃′ − q̃ ′ )U µ (q, σ; q′ , σ ′ )Wµ (p, τ ; p′ , τ ′ )
= − p (9.27)
4π 2 ~ ωq ωq′ Ωp Ωp′ (q̃ − q̃ ′ )2
where the 4-dimensional delta function15
δ 4 (p̃ + q̃ − p̃′ − q̃ ′ ) ≡ δ(p + q − p′ − q′ )δ(Ωp + ωq − Ωp′ − ωq′ ) (9.28)
guarantees the conservation of both momentum and energy in the collision

process.
Note that s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ ) in (9.27) is indeed a matrix element of
the S-operator between two 2-particle states
h0|aq,σ dp,τ S2 [d† a† da]d†p′ ,τ ′ a†q′ ,σ′ |0i

X Z
= h0|aq,σ dp,τ dsdtds′ dt′d†s,π a†t,ρ s2 (s, t, s′ , t′; π, ρ, π ′ , ρ′ ) ×
π,ρ,π ′ ,ρ′
ds′ ,π′ at′ ,ρ′ d†p′ ,τ ′ a†q′ ,σ′ |0i

X Z
= dsdtds′dt′ s2 (s, t, s′ , t′ ; π, ρ, π ′, ρ′ ) ×
π,ρ,π ′ ,ρ′
δ(s − p)δπ,τ δ(t − q)δρ,σ δ(s′ − p′ )δπ′ ,τ ′ δ(t′ − q′ )δρ′ ,σ′

= s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ ) (9.29)
9.2.2 Lorentz invariance of the S-operator

In this subsection we would like to verify the Lorentz invariance of the S-
operator calculated above. More specifically, in accordance with (7.7) we
would like to check that
ic ~ ic ~
e− ~ K0 θ S2 [d† a† da]e ~ K0 θ = S2 [d† a† da]
15
see Appendix M.1
On the left hand side of this equality we apply boosts to particle operators
using formulas (8.36) - (8.37)
X Z ic ~ ic ~
dpdqdp′ dq′ s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )e ~ K0 θ d†p,τ a†q,σ dp′ ,τ ′ aq′ ,σ′ e− ~ K0 θ
σ,τ,σ′ ,τ ′
X Z
= dpdqdp′ dq′ s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ ) ×
σ,τ,σ′ ,τ ′
p √
ΩΛp Ωp X 1/2 ∗ ωλq ωq X 1/2 ∗
~ †
(D )τ ρ (−φW (p, Λ))dΛp,ρ (D )σπ (−φ ~ W (q, λ))a† ×
λq,π
Ωp ρ
ω q π
p √
ΩΛp′ Ωp′ X 1/2 ~ ωλq′ ωq′ X 1/2
′
Dτ ′ η (−φW (p , Λ))dΛp′,η ~ W (q′ , λ))aλq′ ,χ
Dσ′ χ (−φ
Ωp′ η
ω q ′
χ
p
where ωq = m2 c4 + q 2 c2 , λq is the boost transformation of the electron’s
momentum,16 and φ ~ W (q, λ) is the corresponding Wigner angle (5.16). The
analogous proton-related quantities Ωq , Λq, and φ ~ W (q, Λ) are obtained by
replacing the electron mass m with the proton mass M. Changing summation
and integration variables and denoting R(p, λ)τ and R(p, Λ)σ the results of
Wigner rotations on the electron and proton spin components, respectively,
we obtain17
p
X Z Ωp ΩΛ−1 p ωq ωλ−1 q Ωp′ ΩΛ−1 p′ ωq′ ωΛ−1 q′
−1 −1 −1 ′ −1 ′
= d(Λ p)d(λ q)d(Λ p )d(λ q ) ×
σ,τ,σ′ ,τ ′
ΩΛ−1 p ωλ−1 q ΩΛ−1 p′ ωΛ−1 q′
s2 (Λ−1 p, λ−1 q, Λ−1p′ , λ−1 q′ ; R−1 (p, Λ)τ, R−1(q, λ)σ, R(p′ , Λ)τ ′ , R(q′ , λ)σ ′ ) ×
d†p,τ a†q,σ dp′ ,τ ′ aq′ ,σ′
s
X Z ΩΛ−1 p ωλ−1 q ΩΛ−1 p′ ωλ−1 q′
= dpdqdp′ dq′ ×
σ,τ,σ′ ,τ ′
Ωp ωq Ωp′ ωq′
s2 (Λ−1 p, λ−1 q, Λ−1p′ , λ−1 q′ ; R−1 (p, Λ)τ, R−1(q, λ)σ, R(p′ , Λ)τ ′ , R(q′ , λ)σ ′ ) ×
d†p,τ a†q,σ dp′ ,τ ′ aq′ ,σ′
Comparing this result with (9.26) we conclude that the Lorentz invariance
16
see formula (5.14)
17
We also used equation (5.25) here.
condition will be satisfied if the coefficient function can be written in the

form
Mmc4
s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ ) ≡ p S2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )
Ωp ωq Ωp′ ωq′
and the new function S2 satisfies a simpler invariance condition
S2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )
= S2 (Λ−1 p, λ−1 q, Λ−1p′ , λ−1 q′ ; R−1 (p, Λ)τ, R−1 (q, λ)σ, R(p′ , Λ)τ ′ , R(q′ , λ)σ ′ )
(9.30)
In our case (9.27)
′ ′ ′ ′
ie2 c2 4 ′
µ
′ U (q, σ; q , σ )Wµ (p, τ ; p , τ )
S2 = δ (p̃ + q̃ − p̃ − q̃ )
4π 2 ~ (q̃ − q̃ ′ )2
As shown in Appendix J.8, the quantities U µ and W µ transform as 4-vectors
under the change of arguments indicated on the right hand side of (9.30). So
the 4-product U µ Wµ stays invariant. The 4-square (q̃ − q̃ ′ )2 is invariant as
well. This proves Lorentz invariance of the 2nd order contribution (9.27) to
the S-operator.
9.2.3 S2 in Feynman-Dyson perturbation theory

The relativistic invariance of the S2 operator looks almost accidental in our
approach. Indeed, we have used the interacting Hamiltonian V1 + V2 , which
did not have simple transformation properties with respect to boosts. We
have also used a non-covariant form (K.14) of the matrix hµν (k) and saw a
lucky cancelation of non-covariant terms. However, as discussed in section
8.5 of [Wei95], this cancelation is actually not accidental. In fact, it is ex-
pected to occur for all processes in all perturbation orders, so that S-matrix
elements become explicitly Lorentz invariant. This observation opens up a
possibility to perform S-matrix calculations much more easily than it has
been done above, while maintaining the manifest covariance at each calcula-
tion step. Such a possibility is realized in the Feynman-Dyson perturbation
theory, which is the method of choice for S-matrix calculations in QFT.
The prescription used in the Feynman-Dyson approach has three differ-

ences with respect to our subsection 9.2.1 [Wei95, Wei64a].
1. Drop the 2nd order interaction V2 in (9.12), so that the full interaction
operator is given simply by V1 in (9.13)18
Z
1
V1 = dxjµ (x, 0)Aµ (x, 0)
c
Z
= dx(−eψ(x, 0)γ µ ψ(x, 0) + eΨ(x, 0)γ µ Ψ(x, 0))Aµ (x, 0)
(9.31)
2. In calculations involving photon fields19 use the covariant expression

(gµν ) for the matrix hµν (k) instead of our formula (K.14).
3. Calculate the S-operator with the help of the Feynman-Dyson pertur-
bation formula (7.17).
We will omit a proof that this approach works in all orders. Let us simply
show how it applies to our example. Here we will repeat calculation of the
2nd order S-matrix element S2 in the Feynman-Dyson formulation. We use
formulas (7.17), (9.29), and (L.1)
s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )
= h0|aq,σ dp,τ S2 d†p′ ,τ ′ a†q′ ,σ′ |0i
 
Z+∞
1
= −h0|aq,σ dp,τ  2 dt1 dt2 T [V1 (t1 )V1 (t2 )] d†p′ ,τ ′ a†q′ ,σ′ |0i
2!~
−∞
Z
1
= − 2 2 d4 x1 d4 x2 h0|aq,σ dp,τ T [(J µ (x̃1 )Aµ (x̃1 ) + J µ (x̃1 )Aµ (x̃1 )) ×
2!~ c
(J µ (x̃2 )Aµ (x̃2 ) + J µ (x̃2 )Aµ (x̃2 ))]d†p′ ,τ ′ a†q′ ,σ′ |0i
Z
1
= − 2 2 d4 x1 d4 x2 h0|aq,σ dp,τ T [J µ (x̃1 )Aµ (x̃1 )J µ (x̃2 )Aµ (x̃2 )]
2!~ c
18
So, we will call V1 the Feynman-Dyson interaction operator to distinguish it from the
Hamiltonian interaction operator V1 + V2 .
19
commutators (K.12) and photon propagators (K.16)

+T [J µ (x̃1 )Aµ (x̃1 )J µ (x̃2 )Aµ (x̃2 )] d†p′ ,τ ′ a†q′ ,σ′ |0i
Z
e2
= 2 d4 x1 d4 x2 ×
~
h0|aq,σ dp,τ T [ψ(x̃1 )γ µ ψ(x̃1 )Aµ (x̃1 )Ψ(x̃2 )γ ν Ψ(x̃2 )Aν (x̃2 )]d†p′ ,τ ′ a†q′ ,σ′ |0i
(9.32)
If the operator sandwiched between vacuum vectors h0| . . . |0i is converted
to the normal order, then all its terms will not contribute to the matrix ele-
ment, except the c-number term. In order to provide such a c-number term,
the operator under the T -symbol should have the structure d† a† da. From
expressions (J.26) and (J.29) for quantum fields ψ and Ψ we conclude that
operator d† (with a corresponding numerical factor) may come only from the
factor Ψ, operator a† comes from ψ and operators d and a come from factors
Ψ and ψ, respectively. In the process of bringing the full operator to the
normal order, creation (annihilation) operators inside the T -symbol change
places with corresponding annihilation (creation) operators outside this sym-
bol. This leaves expressions like (momentum delta function) × (Kronecker
delta symbol of spin labels) × (numerical factor). The delta function and
the Kronecker delta disappear after integration (summation) and only the
numerical factor is left. For example, the electron creation operator from the
factors ψ α being coupled with the annihilation operator aq,σ results in the
numerical factor
s
mc2 i
exp q̃ · x̃i uα (q, σ) (9.33)
(2π~)3 ωq ~
After these routine manipulations the coefficient function takes the form20
s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )
Z
e2 mMc4
≈ d4 x1 d4 x2 2 p ×
~ (2π~)6 ωq ωq′ Ωp Ωp′

i i ′ i i ′
exp q̃ · x̃1 exp − q̃ · x̃1 exp p̃ · x̃2 exp − p̃ · x̃2 ×
~ ~ ~ ~
20
where the matrix element h0|T [Aµ (x̃1 )Aν (x̃2 )]|0i (the photon propagator ) was taken
from (K.17)
u(q, σ)γµ u(q′ , σ ′ )w(p, τ )γν w(p′, τ ′ )h0|T [Aµ (x̃1 )Aν (x̃2 )]|0i
Z
1 e2 c2 mMc4
= d4 x1 d4 x2 d4 s p ×
2πi (2π~)9 ωq ωq′ Ωp Ωp′

i ′ i ′
exp (q̃ − q̃ − s̃) · x̃1 exp (p̃ − p̃ + s̃) · x̃2 ×
~ ~
gµν
ua (q, σ)γµab ub (q′ , σ ′ ) 2 wc (p, τ )γνcd wd (p′ , τ ′ )
s̃
ie2 c2 mMc4
= − 2 p δ 4 (p̃ + q̃ − p̃′ − q̃ ′ ) ×
4π ~ ωq ωq′ Ωp Ωp′
gµν
ua (q, σ)γµab ub (q′ , σ ′ ) ′ 2
wc (p, τ )γνcd wd (p′ , τ ′ ) (9.34)
(q̃ − q̃ )
ie2 c2 mMc4 δ 4 (p̃ + q̃ − p̃′ − q̃ ′ )U µ (q, σ; q′ , σ ′ )Wµ (p, τ ; p′ , τ ′ )
= − p (9.35)
4π 2 ~ ωq ωq′ Ωp Ωp′ (q̃ − q̃ ′ )2
which, as expected, is exactly the same as in the non-covariant approach
(9.27).
Using results from Appendix J.9 and assuming that the proton is very
heavy (M → ∞), we obtain this coefficient function in the (v/c)2 approxi-
mation21
s2 (p, q, p′ , q′ ; τ, σ, τ ′ , σ ′ )
ie2 δτ,τ ′ mc2 1 0
≈ √ U (q + k, σ; q, σ ′ )
(2π)2 ~ ωq ωq′ k 2

ie2 δτ,τ ′ 1 q2 qk k2 † (2q + k)2 i~σel · [k × q]
≈ 1− − − χσ 1 + + χσ′
(2π)2 ~ k 2 2m2 c2 2m2 c2 4m2 c2 8m2 c2 4m2 c2

ie2 δτ,τ ′ δσ,σ′ 1 q2 qk k2 q2 qk k2
≈ 1− − − + + +
(2π)2 ~ k 2 2m2 c2 2m2 c2 4m2 c2 2m2 c2 2m2 c2 8m2 c2
ie2 δτ,τ ′ 1 † i~σel · [k × q]
+ χ χσ′
(2π)2 ~ k 2 σ 4m2 c2

ie2 δτ,τ ′ δσ,σ′ 1 1 αδτ,τ ′ † ~σel · [k × q]
= − − χ χσ′
(2π) ~ 2 k 2 8m c 2 2 4πm2 c σ k2
In the extreme non-relativistic approximation, we can ignore terms with c in
denominators and obtain
21
We omit the energy delta function and introduce the vector of transferred momentum
k = q′ − q = p − p′ .
q,τ
γμ q',τ'
a bb
kk
c dd p',σ'
p,σ
γν
Figure 9.1: Feynman diagram for the electron-proton scattering in the 2nd
perturbation order.
Z
ie2 X δ(E(p, q, k)) †
† †
S2 [d a da] = dpdqdk dp−k,τ dp,τ a†q+k,σ aq,σ
(2π)2 ~ στ k2
(9.36)
This is consistent with our toy model (8.98). The difference in sign is related
to the fact that equation (8.98) describes scattering of two electrons having
the same charge and, therefore, repelling each other, while equation (9.36)
refers to a Coulomb-type attractive electron-proton potential.22
9.2.4 Feynman diagrams

Expression (9.34) for the scattering amplitude can be conveniently repre-
sented by the Feynman diagram shown in Fig. 9.123 Here we would like to
formulate general rules for drawing and interpreting Feynman diagrams in
QED. They are called Feynman rules.
The initial state of the electron with momentum q and spin component
τ is represented by the factor24
22
23
Note that in spite of some similarities, Feynman diagrams are fundamentally different
from diagrams considered in sections 8.3 and 8.4. Feynman diagrams describe expressions
obtained in the Feynman-Dyson perturbation theory, while diagrams from sections 8.3 and
8.4 are for the use in the “old-fashioned” (non-covariant) perturbation theory. In fact, one
Feynman-Dyson diagram is equivalent to a sum of many “old-fashioned” diagrams. This
explains the convenience and popularity of the canonical perturbation theory.
24
This is (9.33) with the exponential factor dropped. Exponential factors coming from
s
mc2
ua (q, σ) (9.37)
(2π~)3 ωq
in (9.34) and by a thin incoming arrow in the diagram 9.1. Similarly, the
final state of the electron is represented by the factor
s
mc2
ub (q′ , σ ′ ) (9.38)
(2π~)3 ωq′
in (9.34) and by a thin outgoing arrow in the diagram 9.1. The incoming and
outgoing proton lines are represented by thick arrows in the diagram and by
factors
s s
Mc2 Mc2
wc (p, τ ), wd (p′ , τ ′ ) (9.39)
(2π~)3 Ωp (2π~)3 Ωp′
respectively. Similarly incoming and outgoing external photon lines25 with

momentum p and helicity ρ correspond to factors
√ √
~ c ~ c
p eµ (p, ρ), p e∗µ (p, ρ) (9.40)
2p(2π~)3 2p(2π~)3
respectively.
An internal photon line carrying 4-momentum k̃ corresponds to the pho-
ton propagator in (9.34)26
~2 c2 g µν
(9.41)
2πi(2π~)3 k̃ 2
different sources will be collected and analyzed together later.
25
See graph 9.2, where external photon lines are shown by dashed arrows.
26
Compare with (K.17). Here we omit the integration sign and the exponential factor,
which will be tackled later.
It is shown by the dashed line in the diagram 9.1. An internal electron line27
connects two vertices and corresponds to the electron propagator28
/ + mc2 )bc
(k
(9.42)
2πi(2π~)3 (k̃ 2 − m2 c4 )
Similarly, an internal proton line is associated with the factor
/ + Mc2 )bc
(k
(9.43)
2πi(2π~)3 (k̃ 2 − M 2 c4 )
The number of vertices (V) in the graph is the same as the order of
perturbation theory (=2 in the diagram 9.1). Each vertex is associated with
the factor
µ
i(2π~)4 eγab
~
This factor has three summation indices. Two bispinor indices a and b are
coupled to indices of fermion factors (9.37) - (9.39) or (9.42) - (9.43). Thus,
in the diagram, each vertex is connected to two fermion lines (either external
or internal). The 4-vector index µ is coupled to indices of either free pho-
ton factor (9.40) or photon propagator (9.41). Correspondingly, each vertex
connects to one photon line (either external or internal). So, we conclude
that each contribution to the QED S-matrix can be represented simply by
drawing a connected29 diagram, whose edges and vertices respect the above
connectivity rules.
Let us now discuss exponential factors R 4 and integrations. Each interaction
vertex is associated with a 4-integral d x. Each incoming external particle
line (attached to the vertex with integration variable x̃′ ) is associated with
exponential factor exp( ~i p̃· x̃′ ). Each outgoing external particle line (attached
to the vertex with integration variable x̃) is associated with exponential factor
27
For brevity, we call it “internal electron lines.” However, in fact, the corresponding
propagator has contributions from both electron and positron operators.
28
Compare with (J.89).
29
As discussed in subsection 8.4.2, we should not consider disconnected diagrams, be-
cause they correspond to non-interesting spatially separated scattering events.
exp(− ~i p̃ · x̃). Each internal line carrying 4-momentum p̃ and connecting

vertices marked by x̃ and x̃′ provides exponential factor exp( ~i p̃ · (x̃ − x̃′ )).
So, the full exponential factor that depends on x̃ is exp( ~i (p̃1 + p̃2 + p̃3 ) · x̃),
where p̃i are 4-momenta (with appropriate signs) of the three lines attached
to the vertex. Integrating on d4 x we obtain the 4-momentum δ-function
(2π~)4 δ 4 (p̃1 + p̃2 + p̃3 ), which expresses “conservation” of the 4-momentum30
at the interaction vertex. This “conservation” rule helps us to assign 4-
momentum labels to all lines: The external lines correspond to real observable
particles, so their energy-momenta are considered given (p0 , p) and (p′0 , p′ )
and they are always “on the mass shell.” I.e., they satisfy conditions
p
p0 = ω p = m2 c4 + p2 c2 (9.44)
p
p′0 = ωp′ = m2 c4 + (p′ )2 c2 (9.45)
We can arbitrarily assign directions of the momentum flow through inter-

nal lines. For each independent loop we should also introduce an additional
4-momentum, which, as we will see, is a dummy integration variable. Then,
following the above “conservation” rule, we can assign a unique 4-momentum
label to each line in the diagram.
In addition to d4 x integrations discussed above we also have 4-momentum
integrals associated with each external line and each internal line (propaga-
tor). As we discussed in subsection 8.4.1, these integrations are sufficient to
kill all 4-momentum delta functions, leaving just one delta function express-
ing the conservation of the overall momentum and energy in the scattering
process. The number of integrals left is equal to the number of independent
loops in the diagram.
To summarize, we have the following rules for writing a V-order matrix
element of the scattering operator
1. Draw a Feynman diagram with V vertices, I internal lines, and L =

I − V + 1 independent loops. Each vertex in the diagram should be
connected to two electron (or proton) lines (either external or internal)
and to one photon line (either external or internal). External incoming
(outgoing) lines correspond to initial (final) configuration of particles
30
We put the word “conservation” in quotes, because the 4-momenta of “virtual parti-
cles” involved here are just integration variables. They cannot be measured and, therefore,
unphysical. See next subsection.
Table 9.1: The correspondence between elements in a Feynman diagram and

factors in the corresponding scattering amplitude.
Diagram element Factor
q Physical interpretation
mc2
incoming electron line u (p, σ)
(2π~)3 ωp a
electron in the state |p, σi
q at t = −∞
mc2
outgoing electron line (2π~)3 ω p
ua (p, σ) electron in the state |p, σi
√
at t = +∞
~ c
incoming photon line √ e∗ (p, τ ) photon in the state |p, τ i
(2π~)3 2p µ
√
at t = −∞
~ c
outgoing photon line √ eµ (p, τ ) photon in the state |p, τ i
(2π~)3 2p
at t = +∞
(/+mc
p 2
)ab
internal electron line (2πi)(2π~)3 (p̃2 −m2 c4 )
no interpretation
carrying 4-momentum pµ
~2 c2 gµν
internal photon line (2πi)(2π~)3 p̃2
no interpretation
carrying 4-momentum pµ
i(2π~)4 e µ
interaction vertex ~
γab no interpretation
in the considered scattering event. Momenta and spins of particles in

these asymptotic states are assumed to be given.
2. Assign arbitrary 4-momentum labels to L internal loop lines.
3. Following the 4-momentum “conservation” rule at each vertex, assign

4-momentum labels to all remaining internal lines.
4. The integrand is now obtained by putting together numerical factors

corresponding to all lines and vertices in the diagram as shown in Table
9.1.
5. Integrate the obtained expression on all loop 4-momenta.
6. Multiply by 1/V! (which comes from formula (7.17)) and by an appro-

(a) p',σ' (b) p',σ' (c) (d)

q',τ' q',τ'
d d
c c
b b
a a
q,τ p,σ
q,τ p,σ
Figure 9.2: Feynman diagrams for electron-photon (Compton) scattering:

(a) and (b) - 2nd order terms, (c) and (d) - selected 4th order terms (see
section 10.2).
priate symmetry factor.31
7. Multiply by a 4D delta function that expresses the conservation of the

total energy-momentum in the scattering process.
9.2.5 Compton scattering

As an example of the above rules, consider the Compton scattering elec-
tron+photon → electron+photon. The two 2nd order diagrams describing
this process are shown in Figs. 9.2(a)-(b). According to our rules, the corre-
sponding scattering amplitude is
h0|ap,σ cq,τ S2 [a† c† ac]a†p′ ,σ′ c†q′ ,τ ′ |0i

"s # " √ #
µ
mc2 i(2π~)4 eγab ~ ce∗µ (q, τ ) / + /q + mc2 )bc
(p
= ua (p, σ) p
(2π~)3 ωp ~ (2π~)3 2q (2πi)(2π~)3 ((p̃ + q̃)2 − m2 c4 )
31
For example, in our calculation (9.32) the symmetry factor of 2 is needed due to
the appearance of two identical expressions under the t-ordering sign. This symmetry
factor canceled the 1/2! multiplier exactly. In all S-matrix calculations in this book such
cancelation of the symmetry factor and the 1/V! multiplier also occurs.
" √ # "s #
i(2π~)4 eγcd
ν
~ ceν (q′ , τ ′ ) mc2
p ud (p′ , σ ′ ) δ 4 (p̃ + q̃ − p̃′ − q̃ ′ )
~ (2π~)3 2q ′ (2π~)3 ωp′
"s # " √ #
mc2 i(2π~)4 eγab ν
~ ceν (q′ , τ ′ ) / − /q′ + mc2 )bc
(p
+ ua (p, σ) p
(2π~)3 ωp ~ (2π~)3 2q ′ (2πi)(2π~)3((p̃ − q̃ ′ )2 − m2 c4 )
" √ # "s #
µ
i(2π~)4 eγcd ~ ce∗µ (q, τ ) mc2
p 3
ud (p′ , σ ′ ) δ 4 (p̃ + q̃ − p̃′ − q̃ ′ )
~ 3
(2π~) 2q (2π~) ωp ′
From this expression one can obtain the cross section for the elastic electron-
photon scattering. We are not going to reproduce this standard calculation,32
but only note that in the limit of low photon energy one gets exactly the
Thomson cross section formula familiar from classical electrodynamics. Since
the Thomson formula was well tested experimentally, this suggests that all
higher order contributions to our low-energy result should vanish. We will
find this observation useful in our discussion of the charge renormalization
condition in the next chapter.
9.2.6 Virtual particles?

As we mentioned above, external lines of Feynman diagrams represent asymp-
totic states of real particles. The momentum and spin labels attached to these
lines directly correspond to the values of observables that can be measured
in these states.
In some QFT textbooks one can also read about interpretation, according
to which internal lines in Feynman diagrams correspond to so-called “virtual
particles.” Thus, internal photon lines correspond to virtual photons and
internal electron (positron) lines correspond to virtual electrons (positrons).
Clearly, energy-momentum labels attached to internal lines do not satisfy
the basic energy-momentum relationships (9.44) - (9.45) characteristic for
real particles.33 This is usually described as virtual particles being “out of
the mass shell.” In spite of their strange properties, virtual particles are
often regarded as real objects and their “exchange” is claimed to be the
origin of interactions. For example, diagram 9.1 is interpreted as a depiction
32
See, for example section 5.5 in [PS95b] and section 8.7 in [Wei95].
33
In the example shown in diagram 9.1, the “momentum” of the virtual photon is q − q′
and its “energy” is ωq − ωq′ . It is easy to show that in a general case the usual massless
relationship c|q − q′ | = ωq − ωq′ is not satisfied.
of a process in which a virtual photon is “exchanged” between electron and

proton, thus leading to their attraction.
These interpretations are not supported by evidence. They are rather
misleading. In fact, Feynman diagrams are not depictions of particle tra-
jectories or real events. Lines and vertices in Feynman diagrams are simply
graphical representations of certain factors in integrals in scattering ampli-
tudes. Quantum theory does not provide any mechanistic description of
interactions (like “exchange” of virtual particles). The only reliable informa-
tion about interactions is contained in the interaction Hamiltonian, which
does not suggest that some invisible virtual particles are emitted, absorbed,
and exchanged during particle collisions.
Chapter 10
RENORMALIZATION
There is no great thing that would not be surmounted by a still

greater thing. There is no thing so small that no smaller thing
could fit into it.
Kozma Prutkov
In the preceding chapter we calculated 2nd order contributions to the S-

operator. These were the 2nd and 3rd terms in the perturbation expansion
(9.18). It can be demonstrated that the obtained result (9.27) is in a pretty
good agreement with experiments on electron-proton scattering. Similarly,
one can obtain rather accurate 2nd order results for other scattering events,
such as the Compton scattering or the electron-positron annihilation. Can
we then expect to get even better agreement by evaluating higher order terms
in (9.27)? Sadly, the answer to this question is no. As we are going to see in
this chapter, many high-order terms in (9.27) are not just inaccurate - they
are infinite!
To get a better understanding of this remarkable failure, in Fig. 10.1
we show all 2nd and 4th order Feynman diagrams that are relevant to the
electron-proton scattering.1 Generally, all diagrams can be divided into two
broad groups: tree diagrams and loop diagrams. In our example there is
only one tree diagram 10.1(a). This is the same diagram as 9.1 evaluated in
1
Diagrams 10.1(h-k) are obtained from renormalization counterterms that will be dis-
cussed in section 10.2.
325
326 CHAPTER 10. RENORMALIZATION
the preceding chapter. In general, all amplitudes whose Feynman diagrams

are tree-like (i.e., do not contain loops), come out fairly accurate.
Serious troubles are associated with loop diagrams, such as 10.1(b-g) in
the 4th perturbation order.2 As we saw in section 8.4, the appearance of loops
is inevitable in high order calculations, and such loops lead to potentially
divergent integrals. There are two types of divergences associated with loops
in QED. First, it can be shown3 that loop integrals in diagrams 10.1(b-c) and
(e-g) diverge due to singularity at zero loop momentum. These are infrared
divergences [Wei95], whose cancelation will be discussed in Chapter 14.2.
Another problem is the divergence of loop integrals 10.1(b-e) at high loop
momenta.4 These are also known as ultraviolet divergences. The way to fix
this problem is provided by the renormalization theory developed by Tomon-
aga, Schwinger, and Feynman in the late 1940’s. Basically, this approach says
that the QED interaction operator (9.13) - (9.14)
V = V1 + V2 (10.1)
is not complete. It must be corrected by adding certain counterterms. The
counterterms are formally infinite operators. However, if they are carefully
selected, then one can cancel the infinities occurring in loop integrals, so
that only some residual finite contributions (radiative corrections ) remain in
each perturbation order. These small radiative corrections are exactly what is
needed to obtain scattering cross sections, energies of bound states, and some
other properties in remarkable agreement with experiments. The procedure
of adding counterterms to the interaction operator is called renormalization.5
2
Here we show only diagrams, in which loops are associated with electron lines. For a
complete treatment, one should also take into account loops, similar to 10.1(b), (c), (g),
but associated with proton lines. However, it can be shown that their contributions to
scattering amplitudes is much smaller due to the inequality m ≪ M . So, we will omit
them in our analysis.
3
see Appendix M
4
5
We should note, however, that the brief story of renormalization presented above is
different from what can be found in most textbooks. The usual explanation of renormal-
ization involves discussion of the difference between bare and physical particles. While
former have infinite masses and charges, the latter acquire observable masses and charges
due to formation of virtual clouds. See section 10.4. In our approach we deal exclusively
with physical particles. We formulate the renormalization program as a modification of
the full Hamiltonian.
327
q’ p’
q (a) p (b)
(c)
q’ p’ q’ p’
h
q’−h
k k kk
h
k−h q−h
p q p
q (d) (e)
q’ p’ q’ p’
h+k h+k
q−h p+h q−h p’−h
h
h
q (f) p q (g) p (h)
(i) (j) (k)

Figure 10.1: Feynman diagrams for the electron-proton scattering up to the
4th perturbation order. Thick full lines - protons, thin full lines - electrons,
dotted lines - photons. A small circle in (h) and crosses in (i) - (k) denote
counterterms that will be discussed in section 10.2.
10.1 Renormalization conditions

In this section we will be interested in rather general physical principles that
underlie the renormalization approach. We will summarize these principles
in the form of two renormalization conditions that are called the no-self-
scattering condition and the charge renormalization condition.
10.1.1 Regularization
As we mentioned above, loop integrals in QED are ultraviolet-divergent
and/or infrared-divergent and it is difficult to do calculations with infinite
quantities. To make things easier, it is convenient to perform regularization.
The idea is to modify the theory by hand in such a way that all loop integrals
are forced to be finite, so that intermediate manipulations do not involve in-
finities. The simplest regularization approach adopted in Appendix M is to
introduce momentum cutoffs in all loop integrals. The cutoffs depend on two
parameters having the dimensionality of mass: the ultraviolet cutoff Λ limits
integrals at high loop momenta and infrared cutoff λ controls integrations
at low momenta singularities. Of course, the theory with such truncated
integrals cannot be exact. To obtain final results, at the end of calculations
the ultraviolet cutoff momentum should be set to infinity Λ → ∞ and the
infrared cutoff should be set to zero λ → 0.6 If counterterms in the Hamilto-
nian are chosen correctly, then in these limits the S-matrix elements should
remain finite, accurate, and cutoff-independent.
10.1.2 No-self-scattering renormalization condition

First we would like to note that the divergence of loop integrals is not the
biggest problem that we face in QED. Even if all loop integrals were conver-
gent, the S-operator might still contain nasty infinities. Let us now consider
in more detail how these infinities appear and what we can do about them.
Recall that QED interaction operator (10.1) has unphys, phys, and renorm
terms. So the corresponding scattering phase operator F 7 must contain terms
of the same types. Then we can write the most general expression for the
S-operator as
6
In this chapter we will keep λ non-zero. The limit λ → 0 and associated infrared
divergences will be discussed in chapter 14.2.
7
see equation (9.18)
10.1. RENORMALIZATION CONDITIONS 329
unp
S = exp(|{z}
F ) = exp(F F ph + F
| {z } + |{z}
ren
|{z})
F ph + |{z}
= exp(|{z} F ren) (10.2)
where we noticed that unphys terms in F do not contribute to the S-operator

due to equation (8.64). Let us now apply the scattering operator (10.2) to
an one-electron state a†p,σ |0i. It follows from Lemma 8.2 that phys operators
yield zero when acting on one-particle states. Renorm operators do not
change the number of particles. Therefore, we can write
Sa†p,σ |0i
F ren )a†p,σ |0i
F ph + |{z}
= exp(|{z}

ph ren 1 ph ren 2 †
= 1 + |{z}
F +F |{z} + 2! (|{z}
F +F |{z}) + . . . ap,σ |0i

1 ph 2 1 ph ren 1 ren ph 1 ren 2
= ph
1 + |{z}
F +F ren
|{z} + 2! (|{z}
F ) + |{z} |{z} + 2! F
F F F + (F ) + . . . a†p,σ |0i
2! |{z} |{z} 2! |{z}

ren 1 ren 2 †
= 1+F |{z} + 2! (F |{z}) + . . . ap,σ |0i
ren †
= exp(F|{z})ap,σ |0i (10.3)
A similar derivation can be performed for a one-photon state c†p,τ |0i and the
vacuum vector
Sc†p,τ |0i = exp(F ren †

|{z})cp,τ |0i (10.4)
ren
S|0i = exp(F |{z})|0i (10.5)
So, the “scattering” in these states depends only on the renorm part of F .
ren ren
We know from (8.63) that the t-integral F |{z} is infinite, even if F is itself
finite. Therefore, if F ren 6= 0 then the S-operator multiplies 0- and 1-particle
states by infinite phase factors. This divergence is unacceptable.
Intuitively, we expect single particle states and the vacuum to evolve
freely during the entire time interval from t = −∞ to t = +∞. This means
that there can be no non-trivial scattering in such states. This also means
that the S-operator must be equal to the identity operator when acting on
such states. But this condition is not satisfied in the QED theory presented
so far.
We have two options to deal with this problem. One option is to claim (as
advised in many textbooks) that |0i is not the true (physical) vacuum state
of the system and that a†p,σ |0i, c†p,τ |0i are not true (physical) one-particle
states. These are examples of the so-called “bare” states.8 The real physical
0-particle and 1-particle states should be obtained as linear combinations of
the bare states for which the self-scattering is absent. Then scattering theory
of such physical particles would not have divergences and paradoxes.
In this book we will adopt a different9 point of view. We will maintain
that |0i, a†p,σ |0i, c†p,τ |0i, . . . are true representatives of real physical 0-particle
and 1-particle states. Then our explanation for the divergent results (10.3) -
(10.5) is that we have started to develop our theory from a wrong interaction
operator (10.1). We insist that this operator must be modified or renormal-
ized, so that the theory is forced to be finite. In particular, we will demand
that the new renormalized interaction satisfies the condition
F ren = 0 (10.6)
This implies that operator |{z}

F is purely phys
F ph
F = |{z}
|{z}
If this condition is satisfied then (10.3) - (10.5) take physically acceptable
forms
Sa†p,σ |0i = a†p,σ |0i

Sc†p,τ |0i = c†p,τ |0i
S|0i = |0i
Taking into account the perturbation expansion S = 1 + S2 + S3 + . . ., we

can also write in each perturbation order of our theory
8
Some say that the bare electron is surrounded by a cloud of virtual photons and
particle-antiparticle pairs.
9
but, in some respect, equivalent
Sn a†p,σ |0i = 0 (10.7)

Sn c†p,τ |0i = 0 (10.8)
Sn |0i = 0 (10.9)
where n = 2, 3, . . .. And for elements of the S-matrix we have
h0|Sn |0i = 0 (10.10)

h0|ap,σ Sn a†p′ ,σ′ |0i = 0 (10.11)
h0|cp,σ Sn c†p′ ,σ′ |0i = 0 (10.12)
The above conditions can be summarized as the following
Statement 10.1 (no-self-scattering renormalization condition) There

should be no (self-)scattering in the vacuum and one-particle states.10
The physical interpretation is obvious: scattering is expected to occur only

when there are at least two particles which interact with each other. One
particle has nothing to collide with, so it cannot experience scattering. Sim-
ilarly, no scattering can happen in the no-particle vacuum state.
10.1.3 Charge renormalization condition

The no-self-scattering renormalization condition sets strict requirements (10.10)
- (10.12) on matrix elements of the S-operator between 0-particle and 1-
particle states. However, this condition alone is not sufficient to guarantee
cancelation of ultraviolet divergences in scattering calculations. On physical
grounds we can derive another necessary renormalization condition.
Recall that the 2nd order electron-proton scattering amplitude (9.34) has
a singularity ∝ e2 /k̃ 2 at zero transferred momentum k̃ = q̃ ′ − q̃ = 0. As
shown in subsection 8.3.5, in the position space this singularity gets Fourier-
transformed into the long-range Coulomb potential −e2 /(4πr). From clas-
sical physics and experiment we also know that the Coulomb potential is
10
Note that our no-self-scattering condition is actually equivalent to the more traditional
“mass renormalization” condition. For example, in section 10.2 we will see that our 2nd
order renormalization counterterms are exactly the same as electron and photon self-energy
counterterms in textbook QED.
a very accurate description of the interaction of charges at large distances

and low energies. Similarly, in subsection 9.2.5 we have established that the
2nd perturbation order describes the low-energy electron-photon scattering
very accurately. So, we should not expect any high-order corrections to this
result. We are now raising these observations to the level of a fundamental
physical principle
Postulate 10.2 (charge renormalization condition) Charge-charge and

charge-photon elastic scattering cross sections at large distances and low en-
ergies are described exactly by the 2nd order term S2 in the S-operator. All
higher order contributions to these results should vanish.
When applied to charge-charge scattering, the charge renormalization con-

dition implies that in orders higher than 2nd, scattering amplitudes should
not be singular at low values of the transfer momentum k̃. Suppose that this
is not true, and that the 4th order electron-proton scattering amplitude has
a singularity ∝ e4 /k̃ 2 . Then the long-range electron-proton potential would
obtain an unacceptable form
e2 + Ce4
V (r) ≈ −
4πr
with a non-zero constant C, but from experiments we know that C = 0 to a
high degree of precision.
10.1.4 Renormalization in Feynman-Dyson theory

In the next section we will see that the no-self-scattering and charge renor-
malization conditions are not satisfied in QED with interaction Hamiltonian
(10.1). This can be seen already from the fact that interaction operator V1
(9.13) contains unphys terms like a† c† a + a† ac. Commutators of two un-
phys terms may contain renorm parts.11 So, there is nothing to prevent the
appearance of renorm terms in the scattering phase operator (7.21)
1
F = V1 − [V1 , V1 ] + . . .
2
11
see Table 8.2
in violation of the condition (10.6). We will see that interaction (10.1) vio-
lates the charge renormalization condition as well. These two problems are
very serious even though they are not directly related to divergences in loop
diagrams. The presence of such divergences is just another argument that
QED interaction V1 + V2 must be modified.
The idea of the renormalization approach is to switch from the interaction
Hamiltonian V1 + V2 to another interaction V c by addition of renormalization
counterterms, which will be denoted collectively by Q
V c = V1 + V2 + Q (10.13)
The form of Q must be chosen such that the no-self-scattering and charge
renormalization conditions are satisfied. In particular, the new scattering
phase operator
1
F c = V c − [V c , V c ] + . . . (10.14)
2
should not contain renorm terms. Moreover, high-order contributions Fnc (n >
2) to the scattering of charged particles should be nonsingular at k̃ = 0.
Unfortunately, the program outlined above is difficult to implement. The
reason is that scattering calculations with the QED Hamiltonian (10.1) are
rather cumbersome. As we discussed in subsection 9.2.3, it is much easier
to use the Feynman-Dyson approach, in which the 2nd order interaction op-
erator V2 is omitted, and the momentum space photon propagator is taken
simply as ∝ gµν /p̃2 . This is the standard way to calculate the S-matrix in
QED, and we will adopt this approach in the present chapter. The gen-
eral idea of renormalization remains the same. We are looking for certain
counterterms QF D that can be added to the original interaction operator
Z
V1 (t) = −e dxψ(x̃)γµ ψ(x̃)Aµ (x̃) (10.15)
to obtain renormalized interaction
VFcD = V1 + QF D (10.16)
(a) (b)
k
a b c d
p,σ γμ γν p',σ' p,σ p',σ'
Figure 10.2: Feynman diagrams for the scattering electron→ electron in the
2nd perturbation order.
with which the Feynman-Dyson perturbation expansion (7.17)
Z+∞ Z+∞
i 1
Sc = 1 − dt1 VFc D (t1 ) − dt1 dt2 T [VFc D (t1 )VFc D (t2 )] . . . (10.17)
~ 2!~2
−∞ −∞
becomes finite, and our two renormalization conditions are satisfied.
10.2 Counterterms
Next we would like to see how the program outlined above works in practice.
In this section we are going to derive explicit expressions for counterterms
QF D in the 2nd and 3rd order.
10.2.1 Electron self-scattering

Let us first discuss the electron→electron scattering and see how exactly the
condition (10.11) is violated.
There are only two connected diagrams that contribute to the electron→electron
scattering in the 2nd order. They are shown in Fig. 10.2. First we will inves-
tigate the effect of 10.2(a). Applying Feynman rules from Table 9.1 to this
diagram, we obtain
FD †
h0|ap,σ S2a ap′ ,σ′ |0i
Z
me2 c4 δ 4 (p̃ − p̃′ ) / − /k + mc2 )bc g µν cd
(p
= − ua (p, σ) d 4
kγµab · γν ub (p′ , σ ′ )
(2πi)2 (2π~)ωp 2 2
(p̃ − k̃) − m c 4 k̃ 2
10.2. COUNTERTERMS 335
(10.18)
2 4 4 ′
me c δ (p̃ − p̃ )
= 2
ua (p, σ) C0 δab + C1 (p /) ub (p, σ ′ )
/ − mc2 )ab + Rab (p
(2π) (2π~)ωp
(10.19)
where (divergent) constants C0 and C1 are calculated in (M.30) and (M.31),

respectively.12

3π 2 mc2 Λ
C0 = 4 ln + 1
2ic3 m
2

2π Λ λ 9
C1 = − 3 ln + 2 ln +
ic m m 4
The finite quantity R(p

/) includes terms quadratic, cubic, and higher order in
2
/p − mc
R(p / − mc2 )2 + C3 (p
/) = C2 (p / − mc2 )3 + . . . (10.20)
By stripping away factors corresponding to external electron lines and

the delta function in (10.19), we obtain the contribution from the loop itself
and from two attached vertices

Dloop (p / − mc2 ) + R(p
/) = ~2 e2 c2 C0 + C1 (p /) (10.21)
This non-vanishing result contradicts the no-self-scattering condition (10.11).

Moreover, this expression is clearly infinite in the limit Λ → ∞. So, we are
dealing with an ultraviolet divergence here.
Now let us consider an arbitrary Feynman diagram, which can contain
electron-photon loops in external and/or internal lines. If the loops shown in
Figure 10.2(a) is inserted in an external electron line, then the 4-momentum
p̃ is on the mass shell,13 and only the constant term in (10.21) survives14
12
Quantities C0 and R have the dimension of hmihc−1 i and hC1 i = hc−3 i. Therefore,
the dimension of the scattering amplitude (10.19) is hp−3 i, as expected from (O.1).
13
This is true for loops in diagrams 10.1(b-c) and also for the diagram 10.2(a).
14
Here we formally write the mass shell condition (p̃2 = m2 c4 ) as /p = mc2 due to (J.23).

2 2 2 2 Λ 1
2 2
Dloop (p
/ = mc ) = ~ e c C0 = 3ie π mc 2 ln + (10.22)
m 2
For loops in internal electron lines, the 4-momentum p̃ is not necessarily on
the mass shell, so the full factor (10.21) should be taken into account.
Next consider the other electron self-scattering diagram 10.2(b).15
FD †
h0|ap,σ S2b ap′ ,σ′ |0i

mc2 µ ′ ′ (2π~)8 e2 ~2 c2 gµν δ 4 (p̃ − p̃′ )
= ua (p, σ)γ ab u b (p , σ ) − ×
(2π~)3 ωp ~2 (2πi)(2π~)3 (p̃ − p̃′ )2
Z
1 4 (k/ + mc2 )cd γcd ν
d k
(2πi)(2π~)3 k̃ 2 − m2 c4
The integral on k̃ vanishes due to (J.7) and (J.8)
Z Z
4T r(γ ν γ ρ kρ + γ ν mc2 ) 4k ν
dk = d4 k =0
k̃ 2 − m2 c4 k̃ 2 − m2 c4
so, diagrams like 10.2(b) make no contributions in both external and internal
electron lines.
10.2.2 Electron self-energy counterterm

From subsection 10.1.4 we know that the above divergences (10.21) and
(10.22) should be canceled by a 2nd order counterterm. Of course, that
counterterm cannot be chosen arbitrarily. It becomes a part of the interac-
tion operator, so it should obey the conditions formulated for such operators
in subsection 9.1.1. In particular, our addition of the counterterm should not
affect the relativistic invariance of the theory, so the condition (9.7) is essen-
tial. Taking these considerations into account, let us choose the following
electron self-energy counterterm
Z Z
QF2elD (t) = δm2 dxψ(x̃)ψ(x̃) + (Z2 − 1)2 dxψ(x̃)(−i~cγµ ∂ µ + mc2 )ψ(x̃)
(10.23)
15
The divergent denominator 1/(p̃ − p̃′ )2 can be regularized by prescription (M.23).
where the 4-gradient ∂ µ is defined as

µ 1∂ ∂ ∂ ∂
∂ ≡ − , , , (10.24)
c ∂t ∂x ∂y ∂z
and parameters δm2 and (Z2 −1)2 must be adjusted to satisfy renormalization
conditions.16 The 2nd order contribution to the electron→electron scattering
amplitude resulting from interaction (10.23) is17
h0|ap,σ S2count a†p′ ,σ′ |0i

Z
iδm2
= − h0|ap,σ d4 xψ(x̃)ψ(x̃)a†p′ ,σ′ |0i
~
Z
i(Z2 − 1)2
− h0|ap,σ d4 xψ(x̃)(−i~cγµ ∂ µ + mc2 )ψ(x̃)a†p′ ,σ′ |0i
~
Z
iδm2 1 mc2 i i ′
= − d4 x 3 √ e ~ p̃·x̃ e− ~ p̃ ·x̃ ua (p, σ)ua (p′ , σ ′ )
~ (2π~) ωp ωp′
Z
i(Z2 − 1)2 mc2 i i ′
− d4 x 3 √ e ~ p̃·x̃ e− ~ p̃ ·x̃ ua (p, σ)(−p
/ + mc2 )ua (p′ , σ ′ )
~ (2π~) ωp ωp′
2πi(δm2 )mc δ (p̃ − p̃′ )
2 4
= − ua (p, σ)ua (p′ , σ ′ )
ωp
2πi(Z2 − 1)2 mc2 δ 4 (p̃ − p̃′ )
− ua (p, σ)(−p / + mc2 )ua (p′ , σ ′ ) (10.25)
ωp
Dropping factors corresponding to external electron lines and the delta func-
tion, we obtain the pure counterterm contribution
i(2π~)4 i(2π~)4
Dcount(p
/) = − δm2 + / − mc2 )
(Z2 − 1)2 (p (10.26)
~ ~
16
δm2 has the dimension of energy and (Z2 − 1)2 is dimensionless. Both of them are
second-order quantities, as indicated by the subscripts. Here the numerical factors δm2
and (Z2 )2 are regarded as unknown constants whose values need to be adjusted to cancel
out divergent terms in (10.21). Eventually, we will see that these factors coincide with
renormalization parameters - the mass shift and the electron wave function renormalization
factor - in usual approaches. Note that in our philosophy of renormalization we are not
following the usual logic. In particular, we are not “shifting” the electron’s mass and do
not multiply the electron field by a factor, as in [Wei95].
17
This formula is obtained by inserting QF D c
2el (t) instead of VF D (t) in the second term on
the right hand side of (10.17).
The interaction potential (10.23) will be represented by a new interaction

vertex, which will be denoted by a cross placed on electron lines, as in figs.
10.1(i-j). Such vertices can be placed on electron lines in Feynman diagrams
of arbitrary topological shape thus increasing the order of the diagram by
2. If the counterterm vertex is placed on an external electron line, then the
4-momentum p̃ is on the mass shell where the 2nd term in (10.26) vanishes.
So, in this case
i(2π~)4
/ = mc2 ) = −
Dcount (p δm2
~
This contribution should be added to the loop term (10.22). Thus we con-
clude that loops in external electron lines will be canceled exactly18 if we
choose19

ie2 c2 C0 3mc3 e2 1 Λ
δm2 = − =− + 2 ln (10.27)
(2π)4 ~ 16π 2 ~ 2 m
In other words, when doing calculations in the renormalized theory the self-
energy loops in external electron lines and contributions from the counterterm
(10.23) can be simply ignored. In our study this means that diagrams 10.1(b-
c) and (i-j) can be omitted.
Electron-photon loops and “cross” vertices can also appear in internal
electron lines whose 4-momentum p̃ is not necessarily on the mass shell. Then
the loop contribution (10.21) has a non-vanishing divergent term proportional
to C1 . In order to cancel this divergence, it is sufficient to choose the other
renormalization factor in (10.23)20
18
In the language of traditional mass renormalization theory this cancelation comes from
the requirement that the renormalized electron propagator has a pole at /p = mc2 , where
m is the physical mass of the electron.
19
This expression for the “mass shift” can be compared with formula for δm in (21)
[Fey49], with expression right after equation (8.42) in [BD64] and with second equation
on page 523 in [Sch61].
20
Compare this result with (8.43) in [BD64] and with equation (94b) in section 15 in
[Sch61]. Note that the finiteness requirement does not specify this factor uniquely. One
can replace C1 with C1 + δ, where δ is any finite constant, and still have a finite result.
Usually, the correct choice δ = 0 is justified by the requirement that the residue of the
renormalized electron propagator is equal to 1. In our approach this requirement is covered

ie2 c2 C1 e2 Λ λ 9
(Z2 − 1)2 = = − ln + 2 ln + (10.28)
(2π)4 ~ 8π 2 ~c m m 4
Then infinite contributions from the loop and the counterterm cancel each
other, and only a finite and harmless R-correction remains
Dloop (p /) = ~2 e2 c2 R(p
/) + Dcount (p /)
It is responsible for so-called electron self-energy radiative corrections to

scattering amplitudes. Such corrections do not play any role in processes
discussed in this book, so we will not discuss them any further.
10.2.3 Photon self-scattering

The amplitude of scattering photon→photon in the second perturbation or-
der is obtained from the diagram 10.321
h0|cp,τ S2F D c†p′ ,τ ′ |0i

ce2
= − δ 4 (p̃′ − p̃)e∗µ (p, τ ) ×
(2π~)2p(2πi)2
hZ (k/ + mc2 )ac µ (p / − /k + mc2 )bd ν i
d4 k γab γcd eν (p, τ ′ )
2
k̃ − m c 2 4 2
(p̃ − k̃) − m c 2 4
2
ce
= δ 4 (p̃′ − p̃)e∗µ (p, τ )(p̃2 g µν − pµ pν )Π(p̃2 )eν (p, τ ′ )
(2π~)2p(2π)2
(10.29)
where Π(p̃2 ) is a divergent function. It is convenient to write Π(p̃2 ) as a sum

of its (infinite) value Π(0) on the photon’s mass shell (p̃2 = 0) plus a finite
remainder η(p̃2 ):
by the charge renormalization condition 10.2. For example, if δ 6= 0 then the 4th order
diagrams 9.2(c) and (d) would not cancel each other at low energies. This would result in a
(finite) 4th order correction the low-energy charge-photon scattering, which is inconsistent
with the classical Thomson formula. Compare with discussion of equation (10.39).
21
We omit calculation of the integral in square brackets. This calculation can be found,
e.g., in section 11.2 of [Wei95], in section 7.5 of [PS95b] and in section 8.2 of [BD64].
p−k
p,τ b cc p,τ’’
γµ γν
a d
k
Figure 10.3: Feynman diagram for the scattering photon→photon in the 2nd
perturbation order.
Π(p̃2 ) = Π(0) + η(p̃2 )

Function η(p̃2 ) can be represented by integral (11.2.22) in [Wei95], which
takes the following form at low values of energy-momentum p̃
Z 1
2 (2π)4 p̃2 x(1 − x) i(2π)4 p̃2
η(p̃ ) = − 2 3 x(1 − x) ln 1 + ≈ (10.30)
2π ic 0 m2 c4 60π 2 m2 c7
In particular, this function vanishes on the photon’s mass shell
η(0) = 0 (10.31)
The factor in (10.29) associated only with the loop and two attached vertices
(no contributions from external lines) is
Dloop (p̃) = e2 Π(p̃2 )(p̃2 g µν − pµ pν ) (10.32)

In equation (10.29), the 4-momentum p̃ is on the mass shell, therefore the
loop contribution (10.32) vanishes there22 and the no-self-scattering renor-
malization condition (10.12) is satisfied without extra effort. The same can
be said for loops in external photon legs of any diagram: these loops can
be simply ignored. However, we cannot ignore loop contributions in internal
photon lines.23 In such cases the 4-momentum p̃ is not necessarily on the
22
The second term in parentheses in (10.32) is not contributing due to the property
p̃ · e(p, τ ) = 0 proved in (K.10).
23
See, for example, the graph 10.1(d).
mass shall, and factor (10.32) is divergent.
10.2.4 Photon self-energy counterterm

Similar to the electron self-energy renormalization described above, we are
going to cancel this infinity by adding to the interaction operator of QED a
new counterterm24
Z
(Z3 − 1)2
QF2ph
D
(t) =− dxF µν (x̃)Fµν (x̃) (10.33)
4
where we denoted
Fµν ≡ ∂µ Aν − ∂ν Aµ
µν
F Fµν = (∂ µ Aν − ∂ ν Aµ )(∂µ Aν − ∂ν Aµ )
= ∂ µ Aν ∂µ Aν − ∂ µ Aν ∂ν Aµ − ∂ ν Aµ ∂µ Aν + ∂ ν Aµ ∂ν Aµ
= 2∂ µ Aν ∂µ Aν − 2∂ µ Aν ∂ν Aµ
and (Z3 − 1)2 is yet unspecified 2nd order renormalization factor. Let us
now evaluate the effect of the counterterm (10.33) on the photon→photon
scattering amplitude. From definitions (K.2) and (10.24) it follows25
√ Z
i c dq qν X − i q̃·x̃ i
∂ν Aµ (x, t) = 3/2
√ [−e ~ eµ (q, τ )cq,τ + e ~ q̃·x̃ e∗µ (q, τ )c†q,τ ]
(2π~) 2q c τ
i i
h0|cp,τ ∂ν Aµ (x, t) → h0| √ e ~ p̃·x̃ pν e∗µ (p, τ )
(2π~)3/2 2pc
i i ′
∂ν Aµ (x, t)c†p′ ,τ ′ |0i → − √ e− ~ p̃ ·x̃ p′ν eµ (p′ , τ ′ )|0i
(2π~)3/2 2p′ c
Then the S-matrix contribution from (10.33) has a form similar to (10.29)
24
From the dimension (O.2) of the photon quantum field, it is easy to show that this
operator has the required dimension of energy if (Z3 − 1)2 is dimensionless. Moreover, this
operator explicitly satisfies the Poincaré invariance condition (9.7).
25
Here symbol → denotes the part that is relevant for calculation of the matrix element
(10.34).
h0|cp,τ S2count c†p′ ,τ ′ |0i

Z
i(Z3 − 1)2
= − h0|cp,τ d4 x ∂ λ Aκ (x̃)∂λ Aκ (x̃) − ∂ λ Aκ (x̃)∂κ Aλ (x̃) c†p′ ,τ ′ |0i
2~
Z
i(Z3 − 1)2 1 i ′
= − d4 x 3
√ ′ e ~ (p̃−p̃ )·x̃ pλ p′λ e∗κ (p, τ )eκ (p′ , τ )
2~c (2π~) 4pp
Z
i(Z3 − 1)2 1 i ′
+ d4 x 3
√ ′ e ~ (p̃−p̃ )·x pλ p′κ e∗κ (p, τ )eλ (p′ , τ )
4~c (2π~) 4pp
i(Z3 − 1)2 πδ (p̃ − p̃′ ) λ ∗κ
4
= − [p pλ e (p, τ )eκ (p, τ ) − pλ pκ e∗κ (p, τ )eλ (p, τ )]
2pc
i(Z3 − 1)2 πδ 4 (p̃ − p̃′ ) ∗
= − eµ (p, τ )[p̃2 g µν − pµ pν ]eν (p, τ ) (10.34)
2pc
In the Feynman diagram notation the counterterm (10.33) gives rise to a new
type of vertex,26 which corresponds to the factor27
8i(Z3 − 1)2 π 4 ~ 2 µν
Dcount (p̃) = − (p̃ g − pµ pν ) (10.35)
c2
The constant (Z3 − 1)2 should be chosen such that expression (10.35) exactly
cancels the loop factor (10.32) when p̃2 = 0, i.e., for external photon lines
ie2 c2 Π(0)
(Z3 − 1)2 = − (10.36)
8π 4 ~
Then for internal photon lines28 the sum of the loop (10.32) and the coun-
terterm (10.35) is finite
Dloop (p̃) + Dcount(p̃) = e2 η(p̃2 )(p̃2 g µν − pµ pν ) (10.37)

This means that an electron-positron loop and a photon-line cross taken
together29 result in a finite correction to the scattering amplitude. This is
the so-called vacuum polarization radiative correction.
26
denoted by a cross placed on photon lines, as in Fig. 10.1(k)
27
This factor is obtained from (10.34) by stripping off the 4-momentum delta func-
tion as well as factors ~c1/2 (2π~)−3/2 (2p)−1/2 eν (p, τ ) and ~c1/2 (2π~)−3/2 (2p)−1/2 e∗µ (p, τ )
associated with external photon lines.
28
i.e., when p̃2 6= 0
29
e.g., the sum of diagrams 10.1(d) and 10.1(k)
10.2.5 Charge renormalization

If our only goal is to make the perturbation theory expansion finite, then the
choice of the renormalization parameter (10.36) is not unique. Indeed, we
could add to Π(0) in (10.36) an arbitrary finite number δ, so that
ie2 c2 (Π(0) + δ)
(Z3 − 1)2 = −
8π 4 ~
and (10.37) would remain finite
Dloop (p̃) + Dcount(p̃) = e2 (η(p̃2 ) − δ)(p̃2 g µν − pµ pν ) (10.38)

Why don’t we do that? The answer is that such an addition would be
inconsistent with the charge renormalization condition in Postulate 10.2.
To see that, let us evaluate the contribution to the electron-proton scat-
tering from diagrams 10.1(d) and (k). Using Feynman’s rules and (10.38) we
obtain30
(d)+(k) †
h0|aq,σ dp,τ S4 dp′ ,τ ′ a†q′ ,σ′ |0i
2 4 4 4
e mMc ~ c
= − 2 4
p δ 4 (q̃ − q̃ ′ − p̃′ + p̃) ×
(2πi) (2π~) ωq ωq′ Ωp Ωp′
e2 (η(k̃ 2 ) − δ) µ gµν 2 νλ gλκ κ
2
U (q, σ; q′ , σ ′ ) (k̃ g − k ν k λ ) W (p, τ ; p′ , τ ′ )
~ k̃ 2 k̃ 2
e4 mMc4 c4
= 2 6
p δ 4 (q̃ − q̃ ′ − p̃′ + p̃) ×
~ (2π) ωq ωq′ Ωp Ωp′
η(k̃ 2 ) − δ
Uµ (q, σ; q′ , σ ′ )W µ (p, τ ; p′ , τ ′ )
k̃ 2
4 4
ec 4 ′ ′ η(k̃ 2 ) − δ
≈ δ (q̃ − q̃ − p̃ + p̃) δσ,σ′ δτ,τ ′ (10.39)
(2π)6 ~2 k̃ 2
Taking into account equation (10.31), we conclude that if δ 6= 0, then this
matrix element has a singularity ∝ −δ/k̃ 2 at small values of k̃. This singu-
larity leads to a 4th order correction to the long-range scattering of charged
30
Here we denoted k̃ ≡ q̃ ′ − q̃ = p̃ − p̃′ , used non-relativistic approximations from
Appendix J.9 and formulas (J.84) - (J.85): U µ gµν k ν k λ gλκ W κ = U µ kµ kκ W κ = 0.
particles and, therefore, violates the charge renormalization condition 10.2.

The only way to satisfy this condition is to set δ = 0.
10.2.6 Vertex renormalization

Our renormalization task is not completed yet. One more type of counterterm
is required in order to make QED calculations finite and accurate.
Let us evaluate diagram 10.1(e) using Feynman rules
(e)
h0|aq,σ dp,τ S4 d†p′ ,τ ′ a†q′ ,σ′ |0i
e4 c4 mMc4
≈ 4 2
p δ 4 (q̃ − q̃ ′ − p̃′ + p̃)u(q, σ) ×
(2πi) (2π~) ωq ωq p p ′ Ω Ω ′
Z
4 −h / + /q + mc2 −h / + /q′ + mc2 µ 1
d hγµ γκ γ ×
(h̃ − q̃)2 − m2 c4 (h̃ − q̃ ′ )2 − m2 c4 h̃2
1
u(q′, σ ′ ) ′ W κ (p, τ ; p′ , τ ′ ) (10.40)
(q̃ − q̃)2
The integral in square brackets I κ (q̃, q̃ ′ ) is calculated in (M.46)
I κ (q̃, q̃ ′)
Zθ
π 2 γ κ 8θ λ 8 1 Λ
= 3
ln + α tan αdα + + 6θ cot θ + 2 ln
ic tan(2θ) m tan(2θ) 2 m
0
2 ′ κ
2π θ(q + q )
− (10.41)
imc5 sin(2θ)
To apply the charge renormalization condition, we should consider this
expression at small values of the transferred momentum (i.e., when q̃ ≈ q̃ ′ ,
θ ≈ 0). In this case we can use equation (J.4)
0 = u(q, σ)(γ µ (q
/ − mc2 ) + (q / − mc2 )γ µ )u(q, σ ′ )
= u(q, σ)(γ µ γ ν qν + γ ν qν γ µ − 2γ µ mc2 )u(q, σ ′)
= u(q, σ)(2g µν qν − 2γ µ mc2 )u(q, σ ′ )
= 2u(q, σ)(q µ − γ µ mc2 )u(q, σ ′ )
to see that when sandwiched between u and u, the 4-vector q µ can be replaced
by γ µ mc2 . Therefore, in (10.41) we will set q κ ≈ (q ′ )κ ≈ γ κ mc2 and obtain31
F π2γ κ
lim I κ (q̃, q̃ ′ ) =
k̃→0,q̃→0 ic3
where we introduced an ultraviolet-divergent and infrared-divergent con-

stant32
λ Λ 9
F ≡ 4 ln + 2 ln +
m m 2
Then at small values of k̃ the scattering amplitude (10.40)
(e)
ie4 c4 F π 2 mMc4 δ 4 (q̃ − q̃ ′ − p̃′ + p̃)
= − 3 4 2
p
′ 2
Uκ (q, σ; q′ , σ ′ )W κ (p, τ ; p′ , τ ′ )
c (2πi) (2π~) ωq ωq Ωp Ωp (q̃ − q̃)
′ ′
(10.42)
has a singularity ∝ F/k̃ 2 . This means that in disagreement with our Postu-
late 10.2, the 4th perturbation order makes a non-trivial contribution to the
long-distance electron-proton scattering. Even more disturbing is the fact
that this effect is infinite in the limit Λ → ∞.
This unacceptable situation can be fixed by adding one more (vertex )
renormalization counterterm to the QED interaction
Z
QF3 D (t) = −e(Z1 − 1)2 dxψ(x̃)γµ ψ(x̃)Aµ (x̃) (10.43)
In Feynman diagrams we will denote the corresponding three-leg vertex by a

circle, as shown in diagram 10.1(h). The renormalization constant (Z1 − 1)2
is of the 2nd perturbation order, so the order of the counterterm (10.43) is 3.
√
31 −1 k̃2
In our notation (M.40), (M.41) θ ≡ sin 2mc2 .
32
Compare with equation (23) in [Fey49].
It has the same form as the basic QED interaction (10.15), so the diagram
10.1(h) is easily evaluated33
(h)
ie2 c2 (Z1 − 1)2 mMc4 δ 4 (q̃ + p̃ − q̃ ′ − p̃′ )
= p Uκ (q, σ; q′ , σ ′ )W κ (p, τ ; p′ , τ ′ )
4π 2 ~ ωq ωq′ Ωp Ωp′ (q̃ ′ − q̃)2
Our requirement to cancel the infinite/singular term (10.42) tells us that we

need to choose our renormalization constant as34
e2 F
(Z1 − 1)2 = = −(Z2 − 1)2
16π 2 c~
After adding all three renormalization counterterms (10.23), (10.33), and
(10.43) the full Feynman-Dyson interaction operator (10.16) takes the form
VFc D (t)
= V1 (t) + QF2elD (t) + QF2ph
D
(t) + QF3 D (t) + . . .
Z Z
= −e dxψ(x̃)γµ ψ(x̃)A (x̃) + e dxΨ(x̃)γµ Ψ(x̃)Aµ (x̃)
µ
Z Z
+δm2 dxψ(x̃)ψ(x̃) + (Z2 − 1)2 dxψ(x̃)(−i~cγµ ∂µ + mc2 )ψ(x̃)
Z Z
(Z3 − 1)2
− µν
dxF (x̃)Fµν (x̃) − e(Z1 − 1)2 dxψ(x̃)γµ ψ(x̃)Aµ (x̃) + . . .
4
(10.44)
10.3 Renormalized S-matrix

Equation (10.44) is the renormalized QED interaction operator that is ac-
curate up to the 3rd perturbation order. Our claim was that inserting this
interaction in the usual formula (10.17) for the S-operator we can obtain
33
34
The equality with the electron-photon loop renormalization factor (Z2 − 1)2 in (10.28)
is not accidental. It is explained in section 8.6 in [BD64].
10.3. RENORMALIZED S-MATRIX 347
ultraviolet-finite and accurate scattering amplitudes. Let us now support

this claim with explicit calculation of all 4th order diagrams in Fig. 10.1. As
we already know, diagrams 10.1(b), (c), (i), (j) cancel out exactly. In this
section we are going to calculate six other diagrams that we arrange in four
coefficient functions s4 on the right hand side of

(d)+(k) (e)+(h) (f ) (g)
h0|aq,σ dp,τ S4c d†p′ ,τ ′ a†q′ ,σ′ |0i = s4 + s4 + s4 + s4 δ 4 (q̃ + p̃ − q̃ ′ − p̃′ )
(10.45)
For our purposes it will be sufficient to work in the limit of low momenta of
particles35 and small transferred momentum k̃.
10.3.1 Vacuum polarization diagrams

Inserting (10.30) in (10.39) and setting δ = 0 we find that in our approxi-
mation the S-matrix elements described by diagrams 10.1(d) and (k) do not
depend on particle momenta and spins36
(d)+(k) ie4 iα2

s4 ≈ δ ′ δ
σ,σ τ,τ ′ = δσ,σ′ δτ,τ ′ (10.46)
~2 (2π)2 60π 2 m2 c3 15π 2 m2 c
10.3.2 Vertex diagrams

The full electron vertex contribution in Figs. 10.1(e) and (h) is given by equa-
tion (10.40), where the square bracket should be replaced by the ultraviolet-
finite expression
F π2 κ
I κ (q̃, q̃ ′ ) − γ
ic3
Zθ
π 2 γ κ 8θ λ 8 1 Λ
= 3
ln + α tan αdα + + 6θ cot θ + 2 ln
ic tan(2θ) m tan(2θ) 2 m
0
2 ′ κ 2 κ
2π θ(q̃ + q̃ ) π γ λ Λ 9
− 5
− 4 ln + 2 ln +
imc sin(2θ) ic3 m m 2
35
see Appendix J.9
36
Here α ≡ e2 /(4π~c) ≈ 1/137 is the fine structure constant.
Zθ
π2γ κ 8θ λ 8
= − 4 ln + α tan αdα − 4 + 6θ cot θ
ic3 tan(2θ) m tan(2θ)
0
2 ′ κ
2π θ(q̃ + q̃ )
−
imc5 sin(2θ)
In the limit of small momentum transfer (θ ≈ 0) this formula simplifies
p
k̃ 2
θ ≈
2mc2
8θ 4|k̃| 4|k̃| 4k̃ 2
≈ ≈ ≈4−
tan(2θ) mc2 tan mc |k̃| |k̃|
mc2 mc |k̃|3 3m2 c4
2 2 + 3m3 c6
Zθ Zθ
8 4 4 θ3 k̃ 2
α tan αdα ≈ α2 dα = · ≈
tan(2θ) θ θ 3 3m2 c4
0 0

1 θ k̃ 2
6θ cot θ ≈ 6θ − = 6 − 2θ2 ≈ 6 −
θ 3 2m2 c4
2θ 2θ 2θ2 k̃ 2
≈ ≈ 1 + ≈ 1 +
sin(2θ) 2θ − (4/3)θ3 3 6m2 c4
! !
F π2 2π 2 γ κ k̃ 2 π 2 (q̃ + q̃ ′ )κ k̃ 2
I κ (q̃, q̃ ′ ) − 3 γ κ ≈ 1− − 1+
ic ic3 12m2 c4 imc5 6m2 c4
π 2 γ κ 4k̃ 2 λ
− 3 2 4
ln
ic 3m c m
so that diagrams 10.1(e) + (h) become
(e)+(h) −ic3 α2 Mmc4

s4 ≈ p W κ (p, τ ; p′ , τ ′ ) ×
2
4π k̃ 2 ωq ωq′ Ωp Ωp′
" ! ! #
2 ′ κ 2 κ 2
k̃ (q̃ + q̃ ) k̃ 4γ k̃ λ
u(q, σ) 2γ κ 1 − − 1+ − ln u(q′ , σ ′ )
12m2 c4 mc2 6m2 c4 3m2 c4 m
(10.47)
This expression can be divided into two parts
(e)+(h) (e)+(h)AM M (e)+(h)div

s4 = s4 + s4
(e)+(h)AM M (e)+(h)div
where s4 remains finite in the infrared limit λ → 0.37 and s4
contains the infrared-divergent logarithm ln(λ/m). Let us now introduce the
vector of transferred momentum k = q′ −q = p−p′ , apply the limit M → ∞,
and (v/c)2 approximation38 to the infrared-finite part of the amplitude
(e)+(h)AM M icα2 Mmc4

s4 = p ×
4π 2 k 2 Ωp−k Ωp ωq+k ωq

γ κ k2 (q + k)κ + q κ k2
u(q + k, σ) 2γ + κ
− 1− u(q, σ ′ ) ×
6m2 c2 mc2 6m2 c2
W κ (p − k, τ ; p, τ ′ ) (10.48)
2 4
icα δτ,τ ′ Mmc
≈ p ×
4π 2 k 2 Ωp−k Ωp ωq+k ωq
h k2

k2

ωq+k + ωq i
0 ′ ′
2 1+ U (q + k, σ; q, σ ) − 1 − u(q + k, σ)u(q, σ )
12m2 c2 6m2 c2 mc2
We use formulas from Appendices J.5, J.9, and (H.8) to further simplify both
parts of this expression

2icα2 δτ,τ ′ Mmc4 k2
·p 1+ U 0 (q + k, ǫ; q, ǫ′ )
4π 2 k 2 Ωp−k ωq+k Ωp ωq 12m2 c2

icα2 δτ,τ ′ q2 qk k2 k2
≈ 1− − − 1+ ×
2π 2 k 2 2m2 c2 2m2 c2 4m2 c2 12m2 c2

† (2q + k)2 + 2i~σel · [k × q]
χσ 1 + χσ′
8m2 c2
icα2 δτ,τ ′ † q2 qk k2 k2 q2 qk k2
≈ χ 1 − − − + + + +
2π 2 k 2 σ 2m2 c2 2m2 c2 4m2 c2 12m2 c2 2m2 c2 2m2 c2 8m2 c2
~σel [k × q]
+ i χσ′
4m2 c2
37
In chapter 14.2 we will see that this expression is related to the electron’s anomalous
magnetic moment (AMM).
38
see Appendix J.9. For example, in this limit we can replace W 0 ≈ δτ,τ ′ and W ≈ 0.

icα2 δτ,τ ′ † 1 1 ~σel [k × q]
≈ χσ − +i χσ′ (10.49)
2π 2 k 2 2
24m c 2 4m2 c2 k 2

icα2 δτ,τ ′ Mmc4 k2 ωq+k + ωq
− 2 2
·p · 1 − 2 2
u(q + k, σ)u(q, σ ′)
4π k Ωp−k ωq+k Ωp ωq k 2 6m c mc2

icα2 δτ,τ ′ q2 qk k2 k2 q 2 + k 2 + 2qk q2
= − 1− − − − + + ×
2π 2 k 2 2m2 c2 2m2 c2 4m2 c2 6m2 c2 4m2 c2 4m2 c2

†
p
2
p
2
q+k
χσ ωq+k + mc , ωq+k − mc · ~σel ×
|q + k|
" p #
ωq + mc 2
p χσ′
2 q
− ωq − mc q · ~σel 2mc2
p p
icα2 δτ,τ ′ k2 † ω q+k + mc 2 ωq + mc2
= − 1 − χ σ
2π 2 k 2 6m2 c2 2mc2
p p
ωq+k − mc2 ωq − mc2 q + k q
− · ~σel · ~σel χσ′
2mc2 |q + k| q

icα2 δτ,τ ′ k2
≈ − 1− ×
2π 2 k 2 6m2 c2

† (q + k)2 q2 |q + k|q q + k q
χσ 1+ 1+ − · ~σel · ~σel χσ′
8m2 c2 8m2 c2 4m2 c2 |q + k| q

icα2 δτ,τ ′ † 1 1 i~σel [k × q]
= χσ − 2 + + χσ′ (10.50)
2π 2 k 24m2 c2 4m2 c2 k 2
Putting (10.49) and (10.50) together, we obtain39
(e)+(h)AM M α2 δτ,τ ′ † (~σel · [k × q])

s4 =− χ χσ′ (10.51)
4π 2 m2 c σ k2
For the λ-dependent part of (10.47) we use the non-relativistic approximation

(e)+(h)div iα2 Mmc4 λ
s4 = p ln (Ũ · W̃ )
3π 2 m2 c Ωp−k ωq+k Ωp ωq m

iα2 λ
≈ 2 2
ln δσ,σ′ δτ,τ ′ (10.52)
3π m c m
39
As expected from the charge renormalization condition, Coulomb-like terms ∝ 1/k 2
have canceled out.
10.3.3 Ladder diagram

Let us investigate the contribution in S4c corresponding to the ladder diagram
10.1(f). According to Feynman rules, it is given by the integral40
(f ) e4 (2π~)16 −1 −~4 c4 Mmc4

s4 = p
~4 (2π)2 (2π~)6 (2π)2 (2π~)6 (2π~)6 ωq ωq′ Ωp Ωp′
Z
/ − /h + mc2 )γν u(q′ ) w(p)γµ (p
u(q)γµ (q / + /h + Mc2 )γν w(p′ )
d4 h ·
(q̃ − h̃)2 − m2 c4 (p̃ + h̃)2 − M 2 c4
1
[h̃ − λ c ][(h̃ + k̃)2 − λ2 c4 ]
2 2 4
We use Dirac equations (J.83) - (J.82) for functions u(q) and w(p) and
the anticommutator relationship (J.4) for gamma matrices to rewrite the
numerator
[u(q)γµ (q / − /h + mc2 )γν u(q′ )] · [w(p)γµ (p

/ + /h + Mc2 )γν w(p′ )]
= [u(q)γµ (q / + mc2 )γν u(q′ ) − u(q)γµ/hγν u(q′ )] ×
[w(p)γµ (p / + Mc2 )γν w(p′ ) + w(p)γµ/hγν w(p′ )]
= [2u(q)q µ γν u(q′ ) − u(q)γµ/hγν u(q′ )][2w(p)pµ γν w(p′) + w(p)γµ/hγν w(p′ )]
= 4(q̃ · p̃)u(q)γν u(q′ )w(p)γν w(p′) + 2u(q)γν u(q′ )w(p)q /γα γν w(p′ )hα
− 2u(q)p /γα γν u(q′ )w(p)γν w(p′ )hα − u(q)γµ γα γν u(q′ )w(p)γµ γβ γν w(p′ )hα hβ
In the denominators we use q̃ 2 = m2 c4 and p̃2 = M 2 c4 to write
(q̃ − h̃)2 − m2 c4 = h̃2 − 2(q̃ · h̃)

(p̃ + h̃)2 − M 2 c4 = h̃2 + 2(p̃ · h̃)
Using non-relativistic approximation (J.70) we then obtain
(f ) e4 c4
s4 ≈ ×
(2π)4 (2π~)2
40
Here we dropped spin indices, as in our approximation the spin dependence will be
lost anyway.
[4(q̃ · p̃)u(q)γν u(q′ )w(p)γν w(p′ )b(p, q, k)

+ 2u(q)γν u(q′ )w(p)q /γα γν w(p′ )bα (p, q, k)
− 2u(q)p /γα γν u(q′ )w(p)γν w(p′ )bα (p, q, k)
− u(q)γµ γα γν u(q′ )w(p)γµ γβ γν w(p′)bαβ (p, q, k)] (10.53)
where
Z
d4 h
b(p, q, k) =
[h̃2 − 2(q̃ · h̃][h̃2 + 2(p̃ · h̃)][h̃2 − λ2 c4 ][(h̃ + k̃)2 − λ2 c4 ]
(10.54)
Z 4 α
d hh
bα (p, q, k) =
[h̃ − 2(q̃ · h̃)][h̃ + 2(p̃ · h̃)][h̃2 − λ2 c4 ][(h̃ + k̃)2 − λ2 c4 ]
2 2
Z
d4 hhα hβ
bαβ (p, q, k) =
[h̃2 − 2(q̃ · h̃)][h̃2 + 2(p̃ · h̃)][h̃2 − λ2 c4 ][(h̃ + k̃)2 − λ2 c4 ]
In our calculations we are interested only in leading infrared-divergent

terms. They come from those regions in the 4D space of the integration vari-
able h̃, where integrand’s denominators vanish in the limit λ → 0. These are
regions h̃ ≈ 0 and h̃ ≈ −k̃. Using these approximations in the numerators,
we get
bα (p, q, k) ≈ −k α b(p, q, k)
bαβ (p, q, k) ≈ k α k β b(p, q, k)
Now substitute these results in (10.53) and use definitions (J.62) - (J.63) of
functions U µ and W µ
(f ) e4 c4
s4 = b(p, q, k) ×
(2π)4 (2π~)2
[4(q̃ · p̃)(u(q)γν u(q′ )w(p)γν w(p′ ) − 2u(q)γν u(q′ )w(p)q /γν w(p′ )
/k
+ 2u(q)p /k/γν u(q′ )w(p)γν w(p′) − u(q)γµ/kγν u(q′ )w(p)γµ/kγν w(p′ )]
e4 c4
= b(p, q, k)[4(q̃ · p̃)(Ũ · W̃ ) − 2Uν w(p)q/γν w(p′ )
/k
(2π)4 (2π~)2
+ 2u(q)p /k/γν u(q′ )Wν − u(q)γµ/kγν u(q′ )w(p)γµ/kγν w(p′ )]
Next we need to simplify separate pieces of this expression
w(p)q/γν w(p′ ) = w(p)q

/k /γν w(p′ ) − w(p)q
/p /p/′ γν w(p′ )
= w(p)q/γν w(p′) − w(p)q
/p /(2(p′ )ν − mc2 γν )w(p′ )
= w(p)(−p / + 2(q̃ · p̃))γν w(p′ ) − 2(p′ )ν (q̃ · W̃ ) + mc2 w(p)q
/q /γν w(p′ )
= w(p)(−mc2/q + 2(q̃ · p̃))γν w(p′ ) − 2(p′ )ν (q̃ · W̃ ) + mc2 w(p)q/γν w(p′ )
= −mc2 w(p)q/γν w(p′) + 2(q̃ · p̃)Wν − 2(p′ )ν (q̃ · W̃ )
+ /γν w(p′ ) = 2(q̃ · p̃)Wν − 2(p′ )ν (q̃ · W̃ )
mc2 w(p)q (10.55)
u(q)p /γν u(q′ ) = u(q)p

/k /′ − /)γ
/(q q ν u(q′ )
= u(q)p /′ γν u(q′ ) − u(q)p
/q /γν u(q′ )
/q
= u(q)p /(−γν/q′ + 2(q ′ )ν )u(q′ ) − u(q)(−q / + 2(q̃ · p̃))γν u(q′ )
/p
= −u(q)p /γν/q′ u(q′ ) + 2u(q)p /(q ′ )ν u(q′ ) + u(q)q/γν u(q′ )
/p
− 2(q̃ · p̃)Uν
= −mc2 u(q)p /γν u(q′ ) + 2(q ′ )ν (p̃ · Ũ ) + mc2 u(q)p /γν u(q′ ) − 2(q̃ · p̃)Uν
= 2(q ′ )ν (p̃ · Ũ ) − 2(q̃ · p̃)Uν
u(q)γµ/kγν u(q′ )
= u(q)γµ/q′ γν u(q′ ) − u(q)γµ/γ
q ν u(q′ )
= u(q)γµ (−γν/q′ + 2(q ′ )ν )u(q′ ) − u(q)(−q /γµ + 2q µ )γν u(q′ )
= −u(q)γµ γν/q′ u(q′ ) + 2u(q)γµ (q ′ )ν u(q′ ) + u(q)q
/γµ γν u(q′)
−2u(q)q µ γν u(q′ )
= −mc2 u(q)γµ γν u(q′ ) + 2(q ′)ν Uµ + mc2 u(q)γµ γν u(q′ ) − 2q µ Uν
= 2(q ′ )ν Uµ − 2q µ Uν (10.56)
w(p)γµ/kγν w(p′ ) = −2(p′ )ν Wµ + 2pµ Wν (10.57)
Then we use equalities (M.47), (M.48) and non-relativistic approximations

(q̃ ′ · p̃′ ) ≈ Mmc4 , (Ũ · W̃ ) ≈ 1 to write
(f ) e4 c4
s4 = b(p, q, k) ×
(2π)4 (2π~)2
[4(q̃ · p̃)(Ũ · W̃ ) − 2Uν (2(q̃ · p̃)Wν − 2(p′ )ν (q̃ · W̃ ))
+ 2(2(q ′ )ν (p · U) − 2(q̃ · p̃)Uν )Wν − (2(q ′ )ν Uµ − 2q µ Uν )(−2(p′ )ν Wµ + 2pµ Wν )]
4e4 c4
= b(p, q, k) ×
(2π)4 (2π~)2
[(q̃ · p̃)(Ũ · W̃ ) − (q̃ · p̃)(Ũ · W̃ ) + (p̃′ · Ũ )(q̃ · W̃ ) + (q̃ ′ · W̃ )(p̃ · Ũ ) − (q̃ · p̃)(Ũ · W̃ )
+ (q̃ ′ · p̃′ )(Ũ · W̃ ) − (q̃ ′ · W̃ )(p̃ · Ũ) − (q̃ · W̃ )(p̃′ · Ũ) + (q̃ · p̃)(Ũ · W̃ )]
4e4 c4
= b(p, q, k)(q̃ ′ · p̃′ )(Ũ · W̃ )
(2π)4 (2π~)2
4e4 Mmc8
≈ b(p, q, k) (10.58)
(2π)4 (2π~)2
Function b(p, q, k) is evaluated in (M.52)
! Z1
π2 k̃ 2 dy
b(p, q, k) = ln (10.59)
ic3 k̃ 2 λ 2 c4 (p̃ + q̃)2 y 2 − 2p̃(p̃ + q̃)y + p̃2
0
This integral is of the table form
Z
dy 2 2ax + b
2
=√ tan−1 √ + const
ay + by + c 4ac − b2 4ac − b2
So, we get
Z1
dy
(p̃ + q̃)2 y 2 − 2p̃(p̃ + q̃)y + p̃2
0

−1 2(p̃ + q̃)2 y − 2p̃(p̃ + q̃) y=1
−1
= 2B tan
B y=0
2 2
−1 −1 2q̃ + 2(p̃ · q̃) −1 2p̃ + 2(p̃ · q̃)
= 2B tan + tan
B B

1 mc Mc
≈ 3
tan−1 + tan−1 (10.60)
iMc q iq iq
where we used the inequality M ≫ m and denoted
p p p
B ≡ 4(p̃ + q̃)2 p̃2 − 4(p̃2 + (p̃ · q̃))2 = 2 p̃2 q̃ 2 − (p̃ · q̃)2 = 2 M 2 m2 c8 − (p̃ · q̃)2
s 2
p 2 q 2
≈ 2 M 2 m2 c8 − Mc2 + mc2 + − c2 (pq)
2M 2m
p
≈ 2 −M 2 c6 q 2 = 2iMc3 q
Putting results (10.58) - (10.60) together and using k̃ 2 ≈ −c2 k 2 , we finally

obtain

(f ) α2 mc2 −1 mc −1 Mc k2
s4 ≈ tan + tan ln − 2 2
π 2 qk 2 iq iq λ c
(10.61)
We will not elaborate this result further as we expect some cancelations with
the crossed ladder diagram evaluated in the next subsection.
10.3.4 Cross-ladder diagram

Similar to the above ladder diagram we calculate the cross-ladder diagram
shown in Fig. 10.1(g)
Z
(g) e4 c4 u(q)γµ (q/ − /h + mc2 )γν u(q′ )
s4 ≈ d4 h ×
(2π)4 (2π~)2 (q̃ − h̃)2 − m2 c4 + iσ
/′ − /h + Mc2 )γµ w(p′ )
w(p)γν (p 1
′ 2
(p̃ − h̃) − M c 2 4 [h̃ − λ c ][(h̃ + k̃)2 − λ2 c4 ]
2 2 4
In the numerator we use (J.4), (J.83) - (J.82) to write
/ − /h + mc2 )γν u(q′ )] · [w(p)γν (p

[u(q)γµ (q /′ − /h + Mc2 )γµ w(p′ )]
/ + mc2 )γν u(q′ ) − u(q)γµ/hγν u(q′ )]
= [u(q)γµ (q
× [w(p)γµ (p/′ + Mc2 )γν w(p′ ) − w(p)γµ/hγν w(p′ )]
= [2u(q)q µ γν u(q′ ) − u(q)γµ/hγν u(q′ )]
× [2w(p)(p′ )ν γµ w(p′ ) − w(p)γµ/hγν w(p′ )]

= /′ u(q′ )w(p)q
4u(q)p /w(p′ ) − 2u(q)γν u(q′ )w(p)q
/γα γν w(p′ )hα
− 2u(q)γµ γα/p′ u(q′)w(p)γµ w(p′ )hα
+ u(q)γµ γα γν u(q′ )w(p)γµ γβ γν w(p′ )hα hβ
and
(g) e4 c4
s4 = ×
(2π)4 (2π~)2
/′ u(q′ )w(p)q
[4u(q)p /w(p′ )b(−p′ , q, k)
− 2u(q)γν u(q′ )w(p)q /γα γν w(p′ )bα (−p′ , q, k)
− 2u(q)γµ γα/p′ u(q′ )w(p)γµ w(p′ )bα (−p′ , q, k)
+ u(q)γµ γα γν u(q′ )w(p)γµ γβ γν w(p′ )bαβ (−p′ , q, k)] (10.62)
Here we notice that integral
Z
′ d4 h
b(−p , q, k) ≡
[h̃2 − 2(q̃ · h̃)][h̃2 − 2(p̃′ · h̃)][h̃2 − λ2 c4 ][(h̃ + k̃)2 − λ2 c4 ]
can be obtained from (10.54) by replacing p̃ → −p̃′ . Using the same assump-
tions as in the preceding subsection, two other integrals can be expressed in
terms of b(−p′ , q, k)
Z
α ′ d4 hhα
b (−p , q, k) ≡
[h̃2 − 2(q̃ · h̃)][h̃2 − 2(p̃′ · h̃)][h̃2 − λ2 c4 ][(h̃ + k̃)2 − λ2 c4 ]
≈ −k α b(−p′ , q, k)
Z
αβ ′ d4 hhα hβ
b (−p , q, k) ≡
[h̃2 − 2(q̃ · h̃)][h̃2 − 2(p̃′ · h̃)][h̃2 − λ2 c4 ][(h̃ + k̃)2 − λ2 c4 ]
≈ k α k β b(−p′ , q, k)
Then in (10.62) we can use (10.55), (10.56), (10.57), and
′ ′
u(q)γµ/kp/u(q /′ − /)p
) = u(q)γµ (q q /′ u(q′ )
= u(q)γµ/q′/p′ u(q′ ) − u(q)γµ/p
q/′ u(q′)
= −u(q)γµ/p′/q′ u(q′) + 2u(q)γµ (q ′ p′ )u(q′ ) + u(q)q

/γµ/p′ u(q′)
−2u(q)q µ/p′ u(q′ ) = 2u(q)γµ (q ′ · p′ )u(q′ ) − 2u(q)q µ/p′ u(q′ )
= 2(q̃ ′ · p̃′ )Uµ − 2(p̃′ · Ũ )q µ
to obtain
(g) e4 c4
s4 = b(−p′ , q, k) ×
(2π)4 (2π~)2
[4(p̃′ · Ũ)(q̃ · W̃ ) + 2Uν (2(q̃ · p̃)Wν − 2(p′ )ν (q · W ))
+ 2(2(q̃ ′ · p̃′ )Uµ − 2(p̃′ · Ũ )q µ )Wµ + (2(q ′)ν Uµ − 2q µ Uν )(−2(p′ )ν Wµ + 2pµ Wν )]
4e4 c4
= b(−p′ , q, k) ×
(2π)4 (2π~)2
[(p̃′ · Ũ )(q̃ · W̃ ) + (q̃ · p̃)(Ũ · W̃ ) − (p̃′ · Ũ)(q̃ · W̃ ) + (q̃ ′ · p̃′ )(Ũ · W̃ )
− (p̃′ · Ũ)(q̃ · W̃ ) − (q̃ ′ · p̃′ )(Ũ · W̃ ) + (q̃ · W̃ )(p̃′ · Ũ ) + (q̃ ′ · W̃ )(p̃ · Ũ)
− (q̃ · p̃)(Ũ · W̃ )]
4e4 c4
= 4 2
b(−p′ , q, k)(q̃ ′ · W̃ )(p̃ · Ũ )
(2π) (2π~)
4e4 Mmc8
≈ b(−p′ , q, k)
(2π)4 (2π~)2
For the integral
! Z1
π2 k̃ 2 dy
b(−p′ , q, k) = ln
ic3 k̃ 2 λ 2 c4 (−p̃′ + q̃)2 y 2 + 2p̃′ (−p̃′ + q̃)y + p̃2
0
we use the same method as in (10.60). This time in our non-relativistic

approximation (p̃′ · q̃) ≈ Mmc4
p
B′ ≡ 4(q̃ − p̃′ )2 (p̃′ )2 − 4((p̃′)2 − (p̃′ · q̃))2 ≈ B = 2iMc3 q
Z1
dy
(−p̃′ + q̃)2 y 2 + 2p̃′ (−p̃′ + q̃)y + p̃2
0

−1 2(−p̃′ + q̃)2 y + 2p̃′ (−p̃′ + q̃) y=1
−1
≈ 2B tan
B y=0
2 ′
2
−1 −1 2q̃ − 2(p̃ · q̃) −1 2p̃ − 2(p̃′ · q̃)
= 2B tan + tan
B B

1 mc Mc
≈ 3
− tan−1 + tan−1
iMc q iq iq
and

(g) α2 mc2 −1 mc −1 Mc k2
s4 ≈ 2 2 − tan + tan ln − 2 2
π qk iq iq λ c
Adding this result to (10.61) and using approximation tan−1 (Mc/(iq)) ≈

−π/2 we obtain the joint contribution of the ladder and crossed ladder dia-
grams

(f )+(g) α2 mc2 k2
s4 ≈− ln − 2 2 (10.63)
πqk 2 λ c
10.3.5 Renormalizability
Combining results (10.46), (10.51), (10.52), (10.63) we get the following Λ-
independent 4th order amplitude for the electron-proton scattering
h0|aq,σ dp,τ S4c d†p′ ,τ ′ a†q′ ,σ′ |0i ≈ δ 4 (q̃ − q̃ ′ − p̃′ + p̃)δτ τ ′ ×
h iα2
iα2 λ mc2 α2 k2
δσσ′ + 2 2 ln δσσ′ − ln − 2 2 δσσ′
15π 2 m2 c 3π m c m πqk 2 λ c
α2 χ†σ (~σel · [k × q])χσ′ i
− (10.64)
4π 2 m2 ck 2
As expected, this result is finite and does not depend on the cutoff param-
eter Λ. In other words, this result is ultraviolet-safe, so that we fulfilled
the promise of the renormalization approach. Unfortunately, the amplitude
(10.64) still contains unpleasant infrared-divergent logarithms. Their phys-
ical origin is related to the vanishing photon mass. Any collision involving
10.4. TROUBLES WITH RENORMALIZED QED 359
charged particles41 is inevitably accompanied by the emission of a large (even

infinite) number of low-energy (soft) photons. In most cases these soft pho-
tons escape experimental detection, but in a rigorous theoretical treatment
one must take them all into account in order to obtain scattering cross-
sections in good agreement with experiments. This would require rather
involved non-perturbative calculations [Wei95, PS95b] that are beyond the
scope of this book. The cancelation of infrared divergences in calculations of
the hydrogen energy spectrum (the Lamb shift) will be discussed in chapter
14.2.
So, we conclude that our renormalization approach has achieved its goal:
ultraviolet divergences in loop integrals have been canceled, and accurate
description of scattering is within reach. Can we get even better accuracy
by extending renormalization to higher perturbation orders? Yes, but then
we would need to add higher-order ultraviolet-divergent counterterms to our
interaction (10.44), so that the no-self-scattering and charge renormalization
conditions are enforced in each perturbation order. It is remarkable that
all these higher-order counterterms will have exactly the same functional
form as those already discussed. In other words, the complete infinite-order
interaction operator of the renormalized QED will have the same form as
our low-order expression (10.44). Only the values of exact renormalization
constants δm, Z2 − 1, Z3 − 1, and Z1 − 1 will get more complicated forms
than our 2nd order expressions δm2 , (Z2 − 1)2 , (Z3 − 1)2 , (Z1 − 1)2 . This fact
is referred to as the renormalizability of QED.
We have applied renormalization only to the potential energy operator V
in QED. In order to have a relativistic theory one also need to find appropri-
ate counterterms for the potential boost operator Z, so that the “renormal-
ized boost” satisfies appropriate Poincaré commutation relations with the
“renormalized energy.” As far as I know, there were no attempts to extend
the renormalization theory to boosts. Nevertheless, we will assume that such
a construction is possible, and that renormalized QED is a fully relativistic
theory.
10.4 Troubles with renormalized QED

Great successes of the renormalized QED are well known. In this section we
are going to focus on its weak points. The most obvious problem of QED is
41
in particular, the e− − p+ collision considered here
related to extremely weird properties of its fundamental ingredients - bare

particles. The masses and charges of bare electrons and protons are infinite,
and the Hamiltonian of QED is formally infinite as well. More precisely,
coefficient functions of interaction terms in HFc D = H0 + VFc D written in the
bare-particle representation (10.44) diverge as the ultraviolet cutoff momen-
tum Λc is sent to infinity. From results obtained in this chapter, we know
that this operator can be used successfully in S-matrix calculations, because
all divergences cancel out. However, its use for bound state or time evolution
studies seems problematic.
10.4.1 Renormalization in QED revisited

Let us now review the material of this chapter and recall the logic which led
us from the original Feynman-Dyson interaction Hamiltonian V1 in (10.15)
to the interaction with counterterms VFcD in (10.44).42
One distinctive feature of the operator V1 is its unphys (U) type. In
order to obtain the scattering phase operator F one needs to calculate mul-
tiple commutators of V1 as in (7.21). It is clear from Table 8.2 that these
commutators will give rise to renorm terms43 in each perturbation order of
F . However, according to equation (10.6) and Statement 10.1 (the no-self-
scattering renormalization condition), there should be absolutely no renorm
terms in the operator F of any sensible theory. So, we have a contradiction.
The renormalization approach presented in this chapter suggested the fol-
lowing resolution of this paradox: change the interaction operator from V1 to
VFc D by adding infinite counterterms (10.44). Two conditions were used to
select the counterterms. The first (no-self-scattering renormalization) condi-
tion required cancelation of all renorm terms in the scattering phase opera-
tor F c calculated with VFcD . This requirement was satisfied by making sure
there was a certain balance of unphys and renorm (counter)terms in VFcD ,
so that all renorm terms in F c canceled out. The second (charge renormal-
ization) condition demanded a consistency with classical electrodynamics in
the low energy regime. These two conditions were sufficient to specify all
counterterms in VFcD . Somewhat miraculously, the S-matrix obtained with
thus modified Hamiltonian HFc D = H0 + VFc D agreed with experiment at all
energies and for all scattering processes.
42
Everything said about Feynman-Dyson interaction VFcD in this subsection applies also
to the conventional renormalized interaction operator V c in (10.13).
43
in commutators [U, U ′ ]
Frequently one can meet interpretations of the renormalization approach,

which say that infinities in the Hamiltonian HFc D (10.44) have a real physical
meaning. The usual interpretation is that renormalization (=the addition of
counterterms) is equivalent to redefinition of parameters (masses and charges
of particles) in the Lagrangian. In fact, these parameters become infinite af-
ter the renormalization. Thus, it is declared that particle operators specified
in section 8.1 refer not to real (or physical) particles, but to so-called bare
particles. One can hear also that bare electrons and protons really have in-
finite masses and infinite charges.44 The fact that such particles were never
observed in nature is then explained as follows: Bare particles are not eigen-
states of the total Hamiltonian HFc D . The “physical” electrons and protons
observed in experiments are complex linear combinations of multiparticle
bare states. These linear combinations are eigenstates of the total Hamil-
tonian, and they do have correct (finite) measurable masses m and M and
charges ±e. This situation is often described as bare particles being sur-
rounded by “clouds” of virtual particles, thus forming physical or dressed
particles. The virtual cloud modifies the mass of the bare particle by an
infinite amount, so that the resulting mass is exactly the one measured in ex-
periments. The cloud also “shields” the (infinite) charge of the bare particle,
so that the effective charge becomes ±e [Hua13].
Even if we accept this weird description of physical reality, it is clear
that the renormalization program did not solve the problem of ultraviolet
divergences in quantum field theory. The divergences were removed from
the S-operator, but they reappeared in the Hamiltonian HFc D in the form
of infinite counterterms. This introduction of counterterms just shifted the
problem of infinities from one place to another. Inconsistencies of the renor-
malization approach concerned many prominent scientists, such as Dirac and
Landau. For example, Rohrlich wrote
Thus, present quantum electrodynamics is one of the strangest

achievements of the human mind. No theory has been confirmed
44
It is also common to hypothesize that these bare parameters may be actually very large
rather than infinite. The idea is that the “granularity” of space-time or other yet unknown
Planck-scale effect sets a natural momentum cutoff. This “effective field theory” approach
assumes that QED is just a low energy approximation to some unknown divergence-free
truly fundamental theory operating at the Planck scale. Speculations of this kind are not
needed for the dressed particle approach that will be developed in the second part of this
book. The dressed particle Hamiltonian and the corresponding S-operator remain finite,
even for infinite cutoff momentum.
by experiment to higher precision; and no theory has been plagued

by greater mathematical difficulties which have withstood repeated
attempts at their elimination. There can be no doubt that the
present agreement with experiments is not fortuitous. Neverthe-
less, the renormalization procedure can only be regarded as a tem-
porary crutch which holds up the present framework. It should be
noted that, even if the renormalization constants were not infinite,
the theory would still be unsatisfactory, as long as the unphysical
concept of “bare particle” plays a dominant role. F. Rohrlich
[Roh]
In our interpretation of renormalization we do not distinguish bare and

physical particles. We claim that |0i, a†p |0i and c†p |0i represent real physical
0-particle and 1-particle states with finite masses and charges. The only ef-
fect of renormalization is to add certain counterterms to the original QED
interaction V1 , as shown in (10.44). The counterterms are formally divergent,
thus rendering the QED Hamiltonian unusable for most quantum-mechanical
calculations. In particular, time evolution studies become virtually impossi-
ble in this approach, as we will see in the next subsection.
10.4.2 Time evolution in QED

Let us forget for a moment that interaction (counter)terms in H c are infinite
and apply the time evolution operator U c (t ← 0) = exp(− ~i H c t) to the vac-
uum (no-particle) state. Expressing V c in terms of creation and annihilation
operators of particles we obtain45
i c it
|0(t)i = e− ~ H t |0i = (1 − (H0 + V c ) + . . .)|0i
~
∝ |0i + ta† b† c† |0i + td† f † c† |0i + . . .
∝ |0i + t|abci + t|df ci + . . . (10.65)
We see that various multiparticle states (|abci, |df ci, etc.) are created from
the vacuum during time evolution. The physical vacuum in QED is not just
45
Here we are concerned only with the presence of a† b† c† and d† f † c† interaction terms
in (L.8). All other terms are omitted. We also omit factors i, ~ and coefficient functions
which are not relevant in this context.
an empty state without particles. It is more like a boiling “soup” of bare

particles, antiparticles, and photons.
Similarly, the time evolution of bare one-electron states is accompanied
by appearing and disappearing virtual particles. Such behaviors have not
been seen in experiments. Obviously, if a theory cannot get right the time
evolutions of simplest zero-particle and one-particle states, there is no hope
of predicting the time evolution in more complex multiparticle states.
The reason for these unphysical time evolutions is the presence of un-
phys (e.g., a† b† c† + d† f † c† ) and renorm (a† a + b† b + d† d + f † f ) terms in
the interaction operator V c of the renormalized QED. How is it possible that
such an unrealistic Hamiltonian leads to exceptionally accurate experimental
predictions?46
The important point is that unphys and renorm interaction terms in H c
are absolutely harmless when the time evolution in the infinite time range
(from −∞ to ∞) is considered. As we saw in equation (7.8), such time
evolution is represented exactly by the product of the non-interacting time
evolution operator and the S-operator
U c (∞ ← −∞) = S c U0 (∞ ← −∞) = U0 (∞ ← −∞)S c
The factor U0 in this product leaves invariant no-particle and one-particle

states. The factor S c has the same property due to the cancelation of unphys
and renorm terms in F c , as discussed in subsection 10.1.2. So, in spite of
ill-defined operators H c and exp(− ~i H c t), the renormalized QED is perfectly
capable of describing scattering.
Luckily for QED, current experiments with elementary particles are not
designed to measure time-dependent dynamics in the interaction region.
These experiments are, basically, limited to measurements of scattering cross-
sections as well as energies and lifetimes of bound states, i.e., properties
encoded in the S-matrix. In collision experiments, interaction processes oc-
cur almost instantaneously, so their detailed time evolution is beyond reach.
Measured scattering cross sections do reflect this time evolution, but only
in an averaged, integrated form. In bound states (atoms, nuclei, etc.), the
46
Note that in the traditional renormalized QED S-matrix elements are calculated on
bare particle states. This appears to be in contradiction with the absence of well-defined
time evolution of such states and with the general understanding that bare particles are
not physical.
interaction is present all the time, but the time evolution is trivial: station-

ary wave function acquire simple time-dependent phase factors exp − ~i En t .
Experimental measurements of such bound states are usually limited to their
energies En , while accessing their wave functions is virtually impossible. As
we know from subsection 7.1.5, En can be obtained as positions of poles of
the S-matrix S(E) on the complex energy plane. Thus, for description of
most experiments it is sufficient to know the S-operator; the knowledge of
the Hamiltonian is not required.47
So, in the present experimental situation, the lack of a well-defined Hamil-
tonian and the inability of renormalized QFT to describe time evolution can
be tolerated. But there is no doubt that time-dependent processes in high
energy physics will be eventually accessible to more advanced experimental
techniques. See, for example, recent measurements of the time-dependent dy-
namics of atomic wave functions with attosecond resolution [DLM12]. More-
over, the time evolution is clearly observable in everyday “macroscopic” life.
So, a consistent and comprehensive subatomic theory must describe time-
dependent phenomena. The renormalized QED cannot do that, so the scope
of this theory is limited.
10.4.3 Unphys and renorm operators in QED

In the preceding subsection we saw that the presence of unphys and renorm
interaction operators in V c was responsible for unphysical time evolution of
bare particles. It is not difficult to see that the presence of such questionable
interaction terms is inevitable in any local quantum field theory where inter-
action Hamiltonian is constructed as a polynomial of quantum fields [Shi07].
We saw in (J.26) and (K.2) that quantum fields of both massive and mass-
less particles always have the form of a sum (creation operator + annihilation
operator)
ψ ∝ α† + α
Therefore, if we constructed interaction as a product (or polynomial) of

47
It is also important to note that even if we had complete information about the S-
matrix from our experiments, it would not allow us to reconstruct the Hamiltonian in an
unique way. According to subsection 7.2.1, there is infinite number of (unitary equivalent)
Hamiltonians producing the same given S-matrix.
fields,48 we would necessarily have unphys and renorm terms there. For
example, converting a product of four fields to the normally ordered form
V ∝ ψ4
= (α† + α)(α† + α)(α† + α)(α† + α)
= α† α† α† α† + α† α† α† α + α† ααα + αααα + α† α† + αα (10.66)
+α† α + C (10.67)
† †
+α α αα (10.68)
we obtain unphys terms (10.66) together with renorm terms (10.67) and
one phys term (10.68). The presence of unphys terms is an indication that
bare states created by operators α† cannot be sensibly associated with true
physical particles.
This analysis suggests that any quantum field theory is destined to suffer
from renormalization difficulties. Do we have any alternative? We are go-
ing to show that with certain modifications, quantum field theories can be
salvaged. It is possible to construct a satisfactory relativistic quantum ap-
proach where the Hamiltonian is well-defined, renormalization problems are
absent, and time evolution can be described in a natural way. This approach
will be discussed in the second part of this book.
48
as prescribed by general QFT rules from subsection 9.1.1.
Part II
QUANTUM THEORY OF
PARTICLES
367
369
In the first part of this book we presented a fairly traditional view on

relativistic quantum field theory. This well-established approach had great
successes in many important areas of high energy physics, in particular, in
the description of scattering events. However, it also had a few troubling
spots. The first one is the problem of ultraviolet divergences. The idea of
self-interacting bare particles with infinite masses and charges seems com-
pletely unphysical. Moreover, QFT is not suitable for the description of time
evolution of particle observables and wave functions. In this second part of
the book, we suggest how to solve these problems by abandoning the idea of
quantum fields as basic ingredients of nature and returning to the old (go-
ing back to Newton) concept of particles interacting via direct forces. This
reformulation of QFT will be achieved by applying the “dressed particle”
approach first developed by Greenberg and Schweber [GS58].
Instantaneous forces acting between dressed particles imply the real possi-
bility of sending superluminal signals. Then we find ourselves in contradiction
with special relativity, where faster-than-light signaling is strictly forbidden
(see Appendix I.5). This paradox forces us to take a second look on deriva-
tions of basic results in special relativity, such as Lorentz transformations for
space and time coordinates of events. We find that previous theories missed
one important point. Specifically, they ignored the fact that in interact-
ing systems generators of boost transformations are interaction-dependent.
A proper recognition of this fact will allow us to reconcile instantaneous
action-at-a-distance with the principle of causality in all reference frames
and to build a consistent relativistic theory of interacting quantum particles.
370
Chapter 11
DRESSED PARTICLE
APPROACH
The first principle is that you must not fool yourself – and you
are the easiest person to fool.
Richard Feynman
In this chapter we will continue our discussion of quantum electrody-

namics - the theory of interacting charged particles (electrons, protons, etc.)
and photons. We are going to demonstrate that the formalism of QED can
be significantly improved by removing ultraviolet-divergent terms from the
Hamiltonian and abandoning the ideas of non-observable virtual and bare
particles. In particular, in sections 11.1 - 11.2 we will find a finite “dressed”
particle Hamiltonian H d which, in addition to accurate scattering operators,
also provides a good description of the time evolution and bound states. We
will call this approach the relativistic quantum dynamics (RQD). The word
“dynamics” is used here because, unlike the traditional quantum field theory
concerned with calculations of time-independent S-matrices, RQD empha-
sizes the dynamical, i.e., time-dependent, nature of interacting processes.
371
372 CHAPTER 11. DRESSED PARTICLE APPROACH
11.1 Dressing transformation

In section 10.2 we established the presence of unphys and renorm terms (as
well as their divergence) in the Hamiltonian H c = H0 + V c of the renor-
malized QED. The viewpoint adopted in this book is that such terms are
not acceptable. Any realistic interaction operator must be finite and purely
physical. Therefore, we conclude that the Tomonaga-Schwinger-Feynman
renormalization program was just a first step in the process of elimination of
infinities from quantum field theory. In this section we are going to propose
how to make a second step in this direction: remove infinite contributions
from the Hamiltonian H c and solve the paradox of ultraviolet divergences in
QED.
In subsection 11.1.1 we will see that there are no compelling reasons1
for using traditional Hamiltonians of QED. So, we should not be afraid of
trying other Hamiltonians H d = H0 + V d if we can show that they reproduce
existing experimental data. Our starting idea is that QED interaction V c is
not good, and we are going to use a completely different interaction operator
V d for our version of quantum electrodynamics.
Our solution for H d will be based on the dressed particle approach which
has a long history. Initial ideas about “persistent interactions” in QFT were
expressed by van Hove [Hov55, Hov56]. First clear formulation of the dressed
particle concept and its applications to model quantum field theories are con-
tained in a brilliant paper by Greenberg and Schweber [GS58]. This formal-
ism was further applied to various quantum field models including the scalar-
field model [Wal70], the Lee model [EKU62, Fiv70, DG73, DG75, Are72],
and the Ruijgrok-Van Hove model [Rui59, opu59]. The way to construct
the dressed particle Hamiltonian as a perturbation series in a general QFT
theory was suggested by Faddeev [Fad63] (see also [Tan59, Sat66, FS73]).
Shirokov with coworkers [Shi72, VS74, Shi93, Shi94, SS01] further developed
these ideas and, in particular, demonstrated how ultraviolet divergences can
be removed from the dressed Hamiltonian (see also [KSO97, KS04, KCS07]).
An interesting and somewhat related approach to particle interactions
in QFT was recently developed by Weber and co-authors [Web05, Web99,
WL02, WL05].
In this chapter we will present a general theory of the unitary dressing
transformation. In chapter 12 we are going to use this theory to construct
1
except, possibly, historical
11.1. DRESSING TRANSFORMATION 373
the dressed Hamiltonian H d in the 2nd perturbation order. Unfortunately,

in higher orders computations become very difficult. This challenge will be
addressed in chapter 14 where we will introduce a trick allowing us to go
beyond 2nd order results.
11.1.1 On the origins of QED interaction

In subsection 9.1.2 we simply postulated the QED interaction operator (9.12)
- (9.14) or its renormalized form (10.44). What are the physical origins of
these expressions? Are there deeper fundamental principles that demand
this particular form of electromagnetic interactions? The standard textbook
answer is that the true reason for interactions between charged particles and
photons is the principle of local gauge invariance. It is usually postulated
that the Lagrangian of electromagnetic theory must be invariant with re-
spect to certain simultaneous “gauge” transformations of the fermion fields
ψ(x, t), Ψ(x, t), and the photon field Aµ (x, t). Then it appears that the free
field Lagrangian does not satisfy this requirement and that the local gauge
invariance can be ensured only after addition of “minimal” interaction terms
there. This idea is explained in all modern textbooks on field theory, so we
will not dwell on it here. It is sufficient to say that the same principle of
local gauge invariance has been used to derive interaction Lagrangians for
both electro-weak theory and quantum chromodynamics.
In spite of its wide theoretical use, the physical meaning of the gauge
invariance remains obscure. For example, the original idea of gauge freedom
comes from Maxwell’s electrodynamics. However, in chapter 15 we will see
that this theory can be replaced by a direct interaction approach in which
electromagnetic fields, potentials, and gauges do not play any role at all.
Moreover, the physical meaning of quantum fields themselves is not clear,
as will be discussed in section 17.4. For these reasons, in our book we do
not accept the usual claim about the fundamental physical importance of
fields and gauges. We maintain that the local gauge invariance should be
considered only as an heuristic principle, whose remarkable effectiveness still
awaits its proper explanation.
Perhaps, a more convincing justification of interactions (9.12) - (9.14)
was explained by Weinberg in section 8.1 of his book [Wei95]. He assumed
that interaction must be a polynomial function of field components. Then
he advanced an experiment-based argument that this polynomial must be
linear in Aµ (x, t). Another requirement is the invariance of the interaction
polynomial with respect to the non-interacting representation of the Poincaré

group.2 If Aµ transformed as a 4-vector with respect to the Lorentz group,
then the latter condition could be satisfied if one chooses interaction in the
form ∝ Aµ Jµ , where Jµ is any 4-vector composed of fermion fields. However,
Lorentz transformations of Aµ are different from the 4-vector law by the
presence of an additional term.3 This difficulty can be overcome if for Jµ one
chooses a conserved fermionic 4-vector. Then Aµ Jµ is a Lorentz scalar despite
the non-4-vector character of Aµ . The simplest choice for Jµ is the fermion
current density (L.1). This line of arguments leads one to the electromagnetic
interaction operator (9.12) - (9.14) that is consistent with usual gauge-based
derivations.
However, neither gauge-based nor Weinberg’s arguments appear very con-
vincing, especially if one takes into account the need for adding divergent
renormalization counterterms to the resulting interaction Hamiltonian. The
apparent success of the renormalization program seems mysterious and acci-
dental. It is puzzling how infinite counterterms (almost) cancel divergences
in the S-matrix expansion and how the tiny residual radiative corrections
come out in perfect agreement with measurements.4
So, we just have to accept that the present formulation of QED lacks
solid theoretical foundation. Quantum fields and gauges seem to be heuristic
devices, and the whole construction is supported by agreement with experi-
ments more than by reliance on well-tested physical principles. Bearing this
weakness in mind, in this second part of our book we will attempt to re-
formulate the QED formalism. Instead of fields and gauges, our approach
will be based on the ideas of point particles and instantaneous interaction
potentials.
11.1.2 No-self-interaction condition

In section 10.4 we saw that the presence of unphys and renorm interactions
is the immediate cause of many QFT problems, such as the need for renor-
malization and the absence of a well-defined time evolution. The simplest
way to avoid these problems is to demand that the true (dressed) interaction
Hamiltonian V d does not contain unphys and renorm terms. So, we are now
2
see condition (II) in step 2. on page 301
3
see equation (K.23)
4
see chapter 14.2
going to postulate that our desired interaction operator V d has phys terms5
V d = α† α† αα + α† α† ααα + α† α† α† αα + . . . (11.1)
According to Table 8.2 in subsection 8.2.5, commutators of phys terms can be
only phys. Therefore, when the scattering operator F d is calculated from V d
via equation (7.21) only phys terms can appear there in each perturbation
order. Then both V d and F d yield zero when acting on zero-particle and
one-particle states
V d |0i = V d α† |0i = 0
F d |0i = F d α† |0i = 0
as required by the no-self-scattering renormalization condition 10.1. More-
over, time evolutions of the vacuum and one-particle states are not different
from their free time evolutions

− ~i H d t it d ph
|0(t)i = e |0i = 1 − (H0 + (V ) ) + . . . |0i
~

it i
= 1 − H0 + . . . |0i = e− ~ H0 t |0i
~

− ~i H d t † it
|α(t)i = e α |0i = 1 − (H0 + (V ) ) + . . . α† |0i
d ph
~

it i
= 1 − H0 + . . . α† |0i = e− ~ H0 t |αi
~
as they should be. Physically this means that, in addition to forbidding self-
scattering in zero-particle and one-particle states,6 our ansatz (11.1) also
forbids any self-interaction in these states. So, our search for a better QED
interaction will be based on the following
Postulate 11.1 (stability of vacuum and one-particle states) There is
no (self-)interaction in the vacuum and one-particle states, i.e., the time evo-
lution of these states is not affected by interaction and is governed by the
5
Recall that decay and oscillation operators are not present in QED. So, we will not
consider them here.
6
i.e., satisfying Statement 10.1
non-interacting Hamiltonian H0 . Mathematically, this means that the type

of the interaction Hamiltonian V d is phys.
Summarizing discussions from various parts of this book, we can put together
a list of conditions that should be satisfied by any realistic interaction
(A) Poincaré invariance (Statement 3.2);
(B) instant form of dynamics (Postulate 17.2);
(C) cluster separability (Postulate 6.3);
(D) no self-interactions = phys character of V d (Postulate 11.1);
(E) finiteness of coefficient functions of interaction potentials;
(F) coefficient functions should rapidly tend to zero at large values of mo-
menta 7
As we saw in subsection 10.4.3, requirement (D) practically excludes all

usual field-theoretical Hamiltonians. The question is whether there are non-
trivial “good” interactions that have all the properties (A) - (F)? And the
answer is “yes.”
One set of examples of allowed interacting theories is provided by “direct
interaction” models.8 Two-particle models of this kind were first constructed
by Bakamjian and Thomas [BT53]. Sokolov [Sok75, SS78], Coester and Poly-
zou [CP82] showed how this approach can be extended to cover multi-particle
systems. There are recent attempts [Pol03] to extend this formalism to in-
clude description of systems with variable number of particles. In spite of
these achievements, the “direct interaction” approach is currently applicable
only to model systems. One of the reasons is that conditions for satisfying
the cluster separability are very cumbersome. This mathematical complexity
was evident even in the simplest 3-particle case discussed in subsection 6.3.6.
In the “direct interaction” approach, interactions are expressed as func-
tions of (relative) particle observables, e.g., relative distances and momenta.
However, it appears more convenient to write interactions as polynomials in
7
According to Theorem 8.13, this condition guarantees convergence of all loop integrals
involving vertices V d and, therefore, the finiteness of the corresponding scattering operator
S d.
8
Some of them were discussed in section 6.3.
particle creation and annihilation operators (8.50). We saw in Statement 8.7

that in this case the cluster separability condition (C) is trivially satisfied
if coefficient functions have smooth dependence on particle momenta. The
no-self-interaction condition (D) simply means that all interaction terms are
phys. The instant form condition (B) means that generators of space transla-
tions P = P0 and rotations J = J0 are interaction-free and that interaction V
commutes with P0 and J0 . The most difficult part is to ensure the relativistic
invariance (condition (A)), i.e., commutation relations of the Poincaré group.
One way to solve this problem is to fix the operator structure of interaction
terms and then try to find the momentum dependence of coefficient functions
by solving a set of differential equations resulting from Poincaré commuta-
tors (6.22) - (6.26) [Kaz71, Kit66, Kit68, Kit70, Kit72b, Kit72a, Kit73, SF11].
Kita and Kazes demonstrated that there is an infinite number of solutions
for these equations and provided some non-trivial examples. This means
that the above conditions (A) - (F) are not restrictive enough. Ideally, we
would like to formulate additional physical principles that would single out
the unique theory of interacting particles that agrees with all experimental
observations. Unfortunately, these additional principles are not known at
this moment.
11.1.3 Main idea of the dressed particle approach

The Kita-Kazes approach is difficult to apply to realistic particle interac-
tions, so, currently, it cannot compete with QFT. Thus, it might be more
promising to abandon the idea to build relativistic interactions from scratch
and, instead, try to modify traditional field theories to make them consistent
with our requirements (D), (E), and (F). One idea how to make this possible
is to note that the S-matrix of the usual renormalized QED agrees with ex-
periments very well. So, we may add the following requirement to the above
list (A) - (F):
(G) the scattering operator S d in our “dressed” theory is exactly the same
(in each perturbation order) as the operator S c in renormalized QED.
Previously we have denoted the desired phys interaction operator by V d .

Then, condition (G) means that the dressed Hamiltonian H d = H0 + V d
is scattering equivalent to the renormalized QED Hamiltonian H c = H0 +
V c from which the accurate scattering operators S c is usually calculated.
According to our discussion in subsection 7.2.1, this means that H d and H c

are related by a unitary transformation
H d = H0 + V d = eiΦ H c e−iΦ (11.2)

= eiΦ (H0 + V c )e−iΦ
1
= (H0 + V c ) + i[Φ, (H0 + V c )] − [Φ, [Φ, (H0 + V c )]] + . . .(11.3)
2!
where Hermitian operator Φ satisfies condition (7.35). Transformation eiΦ
will be called the unitary dressing transformation.
11.1.4 Unitary dressing transformation

Now our goal is to find a unitary transformation eiΦ , which ensures that the
dressed particle Hamiltonian H d satisfies all properties (A) - (G). In this
study we will need the following useful results
Theorem 11.2 (transformations preserving the S-operator) A unitary

transformation of the Hamiltonian
H ′ = eiΦ He−iΦ
preserves the S-operator if the Hermitian operator Φ has the form (8.49)
- (8.50) where all terms ΦN M are either phys or unphys and have smooth
coefficient functions.
Idea of the proof. Assume that operator Φ has the standard form (8.49)
- (8.50)
∞ X
X ∞
Φ = ΦN M
N =0 M =0
XZ
ΦN M = dq′1 . . . dq′N dq1 . . . dqM DN M (q′1 η1′ ; . . . ; q′N ηN
′
; q1 η1 ; . . . ; qM ηM ) ×
{η,η′ }
N M
!
X X
δ q′ i − qj αq† ′ ,η′ . . . αq† ′ ′ αq1 ,η1 . . . αqM ,ηM
1 1 N ,ηN
i=1 j=1
Then the left hand side of the scattering equivalence condition (7.35) for each
term ΦN M is
i i
lim e ~ H0 t ΦN M e− ~ H0 t
t→±∞
XZ
= lim dq′1 . . . dq′N dq1 . . . dqM DN M (q′1 η1′ ; . . . ; q′N ηN
′
; q1 η1 ; . . . ; qM ηM ) ×
t→±∞
{η,η′ }
N M
!
X X i
δ q′ i − qj e ~ ENM t αq† ′ ,η′ . . . αq† ′ ′ αq1 ,η1 . . . αqM ,ηM (11.4)
1 1 N ,ηN
i=1 j=1
where EN M is the energy function of this term. In the limits t → ±∞ mo-

mentum integrals tend to zero by Riemann-Lebesgue lemma B.1, because
i
the coefficient function DN M is smooth, while the factor e− ~ ENM t oscillates
rapidly in the momentum space. Therefore, according to (7.36), Hamilto-
nians H and H ′ are scattering-equivalent. This theorem does not apply to
renorm operators ΦN M , because for them the energy functions EN M are iden-
i i
tically zero, products e ~ H0 t ΦN M e− ~ H0 t do not depend on t, and the scattering
equivalence condition (7.35) is violated.
Lemma 11.3 Potential B 9 is smooth10 if B is either unphys with arbitrary

smooth coefficient function or phys with a smooth coefficient function, which
is identically zero on the energy shell.
Proof. The only possible source of singularity in B is the energy denom-
inator EB−1 , which is singular on the energy shell. However, for operators
satisfying conditions of this Lemma, either the energy shell does not exist,
or the coefficient function vanishes there. In both cases B is not singular on
the energy shell.
We will assume that all relevant operators can be written as expansions
in powers of the coupling constant, and that all series converge
H c = H0 + V1c + V2c + . . . (11.5)

H d = H0 + V1d + V2d + . . . (11.6)
Φ = Φ1 + Φ2 + . . . (11.7)
9
For definition of an underlined operator symbol see (7.12) and (8.61).
10
i.e., it has a smooth coefficient function
As usual, the subscript denotes the power of e (= the perturbation order).

Next, following the plan outlined in subsection 10.1.1, we introduce reg-
ularization cutoffs λ and Λ, which ensure that interactions and counterterms
Vic in the Hamiltonian of QED are non-singular and finite in all perturbation
orders. Moreover, with these cutoffs all loop integrals involved in calculations
of products and commutators of Vic become convergent. In this section we
are going to prove that in this regulated theory the operator Φ can be cho-
sen so that conditions (A) - (G) are satisfied in all perturbation orders. Of
course, to get accurate results, in the end of calculations the regularization
cutoffs λ and Λ should be lifted. Only those quantities may have physical
meaning, which remain finite in the limits λ → 0 and Λ → ∞. The latter
limit will be considered in subsection 11.1.8. For the infrared limit λ → 0
see section 14.2.
Using expansions (11.5) - (11.7) in (11.3) and collecting together terms
of equal order we obtain an infinite set of equations
V1d = V1c + i[Φ1 , H0 ] (11.8)

1
V2d = V2c + i[Φ2 , H0 ] + i[Φ1 , V1c ] − [Φ1 , [Φ1 , H0 ]] (11.9)
2!
V3d = V3c + i[Φ3 , H0 ] + i[Φ2 , V1c ] + i[Φ1 , V2c ]
1 1 1
− [Φ2 , [Φ1 , H0 ]] − [Φ1 , [Φ2 , H0 ]] − [Φ1 , [Φ1 , V1c ]]
2! 2! 2!
i
− [Φ1 , [Φ1 , [Φ1 , H0 ]]] . . . (11.10)
3!
...
Now we need to solve these equations order-by-order. This means that we

need to choose appropriate operators Φi = Φphi + Φi
unp
+ Φren
i , so that inter-
d
action terms Vi on left hand sides satisfy above conditions (B) - (G).11 Let
us start with equation (11.8).
11.1.5 Dressing in the first perturbation order

In renormalized QED the 1st order interaction operator V1c = V1 is unphys.12
According to condition (D), this term should be canceled exactly. This can
11
We will discuss condition (A) separately in subsection 11.1.9.
12
see equation (L.8)
be achieved if we choose13 Φph

1 = Φren
1 = 0 and use (8.69) to solve the
commutator equation
i[Φunp
1 , H0 ] = −V1
Φunp
1 = iV1 (11.11)
for the unphys part of Φ1 . This choice ensures that not only the unphys part
of V1d is zero, but the entire first order dressed interaction potential vanishes
V1d = 0, so that conditions (B) - (F) are trivially satisfied in this order. The
coefficient function of V1 is non-singular. By Lemma 11.3 this implies that
Φunp
1 in equation (11.11) is smooth. By Theorem 11.2, the presence of this
term in the dressing transformation eiΦ does not affect the S-operator in
agreement with our condition (G). So, we managed to satisfy all necessary
conditions in the first perturbation order.
11.1.6 Dressing in the second perturbation order

Now we can substitute the operator Φ1 found above into equation (11.9) and
obtain expression for the 2nd order dressed potential
1
V2d = V2c + i[Φ2 , H0 ] − [V1 , V1 ] + [V1 , V1 ]
2!
1
= V2c + i[Φ2 , H0 ] − [V1 , V1 ] (11.12)
2
It is convenient to write separately unphys, phys, and renorm parts of this
equation and take into account that [Φren
2 , H0 ] = 0
1
(V2d )unp = (V2c )unp + i[Φunp
2 , H0 ] − [V1 , V1 ]
unp
(11.13)
2
d ph ph 1
(V2 ) = (V2 ) + i[Φ2 , H0 ] − [V1 , V1 ]ph
c ph
(11.14)
2
d ren c ren 1 ren
(V2 ) = (V2 ) − [V1 , V1 ] (11.15)
2
13
More generally, we can also choose Φph 1 to be any phys operator whose coefficient
function vanishes on the energy shell. See next subsection.
All components on the right hand sides of (11.13) - (11.15) (except com-
mutators of Φunp 2 and Φph2 ) are, basically, known to us already: In QED,
ph
(V2 ) is the same as V2 in (L.11); operator (V2c )ren is coming from elec-
c ph
tron and photon self-energy counterterms discussed in subsections 10.2.2 and

10.2.4, respectively; (V2c )unp takes contributions from V2unp in (L.12) as well
as from self-energy counterterms; and calculation of commutators [V1 , V1 ]unp
and [V1 , V1 ]ren should follow the same steps as calculation of [V1 , V1 ]unp in
subsection 9.2.1. Now our goal is to choose operators Φph unp
2 and Φ2 , so that
dressed interactions on the left hand sides of (11.13) - (11.15) satisfy our
conditions (B) - (G).
From the condition (D) it follows that (V2d )unp must vanish.To achieve
that, we can choose
i
Φunp
2 = iV2unp − [V1 , V1 ]unp (11.16)
2
Operators V1 and V2unp are smooth. Then, by Lemma 11.3, the operator V1
is also smooth and by Lemma 8.12 the commutator [V1 , V1 ]unp is smooth as
well. Using Lemma 11.3 again, we see that operator Φunp 2 is smooth, and
iΦ
by Theorem 11.2 its presence in the transformation e does not affect the
S-operator. This is exactly what we need.
Let us now turn to equation (11.14) for the phys part of the dressed
particle interaction V2d . What are the conditions for selecting Φph 2 ? For
example, we cannot simply choose Φph 2 = 0, because in this case the dressed
particle interaction would acquire the form
1
(V2d )ph = V2ph − [V1 , V1 ]ph
2
and there is absolutely no guarantee that the coefficient function of (V2d )ph
rapidly tends to zero at large values of particle momenta (condition (F)). In
order to have this guarantee, we are going to choose Φph2 such that the right
hand side of (11.14) rapidly tends to zero when momenta are far from the
energy shell. In addition, we will require that Φph2 is non-singular.
14
Both
15
conditions can be satisfied by choosing
14
This is needed to obey the charge renormalization postulate from subsection 10.1.3.
15
Note that this part of our dressing transformation closely resembles the “similarity
renormalization” procedure suggested by Glazek and Wilson [GW93, Gla97].

i
Φph
2 = ph
iV2 − [V1 , V1 ]ph
◦ (1 − ζ2 ) (11.17)
2
where ζ2 is a real function,16 such that
(I) ζ2 is equal to 1 on the energy shell;
(II) ζ2 depends on rotationally invariant combinations of momenta (to make

sure that V2d commutes with P0 and J0 );
(III) ζ2 is smooth;
(IV) ζ2 rapidly tends to zero when the arguments move away from the energy
shell.17
With the choice (11.17) we obtain

ph 1
(V2d )ph = V2 − [V1 , V1 ]ph
◦ ζ2 (11.18)
2
so that (V2d )ph rapidly tends to zero when momenta of particles move away
from the energy shell in agreement with condition (F). Moreover, property
(I) guarantees that expression under the t-integral in (11.17) vanishes on the
energy shell. Therefore, this t-integral18 is non-singular and, according to
Theorem 11.2, Φph 2 does not modify the S-operator, i.e., condition (G) is
satisfied.
In (11.18) V1 and V1 are smooth operators, so, according to Theorem 8.12,
their commutator is also smooth. Operator V2ph and function ζ2 are smooth
16
The arguments of ζ2 (particle momenta and spin projections) should be the same as
arguments of coefficient functions in V2ph and [V1 , V1 ]ph . The “small circle” notation was
defined in subsection 8.2.3
2
17
For example, we can choose ζ2 = e−αE where α is a positive constant and E is the
energy function of the operator on the right hand side of (11.17). Actually, it may happen
that loop integrals involving V2d converge even without involvement of convergency factors
ζ2 . For example, in subsection 14.2.1 we will see that in QED the loop integral in the
product V2d V2d converges even if ζ2 = 1 everywhere.
18
calculated by formula (8.61)
as well. So, due to Statement 8.7, we conclude that the second-order dressed
interaction (V2d )ph is separable in accordance with our requirement (C).
Finally, we need to choose Φren2 . Note that this operator is not present in
the system of equations (11.13) - (11.15), so it should be derived from other
considerations. Apparently, the only sensible choice is
Φren
2 = 0 (11.19)
because, according to Theorem 11.2, a non-zero renorm part of Φ2 would

destroy the scattering equivalence (condition (G)). we can satisfy all our
conditions. Let us analyze the only non-obvious condition (D). How can
we be sure that our choice (11.19) satisfies the condition (D), in particular,
that (V2d )ren = 0? We already know that (11.19) guarantees the scatter-
ing equivalence of the dressed theory. This means that the S-operator ob-
tained with the transformed interaction V2d agrees with the S-operator S c
up to the second perturbation order (condition (G)). In particular, F2d = F2c .
This would be impossible if V2d contained a non-zero renorm term. Indeed,
(V2d )ren 6= 0 would imply that operator F2d and, therefore, F2c have non-zero
renorm terms in disagreement with equation (10.6). Thus we must conclude
that (V2d )ren = 0, and that two terms on the right hand side of (11.15) cancel
each other. This cancelation can be verified by a direct calculation as well
[KS04].
11.1.7 Dressing in arbitrary order

For any higher perturbation order i > 2, the selection of Φi and proofs of the
conditions (B) - (G) are similar to those described above for the 2nd order.
The defining equation for Vid can be written in a general form19
Vid = Vic + i[Φi , H0 ] + Ξi (11.20)
where Ξi is a sum of multiple commutators involving Vjc from lower orders

(1 ≤ j < i) and their t-integrals (= “underlines”). This equation is solved
by
19
compare with (11.12)
Φren
i = 0
unp
Φi = iΞunp
i + i(Vic )unp ,
Φph
i = i(Ξph c ph
i + (Vi ) ) ◦ (1 − ζi ) (11.21)
where functions ζi have properties (I) - (IV) from the preceding subsection.
Similar to the 2nd order discussed above, one then demonstrates that Φi is
smooth, so that condition (G) is satisfied in the i-th order and that (Vid )ren =
(Vid )unp = 0.
Solving equations (11.20) order-by-order we obtain the dressed particle
Hamiltonian
H d = eiΦ H c e−iΦ = H0 + V2d + V3d + V4d + . . . (11.22)
which has all required properties (B) - (G), as promised.
11.1.8 Infinite momentum cutoff limit

So far our calculations of operators Φ and V d were performed under the
assumption of a finite momentum cutoff Λc. This permitted us to avoid
ultraviolet divergences in our formulas. In the complete and final theory we
must, obviously, take the limit Λ → ∞. Our approach can be viable only if
we can prove that all physically relevant dressed operators remain finite in
this limit.20
It seems rather obvious that conditions (B) - (D) and (F) are independent
on the momentum cutoff Λc. Therefore, they also remain valid in the limit
Λ → ∞. Let us now demonstrate that condition (E) is satisfied in this limit
as well. To do that, we note than on one hand the traditional QED gives us
a perturbation series for the S-operator
S c = 1 + S2c + S3c + S4c + . . .

= 1 + Σc2 + Σc3 + Σc4 + . . . (11.23)
|{z} |{z} |{z}
20
Note that operator Φ providing the link (11.22) between Hamiltonians H c and H d
does not correspond to any observable property, so it is OK if Φ does not converge in the
large cutoff limit.
On the other hand, in the dressed particle approach with Hamiltonian (11.22),
the S-operator can be written using formulas (7.20) and (7.22)
S d = 1 + V2d + V3d + V4d + V2d V2d . . .

|{z} |{z} |{z} | {z }
According to our condition (G), these two operators should be equal order-
by-order. Thus we obtain the following set of relations between Vid and Sic
on the energy shell21
V2d = S2c = Σc2 (11.24)

|{z} |{z}
V3 = S3 = Σc3
d c
(11.25)
|{z} |{z}
V4d = S4c − V2d V2d = Σc4 − V2d V2d (11.26)
|{z} | {z } |{z} | {z }
d c
Vi = Si + Y i , i > 4 (11.27)
|{z} |{z}
where Yi stands for a sum of certain products of Vjd from lower orders (2 ≤
j ≤ i − 2) with t-integrations (“underlines”). The relations (11.24) - (11.27)
are independent on the cutoff Λ, so they remain valid when Λ → ∞. In this
limit operators Sic and Σci are finite and assumed to be known on the energy
shell from the standard renormalized QED theory. This immediately implies
that V2d and V3d are finite on the energy shell and, due to our condition (F),
they are finite for all momenta even outside the energy shell.
Can we say that operator V4d is finite too? The part Σc4 is definitely
|{z}
finite, but how can we be sure that the term
− V2d V2d (11.28)

| {z }
is finite on the energy shell? This is where the yet undefined factor ζ2
comes into play. According to our discussion in subsection 11.1.7, this fac-
tor can be chosen to decay sufficiently rapidly at large values of arguments
21
Recall that the S-operator is defined only on the energy shell. Moreover the “under-
V = 2πiV ◦ δ(EV ). So, |{z}
brace” symbol was defined in (8.54) as |{z} V is non-zero only on
the energy shell of the operator V .
11.2. DRESSED INTERACTIONS BETWEEN PARTICLES 387
(momenta). Then all loop integrals present in the product (11.28) are guar-
anteed to converge.22 Consequently, operator (11.28) is finite on the energy
shell, and V4d in (11.26) is also finite on the energy shell and everywhere else.
These arguments can be repeated in all higher orders, thus proving that the
dressed particle Hamiltonian H d is free of ultraviolet divergences.
11.1.9 Poincaré invariance of the dressed particle ap-

proach
The next question is whether our theory with the transformed Hamiltonian
H d is Poincaré invariant (condition (A))? In other words, whether there exists
a boost operator Kd such that the set of generators {P0 , J0 , Kd , H d} satis-
fies Poincaré commutators? With the dressing operator exp(iΦ) constructed
above, this problem has a simple solution. If we define Kd = eiΦ Kc e−iΦ , then
we can obtain a full set of dressed generators via unitary transformation of
the old generators23
{P0 , J0 , Kd , H d } = eiΦ {P0 , J0 , Kc , H c }e−iΦ
The dressing transformation eiΦ is unitary and, therefore, preserves commu-

tators. Since old operators obey the Poincaré commutators, the same is true
for the new generators. This proves that the transformed theory is Poincaré
invariant and belongs to the instant form of dynamics [SS98].
11.2 Dressed interactions between particles

11.2.1 General properties of dressed potentials
One may notice that even after conditions (I) - (IV) on page 383 are sat-
isfied for functions ζi , there is a great deal of ambiguity in choosing their
behavior outside the energy shell. Therefore, the dressing transformation eiΦ
is not unique, and there is an infinite set of dressed particle Hamiltonians
that satisfy our requirements (A) - (G). Which dressed Hamiltonian should
we choose? Before trying to answer this question, we can notice that all
22
see Theorem 8.13
23
Note that operator exp(iΦ) commutes with P0 and J0 by construction.
Hamiltonians satisfying conditions (A) - (G) have some important common

properties, which will be described here.
Note that electromagnetic interactions are rather weak. In most situ-
ations the (expectation value) of the interaction potential energy is much
less than the electron’s rest energy (mc2 ).24 To describe such situations it is
sufficient to know the coefficient functions of the interaction only near the
energy shell where we can use condition (I) and set approximately ζi ≈ 1 for
each perturbation order i. This observation immediately allows us to obtain
a good approximation for the second-order interaction from equation (11.18)
by setting ζ2 ≈ 1 there
1
V2d ≈ V2ph − [V1 , V1 ]ph (11.29)
2
Operator V2ph can be taken from formula (L.11), and calculations involved
in [V1 , V1 ]ph have been explained in subsection 9.2.1. So, obtaining the full
operator V2d is not that difficult.
In higher perturbation orders commutator formulas (11.20) become rather
complicated and the method of unitary dressing transformation becomes im-
practical. Fortunately, there is an equivalent, but a much simpler alternative
approach: One can fit the desired dressed interaction operators Vid directly
to the renormalized S-operator (or, more precisely, to its components Σci
in each perturbation order) of traditional QED, as described in subsection
11.1.8.
For example, in the 2nd order we obtain from (11.24) and our assumption
ζi ≈ 1
V2d ≈ (Σc2 )ph
which is consistent with (11.29). Let us now briefly describe the structure
of this operator. Some examples of potentials present in V2d are shown in
Table 11.1. We can classify them into two groups: elastic potentials and
inelastic potentials. Elastic potentials do not change the particle content of
the system: they have equal number of annihilation and creation operators
of the same particle types. As shown in subsection 8.2.8, elastic potentials
24
For example, the ratio of the hydrogen’s binding energy (13.6 eV) and the electron’s
rest energy (511 keV) is only 2.6×10−5.
correspond to particle interactions familiar from ordinary quantum mechanics

and classical physics. Inelastic potentials change the number and/or types of
particles. Among inelastic 2nd order potentials in RQD there are potentials
for pair creation, pair annihilation, and pair conversion.
Similarly to the 2nd order discussed above, the third-order interaction V3d
can be unambiguously obtained near the energy shell by setting ζ3 ≈ 1 in
(11.25)
V3d ≈ (Σc3 )ph (11.30)
All 3rd order potentials are inelastic. Two of them are shown in Table
11.1: The term d† a† c† da (bremsstrahlung) describes creation of a photon
in a proton-electron collision.25 In the language of classical electrodynamics,
this can be interpreted as emission of radiation due to acceleration of charged
particles and is also related to the radiation reaction force. The Hermitian-
conjugated term d† a† dac describes absorption of a photon by a colliding pair
of charged particles.
The situation is less certain for the 4th and higher order dressed particle
interactions. Near the energy shell we can again set ζ4 ≈ 1 in equation
(11.26)
V4d ≈ (Σc4 )ph − (V2d V2d )ph (11.31)
The operator V4d obtained by this formula is a sum of various interaction

potentials (some of them are shown in Table 11.1; see also Chapter 14.2)
V4d = d† a† da + a† a† a† b† aa + . . . (11.32)
The contribution (Σc4 )ph in equation (11.31) is well-defined near the energy
shell, because we assume exact knowledge of the S-operator of renormalized
QED in all perturbation orders. However, there is less clarity about the con-
tribution (V2d V2d )ph . As explained in section 8.4, the diagram for the product
(V2d V2d )ph should be constructed from diagrams V2d and V2d by coupling some
of their external lines, thus transforming them into internal lines and loops.
25
A more detailed discussion of this effect can be found in section 14.1.
Table 11.1: Examples of interaction potentials in RQD. Bold numbers in

the third column indicate perturbation orders in which explicit interaction
operators can be unambiguously obtained near the energy shell as described
in the text.
Operator Physical meaning Perturbation
Orders
Elastic potentials
† †
a a aa e− − e− potential 2, 4, 6, . . .
† †
d a da e− − p+ potential 2, 4, 6, . . .
† † −
a c ac e − γ potential (Compton scattering) 2, 4, 6, . . .
a† a† a† aaa e− − e− − e− potential 4, 6, . . .
Inelastic potentials
† †
a b cc e− − e+ pair creation 2, 4, 6, . . .
† †
c c ab e− − e+ annihilation 2, 4, 6, . . .
† † − + − +
d f ab conversion of e − e pair to p − p pair 2, 4, 6, . . .
† † † − +
d a c da e − p bremsstrahlung 3, 5, . . .
d† a† dac photon absorption in e− − p+ collision 3, 5, . . .
a† a† a† b† aa pair creation in e− − e− collision 4, 6, . . .
Loop integration momenta are not limited, so the product (V2d V2d )ph generally
depends on the behavior of the factors everywhere in the momentum space,
even outside their energy shells. So, (11.31) depends on our global choice
ζ2 outside the energy shell. The function ζ2 satisfies conditions (I) - (IV),
but still there is a great freedom in choosing the out-of-shell behavior of ζ2 .
This freedom is reflected in the uncertainty of V4d even on the energy shell.
Therefore, we have two possibilities depending on the operator structure of
the 4th order potential we are interested in.
First, there are potentials contained only in the term (Σc4 )ph in (11.31)
and not present in the product (V2d V2d )ph . For example, this product does
not contain operator a† a† a† b† aa responsible for the creation of an electron-
positron pair in two-electron collisions. For such potentials, their 4th order
expression near the energy shell can be explicitly obtained from formula
(11.31).26
Second, there are potentials V4d whose contributions come from both two
terms on the right hand side of (11.31). For such potentials, the second term
26
This certainty is stressed by the bold 4 in the last row of Table 11.1.
on the right hand side of this equation is dependent on the particular choice of
function ζ2 and, therefore, remains uncertain. One example is the 4th order
contribution to the electron-proton interaction d† a† da, which is responsible
for the famous Lamb shift. See subsection 14.2.3.
The uncertainty of high order interactions Vid is perfectly understandable:
It simply reflects the one-to-many correspondence between the S-operator
and Hamiltonians. It means that there is a broad class of finite phys in-
teractions {V d } all of which can be used for S-matrix calculations without
encountering divergent integrals. Then which member of the class {V d } is
the unique correct interaction Hamiltonian V d ? As we are not aware of any
theoretical condition allowing to determine the off-energy-shell behavior of
functions ζi , this question should be deferred to experiments. There seems
to be no other way but to fit functions ζi to experimental measurements.
Such experiments are bound to be rather challenging because they should
go beyond usual information contained in the S-operator (scattering cross-
sections, energies and lifetimes of bound states, etc.) and should be capable
of measuring radiative corrections to wave functions and time evolution of
observables in the region of interaction. Modern experiments do not have
sufficient resolution to meet this challenge.
To summarize the above discusion, let us now mention a few important
differences between the original QED interaction V c and the dressed parti-
cle potential energy V d . First, we see that in interaction operators Vid of
higher perturbation orders there are more and more terms with increasing
complexity. In contrast to QED Hamiltonians H and H c , there seems to be
no way to write H d in a closed form. However, to the credit of RQD, all
these high order terms directly reflect real interactions and processes observ-
able in nature. Unfortunately, the above construction of the dressed particle
Hamiltonian does not allow us to obtain full information about V d : The
off-energy-shell behavior of potentials is fairly arbitrary and the on-energy-
shell behavior27 can be determined theoretically only for lowest order terms.
Uncovering the dressed interaction potentials in the entire momentum space
would require additional information that can be made available only from
sophisticated experiments.
The idea of defining “effective” particle interactions, which reproduce
scattering amplitudes obtained from quantum field theory and satisfy equa-
tions like (11.24) - (11.27) has a long history. Approaches based on this idea
27
which is the most relevant for comparison with experiments
E E

Pc 0 Pc
0
0 0
a) (b)
Figure 11.1: Typical momentum-energy spectrum of (a) non-interacting and

(b) interacting dressed particle theory.
can be found in a number of works [Hol04, PS98a, PS98b, GR80, GRI89,

FS88]. The important difference of our dressed approach is somewhat philo-
sophical: In contrast to previous works, we do not consider quantum fields
as fundamental physical entities, and we do not regard effective potentials as
mere approximations to the “rigorous” field based description. For us parti-
cles and their direct dressed interactions V d are the ultimate ingredients of
nature.
11.2.2 Energy spectrum of the dressed theory

Properties of interactions between dressed particles discussed in the preceding
subsection allow us to analyze some general features of the energy spectrum of
our theory. In Fig. 11.1(a) we show the energy spectrum of a non-interacting
theory with one (massive) particle type.28 The 0-particle state (vacuum)
has
√ vanishing energy and momentum. The 1-particle state has energy E =
m2 c4 + P 2 c2 . Energy-momenta of 2-particle states
p form a dense (hatched)
region limited from below by the hyperboloid E = (2mc2 )2 + P 2 c2 . Energy-
momenta of 3-particle states form
p a (double-hatched) region limited from
below by the hyperboloid E = (3mc2 )2 + P 2 c2 , etc.
We know that dressed interaction does not affect 0-particle and 1-particle
states, so the corresponding energies remain exactly the same as in the non-
28
compare with Fig. 6.1(a)
interacting case.29 Dressed interaction does perturb states with two or more
particles. In particular, if inter-particle potentials in V d are relatively weak
and attractive, one can expect formation of hyperboloids of bound states, as
shown in Fig. 11.1(b). In the next section we will illustrate the description
of bound states in RQD using the hydrogen atom as an example.
Traditional renormalized quantum field theories also make similar state-
ments about the energy-momentum spectrum of multiparticle states.30 How-
ever, in field theories these statements are not obvious. They cannot be
deduced directly from the renormalized Hamiltonian H c . The main reason
is that the interacting part of H c cannot be regarded as weak, because it
includes divergent renormalization counterterms.
11.2.3 Comparison with other dressed particle approaches

In this subsection, we would like to discuss another point of view on the
dressing transformation. This point of view is philosophically different but
mathematically equivalent to ours. It is exemplified by the works of Shirokov
and coauthors [Shi93, Shi94, SS01]. In contrast to our approach in which the
dressing transformation eiΦ was applied to the field-theoretical Hamiltonian
H c of QED while (bare) particle creation and annihilation operators were not
affected, Shirokov et al. kept the H c intact, but applied the (inverse) dressing
transformation e−iΦ to creation and annihilation operators of particles
αd† = e−iΦ α† eiΦ

αd = e−iΦ αeiΦ
to the vacuum state
|0id = e−iΦ |0i
and to particle observables. Physically, this means that instead of bare par-
ticles (created and annihilated by α† and α, respectively) the theory is for-
mulated in terms of fully dressed particles (created and annihilated by op-
erators αd† and αd , respectively), i.e., particles together with their virtual
29
See Fig. 11.1(b) and compare with Fig. 6.1(b).
30
See, for example, Fig. 17.4 in [Sch61], Fig. 16.1 in [BD65], and Fig. 7.1 in [PS95b].
clouds. Within this approach the Hamiltonian H c must be expressed as a

function of the new particle operators H c = F (αd† , αd ). Apparently, the same
function F expresses the Hamiltonian H d of our approach through original
(bare) particle operators: H d = F (α† , α). Indeed, from equation (11.2) we
can write
H c = e−iΦ H d eiΦ = e−iΦ F (α† , α)eiΦ = F (e−iΦ α† e−iΦ , e−iΦ αeiΦ )

= F (αd† , αd )
So, mathematically, these two approaches are equivalent. Let us demon-

strate this equivalence on a simple example. Suppose we want to calculate
a trajectory (=the time dependence of the expectation value of the position
operator) of the electron in a 2-particle system (electron + proton). In our
approach the initial state of the system has two particles
|Ψi = a† d† |0i
and the expectation value of the electron’s position is given by formula
r(t) = hΨ|R(t)|Ψi
i d i d
= h0|dae ~ H t Re− ~ H t a† d† |0i (11.33)
where R is the position operator for the bare electron, and the time evolution
is governed by the dressed Hamiltonian H d . However, we can rewrite this
expression in the following form characteristic for the Shirokov’s approach
i c i c
r(t) = h0|da(eiΦ e ~ H t e−iΦ )R(eiΦ e− ~ H t e−iΦ )a† d† |0i
i c i c
= h0|eiΦ (e−iΦ daeiΦ )e ~ H t (e−iΦ ReiΦ )e− ~ H t (e−iΦ a† d† eiΦ )e−iΦ |0i
i i
Hct c
= d h0|dd ad e ~ Rd e− ~ H t a†d d†d |0id (11.34)
where the time evolution is generated by the original Hamiltonian H c , while

“dressed” definitions are used for the vacuum state, particle operators, and
the position observable
|0id = e−iΦ |0i

{ad , dd , a†d , d†d } = e−iΦ {a, d, a† , d† }eiΦ
Rd = e−iΦ ReiΦ
In spite of different formalisms, physical predictions of both theories, e.g.,

expectation values of observables (11.33) and (11.34), are exactly the same.
Chapter 12
COULOMB POTENTIAL
AND BEYOND
This work contains many things which are new and interesting.
Unfortunately, everything that is new is not interesting, and ev-
erything which is interesting, is not new.
Lev D. Landau
In the preceding chapter we have derived basic formulas of the dressed

particle approach. Our next goal is to demonstrate that this method is
really useful in concrete calculations. This task will occupy us throughout
the next three chapters. Their plan is as follows: In the present chapter
we will derive the dressed electron-proton interaction potential in the 2nd
perturbation order. Then we will use it to calculate the energy spectrum
of the hydrogen atom. The next step is to extend this theory to higher
perturbation orders. In the 3rd order the hydrogen spectrum is disturbed
by spontaneous photon emission from excited energy levels. To analyze this
effect we will introduce a relativistic quantum theory of unstable systems in
chapter 13. In chapter 14 we are going to use this theory as well as 4th order
radiative corrections to the electron-proton potential and derive two classic
QED results - the anomalous magnetic moment of the electron and the Lamb
shift in hydrogen atoms.
In the preceding chapter we obtained formulas (11.29) and (11.31) for the
dressed particle Hamiltonian H d in a rather abstract form. In this chapter
397
398 CHAPTER 12. COULOMB POTENTIAL AND BEYOND
we would like to demonstrate how this Hamiltonian can be cast into a form
suitable for calculations, i.e., expressed through creation and annihilation
operators of electrons, protons, photons, etc. Here we will focus on pair in-
teractions between electrons and protons in the lowest (second) order of the
perturbation theory. In section 12.1 we will use the (v/c)2 approximation
to obtain what is commonly known as the Darwin-Breit potential between
charged particles with spin 1/2. The major part of this interaction is the
usual Coulomb potential. In addition, there are relativistic corrections re-
sponsible for magnetic, contact, spin-orbit, spin-spin, and other interactions
which are routinely used in relativistic calculations of atomic and molecu-
lar systems. This derivation demonstrates how formulas familiar from ordi-
nary quantum mechanics and classical electrodynamics follow naturally from
RQD. In section 12.2 we will solve the stationary Schrödinger equation for the
hydrogen atom and obtain its energy spectrum with relativistic corrections.
12.1 Darwin-Breit Hamiltonian

12.1.1 Electron-proton potential in the momentum space
Note that the second-order dressed interaction near the energy shell (11.29)
is given by the same formula as F2 in (9.17). So, for the electron-proton
potential we can simply reuse our result (9.24)1
V2d [d† a† da] = V2A

d
+ V2Bd d
+ V2C (12.1)
Z
d e2 ~2 dkdqdpMmc4 1
V2A = − p ×
(2π~)3 ωq ωq+k Ωp Ωp−k k 2
X
W 0 (p − k, π; p, π ′ )U 0 (q + k, ǫ; q, ǫ′ ) ×
ǫǫ′ ππ ′
d†p−k,π a†q+k,ǫ dp,π′ aq,ǫ′ (12.2)

1
Note that we could also choose a different expression (9.25) for F2 , thus obtaining
a different formula for V2d . The two operators (9.24) and (9.25) coincide on the energy
shell, so they result in identical 2nd order S-matrices and energy spectra. However, their
corresponding dressed Hamiltonians are different, as they have different off-energy-shell
behaviors. As shown in chapter 15, our decision to use (9.24) leads to a good agreement
with Maxwell’s electrodynamics. However, it is not clear how much this macroscopic
theory would be affected by a different choice of V2d outside the energy shell. This question
requires a deeper investigation.
12.1. DARWIN-BREIT HAMILTONIAN 399
Z
d e2 ~2 c2 dkdqdpMmc4 1
V2B = − 3
p ×
(2π~) ωq ωq+k Ωp Ωp−k (q̃ + k̃ ÷ q̃)2
X
W(p − k, π; p, π ′ ) · U(q + k, ǫ; q, ǫ′ ) ×
ǫǫ′ ππ ′

2 2 2 Z 4
d e ~c dkdqdpMmc 1
V2C = 3
p ×
(2π~) ωq ωq+k Ωp Ωp−k (q̃ + k̃ ÷ q̃)2 k 2
X
(k · W(p − k, π; p, π ′ ))(k · U(q + k, ǫ; q, ǫ′ )) ×
ǫǫ′ ππ ′

In these formulas we integrate over the electron (q), proton (p) and trans-
ferred (k) momenta and sum over spin projections of the “incoming” and
“outgoing” particles ǫ, ǫ′ , π ′ and ǫ, π.
Operator (12.1) has non-trivial action in all sectors of the Fock space
which contain at least one electron and one proton. However, for simplicity,
we will limit our attention to the “1 proton + 1 electron” subspace Hpe in
the Fock space. If Ψ(p, π; q, ǫ) is the wave function of a two-particle state in
Hpe , then the interaction Hamiltonian (12.1) will transform it to2
Ψ′ (p, π, q, ǫ) = V2d [d† a† da]Ψ(p, π; q, ǫ)

XZ
= dkv2 (p, q, k; π, ǫ, π ′ , ǫ′ )Ψ(p − k, π ′ ; q + k, ǫ′ ) (12.5)
π ′ ǫ′
where v2 is the coefficient function of the interaction operator V2d [d† a† da].
We are going to write our formulas with the accuracy of (v/c)2 . So, we use
(J.64) - (J.69) to obtain the coefficient function in (12.5) as a sum of three
terms
v2 = v2A + v2B + v2C (12.6)

where3
2
See subsection 8.2.8.
3
We used properties of Pauli matrices from Appendix H.6 and formulas from Appendix
J.9. Our calculations in this section can be compared with §83 in ref. [BLP01] and with
[Ito65].
v2A

e2 ~2 (el)† (pr)† 1 1 1 ~σpr [k × p] ~σel [k × q] (el) (pr)
= − χ χπ − − −i +i χǫ′ χπ′
(2π~)3 ǫ k 2 2
8M c 2 2
8m c 2 2
4M c k2 2 4m2 c2 k 2
e2 ~2 (el)† (pr)† pq kq pk 1
v2B = χǫ χ π − + −
(2π~)3 Mmc2 k 2 2Mmc2 k 2 2Mmc2 k 2 4Mmc2
i[~σpr × k] · q ip · [~σel × k] (~σel · ~σpr ) (~σpr · k)(~σel · k) (el) (pr)
− + + − χǫ′ χπ′
2Mmc2 k 2 2Mmc2 k 2 4Mmc2 4Mmc2 k 2
e2 ~2 1 (el) (pr)
v2C = − χ(el)† χπ(pr)† (2pk − k 2 )(2qk + k 2 )χǫ′ χπ′
(2π~) k 4Mmc2 ǫ
3 4

e2 ~2 (el)† (pr)† (pk)(qk) qk pk 1 (el) (pr)
= − χ χπ − + − χǫ′ χπ′
(2π~)3 ǫ Mmc2 k 4 2Mmc2 k 2 2Mmc2 k 2 4Mmc2
Putting these three terms together we finally rewrite (12.6) in the form
v2 (p, q, k; π, ǫ, π ′, ǫ′ )
e2 ~2 (el)† (pr)† 1 1 1 pq
= 3
χǫ χπ − 2+ 2 2
+ 2 2
+
(2π~) k 8M c 8m c Mmc2 k 2
(pk)(qk) ~σpr [k × p] ~σel [k × q] ~σpr [k × q] ~σel [k × p]
− 2 4
+i 2 2 2
−i 2 2 2
−i 2 2
+i
Mmc k 4M c k 4m c k 2Mmc k 2Mmc2 k 2
(~σel · ~σpr ) (~σpr · k)(~σel · k) (el) (pr)

+ − χǫ′ χπ′ (12.7)
4Mmc2 4Mmc2 k 2
In a reasonable approximation one can assume that the proton is infinitely

heavy (M ≫ m) and skip terms with M in denominators. Then
e2 ~2 1 1 ~σel [k × q] (el)
v2 (p, q, k; π, ǫ, π ′ , ǫ′ ) ≈ δ ′
π,π ǫχ(el)†
− + − i χǫ′
(2π~)3 k 2 8m2 c2 4m2 c2 k 2
(12.8)
12.1. DARWIN-BREIT HAMILTONIAN 401
12.1.2 Position representation

The physical meaning of interaction (12.7) is more transparent in the po-
sition representation, which is derived by replacing variables p and q with
differential operators p̂ = −i~(d/dx) and q̂ = −i~(d/dy) and taking the
Fourier transform4
V̂2d [d† a† da]Ψ(x, π; y, ǫ)

XZ i
= dke ~ k(x−y) v2 (p̂, q̂, k; π, ǫ, π ′, ǫ′ )Ψ(x, π ′; y, ǫ′ )
π ′ ,ǫ′
Z
2 2
e ~ i
k(x−y) 1 1 1 p̂q̂ (p̂k)(q̂k)
= dke ~ − + + + −
(2π~)3 k 2 8M 2 c2 8m2 c2 Mmc2 k 2 Mmc2 k 4
~σpr [k × p̂] ~σel [k × q̂] ~σpr [k × q̂] ~σel [k × p̂]
+i 2 2 2
−i 2 2 2
−i 2 2
+i
4M c k 4m c k 2Mmc k 2Mmc2 k 2
(~σel · ~σpr ) (~σpr · k)(~σel · k)
+ − Ψ(x, π; y, ǫ)
4Mmc2 4Mmc2 k 2
Using integral formulas (B.7) - (B.11) we obtain the following position-space
representation of this potential (where r ≡ x − y)5
e2 e2 ~2 1 1 e2 h (r · q̂)(r · p̂) i
V̂2d [d† a† da] = − + 2 + δ(r) + p̂ · q̂ +
4πr 8c M 2 m2 8πMmc2 r r2
e2 ~[r × p̂] · ~σpr e2 ~[r × q̂] · ~σel e2 ~[r × q̂] · ~σpr e2 ~[r × p̂] · ~σel
− + + −
16πM 2 c2 r 3 16πm2 c2 r 3 8πMmc2 r 3 8πMmc2 r 3
2 2
e ~ ~σpr · ~σel (~σpr · r)(~σel · r) 2
+ − + 3 + (~
σ pr · ~
σ el )δ(r)
4Mmc2 4πr 3 4πr 5 3
(12.9)
With the accuracy of (v/c)2 the free Hamiltonian H0 can be written as
p p
Ĥ0 = M 2 c4 + p̂2 c2 + m2 c4 + q̂ 2 c2
2 2 p̂2 q̂ 2 p̂4 q̂ 4
= Mc + mc + + − − + ...
2M 2m 8M 3 c2 8m3 c2
4
see subsection 8.2.8; x is the proton’s position and y is the electron’s position
5
Some of these interaction terms are non-Hermitian due to the non-commutativity of
operators r and p, q. This minor problem can be solved by symmetrizing non-commutative
products, e.g., by replacing AB → (AB + BA)/2.
where the rest energies of particles Mc2 and mc2 are simply constants, which
can be eliminated by a proper choice of zero on the energy scale. Note
also that Pauli matrices are proportional to particle spin operators (H.5):
Ŝel = ~2 ~σel , Ŝpr = ~2 ~σpr . So, finally, the QED Hamiltonian responsible for
the electron-proton interaction in the 2nd order is obtained in the form of
Darwin-Breit potential
Ĥ d = Ĥ0 + V̂2d (p̂, q̂, r, Ŝel , Ŝpr ) + . . .

p̂2 q̂ 2
= + + V̂Coulomb + V̂orbit + V̂spin−orbit + V̂spin−spin + . .(12.10)
.
2M 2m
This form is similar to the familiar non-relativistic Hamiltonian in which
p̂2 /(2M) + q̂ 2 /(2m) is treated as the kinetic energy operator and V̂Coulomb is
the usual Coulomb interaction between two charged particles
e2
V̂Coulomb = − (12.11)
4πr
This is the only interaction term which survives in the non-relativistic limit
c → ∞. V̂orbit is a spin-independent relativistic correction to the Coulomb
interaction

p̂4 q̂ 4 e2 ~2 1 1
V̂orbit = − − + 2 + δ(r)
8M 3 c2 8m3 c2 8c M 2 m2
e2 h (r · q̂)(r · p̂) i
+ p̂ · q̂ + (12.12)
8πMmc2 r r2
The first two terms do not depend on relative variables, so they can be
regarded as relativistic corrections to energies of single particles. The contact
interaction (proportional to ~2 δ(r)) can be neglected in the classical limit
~ → 0. Keeping the (v/c)2 accuracy and substituting p̂/M → v̂pr and
q̂/m → v̂el , the remaining terms can be rewritten in a more familiar form of
the Darwin potential [Bre68]
e2 h (r · v̂pr )(r · v̂el ) i

V̂Darwin = v̂el · v̂pr + (12.13)
8πc2 r r2
12.2. HYDROGEN ATOM 403
which describes velocity-dependent (magnetic) interaction between charged

particles.
Two other terms V̂spin−orbit and V̂spin−spin in (12.10) depend on particle
spins
e2 [r × p̂] · Ŝpr e2 [r × q̂] · Ŝel e2 [r × q̂] · Ŝpr e2 [r × p̂] · Ŝel

V̂spin−orbit = − + + −
8πM 2 c2 r 3 8πm2 c2 r 3 4πMmc2 r 3 4πMmc2 r 3
(12.14)
e2 Ŝpr · Ŝel (Ŝpr · r)(Ŝel · r) 2
V̂spin−spin = 2
− 3
+3 + (Ŝpr · Ŝel )δ(r)
Mmc 4πr 4πr 5 3
(12.15)
Since our dressing transformation preserved commutation relations of the
Poincaré Lie algebra,6 we can be confident that the Darwin-Breit Hamilto-
nian is relativistically invariant, at least up to the order (v/c)2 . In Appendix
N.3 we additionally verify this important fact by a direct calculation.7
The Darwin-Breit Hamiltonian was successfully applied to various elec-
tromagnetic problems, such as the fine structure of atomic spectra [BLP01,
Bre68], superconductivity, and properties of plasma [Ess07, Ess95, Ess96,
Ess99, EF13]. In chapter 15 we will see that in the classical limit this Hamil-
tonian reproduces correctly all major results of classical electrodynamics. In
chapter 14.2 we will calculate small radiative corrections to the Darwin-Breit
potential.
12.2 Hydrogen atom

Having derived the electron-proton interaction potential, now we can study
the bound state of these two particles - the hydrogen atom. We are interested
in energies and wave functions of its stationary states. In the dressed particle
approach, this task is accomplished simply by diagonalizing the dressed par-
ticle Hamiltonian in the “electron+proton” sector Hpe of the Fock space. In
other words, the stationary Schrödinger equation needs to be solved. In this
section, we will study this solution with the Hamiltonian (12.10) including
interaction terms up to the 2nd perturbation order. Higher order corrections
will be considered in chapter 14.2.
6
7
see also [CV68, CO70, KF74]
12.2.1 Non-relativistic Schrödinger equation

We can use the fact that Hamiltonian (12.10) commutes with the operator of
total momentum P = p + q. Therefore this Hamiltonian leaves invariant the
zero-total-momentum subspace of Hpe . Working in this subspace we can set
Q̂ ≡ −q̂ = p̂ in equations (12.12) and (12.14) and consider Q̂ as operator of
differentiation with respect to r
∂
Q̂ = i~
∂r
If we make these substitutions in (12.10), then the energies ε and wave func-
tions Ψε (r, π, ǫ) of stationary states of the hydrogen atom at rest can be
found as solutions of the stationary Schrödinger equation

d ∂
Ĥ i~ , r, Sel , Spr Ψε (r, π, ǫ) = εΨε (r, π, ǫ) (12.16)
∂r
Analytical solution of equation (12.16) is not possible. Realistically, one can

first solve equation (12.16) leaving just the Coulomb interaction term (12.11)
there and rewriting the first two terms in (12.10) as
p̂2 q̂ 2 (m + M)Q̂2 Q̂2

+ = =
2M 2m 2mM 2µ
where µ = mM/(m + M) ≈ m is the reduced mass. In this approximation

equation (12.16) takes the form
!
Q̂2 e2
− Ψε (r, π, ǫ) = εΨε (r, π, ǫ) (12.17)
2µ 4πr
It does not depend on spin variables π, ǫ, so the solution can be written as a

product of orbital and spin parts
Ψε (r, π, ǫ) = ψε (r)χ(π, ǫ)
The energy ε is independent on the spin part χ(π, ǫ), which can be chosen
as an arbitrary set of four complex numbers, satisfying the normalization
condition
|χ(1/2, 1/2)|2 + |χ(−1/2, 1/2)|2 + |χ(1/2, −1/2)|2 + |χ(−1/2, −1/2)|2 = 1

The orbital parts and their energy eigenvalues satisfy the differential equation
2 2
~ ∂ e2
− − ψε (r) = εψε (r) (12.18)
2µ ∂r2 4πr
or in spherical coordinates
2
~ 1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2 e2
− r + 2 sin θ + 2 2 − ψε (r, θ, φ)
2µ r 2 ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂φ2 4πr
= εψε (r, θ, φ)
This is the familiar non-relativistic problem with a well-known analytical
solution, which can be found in any textbook on quantum mechanics, e.g.,
[Bal98, LL77]. Eigenstates will be labeled by the principal (n), orbital (l),
and magnetic (m) quantum numbers. Energy eigenvalues are degenerate
with respect to l and m
µc2 α2
ε(n, l, m) = − (12.19)
2n2
Few low energy solutions are shown in table 12.1, where a0 ≡ 4π~2 /(µe2 ) ≈
~/(αmc) denotes the Bohr radius, and α ≡ e2 /(4π~c) ≈ 1/137 is the fine
structure constant.
For further calculations we will need expectation values for inverse powers
of r in different eigenstates. For example
Z Z ∞ 2
−1 ∗ 1 1 r
hr i(2S) ≡ drψ2S (r) ψ2S (r) = 3 drr 2 − e−r/a0
r 8a0 0 a0
Z ∞ Z Z
1 −r/a0 4 ∞ 2 −r/a0 1 ∞ 3 −r/a0
= 4 drre − drr e + 2 drr e
8a30 0 a0 0 a0 0
1 1
= 4a20 − 8a20 + 6a20 =
8a30 4a0
Table 12.1: Normalized low energy solutions for non-relativistic hydrogen

atom
State(n, l, m) Wave function ψε (r, θ, φ) Energy (ε)
1
√ 3e −r/a
1S(1, 0, 0) 0
−µc α2 /2 = -13.6 eV
2
πa0
2S(2, 0, 0) √1 3
(2 − r
a0
)e−r/(2a0 ) −µc2 α2 /8 = -3.4 eV
4 2πa0
2P (2, 1, 0) √r e−r/(2a0 ) cos θ −µc2 α2 /8 = -3.4 eV
4a0 2πa30
√r
2P (2, 1, −1) e−r/(2a0 ) sin θe−iφ −µc2 α2 /8 = -3.4 eV
8a0 πa30
√r
2P (2, 1, 1) e−r/(2a0 ) sin θeiφ −µc2 α2 /8 = -3.4 eV
8a0 πa30
These results are shown in Table 12.2 along with probability densities at the
origin |ψ(0)|2 .
Table 12.2: Properties of low energy solutions for non-relativistic hydrogen

atom
State |ψ(0)|2 hr −1 i hr −2 i hr −3 i
1 1 2
1S πa30 a0 a20
1 1 1
2S 8πa30 4a0 4a20
1 1 1
2P 0 4a0 12a2 24a3
0 0
12.2.2 Relativistic energy corrections (orbital)

In the preceding subsection we obtained energies ε and wave functions Ψε for
a simple model of the hydrogen atom in which the electron-proton interaction
is approximated by the Coulomb potential −e2 /(4πr). We can consider these
results as a zero-order approximation for the full exact solution. Then other
interaction terms in (12.10) can be treated as a perturbation Vpert ≡ Vorbit +
Vspin−orbit + Vspin−spin. In the first approximation, this perturbation does not
affect the wave functions but shift energies [Bal98]. The resulting energy
correction for the state |Ψε i is given by the matrix element
∆ε = hΨε |Vpert|Ψε i (12.20)

(a) (b) ( c) (d)

Energy
E
2S 2P 2P3/2 2P3/2
2P3/2
2S1/2 2S1/2
2P1/2 2S1/2 2P1/2
2P1/2
1S
1S1/2
1S1/2 1S1/2
Figure 12.1: Low energy states of the hydrogen atom: (a) the non-relativistic
approximation (with the Coulomb potential (12.11)); (b) with the fine struc-
ture (due to the orbit (12.12) and spin-orbit (12.14) corrections); (c) with
Lamb shifts (due to the 4th and higher order radiative corrections); (d) with
the hyperfine structure (due to the spin-spin corrections (12.15)). Not to
scale.
Then perturbations Vorbit and Vspin−orbit are responsible for the fine structure
of the hydrogen atom and Vspin−spin is responsible for the hyperfine structure
(see Fig. 12.1). More details can be found in §84 of ref. [BLP01].
Let us first calculate energy level corrections due to the perturbation
Vorbit . We will take into account that M ≫ m, thus ignoring terms propor-
tional to inverse powers of M in (12.12)8 and assuming that µ = m. The
energy correction due to the second term in (12.12) is
Z
1
∆εrelat =− 3 2 drψ ∗ Q̂4 ψ (12.21)
8m c
If ψε is an eigenfunction of H with eigenvalue ε, then9
Q̂4 ψε = Q̂2 Q̂2 ψε

8
For example, we see that the last term in (12.12) - which is Darwin’s magnetic electron-
proton potential - is negligibly small in this approximation.
9
Here we used (12.18) and (B.2).

2 e2
= 2mQ̂ ε + ψε
4πr
2
2 ∂ ∂ψε ∂ e
= −2m~ ε + ψε
∂r ∂r ∂r 4πr
2
2 ∂ ψε e2 ∂ 2 1 e2 ∂ 1 ∂ψε e2 ∂ 2 ψε
= −2m~ ε 2 + ψε + · +
∂r 4π ∂r2 r 2π ∂r r ∂r 4πr ∂r2
" 2 #
2 2
2m e e ∂ 1 ∂ψ ε
= −2m~2 − 2 ε + ψε − e2 δ(r)ψε + ·
~ 4πr 2π ∂r r ∂r
Using expression for the gradient in spherical coordinates10
∂f (r, θ, φ) ∂f θ̂ ∂f φ̂ ∂f
= r̂ + +
∂r ∂r r sin φ ∂θ r ∂φ
and inserting the resulting expression for Q̂4 ψε in (12.21) we obtain
Z " 2 #
~2 2m e2 e2
∂ψε
∆εrelat = drψε∗ − 2 ε+ ψε − e2 δ(r)ψε −
4m2 c2 ~ 4πr 2
2πr ∂r
The last term in square brackets can be evaluated as11
Z Z Z π Z ∞
e2 ∗ 1 ∂ψε e2 2π ∂ψε
− drψε 2 =− dφ sin θdθ drψε∗
2π r ∂r 2π 0 0 0 ∂r
2 Z 2π Z π Z ∞ 2 2 Z 2π Z π
e ∂|ψε | e
= − dφ sin θdθ dr = dφ sin θdθ|ψε (0)|2
4π 0 0 0 ∂r 4π 0 0
= e2 |ψε (0)|2 (12.22)
The second term in square brackets is
Z
−e 2
drψε∗ δ(r)ψε = −e2 |ψε (0)|2
10
Here r̂ ≡ r/r, θ̂, φ̂ are unit vectors directed along directions of growth of the corre-
sponding coordinates.
11
we take into account that ψε (r, θ, φ) → 0 as r → ∞
so it cancels with (12.22) and
Z
1 ∗ 2 e2 ε e4
∆εrelat = − drψε ε + + ψε
2mc2 2πr 16π 2 r 2

1 2 e2 ε −1 e4 −2
= − ε + hr i + hr i
2mc2 2π 16π 2
Energy correction due to the third term in (12.12) is
Z
e2 ~2 e2 ~2
∆εcontact = drδ(r)|ψε (r)|2 = |ψε (0)|2 (12.23)
8m2 c2 8m2 c2
Using data from Tables 12.1 and 12.2 we obtain orbital energy corrections
for individual states ψnlm as shown in the 2nd and 3rd rows of Table 12.3.
Table 12.3: 2nd order perturbative relativistic energy corrections to low-lying

states of the hydrogen atom.
1S 1/2 2S 1/2 2P 1/2 2P 3/2
mc2 α2 mc2 α2 mc2 α2 2 2
non-relativistic energy (12.19) − 2 − 8 − 8 − mc8 α
Energy corrections:
2 4 2 α4 2 α4 2 α4
relativistic (12.21) − 8m13 c2 hQ4 i − 5mc8 α − 13mc
128
− 7mc
384
− 7mc
384
e 2 ~2 mc2 α4 mc2 α4
contact (12.23) 8m2 c2
hδ(r)i 2 16
0 0
e2 2 4 mc2 α4
spin-orbit (12.24) 8πm2 c2
h L·S
r3
el
i 0 0 − mc48α 48
2 4 2 4 2 4 mc2 α4
Total correction − mc8 α − 5mc
128
α
− 5mc
128
α
384
12.2.3 Relativistic energy corrections (spin-orbital)

Let us now consider the effect of the spin-orbit interaction (12.14)
e2 [r × q̂] · Ŝel e2 L · Ŝel

V̂spin−orbit ≈ = (12.24)
8πm2 c2 r 3 8πm2 c2 r 3
where L = [r × Q̂] is the orbital angular momentum of the atom. This

interaction does not act on states with l = 0, which are eigenvectors of the
orbital angular momentum operator L2 with eigenvalue 0. So, we need to
consider only 2P -states, where l = 1.
Totally, there are 6 different substates in 2P : those with different com-
binations of l = −1, 0, 1 and s = −1/2, 1/2. In these substates the total
angular momentum12 J = L + Sel can be either j~ = (1 − (1/2))~ = ~/2
or j~ = (1 + (1/2))~ = 3~/2. So, the 6 substates separate into two groups.
One group of two states corresponds to j = 1/2. It is denoted by 2P 1/2 . The
other group of four states corresponds to j = 3/2 and is denoted by 2P 3/2 .
The non-perturbed Hamiltonian
Q2 e2
He−p = − (12.25)
2m 4πr
commutes with the orbital angular momentum operator L, with the electron
spin operator Sel and with the total angular momentum operator J. So, all
six substates are degenerate with respect to (12.25).
On the other hand, the total Hamiltonian (12.10) commutes only with
the total angular momentum J and it does not commute with L and Sel
separately. Therefore, the total energies of the two groups 2P 1/2 and 2P 3/2
may be different. Let us demonstrate the effect of perturbation (12.24) on
the state 2P 1/2 . We use formula
J 2 = (L + Sel )2 = L2 + Sel2 + 2(L · Sel )

Then
J 2 ψ2P 1/2 = ~2 j(j + 1)ψ2P 1/2 = 3/4~2ψ2P 1/2

L2 ψ2P 1/2 = ~2 l(l + 1)ψ2P 1/2 = 2~2 ψ2P 1/2
Sel2 ψ2P 1/2 = ~2 s(s + 1)ψ2P 1/2 = 3/4~2ψ2P 1/2
(J 2 − L2 − Sel2 ) (3/4 − 2 − 3/4)~2
(L · Sel )ψ2P 1/2 = ψ2P 1/2 = ψ2P 1/2
2 2
= −~2 ψ2P 1/2
e2 ~2 −3 mc2 α4
∆εspin−orbit(2P 1/2 ) = − hr i = −
8πm2 c2 48
12
Here we ignore the proton’s spin Spr whose contribution to the energy can be ignored
in our approximation.
A similar calculation gives us the spin-orbit correction to the energy of 2P 3/2
mc2 α4
∆εspin−orbit(2P 3/2 ) =
48
One can see from Table 12.3 that the total 2nd order energy corrections to
states 2S 1/2 and 2P 1/2 are the same. So, these two states remain degenerate
in our approximation
ε(2S 1/2 ) − ε(2P 1/2) = 0 (12.26)
In chapter 14 we will take into account higher perturbation orders of the

dressed theory and find a small energy gap between the 2S 1/2 and 2P 1/2
levels, which is known as the Lamb shift.
The description of the hydrogen atom presented here is by no means new
or original. Similar calculations can be found in many textbooks on quan-
tum mechanics. However, these textbook calculations always assume the
Coulomb potential −e2 /(4πr) between the proton and the electron as given,
and this assumption is never properly justified. Usually, the introduction of
the Coulomb potential is explained by a reference to classical electrodynam-
ics, but we do not want to base our fundamental relativistic quantum theory
on such a shaky foundation as classical Maxwell’s theory. Unfortunately, the
interaction between the two charges cannot be extracted from the basic for-
malism of QED as well. It is impossible to recognize traces of the Coulomb
potential among components13 of the QED field-based interaction operator
(10.44).
The true value of the dressed particle approach presented here is that
it shows a clear and rigorous path from the abstract and divergent QED
Hamiltonian to the intuitive and physically transparent picture of particles
interacting via familiar action-at-a-distance forces.
13
some of them even divergent
Chapter 13
DECAYS
Many things are incomprehensible to us not because our compre-

hension is weak, but because those things are not within the frames
of our comprehension.
Kozma Prutkov
Our formulation of quantum theory in the Fock space with undetermined

particle numbers gives us an opportunity to describe not just inter-particle
interactions, but also processes of particle creation and absorption. The
simplest example of such a process is the decay of an unstable particle. This
is the topic of the present chapter.
Unstable particles are interesting for several reasons. First, an unsta-
ble particle is a rare example of a quantum interacting system whose time
evolution can be observed relatively easily. This time evolution is especially
simple, because in many cases it can be described by just one parameter -
the non-decay probability ω(t). Second, a rigorous description of the decay
is possible in a small portion of the Fock space that contains only states
of the particle and its decay products, so a rather accurate solution of this
time-dependent problem can be obtained in a closed form.
In sections 13.1 - 13.2 we will discuss the decay law of an unstable system
at rest. Most of this material is rather traditional. Less familiar ideas will
be presented in the last two sections of this chapter. In section 13.3 we will
be interested in the decay law observed from a moving reference frame. In
413
414 CHAPTER 13. DECAYS
section 13.4 we are going to show that the famous Einstein’s time dilation
formula is not exactly applicable to such decays.
13.1 Unstable system at rest

In this section we will pursue two goals. The first goal is to provide a pre-
liminary material for our discussion of decays of moving particles in sections
13.3 and 13.4. The second goal is to derive a beautiful result, due to Breit
and Wigner, which explains why the time dependence of particle decays is
(almost) always exponential.
13.1.1 Quantum mechanics of particle decays

The decay of unstable particles is described mathematically by the non-decay
probability which has the following definition. Suppose that we have a piece
of radioactive material with N unstable nuclei prepared simultaneously at
time t = 0 and denote Nu (t) the number of nuclei that remain undecayed at
time t > 0. So, at each time point the piece of radioactive material can be
characterized by the ratio Nu (t)/N.
In the spirit of quantum mechanics, we will treat N unstable nuclei or
particles as an ensemble of identically prepared systems and consider the
ratio Nu (t)/N as a property of a single particle – the probability of finding
this particle in the undecayed state. Then the non-decay probability ω(t) is
defined as a large N limit
ω(t) = lim Nu (t)/N (13.1)

N →∞
Function ω(t) will be called the decay law of the particle.

Let us now turn to the description of an isolated unstable system from
the point of view of quantum theory. We will consider a model theory with
three particles a, b, and c, so that particle a is massive and unstable. In
order to simplify calculations and avoid being distracted by issues that are
not relevant to the problem at hand we assume that particle a is spinless and
has only one decay channel. The decay products b and c are assumed to be
stable, and their masses satisfy the inequality
ma > mb + mc (13.2)
13.1. UNSTABLE SYSTEM AT REST 415
which makes the decay a → b+c energetically possible. Decays of elementary

particles are forbidden in quantum electrodynamics, because, as we discussed
in subsection 8.2.4, there are no decay type interactions1 in the Hamiltonian
of QED. So, our analysis in this chapter is more relevant to decays governed
by weak nuclear interactions. However, even in QED decays may occur
in compound systems, e.g., in the hydrogen atom. In section 14.1 we will
consider a specific example in which a and b are stationary states of the atom
(0 < mb < ma ) and c is the photon (mc = 0). Most result from the present
chapter will be applicable there without modifications.
Observations performed on the unstable system may result in only two
basic outcomes. One can find either a non-decayed particle a intact or its
decay products b + c. Thus it is appropriate to describe states of this system
in just two sectors of the Fock space2
H = Ha ⊕ Hbc (13.3)
where Ha is the subspace of states of the unstable particle a and Hbc ≡
Hb ⊗ Hc is the orthogonal subspace of the decay products.
Now we can introduce a Hermitian operator T that corresponds to the
experimental proposition “particle a exists.” The operator T can be fully de-
fined by its eigensubspaces and eigenvalues. When a measurement performed
on the unstable system finds it in a state corresponding to the particle a, then
the value of T is 1. When the decay products b + c are observed, the value
of T is 0. Apparently, T is the projection operator on the subspace Ha . For
each normalized state vector |Φi ∈ H, the probability of finding the unstable
particle a is given by the expectation value of this projection
ω = hΦ|T |Φi (13.4)

Alternatively, one can say that ω is a square of the norm of the projection
T |Φi
1
like (13.7)
2
In principle, a rigorous description of systems involving these three types of particles
must be formulated in the full Fock space where particle numbers Na , Nb , and Nc are
allowed to take any values from zero to infinity. However, in most cases the interaction
between decay products b and c in the final state can be ignored. Creation of additional
particles due to this interaction can be ignored too. So, limiting our attention only to the
Fock subspace (13.3) is a reasonable approximation.
ω = hΦ|T T |Φi = kT |Φik2 (13.5)
where we used property T 2 = T from Theorem G.1.

Any vector |Ψi ∈ Ha describes a state in which the unstable particle a
is found with 100% certainty. We will assume that the unstable system was
prepared in such a state |Ψ(0)i at time t = 0
ω(0) = hΨ(0)|T |Ψ(0)i = 1 (13.6)
To study the time evolution (= decay) of this state, we need to know the
full Hamiltonian H = H0 + V in trhe Hilbert space H. The non-interacting
part H0 can be easily constructed by usual rules from subsection 8.1.8
For the interaction part we choose the simplest operator of the decay type
(8.50) that can be responsible for the process a → b + c is3
Z
V = dpdq G(p, q)a†p+q bp cq + G∗ (p, q)b†p c†q ap+q (13.7)
As expected, the interaction operator (13.7) leaves invariant the sector H =

Ha ⊕ Hbc of the total Fock space.
Then the time evolution of the initial state |Ψ(0)i is given by formula
(5.49)4
i
|Ψ(t)i = e− ~ Ht |Ψ(0)i (13.8)
and the decay law is
i i
ω(t) = hΨ(0)|e ~ Ht T e− ~ Ht |Ψ(0)i (13.9)
3
Note that in order to have a Hermitian Hamiltonian we need to include in the inter-
action both the term b† c† a responsible for the decay and the term a† bc responsible for the
inverse process b + c → a. Due to the relation (13.2), these two terms have non-empty
energy shells, so, according to our classification in subsection 8.2.4, they belong to the
“decay” type.
4
In this chapter we are working in the Schrödinger picture.
Ha
|Ψ(0)>
1
ω(t) |Ψ(t)>
bc
Figure 13.1: Time evolution of the state vector of an unstable system.
It is clear that our Hamiltonian H does not commute with the projection
operator T
[H, T ] 6= 0 (13.10)
Therefore, the subspace Ha of states of the particle a is not invariant with

respect to time translations, so that the decay law (13.9) is a nontrivial
function of time.
Now we can suggest a schematic visual representation of the decay process
in the Hilbert space. In Fig. 13.1 we show the full Hilbert space Ha ⊕ Hbc as
a sum of two orthogonal subspaces Ha and Hbc . We assume that the initial
normalized state vector |Ψ(0)i at time t = 0 lies entirely in the subspace
Ha . So that the non-decay probability ω(0) is equal to 1 as in equation
i
(13.6). At later times t > 0 the state vector |Ψ(t)i = e− ~ Ht |Ψ(0)i develops a
component5 lying in the subspace of decay products Hbc . As we will see later,
in our model the state vector gradually moves from the subspace of unstable
particle Ha to the subspace of decay products Hbc , so that the non-decay
probability ω(t) decreases with time monotonically, while the probability
(1 − ω(t)) of finding the decay products grows. Moreover, we are going to see
that under very broad range of conditions, the decay law takes the unversal
exponential form ω(t) ≈ e−Γt , where the decay rate Γ is controlled by the
strength of the decay interaction V .
5
shown by a broken-line arrow in the figure
Before calculating the decay law (13.9) we will need to do some prepara-
tory work first. In subsections 13.1.2 - 13.1.4 we are going to construct two
useful bases. One is the basis |pi of eigenvectors of the total momentum
operator P0 in Ha . Another one is the basis |p,
pmi of common eigenvectors
of P0 and the interacting mass operator M = H 2 − P02 c2 /c2 in H.

group
Let us first consider a simple case when the interaction responsible for the
decay is “turned off,” e.g., by setting G(p, q) in (13.7). This means that dy-
namics of the system is governed by the non-interacting representation of the
Poincaré group Ug0 in H.6 This representation is constructed in accordance
with the structure of the Hilbert space (13.3) as
Ug0 ≡ Uga ⊕ (Ugb ⊗ Ugc ) (13.11)
where Uga , Ugb , and Ugc are unitary irreducible representations of the Poincaré
group corresponding to the particles a, b, and c, respectively. Generators of
this representation will be denoted by P0 , J0 , H0 , and K0 . According to
(13.2), the operator of the non-interacting mass
q
1
M0 = + 2 H02 − P02 c2
c
has a continuous spectrum in the interval [mb + mc , ∞) and a discrete point
ma embedded in this interval.
From definition (13.11) it is clear that the subspaces Ha and Hbc are
separately invariant with respect to Ug0 . Moreover, the projection operator
T commutes with non-interacting generators
[T, P0 ] = [T, J0 ] = [T, K0 ] = [T, H0 ] = 0 (13.12)
Exactly as we did in subsection 5.1.2, we can use the non-interacting rep-

resentation Ug0 to build a basis |pi of eigenvectors of the momentum operator
6
P0 in the subspace Ha . Then any state |Ψi ∈ Ha can be represented by a

linear combination of these basis vectors
Z
|Ψi = dpψ(p)|pi (13.13)
and the projection operator T can be written ascompare with equation (5.20)
Z
T = dp|pihp| (13.14)
13.1.3 Normalized eigenvectors of momentum

Improper basis vectors |pi are convenient for writing arbitrary states |Ψi ∈
Ha as linear combinations (13.13). However vectors |pi themselves are not
good representatives of quantum states, because they are not normalized.7
For example, the momentum space “wave function” of the basis vector |qi is
a delta function
ψq (p) = hp|qi = δ(p − q) (13.15)
and the corresponding “probability” of finding the particle is infinite
Z Z
2
dp|ψq (p)| = dp|δ(p − q)|2 = ∞
Therefore, improper states like |qi cannot be used in formula (13.4) for calcu-
lating the decay law of states with definite (or almost definite) momentum p0 .
For such calculations we should use other (proper) state vectors |p0 ) whose
normalized momentum-space wave functions are sharply localized near p0 .
In order to satisfy the normalization condition
Z
dp|ψ(p)|2 = 1
7
the wave function of |p0 ) may be formally represented as a square root of

the Dirac’s delta function8
p
ψp0 (p) = δ(p − p0 ) (13.16)
13.1.4 Interacting representation of the Poincaré group

In order to study dynamics of the unstable system we need to define an
interacting unitary representation of the Poincaré group in the Hilbert space
H. This representation will allow us to relate results of measurements in
different reference frames. In this section we will take the point of view of
the observer at rest. We will discuss particle decay from the point of view of
a moving observer in sections 13.3 and 13.4.
Let us now “turn on” the interaction (13.7) responsible for the decay and
discuss the interacting representation Ug of the Poincaré group in H with
generators P, J, K, and H. As usual, we prefer to work in the Dirac’s
instant form of dynamics. Then the generators of space translations and
rotations are interaction-free
P = P0
J = J0
while generators of time translations (the Hamiltonian H) and boosts contain
interaction-dependent terms
H = H0 + V
K = K0 + Z
We will further assume that the interacting representation Ug belongs to the
Bakamjian-Thomas form of dynamics9 in which the interacting operator of
mass M commutes with the Newton-Wigner position operator (4.32)
8
Another way to achieve the same goal would be to keep the delta-function represen-
tation (13.15) of definite-momentum
R states, but use (formally vanishing) normalization
factors, like N = ( dp|ψ(p)|2 )−1/2 . Perhaps such manipulations with infinitely large and
infinitely small numbers can be justified within non-standard analysis [Fri94].
9
For the Bakamjian-Thomas theory see subsection 6.3.2. The possibilities for the decay
interaction to be in other forms of dynamics will be discussed in subsection 13.4.3.
q
−2
M ≡ c H 2 − P20 c2
[R0 , M] = 0 (13.17)
Our next goal is to define the basis of common eigenvectors of commuting

operators P0 and M in H.10 These eigenvectors must satisfy conditions
P0 |p, mi = p|p, mi (13.18)

M|p, mi = m|p, mi (13.19)
p
They are also eigenvectors of the interacting Hamiltonian H = M 2 c4 + P20 c2
H|p, mi = ωp |p, mi
p
where ωp ≡ m2 c4 + c2 p2 . In the zero-momentum eigensubspace of the
momentum operator P0 we can introduce a basis |0, mi of eigenvectors of
the interacting mass M
P0 |0, mi = 0
M|0, mi = m|0, mi
Then the basis |p, mi in the entire Hilbert space H can be built by formula11
s
mc2 − ic Kθ~
|p, mi = e ~ |0, mi
ωp
10
In addition to these two operators, whose eigenvalues are used for labeling eigenvectors
|p, mi, there are other independent operators in the mutually commuting set containing
P0 and M . These are, for example, the operators of the square of the total angular
momentum J02 and the projection of J0 on the z-axis J0z . Therefore a unique character-
ization of any basis vector requires specification of all corresponding quantum numbers
as |p, m, j 2 , jz , . . .i. However these additional quantum numbers are not relevant for our
discussion, and we omit them.
11
Compare with (5.5) and (5.27). Here K is the interacting boost operator whose explicit
form will not be required in our derivation.
where vector ~θ is related to momentum by formula p = mc~θθ−1 sinh θ. These

improper eigenvectors are normalized to delta functions
hq, m|p, m′ i = δ(q − p)δ(m − m′ ) (13.20)

The actions of inertial transformations on these states are found by the same
method as in section 5.1. In particular, for boosts along the x-axis and for
time translations we obtain12
r
− ic Kx θ ωΛp
e ~ |p, mi = |Λp, mi (13.21)
ωp
i i
e ~ Ht |p, mi = e ~ ωp t |p, mi (13.22)
ωp
Λp = px cosh θ + sinh θ, py , pz (13.23)
c
i
Next we notice that due to equations (4.25) and (13.17) vectors e ~ R0 ·p |0, mi
also satisfy eigenvector equations (13.18) - (13.19), so they must be propor-
tional to the basis vectors |p, mi
i
e ~ R0 ·p |0, mi = γ(p, m)|p, mi (13.24)
where γ(p, m) is a unimodular factor. Unlike in (5.36), we cannot conclude
that γ(p, m) = 1. However, if the interaction is not pathological we can
assume that the factor γ(p, m) is smooth, i.e., without rapid oscillations.
Obviously, vector |0i from the basis (5.36) can be expressed as a linear
combination of zero-momentum basis vectors |0, mi, so we can write13
Z∞
|0i = dmµ(m)|0, mi (13.25)
mb +mc
12
13
Here we assume that interaction responsible for the decay does not change the spec-
trum of mass. In particular, we will neglect the possibility of existence of bound states of
particles b and c, i.e., discrete eigenvalues of M below mb + mc . Then the spectrum of M
(similar to the spectrum of M0 ) is continuous in the interval [mb + mc , ∞), and integration
in (13.25) should be performed from mb + mc to infinity. Note that this assumption does
not hold in the example considered in subsection 13.2.2, where one discrete eigenvalue of
the mass operator M appears below the threshold mb + mc , as shown in Figs. 13.2 and
13.3.
where µ(m) is yet unknown function, which depends on the choice of the
interaction Hamiltonian V and satisfies equations
µ(m) = h0, m|0i (13.26)

Z∞
dm|µ(m)|2 = 1
mb +mc
The physical meaning of µ(m) is the probability amplitude for finding the
value m of the interacting mass M in the initial unstable state |0i ∈ Ha .
It will be referred to as the mass distribution of the unstable particle. We
now use equations (5.36) and (13.25) to expand vectors |pi ∈ Ha in the basis
|p, mi
Z∞
i i
R p R p
|pi = e ~ 0 |0i = e ~ 0 dmµ(m)|0, mi
mb +mc
Z∞
= dmµ(m)γ(p, m)|p, mi (13.27)
mb +mc
Then any state vector from the subspace Ha can be written as
Z
|Ψi = dpψ(p)|pi (13.28)
Z Z∞
= dp dmµ(m)γ(p, m)ψ(p)|p, mi (13.29)
mb +mc
From (13.20) we also obtain a useful formula
Z∞
hq|p, mi = dm′ µ∗ (m′ )γ ∗ (q, m′ )hq, m′ |p, mi
mb +mc
= γ (p, m)µ∗ (m)δ(q − p)
∗
(13.30)
13.1.5 Decay law

We are now fully equipped to find the time evolution of the state vector
(13.28) prepared within the subspace Ha at time t = 0. We apply equations
(13.8), (13.22), and (13.27)
Z
i
|Ψ(t)i = dpψ(p)e− ~ Ht |pi
Z Z∞
i
= dpψ(p) dmµ(m)γ(p, m)e− ~ Ht |p, mi
mb +mc
Z Z∞
i
= dpψ(p) dmµ(m)γ(p, m)e− ~ ωp t |p, mi
mb +mc
The inner product of this vector with |qi is found by using (13.30)
hq|Ψ(t)i
Z Z∞
i
= dpψ(p) dmµ(m)γ(p, m)e− ~ ωp t hq|p, mi
mb +mc
Z Z∞
i
= dpψ(p) dm|µ(m)|2 γ(p, m)γ ∗ (p, m)e− ~ ωp t δ(q − p)
mb +mc
Z∞
i
= ψ(q) dm|µ(m)|2e− ~ ωq t (13.31)
mb +mc
The decay law is then obtained by substituting (13.14) in equation (13.9)

and using (13.31)
Z Z
ω(t) = dqhΨ(t)|qihq|Ψ(t)i = dq|hq|Ψ(t)i |2
∞ 2
Z Z

2

2 − ~i ωq t
= dq|ψ(q)| dm|µ(m)| e (13.32)

mb +mc
13.2. BREIT-WIGNER FORMULA 425
This formula is valid for the decay law of any state |Ψi ∈ Ha . In the particular
case of the normalized state |0) whose wave function ψ(q) is well-localized
in the momentum space near zero momentum, we can set approximately
p
ψ(q) ≈ δ(q)
Z Z
dq|ψ(q)|2 ≈ dqδ(q) = 1
and14
∞ 2
Z
i 2
ω|0) (t) ≈ dm|µ(m)|2 e− ~ mc t (13.33)

mb +mc
This result demonstrates that the decay law is fully determined by the mass
distribution |µ(m)|2. In the next section we will consider an exactly solv-
able decay model in which the mass distribution and the decay law can be
calculated explicitly.
13.2 Breit-Wigner formula

13.2.1 Schrödinger equation
In this section we are discussing decay of a particle at rest. Therefore, it
is sufficient to consider the subspace H0 ⊆ H of states having zero total
momentum. The subspace H0 can be further decomposed into the direct
sum
H0 = Ha0 ⊕ H(bc)0
where15
Ha0 = H0 ∩ Ha
H(bc)0 = H0 ∩ (Hb ⊗ Hc )
14
compare, for example, with equation (3.8) in [FGR78]
15
Recall that symbol ∩ denotes intersection of two subspaces in the Hilbert space.
Ha0 is, of course, the one-dimensional subspace spanning the zero-momentum

vector |0i of the particle a. In the subspace H(bc)0 of decay products the total
momentum is zero P = pb + pc = 0. Then 2-particle basis states |~ρi there
can be labeled by eigenvectors of the relative momentum operator
ρ~ = pb = −pc
and each state |Ψi in H0 can be written as an expansion in the above basis
(|0i, |~ρi)
Z
|Ψi = ξ|0i + d~ρζ(~ρ)|~ρi
The coefficients of this expansion can be represented as an infinite column

vector
 
ξ

 ζ(~ρ1) 

|Ψi = 
 ζ(~ρ2) 

 ζ(~ρ3) 
...
whose first component is a complex number ξ ≡ h0|Ψi. All other components

are values of the complex function ζ(~ρ) at different momenta ρ~.16 For brevity,
we will use the following notation

ξ
|Ψi = (13.34)
ζ(~ρ)
The vector |Ψi should be normalized, hence its wave function satisfies the
normalization condition
16
These are projections of |Ψi on the relative momentum eigenvectors |~ ρi. Of course,
the spectrum of ρ ~ is continuous and, strictly speaking, cannot be represented by a set
of discrete values ρ~i . However, we can justify our approximation by the usual trick of
placing the system in a finite box (then the momentum spectrum becomes discrete) and
then taking the limit, in which the size of the box goes to infinity.
Z
2
|ξ| + d~ρ|ζ(~ρ)|2 = 1 (13.35)
The probability of finding the unstable particle in the state |Ψi is
ω = |ξ|2
and in the initial state

1
|0i = (13.36)
0
the unstable particle is found with 100% probability.

We can now find representations of various operators in the basis (|0i, |~ρi).
The matrix of the free Hamiltonian is diagonal
 
ma c2 0 0 0 ...
 0 ηρ~1 c2 0 0 ...  2

 ≡ ma c 0
 
H0 =  0 0 ηρ~2 c2 0 ...
  0 ηρ~ c2
 0 0 0 ηρ~3 c2 ... 
0 0 0 0 ...
where
q
1 2 4 2 2
p
2 4 2 2
ηρ~ = 2 mb c + c ρ + mc c + c ρ (13.37)
c
is the mass of the two-particle (b + c) system expressed as a function of the

relative momentum. In the subspace H0 interaction operator (13.7) takes
the form
Z
V = d~ρ G(~ρ, −~ρ)a†0 bρ~ c−~ρ + G∗ (~ρ, −~ρ)b†ρ~ c†−~ρ a0
Z
≡ d~ρ g(~ρ)a†0 bρ~ c−~ρ + g ∗ (~ρ)b†ρ~ c†−~ρ a0
Its matrix representation is17
 
0 g(~ρ1 ) g(~ρ2 ) g(~ρ3 ) ...
∗
 g (~ρ1 ) 0 0 0 ...  R
 ∗

≡ 0 dqg(q) . . .
V =  g (~ρ2 ) 0 0 0 ...
  g ∗ (~ρ) 0
 g ∗ (~ρ3 ) 0 0 0 ... 
... 0 0 0 ...
where g(~ρ) is the matrix element of the interaction operator between states
|0i and |~ρi
g(~ρ) = h0|V |~ρi (13.38)
Then the action of the full Hamiltonian H = H0 + V on vectors (13.34) is
  
ma c2 g(~ρ1 ) g(~ρ2 ) g(~ρ3 ) . . . ξ
 g ∗(~ρ1 ) ηρ~1 c2 0 0 ...    ζ(~ρ1 ) 
 
ξ  ∗ 2
H =  g (~ρ2 ) 0 ηρ~2 c 0 . . .   ζ(~ρ2 ) 
 
ζ(~ρ)  ∗ 2

 g (~ρ3 ) 0 0 ηρ~3 c . . .   ζ(~ρ3 ) 
... 0 0 0 ... ...
 R 
ma c2 ξ + dqg(q)ζ(q)
 g ∗(~ρ1 )ξ + ηρ~1 ζ(~ρ1)c2  R
 ∗ 2 
 ma c2 ξ + dqg(q)ζ(q)
=  g (~ρ2 )ξ + ηρ~2 ζ(~ρ2)c  ≡
 g ∗(~ρ)ξ + ηρ~ ζ(~ρ)c2
 g ∗(~ρ3 )ξ + ηρ~3 ζ(~ρ3)c2 
...
The next step is to find eigenvalues (which we denote mc2 ) and eigenvec-
tors

µ∗ (m)
|0, mi ≡ (13.39)
ζm (~ρ)
17
R
R Here symbol dqg(q) . . . denotes a linear operator, which produces a number
dqg(q)ζ(q) when acting on an arbitrary test function ζ(q). The function g(~
ρ) coin-
cides with G(p, q) on the zero-momentum subspace g(~ρ) ≡ G(~
ρ, −~
ρ).
of the Hamiltonian H.18 This task is equivalent to the solution of the follow-
ing system of linear equations:
Z
2 ∗
ma c µ (m) + dqg(q)ζm (q) = mc2 µ∗ (m) (13.40)
g ∗ (~ρ)µ∗ (m) + ηρ~ c2 ζm (~ρ) = mc2 ζm (~ρ) (13.41)
From Equation (13.41) we obtain
g ∗(~ρ)µ∗ (m)
ζm (~ρ) = (13.42)
mc2 − ηρ~ c2
Substituting this result to (13.40), we get a non-linear equation determining

the spectrum of eigenvalues m
Z
1 |g(q)|2
m − ma = 4 dq (13.43)
c m − ηq
To comply with the law of conservation of the angular momentum,19 the func-
tion |g(q)| must depend only on the absolute value q ≡ |q| of its argument.
Therefore, we can rewrite equation (13.43) in the form
m − ma = F (m) (13.44)
where
Z∞
G(q)
F (m) ≡ dq (13.45)
m − ηq
0
4πq 2
G(q) ≡ |g(q)|2 (13.46)
c4
From the normalization condition (13.35)
18
Note that the function µ(m) in (13.39) is the same as in (13.26). So, in order to
calculate the decay law (13.33), all we need to know is |µ(m)|2 .
19
for a spinless particle a
Z
2
|µ(m)| + dq|ζm (q)|2 = 1 (13.47)
and equation (13.42) we finally obtain a formula for the mass distribution
Z
2 G(q)
|µ(m)| 1 + dq = 1
(m − ηq )2
1
|µ(m)|2 = (13.48)
1 − F ′ (m)
where F ′ (m) is the derivative of F (m). So, in order to calculate the decay
law (13.33) we just need to know the derivative of F (m) at points m of the
spectrum of the interacting mass operator.20 The following subsection will
detail such a calculation.
13.2.2 Finding function µ(m)

Function ηρ~ in (13.37) expresses dependence of the total mass of the two
decay products on their relative momentum. This function has minimum
value η0 = mb + mc at ρ~ = 0 and grows to infinity with increasing ρ. Then
the solution of equation (13.44) for values of m in the interval [−∞, mb + mc ]
is rather straightforward. In this region the denominator in the integrand
of (13.45) does not vanish, and F (m) is a well-defined continuous function
which tends to zero at m = −∞ and decreases monotonically as m grows. A
graphical solution of equation (13.44) in the interval [−∞, mb + mc ] can be
obtained as an intersection of the line m − ma and the graph of the function
F (m) (point M0 in Fig. 13.2). The corresponding value m = M0 < mb +mc is
a discrete eigenvalue of the interacting mass operator and the corresponding
eigenstate is a superposition of the unstable particle a and its decay products
b + c.
Finding the spectrum of the interacting mass in the region [mb + mc , ∞]
is more tricky due to a singularity in the integrand of (13.45). Let us first
discuss our approach qualitatively, using graphical representation in Fig. 13.3
and the discrete approximation for the spectrum of ρ~. Then equation (13.44)
takes the form
20
These points are solution of the equation (13.43).
F(m
m)
M0 m
mb+mc ma
Figure 13.2: A graphical solution of equation (13.44) for m < mb + mc .

The thick dashed line indicates the continuous part of the spectrum of the
non-interacting mass operator. The thick full line shows function F (m) at
m < mb + mc .
F(m)
M3
M0 M1 M2 M4 M5 m
m2 m6
m1 m3ma m4 m5
Figure 13.3: Spectra of the free (opened circles, H0 ) and interacting (full
circles, H = H0 + V ) Hamiltonians.
∞
1 X |g(~ρi )|2
m − ma = 2 = F (m) (13.49)
c i=1 m − mi
where mi ≡ ηρ~i are eigenvalues of the non-interacting mass of the 2-particle

system b + c and the lowest eigenvalue is m1 = mb + mc . The function
on the right hand side of equation (13.49) is a superposition of functions
c−2 |g(~ρi )|2 (m − mi )−1 for all values of i = 1, 2, 3, . . .. These functions have
singularities at points mi . Positions of these singularities are shown as open
circles and dashed vertical lines in Fig. 13.3. The overall shape of the function
F (m) in this approximation is shown by the thick full line. According to
equation (13.49), the spectrum of the interacting mass operator can be found
at points where the line m−ma intersects with the graph F (m). These points
Mi are shown by full circles in Fig. 13.3. So, the derivatives required in
equation (13.48) are graphically represented as slopes of the function F (m)
at points M1 , M2 , M3 , . . .. The difficulty is that in the continuous limit the
distances between points mi tend to zero, function F (m) wildly oscillates,
and its derivative tends to infinity everywhere.
To overcome this difficulty we will use the following idea [Liv47]. Let us
first change the integration variable in (13.45)
z = ηρ
so that inverse function
ρ = η −1 (z)
expresses the relative momentum ρ as a function of the total mass z of the

decay products. Then denoting
dη −1 (z)
Γ(z) ≡ 2π G(η −1 (z)) (13.50)
dz
we obtain
Z∞ m−∆
Z m+∆
Z
Γ(z) Γ(z) Γ(z)
F (m) = dz = dz + dz
2π(m − z) 2π(m − z) 2π(m − z)
mb +mc mb +mc m−∆
Z∞
Γ(z)
+ dz (13.51)
2π(m − z)
m+∆
Here we split the integration interval [mb + ma , +∞) into three segments.
When ∆ → 0, the first and third terms on theR right hand side of (13.51) give
the principal value integral (denoted by P )
m−∆
Z Z∞ Z∞
Γ(z) Γ(z) Γ(z)
dz + dz −→ P dz ≡ P(m)
2π(m − z) 2π(m − z) 2π(m − z)
mb +mc m+∆ mb +mc
(13.52)
Let us now look more closely at the second integral on the right hand side
of (13.51). If ∆ is sufficiently small,21 then function Γ(z) may be considered
constant Γ(z) = Γ(m). Moreover, in our discrete approximation, the density
of points mj in the interval [m − ∆, m + ∆] is almost constant. So, this
interval can be divided into 2N small equal segments
∆
mj = m0 + j
N
where m0 = m, integer j runs from −N to N, and the integral is represented
by a partial sum
m+∆
Z m+∆
Z N
Γ(z) Γ(m0 ) 1 Γ(m0 ) X ∆/N
dz ≈ dz ≈ ∆
2π(m − z) 2π m−z 2π j=−N m − m0 − j N
m−∆ m−∆
(13.53)
21
but still much larger than the distance between adjacent points mj and mj+1
Next we assume that N → ∞ and index j runs from −∞ to ∞. Then the

right hand side of equation (13.53) defines an analytical function with poles
at points
∆
mj = m0 + j (13.54)
N
and with residues Γ(m0 )∆/(2πN). As any analytical function is uniquely
determined by the positions of its poles and the values of its residues, we
conclude that integral (13.53) has the following representation
m+∆
Z
Γ(z) Γ(m0 ) πN
dz ≈ cot (m − m0 ) (13.55)
2π(m − z) 2 ∆
m−∆
Indeed, the cot function on the right hand side of (13.55) also has poles at
points (13.54). The residues of this function are exactly as required too.22
Now we can put equations (13.52) and (13.55) together and write

Γ(m) πNm
F (m) = P(m) + cot
2 ∆
Then, using
cot(ax)′ = −a(1 + cot2 (ax))
and ignoring derivatives of smooth functions P(m) and Γ(m) we obtain
πΓ(m)N
F ′ (m) = − (1 + cot2 (πN∆−1 m)) (13.56)
2∆
22
For example, near the point m0 (where j = 0) the right hand side of (13.55) can be
approximated as

Γ(m0 ) πN Γ(m0 )∆/N
cot (m − m0 ) ≈
2 ∆ 2π(m − m0 )
which agrees with (13.53).

For formula (13.48) we need values of F ′ (m) at the discrete set of solutions
of the equation
F (m) = m − ma
At these points we can write
Γ(m)
m − ma = P(m) − cot(πN∆−1 m)
2
2(m − ma − P(m))
cot(πN∆−1 m) = −
Γ(m)
4(m − ma − P(m))2
cot2 (πN∆−1 m) =
Γ2 (m)
Substituting this to (13.56) and (13.48) we obtain the desired result

′ Γ(m)N 4(m − ma − P(m))2
F (m) = −π 1+
2∆ Γ2 (m)
1
|µ(m)|2 = 2)
(13.57)
1 + πΓ(m)N/(2∆) 1 + 4(ma +P(m)−m)
2
Γ (m)
Γ(m)∆/(2πN)
≈ (13.58)
Γ2 (m)/4 + (ma + P(m) − m)2
where we neglected the unity in the denominator of (13.57) as compared to
the large factor ∝ N∆−1 . Formula (13.58) gives the probability for finding
particle a at each point of the discrete spectrum M1 , M2 , M3 , . . .. This proba-
bility naturally tends to zero as the density of points N∆−1 tends to infinity.
However, when approaching the continuous spectrum in the limit N → ∞ we
do not need the probability at each spectrum point. We, actually, need the
probability density which can be obtained by multiplying the right hand side
of equation (13.58) by the number of points per unit interval N∆−1 . Then
the mass distribution for the unstable particle takes the famous Breit-Wigner
form
Γ(m)/(2π)
|µ(m)|2 = (13.59)
Γ2 (m)/4 + (ma + P(m) − m)2
µ(m)|2
|µ
m
mb+mc mA
Figure 13.4: Mass distribution of a typical unstable particle.
This resonance mass distribution describes an unstable particle with the ex-
pectation value of mass23 mA = ma + P(mA ) and the width of ∆m ≈ Γ(mA )
(see Fig. 13.4).
For unstable systems whose decays are slow enough to be observed in
time-resolved experiments, the resonance shown in Fig. 13.4 is very narrow,
so that instead of functions Γ(m) and P(m) we can use their values (con-
stants) at m = mA : Γ ≡ Γ(mA ) and P ≡ P(mA ). Moreover, we will assume
that the instability of the particle a does not have a large effect on its mass,
i.e., that P ≪ ma and mA ≈ ma .24 We also neglect a small contribution
from the isolated point M0 of the mass spectrum discussed in the beginning
of this subsection. Then
Γ/(2π)
|µ(m)|2 ≈ (13.60)
Γ2 /4 + (ma − m)2
13.2.3 Exponential decay law

To complete our discussion of the unstable system at rest we are now going
to calculate its decay law and the time-dependent wave function. For the
initial state vector at t = 0 we choose state (13.36) of the particle a. Its time
dependence is described by the time evolution operator
23
the center of the resonance
24
This assumption is not trivial, as will be seen from subsections 14.1.5 and 14.2.3.
i
|0, ti = e− ~ Ht |0i
To evaluate this expression it is convenient to represent |0i as an expansion

(13.25) in the basis of eigenvectors of the Hamiltonian H. Then, using (13.39)
and (13.42) we obtain
Z∞ Z∞
− ~i Ht − ~i Ht i 2
e |0i = dmµ(m)e |0, mi = dmµ(m)e− ~ mc t |0, mi
mb +mc mb +mc
Z∞
− ~i mc2 t 2 1 I(t)
= dme |µ(m)| ≡
g ∗ (~ρ)/(mc2 − ηρ~ c2 ) J(~ρ, t)
mb +mc
(13.61)
The first integral I(t) determines the decay law for the particle at rest. Sub-
stituting (13.60) in the integrand, we obtain
Z i 2
2
1 ∞ Γe− ~ mc t
ω(t) = |I(t)|2 ≈ 2

dm 2 2
(13.62)
4π mb +mc Γ /4 + (ma − m)
For most unstable systems
Γ ≪ ma − (mb + mc ) (13.63)
so the integrand is well localized around the value m ≈ ma , and we can
introduce further approximation by setting the lower integration limit in
(13.62) to −∞. Then the decay law obtains a familiar exponential form
2
2 2

1 − i
m c 2t Γc t Γc t
ω(t) ≈ 2πe ~ a exp − = exp −
4π 2 2~ ~

t
= exp − (13.64)
τ0
where
~
τ0 = (13.65)
Γc2
is the lifetime of the unstable particle. The nondecay probability decreases
from 1 to 1/e during the lifetime interval.
Using formulas (13.38), (13.46) and (13.50) we can also see that the decay
rate Γ25
1 Γc2 2πc2 dη −1(z)

= = G(η −1 (ma ))
τ0 ~ ~ dz z=ma
2 2 −1
8π ρ dη (z)
= 2
|h0|V |ρi|2 (13.66)
c ~ dz z=ma
is proportional to the square of the matrix element of the perturbation V the

initial and final states of the system. It is also proportional to the “kinemat-
ical” factor dη −1(z)/dz|z=ma , which is fully determined by the three involved
masses ma , mb , and mc .26
The importance of formulas (13.60), (13.64), and (13.66) is that they were
derived from very general assumptions. We have not used the perturbation
theory. Actually, the only significant approximation was the weakness of the
interaction responsible for the decay, i.e., the narrow width Γ of the resonance
(13.63). This condition is satisfied for all known decays.27 Therefore, the
exponential decay law is expected to be universally valid. This prediction is
confirmed by experiment: so far no deviations from the exponential decay
law (13.64) were observed.
13.2.4 Wave function of decay products

The second integral J(~ρ, t) in (13.61) describes the wave function of decay
product b and c. We note that function |µ(m)|2 has poles at ma − iΓ/2 and
ma + iΓ/2. Then
25
Actually, parameter Γ has the dimensionality of mass, so the true decay rate is 1/τ0 =
Γc2 /~ (Hz). The momentum ρ in (13.66) should be calculated as ρ = η −1 (ma ).
26
See equation (13.37).
27
Approximation (13.63) may be not accurate for particles (or resonances) decaying due
to strong nuclear forces. However, their lifetime is very short τ0 ≈ 10−23 s, so the time
dependence of their decays cannot be observed experimentally.
Im(m)
ma+iΓ/2
Re(m)
0 ηρ ma-iΓ/2
Figure 13.5: Complex plane integration contour for integral (13.67).
Z∞ i 2
Γg ∗(~ρ) e− ~ mc t dm
J(~ρ, t) = (13.67)
2πc2 (m − ηρ~ )(m − ma − iΓ/2)(m − ma + iΓ/2)
−∞
The integration contour should be closed as shown in Fig. 13.5, because then
the integral along the large semi-circle in the lower half-plane can be ignored,
i 2
as the factor e− ~ mc t tends to zero there. So we obtain
i 2
!
g ∗ (~ρ) − ~i (ma −iΓ/2)c2 t iΓe− ~ ηρ~ c t
J(~ρ, t) = 2 e − (13.68)
c (ma − ηρ~ − iΓ/2) 2(ma − ηρ~ ) + iΓ
To be consistent with the initial condition (13.36), our solution must satisfy
J(~ρ, 0) = 0. This is, indeed, true as within our approximations we can set
ηρ~ = ma in the second term in the parentheses, so that the whole expression
vanishes at t = 0.
The first term in the parentheses is significant only at short times compa-
rable with the particle’s lifetime τ0 . In the limit t → ∞ only the second term
contributes, and we can write the wave function in the position representation
(5.43)
Z i 2
iΓ i
ρ
~r ∗ e− ~ ηρ~ c t
J(r, t) ≈ − 2 d~ρe ~ g (~ρ) (13.69)
2c (2π~)3/2 (ma − ηρ~ )2 + Γ2 /4
where r ≡ rb − rc is the relative position of the two decay products, which is

an observable conjugate to ρ~. In our model, the interaction strength |g(~ρ)|
is spherically symmetric. We will also assume that function g(~ρ) is real
g ∗(~ρ) = |g(ρ)|. The largest contribution to the integral (13.69) comes from
a thin spherical layer with momenta |~ρ| ≈ ρ0 , such that ηρ0 ≈ ma . We can
assume that within this layer the modulus of the interaction function stays
nearly constant |g(ρ)| ≈ |g(ρ0 )|. Similarly, we can approximate equation
(13.37) as
!
1 c2 ρ0 c2 ρ0
ηρ~ ≈ ηρ0 + 2 p +p (ρ − ρ0 )
c m2b c4 + c2 ρ20 m2c c4 + c2 ρ20
1
= ηρ0 + (vb + vc ) (ρ − ρ0 )
c2
where vb vc are the average speeds with which the two decay products leave
the region of their creation. With these approximations we rewrite equation
(13.69)
Z
C i i
J(r, t) ≈ d~ρe ~ ρ~r e− ~ (vb +vc )ρt (13.70)
(2π~)3
where C is a constant whose value is not important to us here. Evaluating

this integral in spherical coordinates we obtain
Zπ Z∞
2πC i i
J(r, t) ≈ sin θdθ ρ2 dρe ~ ρr cos θ e− ~ (vb +vc )ρt
(2π~)3
0 0
Z∞
2π~C i i i
≈ ρdρ(e ~ ρr − e− ~ ρr )e− ~ (vb +vc )ρt
ir(2π~)3
−∞
1
∝ [δ(r − (vb + vc )t) + δ(r + (vb + vc )t)]
r
The second delta function in square brackets can be ignored, because its
argument never turns to zero as r, vb , vc , t are positive quantities and the
limit t → +∞ is taken. Then
13.3. DECAY LAW FOR MOVING PARTICLES 441
1
J(r, t) ∝ δ(r − (vb + vc )t)
r
which means that after all approximations adopted here the wave function
of the decay products has the form of a spherical shell expanding around
the decay point with a constant speed. The separation between two decay
products changes as
r = (vb + vc )t (13.71)
This indicates that particles b and c are not interacting in the asymptotic
regime: they move apart with constant velocities, as expected.
13.3 Decay law for moving particles

Equation (13.32) is the decay law ω(0, t) observed from the reference frame O
at rest. In the present section we will derive a formula for the decay law ω(θ, t)
in a moving frame O ′. Particular cases of this formula relevant to unstable
particles with sharply defined momenta or velocities will be considered in
subsections 13.3.2 and 13.3.3, respectively.
13.3.1 General formula for the decay law

Suppose that observer O describes the initial state (at t = 0) by the state
vector |Ψi. Then moving observer O ′ describes the same state28 by the vector
ic
|Ψ(θ, 0)i = e ~ Kx θ |Ψi
The time dependence of this state is
i ′ ic
|Ψ(θ, t′ )i = e− ~ Ht e ~ Kx θ |Ψi (13.72)
According to the general formula (13.5), the decay law from the point of view
of O ′ is
28
at t′ = t = 0, where t′ is time measured by the observer’s O′ clock
ω(θ, t′) = hΨ(θ, t′ )|T |Ψ(θ, t′)i (13.73)

= kT |Ψ(θ, t′)ik2 (13.74)
Let us use the basis set decomposition (13.29) of the state vector |Ψi.
Then, applying equations (13.72), (13.21), and (13.22) we obtain
Z
i ′ ic
′
|Ψ(θ, t )i = dpψ(p)e− ~ Ht e ~ Kx θ |pi
Z Z∞
i ′ ic
= dpψ(p) dmµ(m)γ(p, m)e− ~ Ht e ~ Kx θ |p, mi
mb +mc
Z Z∞ r
− ~i ωΛp t′ ωΛp
= dpψ(p) dmµ(m)γ(p, m)e |Λp, mi
ωp
mb +mc
The inner product of this vector with |qi can be found with the help of
(13.30) and new integration variables r = Λp
hq|Ψ(θ, t′)i
Z Z∞ r
− ~i ωΛp t′ ωΛp
= dpψ(p) dmµ(m)γ(p, m)e hq|Λp, mi
ωp
mb +mc
Z Z∞ r
i ′ ωΛp
= dpψ(p) dm|µ(m)|2 γ(p, m)γ ∗ (Λp, m)e− ~ ωΛp t δ(q − Λp)
ωp
mb +mc
Z∞ Z r
ωΛ−1 r ωr i ′
= dm dr ψ(Λ−1 r)γ(Λ−1 r)γ ∗ (r)|µ(m)|2e− ~ ωr t δ(q − r)
ωr ωΛ−1 r
mb +mc
Z∞ r
ωΛ−1 q i ′
= dm ψ(Λ−1 q)γ(Λ−1 q, m)γ ∗ (q, m)|µ(m)|2 e− ~ ωq t
ωq
mb +mc
The non-decay probability in the reference frame O ′ for all values of θ and t′
is then found by substituting (13.14) in equation (13.73)
ω(θ, t′ )
Z Z
= dqhΨ(θ, t )|qihq|Ψ(θ, t )i = dq|hq|Ψ(θ, t′ )i |2
′ ′
∞ 2
Z Z r
ωΛ−1 q −1 −1 ∗

2 − ~i ωq t′
= dq dm ψ(Λ q)γ(Λ q, m)γ (q, m)|µ(m)| e
ωq

mb +mc
(13.75)
This general formula is not very convenient for calculations. So, in the fol-
lowing subsections we will consider some specific situations in which (13.75)
can be simplified and fully evaluated.
13.3.2 Decays of states with definite momentum

In the reference frame at rest (θ = 0), formula (13.75) coincides exactly with
our earlier result (13.32)
∞ 2
Z Z

2

2 − ~i ωq t
ω(0, t) = dq|ψ(q)| dm|µ(m)| e (13.76)

mb +mc
In section 13.1 we applied this formula to calculate the decay law of a particle
with zero momentum. Here we will consider the case when the unstable par-
ticle has a non-zero momentum p, i.e., the state is described by a normalized
vector |p) whose wave function is (13.16)
p
ψ(q) = δ(q − p) (13.77)
From equation (13.76) the decay law for such a state is
∞ 2
Z
2 − i
ω t

ω|p) (0, t) = dm|µ(m)| e ~ p (13.78)

mb +mc
In a number of works [Ste96, Kha97, Shi04, Urb14] it was noticed that this
result disagrees with Einstein’s time dilation formula (I.25). Indeed, if one
interprets the state |p) as a state of unstable particle moving with definite
speed
c2 p
v = p = c tanh θ
m2a c4 + p2 c2
then the decay law (13.78) cannot be connected with the decay law of the
particle at rest (13.33) by Einstein’s formula (I.25)29
ω|p) (0, t) 6= ω|0) (0, t/ cosh θ) (13.79)

This observation prompted authors of [Ste96, Kha97, Shi04] to question the
applicability of special relativity to particle decays. However, at a closer
inspection it appears that this result by itself does not challenge the special-
relativistic time dilation (I.25) directly. Formula (13.79) is comparing decay
laws of two different momentum eigenstates |0) and |p) viewed from the
same reference frame. This is not exactly the same as (I.25) which compares
observations made on the same particle from two frames of reference moving
with respect to each other. If from the point of view of observer O the
particle is described by the state vector |0) which has zero momentum and
zero velocity, then from the point of view of O ′ this particle is described by
the state
ic ~
e ~ Kθ |0) (13.80)
which is not an eigenstate of the momentum operator P0 . So, strictly speak-
ing, formula (13.78) is not applicable to this state. However, it is not difficult
to see that (13.80) is an eigenstate of the velocity operator [Shi06]. Indeed,
taking into account that Vx |0) = 0 and equations (4.3) - (4.4), we obtain
ic ic ic ic ic Vx − c tanh θ
Vx e ~ Kx θ |0) = e ~ Kx θ e− ~ Kx θ Vx e ~ Kx θ |0) = e ~ Kx θ |0)
1 − Vx tanh
c
θ
ic
≈ −c tanh θe ~ Kx θ |0) (13.81)
29
In subsection 13.4.1 we will illustrate this inequality with numerical calculations.
Thus, a fair comparison with the time dilation formula (I.25) requires consid-
eration of unstable states having definite values of velocity for both observers.
This will be done in subsection 13.3.4.
13.3.3 Decay law in the moving reference frame

Before addresing the decay law seen from a moving frame, let us first intro-
duce a few realistic approximations and simplify our general formula (13.75)
a little bit. First, we may notice that in all realistic cases the initial state
|Ψi ∈ Ha is not an exact eigenstate of the total momentum operator: the
wave function of the unstable particle is never localized at one point in the
momentum space (as was assumed, for example, in (13.77)) but has a spread
(or uncertainty) of momentum |∆p| and, correspondingly, an uncertainty of
position |∆r| ≈ ~/|∆p|. Second, the state |Ψi ∈ Ha is not an eigenstate
of the mass operator M. The initial state |Ψi is characterized by the un-
certainty of mass Γ (see Fig. 13.4) that is related to the particle’s lifetime
(τ0 ) by formula (13.65). It is important to note that in all cases of practical
interest the mentioned uncertainties are related by inequalities
|∆p| ≫ Γc (13.82)
|∆r| ≪ cτ0 (13.83)
In particular, the latter inequality means that the uncertainty of position
is mush less than the distance passed by light during the lifetime of the
particle. For example, in the case of muon τ0 ≈ 2.2 · 10−6 s and, according
to (13.83), the spread of the wave function in the position space must be
much less than 600m, which is a reasonable assumption. Therefore, we can
safely assume that the factor |µ(m)|2 in (13.75) has a very sharp peak near
30
the
q value m = ma . Then we can move the value of the smooth function
ωΛq
ωq
ψ(Λq)γ(Λq, m)γ ∗ (q, m) at m = ma outside the integral on m
ω(θ, t′)
Z s 2 Z∞ 2

Ω −1
Λ q −1 −1 ∗ 2 − ~i ωq t′
≈ dq ψ(Λ q)γ(Λ q, ma )γ (q, ma ) dm|µ(m)| e
Ωq

mb +mc
30
see discussion after equation (13.24)
∞ 2
Z Z
ΩL−1 q −1

2

2 − ~i ωq t′
= dq |ψ(L q)| dm|µ(m)| e
Ωq

mb +mc
∞ 2
Z Z

2

2 − ~i ωLp t′
= dp|ψ(p)| dm|µ(m)| e (13.84)

mb +mc
p
Here Ωp = m2a c4 + p2 c2 , Λp is given by equation (13.23) and Lp = (px cosh θ+
Ωp
c
sinh θ, py , pz ).
13.3.4 Decays of states with definite velocity

Next we consider an initial state which has zero velocity from the point of
view of observer O. The wave function of this state is localized near zero
momentum p = 0. So, we can set in equation (13.84)31
|ψ(p)|2 ≈ δ(p) (13.85)
and obtain the decay law seen by the moving observer
∞ 2
Z
it′
√

′ 2 − m2 c4 +m2 c4 sinh2 θ
ω|0) (θ, t ) ≈ dm|µ(m)| e ~ a
(13.86)

mb +mc
If we approximately identify ma c sinh θ with the momentum p of the particle

a from the point of view of the moving observer O ′32 then
∞ 2
Z
′

2 − ~i ωp t′
ω|0) (θ, t ) ≈
dm|µ(m)| e (13.87)

mb +mc
31
As we mentioned in the preceding subsection, in reality this state is not exactly an
eigenstate of momentum (velocity) |0). However, its wave function is still much better
localized in the p-space than the slowly varying second factor under the integral in (13.84),
so, approximation (13.85) is justified.
32
From the point of view of this observer, particle’s velocity is c tanh θ.
13.4. “TIME DILATION” IN DECAYS 447
So, in this approximation the decay law (13.87) in the frame of reference
O ′ moving with the speed c tanh θ takes the same form as the decay law
(13.78) of a particle moving with momentum ma c sinh θ with respect to the
stationary observer O.33 So, the violation of the Einstein’s time dilation
formula mentioned in subsection 13.3.2 is a real effect that warrants a more
in-depth study. In the next section we will evaluate (13.87) numerically.
13.4 “Time dilation” in decays

In this section we will present a specific example, in which predictions of our
RQD approach deviate from special relativity. In particular, we will demon-
strate the approximate character of the Einstein’s “time dilation” formula
(I.25) for decays of fast moving particles.
13.4.1 Numerical results

In this subsection we will calculate the difference between the accurate quan-
tum mechanical result (13.87)34 and the special-relativistic formula (I.25)

SR t
ω|0) (θ, t) = ω|0) 0, (13.88)
cosh θ
In this calculation we assume that the mass distribution |µ(m)|2 of the un-
stable particle has the Breit-Wigner form35
 αΓ/2π
 Γ2 /4+(m−ma )2
, if m ≥ mb + mc
|µ(m)|2 = (13.89)

0, if m < mb + mc
where parameter α is a factor that ensures the normalization to unity
Z∞
|µ(m)|2 = 1
mb +mc
33
Note the disagreement between ourresult and conclusions of ref. [Shi06].
34
For similar calculations of the decay law of fast moving particles see [Urb14].
35
see equation (13.60) and Fig. 13.4
The following parameters of this distribution have been chosen: The mass
of the unstable particle was ma = 1000 MeV/c2 , the total mass of the decay
products was mb + mc = 900 MeV/c2 , and the width of the mass distribution
was Γ= 20 MeV/c2 . These values do not correspond to any real particle, but
they are typical for strongly decaying baryon resonances.
It is convenient to measure time in units of the lifetime τ0 cosh θ. Denoting
χ ≡ t/(τ0 cosh θ), we find that special-relativistic decay laws (13.88) for any
rapidity θ are given by the same universal function ω SR (χ). This function
was evaluated for values of χ in the interval from 0 to 6 with the step of
0.1. Calculations were performed by direct numerical integration of equation
(13.78) using the Mathematica program shown below
gamma = 20
mass = 1000
theta = 0.0
Do[Print[(1/0.9375349) Abs[NIntegrate [gamma/(2 Pi) / (gamma^2/4 +(x

- mass)^2) Exp[ I t Sqrt [x^2 + mass^2 (Sinh [theta])^2] Cosh
[theta] / gamma], {x, 900, 1010, 1100, 300000}, MinRecursion -> 3,
MaxRecursion -> 16, PrecisionGoal -> 8, WorkingPrecision -> 18]]^2],
{t, 0.0, 6.0, 0.1}]
As expected, function ω SR (χ) (shown by the thick solid line in Fig. 13.6) is
very close to the exponent e−χ .
Next we used equation (13.78)36 to calculate decay laws ω|0) (θ, χ) of mov-
ing particles. These calculations were done by the same Mathematica code,
only instead of θ = 0 we used three values of the rapidity parameter θ
(=theta), namely θ = 0.2, 1.4, and 10.0. These rapidities corresponded
to moving frame velocities of 0.197c, 0.885c, and 0.999999995c, respectively.
These calculations revealed important differences between acurate quantum
mechanical result (13.78) and the special-relativistic approximation (13.88).
These differences ω|0) (θ, χ) − ω SR (χ) are plotted as thin lines in Fig. 13.6.
Deviations from the Einstein’s time dilation formula are as high as 0.3% in
the example considered here. Is it possible to observe these deviations in
experiments with moving unstable particles?
36
or, what is essentially the same, equation (13.87)
ω|0)(θ,χ)−ωSR(χ)
ωSR(χ)
0.002
0.001
1 2 3 4 5 6
χ
0.001 θ=10
θ=1.4
0.002 θ=0.2
Figure 13.6: Corrections to the Einstein’s “time dilation” formula (I.25)

for the decay law of unstable particle moving with the speed v = c tanh θ.
Parameter χ is time measured in units of τ0 cosh θ.
The lifetime of the particle a considered in our example (τ0 ≈ 2 × 10−22

s) is too short to be observed experimentally. Unstable baryon resonances
are identified experimentally by the resonance behavior of the scattering
cross-section as a function of the collision energy, rather than by direct mea-
surements of the decay law. So, calculated corrections to the Einstein’s time
dilation law have only illustrative value. However, from these data we can
estimate the magnitude of corrections for particles whose time-dependent
decay laws can be measured in a laboratory, e.g., for muons. Taking into ac-
count that the magnitude of corrections is roughly proportional to the ratio
Γ/ma [Ste96, Shi04] and that in our example Γ/ma = 0.02, we can expect
that for muons (Γ ≈ 2 × 10−9 eV /c2 , ma ≈ 105MeV /c2 , Γ/ma ≈ 0.02 × 10−15 )
the maximum magnitude of the correction should be about 2 × 10−18 , which
is much smaller than the precision of modern experiments.37
Our results indicate that all physical processes38 viewed from a moving
reference frame do not go exactly cosh θ slower, as special relativity would
37
Most accurate experiments with decaying muons confirm Einstein’s time dilation for-
mula with the precision of only 10−3 [BBC+ 77, Far92].
38
not just particle decays considered here
predict. The exact slowdown pattern depends on the physical makeup of

the process and on interactions responsible for it. For more discussions on
how our RQD approach compares with special relativity and experiments see
chapter 17.
13.4.2 Decays caused by boosts

Recall that in subsection 6.2.2 we discussed two classes of inertial transforma-
tions of observers - kinematical and dynamical. According to our Postulate
17.2, we are working in the instant form of Dirac’s dynamics, so space trans-
lations and rotations are kinematical, while time translations and boosts are
dynamical. Kinematical transformations impose only trivial changes on the
external appearance of the object and do not influence its internal state.
The description of kinematical space translations and rotations is a purely
geometrical exercise which does not require intricate knowledge of interac-
tions in the physical system. This conclusion is supported by observations
of unstable particles: For two observers in different places or with different
orientations, the non-decay probability of the particle has exactly the same
value.
On the other hand, dynamical transformations depend on interaction and
directly affect the internal structure of the observed system.39 The dynamical
effect of time translations on the unstable particle is obvious - the particle de-
cays with time. But what about boost transformations? Does the non-decay
probability depend on the observer’s velocity? Special relativity answers:
“No, there is no such dependence.”40 And this answer is often believed to
be self-evident in discussions of relativistic effects. For example, Polishchuk
writes41
Any event that is “seen” in one inertial system is “seen” in all

others. For example if observer in one system “sees” an explosion
on a rocket then so do all other observers. R. Polishchuk [Pol01]
Applying this statement to decaying particles, we would expect that the non-
decay probability does not depend on the observer’s velocity. In particular,
this would mean that at time t = 0 we should have
39
40
see Appendix I.4
41
This assertion does not hold in our RQD theory, as illustrated in Fig. 17.3.
ω(θ, 0) = 1 (13.90)
for all θ. Here we are going to prove that this expectation is incorrect.
Suppose that special-relativistic equation (13.90) is valid, i.e., for any
|Ψi ∈ Ha and any θ > 0, boost transformations of the observer keep the
initial state vector within the unstable particle’s subspace
ic
e ~ Kx θ |Ψi ∈ Ha
ic
Then the subspace Ha is invariant under action of boosts e ~ Kx θ , which means
that operator Kx commutes with the projection T on the subspace Ha . Then
from Poincaré commutator (3.57) and [T, P0x ] = 0 it follows by Jacobi iden-
tity that
ic2 ic2 ic2

[T, H] = [T, [Kx , P0x ]] = [Kx , [T, P0x ]] − [P0x , [T, Kx ]]
~ ~ ~
= 0
which contradicts the fundamental property (13.10) of unstable systems.
ic
This contradiction implies that, in fact, the state e ~ Kx θ |Ψi does not cor-
respond to the particle a with 100% probability. This state must contain
contributions from decay products even at t = 0
ic
e ~ Kx θ |Ψi ∈
/ Ha (13.91)
ω(θ, 0) < 1, f or θ 6= 0 (13.92)
This is the “decay caused by boost,” which means that special-relativistic
equations (I.25) and (13.90) are not accurate and that boosts of the observer
have a non-trivial effect on the internal state of the observed unstable system.
The presence of “decays caused by boosts” means that particle composi-
tion of systems involving unstable states is not a relativistic invariant. For
example, one should be careful when making assertions like this one:
Flavor is the quantum number that distinguishes the different
types of quarks and leptons. It is a Lorentz invariant quantity.
For example, an electron is seen as an electron by any observer,
never as a muon. C. Giunti and M. Lavender [GL]
Although this statement about the electron is correct (because the electron
is a stable particle), it is not true about the muon. An unstable muon can
be seen as a single particle by the observer at rest and, at the same time, it
will be perceived as a group of three decay products (an electron, a neutrino
νµ , and an antineutrino ν̃e ) by a moving observer.
In spite of its fundamental importance, the effect of boosts on the non-
decay probability is very small. For example, our rather accurate approxi-
mation (13.84) failed to “catch” this effect. Indeed, for t = 0 this formula
predicted
∞ 2
Z Z

2

2
ω(θ, 0) = dp|ψ(p)| dm|µ(m)| = 1

mb +mc
instead of the expected ω(θ, 0) < 1.
13.4.3 Particle decays in different forms of dynamics

Throughout this section we assumed that interaction responsible for the de-
cay belongs to the Bakamjian-Thomas instant form of dynamics. However, as
we saw in subsection 6.3.5, the Bakamjian-Thomas form does not allow sep-
arable interactions, so, most likely, this is not the form preferred by nature.
Therefore, it would be interesting to calculate decay laws in non-Bakamjian-
Thomas instant forms of dynamics as well. Although no such calculations
have been done yet, one can say with certainty that there is no form of
interaction in which the special-relativistic time dilation formula (13.88) is
exactly valid. This follows from the fact that in any instant form of dynamics
boost operators contain interaction terms, so the “decays caused by boosts”
- which contradict equation (13.88) - are always present.
What if the interaction responsible for the decay has a non-instant form?
Is it possible that there is a form of dynamics in which Einstein’s time dilation
formula (13.88) is exactly true? Our answer to this question is No. Let us
consider, for example, the point form of dynamics.42 In this case the subspace
Ha of the unstable particle is invariant with respect to boosts, [K0x , T ] = 0,
so there can be no boost-induced decays (13.91). However, we obtain a rather
42
surprising relationship between decay laws of the same particle viewed from
the moving reference frame ω(θ, t) and from the frame at rest ω(0, t)43
ic i i ic
ω(θ, t) = h0|e− ~ K0x θ e ~ Ht T e− ~ Ht e ~ K0x θ |0i
ic i ic ic ic ic i ic
= h0|e− ~ K0x θ e ~ Ht e ~ K0x θ e− ~ K0x θ T e ~ K0x θ e− ~ K0x θ e− ~ Ht e ~ K0x θ |0i
ic i ic ic i ic
= h0|e− ~ K0x θ e ~ Ht e ~ K0x θ T e− ~ K0x θ e− ~ Ht e ~ K0x θ |0i
it it
= h0|e ~ (H cosh θ+cPx sinh θ) T e− ~ (H cosh θ+cPx sinh θ) |0i
it it
= h0|e ~ H cosh θ T e− ~ H cosh θ |0i
= ω(0, t cosh θ)
where the last equality follows from comparison with the decay law at rest
(13.9). This means that the decay rate in the moving frame is cosh θ times
faster than in the rest frame. This is in direct contradiction with experiments.
The point form of dynamics is not acceptable for the description of de-
cays for yet another reason. Due to the interaction-dependence of the total
momentum operator (17.37), one should expect decays induced by space
translations
i
e ~ Px a |Ψi ∈
/ Ha , f or a 6= 0 (13.93)
Translation-induced and/or rotation-induced decays are expected in all forms

of dynamics (except the instant form). This contradicts our experience,
which suggests that the composition of an unstable particle is not affected
by these kinematical transformations. Therefore only the instant form of dy-
namics is appropriate for the description of particle decays. This conclusion
supports our Postulate 17.2.
43
In this derivation we assumed that the state of the particle at rest |Ψi is an eigenvector
of the interacting momentum operator P|Ψi = 0.
Chapter 14
RQD IN HIGHER ORDERS
There must be no barriers for freedom of inquiry. There is no

place for dogma in science. The scientist is free and must be
free to ask any question, to doubt any assertion, to seek for any
evidence, to correct any errors.
J. Robert Oppenheimer
In section 11.1 we derived the 2nd order dressed particle Hamiltonian

H0 + V2d by applying a unitary dressing transformation to the field-based
Hamiltonian of QED. However, this approach is hardly applicable to higher
perturbation orders because dressing requires separate processing of terms
of different types (unphys, renorm, phys). To achieve that, instead of us-
ing the compact field representation, one needs to keep all terms expressed
through creation and annihilation operators as in (8.49) - (8.50). In this
representation even the original QED Hamiltonian takes rather inconvenient
cumbersome form shown in Appendix L. The task of dressing transformation
is further complicated by the necessity to calculate multiple commutators of
these long expressions.
Fortunately, there is a much simpler approach, which leads to the same
desired result for the dressed Hamiltonian H d . This approach is based on
formulas from subsection 11.1.8. Suppose, for example, that we want to find
the 3rd and 4th order contributions V3d and V4d to the electron-proton dressed
interaction. To do that we can use equations (11.30) - (11.31) and write
455
456 CHAPTER 14. RQD IN HIGHER ORDERS
V3d ≈ (Σc3 )ph (14.1)

V4d ≈ (Σc4 )ph − V2d V2d (14.2)
All terms on the right hand sides can be found relatively easily. All warnings
from subsection 11.2.1 regarding the non-uniqueness of the off-energy-shell
behavior of the potentials remain valid here. Indeed, according to (11.23),
Σ-operators are directly related to the scattering matrix S c
(Σc3 )ph = S3c (14.3)

| {z }
(Σc4 )ph = S4c (14.4)
| {z }
whose calculation is the primary goal of the QED formalism, as described in
all textbooks. In particular, the 4th order scattering matrix S4c has been cal-
culated in (10.64). The 3rd order contribution S3c will be found in subsection
14.1.1.
We will use formula (14.1) to calculate the 3rd order dressed interac-
tion V3d in section 14.1. This interaction potential will help us to calculate
lifetimes and energy shifts of unstable levels in the hydrogen atom. In sec-
tion 14.2 we will use (14.2) to derive the 4th order electron-proton interac-
tion V4d , whose experimental manifestations include the electron’s anomalous
magnetic moment and the Lamb shift.
14.1 Spontaneous radiative transitions

In chapter 13 we discussed rather general properties of the decay process.
In particular, we did not specify the exact type of the unstable system and
the form of the decay interaction operator V . So, we left our formula for
the decay rate in an unprocessed form (13.66). In this section we would like
to fill this gap and perform a complete calculation of the decay rate for a
realistic system – an excited state of the hydrogen atom.
The simplest interaction responsible for the spontaneous photon emis-
sion from hydrogen excited states has the structure d† a† c† da. In the dressed
particle Hamiltonian such terms first appear in the 3rd perturbation order.1
1
see Table 11.1
14.1. SPONTANEOUS RADIATIVE TRANSITIONS 457
So, our plan in this section is as follows: In subsection 14.1.1 we are go-
ing to find the 3rd order contribution to the S-operator having the desired
structure d† a† c† da. Then in subsection 14.1.2 we will use the correspondence
(14.1), (14.3) between the S-operator and the dressed particle Hamiltonian
in order to obtain V3d [d† a† c† da] near the energy shell. Next, in subsections
14.1.3-14.1.5, we will use approach developed in chapter 13 to calculate the
radiative transition rate between two states of the hydrogen atom and asso-
ciated energy shifts.
14.1.1 Bremsstrahlung scattering amplitude

To find the (bremsstrahlung) d† a† c† da part of the scattering operator in the
3rd order, we use the Feynman-Dyson perturbation theory with interaction
operator V1 from (9.31). Then
Z+∞
i
S3 = dt1 dt2 dt3 T [V1 (t1 )V1 (t2 )V1 (t3 )]
3!~3
−∞
Z
i
= 3
d4 x1 d4 x2 d4 x3 T [V1 (x̃1 )V1 (x̃2 )V1 (x̃3 )] (14.5)
3!~
Z
i
= d4 x1 d4 x2 d4 x3 ×
3!~3
T [(Jµ (x̃1 )Aµ (x̃1 ) + Jµ (x̃1 )Aµ (x̃1 )) ×
(Jν (x̃2 )Aν (x̃2 ) + Jν (x̃2 )Aν (x̃2 )) ×
(Jλ (x̃3 )Aλ (x̃3 ) + Jλ(x̃3 )Aλ (x̃3 ))] (14.6)
where Jµ and Jµ are electron-positron and proton-antiproton current opera-

tors, respectively, as defined in Appendix L.1. Expanding the three parenthe-
ses we get 8 terms under the integral sign. The term of the type JJJ cannot
contribute to the electron-proton bremsstrahlung, because it lacks the pro-
ton component. Similarly the term J J J does not contribute and should be
omitted as well. Let us first consider the three terms JJJ + JJ J + J JJ.
As the order of factors under the time-ordering sign is irrelevant, these three
terms are equal. So, the corresponding contribution to the coefficient func-
tion of the S-operator is2
2
Summation is performed on repeating indices µ, ν, λ = 0, 1, 2, 3.
S3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ)
Z
i
= 3 3
d4 x1 d4 x2 d4 x3 ×
2~ c
h0|aq,τ dp,σ T [Jµ (x̃1 )Aµ (x̃1 )Jν (x̃2 )Aν (x̃2 )Jλ (x̃3 )Aλ (x̃3 )]d†p′ ,σ′ a†q′ ,τ ′ c†s,κ |0i
Z
ie3
= 3
d4 x1 d4 x2 d4 x3 ×
2~
h0|aq,τ dp,σ T [(ψ(x̃1 )γµ ψ(x̃1 )Aµ (x̃1 ))(ψ(x̃2 )γν ψ(x̃2 )Aν (x̃2 )) ×
(Ψ(x̃3 )γλ Ψ(x̃3 )Aλ (x̃3 ))]d†p′ ,σ′ a†q′ ,τ ′ c†s,κ |0i
This function can be evaluated by drawing two Feynman diagrams shown in

Fig. 14.1 and processing them according to Feynman rules from subsection
9.2.4.
S3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ)
ie3 c3/2 mMc4 c
= − 2
p √ ×
4π ωq ωq′ Ωp Ωp′ (2π~) 3/2 2s
1
δ 4 (−s̃ + q̃ + p̃ − p̃′ − q̃ ′ ) ×
(p̃ − p̃′ )2
/ − /s + mc2 )bc cd
(q
ua (q, τ ) /eab (s, κ) W/ (p, σ; p′ , σ ′ )
(q̃ − s̃)2 − m2 c4
/ + /q′ + mc2 )bc cd
(s
+W/ab (p, σ; p′ , σ ′ ) /
e (s, κ) ud (q′ , τ ′ )
(s̃ + q̃ ′ )2 − m2 c4
Now let us assume that the electron and the proton are non-relativistic
and simplify the above expression. According to approximations derived in
Appendix J.9 and using
p̃2 = (p̃′ )2 = m2 c4
s̃2 = 0
(p̃ − p̃′ )2 ≈ −c2 (p − p′ )2
(s̃ + q̃ ′ )2 − m2 c4 = 2s̃ · q̃ ′
(q̃ − s̃)2 − m2 c4 = −2s̃ · q̃
q’,τ’
q’,τ’
p’,σ’ s,κ p’,σ’
γνν
γνν
s,κ γλ γλ
γµ
γµ
p,σ q,τ p,σ
q,τ
(a) (b)
Figure 14.1: 3rd order Feynman diagrams for the photon emission in
electron-proton collisions.
W 0 (p, σ, p′ , σ ′ ) ≈ δσ,σ′
W(p, σ, p′ , σ ′ ) ≈ 0
p
ωq ωq′ Ωp Ωp′ ≈ Mmc4
we obtain
S3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ)
1 1 δ 4 (−s̃ + q̃ + p̃ − p̃′ − q̃ ′ )δσσ′
≈ ie3 c5/2 2 √ ×
4π (2π~)3/2 2s c2 (p − p′ )2
/ − /s + mc2 )bc cd
(q
ua (q, τ ) −e /ab (s, κ) γ0
2q̃ · s̃
/ + /q′ + mc2 )bc cd
(s
+ γ0ab /
e (s, κ) ud (q′ , τ ′ )
2q̃ ′ · s̃
Next we assume that the energy and momentum of the emitted photon
is much less than energies and momenta of charged particles. We also use
Dirac equations (J.83), (J.82) and the non-relativistic approximation (J.71)
to write3
3
Here we used the tilde to write ẽ in order to stress the 4-component nature of this
quantity despite the fact that it does not transform as a 4-vector.
u(q, τ )e /(q/ − /s + mc2 )γ0 u(q′ , τ ′ )

≈ u(q, τ )eµ γ µ (qν γ ν + mc2 )γ0 u(q′ , τ ′ )
= u(q, τ )((−qν γ ν + mc2 )eµ γ µ + 2g µν qν eµ )γ0 u(q′ , τ ′ )
= u(q, τ )((−q / + mc2 )eµ γ µ + 2q̃ · ẽ)γ0 u(q′ , τ ′ )
= 2u(q, τ )(q̃ · ẽ)γ0 u(q′ , τ ′ ) = 2U 0 (q, τ ; q′ , τ ′ )(q̃ · ẽ)
≈ 2δτ,τ ′ (q̃ · ẽ)
u(q, τ )γ0 (s/ + /q′ + mc2 )e /u(q′ , τ ′ )

= 2U 0 (q, τ ; q′ , τ ′ )(q̃ ′ · ẽ) ≈ 2δτ,τ ′ (q̃ ′ · ẽ)
Further approximations yield
s̃ · q̃ ′ = csωq′ − c2 (s · q′ ) ≈ mc3 s
s̃ · q̃ ≈ mc3 s
q̃ · ẽ = −c(q · e)
q̃ ′ · ẽ = −c(q′ · e)
Therefore4
S3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ)
√ ′
ie3 cδ 4 (−s̃ + q̃ + p̃ − p̃′ − q̃ ′ ) δσσ′ δτ,τ ′ q̃ · ẽ q̃ · ẽ
≈ √ · −
4π 2 (2π~)3/2 2s (p − p′ )2 q̃ ′ · s̃ q̃ · s̃
(14.7)
3 4 ′ ′ ′
ie δ (q̃ + p̃ − p̃ − q̃ ) δσσ′ δτ,τ ′ (q − q) · e(s, κ)
≈ − p ·
4π 2 m(2π~)3/2 2(cs)3 (q′ − q)2
(14.8)
This is our final expression for the terms JJJ in the scattering operator
(14.6). The contribution from JJ J + J JJ + J J J terms can be obtained
simply by replacing the electron’s mass m in (14.8) by the proton’s mass M.
So, this contribution is much smaller and will be neglected.
4
Our result in (14.7) can be compared with equations (7.57) - (7.58) in [BD64].
14.1.2 3rd order perturbation Hamiltonian

From results in the preceding subsection we can find the 3rd order contri-
bution V3d [d† a† c† da] to the dressed particle interaction Hamiltonian. The
relationship between the dressed Hamiltonian and the scattering operator in
the 3rd order is given by equations (14.1) and (14.3). So5
V3d
X Z
= dpdqdp′ dq′ dsV3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ)a†q′ ,τ ′ d†p′ ,σ′ c†s,κ aq,τ dp,σ
στ σ′ τ ′ κ
(14.9)
whose coefficient function is6
V3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ)
e3 δσσ′ δτ,τ ′ (q′ − q) · e(s, κ)
≈ p δ(q + p − p′ − q′ )
8π 3 m(2π~)3/2 2(cs)3 (q′ − q)2
(14.10)
The action of the operator (14.9) on a two particle (electron+proton) initial
state
XZ
|Ψi i ≡ dp′′ dq′′ Ψ(p′′ , q′′ ; λ, ν)a†q′′ ,ν d†p′′ ,λ |0i
λν
is
X XZ Z
′′ ′′
V3d |Ψi i = dp dq dpdqdp′ dq′ dsV3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ) ×
στ σ′ τ ′ κ λν
Ψ(p , q′′ ′′
; λ, ν)a†q′ ,τ ′ d†p′ ,σ′ c†s,κ aq,τ dp,σ a†q′′ ,ν d†p′′ ,λ |0i
X XZ Z
′′ ′′
= dp dq dpdqdp′ dq′ dsV3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ) ×
στ σ′ τ ′ κ λν
5
For brevity we drop the label [d† a† c† da] from the V3d operator symbol.
6
This formula is obtained simply by dividing scattering amplitude (14.8) by the factor
(−2πi) and omitting the energy delta function δ(ωq + Ωp − ωq′ − Ωp′ ).
Ψ(p′′ , q′′ ; λ, ν)a†q′ ,τ ′ d†p′ ,σ′ c†s,κ δ(q − q′′ )δτ ν δ(p − p′′ )δσλ |0i
X Z
= dpdqdp′ dq′ dsV3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ) ×
στ σ′ τ ′ κ
Ψ(p, q; σ, τ )a†q′ ,τ ′ d†p′ ,σ′ c†s,κ |0i

XZ X Z
′ ′
= dp dq ds dpdqV3JJJ (p, q, p′ , q′ , s; σ, τ, σ ′ , τ ′ , κ) ×
σ′ τ ′ κ στ

Ψ(p, q; σ, τ ) a†q′ ,τ ′ d†p′ ,σ′ c†s,κ |0i
where expression in big parentheses is the transformed wave function of the 3-

particle system (electron+proton+photon). Using (14.10), this wave function
can be written as
Ψ′ (p′ , q′ , s; σ ′ τ ′ κ)
XZ e3 δ(q + p − p′ − q′ )δσσ′ δτ,τ ′ (q′ − q) · e(s, κ)
= dpdq p Ψ(p, q; σ, τ )
στ 8π 3 m(2π~)3/2 2(cs)3 (q′ − q)2
Z
e3 k · e(s, κ)
= p dk Ψ(p′ + k, q′ − k; σ ′ , τ ′ )
3
8π m(2π~) 3/2 2(cs) 3 k2
By taking a Fourier transform and using (8.74), (B.8) we can switch to

the position representation for fermions7
Ψ′ (x, y, s; σ ′ τ ′ κ)
Z Z
e3 ′ ′ ~i p′ x+ ~i q′ y k · e(s, κ)
= p dp dq e dk ×
8π 3 m(2π~)9/2 2(cs)3 k2
Ψ(p′ + k, q′ − k; σ ′ , τ ′ )
Z Z
e3 ′ ′ ~i (p′ −k)x+ ~i (q′ +k)y k · e(s, κ)
= p dp dq e dk ×
8π 3 m(2π~)9/2 2(cs)3 k2
Ψ(p′ , q′ ; σ ′ , τ ′ )
Z
~3/2 e3 i k · e(s, κ)
= p dke ~ k(y−x) ×
(2π~)3 m(2π)3/2 2(cs)3 k2
7
x and y are position vectors of the proton and the electron, respectively.
Z
1 ′ ′ ~i p′ x+ ~i q′ y ′ ′ ′ ′
dp dq e Ψ(p , q ; σ , τ )
(2π~)3
e3 ~1/2 i(y − x) · e(s, κ)
= p 3
Ψ(x, y; σ ′, τ ′ )
m(2π) 3/2 2(cs) 3 4π|y − x|
This means that the 3rd order position-space bremsstrahlung potential be-
tween two charged particles is
i~1/2 e3 r · e(s, κ)
V3d (r, s, κ) = p (14.11)
4πm 2(2πcs)3r 3
In contrast to 2nd order interactions discussed in chapter 12, this potential

does not conserve the number of particles. It is responsible for the emission
of photons with momentum s and helicity κ by an electron moving in the
field of a heavy proton.8 We can expect that the radiation emission rate
should be proportional to the square of the matrix element of this operator
between appropriate initial and final states. One can notice that potential
(14.11) is proportional to the electron’s acceleration in the Coulomb field9
e2 r
a ≈
4πmr 3
Thus we conclude that the total radiated power should depend on the square
of electron’s acceleration a2 . This is in agreement with the well-know Lar-
mor’s formula of classical electrodynamics. Thus the 3rd order bremsstrahlung
interaction V3d has direct relevance to the “radiation reaction” effect [McD00,
Par02, Par05].
14.1.3 Instability of excited atomic states

The (bremsstrahlung) perturbation V3 derived in the preceding subsection is
also responsible for radiative transitions between energy levels in atoms and
other bound systems. As an example, let us consider two stationary states
8
To maintain the Hermiticity of the full Hamiltonian it must contain also a term, which
is a Hermitian conjugate of V3d . Apparently, this term is responsible for the absorption of
photons by the interacting system electron+proton.
9
see equation (15.10)
2P 1/2 and 1S 1/2 of the hydrogen atom. The corresponding state vectors
will be denoted |Ψi i and |Ψf i, respectively. They are eigenvectors of the
two-particle electron-proton Hamiltonian10
p2e p2p e2
He−p = + − (14.12)
2m 2M 4πr
with eigenvalues Ei and Ef
He−p |Ψi i = Ei |Ψi i

He−p |Ψf i = Ef |Ψf i
Ei > Ef
If we add interaction potential V3d + (V3d )† to the Hamiltonian He−p then

the state |Ψi i is no longer stationary. The two-particle subspace Hpe is
not invariant with respect to this potential. Operator V3d has a non-zero
matrix element between the stationary state 2P 1/2 of the hydrogen atom
and the state 1S 1/2 + γ which contains the ground state of the atom and
one emitted photon γ. Thus, an atom initially prepared in the high-energy
state |Ψi i decays over time into two decay products: the atom in the state
|Ψf i plus a photon. This is exactly the situation discussed in the preceding
chapter: particles a, b, c from section 13.1 are analogous to our states |Ψi i,
|Ψf i, and γ, respectively. Thus, according to arguments in subsection 13.2.2,
the presence of the perturbation V3d must result in energy shifts of Ei and
Ef and in broadening of the level 2P 1/2 . Note that the level broadening does
not apply to the lowest-energy ground state 1S 1/2 . This state cannot decay
spontaneously, simply because there are no any lower energy states to which
it can decay. Thus only the ground state |1S 1/2 i is the true sharp-energy
stationary state of the RQD Hamiltonian He−p + V3d + (V3d )† .
14.1.4 Transition rate

From formula (13.66) we can obtain the probability density for the radia-
tive transition between two stationary atomic states |Ψf i and |Ψi i with the
emission of one photon
10
For simplicity, here we ignore relativistic corrections. pe and pp are the electron’s and
proton’s momentum operators, respectively, and r ≡ re − rp .
−1
8π 2 s2 d

2 dη (z)
Γ(s, κ) = |hΨ |V
i 3 (r, s, κ)|Ψ f i| (14.13)
c4 dz z=ma
Let us now simplify this expression a bit. Using equality

e2 r
[pe , He−p ] = pe , = i~e2 (14.14)
4πr 4πr 3
and denoting E ≡ Ei − Ef = cs the energy of the emitted photon we obtain

for the matrix element

i~1/2 e3 r · e(s, κ)
hΨi |V3d (r, s, κ)|Ψf i = p Ψi Ψf
m 2(2πcs)3 4πr 3
e
= p hΨi |(pe · e)He−p − He−p (pe · e)|Ψf i
m 2~(2πE)3
e
= p EhΨi |(pe · e)|Ψf i
m 2~(2πE)3
Next we use
im im im m
− [r, He−p] = − [re , He−p ] + [rp , He−p ] ≈ pe − pp ≈ pe
~ ~ ~ M
to obtain
ie
hΨi |V3d (r, s, κ)|Ψf i = − p EhΨi |(r · e)He−p − He−p (r · e)|Ψf i
2(2π~E)3
ie
= p E 2 hΨi |(r · e)|Ψf i
2(2π~E) 3
√
ie E
= p hΨi |(r · e)|Ψf i (14.15)
2(2π~)3
In order to find function η −1 (z) in (14.13) we turn to the definition (13.37).

In the case of atomic radiative transitions considered here, one of the decay
products (the photon) is massless (mc = 0), and its energy is much smaller
than rest energies of the atomic states11 cs ≪ ma c2 ≈ mb c2
q
1 1
z = ηs = 2 mb c + c s + cs ≈ 2 (mb c2 + cs)
2 4 2 2
c c
−1
s = η (z) ≈ (z − mb )c
dη −1 (z)
≈ c
dz z=ma
Putting these results in (14.13) we obtain
√ !2
8π 2 E 2 e E
Γ(s, κ) = p |hΨi |(r · e)|Ψf i|2
c5 2(2π~)3
E3
= |hΨi |(d · e(s, κ))|Ψf i|2
2~3 c5
where d ≡ −er is the atom’s dipole moment operator.
The full transition probability12 should be obtained by summing (14.13)
over two photon polarizations κ = ±1 and integrating over all possible di-
rections s/s of the emitted photon13
+1 Z +1 Z
1 c2 X E3 X
= dΩΓ(s, κ) = dΩ|hΨi |(d · e(s, κ))|Ψf i|2
τ0 ~ κ=−1 2π~4 c3 κ=−1
+1 Z
E3 X
= 4 3
dΩ|(dif · e(s, κ))|2
2π~ c κ=−1
For simplicity, we assume that vector dif is directed along the z-axis. Then,
with the help of (K.9) we obtain
11
In the case of the hydrogen atom we can assume, for example, a = 2P 1/2 and b =
1/2
1S .
12
This probability is related to the brightness of the corresponding spectral line. See
also section
R 19.5 in [Bal98] and section 45 in [BLP01].
13
Here dΩ denotes the integral over orientations of s, and dif ≡ hΨi |d|Ψf i is the
matrix element of the dipole moment operator calculated on eigenfunctions of atomic
states |Ψi i and |Ψf i.
+1
X dif (−sx + isy ) 2 dif (−sx − isy ) 2 |dif |2 (s2x + s2y )
2
|(dif · e(s, κ))| = √ + √ =
κ=−1
2s 2s s2
and the total transition rate (measured in Hz) is given by the well-known
formula
Zπ Z2π Zπ
1 |dif |2 E 3 (s2x + s2y ) |dif |2 E 3 4|dif |2 E 3
= sin θdθ dφ = sin3 θdθ =
τ0 2π~4 c3 s2 ~ 4 c3 3~4 c3
0 0 0
14.1.5 Energy correction due to level instability

Let us now assume that the photon has a small mass λ. This can be modeled
by adding a “rest energy” term λ2 c4 to the (squared) photon’s energy c2 s2 .
Then interaction (14.11) can be rewritten as
i~1/2 e3 r · e(s, κ)
V3d (r, s, κ) = − p (14.16)
4πm 2(2π)3 (λ2 c4 + c2 s2 )3/4 r 3
Consider state |ni of the hydrogen atom with energy En . Perturbation

(14.16) changes the energy of this state by the amount ∆En , which can
be calculated by the perturbation theory formula14
Z X X hn|V d |l; s, κihl; s, κ|V d |ni

3 3
∆En = ds
l
En − El − cs
κ
+1 Z
~e6 XX (r · e(s, κ)) (r · e(s, κ))
n ×
= ds n
l; s, κ
l; s, κ

2m2 (2π)3 l κ=−1 4πr 3 4πr 3
1
(λ2 c4 + c2 s2 )3/2 (En − El − cs)
14
See equation (10.70) in [Bal98]. This energy shift can be compared with the mass shift
P(mA ) in formula (13.59).
where |l; s, κi ≡ |li|s, κi is a basis state, which has the atom in a stationary
state |li15 and a free photon in the state |s, κi with momentum s and helicity
κ. From equality (K.13)
+1
X si sj
ei (s, κ)ej (s, κ) = δij −
κ=−1
s2
we obtain
~e6 XZ si sj D ri E D rj E
∆En = ds δij − n l l n ×
2m2 (2π)3 lij s2 4πr 3 4πr 3
1
(λ2 c4 + c2 s2 )3/2 (En − El − cs)
Let us now set Enl ≡ En − El and
Z∞
s2 ds
Inl ≡
(λ2 c4 + c2 s2 )3/2 (Enl − cs)
0
Then
Z Z
si sj s2z
ds 2 2 4 = δij ds
s (λ c + c2 s2 )3/2 (Enl − cs) s2 (λ2 c4 + c2 s2 )3/2 (Enl − cs)
Zπ Z∞
s2 cos2 θ
= 2πδij sin θdθ ds 2 4
(λ c + c2 s2 )3/2 (Enl − cs)
0 0
Z1 Z∞
s2 t2 4πδij
= 2πδij dt ds = Inl
(λ2 c4 + c2 s2 )3/2 (Enl − cs) 3
−1 0
and we can write

15
|li is an eigenstate of the non-perturbed 2nd order Hamiltonian He−p (14.12) with
eigenvalue El .
~e6 X D ri E D rj E 4πδij

∆En ≈ n l l n 4πδij Inl − Inl
2m2 (2π)3 lij 4πr 3 4πr 3 3
4~e6 X D ri E D ri E
= n l l n Inl
3m2 (2π)3 li 4πr 3 4πr 3
Integral Inl is calculated as follows
Inl
1h λ2 c4 − Enl cs 2
Enl
= 3 √ + 2 3/2
×
c (λ2 c4 + Enl 2
) λ2 c4 + c2 s2 (λ2 c4 + Enl )
p √ !
λ2 c4 + Enl2
λ2 c4 + c2 s2 + λ2 c2 + Enl cs is=∞
ln
Enl − cs s=0

1 h 1 1 2 4 −1
1 |Enl |λc2 i
≈ 3 − + ln −(|Enl | + λ c |Enl | /2) − Enl − ln
c Enl |Enl | |Enl | Enl
2 4 −1

1 −[|Enl | + λ c |Enl | /2 + Enl ]Enl
≈ 3 ln
c |Enl | |Enl |λc2
If Enl > 0

1 −2Enl
Inl = 3 ln
c Enl λc2
If Enl < 0
−1
1 [λ2 c4 Enl /2]Enl 1 2Enl
Inl = − 3 ln = 3 ln
c Enl Enl λc2 c Enl λc2
So, taking into account that ln(−1) ≪ ln (2|Enl |/(λc2)), we can write for all
values of Enl

1 2|Enl |
Inl ≈ 3 ln
c Enl λc2
Then

4π~e6 X D ri E2 1 2|Enl |
∆En ≈ n l ln
3m2 (2π)3 c3 li 4πr 3 Enl λc2
Next we use equation (14.14)
ri 1
3
= − 2 [pi , He−p ]
4πr i~e
D r E2 1
i
n 3
l = − 2 4 hn |(pi He−p − He−p pi )| li hl |(pi He−p − He−p pi )| ni
4πr ~ e
2
Enl
= 2 4 hn |pi | li hl |pi | ni
~e
to obtain

e2 X 2 λc2
∆En ≈ − 2 2 3 |hn|pi |li| Enl ln
6m π ~c li 2|Enl |
Our next step is to use the so-called Bethe logarithm E n defined for s-states
as16
X
X 2 λc2 λc2
|hn|pi |li| Enl ln ≡ ln |hn|pi |li|2 Enl
li
2|Enl | 2E n li
Then17
X
2α λc2
∆En ≈ − ln |hn|pi |li|2 Enl
3πm2 c2 2E n l
2
X
α λc
= − ln hn|(He−p pi − pi He−p )|lihl|pi |ni
3πm2 c2 2E n l

−hn|pi |lihl|(He−ppi − pi He−p )|ni
16
See, e.g., formulas (8.87) in [BD64] and (14.3.51) in [Wei95].
17
here we used equation (B.2)
14.2. RADIATIVE CORRECTIONS 471
2
α λc
= − 2 2
ln (hn|(He−ppi − pi He−p )pi |ni
3πm c 2E n
−hn|pi (He−p pi − pi He−p )|ni)
2
α λc
= − 2 2
ln hn|([He−p , pi ]pi − pi [He−p , pi ])|ni
3πm c 2E n
2
α λc
= − ln hn|[pi , [pi , He−p]]|ni
3πm2 c2 2E n
2 2 2
~2 α λc ∂ e
= − ln n ∂r2 4πr n

3πm2 c2 2E n
2
4~3 α2 λc
= 2
ln hn|δ(r)|ni (14.17)
3m c 2E n
This energy correction affects only spherically-symmetric S-states of the

atom. From formula (14.17) and the well-known result18 2E 2S 1/2 = 16.64mc2 α2
we can calculate the effect of spontaneous emission on the hydrogen 2S 1/2
state’s energy

se 4~3 α2 λ 2 mc2 α5 λ
∆E2S 1/2 = ln |ψ2S 1/2 (0)| = ln
3m2 c 16.64mα2 6π 16.64mα2
(14.18)
It is rather shocking that in the limit of zero photon mass (λ → 0) this

energy shift becomes infinite. This is an example of infrared divergence that
will be discussed in greater detail in the next section.
14.2 Radiative corrections

In the preceding section we got a disturbing result: the instability of atomic
levels with respect to spontaneous photon emission has resulted in divergent
energy shifts (14.18) in the 3rd perturbation order. This problem can be
solved by taking into account even higher perturbation orders. In this section
we are going to derive 4th order radiative corrections to the electron-proton
interaction potential by using formula (14.2). In particular, we will see that
this potential contains an infrared divergence, which cancels the infinite level
18
See section 14.3 in [Wei95].
shift (14.18) and leads to a finite energy spectrum of the hydrogen atom,
which compares favorably with experiment. Here, within our dressed particle
approach, we are going to reproduce such classic results of renormalized QED
as the electron’s anomalous magnetic moment and the Lamb shifts.
14.2.1 Product term in (14.2)

In this subsection we would like to calculate the 4th order electron-proton
interaction potential V4d via formula (14.2). The first term on the right hand
side of this formula can be obtained from (10.64) by dropping the energy
delta function and dividing by −2πi
δ 4 (q̃ − q̃ ′ − p̃′ + p̃)δτ τ ′

(Σc4 )ph ≈ ×
−2πi
h α2 α2 λ imc2 α2 k2
− δσσ′ + − 3 2 ln δσσ′ − 2 2 ln − 2 2 δσσ′
30π 3 m2 c 6π m c m 2π qk λ c
2 †
iα χσ (~σel · [k × q])χσ′ i
+ (14.19)
8π 3 m2 ck 2
The only yet unknown part on the right hand side of (14.2) is the product
term −V2d V2d . Here V2d is the usual non-relativistic Coulomb potential19
V2d (t)
Z i
e2 ~2 ′ ′ δ(p + q − p′ − q′ )e ~ (Ωp +ωq −Ωp′ −ωq′ )t † †
= − dpdqdp dq dp aq dp′ aq′
(2π~)3 (q − q′ )2 + λ2 c2
and V2d is obtained with the help of (8.61)
V2d (t)
19
This is the leading term in the momentum representation of the Darwin-Breit inter-
action (12.7) with a screening provided by the photon mass λ. Here we are interested
only in the dominant infrared-divergent term in the product V2d V2d , so we will work in
the non-relativistic approximation from Appendix J.9 and thus assume that momenta of
interacting particles are much less that mc. Accordingly, we will omit all terms contain-
ing positive powers of p̃, q̃ and/or k̃. In this approximation the product V2d V2d becomes
spin-independent. So, we disregard spins and drop spin labels of particle states. The same
approximations will be employed for other infrared-divergent terms in this chapter.
Z i
e2 ~2 ′δ(t + s − t′ − s′ )e ~ (Ωt +ωs −Ωt′ −ωs′ )t † †
′
= dtdsdt ds d a dt′ as′
(2π~)3 [(s − s′ )2 + λ2 c2 ][ωs − ωs′ + Ωt − Ωt′ ] t s
Using
d†p a†q dp′ aq′ d†t a†s dt′ as′

= (d†pdp′ d†t dt′ )(a†q aq′ a†s as′ )
= (−d†p d†t dp′ dt′ + d†p dt′ δ(t − p′ ))(−a†q a†s aq′ as′ + a†q as′ δ(s − q′ ))
= d†p a†q dt′ as′ δ(t − p′ )δ(s − q′ ) + . . . (14.20)
we obtain20
−V2d (t)V2d (t)

Z
e4 ~4
= dpdqdp′ dq′ dsdtds′dt′ d†p a†q dp′ aq′ d†t a†s dt′ as′ ×
(2π~)6
δ(p + q − p′ − q′ )δ(s + t − s′ − t′ )
(14.21)
[(q − q′ )2 + λ2 c2 ][ωs − ωs′ + Ωt − Ωt′ ][(s − s′ )2 + λ2 c2 ]
Z
e4 ~4
= dpdqdp′ dq′ dsdtds′dt′ d†p a†q dt′ as′ ×
(2π~)6
δ(p + q − p′ − q′ )δ(s + t − s′ − t′ )δ(q′ − s)δ(p′ − t)
[(q − q′ )2 + λ2 c2 ][ωs − ωs′ + Ωt − Ωt′ ][(s − s′ )2 + λ2 c2 ]
Z
e4 ~4
= 6
dqdpdp′ dq′ δ(q + p − p′ − q′ )d†p a†q dp′ aq′ ×
(2π~)
Z
ds
[(q − s) + λ c ][ωs − ωq′ + Ωq+p−s − Ωp′ ][(s − q′ )2 + λ2 c2 ]
2 2 2
Next we introduce non-relativistic approximations ωq ≈ mc2 + q 2 /(2m) and

Ωp ≈ Ωp′ ≈ Mc2 . We also choose the center-of-mass frame, where the heavy
20
For definition of the underbrace sign see (8.54). As we have mentioned in subsection
11.2.1, operators V2d and V2d are well-defined only on the energy shell. However, momentum
integrations in (14.21) include regions outside the energy shell, where the integrands are not
known precisely. Nevertheless, this uncertainty does not seem to be significant, because
here we are interested only in the leading infrared-divergent contribution, which comes
from integration in a small region near the k ≡ q′ − q = 0 singularity located on the
energy shell.
proton remains motionless, and the electron scattering is elastic: q = q ′ .21

Z
e4 ~4 ds 1
6 2 2 2 ′ 2 2 2
·
(2π~) [(q − s) + λ c ][(s − q ) + λ c ] ωs − ωq′
4 4 Z
2e ~ m ds
≈
(2π~) 6 [(q − s) + λ c ][s − q 2 − iµ][(q′ − s)2 + λ2 c2 ]
2 2 2 2
This integral is calculated in equation (M.53) in Appendix M.7. So, the

coefficient function of the operator −V2d (t)V2d (t) is

α2 mc2 k2
ln (14.22)
(−2πi)πqk 2 λ 2 c2
Adding this term to (14.19), we see that it cancels the third term in square
brackets there, i.e., the contribution from ladder and crossed ladder dia-
grams.22 Then for the left hand side of (14.2) we obtain
h0|aq,σ dp,τ V4d d†p′ ,τ ′ a†q′ ,σ′ |0i

δτ τ ′ h iα2 α2 χ†σ (~σel · [k × q])χσ′ iα2 λ i
≈ δ σσ ′ − + ln δσσ ′
−2πi 15π 2 m2 c 4π 2 m2 ck 2 3π 2 m2 c m
(14.23)
14.2.2 Radiative corrections to the Coulomb potential

Equation (14.23) gives us the 4th order interaction V4d only on the energy
shell. Outside the energy shell we can adopt the usual assumption about
the near constancy of the coefficient function.23 Then the momentum-space
coefficient function of the operator V4d has three components
v4d (p, q, k; τ, σ, τ ′ σ ′ )

α2 δσ,σ′ δτ,τ ′ α e2 ~2 σel · [k × q]
† i~ α2 δσ,σ′ δτ,τ ′ λ
= − 3 2
− 3
· χσ 2 2 2
χσ′ δτ,τ ′ − 3 2
ln
30π m c π (2π~) 4m c k 6π m c m
21
Here µ is a small positive constant, which should be taken to 0 at the end of calcula-
tions.
22
This observation justifies the omission of contributions (14.22) and (10.63) in most
textbook calculations of the Lamb shift.
23
See condition ζi ≈ 1 in subsections 11.1.6 - 11.1.7 and discussion in subsection 11.2.1.
The corresponding position-space potential is obtained using the Fourier

transform with respect to the transferred momentum k, formulas from Ap-
pendix B, and Sel = ~~σel /2.

8~3 α2 e2 [r × q] · Sel α 4~3 α2 λ
V4d (q, r, Sel ) = − 2
δ(r) + 2 2 3
− 2
ln δ(r)
30m c 8πm c r π 3m c m
(14.24)
In a theory of electron-proton interaction valid to the 4th perturbation order

this expression must be added to the 2nd order interaction operator (12.9),
thus leading to the following complete potential that depends on the position,
momentum, and spin of the electron

d e2 e2 [r × q] · Sel α e2 ~2 8α
V2+4 (q, r, Sel ) = − + 1+ + 2 2 1− δ(r)
4πr 8πm2 c2 r 3 π 8c m 15π

4~3 α2 λ
− 2
ln δ(r) (14.25)
3m c m
The 1st term is the usual Coulomb potential. The 2nd term describes the
spin-orbit interaction of the electron’s spin with its own momentum. The
3rd and 4th terms are contact potentials. The latter one is rather troubling:
it diverges in the limit of zero photon mass λ → 0. From the point of view
of classical electrodynamics,24 this is not a reason for concern, because such
short-range potentials do not affect macroscopic dynamics of classical point-
like charges. However, this potential has infinite effect on eigenstates and
eigenvalues of quantum compound systems, e.g., the hydrogen atom. How
are we going to solve this problem?
14.2.3 Lamb shift

As we saw in (12.26), in the 2nd order theory hydrogen levels 2S 1/2 and 2P 1/2
have exactly the same energies. However, in 1947 Lamb and Retherford found
a small gap between the two levels, which is now known as the Lamb shift.
The modern experimental value for the Lamb shift is [WHSK+ 95]
24
See chapter 15.
ε2S 1/2 − ε2P 1/2 = 4.37 × 10−6 eV

= 2π~ × 1057.8 MHz (14.26)
The presence of the Lamb shift was completely unexpected from the point
of view of the quantum theory available at the time. Attempts to explain
this effect played an important role in the development of quantum elec-
trodynamics. Successful calculation of the shift value (14.26) was a major
triumph of QED. Here we are going to calculate the Lamb shift within our
dressed particle approach.
The effects of the 4th order potentials (14.24) on energies of the 2S 1/2
and 2P 1/2 hydrogen states can be calculated using perturbation theory, as in
section 12.2. The results are collected in Table 14.1.
The short-range contact potential in the 1st term in (14.24) shifts only
the s-state, whose wave function is non-zero at the origin
8~3 α2 mc2 α5
∆εcontact
2S 1/2 = − |ψ2S (0)| 2
= −
30m2 c 30π
Table 14.1: 4th order perturbative energy corrections to 2S 1/2 and 2P 1/2
energy levels of the hydrogen atom.
contribution potential ∆ε2S 1/2 ∆ε2P 1/2
e 2 ~2 α mc2 α5
contact (14.24) − 30m2 c2 π δ(r) − 30π 0
3 2 2 5
vertex (14.24) − 4~ α
3m2 c
ln λ
m
δ(r) − mc6πα ln λ
m
0
mc2 α5 λ

emission (14.18) 6π
ln 17.6α2 m
0
2 2 5
spin-orbit (14.24) − e16π
α[r×q]·Sel
2 m2 c2 r 3 0 − mc48πα
mc2 α5
1 2 5
Total correction 6π
− 5 − ln(17.6α2 ) − mc48πα
As we mentioned in the preceding subsection, the energy correction due to

the last term in (14.24)

4~3 α2 λ mc2 α5 λ
∆εvertex
2S 1/2 =− ln 2
|ψ2S (0)| = − ln (14.27)
3m2 c m 6π m
diverges in the limit λ → 0, which seems to be unphysical. Luckily, there

is another interaction that cancels this divergence. This is the 3rd order
bremsstrahlung potential25 (14.11), which induces the energy shift (14.18)

mc2 α5 λ
∆εemission
2S 1/2 = ln
6π 16.64α2m
So, the total energy correction becomes finite. The only interaction affecting
the 2P 1/2 level is the 2nd term in (14.24). The corresponding energy shift
can be evaluated by the method from subsection 12.2.3.
Finally, within our approximations, the full 4th order contribution to the
Lamb shift
spin−orbit
ε2S 1/2 − ε2P 1/2 = ∆εcontact emission
2S 1/2 + ∆ε2S 1/2 + ∆εvertex
2S 1/2 − ∆ε2P 1/2

mc2 α5 3
= − − ln(16.64α ) = 3.91 × 10−6 eV
2
6π 40
= 2π~ × 945 MHz
is in a good agreement with the experimental value (14.26). Relative posi-

tions of lowest energy levels of the hydrogen atom are shown in Fig. 12.1.
Let us stress once again that in RQD we do not assume the existence of
virtual particles or non-trivial vacuum. Therefore, we explain the high-order
effects entirely in terms of small corrections to inter-particle potentials with-
out any reference to “virtual particle exchanges”, “vacuum polarization”, and
other field-theoretical terminology. In the literature one can find a number of
similar “effective” particle approaches [Hol04, PS98a, PS98b, GR80, GRI89,
FS88], which use inter-particle potentials with radiative corrections.
25
This interaction couples the “electron+proton” subspace of the Fock space with the
“electron+proton+photon” subspace. So, strictly speaking, this is not a true electron-
proton potential.
14.2.4 Electron’s anomalous magnetic moment

As we discussed in chapter 10, renormalization has no effect on the electron’s
charge, because this is forbidden by the charge renormalization condition
postulate 10.2. However, there is another electron’s property - the magnetic
moment - which is not restricted by any postulate. The effect of renormaliza-
tion on the electron’s magnetic moment was first calculated by Schwinger in
1948. This was a major triumph of the renormalized QED. In this subsection
we are going to reproduce this result within our dressed particle approach.
The electron’s magnetic moment manifests itself by the electron’s dy-
namics in external “magnetic fields”.26 In our electron-proton system, in
principle, the proton can play the role of a source of such a “magnetic field”.
Unfortunately, so far we assumed that the proton’s mass is infinite and that
this particle is motionless. Thus, in our approximation we have lost the effect
of the proton’s magnetic field. In order to have a model of electron-magnet
interaction, let us consider the effect of a finite proton mass M < ∞. In
the 2nd perturbation order the relevant potential was obtained in equation
(12.14)
e2 [r × p] · Sel
Vs−o = − (14.28)
4πMmc2 r 3
where p is the proton’s momentum. It is customary to define the electron’s

magnetic moment and its interaction with a moving charge e by formulas27
geSel
~µel ≡ −
2mc
e[r × p] · ~µel
Vs−o =
4πMcr 3
where g is the so-called gyromagnetic ratio or simply the g-factor. Thus,

comparing (14.28) and (14.29), we conclude that in the 2nd perturbation
order g = 2. In higher orders we expect some corrections to this value. Here
we will be interested in the 4th order correction β4 , such that
26
Experimental manifestations of particle magnetic moments will be discussed in more
detail in chapter 15.
27
see equation (11.100) in [Jac99]
g = 2(1 + β4 ) (14.29)
Recall that interaction (14.28) has resulted from the momentum-space

coefficient function in (12.7)

′ ′ ie2 ~2 δπ,π′ (el)† ~σel · [k × p] (el)
v2s−o (p, q, k; π, ǫ, π , ǫ ) = χ χǫ′ (14.30)
(2π~)3 ǫ 2Mmc2 k 2
Now our plan is to find a 4th order terms with the similar structure. One can
easily see that the only relevant S-matrix contribution is (10.48). Retaining
only the leading first term in the square bracket there, we obtain
icα2
s4s−o (p, q, k; π, ǫ, π ′ , ǫ′ ) ≈ − U(q + k, ǫ; q, ǫ′ ) · W(p − k, π; p, π ′ )
2π 2 k 2
To obtain the coefficient function of the corresponding potential we just need
to use (J.68), (J.69), and multiply the result by the usual factor 1/(−2πi)

icα2 δπ,π′ (el)† i[~σel × k] 2p − k (el)
v4s−o = χ · χǫ′
(2πi)2π 2k 2 ǫ 2mc 2Mc

ie2 ~2 δπ,π′ α (el)† ~σel · [k × p] (el)
= 3
χǫ 2 2
χǫ′
(2π~) 2π 2Mmc k
This 4th order matrix element differs from the similar 2nd order matrix
element (14.30) only by the factor α/(2π), which is the desired value of β4
in (14.29). So, in our approximation, the g-factor is
α
g =2 1+ ≈ 2.0023 (14.31)
2π
which is the standard 4th order QED result.
Chapter 15
CLASSICAL
ELECTRODYNAMICS
All of physics is either impossible or trivial. It is impossible until

you understand it and then it becomes trivial.
Ernest Rutherford
So far in this book we were concerned with quantum mechanical descrip-

tion of electromagnetic phenomena. We focused on scattering, decays, and
bound states of systems of charges and achieved a good agreement with ex-
periments in the 4th order of perturbation theory. We can expect even better
accuracy by extending our dressed particle approach to higher perturbation
orders. We are not going to do that in our bok. Instead, we will consider
another broad class of electromagnetic phenomena, namely the dynamics of
macroscopic charges in the classical limit.
Classical theory of electromagnetic phenomena was formulated a century-
and-a-half ago. It was based on a set of equations, which were designed by
Maxwell as a theoretical generalization for a large number of experiments,
in particular those performed by Faraday. By and large, this theory enjoyed
a good agreement with experiment. However, in section 15.2 and in chapter
16 we will meet a number of paradoxes and experiments that cannot be
explained within the Maxwell’s theory.
In the present chapter we will apply the direct interaction RQD approach
481
482 CHAPTER 15. CLASSICAL ELECTRODYNAMICS
to classical electrodynamics. Our goal is to show that this is a plausible

alternative to the standard theory based on Maxwell’s equations.
We would like to emphasise that our classical electromagnetic theory
is obtained by a straightforward application of the ~ → 0 limit to RQD
equations. There is no such a direct connection between the traditional
QED and the Maxwell’s theory.
15.1 Hamiltonian formulation

The central idea of Maxwell’s electrodynamics is that charged particles in-
teract with each other indirectly via electric and magnetic fields and that
electromagnetic radiation is an electromagnetic field varying in time and
space. In this book we are challenging these universally accepted points of
view. Our main concern is that such primary ingredients of the classical
theory as Maxwell fields, Liénard-Wiechert potentials, and the Lorentz force
law cannot be expressed in the language of Poincaré group generators, e.g.,
the Hamiltonian. In our opinion, this is the universal language in which all
physical theories ought to be formulated. In particular, the Poincaré invari-
ant Hamiltonian dynamics is the most natural way to ensuring conservation
laws and correct transformations of observables between different reference
frames. In Maxwell’s approach the validity of these important requirements
is not obvious at all.
We argue that all results of conventional classical electromagnetic theory
can be equally well (or even better) explained from the viewpoint of Hamil-
tonian dynamics of charged particles with direct interactions, where “fields”
are not involved at all. In our approach light is described as a flow of mass-
less particles – photons, rather than the so-called transverse “electromagnetic
wave”.
In section 12 we already derived the Darwin-Breit Hamiltonian (12.10) for
charged particles as an approximation to the full-fledged RQD. This Hamil-
tonian was obtained in the 2nd order perturbation theory within the (v/c)2
approximation. Our goal now is to demonstrate that this Hamiltonian can
be used successfully even in classical (non-quantum) approximation. Then
it provides a reasonably accurate description of electromagnetic processes
in which acceleration of charged particles is low, so that one can neglect
the emission of electromagnetic radiation (photons). The Darwin-Breit ap-
proach adopted here is fundamentally different from the generally accepted
15.1. HAMILTONIAN FORMULATION 483
Maxwell’s theory. In the Darwin-Breit approach charged particles interact

via instantaneous potentials; there are no electromagnetic fields and no spe-
cific “field energy” associated with them. In spite of these differences, we
will see that in many cases it is very difficult to distinguish these two ap-
proaches experimentally, as both of them lead to very similar predictions.
We will also find situations1 in which the traditional Maxwell’s theory leads
to contradictions and paradoxes. These paradoxes find their resolution in
the Darwin-Breit electrodynamics.
In this chapter, we will be working in the classical approximation: ig-
noring all quantum effects,2 not paying attention to the order of dynamical
variables in their products and using Poisson brackets [. . . , . . .]P instead of
quantum commutators (−i/~)[. . . , . . .]. We will also represent all quantities
as series in powers of v/c and leave only terms whose order is not higher than
(v/c)2 .
15.1.1 Darwin-Breit Hamiltonian

The Darwin-Breit Hamiltonian H = H0 + V for a system of two charges q1
and q2 consists of the free part
H0 = h1 + h2
q q
= m1 c + p1 c + m22 c4 + p22 c2
2 4 2 2
p21 p2 p41 p42

≈ m1 c2 + m2 c2 + + 2 − − (15.1)
2m1 2m2 8m31 c2 8m32 c2
and the potential energy V (12.11) - (12.15)3

q1 q2 q1 q2 (p1 · r)(p2 · r)
V ≈ − (p1 · p2 ) +
4πr 8πm1 m2 c2 r r2
1
See section 15.2 and chapter 16.1.
2
The only exception is our discussion of the Aharonov-Bohm effect in section 15.4.
3
We denote r ≡ r1 − r2 throughout this chapter. In the case of electron-proton system,
the charges are q1 = −e and q2 = +e. Contact terms proportional to δ(r) are not relevant
for classical mechanics and are omitted. We also omitted 3rd and 4th order corrections to
this Hamiltonian that were derived in sections 14.1 and 14.2. In Appendix N.3 we verified
that with a properly chosen boost generator K = K0 + Z the Hamiltonian H = H0 + V
satisfies all Poincaré Lie algebra relationships within the (v/c)2 approximation.
q1 q2 [r × p1 ] · s1 q1 q2 [r × p2 ] · s2 q1 q2 [r × p2 ] · s1
− + +
8πm21 c2 r 3 8πm22 c2 r 3 4πm1 m2 c2 r 3
q1 q2 [r × p1 ] · s2 q1 q2 (s1 · s2 ) 3q1 q2 (s1 · r)(s2 · r)
− 2 3
+ 2 3
−
4πm1 m2 c r 4πm1 m2 c r 4πm1 m2 c2 r 5
(15.2)
In order to use this Hamiltonian in practical calculations, we introduce

few adjustments. First, we omit the rest energies of the two particles, because
they have no effect on dynamics. Second, we notice that particle spins si
are not easily measurable in classical experiments. It is more convenient to
replace them with magnetic moments ~µi , which are known to be proportional
to spins. This dependence includes anomalous contributions, i.e., those not
described by the classical formula ~µi = qi si /(mi c). The electron’s anomalous
magnetic moment has been discussed in subsection 14.2.4. We will not dwell
on this issue here and simply postulate that the full Hamiltonian for two
charged spinning classical particles takes the form
p21 p2 p41 p42 q1 q2

H = + 2 − 3 2
− 3 2
+
2m1 2m2 8m1 c 8m2 c 4πr

q1 q2 (r · p2 )(r · p1 )
− p1 · p2 +
8πm1 m2 c2 r r2
q1 [r × p1 ] · ~µ2 q1 [r × p2 ] · ~µ2 q2 [r × p1 ] · µ ~1
− 3
+ 3
− 3
4πm1 cr 8πm2 cr 8πm1 cr
q2 [r × p2 ] · ~µ1 (~µ1 · ~µ2 ) 3(~µ1 · r)(~µ2 · r)
+ + − (15.3)
4πm2 cr 3 4πr 3 4πr 5
15.1.2 Two charges

Let us consider a system of two spinless charged particles. The full Hamil-
tonian of this system (which is called the Darwin Hamiltonian) is obtained
by dropping spin-dependent terms in the Darwin-Breit Hamiltonian (15.3)
p21 p22 p41 p42 q1 q2

H = + − 3 2
− 3 2
+
2m1 2m2 8m1 c 8m2 c 4πr

q1 q2 (p1 · r)(p2 · r)
− (p1 · p2 ) + (15.4)
8πm1 m2 c2 r r2
This Hamiltonian fully determines the dynamics in the system via Hamilton’s
equations of motion (6.99) - (6.100) and Poisson brackets (6.96). The time
derivative of the first particle’s momentum can be obtained from the first
Hamilton’s equation
dp1 ∂H
= [p1 , H]P = −
dt ∂r1
q1 q2 r q1 q2 (p1 · p2 )r q1 q2 p1 (p2 · r) q1 q2 (p1 · r)p2
= − + +
4πr 3 8πm1 m2 c2 r 3 8πm1 m2 c2 r 3 8πm1 m2 c2 r 3
3q1 q2 (p1 · r)(p2 · r)r
− (15.5)
8πm1 m2 c2 r 5
Since Hamiltonian (15.4) is symmetric with respect to permutations of the
two particles, we can obtain the time derivative of the second particle’s mo-
mentum by replacing indices 1 ↔ 2 in (15.5)
dp2 dp1
= − (15.6)
dt dt
Velocities of particles 1 and 2 are obtained from the second Hamilton’s equa-
tion4
dr1 ∂H
v1 ≡ = [r1 , H]P =
dt ∂p1
2
p1 p p1 q1 q2 p2 q1 q2 (p2 · r)r
= − 13 2 − − (15.7)
m1 2m1 c 8πm1 m2 c2 r 8πm1 m2 c2 r 3
2
dr2 p2 p p2 q1 q2 p1 q1 q2 (p1 · r)r
v2 ≡ = − 23 2 − − (15.8)
dt m2 2m2 c 8πm1 m2 c r 8πm1 m2 c2 r 3
2
From these results we can calculate second time derivatives of particle posi-
tions (=accelerations)5
4
This relationship between velocity and momentum is interaction-dependent because
interaction energy in (15.4) is momentum-dependent.
5
In this derivation we omitted terms proportional to q12 q22 due to their smallness (for-
mally they belong to the 4th perturbation order). Also keeping the accuracy of (v/c)2 we
can set ṙ = dr dr2 p1 p2
dt − dt ≡ v1 − v2 ≈ m1 − m2 in those terms that already have the factor
1
2
(1/c) .
d 2 r1 ṗ1 p21 ṗ1 2(p1 · ṗ1 )p1 q1 q2 p2 (r · ṙ)

2
= − 3 2
− 3 2
+
dt m1 2m1 c 2m1 c 8πm1 m2 c2 r 3
q1 q2 (p2 · ṙ)r 3q1 q2 (p2 · r)r(r · ṙ) q1 q2 (p2 · r)ṙ
− + −
8πm1 m2 c2 r 3 8πm1 m2 c2 r 5 8πm1 m2 c2 r 3
q1 q2 r q1 q2 (p1 · p2 )r q1 q2 p1 (p2 · r) q1 q2 (p1 · r)p2
≈ − + +
4πm1 r 3 8πm21 m2 c2 r 3 8πm21 m2 c2 r 3 8πm21 m2 c2 r 3
3q1 q2 (p1 · r)(p2 · r)r p21 q1 q2 r 2q1 q2 (p1 · r)p1 q1 q2 p2 (r · p1 )
− 2
− 2 2
− +
8πm1 m2 c r 2 5 2m1 c 4πm1 r 3 8πm31 c2 r 3 8πm21 m2 c2 r 3
q1 q2 p2 (r · p2 ) q1 q2 (p2 · p1 )r q1 q2 (p2 · p2 )r 3q1 q2 (p2 · r)r(r · p1 )
− − + +
8πm1 m22 c2 r 3 8πm21 m2 c2 r 3 8πm1 m22 c2 r 3 8πm21 m2 c2 r 5
3q1 q2 (p2 · r)r(r · p2 ) q1 q2 (p2 · r)p1 q1 q2 (p2 · r)p2
− − +
8πm1 m22 c2 r 5 8πm21 m2 c2 r 3 8πm1 m22 c2 r 3
q1 q2 r q1 q2 (v1 − v2 )2 r q1 q2 v12 r q1 q2 (v1 · r)(v2 − v1 )
= 3
+ 2 3
− 2 3
+
4πm1 r 8πm1 c r 4πm1 c r 4πm1 c2 r 3
3q1 q2 (v2 · r)2 r
− (15.9)
8πm1 c2 r 5
d 2 r2 q1 q2 r q1 q2 (v1 − v2 )2 r q1 q2 v22 r q1 q2 (v2 · r)(v1 − v2 )
2
≈ − 3
− 2 3
+ 2 3
−
dt 4πm2 r 8πm2 c r 4πm2 c r 4πm2 c2 r 3
2
3q1 q2 (v1 · r) r
+ (15.10)
8πm2 c2 r 5
15.1.3 Definition of force

There are two definitions of force commonly used in classical mechanics.
In one definition the force acting on a particle is identified with the time
derivative of that particle’s momentum
dpi
fi ≡ (15.11)
dt
In another definition [CV68] the force is a product of the particle’s rest mass
and its acceleration6
6
This is equivalent to the second Newton’s law of motion.
d 2 ri
fi ≡ mi (15.12)
dt2
These two definitions are identical only for not-so-interesting potentials that
do not depend on momenta (or velocities) of particles. In the Darwin-Breit
electrodynamics we are dealing with momentum-dependent potentials, so we
need to decide which definition of force we are going to use.
The usual definition (15.11) has the advantage that the third Newton’s
law of motion (the law of action and reaction) in a two-body system has a
simple formulation7
f1 = −f2 (15.13)
This is a trivial consequence of the law of conservation of the total momen-
tum. It follows immediately from the vanishing Poisson bracket [P0 , H]P = 0
in the instant form of relativistic dynamics8
dp2 dp1
= [p2 , H]P = [P0 − p1 , H]P = −[p1 , H]P = − (15.14)
dt dt
Contrary to the usual practice, in this book we will use an alternative
definition of force (15.12). Although this definition does not imply the balance
of forces (15.13),9 it is preferable for several reasons. First, definition (15.12)
is consistent with the standard notion that equilibrium (or zero acceleration
d2 r/dt2 = 0) is achieved when the force vanishes.10 Second, definition (15.11)
is less convenient because it is rather difficult to measure momenta of particles
and their time derivatives in experiments. It is much easier to measure
velocities and accelerations of particles, e.g., by time-of-flight techniques. For
example, by measuring current in a wire we actually measure the amount of
charge passing through the cross-section of the wire in a unit of time. This
quantity is directly related to the velocity of electrons, while it has no direct
connection to electrons’ momenta.
7
see (15.6)
8
Note that in the traditional Maxwell’s theory the proof of the validity of the third
Newton’s law is rather non-trivial. This “proof” requires introduction of such dubious
notions as “hidden momentum” and/or momentum of the electromagnetic fields [Kel42,
PN45, SJ67, Jef99a].
9
This can be seen from comparing (15.9) and (15.10): f1 6= −f2 .
10
This is the first Newton’s law of motion.
15.1.4 Wire with current

Experimentally, it is very difficult to isolate two charged particles and mea-
sure their trajectories with the precision sufficient to verify theoretical predic-
tions (15.9) - (15.10). In many cases it is more convenient to study behavior
of electrons whose movement is confined inside wires made of conducting
materials. In this subsection we will consider forces acting between electrons
in wires and outside charges.
Let us consider the force exerted by a metal wire on a test charge q1
located at point r1 outside the wire and moving with velocity v1 .11 There
are two kinds of charges in the wire: fixed positive ions of the lattice and
mobile negatively charged electrons. In most cases the total charge of the
ions compensates exactly the total charge of the electrons, so that the wire
is electrically neutral. We assume that the spins (or magnetic moments µ ~ 2)
of ions and that electrons in the wire are oriented randomly. Therefore, all
~µ2 -dependent terms in (15.3) vanish after averaging over angles. If the wire
moves as a whole with velocity w, then w-dependent Darwin interactions12
of the charge 1 with electrons and ions in the wire cancel each other. So,
the force acting on the charge 1 does not depend on the wire’s velocity, and
we can assume that the wire remains stationary and that only electrons in
the wire are moving with velocity v2 . Electrons in the wire participate in
two kinds of movements: thermal and drift movements. The velocities of
the thermal movement are rather high, but their orientations are distributed
randomly. The drift velocity is directed along the applied voltage, and its
magnitude is very small (≈mm/sec).
Let us first see the effect of thermally agitated electrons13 on the external
charge 1. In this case we can omit terms that do not depend on v2 in
equation (15.9), because these terms are canceled by forces from positively
charged lattice ions. We can also neglect terms having linear dependence
on v2 , because they average out to zero due to the isotropy of the thermal
movement. So, the force acting on the charge 1 due to the thermal chaotic
movement of electrons 2 is proportional to v22
q1 q2 v22 r 3q1 q2 (v2 · r)2 r

f1 = − (15.15)
8πc2 r 3 8πc2 r 5
11
Here we ignore the magnetic moment of the test particle: µ
~ 1 = 0.
12
the second line in equation (15.3)
13
They are marked by the index 2 in this derivation.
z (0,0,R)
v2 y
Figure 15.1: Interaction of a charge 1 at (0, 0, R) with a piece of conductor

placed in the origin. It is assumed that conductor’s electrons have thermal
velocities with absolute value v2 and random orientations.
Now consider a small piece of conductor located in the origin and the charge
1 at the point (0, 0, R) on the z-axis (see Fig. 15.1). Our goal is to show that
the total force14 acting on the charge 1 is zero. To prove this fact it is sufficient
to show that expression (15.15) yields zero when averaged over directions of
v2 with the absolute value v2 kept fixed. The x- and y-components of this
average are zero by symmetry, and the z-component is given by the integral
on the surface of a sphere of radius v2 15
Zπ Z2π
q1 q2 v22 3v22 cos2 θ
Iz = sin θdθ dϕ −
8π R2 R2
0 0
 −1

2 2 Z
q1 q2  4πv2 6πv2
= + 2 t2 dt = 0
8π R2 R
1
By similar arguments one can show that the reciprocal force exerted by the
charge on the conductor without current vanishes as well. So, we conclude
that the thermal movement of electrons can be ignored in conductor-charge
calculations.
Let us now consider the charge 1 and an infinite straight wire with a non-
zero drift velocity of electrons v2 in the geometry shown in Fig. 15.2. The
14
or the average of forces (15.15) over different values of v2
15
Here we use spherical coordinates with angles ϕ ∈ [0, 2π) and θ ∈ [0, π], so that
(v2 · r) = v2 R cos θ.
v2
r2
y
R,0,0)
Figure 15.2: Interaction of a charge at (R, 0, 0) with an infinite straight

vertical wire with current.
linear density of conduction electrons in the wire is denoted by ρ2 . First we

would like to calculate the force acting on the charge 1 from a small portion
dr2z of the wire. Then we use formula (15.9), keeping only terms dependent
on v2

(v1 · v2 )r v22 r (v1 · r)v2 3(v2 · r)2 r
df1 = q1 ρ2 dr2z − + + −
4πc2 r 3 8πc2 r 3 4πc2 r 3 8πc2 r 5

[v1 × [v2 × r]] v22 r 3(v2 · r)2 r
= q1 ρ2 dr2z + − (15.16)
4πc2 r 3 8πc2 r 3 8πc2 r 5
The full force is obtained by integrating (15.16) on the full length of the
wire. Let us first show that the integral of the 2nd and 3rd term vanishes.
The y- and z-components of this integral are zero due to symmetry. For the
x-component we obtain
Z∞
q1 ρ2 v22 R 2
3r2z R
Ix = dr2z 2 3/2
− 2 5/2
8πm1 c2 2
(R + r2z ) 2
(R + r2z )
−∞
= 0
This result means, in particular, that a neutral superconducting (=zero re-

sistance) wire with current does not create v22 -dependent electrostatic poten-
tial in the surrounding space. In other words, a straight wire with current
does not act on charges at rest. The observation of such a potential was
z q11
p1
r1
µ2
x
0 u
y a
Figure 15.3: Interaction between a current loop and a charge. The charge q1
is located at a general point in space r1 = (r1x , r1y , r1z ) and has an arbitrary
momentum p1 = (p1x , p1y , p1z ).
erroneously reported in [EKL76]. Subsequent more accurate measurements

[LEK92, SSS+ 02] did not confirm that report.
So, the full force acting on the charge 1 is obtained by integration of the
first term in (15.16) on the length of the wire
Z∞
q1 ρ2 [v1 × [v2 × r]]
F1 = dr2z (15.17)
4πc2 r3
−∞
In this expression one easily recognizes the Biot-Savart force law of the tra-
ditional Maxwell’s theory. This means that all results of Maxwell’s theory
referring to magnetic properties of wires with currents remain valid in our
approach.
15.1.5 Charge and current loop

Let us use the Darwin Hamiltonian (15.4) to calculate the interaction en-
ergy between a neutral circular current-carrying wire of a small radius a
and a point charge in the geometry shown in Fig. 15.3. As we saw in the
preceding subsection, the movement of the wire as a whole does not have
any effect on its interaction with the charge. So, we will assume that the
current loop is fixed in the origin. We need to take into account only the
velocity-dependent interaction between the charge 1 and negative charges of
conduction electrons having linear density ρ2 and drift velocity v2 ≈ p2 /m2 ,
whose tangential component is u, as shown in Fig. 15.3. Then the potential
energy of interaction between the charge 1 and the loop element dl is given
by the Darwin’s formula

q1 ρ2 dl (p1 · v2 ) (p1 · r)(v2 · r)
Vdl2−q1 ≈ − +
8πm1 c2 r r3
In the coordinate system shown in Fig. 15.3 the line element in the loop
is dl = adθ and v2 = (−u sin θ, u cos θ, 0). In the limit a → 0 we can
approximate
1 1 1 a(r1x cos θ + r1y sin θ)

≡ ≈ + (15.18)
r |r1 − r2 | r1 r13
1 1 1 3a(r1x cos θ + r1y sin θ)
≡ ≈ 3
+ (15.19)
r3 |r1 − r2 |3 r1 r15
The full interaction between the charge and the loop is obtained by inte-
grating Vdl2−q1 on θ from 0 to 2π and neglecting small terms proportional to
a3
Vloop2−q1
Z2π h
aq1 ρ2 1 a(r1x cos θ + r1y sin θ)
≈ − dθ (−up1x sin θ + up1y cos θ) +
8πm1 c2 r1 r13
0
+(−ur1x sin θ + ur1y cos θ)((p1 · r1 ) − p1x a cos θ − p1y a sin θ) ×

1 3a(r1x cos θ + r1y sin θ) i
+
r13 r15
a2 uq1 ρ2 [r1 × p1 ]z
≈ − (15.20)
4m1 c2 r13
Taking into account the usual definition of the loop’s magnetic moment16 as
a vector ~µ2 whose length is µ2 = πa2 ρ2 u/c and whose direction is orthogonal
16
see equation (5.42) in [Jac99]
to the plane of the loop, we can generalize (15.20) for arbitrary position and
orientation of the loop
q1 [~µ2 × r] · p1
Vloop2−q1 ≈ − (15.21)
4πm1 cr 3
So, the full Hamiltonian for the system of charge 1 and current loop 2 is
p21 p2 p41 p42 q1 [~µ2 × r] · p1

H = + 2 − 3 2
− 3 2
− (15.22)
2m1 2m2 8m1 c 8m2 c 4πm1 cr 3
Now we can use this Hamiltonian to obtain the dynamics in the “loop+charge”
system. The time derivative of the particle’s momentum can be obtained
from the Hamilton’s equation of motion (6.99)
dp1 ∂H q1 [p1 × ~µ2 ] 3q1 ([p1 × ~µ2 ] · r)r

= − = −
dt ∂r1 4πm1 cr 3 4πm1 cr 5
The time derivative of the loop’s momentum follows from the momentum
conservation law (15.14)
dp2 dp1
= −
dt dt
The velocity of the charge 1 is obtained from the 2nd Hamilton’s equation
of motion
∂H p1 p2 p1 q1 [~µ2 × r]
v1 = = − 13 − (15.23)
∂p1 m1 2m1 c 4πm1 cr 3
Acceleration of this particle is obtained as a time derivative of (15.23)17

17
Here we noticed that ṗ1 ∝ (v/c)2 , therefore the time derivative of the second term on
the right hand side of (15.23) is ∝ (v/c)3 , so it can be ignored. We also neglected the time
derivative of the magnetic moment P ~ µ˙ 2 = [~µ2 , H]P , because, due to (17.14), the Poisson
bracket [µ2i , µ2j ]P = q2 /(m2 c) k ǫijk µ2k has an extra factor of c in the denominator,
which means that terms proportional to ~µ˙ 2 are much smaller than other terms in (15.24).
Vector identities (D.17) and (D.18) were used in the derivation of (15.24).
dv1
a1 ≡
dt
ṗ1 q1 [~µ2 × ṙ] 3q1 [~µ2 × r](r · ṙ)
≈ − +
m1 4πm1 cr 3 4πm1 cr 5
q1 [p1 × ~µ2 ] 3q1 ([~µ2 × r] · p1 )r 3q1 [~µ2 × r](r · p1 )
= − +
2πm21 cr 3 4πm21 cr 5 4πm21 cr 5
q1 [~µ2 × p2 ] 3q1 [~µ2 × r](r · p2 )
+ 3
−
4πm1 m2 cr 4πm1 m2 cr 5

q1 [p1 × ~µ2 ] 3q1 [p1 × [r × [~µ2 × r]]] d q1 [~µ2 × r]
= − −
2πm21 cr 3 4πm21 cr 5 dt 4πm1 cr 3
2
q1 [p1 × ~µ2 ] 3q1 [p1 × r](~µ2 · r) d q1 [~µ2 × r]
= − 2 3
+ 2 5
− (15.24)
4πm1 cr 4πm1 cr dt 2 4πm1 cr 3
The notation ( dtd )2 means the time derivative (of r) when only particle 2 (the
loop) is allowed to move. For example

d p2
r = −v2 ≈ −
dt 2 m2
15.1.6 Charge and spin’s magnetic moment

Let us now consider the system of a spinless charged particle 1 and a spin’s
magnetic moment 2. The relevant Hamiltonian is obtained from (15.3) by
assuming ~µ1 = 0 and q2 = 0 and dropping the corresponding terms18
p21 p2 p41 p42 q1 [~µ2 × r] · p2 q1 [~µ2 × r] · p1

H = + 2 − 3 2
− 3 2
+ −
2m1 2m2 8m1 c 8m2 c 8πm2 cr 3 4πm1 cr 3
(15.25)
As usual, we employ Hamilton’s equations of motion to calculate the time

derivative of the momentum, the velocity, and the acceleration
18
Note that if the spin’s magnetic moment is not moving (p2 = 0) then the interaction
energy of “charge + moment” (the last term in (15.25)) is exactly the same as the interac-
tion energy of “charge + current loop” (15.22). For a moving spin the interaction energy
has an additional term (the last term in (15.25)) which is absent in (15.22).
dp1 ∂H
= [p1 , H]P = −
dt ∂r1
q1 [p1 × ~µ2 ] 3q1 ([p1 × ~µ2 ] · r)r q1 [p2 × ~µ2 ] 3q1 ([p2 × µ ~ 2 ] · r)r
= 3
− 5
− 3
+
4πm1 cr 4πm1 cr 8πm2 cr 8πm2 cr 5
dp2 dp1
= −
dt dt
dr1 ∂H p1 p2 p1 q1 [~µ2 × r]
v1 ≡ = [r1 , H]P = = − 13 −
dt ∂p1 m1 2m1 c 4πm1 cr 3
dv1
a1 ≡
dt
ṗ1 q1 [~µ2 × ṙ] 3q1 [~µ2 × r](r · ṙ)
≈ − +
m1 4πm1 cr 3 4πm1 cr 5
q1 [p1 × ~µ2 ] 3q1 ([p1 × ~µ2 ] · r)r 3q1 [~µ2 × r](r · p1 )
= − +
2πm21 cr 3 4πm21 cr 5 4πm21 cr 5
3q1 [v2 × ~µ2 ] 3q1 ([~µ2 × r] · v2 )r 3q1 [~µ2 × r](r · v2 )
− + −
8πm1 cr 3 8πm1 cr 5 4πm1 cr 5
q1 [p1 × ~µ2 ] 3q1 [p1 × [r × [~µ2 × r]]]
= −
2πm21 cr 3 4πm21 cr 5
q1 [v2 × ~µ2 ] 3q1 ([~µ2 × r] · v2 )r q1 [~µ2 × v2 ] 3q1 [~µ2 × r](r · v2 )
− + + −
8πm1 cr 3 8πm1 cr 5 4πm1 cr 3 4πm1 cr 5

q1 [p1 × ~µ2 ] 3q1 [p1 × r](~µ2 · r) d q1 [~µ2 × r]
= − 2 3
+ 2 5
−
4πm1 cr 4πm1 cr dt 2 4πm1 cr 3
d q1 ([v2 × ~µ2 ] · r)
− (15.26)
dr1 8πm1 cr 3
This means that acceleration of the charge 1 in the field of the spin’s magnetic
moment is basically the same as in the field of a current loop (15.24). The
only difference is the presence of an additional gradient term - the last term on
the right hand side of (15.26). This difference will be discussed in subsections
15.3.1 - 15.3.3 in greater detail.
15.1.7 Two types of magnets

Let us now consider the system “moving charge 1 + magnetic moment 2
at rest.” As we discussed above, the magnetic moment can be produced
either by a spinning particle or by a small current loop. The Hamiltonian19

is obtained either from (15.22) or from (15.25) by setting p2 = 0
p21 p2 p41 p42 q1 [r × p1 ] · µ

~2
H = + 2 − − − (15.27)
2m1 2m2 8m31 c2 8m32 c2 4πm1 cr 3
The force acting on the charge 1 is given by formula (15.24)
q1 [p1 × ~µ2 ] 3q1 [p1 × r](~µ2 · r)

f1 = m1 a1 = − +
4πm1 cr 3 4πm1 cr 5
q1
≈ [v1 × b1 ] (15.28)
c
which is the standard definition of the magnetic part of the Lorentz force if
another standard expression
~2
µ 3(~µ2 · r)r
b1 = − 3
+ (15.29)
4πr 4πr 5
is used for the “magnetic field” of the magnetic moment µ ~ 2 at point r1 .20
There are, however, important differences between our formulas and the
standard approach. First, in the usual Lorentz force equation21 the force
is identified with the time derivative of momentum (15.11). In our case,
the force is “mass times acceleration.” Second, in our approach, there are
no fields (electric or magnetic) having independent existence at each space
point. There are only direct inter-particle forces. This is why we put “mag-
netic field” in quotes.
For comparison with experiment it is not sufficient to discuss point mag-
netic moments. We need to apply the above results to macroscopic magnets
as well. It is important to mention that there are two origins of magnetization
in materials. The first origin is due to the orbital motion of electrons. The
second one is due to spin magnetic moments of electrons22 . In permanent
magnets both components play roles. The relative strength of the “orbital”
19
which is the same in both cases
20
See equation (5.56) in [Jac99].
21
See, e.g., equation (11.124) in [Jac99].
22
the contribution from nuclear spins is much weaker
z
µ2
x
u
y
Figure 15.4: A thin solenoid can be represented as a stack of small current
loops. The magnetization vector ~µ2 is directed along the solenoid’s axis.
and “spin” magnetizations varies among different types of magnetic mate-

rials. However, in most cases the dominant contribution is due to electron
spins [RF69]. The full magnetization can be described by summing up total
magnetic moments over all atoms in the body, and the full “magnetic field” of
the macroscopic magnet is obtained by adding up contributions like (15.29).
The above discussion referred to permanent bar magnets. However, there
is an alternative way to produce “magnetic field” by means of electromag-
nets – solenoids with current. In solenoids only the orbital component of
magnetization (due to electrons moving in wires) is present. For example,
a straight thin solenoid can be represented as a collection of small current
loops23 stacked on top of each other (see Fig. 15.4). The “magnetic field” of
such a stack can be obtained by integrating (15.29) along the length of the
stack.
This result can be also generalized for macroscopic solenoids with non-
vanishing cross-sections. It is easy to see that each current-carrying coil
in such a solenoid can be represented as a superposition of infinitely small
loops (see Fig. 15.5). Then a macroscopic thick cylindrical solenoid can be
represented as a set of parallel thin solenoids joined together.
23
C1 I
C3 C2 I
Figure 15.5: A wire coil (black thick line) with current I can be represented as
a superposition of infinitesimally small wire loops C1 , C2, C3 , . . . (grey lines)
with the same current I. All (imaginary) inside currents cancel each other,
so that only the (real) peripheral current remains.
15.1.8 Longitudinal forces in conductors

According to classical electrodynamics, the magnetic force (15.28) is always
perpendicular to the particle’s velocity. Consequently, there can be no mag-
netic force between two electrons moving inside the same straight thin wire
with a steady current. Indeed, if we substitute v1 = v2 ≡ v and r k v in
the standard Biot-Savart force law (15.30) - (15.31), we obtain f1 = f2 = 0.
However, this result does not hold in our approach. Similar substitutions in
our formulas (15.9) - (15.10) yield24
q2v2r 3q 2 (v · r)2 r 5q 2 v 2 r
f1 = − − = −
4πc2 r 3 8πc2 r 5 8πc2 r 3
2 2
5q v r
f2 =
8πc2 r 3
which indicates the presence of an (longitudinal) attractive force parallel to
the electrons velocity vectors. As discussed in [Ess07, Ess96, Ess95], this
magnetic attraction of conduction electrons may contribute to superconduc-
tivity at low temperatures.
24
Here we ignore the Coulomb force components, which are shielded in metal conductors.
q denotes electron’s charge.
15.2. EXPERIMENTS AND PARADOXES 499
It is interesting to note that the issue of longitudinal interactions in con-

ductors was discussed ever since Ampère suggested his charge interaction law
in the early 19th century.25 However, in contrast to our predicted attraction,
the Ampère’s formula predicted longitudinal repulsion between two electrons
in the same wire. Numerous experiments attempting to detect such a re-
pulsion did not yield conclusive results. A recent study [GJR01] declared a
confirmation of the Ampère’s repulsion. However, this conclusion was chal-
lenged in [CCTS13]. So, experimentally, the presence of longitudinal forces in
conductors and their signs (i.e., attractive or repulsive) remains an unsettled
issue.
15.2 Experiments and paradoxes

In this section we will discuss a number of real or thought electromagnetic
experiments, whose description in classical Maxwell’s electrodynamics is in-
adequate or paradoxical. We will also consider these experiments from the
point of view of the RQD direct interaction approach developed in the preced-
ing section. Our goal is to demonstrate that in all cases the RQD description
is more logical and consistent.
15.2.1 Conservation laws in Maxwell’s theory

One important class of difficulties characteristic to Maxwell’s electrodynam-
ics is related to the apparent non-conservation of total observables (energy,
momentum, angular momentum) in systems of interacting charges. Indeed,
in Maxwell’s equations there is no built-in guarantee that total observables
are conserved and that the total energy and momentum form a 4-vector
quantity. Various electromagnetic paradoxes were formulated on the basis of
Maxwell’s equations and the Lorentz force law [But69, Roh60, Fur69, APV88,
Com96, Com00, Kho04, Kho05, Hni04, SG03, Teu96, Jac04, McD06, KY07a,
KY07b, KY08, TY13, Man12, BRBG09, CB09]. The suggested “solutions” of
these paradoxes involved such ad hoc constructions as “hidden momentum,”
the energy and momentum of ”electromagnetic fields,” “Poincaré stresses,”
alternative non-Lorentz force laws, etc.
The simplest example of a paradox in Maxwell’s theory refers to an iso-
lated system of two charges 1 and 2, which are free to move without influence
25
for a good review see [Joh96]
of external forces. By applying the standard Biot-Savart force law
q1 q2 [v1 × [v2 × r]]

f1 = (15.30)
4πc2 r3
q1 q2 [v2 × [v1 × r]]
f2 = − (15.31)
4πc2 r3
and the traditional force definition f = dp/dt it is easy to see that the
Newton’s third law (f1 = −f2 ) is not satisfied for most geometries [How44].
As we discussed in subsection 15.1.3, this means that the total momentum
of particles Pp is not conserved. The usual explanation [Kel42, PN45] of
this paradox is that the two charges alone do not constitute a closed physical
system. In order to restore the momentum conservation one needs to take into
account the momentum contained in the electromagnetic field surrounding
the charges.
According to Maxwell’s theory, electric and magnetic fields E(r), B(r)
have momentum and energy given by integrals over entire space
Z
f 1
P = dr[E(r) × B(r)] (15.32)
4πc
Z
1
Hf = dr(E 2 (r) + B 2 (r)) (15.33)
8π
So, the idea of the standard explanation is that the total momentum of
“particles + fields” (Pp + Pf ) is conserved in all circumstances.
From the point of view of RQD, it is understandable when Maxwell’s
theory associates momentum and energy with transverse time-varying elec-
tromagnetic fields in free space. As we will discuss in subsection 15.5.5,
these fields can be accepted as rough models of electromagnetic radiation.
For free propagating fields, equations (15.32) and (15.33) are supposed to be
equivalent to the sums of momenta and energies of photons, respectively.
However, Maxwell’s theory goes even farther and claims that bound26
electromagnetic fields surrounding charges or magnets also have non-zero
momentum and energy. If this were true then one could easily imagine sta-
tionary systems (e.g., a charged magnet) where nothing is moving and where
26
stationary, non-radiating
fields E, B would possess a non-zero momentum.27 However, this “electro-

magnetic field energy” idea does not seem attractive for a couple of reasons.
First, the “electromagnetic energy” integral (15.33) for the electric field
E = qr/(4πr 3) associated with a stationary point charge (e.g., an electron) is
infinite.28 To avoid this difficulty, various “classical models” of the electron
were suggested, the simplest of which is a charged sphere of a small but finite
radius. However, these models led to other problems. One of them is the
famous “4/3 paradox”: It can be shown that the momentum of the electro-
magnetic field associated with a finite-radius electron does not form a 4-vector
quantity together with its electromagnetic energy [Roh60, But69, Com97].
This violation of relativistic invariance can be “fixed” if one introduces an
extra factor of 4/3 in the formula for the field momentum. To justify this
extra factor the ad hoc idea of Poincaré stresses is sometimes introduced.29
15.2.2 Conservation laws in RQD

On the other hand, if one adopts the RQD “no-fields” approach, then total
observables of particle systems will be conserved, and their correct relativis-
tic transformation laws will hold exactly without any ad hoc assumptions.
No “electromagnetic field” contributions to these quantities need to be taken
into acount. Indeed, in relativistic Hamiltonian dynamics (which is the basis
of our RQD approach to electrodynamics) the conservation laws and trans-
formation properties of observables are direct consequences of the Poincaré
group structure. The Poisson bracket of any observable F with the Hamil-
tonian H determines the time evolution of this observable
dF (t)
= [F, H]P
dt
Then the conservation of observables H, P, and J follows automatically from
their vanishing Poisson brackets with H.30 Similarly, boost transformations
27
See, e.g., subsection 15.4.3. The angular momentum of static electromagnetic fields
in Maxwell’s theory was discussed in [Rom66].
28
See also [Fra07] for discussion of other difficulties related to the idea of energy and mo-
mentum contained in the electromagnetic field. An interesting critical review of Maxwell’s
electrodynamics and Minkowski space-time picture can be found in section 1 of [GZL04].
29
see sections 16.4 - 16.6 in [Jac99]
30
For example, the explicit conservation of the total momentum P in our theory guar-
antees the resolution of paradoxes 6 and 7 in [Kho].
of F can be obtained as solutions of (3.65)
dF (~θ)
= −c[F, K]P
d~θ
In the case of total momentum-energy (P, H), the commutators (3.57) -
(3.58) hold independent on the strength of interaction. Then the 4-vector
transformation formulas (4.3) - (4.4) follow. So, in RQD the conservation of
total observables of isolated systems and their correct transformation laws
are embedded in the formalism at the most fundamental level and can be
never violated. In particular, the total momentum P = p1 + p2 of the two-
charge system described in the beginning of subsection 15.2.1 is conserved
automatically.
15.2.3 Trouton-Noble “paradox”

To be more specific, let us now discuss the conservation of the total angular
momentum in the two competing theories.
In RQD the total angular momentum J of any isolated system of inter-
acting particles is conserved. In other words, there can be no torque31 in any
isolated system of charges. This follows directly from the following Poisson
bracket in the Poincaré Lie algebra
dJ
= [J, H]P = 0
dt
This result should hold in any inertial frame of reference. For example, in a
moving frame the relevant dynamical variables are 32
ic ~ ic ~
J(~θ) = e ~ K·θ Je− ~ K·θ
ic ~ ic ~
H(~θ) = e ~ K·θ He− ~ K·θ
and the equation of motion for the total angular momentum is33
31
The torque is defined here as the time derivative of the total angular momentum of
the system.
32
in quantum notation
33
Here t′ is time measured by the moving observer. Note that this result is valid only if
K is the full interaction-dependent boost (N.27) - (N.28).
f v
r
f1
+ v
Figure 15.6: The Trouton-Noble “paradox”: two charges moving with the
same velocity v. The forces f1 and f2 produce a non-zero torque.
dJ(~θ) ~θ), H(~θ)]P = [e ic~ K·θ~Je− ic~ K·θ~, e ic~ K·θ~He− ic~ K·θ~]P
= [J(
dt′
ic ~ ic ~
= e ~ K·θ [J, H]P e− ~ K·θ = 0
Maxwell’s classical electrodynamics cannot make such a clear statement

about the conservation of the total angular momentum and the absence of
torque in all frames. This failure is in the center of the “Trouton-Noble
paradox” which haunted Maxwell’s theory for more than a century [TN04,
PN45, Fur69, SG03, Jac04, But69, Teu96, Jef99b].
Imagine two charges moving with the same velocity vector v, which makes
an angle34 with the vector r = r1 − r2 connecting positions of the charges
(see Fig. 15.6). A calculation using the standard Biot-Savart force formulas
(15.30) - (15.31) predicts that there should be a non-zero torque, which tries
to turn vector r until it is perpendicular to the direction of motion v [Sar47].
This result is paradoxical for two reasons. First, as we said earlier, one should
expect zero torque from the conservation of the total angular momentum.
Second, there is no torque in the reference frame that moves together with the
charges,35 so the presence of the torque in the reference frame at rest violates
the principle of relativity. Numerous attempts to explain this paradox within
34
The angle between r and v should be different from 0 and 90◦ . Note that in the original
Trouton-Noble experiment [TN04], two charged capacitor plates were used instead of point
charges, but this difference has no significant effect on our theoretical analysis.
35
In this reference frame velocities of both charges are zero. So, only the Coulomb force
remains, which is directed along the vector r, thus causing no torque.
L L
s
v v
I S N
Amp Amp
(a) (b)
Figure 15.7: The electromagnetic induction. Current in the wire loop L can
be induced by (a) a moving solenoid with current; (b) a moving permanent
magnet.
Maxwell’s theory [PN45, Fur69, SG03, Jac04, But69, Teu96, Jef99b] do not
look convincing.
15.3 Electromagnetic induction

From equation (15.28) it is clear that if both the charge 1 and the magnet 2
are at rest, then the force between them vanishes. Classical theory describes
this situation as “a magnet at rest does not create electric field”. One of
greatest Faraday’s discoveries was the realization that a varying magnetic
field does produce electric field, i.e., it acts on stationary charges. This
phenomenon is called electromagnetic induction. Magnetic field variation
can result either from changing magnetic moment ~µ2 or from changing its
position in space r2 . In this section we will consider the latter source of
electromagnetic induction and some of its experimental manifestations.
15.3.1 Moving magnets

In this subsection we are going to consider the force acting on a charge at
rest (p1 = 0) from a moving magnet ~µ2 . In the traditional Maxwell’s theory,
a moving bar magnet creates qualitatively the same fields and forces as a
moving solenoid. However, this is not so in our approach. If the magnetic
15.3. ELECTROMAGNETIC INDUCTION 505
moment ~µ2 is created by a particle with spin, then the force on the charge 1
at rest is given by (15.26)36

d q1 ([v2 × ~µ2 ] · r) d q1 [~µ2 × r]
f1spin = − − (15.34)
dr1 8πcr 3 dt 2 4πcr 3
If the magnetic moment is created by a small current loop, then we should
use (15.24)

d q1 [~µ2 × r]
f1orb = − (15.35)
dt 2 4πcr 3
In other words, the force produced by a moving spin has two components,
the first of which is conservative and the second is non-conservative37
f1spin = f1cons + f1non−cons (15.36)
The force produced by the current loop has only a non-conservative compo-
nent
f1orb = f1non−cons (15.37)
Let us first focus on the non-conservative force component f1non−cons , which

is common for both spin and orbital magnetic moments. We will return to
the conservative force component in subsection 15.3.3.
For macroscopic magnets the infinitesimal quantities considered thus far
should be integrated on the magnet’s volume V , e.g., the full non-conservative
force exerted by a macroscopic magnet on the charge at rest q1 is
Z
d q1 [~µ2 × r]
Fnon−cons
1 = − dr2 (15.38)
dt 2 V 4πcr 3
36 d

Recall that dt 2
denotes the time derivative when r1 is kept fixed.
37
The force is defined as conservative if it can be represented as a gradient of a scalar
function (an example is given by the first term on the right hand side of (15.34)). Otherwise
the force is called non-conservative. The integral of a conservative force vector along any
closed loop is zero. Therefore, conservative forces on electrons cannot be detected by
measuring a current in a closed circuit.
This means that the magnet,38 moving near a wire loop L, induces a current
in the loop as shown in Fig. 15.7(a) and (b).
Let us now show that this prediction agrees quantitatively with Maxwell’s
electrodynamics. We denote by symbol e1 the force with which a microscopic
magnetic moment acts on a unit charge39
e1 ≡ f1non−cons /q1
If we take curl of this quantity, we obtain

∂ 1 d ∂ [~µ2 × r]
× e1 = − ×
∂r1 4πc dt 2 ∂r1 r3

1 d ~µ2 3(~µ2 · r)r
= − − 3 +
4πc dt 2 r r5

1 d
= − b1 (15.39)
c dt 2
where b1 is the “magnetic field” (15.29) of the magnetic moment. After
integrating both sides of equation (15.39) on the magnet’s volume we obtain
exactly the Maxwell’s equation

∂ 1 d
× E1 = − B1
∂r1 c dt 2
which expresses the Faraday’s law of induction.

It is important to stress that the origin of electromagnetic induction pro-
posed in our work is fundamentally different from that adopted in Maxwell’s
theory. The traditional explanation is that electromagnetic induction re-
sults from inter-dependence of time-varying electric and magnetic fields. In
our approach, the electromagnetic induction is the consequence of velocity-
dependent interactions between magnetic dipoles and charges.
15.3.2 Homopolar induction: non-conservative forces

One interesting application of the electromagnetic induction law is the ho-
mopolar generator shown in Fig. 15.8. This device consists of a conducting
38
either a permanent magnet or a solenoid with current
39
In Maxwell’s electrodynamics this is the definition of the electric field e1 .
A Amp A Amp
B B
C C
N N
N N
N N NN NN NN
S N N NN S S N N N N S
S S S S S S
S S S S
S
M S S S S
S
M
(a) (b)
Figure 15.8: Homopolar generator. (a) the conducting disk C rotates; (b)
the magnet M rotates.
disk C and a cylindrical magnet M. Both the conducting disk and the mag-
net are rigidly attached to their own shafts, and both can independently
rotate about their common vertical axis. The magnetization vector µ ~ 2 of
each small volume element of the magnet is directed along the axis, so the
total magnetic moment is time-independent for both stationary and rotat-
ing magnets. The shaft AB is conducting. Points A and C are connected
to sliding contacts (shown by arrows), and the circuit is closed through the
galvanometer.
There are two modes of operation of this device. In the first mode (see Fig.
15.8(a)) the magnet is stationary while the conducting disk rotates about its
axis. The galvanometer detects a current in the circuit. This has a simple
explanation: The force acting on electrons in the metal can be obtained by
integrating formula (15.28),40 on the magnet’s volume V
Z
q1
F1 (r1 , v1 ) = [v1 × b1 (r1 )]dr2 (15.40)
V c
The full electromotive force in the circuit is obtained by integrating expres-
sion (15.40) on the variable r1 along the closed contour A → B → C → gal-
40
As we saw in subsection 15.3.1, this formula is applicable to both “orbital” and “spin”
magnets at rest.
vanometer → A. The velocity v1 is non-zero only on the segment B → C,41

where the force F1 is directed radially. The integral is non-zero and the
galvanometer must show a non-vanishing current in agreement with experi-
ments.
In the second operation mode (see Fig. 15.8(b)) the disk C is fixed,
and the magnet rotates. It was established by careful experiments [Gup63,
The62] that there is no current in this case. If both the magnet and the
disk rotate, then the current is the same as in the “first mode,” i.e., with
a fixed magnet. This means that rotation of the magnet has no effect on
the produced current. This experimental result looks somewhat surprising,
because from the principle of relativity one could expect that the physical
outcome (the current) should depend only on the relative movement of the
magnet and the disk. However, this conclusion is incorrect, because the
principle of relativity is applicable only to inertial movements. It cannot be
applied to rotational movements without contradictions.
Let us now analyze the rotating magnet case shown in Fig. 15.8(b) from
the point of view of the Darwin-Breit electromagnetic theory. We need to
know the integral of the force acting on electrons along the closed circuit
A → B → C → galvanometer → A. The conservative portion of the force
f1cons does not contribute to this integral. Since here we have a cylindrical
magnet rotating about its axis of magnetization, the volume integral in the
expression (15.38) for the non-conservative force is time-independent, and
the total non-conservative force acting on electrons is zero. This agrees with
the observed absence of the current.
15.3.3 Homopolar induction: conservative forces

So far in our discussion of homopolar induction we considered only non-
conservative forces, mainly because they can be detected rather easily by
measuring induced currents in closed circuits. In the beginning of the 20th
century Barnett and Kennard performed experiments [Bar12, Ken17]42 with
the specific purpose to detect the conservative part of forces from moving
magnets. Barnett’s experimental setup (shown schematically in Fig. 15.9)
resembled the homopolar generator discussed above. Its main parts were a
cylindrical solenoid S with current and two conducting cylinders C1 and C2
41
Velocities of electrons in the rotating conductor are shown by small arrows in Fig.
15.8(a).
42
see also [Kho03]
W C
2
S
z
C1
Figure 15.9: The Barnett’s experiment.
placed inside the solenoid. All three cylinders shared the same rotation axis
z. Conductors C1 and C2 formed a cylindrical capacitor. Initially they were
connected by a conducting wire W . Note that in contrast to the homopolar
generator experiment, where a current in a closed circuit was measured, the
system C1 − W − C2 in the Barnett’s setup did not form a closed circuit.
So, the capacitor would obtain a non-zero charge even if the force acting on
electrons in the wire W was conservative.
Similar to the homopolar generator discussed above, this apparatus could
operate in two different modes. In the first mode the cylindrical capacitor
spun about its axis. Due to the presence of the magnetic field inside S a
current ran through the wire W , and the capacitor C1 − C2 became charged.
Then the wire was disconnected, capacitor’s rotation stopped and the ca-
pacitor’s charge measured. As expected, the measured charge was consistent
with the standard Lorentz force formula (15.40).
In the second operation mode, the capacitor was fixed while the solenoid
rotated about its axis. No charge on the capacitor was registered in this
case. This result is consistent with our theory, because, just as in the case
of homopolar generator, the non-conservative force (15.38) vanishes due to
the cylindrical symmetry of the setup, and the conservative force is absent
in the case of a moving solenoid (15.37). So, the null result of the Barnett’s
experiment confirms our earlier conclusion that moving solenoids with current
do not exert conservative forces on nearby charges.
A different result is expected in the case of a rotating permanent magnet.
In this case the conservative force component is non-zero. It can be obtained
by integrating the first term on the right hand side of (15.34) on the volume
V of the magnet
M B
d µ
z
S1 S2
Figure 15.10: Schematic of the Wilson-Wilson experiment.
Z
d q1 ([v2 × ~µ2 ] · r)
Fcons
1 = − dr2 (15.41)
dr1 V 8πcr 3
So, rotating cylindrical permanent magnet should induce a non-zero charge

in a stationary capacitor. A relevant experiment was performed in 1913 by
Wilson and Wilson [WW13]. This experiment was repeated again in 2001
with improved accuracy [HBH+ 01]. For theoretical discussion of the Wilson-
Wilson experiment from the point of view of Maxwell’s electrodynamics see
[McDb, PS95a].
Schematic representation of the Wilson-Wilson experiment is shown in
Fig. 15.10. A hollow cylinder M made of magnetic dielectric (non-conducting
material) was placed in a constant magnetic field B parallel to the axis z.
The inner and outer surfaces of the cylinder (S1 and S2 , respectively) were
covered by metal, and the electrostatic potential between the two surfaces
was measured. When the cylinder was at rest, no potential was recorded,
as expected. However, when the cylinder was rotated a non-zero potential
difference was observed. This potential is a result of electric dipoles d created
in the bulk of the magnet. There are two physical mechanisms for the ap-
pearance of these dipoles. First, molecules of the dielectric material moving
in the magnetic field B get polarized (the Lorentz forces act in opposite di-
rections on positive and negative charges in the molecules). Second, moving
induced magnetic moments ~µ create “electric fields” similar to the “field” of
an electric dipole. Indeed, if we compare expression (15.41) with the force
exerted on the charge q1 by an electric dipole d
15.4. AHARONOV-BOHM EFFECT 511
∂ q1 (d · r)
f1dipole = −
∂r1 4πr 3
we see that a magnetic moment ~µ moving with velocity v acquires an electric
dipole of the magnitude43
[v × ~µ]
d = (15.42)
2c
Thus, the Wilson-Wilson experiment clearly demonstrated that both kinds of
dipole moments (those due to dielectric polarization and those due to moving
dipole moments) are present in the rotating magnet. This confirms qualita-
tively our conclusion about the presence of conservative forces (15.41) near
moving permanent magnets. A quantitative description of this experiment
would require calculation of the polarization and magnetization of bodies
moving in an external magnetic field. This is beyond the scope of the theory
developed here.
15.4 Aharonov-Bohm effect

The central idea of our approach to classical electrodynamics is the rejection
of electric and magnetic fields. This also means that we reject the notion
of electromagnetic potentials Aµ (x, t). In Maxwell’s electrodynamics these
potentials are assumed to be non-observable. However, there exists a class of
experiments, which allegedly proves the reality of electromagnetic potentials.
The oldest and the most famous representative in this class is the Aharonov-
Bohm effect. In its traditional interpretation a claim is made that this effect
is a manifestation of non-vanishing electromagnetic potentials in regions of
space where both electric and magnetic fields are zero. If this interpretation
were true, then our particle-only theory would be in trouble. Our goal in
this section is to show that there is no reason for concern. We are going to
43
The presence of the dipole electric field near moving magnetic moment is predicted in
the traditional special-relativistic theory as well [EL08, Ros93], however this prediction
dSR = [v × ~µ]/c
is twice larger than our result.

demonstrate that the Aharonov-Bohm effect can be easily explained in terms

of particles interacting via Darwin-Breit potentials. This explanation relies
also on quantum properties of particles, in particular, on how the interac-
tion potential affects phases of quasiclassical wave packets, as described in
subsection 6.5.6.
15.4.1 Infinitely long solenoids or magnets

It is not difficult to show that the “magnetic field” outside an infinitely long
thin solenoid vanishes. Assuming that the solenoid is oriented along the z-
axis with x = y = 044 and that the observation point is at r1 = (x1 , y1 , 0),
we obtain45
Z∞ (0, 0, µ2) 3µ2 z(x1 , y1, −z)

Blong (r1 ) = dz − −
4π(x21 + y12 + z 2 )3/2 4π(x21 + y12 + z 2 )5/2
−∞
= 0 (15.43)
A solenoid with arbitrary cross-section can be represented as a bunch of

parallel thin solenoids.46 If the observation point r1 is outside the solenoid’s
volume, then equation (15.43) holds for each thin segment, and the total
“magnetic field” at point r1 also vanishes. The same analysis applies to
infinitely long bar magnets of arbitrary cross-section. Thus we conclude that
the force acting on a moving charge outside infinitely long magnet (either
permanent magnet or solenoid with current) is zero. This conclusion agrees
with calculations based on Maxwell’s equations. See, for example, Problem
5.2(a) in [Jac99].
However, the vanishing force does not mean that the potential energy of
the charge-solenoid interaction is zero as well. In the case of a thin solenoid,
the potential energy can be found by integrating the last term in equation
(15.27) along the length of the solenoid and noticing that the mixed product
([~µ2 × v1 ] · r1 ) is independent of z. Denoting r ≡ (x21 + y12 )1/2 the particle-
solenoid distance, we obtain
44
i.e., solenoid’s points have coordinates r2 = (0, 0, z)
45
Here we integrate equation (15.29) on the (infinite) length of the solenoid. This time
µ2 should be understood as magnetization per unit length of the solenoid.
46
see Fig. 15.5
Z∞
q1 ([~µ2 × v1 ] · r1 ) q1 ([~µ2 × v1 ] · r1 )
Vlong = dz 2 2
= (15.44)
4πc(x1 + y1 + z 2 )3/2 2πcr 2
−∞
The acceleration of the moving charge is found, as usual, by application of

the Hamilton’s equations of motion
dp1 ∂Vlong q1 [p1 × ~µ2 ] q1 ([~µ2 × p1 ] · r1 )r1

= [p1 , H]P = − = +
dt ∂r1 2πm1 cr 2 πm1 cr 4
dr1 ∂H p1 p2 p1 q1 [r1 × ~µ2 ]
= [r1 , H]P = = − 13 +
dt ∂p1 m1 2m1 c 2πm1 cr 2
d 2 r1 ṗ1 q1 [p1 × ~µ2 ]r12 q1 [r1 × ~µ2 ](r1 · p1 )
≈ + −
dt2 m1 2πm21 cr 4 πm21 cr 4
q1
= ([p1 × ~µ2 ]r12 − ([p1 × ~µ2 ] · r1 )r1 − [r1 × ~µ2 ](r1 · p1 ))
πm21 cr 4
q1
= (−[r1 × [r1 × [p1 × ~µ2 ]]] − [r1 × ~µ2 ](r1 · p1 ))
πm21 cr 4
q1
= (−[r1 × p1 ](r1 · ~µ2 ) + [r1 × ~µ2 ](r1 · p1 ) − [r1 × µ ~ 2 ](r1 · p1 ))
πm21 cr 4
= 0 (15.45)
where we took into account that (r1 · ~µ2 ) = 0. This agrees with the van-
ishing “magnetic field” found earlier and presents a curious example of a
non-vanishing potential, which does not produce any force on charges. Ex-
perimental manifestations of such potentials will be discussed in this section.
15.4.2 Aharonov-Bohm experiment

In the preceding subsection we concluded that charges do not experience
any force (acceleration) when they move in the vicinity of a straight infinite
magnetized solenoid or a permanent bar magnet. However, the absence of
force does not mean that charges do not “feel” the presence of the mag-
net. In spite of zero magnetic (and electric) “field”, infinite solenoids/rods
have a non-zero effect on particle wave functions and their interference.
This effect was first predicted by Aharonov and Bohm [AB59] and later
confirmed in experiments [Cha60, TOM+ 86, OMK+ 86]. Experimentally it
µ2 y

0 xx
−R R

Figure 15.11: The Aharonov-Bohm experiment. The vertical infinite thin

magnetized rod with linear magnetization density µ2 is shown by grey arrows.
was found that the interference of the two wave packets at point B de-
pends on the magnetization of the solenoid/rod, in spite of zero force acting
on the electrons. The explanation proposed by Aharonov and Bohm was
based on electromagnetic potentials in the multiply-connected topology of
space induced by the presence of the solenoid. There exist attempts to ex-
plain the Aharonov-Bohm effect as a result of classical electromagnetic force
that creates a “time lag” between wave packets moving on different sides of
the solenoid [RGR91, Boy06, Boy05, Boy07a, Boy07b]. However, this ap-
proach seems to be in contradiction with recent measurements, which failed
to detect such a “time lag” [CBB07]. Several other non-conventional ex-
planations of the Aharonov-Bohm effect were also suggested in the literature
[SC92, Wes98, Pin04, HN08]. Here we suggest a different explanation [Ste08].
Let us consider the idealized version of the Aharonov-Bohm experiment
shown in Fig. 15.11: An infinite solenoid or ferromagnetic fiber with a neg-
ligible cross-section and linear magnetization density µ2 is erected vertically
in the origin. The electron wave packet is split into two parts (e.g., by us-
ing a double-slit) at point A. The subpackets travel on both sides of the
solenoid/bar with constant velocity v1 , and the distance of the closest ap-
proach is R. The subpackets rejoin at point B, where the interference is mea-
sured. (The two trajectories AA1 B1 B and AA2 B2 B are denoted by dashed
lines.) The distance AB is sufficiently large, so that the two paths can be
regarded as parallel to the y-axis everywhere
r1 (t) = (±R, v1y t, 0) (15.46)
To estimate the solenoid’s effect on the interference, we need to turn to

the quasiclassical representation of particle dynamics from subsection 6.5.6.
We have established there that the center of the wave packet is moving in
accordance with Heisenberg’s equations of motion. In our case, no force
is acting on the electrons, so their trajectories (15.46) are independent on
magnetization. We also established in 6.5.6 that the overall phase factor of
the wave packet changes in time as exp( ~i φ(t)), where the action integral φ(t)
is given by
Zt
2
m1 v1y (t′ )
φ(t) ≡ − Vlong (t ) dt′
′
(15.47)
2
t0
and Vlong (t) is the time dependence of the potential (15.44) experienced by
the electron. In the Aharonov-Bohm experiment the electron’s wave packet
separates into two subpackets that travel along different paths AA1 B1 B and
AA2 B2 B. Therefore, the phase factors accumulated by the two subpackets
are generally different, and the interference of the “left” and “right” wave
packets at point B will depend on this phase difference
1
∆φ = (φlef t − φright )
~
Let us now calculate the relative phase difference in the geometry of Fig.
15.11. The kinetic energy term in (15.47) does not contribute, because veloc-
ity remains constant and equal for both paths due to (15.45). However, the
potential energy of the charge 1 is different for the two paths. For all points
on the “right” path the numerator of the expression (15.44) is −q1 µ2 v1y R
and for the “left” path the numerator is q1 µ2 v1y R. Then the total phase
difference
Z∞
1 q1 µ2 Rv1y eµ2
∆φ = 2 2 2
dt = (15.48)
~ πc(R + v1 t ) ~c
−∞
does not depend on the electron’s velocity and on the value of R. This
phase difference is proportional to the solenoid’s magnetization µ2 . So, all
essential properties of the Aharonov-Bohm effect are fully reproduced within
our approach.47
It is interesting that the presence of the phase difference is not specific to
line magnets of infinite length. This effect was also seen in experiments with
short magnetized nanowires [MIB03]. This observation presents a challenge
for the traditional explanation, which must apply one logic (electromagnetic
potential in the space with “multiple-connected topology”) for infinitely long
magnets and another logic (the presence of the magnetic field) for finite bar
magnets. Our description of the Aharonov-Bohm effect is more economical,
as it applies the same logic independent on whether the magnet is infinite
or finite.48 In both cases there is a difference between action integrals for
electron’s paths passing the line magnet on the right and on the left.
15.4.3 Toroidal magnet and moving charge

The system consisting of a toroidal magnet and a moving charge is interesting
for two reasons. First, toroidal permanent magnets were used in Tonomura’s
experiments [TOM+ 86, OMK+ 86], which are regarded as the best evidence
for the Aharonov-Bohm effect. Second, classical Maxwell’s electrodynamics
has a serious trouble in explaining how the total momentum is conserved
in this system. This is known as the “Cullwick’s paradox” [Cul52, AHR04,
McDa].
In an attempt to explain this paradox, let us apply Maxwell’s theory to a
charge moving along the symmetry axis through the center of a magnetized
torus (see Fig. 15.12). As we will see below, there is no “magnetic field”
outside the toroidal magnet, so the force acting on the charge is zero. How-
ever, the moving charge creates its own “magnetic field” which does act on
47
Our result was derived for thin ferromagnetic rods and solenoids, however the same
arguments apply to infinite cylindrical rods and solenoids of any cross-section.
48
To find the potential energy in the case of a finite linear magnet one should simply
use finite integral limits in (15.44).
path 1 path 2
µ2 y
θ
R x
q1 q1
a
Figure 15.12: Toroidal magnet and moving charge. “path 1” passes through
the center of the torus; “path 2” is outside the torus.
the torus with a non-zero force. So, the Newton’s third law is apparently vi-
olated. According to McDonald [McDa], the balance of force can be restored
if one takes into account the hypothetical “momentum of the electromagnetic
field.”49 However, this is not the whole story yet. The field momentum turns
out to be non-zero even in the case when both the magnet and the charge
are at rest. This leads to the absurd conclusion that the linear momentum of
the system does not vanish even if nothing moves. The problem is allegedly
fixed by assuming the existence of the “hidden momentum” in the magnet.
However this explanation does not seem satisfactory, and here we would like
to suggest a different version of events.
First we need to derive the Hamiltonian describing dynamics of the sys-
tem “charge 1 + toroidal magnet 2.” We introduce a Cartesian coordinate
system shown in fig 15.12. Assume that the torus has radius a and linear
magnetization density µ2 and that particle 1 moves straight through the cen-
ter of the torus with momentum p1 = (0, p1y , 0) (path 1 in Fig. 15.12). Then
we can use symmetry arguments to disregard x and z components of forces
and write50
49
Note that in the McDonald’s treatment the force is identified with the time derivative
of momentum, while in our approach the force is defined as (mass)×(acceleration).
50
We do not assume that the magnet is stationary. It can move along the y-axis. The
y-component of its velocity is denoted V2y ≈ P2y /M2 , where M2 is the full mass of the
r2 = (a cos θ, 0, a sin θ)
r = r1 − r2 = (−a cos θ, r1y , −a sin θ)
µ
~2 = (−µ2 sin θ, 0, µ2 cos θ)
[~µ2 × r]y = aµ2 sin2 θ + aµ2 cos2 θ = aµ2
[~µ2 × r] · p1 = aµ2 p1y
[~µ2 × r] · P2 = aµ2 M2 V2y
Then the potential energy of interaction between the charge and the magnet
is obtained by integrating the potential energy in (15.25) on the length of
the torus51
Z2π q1 a2 µ2 p1y
V = dθ −
4πm1 c(a2 cos2 θ + (r1y − R2y )2 + a2 sin2 θ)3/2
0
q1 a2 µ2 V2y
+
8πc(a2 cos2 θ + (r1y − R2y )2 + a2 sin2 θ)3/2
q1 a2 µ2 p1y q1 a2 µ2 P2y
= − +
2m1 c(a2 + (r1y − R2y )2 )3/2 4M2 c(a2 + (r1y − R2y )2 )3/2
(15.49)
and the full Hamiltonian can be written as
p21y 2
P2y p41y 4
P2y q1 a2 µ2 p1y
H = + − − −
2m1 2M2 8m31 c2 8M23 c2 2m1 c(a2 + (r1y − R2y )2 )3/2
q1 a2 µ2 P2y
+
4M2 c(a2 + (r1y − R2y )2 )3/2
We will now switch to the quasiclassical approximation in which the light
particle 1 (presumably, an electron) is described by a localized wave packet
magnet and P2 = (0, P2y , 0) is the magnet’s momentum.
51
R2 is the center of mass of the toroid. Here we assume that we are dealing with
a permanent toroidal magnet. For a toroidal solenoid one should integrate the potential
energy expression (15.21). Then the second term on the right hand side of (15.49) would
be absent.
, whose center moves according to the laws of classical mechanics. The first
Hamilton’s equation of motion leads to the following results
dp1y ∂V
= −
dt ∂r1y
3q1 a2 µ2 p1y (r1y − R2y ) 3q1 a2 µ2 P2y (r1y − R2y )
= − +
2m1 c(a2 + (r1y − R2y )2 )5/2 4M2 c(a2 + (r1y − R2y )2 )5/2
dP2y dp1y
= −
dt dt
So, unlike in Maxwell’s theory, the rate of change of the 1st particle’s momen-
tum is non-zero and the 3rd Newton’s law is satisfied without involvement
of the “electromagnetic field momentum.” Acceleration of the charge 1 is
calculated as follows
dr1y p1y p31y ∂V

= − 3
+
dt m1 2m1 c ∂p1y
p1y p31y q1 a2 µ2
= − −
m1 2m31 c 2m1 c(a2 + (r1y − R2y )2 )3/2
d2 r1y ṗ1y 3q1 a2 µ2 (r1y − R2y )(v1y − V2y )
= +
dt2 m1 2m1 c(a2 + (r1y − R2y )2 )5/2
3q1 a2 µ2 V2y (r1y − R2y )
≈ −
4m1 c(a2 + (r1y − R2y )2 )5/2
When the magnet is at rest (V2y = 0) this expression vanishes, so there is no
force (acceleration) on the particle 1, as expected.52 The force (acceleration)
acting on the magnet is found by the following steps
3
dR2y P2y P2y ∂V
= − 3
+
dt M2 2M2 c ∂P2y
3
P2y P2y q1 a2 µ2
= − +
M2 2M23 c 4M2 c(a2 + (r1y − R2y )2 )3/2
d2 R2y ṗ2y 3q1 a2 µ2 (r1y − R2y )(v1y − V2y )
= −
dt2 M2 4M2 c(a2 + (r1y − R2y )2 )5/2
52
This result holds also for a toroidal solenoid.
3q1 a2 µ2 v1y (r1y − R2y )

≈
4M2 c(a2 + (r1y − R2y )2 )5/2
So, the magnet’s acceleration does not vanish even if V2y = 0. This is an
example of the situation described in subsection 15.1.3: the forces are not
balanced despite exact conservation of the total momentum.
To complete consideration of the quasiclassical wave packet passing through
the center of the stationary torus we need to calculate the action integral
(6.105). Essentiall, we are going to integrate on time the potential (15.49)
Z∞
q1 a2 µ2 v1y q1 µ2
φ0 = − dt 2
= (15.50)
2c(a2 + v1y t2 )3/2 c
−∞
Here we set R2y = P2y = 0, p1y ≈ m1 v1y , r1y = v1y t.

Now let us consider a charge whose trajectory passes outside the station-
ary torus (path 2 in Fig. 15.12). The force acting on the charge vanishes,
so we can assume that the wave packet travels with constant velocity along
straight line
r(t) ≈ r1 (t) = (R, v1y t, 0) (15.51)
To calculate the action integral we repeat our earlier derivation of the po-
tential energy (15.49), this time taking into account x- and z-components of
vectors. We will assume that the torus is small, so that at all times r ≫ a,
r1 ≈ r and
[~µ2 × r] = (−µ2 r1y cos θ, µ2 r1z sin θ + µ2 r1x cos θ − µ2 a, −r1y sin θ)
[~µ2 × r] · p1 = µ2 (−p1x r1y cos θ + p1y r1z sin θ + p1y r1x cos θ − p1y a − p1z r1y sin θ)
1
= − (N2 · p1 ) + µ2 [r1 × p1 ]z cos θ − µ2 [r1 × p1 ]x sin θ
a
Here we characterized magnetic properties of the toroidal magnet by the
vector N2 = (0, µ2a2 , 0) which is perpendicular to the plane of the torus and
whose length is µ2 a2 . Then using approximation (15.19), setting p2 = 0 and
integrating the potential energy in (15.25) on the length of the torus, we
obtain
Z2π
q1 a[~µ2 × r] · p1
V = − dθ
4πm1 cr 3
0
Z2π
q1
≈ − dθ (−(N2 · p1 ) + µ2 a[r × p1 ]z cos θ − µ2 a[r × p1 ]x sin θ) ×
4πm1 c
0

1 3a(rx cos θ + rz sin θ)
+
r3 r5
Z2π
q1 (N2 · p1 ) 3q1 µ2 a2
= 3
+ 5
dθ [r × p1 ]z rx cos2 θ − [r × p1 ]x rz sin2 θ
2m1 cr 4πm1 cr
0
2
q1 (N2 · p1 ) 3q1 µ2 a
= + ([r × p1 ]z rx − [r × p1 ]x rz )
2m1 cr 3 4m1 cr 5
q1 (N2 · p1 ) 3q1
= 3
+ ([[p1 × r] × r] · N2 )
2m1 cr 4m1 cr 5
q1 (N2 · p1 ) 3q1
= 3
− 5
((p1 · N2 )r 2 − (r · N2 )(p1 · r))
2m1 cr 4m1 cr
q1 (N2 · p1 ) 3q1 (r · N2 )(p1 · r)
= − + (15.52)
4m1 cr 3 4m1 cr 5
The time dependence of this potential energy is obtained by substitution of
(15.51) in (15.52). As expected, the corresponding action integral vanishes
Z∞ Z∞ 3 2
q1 N2 v1y 3q1 N2 v1y t
φR = V (t)dt = dt − 2 2 3/2
+ 2 2 5/2
4c(R2 + v1y t) 4c(R2 + v1y t)
−∞ −∞
= 0
Comparing this result with (15.50) we see that the phase difference for the
two paths (inside and outside the torus) is
1 eµ
∆φ = (φ0 − φR ) =
~ ~c
This is the same result as in the case of infinite linear solenoid (15.48). Note
that this phase shift does not depend on the radius of the magnet a and on the
charge’s velocity v1y . This is in full agreement with Tonomura’s experiments

[TOM+ 86, OMK+ 86].
15.5 Fast moving charges and radiation

In subsection 15.1.2 we calculated forces (15.9) - (15.10) acting between two
charges in relative motion. These formulas were approximate as they included
only terms of order (v/c)2 and lower. So, they could not be applied to
situations in which charges move with high velocities comparable to the speed
of light. Moreover, we tacitly assumed that accelerations of our charges were
low, so that the 3rd order interaction (14.11) could be ignored. By doing
so, we neglected the possibility of the photon emission by the interacting
charges.
In this section we will try to fill these gaps and discuss (albeit only qual-
itatively) RQD effects associated with high velocities and accelerations of
charges. We will compare these effects with those predicted by the standard
Maxwell’s theory. In the next chapter we will see how these differences can
be observed in experiments.
15.5.1 Fast moving charge in RQD

Let us now derive the RQD inter-particle potential beyond the (v/c)2 ap-
proximation. We will be interested in a specific setup in which charge q1 is
moving with a high constant velocity v1 ≈ c and momentum p1 ≫ m1 c along
the z-axis, while the charge 2 is resting at the distance y from the beam line,
as shown in fig. 15.13. For simplicity, we will choose our axes in such a way
that the point z1 = 0 on the beam line corresponds to the closest approach
between the two charges. Likewise, t = 0 is the time when particle 1 passes
through this point. We will assume that charge 2 is very small (q2 ≪ q1 ),
so that its presence has no visible effect on the straight-line movement of q1 .
Furthermore, the mass m2 is taken to be infinitely large, so that in the course
of our thought experiment this particle does not move (v2 = 0). Our goal is
to calculate the force f2 experienced by the test charge q2 . More precisely,
we are interested in the ratio
e ≡ f2 /q2 (15.53)
15.5. FAST MOVING CHARGES AND RADIATION 523
2
|
c|t'
y
r(t')
1 z
v1 0 v1
v1|t'|
Figure 15.13: For calculation of RQD “fields” in (15.57) - (15.59) and

Liénard-Wiechert fields in (15.61). Full circles mark positions of particles
1 and 2 at time t = 0. The open circle marks position of the particle 1 at an
earlier time t′ .
which in Maxwell’s electrodynamics goes by the name “electric field”.

Let us start from evaluating the interaction energy between charges 1
and 2 beyond the (v1 /c)2 approximation. Near the energy shell we can use
formula (9.25) for the interaction operator53
V2d
Z
q1 q2 ~2 c2 dkdp2 dp1 m1 c2 Wµ (p2 − k; p2 )U µ (p1 + k; p1 )
≈ − √ ×
(2π~)3 ωp1 ωp1 +k (ωp1 − ωp1 +k )2 − c2 k 2
d†p2 −k a†p1 +k dp2 ap1
Z
q1 q2 ~2 c2 dkdp2 dp1 m1 c2 U 0 (p1 + k; p1 )
≈ − 3 √ 2 2 2
d†p2 −k a†p1 +k dp2 ap1
(2π~) ωp1 ωp1 +k (ωp1 − ωp1 +k ) − c k
Z
≡ dkdp2 dp1 w2d (p1 + k, p1 , k)d†p2 −k a†p1 +k dp2 ap1 (15.54)
According to subsection 8.2.8, the position-space representation of this poten-

53
We ignored spins of the two particles, took the limit m2 → ∞ and used formula
(11.24), which tells us that on the energy shell V2d = Σc2 = F2c .
tial can be obtained by Fourier-transforming the coefficient function w2d (p1 +

k, p1 , k) in (15.54). Since we are interested only in the long-range component
of our interaction, the relevant integration range is around |k| = 0. So, we
will assume k ≪ p1 and ωp ≈ cp. Then from (J.67) and (H.8) we obtain
m1 c2 m1 c
√ ≈
ωp1 ωp1 +k p1
p p
0
U (p1 + k; p1 ) = ωp1 +k + m1 c2 ωp1 + m1 c2
p
2
p
2
(p1 + k) · p1 1 p1
+ ωp1 +k − m1 c ωp1 − m1 c 2
≈
|p1 + k|p1 2m1 c m1 c
q1 q2 ~2 c2
w2d (p1 + k, p1 , k) ≈ − ·
(2π~)3 (ωp1 − ωp1 +k )2 − c2 kx2 − c2 ky2 − c2 kz2
(15.55)
The non-negative expression Ω(kx , ky , kz ) ≡ (ωp1 − ωp1 +k )2 in the denomi-

nator is a function, which vanishes at kx = ky = kz = 0 and has zero first
derivatives there54
∂Ω c2 ky
= −2(ωp1 − ωp1 +k ) =0
∂ky k=0 ωp1 +k k=0
∂Ω c2 (p1z − kz )
= −2(ωp1 − ωp1 +k ) =0
∂kz k=0 ωp1 +k k=0
For second derivatives we obtain
∂ 2 Ω c2 (p1z − kz ) ∂

2
= −2 (ωp1 − ωp1 +k )
∂kz k=0 ωp1 +k ∂kz k=0
2 2
c (p1z − kz ) c (p1z − kz ) 2c4 p21z 2
= 2 = = 2v1z
ωp1 +k ωp1 +k k=0 ωp2 1
∂ 2 Ω ∂ 2 Ω
= =0
∂ky2 k=0 ∂ky ∂kz k=0
54
We took into account that p1x = p1y = 0.
Then the Taylor expansion around k = 0 yields Ω(kx , ky , kz ) ≈ kz2 v12 . Substi-
tuting this expression in (15.55), we obtain
q1 q2 ~2 1
w2d (p1 + k, p1 , k) ≈ · 2
(2π~) kx + ky + kz2 (1 − v12 /c2 )
3 2
and the position-space potential is
Z Z i
i q1 q2 ~2 e ~ kr
w2d (p1 , r) = dkw2 (p1 + k, p1 , k)e ~
kr
= dk 2
(2π~)3 kx + ky2 + kz2 /γ 2
qqγ
= p 1 2 (15.56)
4π x2 + y 2 + γ 2 z 2
p
where we defined γ ≡ 1/ 1 − v12 /c2 ≫ 1 and r ≡ r2 − r1 .
Formula (15.56) is the potential energy of interaction between charges 1
and 2. Within our approximations, the full Hamiltonian can be written as
q1 q2 γ
H ≈ m2 c2 + cp1 + p
4π x2 + y2 + γ 2z2
The force acting on the particle 2 is
dr2 dp2 ∂H
f2 ≡ m2 2
= =−
dt dt ∂r2
and the “electric field” (15.53) at t = 0 can be obtained as the gradient of
the potential (15.56)
1 ∂w2d q1 γx
e(γ)
x (x, y, z) = − · = (15.57)
q2 ∂x2 4π(x2 + y 2 + γ 2 z 2 )3/2
1 ∂w2d q1 γy
e(γ)
y (x, y, z) = − · = (15.58)
q2 ∂y2 4π(x + y 2 + γ 2 z 2 )3/2
2
1 ∂w2d q1 γ 3 z
e(γ)
z (x, y, z) = − · = (15.59)
q2 ∂z2 4π(x2 + y 2 + γ 2 z 2 )3/2
It is interesting to compare this field with the one produced by a charge at
rest (γ = 1)
(a) ey,Ey (b) ez,Ez
z z
0 0
Figure 15.14: Schematic “electric field” profiles along z-direction (with

p coor-
dinates x = 0, y > 0 fixed) for a charge moving with velocity v = c γ 2 − 1/γ
along the z-axis. The profiles are taken at time t = 0 when the charge is lo-
cated in the origin (see Fig. 15.13). Broken line - charge at rest (γ = 0); thick
line - RQD “electric field” e for a moving charge (γ = 2); thin full line - field
E for the moving charge (γ = 2) in Maxwell’s theory: (a) transversal field
components ey and Ey coincide for all γ; (b) longitudinal field components
|ez | > |Ez |.
q1 r
e(γ=1) (x, y, z) = (15.60)
4π(x2 + y 2 + z 2 )3/2
For x = 0 and fixed value y > 0 we plotted ey - and ez -components of (15.60)

as functions of z in Fig. 15.14. They are shown by broken lines. Field
components for the moving charge55 are shown there by thick full lines. The
effect of the charge’s velocity is twofold: First, the field profile gets squeezed
towards the charge’s position (z1 = 0). Second, the peak magnitude of the
field increases. This means that the electric field configuration around a
fast-moving charge 2 is concentrated in a narrow disk perpendicular to the
direction of motion. This disk moves together with the charge as if it was
rigidly attached to the instantaneous charge’s position.
55
See equations (15.58) and (15.59), where the γ = 2 was chosen as an example.
15.5.2 Fast moving charge in Maxwell’s electrodynam-

ics
In the preceding subsection we used our RQD formalism to find the “electric
field” generated by a fast-moving charge. Let us now see how the same
problem is solved in classical Maxwell’s electrodynamics.
The standard derivation56 involves the concept of retarded Liénard-Wiechert
fields. The idea is that electric (and magnetic) fields are not rigidly attached
to the moving charge. They radially spread around the charge with the speed
equal to the speed of light. So, the total field around the charge is not de-
termined by the charge’s instantaneous position. It is rather a function of
previous locations of the particle. The formula for the Liénard-Wiechert field
produced by the uniformly moving charge 1 at the point 2 at time t = 0 is57
q1 r(t′ ) − r(t′ )v1 /c

E(r2) = · 2 (15.61)
4π γ [r(t′ ) − (r(t′ ) · v1 )/c]3
Various components in this formula are shown in Fig. 15.13. In particular,

r(t′ ) = r2 − r1 (t′ ) is the vector connecting the two charges at an earlier
time t′ = −r(t′ )/c. Now, let us find the time t′ and the charge’s position
r1 (t′ ) = (0, 0, z1 (t′ )) at which the Liénard-Wiechert field was “emitted”, such
that it reached the test particle 2 at time t = 0. As the field propagates with
the speed of light, we can write
q
− ct′ = y 2 + z12 (t′ ) (15.62)
On the other hand, in the time interval [t′ , 0] particle 1 has traveled along
the z-axis from z = z1 (t′ ) to z = 0. This condition yields
v1 t′ = z1 (t′ ) (15.63)
Solving the system of equations (15.62) - (15.63) we obtain

56
see [IPL99, Car00] and section 14.1 in [Jac99]
57
See equation (14.14) in [Jac99]. The Liénard-Wiechert field is denoted by the capital
E in order to distinguish it from the “electric field” e (15.57) - (15.59) predicted in our
theory.
y yγ
t′ = − p =− (15.64)
c2 − v12 c
v1
r(t′ ) ≡ r2 − r1 (t) = 0, y, yγ
r c
2
v
r(t′ ) = y 1 + γ 2 21 = yγ (15.65)
c
Using these results in the Liénard-Wiechert formula (15.61) we find electric
field components at time t = 058
q1 γx
Ex (x, y, z) = (15.66)
4π(x2
+ y 2 + γ 2 z 2 )3/2
q1 γy
Ey (x, y, z) = (15.67)
4π(x + y 2 + γ 2 z 2 )3/2
2
q1 γz
Ez (x, y, z) = (15.68)
4π(x + y 2 + γ 2 z 2 )3/2
2
Field components Ex and Ey perpendicular to the direction of motion are

exactly the same as our results (15.57) - (15.58). But the parallel component
(15.68) is γ 2 time smaller than (15.59). This component is shown by the thin
full line in Fig. 15.14(b).
15.5.3 Kislev-Vaidman “paradox”

RQD predicts that electric and magnetic forces between charged particles
propagate instantaneously. In chapter 16 we will discuss a number of experi-
ments indicating that this is more accurate representation of electromagnetic
interactions than the traditional retarded Liénard-Wiechert potentials. In
section 17.3 we will see that action-at-a-distance can be consistent with the
principle of causality.
58
For a detailed derivation see references quoted in [dSFP+ 12], e.g., section 14.1 in
[Jac99]. Note that this is the same result as the one obtained by Lorentz-transforming
field components (15.60) to the moving frame (see section 11.10 in [Jac99]). However,
the method of Lorentz transformations is questionable, because, as we explained in sec-
tion 17.3, standard special-relativistic formulas for such transformations are valid only in
the absence of interactions and cannot be used for transforming forces between particles
(=electric fields).
x
1 1
d/2
L
L−d a L
2 d/2
2 ct
0 t1 t2 t3
Figure 15.15: Movements of two charged particles (full bold lines) in the
Kislev-Vaidman paradox plotted on the t − x plane. The time on the hor-
izontal axis is multiplied by c, so that photon trajectories (dashed arrows)
are at 45◦ angles.
Additional support for these ideas is provided by the remarkable paradox

[KV02] associated with the assumption of retarded interactions in standard
Maxwell’s electrodynamics.59 Consider two particles 1 and 2 both having
the unit charge. Let us assume that their electromagnetic interaction is
transmitted by retarded potentials and that the movement of both particles
is confined on the x axis. Let us now force the two particles to move along
certain prescribed paths plotted in Fig. 15.15 by full thick lines. Initially
(at times t < 0) both particles are kept at rest with the distance L between
them. The Coulomb interaction energy is 1/(4πL). At time t = 0 we apply
external force which displaces particle 1 by the distance d < L toward the
particle 2. The work performed by this force will be denoted W1
1 1
4πW1 = −
L−d L
59
A number of related paradoxes were discussed also in [Eng05, Kho04, Kho05, Kho06].
Then we wait60 until time t2 and move both particles simultaneously by the
distance d/2 away from each other. If we make this move rapidly during a
short time interval (t3 −t2 ) < (L−d)/c, then the retarded field of the particle
2 in the vicinity of the particle 1 remains unperturbed as if the particle 2 has
not been moved at all. The same is true for the field of the particle 1 in the
vicinity of the particle 2. Therefore the work performed by such a move is

1 1
4πW2 = 2 −
L − d/2 L − d
The total work performed in these two steps is nonzero
4π(W1 + W2 )
1 1 2 2
= − + −
L − d L L − d/2 L − d
1 1 2 2
≈ − + −
L(1 − d/L) L 2(1 − d/(2L)) L(1 − d/L)

1 d d2 1 2 d d2 2 d d2
≈ 1+ + 2 − + 1+ + − 1+ + 2
L L L L L 2L 4L2 L L L
2 2 2
1 d d 1 2 d d 2 2d 2d
= + 2+ 3− + + 2+ 3− − 2− 3
L L L L L L 2L L L L
2
d
= − 3 (15.69)
2L
This means that after the shifts are completed we find both charges in the
same configuration as before (at rest and separated by the distance L), how-
ever we gained some amount of energy (15.69). Of course, the balance of
energy (15.69) is not complete. It does not include the energy of photons
emitted by accelerated charges.61 However, one could, in principle, recap-
ture this emitted energy by surrounding the pair of particles by appropriate
photon absorbers and redirect the captured energy to perform the work of
60
The displacement of the charge 1 and its acceleration results in emission of electro-
magnetic radiation (indicated by dashed arrows in Fig. 15.15). So, we would need to wait
for a sufficiently long time until the emitted photons propagated far enough, so that they
do not have any effect on our two-charge system anymore.
61
According to Larmor’s formula, the energy of emitted photons is proportional to the
square of acceleration of the charges. See subsection 14.1.2.
moving the charges again. Then, it would become possible to build a per-
petuum mobile machine in which the two steps described above are repeated
indefinitely and each time some amount of energy (15.69) is gained.
The following explanation of this paradox was suggested by Kislev and
Vaidman [KV02]: They claim that there is another energy term missed in the
above analysis, which is related to the interference of electromagnetic waves
emitted by the two particles62 and which restores the energy balance. This
explanation does not look plausible, because there is actually no interaction
energy associated with the interference of light waves: The interference re-
sults in a redistribution of the wave’s amplitude (formation of minima and
maxima) and the wave’s local energy in space, while the total energy of the
wave remains unchanged [Gau03]. In other words, there is no interaction
between photons.63
The correct explanation of the Kislev-Vaidman “paradox” is provided by
the Darwin-Breit instantaneous action-at-a-distance theory. In the absence
of retardation of the Coulomb potential, it is easy to show that W1 + W2 = 0,
and the total work performed by moving the charges is equal to the energy
of the emitted radiation.
Another reason to doubt the validity of retarded Liénard-Wiechert po-
tentials is the paradoxical prediction [Cor86, Gri86, Ste13] that an isolated
electric dipole can move with a constant acceleration in the absence of any
external force. This paradox does not appear when charges interact instan-
taneously, as in RQD.
15.5.4 Accelerated charges

As we saw above, RQD “fields” depend on the source charge’s velocity, but
not on its acceleration. However, this does not mean that accelerations do not
play any role in electromagnetic interactions. As we have shown in subsection
14.1.2, an accelerated charge emits photons. One result of this effect is that
a part of the charge’s kinetic energy gets lost to radiation. This is called the
radiation reaction or radiation braking.
The other result is the appearance of an additional indirect interaction
between charges: An accelerated charge emits a large number of photons,
62
For example, in Fig. 15.15 electromagnetic waves emitted by the two charges meet at
point a, and the interference of the waves proceeds from that time on.
63
QED predicts a very weak photon-photon interaction in the 4th perturbation order,
however it is negligibly small in the situation considered here.
which spread around with the speed of light. Some photons reach the other
(receiving) charge and interact with it either by absorption64 or by the Comp-
ton scattering,65 thus causing the receiving charge to accelerate. This indirect
transmission of interaction occurs with the speed of light, and is responsible
for TV, radio, and cell-phone signal propagation through air.
15.5.5 Electromagnetic fields vs. photons

So far in this book we presented many arguments suggesting that electrody-
namics (both quantum and classical) does not need electromagnetic poten-
tials (scalar and vector) and/or electromagnetic fields (E and B). Instan-
taneous inter-particle forces and the laws of quantum mechanics are quite
sufficient for the description of all electromagnetic phenomena.
The same conclusion remains true with respect to properties of the free
electromagnetic radiation. In RQD we claim that electromagnetic radiation
is simply a flow of a large number of point-like massless particles – photons.
The wave properties of light are manifestations of the quantum nature of
individual particles, and the Huygens-Maxwell wave theory of light is, in fact,
an attempt to approximate quantum wave functions of billions of photons
by two surrogate functions E(x, t) and B(x, t) [Fie04, dlT05, Car05]. As we
saw in section 1.1, ordinary quantum theory of photons (particles) can also
explain the diffraction and interference phenomena without involving the
classical waves E and B. Moreover, our particle-based explanation works
much better than the classical wave model in the limit of low-intensity light
and for the photo-electric effect.66 So, the representation of electromagnetic
radiation as a flow of discrete countable quantum particles better agrees
with experiment and is more general than the field (or wave) theory of light
[Fie04, Fie06b, dlT04a, dlT04b, dlT05]. This is an invitation to reconsider
the status of Maxwell’s electrodynamics:
Finally, the remark may be made, as previously pointed out by

Feynman [Fey85] and other authors adopting a similar approach
[LLB90], that the so called ‘classical wave theory of light’ de-
veloped in the early part of the 19th century by Young, Fresnel
64
if the receiving charge is accelerating, then it can absorb photons in a process that is
reverse to the bremsstrahlung emission.
65
66
and others is QM as it applies to photons interacting with mat-

ter. Similarly, Maxwell’s theory of CEM [=Classical ElectroMag-
netism] is most economically regarded as simply the limit of QM
when the number of photons involved in a physical measurement
becomes very large. [...] Thus experiments performed by physicists
during the last century and even earlier, were QM experiments,
now interpreted via the wavefunctions of QM, but then in terms of
‘light waves’. [...] The essential and mysterious aspects of QM, as
embodied in the wavefunction (superposition, interference) were
already well known, in full mathematical detail, almost a hundred
years earlier! J. H. Field [Fie04]
Chapter 16
EXPERIMENTAL SUPPORT
FOR RQD
Let a hundred flowers bloom, let a hundred schools of thought

contend.
Mao Zedong
Our discussion in subsection 15.5.4 suggests that there are two distinct
kinds of forces between charged particles: One is the direct bound Coulomb
or magnetic force, which is dominant when the charges are at rest or move
with low accelerations. In RQD these force fields are rigidly and permanently
attached to the source charges and react immediately to any perturbation
of the charges’ trajectories. This is equivalent to saying that the speed of
propagation of these interactions is infinite. On the other hand, in Maxwell
electrodynamics the attachment of the bound fields to the charge is not rigid:
The electric and magnetic forces are described by Liénard-Wiechert fields,
which propagate with the speed of light.
The second type of force is the indirect radiation interaction. In RQD this
interaction is transmitted by photons traveling with the speed of light. In
Maxwell’s theory, the radiation field is represented by the familiar transverse
electromagnetic wave, in which mutually perpendicular E and B vectors
oscillate and travel in space with the light’s speed.
Thus, in RQD the total force field produced by a group of moving charges
535
536 CHAPTER 16. EXPERIMENTAL SUPPORT FOR RQD
can be written symbolically as a superposition of instantaneous and retarded

components1
e = einst ret
bound + eradiation (16.1)
The Liénard-Wiechert electric field of the Maxwell’s theory is fully retarded
E = Eret ret
bound + Eradiation (16.2)
In this book we will not derive the explicit form of the eret
radiation component
2
We will simply assume that eret ret

radiation = Eradiation and focus on the more
interesting difference between bound fields einst ret
bound and Ebound .
The infinite propagation speed of the bound electromagnetic fields is, per-
haps, the most controversial prediction of our RQD approach. In this chapter
we will be interested in experimental techniques that can be used to verify
this prediction. As we have mentioned already, the bound fields are easily
observable in static or quasistatic situations when accelerations of charges
are low. However, if we want to measure field velocities, then we need to dis-
rupt these static configurations, thus introducing charge accelerations and,
inevitably, the emission of radiation fields. So, the experimental challenge
is to somehow minimize the effect of the radiation fields, so that dynamical
properties of the bound fields can be studied in their pure form. Several ex-
perimental approaches that can meet this challenge will be described below.
For other relevant experiments see review articles [Rec09, WFF+ 10].
16.1 Relativistic electron bunches

In subsections 15.5.1 and 15.5.2 we have calculated bound electric fields pro-
duced by a fast moving charge in RQD (15.57) - (15.59) and in the Maxwell-
Liénard-Wiechert theory (15.66) - (15.68), respectively. In both cases the
1
For brevity here we consider only the “electric” part of the field, omitting “magnetic”
interactions that act only on moving charges. For similar ideas about electromagnetic
interactions being composed of both instantaneous and retarded parts see [CSR96, Fie97,
Fie06a, Kho06, KMSR07a].
2
As explained in subsection 15.5.4, its action on a test charge is a combination of the
photon absorption and the Compton scattering.
16.1. RELATIVISTIC ELECTRON BUNCHES 537
fields have a form of a thin “pancake” perpendicular to the charge’s velocity.

However, there is one crucial difference in the field dynamics (time evolution)
predicted by these two theories. Remarkably, this difference is so significant
that it can be seen even in unsophisticated experimental setups. Let us now
discuss this experimental opportunity in more detail.
Fast-moving charges are routinely available as electron beams in acceler-
ators. For example, a bunch of 500 MeV electrons has the factor γ as large
as 103 . Then Liénard-Wiechert equations (15.64) - (15.65) suggest a peculiar
electric field dynamics when such an electron bunch leaves the accelerator’s
pipe and enters the open space. Immediately after the bunch’s emergence
there is no electric field around it. The disk-shaped field builds up gradu-
ally,3 starting from low values of y and extending to larger y as the time
progresses. Thus the electric field grows “older” as the observation point
moves away from the beam line, i.e., as y increases. For example, the peak
electric field at the distance of y = 50 cm from the beam’s axis becomes fully
formed only after the bunch has traveled γy = 500 m away from the pipe
exit point.
Unlike in the Liénard-Wiechert theory, in our RQD approach there is
no “field recovery” dynamics as described above. The electric field (15.57)
- (15.59) of the bunch emerging from the accelerator is fully formed in the
entire space without delay. Here we will have a chance to compare these two
competing predictions with the experiment performed recently by the group
of prof. Pizzella at the Frascati National Laboratory in Italy [dSFP+ 12].
16.1.1 Experiment at Frascati

Frascati experiment used 500 MeV electron bunches (γ ≈ 103 ) with (0.5 −
5.0) × 108 electrons per pulse. The transverse y-component of the electric
field was measured at distances y = 3, 5, 10, 20, 40, and 55 cm from the beam
line. This experiment confirmed theoretical results (15.58) or (15.67) within
experimental errors.
As we discussed above, the traditional result (15.67) relied on the re-
quirement that the electron bunch was moving uniformly long before the
measurement was done. For example, for the validity of (15.67) at y = 55
cm and t = 0,4 the earlier trajectory of the bunch should be linear, uniform,
3
Sometimes such a buildup is described as a “field recovery” of the “semi-bare electron”
[NPSP10, NPS12].
4
Here we use the same notation and assumptions as in subsections 15.5.1 - 15.5.2. In
and unconstrained for at least (55 cm) · γ ≈ 550 m from the point of observa-
tion. However, this essential condition was badly violated in the experiment!
In particular, a fully formed electric field was detected at the distance of only
92 cm from the beam pipe exit flange.
Thus the Pizzella group’s measurements were totally inconsistent with
the gradual build-up of the bunch’s electric field predicted by the classical
Maxwell-Liénard-Wiechert theory. However, they were in full agreement with
our RQD explanation, which maintains that the field is rigidly attached to
the instantaneous position of the moving charge.
16.1.2 Proposal for modified experiment

Let us suggest two simple modifications of the Frascati experimental setup,
which may provide even more spectacular validation of RQD. First, one can
change orientation of the electric field sensors so as to measure the longitudi-
nal z−component of the field and compare it with our prediction (15.58). As
we mentioned earlier, this quantity is expected to be γ 2 ≈ 106 times greater
than the Liénard-Wiechert’s prediction (15.68).
Second, we propose to check whether the field at large transverse distances
y follows instantaneous positions of the moving charge, as predicted by RQD
in subsection 15.5.1. In order to verify this prediction, the experimentalists
can stop the electron bunch abruptly by placing a lead brick (beam dump) on
the beam’s path, and investigate the time evolution of the electric field after
such an interruption of the beam. An example of the proposed setup is shown
in Fig. 16.1.5 The electric field sensor6 is placed at (x = 0, y = 55 cm, z = 0).
The lead brick is positioned at (x = 0, y = 0, z = −30 cm), so that the beam
stoppage occurs at time t1 = −1 ns, i.e., 1 ns earlier than the field maximum
is supposed to reach the sensor. Next consider response of the sensor in the
two theories discussed above.
In the traditional Liénard-Wiechert approach, the electric field of the
beam will continue its motion with velocity (0, 0, v1) even after the electron
beam has been interrupted. So, the sensor will register the onset of the field
pulse at time t = 0, as if the lead brick was not there.7 The beam’s collision
particular, t = 0 is the time when the sensor (=charge 2) registers the maximum ey field
strength.
5
Coordinate axes and beam properties are the same as in Fig. 15.13.
6
which was earlier modeled by the fixed charge 2
7
It is also important to note that in the Liénard-Wiechert theory the amplitude of the
16.1. RELATIVISTIC ELECTRON BUNCHES 539
S
(a) 30 cm (b) 30 cm S
55 cm 55 cm
e- beam e- beam
BD BD
Figure 16.1: Field configurations at t = 0, i.e., after the electron bunch was
stopped by the beam dump (BD): (a) Maxwell’s theory in which the disk-
shaped electric field E of the beam has reached the sensor (S); (b) RQD in
which the runaway disk-shaped electric field is absent (a much weaker sta-
tionary field surrounds the stopped electron bunch), and the photons emitted
from the collision point have not reached the sensor yet.
with the lead brick will also result in formation of a burst of electromagnetic
radiation Eret radiation , which will propagate radially with the speed of light,
as shown in the p figure. The p distance between the sensor and the collision
point is R = y + z = (55 cm)2 + (30 cm)2 ≈ 64 cm. Therefore, the
2 2
electromagnetic pulse will reach the sensor at time t2 = t1 + R/c = −1 ns +

(64 cm)/c ≈ 1 ns, i.e., 1 ns later than the signal onset.
In our RQD approach, the field configuration does not depend on the
previous history of the beam. So, after the electron bunch is stopped, its
field suddenly transforms into a spherically symmetric shape (15.60), char-
acteristic for a charge at rest. At the same time, the field strength reduces
by the factor of γ ≈ 1000, i.e., it weakens below detector’s sensitivity. So, in
contrast to the traditional theory described above, we are not expecting to
see any sensor response at t = 0. Formation of the bremsstrahlung photon
pulse eret
radiation upon the beam-brick collision will proceed as described above,
and this signal will reach the sensor at time t2 ≈ 1 ns.
In short, the two theories predict quite different timings of initial sensor
t = 0 signal must be the same independent on the presence or absence of the beam dump.
This prediction was not confirmed by the Frascati experiment, which showed a significant
reduction of the signal in the presence of the beam dump. See Fig. 15 in [dSFP+ 12].
responses: In Maxwell’s theory, the signal onset will occur at t = 0, while in

our approach the first signal will reach the sensor at t ≈ 1 ns. The expected
time difference of about 1 ns should be easily detectable by the available
experimental equipment.
16.2 Radiation and bound fields

Experimental studies of the electromagnetic field propagation normally in-
volve two antennas: the emitter and the receiver. The signal in the receiver
is a combined effect of both the bound and radiation fields produced by the
emitter. Separation of these two effects is a challenging experimental task,
but it is still possible due to their different physical properties.
According to RQD, the radiation signal must be proportional to the num-
ber of photons reaching the receiver. In the simplest spherically symmetric
−2
case, this number drops with the distance as eret radiation ∝ r . A different
distance dependence can be expected for the bound field component einst bound .
Regarded as a source of the bound field, a simple antenna can be approxi-
mated by a time-varying electric dipole. Then at large distances we should
−3
expect the electric field to behave like einst
bound ∝ r . This suggests that at
large distances the bound field effect is overwhelmed by the radiation effect.
So, in order to detect superluminal bound fields one should either look very
close to the emitter, e.g., in its near-field zone, or try to reshape the photon
flow, so that certain regions of space are free from radiation, and the weaker
bound fields can be detected there.
A few experiments, which succeeded in measuring the bound field velocity,
will be discussed in this section.
16.2.1 Near field studies

A remarkable study of the electromagnetic field propagation was performed
by Kholmetskii and coworkers [KMSR+ 07b, KMSR07a, MKSR11]. They
used the classical Hertz’s setup with two antennas. The emitter (E) an-
tenna produced a short pulse of electromagnetic radiation in the surrounding
space. The receiver (R) antenna was placed at a variable distance r from E,
and time-resolved voltage waveforms were recorded at R. The authors have
demonstrated that at large separations (the distance r was up to 3m) the
signal in R was dominated by the radiation field with the distance depen-
16.2. RADIATION AND BOUND FIELDS 541
v=
R2
∞
d
E v=c R1
Figure 16.2: Sketch of the experiment with microwave horn antennas and its
interpretation. Broken lines indicate the flux of photons concentrated along
the emitter’s (E) axis . Half-circles represent bound electric and magnetic
fields that propagate instantaneously.
dence ∝ r −2 . In the near-field zone (r < 50cm) the signals from both bound
and radiation fields were mixed with the prevalence of the bound (∝ r −3 )
field. Assuming that the radiation field propagated with the constant speed
c within the entire range of r values, the authors were able to subtract the
radiation field contribution from the total signal for all r, thus obtaining
pure bound field waveforms. Analyzing these waveforms, they estimated the
propagation speed of the bound component. In the near-field zone this value
appeared much higher than the speed of light (or even infinite).8 This re-
sult fully agreed with the RQD idea about the instantaneous propagation of
bound fields.
16.2.2 Microwave horn antennas

Perhaps the first convincing experimental observation of the superluminal
character of bound electromagnetic fields was performed by Giakos and Ishii
in 1991 [GI91b, GI91a]. They studied the propagation of microwave pulses
between two horn antennas arranged as shown in Fig. 16.2. In the first run of
this experiment the emitter (E) and the receiver (R1) antennas were placed
face to face and separated by a distance of r =71.5 cm. The signal crossed
this air gap in 2.378 ns, which corresponded to the propagation velocity of
8
These issues have been discussed also in [Wal00, Bud10, Bud09].
3.01 ×108 m/s, i.e., the speed of light, as expected. In the second run the
receiver (R2) was shifted away from the emitter’s axis (with or without tilting
towards the emitter). For shift distances as large as d =34 cm the transit
time remained nearly constant despite substantial increase of the E − R2
separation. Thus, signal propagation velocities were observed as high as 3.32
×108 m/s, i.e., 10% higher than the speed of light.
These results were later confirmed in a set of experiments performed by
Ranfagni and coworkers [RFPM93, RM96, MRR00]. Wide ranges of mi-
crowave frequencies and E − R separations were explored.
These observations are fully consistent with the following RQD narrative:
The emitting horn antenna generates both bound and radiation electromag-
netic fields. Due to the specific E antenna horn shape, the beam of photons
(=the radiation field) is concentrated near the antenna axis. On the other
hand, the bound field is more diffuse and short-range. When the receiver
is placed on the emitter’s axis, the signal is dominated by microwave pho-
tons, and the apparent signal velocity is close to c. When the receiver is
displaced away from the axis, the photon contribution decreases, and a more
prominent role is played by the bound instantaneous field. Thus the effective
signal propagation speed tends to increase. The radiation field still domi-
nates the signal, so the effective speed exceeds c by only few percents. If
it were possible to shut off the radiation field completely, we would see an
infinite propagation speed.
Such an elimination of the radiation component is, actually, possible in
another kind of experiment that we would like to discuss in the next subsec-
tion.
16.2.3 Frustrated total internal reflection

Consider a beam of light directed from the glass side on the interface between
glass (G1) and air (A) (see Fig. 16.3(a)). The total internal reflection occurs
when the incidence angle θ is greater than the Brewster angle. In such cases,
all light is reflected at the interface, and no radiation leaks into the air. The
radiation field is blocked by the interface completely.
However, a rather surprising effect occurs if another piece of glass (G2)
is placed close to the interface (see Fig. 16.3(b)). In this case, some part of
the initial light beam does cross the air gap and penetrates the piece of glass
G2. At the same time the intensity of the reflected light decreases. The total
internal reflection becomes frustrated, and the phenomenon described above
16.2. RADIATION AND BOUND FIELDS 543
G2
A A
G1 G1
a) (b)
Figure 16.3: A beam of light impinging on the glass-air interface. (a) If
the incidence angle θ is greater than the Brewster angle, then all light is
reflected at the surface. The region of evanescent light is shown by vertical
dashed lines. (b) If a second piece of glass is placed near the interface, then
“evanescent light” is converted to the “normal light” propagating into the
second piece of glass (G2).
is called the frustrated total internal reflection (FTIR).

In recent experiments [EN93, CZJW00, BD97] the speed of light’s prop-
agation across the air gap (A) was investigated, and there were strong indi-
cations that this speed may be superluminal.9
The usual explanation of this effect is that the air gap serves as a bar-
rier for the photon propagation, and the gap crossing is an example of the
well-known quantum tunneling effect, which is widely believed to be super-
luminal.10 We suggest another (perhaps, not an alternative but a comple-
9
There are discrepancies in data interpretations by different groups (see, for example,
[RGC01, RMGC01]) and the question remains open whether or not the speed of the signal
crossing the gap may exceed c.
10
This is known as the Hartman effect [Har62]. For a recent observation of superluminal
electron tunneling see [EPC+ 08].
mentary) explanation: Being excited by the light wave, the charged particles
(electrons and nuclei) at the interface G1 - A oscillate. These oscillations
give rise to variable dipole moments at the interface. The dipoles gener-
ate bound electric and magnetic fields in the gap,11 which follow the dipole
dynamics instantaneously. The amplitude of the bound fields decreases ex-
ponentially with the distance from the G1 - A interface. So, when the size
of the gap is sufficiently small, these fields affect charged particles on the
other side of the gap forcing them to oscillate and to emit photons. These
newly created photons propagate inside the piece of class G2 in the form of a
“normal” light beam. In our interpretation, the evanescent wave in the gap
is equivalent to instantaneous Coulomb and magnetic forces acting between
oscillating charges on the two interfaces. Although no real photons cross the
air gap A, the apparent travel time of the light pulse does not depend on
the size of the gap [HN00, AN13], i.e., the signal transmission across the gap
occurs superluminaly.
11
they are also called evanescent waves
Chapter 17
PARTICLES AND
RELATIVITY
How often have I said to you that when you have eliminated the
impossible, whatever remains, however improbable, must be the
truth?
Sherlock Holmes
In the preceding chapters of this book we constructed a dressed parti-

cle version of quantum electrodynamics which we called relativistic quantum
dynamics or RQD. One important property of RQD was that this theory
reproduces exactly the S-matrix of the standard renormalized quantum elec-
trodynamics. Therefore, RQD can describe existing experiments (e.g., scat-
tering cross-sections, bound state energies, and lifetimes) just as well as QED.
However RQD is fundamentally different from QED. The main ingredients of
RQD are particles (not fields) that interact with each other via instantaneous
potentials. The usual attitude toward such a theory is that it cannot be math-
ematically and physically consistent [Str04, HC01, Wal01, Wil99, Hob12].
In this chapter we are going to delve into more philosophical aspects of
RQD. In particular, we will try to reassure the reader that this particle-based
theory with instantaneous interactions still obeys the principles of relativity
and causality. We will also argue that our theory provides a more accu-
rate representation of relativistic phenomena than the traditional Minkowski
545
546 CHAPTER 17. PARTICLES AND RELATIVITY
space-time model. One type of objections against particle-based theories

is related to the alleged incompatibility between the existence of localized
particle states and principles of relativity and causality. We will analyze
these objections in section 17.1 and demonstrate that there is no reason for
concern: the Newton-Wigner position operator and sharply localized particle
states do not contradict any fundamental physical principle. In particular, we
will analyze the well-known paradox of superluminal spreading of localized
wave packets.
In section 17.2 we will define the notion of a localized physical event and
attempt to derive transformations of space-time coordinates of such events
between different reference frames. We will notice that spatial translations
and rotations induce kinematical transformations of observables, but transla-
tions in time are always dynamical (i.e., they depend on interactions). Then
boost transformations must be dynamical as well. This implies first that in-
teractions are governed by the instant form of dynamics, and second that the
connection between space and time coordinates of events in different mov-
ing reference frames are generally different from Lorentz transformations of
special relativity.
In section 17.3 we will conclude that Minkowski space-time picture is
not an accurate representation of the principle of relativity. We will also
dispel another misconception about the alleged incompatibility between in-
stantaneous action-at-a-distance and causality. We will see that in some
cases superluminal effects may not violate causality and may be physically
acceptable.
Section 17.4 is devoted to more philosophical speculations on the role of
quantum fields and their interpretation.
17.1 Localizability of particles

In section 4.3 we found that in relativistic quantum theory particle positions
are described by the Newton-Wigner operator. However, this idea is often
regarded as controversial. There are at least three arguments that are usually
cited to “explain” why there can be no position operator and localized states
in relativistic quantum theory, in particular, in QFT:
• Single particle localization is impossible, because it requires an unlim-
ited amount of energy (due to the Heisenberg’s uncertainty relation)
and leads to creation of extra particles [BLP01]:
17.1. LOCALIZABILITY OF PARTICLES 547
In quantum field theory, where the particle propagators do

not allow acausal effects, it is impossible to define a posi-
tion operator, whose measurement will leave the particle in a
sharply defined spot, even though the interaction between the
fields is local. The argument is always that, to localize the
electric charge on a particle with an accuracy better than the
Compton wavelength of the electron, so much energy should
be put in, that electron-positron pairs would be formed. This
would make the concept of position meaningless. Th. W.
Ruijgrok [Rui98]
• Newton-Wigner particle localization is relative, i.e., different moving

observers may disagree on whether the particle is localized or not.
• Perfectly localized wave packets spread out with superluminal speeds,
which contradicts the principle of causality [Heg98]:
The ’elementary particles’ of particle physics are generally

understood as pointlike objects, which would seem to imply
the existence of position operators for such particles. How-
ever, if we add the requirement that such operators are co-
variant (so that, for instance, a particle localized at the ori-
gin in one Lorentz frame remains so localized in another),
or the requirement that the wave-functions of the particles do
not spread out faster than light, then it can be shown that
no such position operator exists. (See Halvorson and Clifton
(2001) [HC01] and references therein, for details.) D. Wal-
lace [Wal01]
In the present section we are going to show that relativistic localized states of
particles have a well-defined and non-controversial meaning in spite of these
arguments.
17.1.1 Measurements of position

Let us first consider the idea that precise measurements of position disturb
the number of particles in the system.
It is true that due to the Heisenberg’s uncertainty relation (6.88), sharply
localized 1-particle states do not have well-defined momentum and energy.
For a sufficiently localized state, the energy uncertainty can be made greater
than the energy required to create a particle-antiparticle pair. However, large
uncertainty in energy does not immediately imply any uncertainty in the
number of particles, and sharp localization does not necessarily require pair
creation. The number of particles in a localized state would be uncertain
if the particle number operator did not commute with position operators
of particles. However, this is not true. One can easily demonstrate that
Newton-Wigner particle position operators do commute with particle number
operators. This follows directly from the definition of particle observables in
the Fock space.1 By their construction, all 1-particle observables (position,
momentum, spin, etc.) commute with projections on n-particle sectors in the
Fock space. Therefore these 1-particle observables commute with particle
number operators. So, one can measure position of any particle without
disturbing the number of particles in the system. This conclusion is valid
for both non-interacting and interacting particle systems, because the Fock
space structure and definitions of one-particle observables do not depend on
interaction.
17.1.2 Localized states in a moving reference frame

In this subsection we will discuss the second objection against the use of
localized states in relativistic quantum theories, i.e., the non-invariance of
the particle localization.
The position-space wave function of a single massive spinless particle in
a state sharply localized in the origin is2
ψ(r) = δ(r) (17.1)

The corresponding momentum space wave function is (5.38)
ψ(p) = (2π~)−3/2 (17.2)

Let us now find the wave function of this state from the point of view of a
moving observer O ′ . By applying a boost transformation to (17.2)3
1
2
This is a non-normalizable state that we called “improper”
p in section 5.2. Similar
arguments apply to normalized localized wave functions, like δ(r).
3
see equation (5.30)
s
− ic ωp cosh θ − cpx sinh θ
e ~
K̂x θ
ψ(p) = (2π~)−3/2
ωp
and transforming back to the position representation via (5.43) we obtain
Z q
− ic K̂x θ −3 i
e ~ ψ(r) = (2π~) dp cosh θ − (cpx /ωp ) sinh θe ~ pr (17.3)
We are not going to calculate this integral explicitly, but one property of the
function (17.3) must be clear: for non-zero θ this function is non-vanishing
for all values of r.4 Therefore, the moving observer O ′ would not agree with
O that the particle is localized. Observer O ′ can find the particle anywhere
in space. This means that the notion of localization is relative: a state
which looks localized to the observer O does not look localized to the moving
observer O ′ .
The non-invariant nature of localization is a property not familiar in clas-
sical physics. Although this property has not been observed in experiments
yet, it does not contradict any postulates of relativistic quantum theory and
does not constitute a sufficient reason to reject the notion of localizability.
17.1.3 Spreading of well-localized states

Here we are going to discuss the wide-spread opinion that superluminal
spreading of particle wave functions violates the principle of causality [HC01,
Wal01, Mal96, Heg98, BY94].
In the preceding subsection we found how a localized state (17.1) looks
from the point of view of a moving observer. Now, let us find the appearance
of this state from the point of view of an observer displaced in time. Again, we
first make a detour to the momentum space (17.2), apply the time translation
operator
i it
√
m2 c4 +p2 c2
ψ(p, t) = e− ~ Ĥt ψ(p, 0) = (2π~)−3/2 e− ~
4
This property follows from the non-analyticity of the square root in the integrand
[Str04].
t=0
t>0
r
A ct B
Figure 17.1: Spreading of the probability distribution of a localized wave

function. Full line: at time t = 0; dashed line: at time t > 0 (the distance
between points A and B is greater than ct).
and then use equation (5.43) to find the position-space wave function at
non-zero t
Z
i
Z
it
√ i
−3/2 −3 m2 c4 +p2 c2
ψ(r, t) = (2π~) dpψ(p, t)e ~
pr
= (2π~) dpe− ~ e ~ pr
This integral can be calculated analytically [Rui]. However, for us the most
important result is that the wave function is non-zero at distances larger
than ct from the initial point A (r > ct), i.e., outside the “light cone”.5
The corresponding probability density |ψ(r, t)|2 is shown schematically by
the dashed line in Fig. 17.1. Although the probability density outside the
light cone is very small, there is still a non-zero chance that the particle
propagates faster than the speed of light.
Note that superluminal propagation of the particle’s wave function in the
position space does not mean that particle’s speed is greater than c. As we
have established in subsection 5.1.2, for a free massive particle, eigenvalues
of the quantum-mechanical operator of speed are less than c. So, the possi-
bility of wave functions propagating faster than c is a purely quantum effect
associated with the non-commutativity of operators R and V.
5
This fact can be justified by the same analyticity argument as in footnote on page
549. See also [WWS+ 12] and section 2.1 in [PS95b].
ct ct’
O O’
x’
x
A B
C
Figure 17.2: Space-time diagram demonstrating the alleged causality para-
dox associated with superluminal spreading of wave functions. Observers
O and O ′ have coordinate systems with space-time axes (x, ct) and (x′ , ct′ ),
respectively. Observers O and O ′ send superluminal signals to each other by
opening boxes with localized quantum particles. See text for more details.
17.1.4 Superluminal spreading and causality

The superluminal spreading of localized wave packets described in the pre-
ceding subsection holds under very general assumptions in relativistic quan-
tum theory [Heg98]. It is usually regarded as a sign of a serious trouble
[HC01, Wal01, Mal96, BY94, Zub00], because the superluminal propagation
of any signal is strictly forbidden in special relativity.6 This contradiction is
often claimed to be the major obstacle for the particle interpretation of rel-
ativistic quantum theories. Since particle interpretation is the major aspect
of our approach, we definitely need to resolve this controversy. This is what
we are going to do in this subsection.
Let us first describe the reason why the superluminal spreading of wave
functions is claimed to be unacceptable in the traditional approach. One idea
is that this phenomenon can be used to build a device which would violate
the principle of causality, as discussed in Appendix I.5. In that discussion we
have not specified the mechanism by which the instantaneous signals were
sent between observers O and O ′ . Let us now assume that these signals are
transmitted by spreading quantum wave packets. More specifically, suppose
6
see Appendix I.5
that the signaling device used by the observer O is simply a small impene-
trable box containing massive spinless quantum particles. Before time t = 0
(point A in Fig. 17.2) the box is tightly closed, so that wave functions of the
particles are well-localized inside it. The walls of the closed box at t < 0 are
shown by two thick vertical parallel lines on the space-time diagram 17.2.
At time t = 0 observer O sends a signal to the moving observer O ′ by open-
ing the box. The wave function of spreading particles at t > 0 is shown
schematically in Fig. 17.2 by thin dashed lines parallel to the x-axis. Due
to the superluminal spreading of the wave function, there is, indeed, a non-
zero probability of finding particles at the location of the moving observer
O ′ (point B) immediately after the box was opened.
The observer O ′ has a similar closed box with particles. Upon receiving
the signal from O (point B) she opens her box. It is clear that the wave
packet of her particles ψ ′ (r′ , t′ ) spreads instantaneously in her own reference
frame. The question is how this spreading will be perceived by the stationary
observer O? The traditional answer is that the wave function ψ(r, t) from the
point of view of O should be obtained by applying Lorentz transformations
(I.16) - (I.19) to the arguments of ψ ′
ψ(x, y, z, t) = ψ ′ (x cosh θ + ct sinh θ, y, z, t cosh θ + (x/c) sinh θ) = ψ ′ (Λx̃)

(17.4)
This wave function is shown schematically in Fig. 17.2 by inclined parallel
thin dashed lines. Then we see that there is a non-zero probability of finding
particles emitted by O ′ at point C. This means that the response signal sent
by O ′ arrives to O earlier than the initial signal O → O ′ was sent (at point
A). This is clearly a violation of causality.
Actually, the time evolution of the wave packet (17.4) looks totally absurd
from the point of view of O. The particles do not look as emitted from B
at all. In fact, the wave function approaches observer O (point C) from
the opposite side (from the side of negative x) and moves in the positive x
direction. So, one cannot even talk about the “signal” being sent from O ′ to
O!
What is wrong with this picture? The traditional answer is that this weird
behavior is the consequence of the superluminal propagation of the wave
function. The usual conclusion is that sanity can be restored by forbidding
such superluminal effects. However, this would go against the entire theory
developed in this book. Could there be a different answer?
We would like to suggest the following explanation: Apparently, the cru-

cial step in the above derivation is the use of the wave function transfor-
mation law (17.4). However, there are serious reasons to doubt that this
formula is applicable even approximately. First, here we are dealing with
a system (particles confined in a box) where interactions play a significant
role. In such a system the boost operator is interaction-dependent, and boost
transformations of wave functions should depend on the details of interaction
potentials.7 Therefore, it is obvious that the wave function transformation
cannot be described by the universal interaction-independent formula (17.4).
Moreover, our system is not isolated. It is described by a time-dependent
Hamiltonian (the box is opened at some point in time). This makes Poincaré
group arguments unapplicable and further complicates the analysis of boost
transformations. Even if we assume that interaction-dependence of boost
transformations can be neglected in some approximation, there is still no
justification for equation (17.4). This formula cannot be used to transform
even wave functions of free particles. For example, this formula contradicts
the transformation law (17.3) derived earlier for localized particle states. So,
there is absolutely no evidence that the wave function of particles emitted
by observer O ′ will behave as shown in Fig. 17.2 from the point of view of
O. In particular, there is no evidence that the signal sent by O ′ arrives to O
at point C in violation of the causality law.
It is plausible that using the correct boost transformation law one would
obtain that the wave function of particles released by O ′ at point B propa-
gates superluminally in the reference frame O as well.8 This conclusion can
be supported by the following argument. From the point of view of observer
O, the particle emitted at point B (t = 0) is in a localized state with definite
position. Such states do not have any definite velocity (or momentum), so
their free9 time evolution is determined only by the value of the position
characteristic for the initial state. Therefore particles emitted by boxes at
rest (e.g., the box A) and by moving boxes (e.g., the box B) are described
by essentially the same time-dependent wave functions at t > 0. The only
difference being a relative shift along the x-axis. Then instead of the acausal
7
This dependence will be discussed in the next section in greater detail.
8
this means that thin dashed lines around point B in Fig. 17.2 should be drawn parallel
to the axis x
9
At times t > 0 the box B remains opened, and the particle’s evolution is described by
the non-interacting Hamiltonian H0 .
response signal B → C 10 one would have a signal B → A, which, in spite of

being instantaneous, does not violate the principle of causality.
17.2 Inertial transformations in multiparticle

systems
One of the goals of physics declared in Introduction11 includes finding trans-
formations of observables between different inertial reference frames. In chap-
ter 4 and in subsection 6.2.3 we discussed inertial transformations of total
observables in a multiparticle system and we found that these transforma-
tions have universal forms, which do not depend on the system’s composition
and interactions acting there. In this section we will be interested in estab-
lishing inertial transformations for observables of individual particles within
an interacting multiparticle system. Our goal is to compare these predic-
tions of RQD with Lorentz transformations for time and position of events
in special relativity
t′ = t cosh θ − (x/c) sinh θ (17.5)

x′ = x cosh θ − ct sinh θ (17.6)
y′ = y (17.7)
z′ = z (17.8)
Here we will reach a surprising conclusion that formulas of special relativity

may be not accurate.
17.2.1 Events and observables

One of the most fundamental concepts in physics is the concept of an event.
Generally, event is some physical process or phenomenon occurring in a small
volume of space in a short interval of time. So, each event can be charac-
terized by four numbers: its time t and its position r = (x, y, z). These
numbers are referred to as space-time coordinates (t, r) of the event. For the
event to be observable, there should be some material particles present at
10
as predicted incorrectly in the traditional approach based on equation (17.4)
11
see page xxxv
17.2. INERTIAL TRANSFORMATIONS IN MULTIPARTICLE SYSTEMS555
time t at the point r. The simplest example of an event is an intersection of

trajectories (a collision) of two particles. We will define t as the reading of
the clock belonging to the observer witnessing the particles’ collision and r
as the (expectation) value of the positions of particles present in the event’s
volume.
In this section we would like to derive the relationship between event’s
space-time coordinates (t, r) measured in the reference frame at rest O and
space-time coordinates (t′ , r′ ) measured in the moving reference frame O ′ .
Since we just identified event’s position with the expectation value of par-
ticles’ position operators, finding boost transformations r → r′ is just an
exercise in straightforward application of the general rule for transforma-
tions of operators of observables between different reference frames.12 By
following this rule we should be able to derive analogs of Lorentz transfor-
mations (17.5) - (17.8) without artificial assumptions from Appendix I, so we
should be able to tell whether Lorentz transformations formulas are exact or
approximate. This is the plan of our presentation in this section.
For simplicity, here we will consider a system of two massive spinless
particles described in the Hilbert space H = H1 ⊗ H2 , where one-particle
observables (position, momentum, velocity, angular momentum, spin, en-
ergy,...) are denoted by lowercase letters:
r1 , p1 , v1 , j1 , s1 , h1 , . . . (17.9)
r2 , p2 , v2 , j2 , s2 , h2 , . . . (17.10)
Transformations of these observables between reference frames O and O ′
should be found by the general rule outlined in subsection 3.2.4. Suppose
that observers O and O ′ are related by an inertial transformation, which is
generated by the Hermitian operator F and parameter b. If g is an observable
(a Hermitian operator) of one particle in the reference frame O, and g(b) is
the same observable in the reference frame O ′ then we use equations (3.62)
and (E.13) to obtain
i i ib b2
g(b) = e− ~ F b ge ~ F b = g −
[F, g] − [F, [F, g]] + . . . (17.11)
~ 2!~
Application of this formula to event’s position is not straightforward, because
particle localization does not have absolute meaning in quantum mechanics.
12
see subsections 4.3.8 and 5.2.4
If observer O registers a localized event (or locallized particles constituting

this event), then other observers may disagree that the event is localized
or that it has occurred at all. Examples of such a behavior are common
in quantum mechanics. Some of them were discussed in subsections 6.5.3,
17.1.2, and 17.1.3. Thus we are going to apply boost transformations only
to expectation values of positions. In other words, in the rest of this chapter
we will work in the classical limit, where the spreading wave packet can be
ignored, and particle trajectories can be unambiguously defined. Then we will
interpret (17.9) and (17.10) as numerical (expectation) values of observables
in quasiclassical states, and instead of quantum operator equation (17.11)
with commutators we will use its classical analog involving Poisson brackets
(6.95)
b2
g(b) ≈ g + b[F, g]P + [F, [F, g]P ]P + . . .
2!
In order to perform calculations with this formula one needs two major things.
First, one needs to know expressions for Poincaré generators F in terms
of one-particle observables (17.9) - (17.10). This is equivalent to having a
full dynamical description of the system. Such a description can be easily
obtained in the case of a non-interacting particle system. However, for in-
teracting particles this is a rather non-trivial problem that can be solved
only approximately. Second, one needs to know Poisson brackets between all
one-particle observables (17.9) - (17.10). This is an easier task, which has
been accomplished already in chapters 4, 5, and in section 6.1. In particular,
we found there that observables of different particles always have vanishing
Poisson brackets. The Poisson brackets for observables referring to the same
particle are (i, j, k = 1, 2, 3)
[ri , rj ]P = [pi , pj ]P = [ri , sj ]P = [pi , sj ]P = 0 (17.12)

[ri , pj ]P = δij (17.13)
X 3
[si , sj ]P = ǫijk sk (17.14)
k=1
[p, h]P = [s, h]P = 0 (17.15)
pc2
[r, h]P = (17.16)
h
17.2.2 Non-interacting particles

First we assume that the two particles 1 and 2 are non-interacting, so that
generators of inertial transformations in the Hilbert space H are
H0 = h1 + h2 (17.17)
P0 = p1 + p2 (17.18)
J0 = j1 + j2 (17.19)
K0 = k1 + k2 (17.20)
The trajectory of the particle 1 in the reference frame O is obtained from the
usual formula (4.56)
i i i i i i
r1 (t) = e ~ H0 t r1 e− ~ H0 t = e ~ (h1 +h2 )t r1 e− ~ (h1 +h2 )t = e ~ h1 t r1 e− ~ h1 t
t2
≈ r1 + t[h1 , r1 ]P + [h1 , [h1 , r1 ]P ]P + . . . = r1 + v1 t (17.21)
2!
Applying boost transformations to (17.21) and taking into account (4.6) -
(4.8), (4.58) - (4.60), and (4.62) we find the trajectory of this particle in the
reference frame O ′ moving with the speed v = c tanh θ along the x-axis13
r
1x
r1x (θ, t′ ) = β + (v1x − v)t′ (17.22)
cosh θ
j1z v v1y t′
r1y (θ, t′ ) = β r1y + +
h1 cosh θ

r1x v1y v v1y t′
= r1y + β + (17.23)
c2 cosh θ

j1y v v1z t′
r1z (θ, t′ ) = β r1z + +
h1 cosh θ

r1x v1z v v1z t′
= r1z + β + (17.24)
c2 cosh θ
where we denoted β ≡ (1 − v1x vc−2 )−1 . Similar formulas are valid for the
particle 2.
13
If we set t′ = 0 then these formulas coincide with (23) - (24) in ref. [MM97]. By
setting also v1 = 0 we obtain the usual Lorentz length contraction formulas r1x (θ, 0) =
r1x /(cosh θ), r1y (θ, 0) = r1y , r1z (θ, 0) = r1z . Compare with equation (I.20).
The important feature of these formulas is that inertial transformations

for particle observables are completely independent on the presence of other
particles in the system, e.g. formulas for r1 (θ, t′ ) do not depend on observ-
ables of the particle 2. This is hardly surprising, since the two particles were
assumed to be non-interacting.
17.2.3 Lorentz transformations for non-interacting par-

ticles
Now, let us consider a localized event associated with the intersection of
particle trajectories. Suppose that from the point of view of the observer O
this event has space-time coordinates (t, r). This means that
x ≡ r1x (t) = r2x (t)

y ≡ r1y (t) = r2y (t)
z ≡ r1z (t) = r2z (t)
Apparently, these two trajectories intersect from the point of view of the
moving observer O ′ as well. So O ′ also sees the event. Now, the question is:
what are the space-time coordinates of the event seen by O ′ ? The answer to
this question is given by the following theorem.
Theorem 17.1 (Lorentz transformations for time and position) For

events defined as intersections of trajectories of non-interacting particles, the
Lorentz transformations for time and position (17.5) - (17.8) are exactly
valid.
Proof. Let us first prove that Lorentz formulas (17.5) - (17.8) are correct
transformations for the trajectory of the particle 1 between reference frames
O and O ′ . For simplicity, we will consider only the case in which this particle
is moving along the x-axis: r1y (t) = r1z (t) = v1y = v1z = 0. (More general
situations can be analyzed similarly.) Then we can ignore the y- and z-
coordinates in our proof. So, we need to prove that14
14
Here the left hand side is the Newton-Wigner position of the particle 1 seen from the
reference frame O′ at time t′ . This is formula (17.22). The right hand side is Lorentz-
transformed position as in equation (17.6).
r1x (θ, t′ ) = r1x (0, t) cosh θ − ct sinh θ

= (r1x + v1x t) cosh θ − ct sinh θ (17.25)
where
r1x (t)
t′ = t cosh θ − sinh θ (17.26)
c
To do that, we calculate the difference between the right hand sides of equa-
tions (17.22) and (17.25) with t′ taken from (17.26) and using v = c tanh θ
βr1x
+ (v1x − v)β(t cosh θ − c−1 (r1x + v1x t) sinh θ)
cosh θ
−(r1x + v1x t) cosh θ + ct sinh θ
β
= [r1x + v1x t cosh2 θ − vt cosh2 θ − (v1x r1x /c) sinh θ cosh θ
cosh θ
2
+(vr1x /c) sinh θ cosh θ − (v1x /c)t sinh θ cosh θ + (vv1x /c)t sinh θ cosh θ
−r1x cosh θ + r1x (v1x v/c ) cosh2 θ − v1x t cosh2 θ + (v1x
2 2 2
v/c2 )t cosh2 θ + ct sinh θ cosh θ
−(v1x v/c)t sinh θ cosh θ]
β
= [r1x − vt cosh2 θ − (v1x r1x /c) sinh θ cosh θ
cosh θ
2
+(vr1x /c) sinh θ cosh θ − (v1x /c)t sinh θ cosh θ
−r1x cosh θ + r1x (v1x v/c ) cosh2 θ + (v1x
2 2 2
v/c2 )t cosh2 θ + ct sinh θ cosh θ]
β
= [r1x − ct sinh θ cosh θ − (v1x r1x /c) sinh θ cosh θ
cosh θ
+r1x sinh2 θ − (v1x2
/c)t sinh θ cosh θ
2 2
−r1x cosh θ + (r1x v1x /c) sinh θ cosh θ + (v1x /c)t sinh θ cosh θ + ct sinh θ cosh θ]
= 0
This proves equation (17.25) boost-transformed trajectory (17.22) of the par-

ticle 1 is consistent with Lorentz formulas (17.5) and (17.6). The same is true
for the particle 2. This implies that times and positions of intersections of the
two trajectories also undergo Lorentz transformations (17.5) - (17.8) when
the reference frame is boosted.
17.2.4 Interacting particles

This time we will assume that the two-particle system is interacting. This
means that the unitary representation Ug of the Poincaré group in H is
different from the non-interacting representation Ug0 with generators (17.17)
- (17.20). Generally, we can write generators of Ug as15
H = h1 + h2 + V (r1 , p1 , r2 , p2 ) (17.27)
P = p1 + p2 + U(r1 , p1 , r2 , p2 ) (17.28)
J = j1 + j2 + Y(r1 , p1 , r2 , p2 ) (17.29)
K = k1 + k2 + Z(r1 , p1 , r2 , p2 ) (17.30)
where V, U, Y, and Z are interaction operators that are functions of one-
particle observables. One goal of this section is to find out more about the
interaction terms V, U, Y, and Z, e.g., to see if some of these terms can be
set to zero. In other words, we would like to understand if one can find an
observational evidence about the relativistic form of dynamics in nature.
17.2.5 Time translations in interacting systems

The most obvious effect of interaction is modification of the time evolution of
the system as compared to the non-interacting time evolution. We estimate
the strength of interaction between particles by how much their trajectories
deviate from the uniform straight-line movement (17.21). Therefore in any
realistic form of dynamics, the Hamiltonian - the generator of time transla-
tions - should contain a non-vanishing interaction V , and we can discard as
unphysical any form of dynamics in which V = 0. Then the time evolution
of the position of particle 1 is
i i i i
r1 (t) = e ~ Ht r1 e− ~ Ht = e ~ (h1 +h2 +V )t r1 e− ~ (h1 +h2 +V )t
t2
= r1 + t[h1 + V, r1 ]P + [(h1 + h2 + V ), [h1 + V, r1 ]P ]P + . . .
2
t2 t2
= r1 + v1 t + t[V, r1 ]P + [V, v1 ]P + [(h1 + h2 ), [V, r1 ]P ]P
2 2
2
t
+ [V, [V, r1 ]P ]P + . . . (17.31)
2
15
see equations (6.14) - (6.17)
In the simplest case when interaction V commutes with particle positions and
in the non-relativistic approximation v1 ≈ p1 /m1 , this formula simplifies
t2 ∂V f1 t2
r1 (t) ≈ r1 + v1 t − + . . . = r1 + v1 t + + ...
2m1 ∂r1 2m1
a1 t2
= r1 + v1 t + + ...
2
where we denoted
∂V (r1 , p1 , r2 , p2 )
f1 (r1 , p1 , r2 , p2 ) ≡ −
∂r1
the force with which particle 2 acts on the particle 1. The vector a1 ≡ f1 /m1
can be interpreted as acceleration of the particle 1 in agreement with the
Newton’s second law of mechanics. The trajectory r1 (t) of the particle 1
depends in a non-trivial way on the trajectory r2 (t) of the particle 2 and
vice versa. Curved trajectories of particles 1 and 2 are definitely observable
in macroscopic experiments. However, this interacting time evolution, by
itself, cannot tell us which form of relativistic dynamics is responsible for the
interaction. Other types of inertial transformations should be examined in
order to make this determination.
As an example, in this section we will explain which experimental mea-
surements should be performed to tell apart two popular forms of dynamics:
the instant form
H = h1 + h2 + V (17.32)
P = p1 + p2 (17.33)
J = j1 + j2 (17.34)
K = k1 + k2 + Z (17.35)
and the point form
H = h1 + h2 + V (17.36)
P = p1 + p2 + U (17.37)
J = j1 + j2 (17.38)
K = k1 + k2 (17.39)
17.2.6 Boost transformations in interacting systems

Similar to the above analysis of time translations, we can examine boost
transformations. For interactions in the point form (17.36) - (17.39), the po-
tential boost Z is zero, so boost transformations of the position and velocity
are the same as in the non-interacting case16
ic ic ic ic
r1x (θ) = e− ~ K0x θ r1x e ~ K0x θ = e− ~ k1x θ r1x e ~ k1x θ ≈ r1x − cθ[k1x , r1x ]P + . . .
r1x
= (17.40)
cosh θ(1 − v1x vc−2 )
ic ic ic ic
v1x (θ) = e− ~ K0x θ v1x e ~ K0x θ = e− ~ k1x θ v1x e ~ k1x θ ≈ v1x − cθ[k1x , v1x ]P + . . .
v1x − v
= (17.41)
1 − v1x vc−2
On the other hand, in the instant form, generators of boosts (17.35) are
dynamical, and transformation formulas are different. For example, the boost
transformation of position is
ic ic
r1x (θ) = e− ~ Kx θ r1x e ~ Kx θ
ic ic
= e− ~ (K0x +Zx )θ r1x e ~ (K0x +Zx )θ
≈ r1x − cθ[k1x , r1x ]P − cθ[Zx , r1x ]P + . . .
r1x
= − cθ[Zx , r1x ]P + . . . (17.42)
cosh θ(1 − v1x vc−2 )
The first term on the right hand side is the same interaction-independent
term as in (17.40). This term is responsible for the well-known relativistic
effect of length contraction (I.20). The second term in (17.42) is a correction
due to interaction with the particle 2. This correction depends on observables
of both particles 1 and 2, and it makes boost transformations of positions
dependent in a non-trivial way on the state of the system and on interactions
acting there. So, in the instant form of dynamics, there is a strong analogy
between time translations and boosts of particle observables. They are both
interaction-dependent, i.e., dynamical.
16
For simplicity, we consider only x-components here. For the general case, see (4.6) -
(4.8) and (17.22) - (17.24). As usual, v ≡ c tanh θ.
In order to observe the dynamical effect of boosts described above, one

would need to use measuring devices moving with very high speeds com-
parable to the speed of light. This presents enormous technical difficulties.
So, boost transformations of particle positions have not been directly ob-
served with an accuracy sufficient to detect the kinematical relativistic effect
(17.40), let alone the deviation [Zx , r1x ]P due to interactions.
Similarly, we can consider boost transformations of velocity in the instant
form of dynamics
ic ic ic ic
v1x (θ) = e− ~ Kx θ v1x e ~ Kx θ = e− ~ (K0x +Zx )θ v1x e ~ (K0x +Zx )θ
v1x − v
= v1x − cθ[k1x , v1x ]P − cθ[Zx , v1x ]P + . . . = − cθ[Zx , v1x ]P + . . .
1 − v1x vc−2
v1x v(v1x − v)
= (v1x − v) + − cθ[Zx , v1x ]P + . . .
c2
The terms on the right hand side have clear physical meaning: The first
term v1x − v is the usual non-relativistic change of velocity in the moving
reference frame. This is the most obvious effect of boosts that is visible in our
everyday life. The second term is a relativistic correction that is valid for both
interacting and non-interacting particles. This correction is a contribution of
the order c−2 to the relativistic law of addition of velocities (4.6). Currently,
there is abundant experimental evidence for the validity of this law.17 The
third term is a correction due to the interaction between particles 1 and 2.
This effect has not been seen experimentally, because it is very difficult to
perform accurate measurements of observables of interacting particles from
fast moving reference frames, as we mentioned above.
In conclusion, detailed measurements of boost transformations of particle
observables are very difficult, and with the present level of experimental
precision they cannot help us to decide which form of dynamics is active
in any given physical system. Let us now turn to space translations and
rotations.
17.2.7 Spatial translations and rotations

In both instant and point forms of dynamics, rotations are interaction-
independent, so the term Y in the generator of rotations (17.29) is zero,
17
and rotation transformations of particle positions (and other observables)

are exactly the same as in the non-interacting case, e.g.,18
~ = e− ~i J·φ~ r1 e ~i J·φ~ = e− ~i j1 ·φ~ r1 e ~i j1 ·φ~ = R ~ r1

r1 (φ) (17.43)
φ
This is in full agreement with experimental observations.

In the instant form of dynamics, space translations are interaction-independent
as well
i i i i i i
r1 (a) = e− ~ P·a r1 e ~ P·a = e− ~ (p1 +p2 )·a r1 e ~ (p1 +p2 )·a = e− ~ p1 ·a r1 e ~ p1 ·a
= r1 − a
Again this result is supported by experimental observations and by our com-

mon experience in various physical systems and in a wide range of values of
the transformation parameter a.
However, the point-form generator of space translations (17.37) does de-
pend on interaction. Thus translations of the observer have a non-trivial
effect on measured positions of interacting particles. For example, the ac-
tion of a translation along the x-axis on the x-component of position of the
particle 1 is
i i i i
r1x (a) = e− ~ Px a r1x e ~ Px a = e− ~ (p1x +p2x +Ux )a r1x e ~ (p1x +p2x +Ux )a
≈ r1x − a[(p1x + Ux ), r1x ]P + . . .
= r1x − a − a[Ux , r1x ]P + . . . (17.44)
where the last term on the right hand side is the interaction correction. Such
a correction has not been seen in experiments in spite of the fact that there
is no difficulty in arranging observations from reference frames displaced by
large distances a. So, there is a good reason to believe that interaction
dependence (17.44) has not been seen because it is non-existent.
Thus we conclude that the effect of space translations and rotations must
be independent on interactions in the system. This means that these trans-
18
Rotation matrix Rφ~ has been defined in (D.22).
formations are kinematical as in the instant form19
P = P0 (17.45)
J = J0 (17.46)
Therefore available experimental data imply that
Postulate 17.2 (instant form of dynamics) The unitary representation

of the Poincaré group acting in the Hilbert space of any interacting physical
system belongs to the instant form of dynamics.
Throughout this book we assumed (without much discussion) that inter-

actions belong to the instant form. Now we see that this was the correct
choice.
Our arguments in this section used the assumption that one can observe
particle trajectories while interaction takes place. In order to make such
observations, the range of interaction should be larger than the spatial reso-
lution of instruments. This condition is certainly true for particles interacting
via long-range electromagnetic forces. Their non-trivial interacting dynam-
ics (17.31) is directly observed. See chapters 15 and 16. Therefore, for such
systems one should use the instant form of relativistic dynamics with inter-
acting boost operators (N.28). In chapter 13, from the analysis of particle
decays we showed that Postulate 17.2 must be valid also for short-range weak
nuclear forces. In the case of systems governed by short-range strong nuclear
forces, neither interacting trajectories nor time-dependent decay laws can be
observed.20 Thus, the form of dynamics governing strong nuclear interactions
remains an open issue.
19
It follows immediately from (17.45) that boosts ought to be dynamical. Indeed, sup-
pose that boosts are kinematical, i.e., K = K0 . Then from commutator (3.57) we obtain
H = c2 [Kx , Px ]P = c2 [K0x , P0x ]P = H0
which means that V = 0, and the system is non-interacting in disagreement with our
initial assumption.
20
The presence of interaction becomes evident only through scattering effects or through
formation of bound states, which are insensitive to the form of dynamics, as shown in
subsection 7.2.3.
17.2.8 Physical inequivalence of forms of dynamics

Postulate 17.2 contradicts a widely shared belief that different forms of dy-
namics are physically equivalent. In the literature one can find examples of
calculations performed in the instant, point, and front forms. The common
assumption is that one can freely choose the form of dynamics which is more
convenient. Where does this idea come from? There are two sources. The
first source is the fact21 that different forms of dynamics are scattering equiv-
alent. The second source is the questionable assumption that all physically
relevant information can be obtained from the S-matrix:
If one adopts the point of view, first expressed by Heisenberg, that

all experimental information about the physical world is ultimately
deduced from scattering experiments and reduces to knowledge of
certain elements of the scattering matrix (or the analogous classi-
cal quantity), then different dynamical theories which lead to the
same S-matrix must be regarded as physically equivalent. S. N.
Sokolov and A. N. Shatnii [SS78]
We already discussed in chapter 7 that having exact knowledge of the on-shell

S-matrix one can easily calculate scattering cross-sections. Moreover, the
energy levels and lifetimes of bound states are encoded in positions of poles
of the S-matrix on the complex energy plane. It is true that in modern high
energy physics experiments it is very difficult to measure anything beyond
these data. This is the reason why scattering-theoretical methods play such
an important role in particle physics. It is also true that in order to describe
these data, one can choose any convenient form of dynamics and a wide range
of scattering-equivalent expressions for the Hamiltonian.
However, it is definitely not true that the S-matrix provides a complete
description of everything that can be observed. For example, the time evo-
lution and other inertial transformations of particle observables discussed
earlier in this section, cannot be described within the S-matrix formalism. A
theoretical description of these phenomena requires exact knowledge of gen-
erators of the Poincaré group. Two scattering-equivalent forms of dynam-
ics may yield very different transformations of states with respect to space
translations, rotations and/or boosts. These differences can be measured in
21
explained in subsection 7.2.3
experiments: For example, the interaction-independence of spatial transla-

tions rules out the point form of dynamics, as we established in subsection
17.2.7.
17.2.9 “No interaction” theorem

The fact that boost generators are interaction-dependent has very impor-
tant implications for relativistic effects in interacting systems. For example,
consider a system of two interacting particles. The arguments used to prove
Theorem 17.1 are no longer valid in this case. Boost transformations of par-
ticle positions (17.42) contain interaction-dependent terms. This means that
Lorentz transformations (17.5) - (17.8) are no longer applicable to trajecto-
ries of individual particles and associated events.
The contradiction between the usually assumed “invariant world lines”
and relativistic interactions was noticed first by Thomas [Tho52]. Currie,
Jordan, and Sudarshan analyzed this problem in greater detail [CJS63] and
proved their famous theorem
Theorem 17.3 (Currie, Jordan, and Sudarshan) In a two-particle sys-

tem,22 trajectories of particles obey Lorentz transformation formulas (17.5) -
(17.8) if and only if the particles do not interact23 with each other.
Proof. We have demonstrated in Theorem 17.1 that trajectories of non-

interacting particles do transform by Lorentz formulas. So, we only need to
prove the reverse statement: Any physical system whose trajectories trans-
form by Lorentz formulas, is interaction-free.
In our proof we will need to study inertial transformations of particle
observables (position r and momentum p), with respect to time translations
and boosts. In particular, given observables r(0, t) and p(0, t) at time t in
the reference frame O, we would like to find observables r(θ, t′ ) and p(θ, t′ )
in the moving reference frame O ′ , where time t′ is measured by its own clock.
As before, we will assume that O ′ is moving relative to O with velocity
v = c tanh θ along the x-axis.
22
This theorem can be proven for many-particle systems as well. We limit ourselves to
two particles in order to simplify the proof.
23
In our version of the proof we actually demonstrate the impossibility of cluster-
separable interactions. For a more general proof see original paper [CJS63].
Our plan is similar to the proof of Theorem 17.1. We will compare formu-
las for r(θ, t′ ) and p(θ, t′ ) obtained by two methods. In the first method, we
will use Lorentz transformations of special relativity. In the second method,
we will apply interacting unitary operators of time translation and boost to
r and p. Our goal is to show that these two methods give different results.
It is sufficient to demonstrate that differences occur already in terms linear
with respect to t′ and θ. So, we will work in this approximation.
Let us apply the first method (i.e., traditional Lorentz formulas). From
equations (17.5) - (17.8) and (4.3) we obtain the following transformations
for the position and momentum of the particle 1 (formulas for the particle 2
are similar)
r1x (θ, t′ ) ≈ r1x (0, t) − ctθ (17.47)

r1y (θ, t′ ) = r1y (0, t) (17.48)
r1z (θ, t′ ) = r1z (0, t) (17.49)
1
p1x (θ, t′ ) ≈ p1x (0, t) − h1 (0, t)θ (17.50)
c
p1y (θ, t′ ) = p1y (0, t) (17.51)
p1z (θ, t′ ) = p1z (0, t) (17.52)
θ
t′ ≈ t − r1x (0, t) (17.53)
c
We can rewrite equation (17.47) without affecting the first order accuracy
level in t′ and θ

′ ′ r1x (0, t) ′ r1x (t)
r1x (θ, t ) = r1x 0, t + θ − t + θ cθ
c c
dr1x (0, t′ )
≈ r1x (0, t′ ) + r1x (0, t)θ − cθt′
cdt′
dr1x (0, t′ )
≈ r1x (0, t′ ) + r1x (0, t′ )θ − cθt′ (17.54)
cdt′
Next we use the second method (i.e., the direct application of interacting
time translations and boosts)
ic i ′ ic ic ic ic i ′ ic
r1x (θ, t′ ) = e− ~ Kx θ e ~ Ht e ~ Kx θ e− ~ Kx θ r1x (0, 0)e ~ Kx θ e− ~ Kx θ e− ~ Ht e ~ Kx θ
i ′ cosh θ − ic P t′ sinh θ − ic ic ic ′ sinh θ − ~i Ht′ cosh θ
= e ~ Ht e
~ x ~
Kx θ
e r1x (0, 0)e ~ Kx θ e ~ Px t e
i
Ht′ − ic P t′ θ ic
P t′ θ − ~i Ht′
≈ e ~ e ~ x (r1x (0, 0) − cθ[r1x (0, 0), Kx]P )e ~ x e
i
Ht′ − ~i Ht′
≈ e (r1x (0, 0) − cθt′ − cθ[r1x (0, 0), Kx]P )e
~
= r1x (0, t′ ) − cθ[r1x (0, t′ ), Kx (t′ )]P − cθt′ (17.55)

i
Ht′ − ic P t′ − ic ic ic
P t′ − ~i Ht′
p1x (θ, t′ ) = e ~
cosh θ
e ~ x
sinh cθ
e ~
Kx θ
p1x (0, 0)e ~
Kx θ
e ~ x
sinh cθ
e cosh θ
i ′ ic ′ ic ′ i ′
≈ e ~ Ht e− ~ Px t θ (p1x (0, 0) − cθ[p1x (0, 0), Kx]P )e ~ Px t θ e− ~ Ht
i ′ i ′
≈ e ~ Ht (p1x (0, 0) − cθ[p1x (0, 0), Kx ]P )e− ~ Ht
= p1x (0, t′ ) − cθ[p1x (0, t′ ), Kx (t′ )]P (17.56)
If Lorentz transformations were exact, then results of both methods would

be identical and comparing (17.54) and (17.55) we would obtain
1 dr1x (0, t)
r1x (0, t)θ = −cθ[r1x (0, t), Kx (t)]P
c dt
dr1x ∂H ∂Kx
or using dt
= [r1x , H]P = ∂p1x
and [r1x , Kx ]P = ∂p1x
∂Kx ∂H
c2 = −rx
∂p1x ∂p1x
Similar arguments lead us to the general case (i, j = 1, 2, 3)
∂Kj ∂H
c2 = −rj (17.57)
∂p1i ∂p1i
Comparing equations (17.56) and (17.50) we get

h1 (0, t)θ r1x (0, t)
p1x (0, t) − = p1x 0, t − θ − cθ[p1x (0, t′ ), Kx (t′ )]P
c c
r1x (0, t)θ ∂p1x (0, t)
≈ p1x (0, t) − − cθ[p1x (0, t′ ), Kx (t′ )]P
c ∂t
r1x (0, t)θ
≈ p1x (0, t) − [p1x (0, t), H]P − cθ[p1x (0, t), Kx (t)]P
c
and
c2 [p1x , Kx ]P = −r1x [p1x , H]P + h1
In the general case (i, j = 1, 2, 3) we have
∂Kj ∂H
c2 = −r1j + δij h1 (17.58)
∂r1i ∂r1i
Putting equations (17.57) - (17.58) together, we conclude that if tra-
jectories of interacting particles transform by Lorentz, then the following
equations must be valid
∂Kk ∂H
c2 = −r1k (17.59)
∂p1 ∂p1
∂Kk ∂H
c2 = −r2k (17.60)
∂p2 ∂p2
∂K k ∂H
c2 = −r1k + δik h1 (17.61)
∂r1i ∂r1i
∂K k ∂H
c2 = −r2k + δik h2 (17.62)
∂r2i ∂r2i
Our next goal is to show that these equations lead to a contradiction. Taking
derivatives of (17.59) by p2 and (17.60) by p1 and subtracting them we obtain
∂2H
=0
∂p2 ∂p1
In a similar way we get24
∂2H ∂2H ∂2H

= = =0
∂r2 ∂r1 ∂r2 ∂p1 ∂p2 ∂r1
The only non-zero cross-derivatives are

p
24
Here we used ∂h2 /∂r1 = ∂ m2 c4 + p22 c2 /∂r1 = 0 and ∂h2 /∂p1 = ∂h1 /∂r2 =
∂h1 /∂p2 = 0.
∂2H
6 = 0
∂p1 ∂r1
∂2H
6 = 0
∂p2 ∂r2
Therefore, only pairs of arguments (p1 , r1 ) and (p2 , r2 ) are allowed to be
together in H, and we can represent the full Hamiltonian in the form
H = H1 (p1 , r1 ) + H2 (p2 , r2 )
This result means that the force acting on the particle 1 does not depend on
the state (position and momentum) of the particle 2.
∂p1
f1 = = [p1 , H]P = [p1 , H1 (p1 , r1 )]P
∂t
and vice versa. Therefore, both particles move independently, i.e., there
is no interaction. This already proves the statement of the Theorem. A
stronger result can be obtained if we disregard the possibility of non-cluster
separable (or long-range) interactions. From the Poisson bracket with the
total momentum we obtain
∂H1 (p1 , r1 ) ∂H2 (p2 , r2 )

0 = [P, H]P = [p1 + p2 , H]P = − − (17.63)
∂r1 ∂r2
Since two terms on the right hand side of (17.63) depend on different vari-
ables, we must have
∂H1 (p1 , r1 ) ∂H2 (p2 , r2 )

=− =C
∂r1 ∂r2
where C is a constant vector. Then the Hamiltonian can be written in the
form
H = H1 (p1 ) + H2 (p2 ) + C · (r1 − r2 )

To ensure the cluster separability (=short range) of the interaction we must

set C = 0. Then the resulting form of the Hamiltonian H = H1 (p1 ) + H2(p2 )
implies that the force acting on the particle 1 vanishes
∂p1
f1 = = [p1 , H]P = [p1 , H1 (p1 )]P = 0
∂t
The same is true for the force acting on the particle 2.
The above theorem shows us that if particles have Lorentz-invariant
“worldlines,” then they are not interacting. In special relativity, Lorentz
transformations are assumed to be exactly and universally valid (see As-
sertion I.1). Then the theorem leads to the conclusion that inter-particle
interactions are simply impossible. This explains why this theorm is com-
monly referred to as “the no-interaction theorem.” Of course, it is absurd to
think that there are no interactions in nature. So, in current literature there
are two interpretations of this result. One interpretation is that the Hamil-
tonian dynamics cannot properly describe interactions. Then a variety of
non-Hamiltonian versions of dynamics were suggested [Kei94, DW65, Pol85].
Another view is that variables r and p do not describe real observables of par-
ticle positions and momenta, or even that the notion of particles themselves
becomes irrelevant in quantum field theory. Quite often the Currie-Jordan-
Sudarshan theorem is considered as an evidence that particle-based descrip-
tion of nature is not adequate, and one should seek a field-based approach
[Boy08].
However, we reject both these explanations. Non-Hamiltonian versions of
particle dynamics contradict fundamental postulates of relativistic quantum
theory, which were formulated and analyzed throughout this book. We also
would like to stick to the idea that physical world is described by particles
with well-defined positions, momenta, spins, etc. We will also assume that
these physical particles interact via instantaneous potentials obtained in the
dressed particle approach to QFT. So, for us the only way out of the paradox
is to admit that Lorentz transformations of special relativity are not appli-
cable to observables of interacting particles. Then from our point of view,
it is more appropriate to call Theorem 17.3 the “no-Lorentz-transformation
theorem.” In contrast to the special-relativistic Assertion I.1, boost transfor-
mations of observables of individual particles should depend on the observed
system, its state and on interactions acting in the system. So, we conclude
that boost transformations are dynamical.
17.3. COMPARISON WITH SPECIAL RELATIVITY 573
17.3 Comparison with special relativity

In this section we would like to discuss the physical significance of our con-
clusion about the dynamical character of boosts and how this contradicts
Einstein’s special relativity. In subsection 17.3.1 we will analyze existing
proofs of Lorentz transformations and show that these proofs do not ap-
ply to transformations of observables of interacting particles. In subsection
17.3.2 we are going to discuss experimental verifications of special relativ-
ity. We will see that in most cases these experiments are dealing with free
particles, for which our theory makes the same predictions as the standard
special relativity. When genuine interacting systems are observed (such as
decaying particles), the measurements are not accurate enough to register
our predicted deviations from Einstein’s theory. So, it is not surprising that
this theory has withstood all experimental tests so far. In subsections 17.3.3
- 17.3.6 we will suggest that such fundamental assertions of relativistic theo-
ries as the manifest covariance and the 4-dimensional Minkowski space-time
continuum should be rejected.
17.3.1 On “derivations” of Lorentz transformations

Einstein based his special relativity [Ein05] on two postulates. One of them
was the principle of relativity. The other was the independence of the speed
of light on the velocity of the source and/or observer. Both these statements
remain true in our theory as well (see our Postulate 2.1 and Statement 5.1).
Then Einstein discussed a series of thought experiments with measuring rods,
clocks, and light rays, which demonstrated the relativity of simultaneity, the
length contraction of moving rods, and the slowing-down of moving clocks.
These observations were formalized in Lorentz formulas (17.5) - (17.8), which
supposedly connected times and positions of a localized event in different
moving reference frames. As we demonstrated in Theorem 17.1, our approach
leads to exactly the same transformation laws for events associated with
non-interacting particles. So far our approach and special relativity are in
complete agreement.
Note that although Einstein’s relativity postulate had universal appli-
cability to all kinds of events and processes, his “invariance of the speed of
light” postulate is only relevant to freely propagating light pulses. So, strictly
speaking, all conclusions made in [Ein05] can be applied only to space and
time coordinates of events (such as intersections of light pulses) related in
some way to the propagation of light. Nevertheless, in his work Einstein

tacitly assumed25 that the same conclusions should be extended to all other
events independent on their physical nature and on involved interactions.26
There is a large number of publications [Sch84, Fie97, LK75, LL76, Sar82,
Pol01], which claim that Lorentz transformation formulas (17.5) - (17.8) can
be derived even without using the Einstein’s second postulate. However,
these works do not look conclusive. There are two common features in these
derivations, which we find troublesome. First, they often assume an abstract
(i.e., independent on real physical processes and interactions) nature of events
occurring in space-time points (t, x, y, z). Second, they often postulate the
isotropy and homogeneity of space around these points [Sch84, Fie97, LK75,
LL76]. The main problem with these approaches is that in physics we should
be interested in transformations of observables of real interacting particles,
rather than abstract space-time points. One cannot make an assumption
that transformations of these observables are completely independent of what
occurs in the space surrounding the particle and what are interactions of this
particle with the rest of the physical system.
One can reasonably assume that all directions in space are exactly equiv-
alent for a single isolated particle [Sar82], but this is not at all obvious when
the particle participates in interactions with other particles. Suppose that
we have two interacting particles 1 and 2 at some distance from each other.
Suppose that we want to derive boost transformations for observables of the
particle 1. Clearly, for this particle different directions in space are not equiv-
alent: For example, the direction pointing to the particle 2 is different from
other directions. So, the assumption of spatial isotropy cannot be applied in
this case.
Sometimes the following argument is presented in order to justify the
applicability of (17.5) - (17.8) even for interacting systems. Suppose that we
have two events A and B having the same coordinates (r, t) in the frame at
rest. Suppose also that event A is related to light pulses (therefore, Lorentz
formulas are exactly applicable to it), but event B is associated with some
interacting system. If space-time coordinates of A and B transformed by
different formulas, then we would have a seemingly intolerable situation in
which events A and B coincide in the frame at rest, but they occur at different
space-time points if observed from a moving frame. Therefore, the argument
25
and this assumption was being repeated in all relativity textbooks ever since
26
see Assertion I.1
goes, all events, independent on their physical nature, must transform by the
same universal formulas (17.5) - (17.8). Though seemingly reasonable, the
above argument is not convincing. There is absolutely no experimental or
theoretical support for the above “coincidence postulate” (i.e., that events,
overlapping in one frame, overlap in all other frames as well).
Thus we conclude that there are no compelling theoretical reasons to
believe in the universal validity of Lorentz transformations (17.5) - (17.8).
Note that special relativity first postulates these transformations and then
tries to formulate dynamical (interacting) theories, which conform with this
assumption. Our approach to relativistic dynamics is fundamentally differ-
ent, in fact, opposite. We start with formulation of relativistic (=Poincaré
invariant) interacting theory. Then we derive boost transformations of par-
ticle observables using standard formulas of quantum theory and see27 that
they are different from universal Lorentz formulas (17.5) - (17.8). Correct
boost transformations depend on the state of the observed multiparticle sys-
tem and on interactions acting there.
We also see28 that “geometric” universality of boosts contradicts the (well-
established) dynamical character of time translations. A theory, in which
time translations are dynamical while space translations, rotations, and boost
are kinematical, cannot be invariant with respect to the Poincaré group. So,
ironically, the assumptions of kinematical boosts, universal Lorentz trans-
formations, and “invariant worldlines” are in conflict with the principle of
relativity. This contradiction is the main reason for the “no interaction”
Theorem 17.3.
17.3.2 On experimental tests of special relativity

Supporters of special relativity usually invoke an argument that predictions
of this theory were confirmed by experiments with astonishing precision.
This is, indeed, true. However, at a closer inspection it appears that existing
experiments cannot distinguish between special relativity and the approach
presented in this book. In some cases, this is because two theories really
agree. In other cases, the disagreement is so small that the required precision
is out of reach for modern technology.
From the preceding discussion it should be clear that Lorentz formulas
27
in section 17.2
28
in subsection 17.2.7
of special relativity are exactly applicable to observables of non-interacting

particles and to total observables of any physical system, whether interacting
or not. It appears that almost all experimental tests of special relativity
are concerned with these kinds of measurements: they either look at non-
interacting (free) particles or at total observables in a compound system.
Below we briefly discuss several major classes of such experiments [NFRS78,
Mac86, Sch, Rob00].
One class of experiments is related to measurements of the frequency
(energy) of light and its dependence on the movement of the source and/or
observer.29 These Doppler effect experiments [IS38, IS41, KPR85, HMS79]
can be formulated either as measurements of the photon’s energy dependence
on the velocity of the source (or observer) or as velocity dependence of the en-
ergy level separation in the source. These two interpretations were discussed
in subsections 5.3.4 and 6.4.2, respectively. In the former interpretation, one
is measuring the energy (or frequency) of a free particle - the photon. In
the latter interpretation, measurements of the total energy (differences) in
an interacting system are performed. In both these formulations, predictions
of our theory exactly coincide with special relativity.
Another class of experiments is concerned with measuring the speed of
light and confirming its independence on the movement of the source and/or
observer. This class includes interference experiments of Michelson-Morley
and Kennedy-Thorndike as well as direct speed measurements [AFKW64].
These experiments are performed with free photons or light rays, so, again,
our theory and special relativity make exactly the same predictions for such
non-interacting systems. The same is true for tests of relativistic kinematics,
which include relationships between velocities, momenta, and energies of
free massive particles as well as changes of these parameters after particle
collisions or decays.
An exceptional type of experiment where one can, at least in principle,
observe the differences between our theory and special relativity is the decay
of fast moving unstable particles. In this case we are dealing with a physi-
cal system in which the interaction acts during a long time interval (of the
order of particle’s lifetime), and there is a clearly observable time-dependent
process (the decay) which is controlled by the strength of the interaction.
It was established in chapter 13 that experiments measuring decays of fast
moving particles are not accurate enough to see the small difference between
29
See subsections 5.3.4 and 6.4.2.
predictions of special relativity and RQD.
17.3.3 Poincaré invariance vs. manifest covariance

From our above discussion it should be clear that there are two rather differ-
ent approaches to constructing relativistic theories. One is the traditional ap-
proach pioneered by Einstein and Minkowski and used in theoretical physics
ever since. This approach accepts without proof the validity of Assertion I.1
(the universality of Lorentz transformations) and its various consequences,
like Assertions I.3 (no superluminal signaling) and I.2 (manifest covariance).
It also assumes the existence of space-time, its 4-dimensional geometry, and
universal tensor transformations of space-time coordinates of events. The
distinguishing feature of this manifestly covariant approach is that boost
transformations of observables are assumed to be interaction-independent
and universal.
In this book we take a different viewpoint on relativity. We would like
to call it a Poincaré invariant approach. It is built on two fundamental
Postulates: the principle of relativity (Postulate 2.1) and the laws of quantum
mechanics from sections 1.5 and 1.6. From these Postulates we found that
Statement 17.4 (Poincaré invariance) Descriptions of the system in dif-

ferent inertial reference frames are related by transformations which furnish
a unitary representation of the Poincaré group.
For example, if F is an operator of observable in the reference frame at

rest, then in the moving frame the same observable is represented by the
transformed operator
ic ~ ic ~
F ′ = e− ~ Kθ F e ~ Kθ
Most textbooks in relativistic quantum theory tacitly assume that the

Poincaré invariance and the manifest covariance do not contradict each other,
in fact, they are often assumed to be equivalent. However, it is important to
realize that there is no convincing proof of such an equivalence. For example,
Foldy wrote
To begin our discussion of relativistic covariance, we would like

first to make clear that we are not in the least concerned with
appropriate tensor or spinor equations, or with “manifest covari-

ance” or with any other mathematical apparatus which is intended
to exploit the space-time symmetry of relativity, useful as such
may be. We are instead concerned with the group of inhomoge-
neous Lorentz transformations as expressing the inter-relationship
of physical phenomena as viewed by different equivalent observers
in un-accelerated reference frames. That this group has its basis in
the symmetry properties of an underlying space-time continuum
is interesting, important, but not directly relevant to the consid-
erations we have in mind. L. Foldy [Fol61]
This issue was also discussed by H. Bacry who came to a similar conclusion
The Minkowski manifest covariance cannot be present in quantum

theory but we want to preserve the Poincaré covariance. H. Bacry
[Bac89]
17.3.4 Is there an observable of time?

Special relativity and the manifestly covariant approach to relativistic physics
adopt a “geometrical” viewpoint on Lorentz transformations.30 In these the-
ories, time and position are unified as components of one 4-vector, and they
are treated on equal footing. Such an unification implies that there should be
certain similarity between space and time coordinates. However, in quantum
mechanics31 there is a significant physical difference between space and time.
Space coordinates x, y, z are attributes (observables) of a material physical
system – a collection of particles. In the formalism of quantum mechanics
these coordinates are represented by (expectation) values of the position op-
erator R. There are position operators for each particle in the system as well
as the center-of-mass position operator. On the other hand, time is not an
observable in quantum mechanics.
In order to better understand this difference between r and t, recall our
definitions of measurements, clocks, and observables from Introduction. The
measurements yield values of observables, such as positions, momenta, en-
ergies, etc, and these values depend on the nature of the measured physical
system and on the state in which the system finds itself. “Time” is not in
30
see Appendix I
31
and in our everyday experience
the list of physical observables. Time is just a numerical label attached to

each measurement according to the reading of the clock at the instant of
the measurement. The clock is separate from the observed physical system.
Clock readings do not depend on the kind of system being measured and on
its state. We can record time even if we do not measure anything, even if
there is no physical system to observe in our laboratory.
The clock is the necessary component of any reference frame or observer,
but this component is different from any measuring apparatus. In order
to “measure” time the observer needs to look at hour and minute hands
or at the digital display of her laboratory clock. In practical applications
clocks are macroscopic classical systems, such that there are no quantum
uncertainties in the hand’s positions, or these uncertainties are reduced to a
minimum. Of course, there is a certain logical controversy here. We know
that all systems (including clocks) obey the laws of quantum mechanics.
When we look at clock’s hands we basically measure their positions, while
assuming that their velocities are exactly zero. This situation is explicitly
forbidden by the Heisenberg’s uncertainty principle. So, there should be some
uncertainty associated with the measurement of the clock hand’s position,
which implies an uncertainty associated with “measurements of time”. Does
it mean that some “quantum nature” of time should be taken into account?
The answer is no. Only those systems which produce well-defined countable
periodic “ticks” without any (or with negligible) quantum uncertainty are
suitable as good clocks. So, if our laboratory clock exhibits some annoying
quantum fuzziness, then this is simply a bad clock that should be replaced
by a more accurate and stable one. Similarly, in order to measure positions
one needs to have heavy macroscopic sticks as rulers whose length is not
subject to quantum fluctuations. The existence of such ideal clocks and
rulers is questionable from the formal theoretical point of view. But there is
no doubt that distances and time intervals can be measured with very high
precision in practice. So, for theoretical purposes, it is reasonable to assume
the availability of ideal clocks and rulers, whose performance is not affected
by quantum effects.
The clock and the observed physical system are two separate objects,
and time “measurements” do not involve any interaction between the physi-
cal system and the measuring apparatus. Therefore, in quantum mechanics
there can be no “operator of time”, such that t is the expectation value
(or eigenvalue) of this operator. All attempts to introduce time operator in
quantum mechanics were not successful.
There were numerous attempts to introduce the “time of arrival” observ-

able (and a corresponding Hermitian operator) in quantum mechanics, see,
e.g., [ORU98, Gal05, WX06, GRT96] and references cited therein. For ex-
ample, one can mark a certain space point (X, Y, Z) and ask “at what time
the particle arrives at this point?” Observations can yield a specific value for
this time T , and this value depends on the particle’s state. Of course, these
are important attributes of an observable. However, they are not sufficient to
justify the introduction of the “time of arrival” observable. According to our
definitions from Introduction, any observable is an attribute of the system
that can be measured by all observers. The time of arrival is a different kind
of attribute. For those observers whose time label32 is different from T the
particle is not at the point (X, Y, Z), so the time of arrival value is com-
pletely undefined. So, one cannot associate the time of arrival with any true
observable. It is more correct to say that the “time of arrival” is a time label
of a particular inertial observer (or observers) for whom the measured value
of the particle’s position coincides with the pre-determined point (X, Y, Z).
An alternative proposal to introduce the time operator was presented in
[Nik08]. The author suggested to define the action of the time operator T̂
on particle wave functions ψ(r, t) as
T̂ ψ(r, t) = tψ(r, t)
According to our postulate 1.2, the (assumed) existence of such an observable

implies existence of states (eigenstates of T̂ ) in which time acquires a definite
fixed value t0 . The wave function of the particle in such a state is zero at all
times except (small neighborhood of the) time t0 . Physically this means that
the particle was created spontaneously out of nothing, existed for a short
time interval around t0 and then disappeared. Such states violate all kinds
of conservation laws, and they are clearly unphysical.
17.3.5 Is geometry 4-dimensional?

Our position in this book is that there is no “symmetry” between space and
time coordinates. So, there is no need for a 4-dimensional “background”
continuum of special relativity. All we care about (in both experiment and
32
Recall that in our definition (see Introduction) observers are “instantaneous”: each
observer is characterized by a definite time label.
in theory) are particle observables (e.g., positions and momenta) and how
they transform with respect to inertial transformations (e.g., time transla-
tions and boosts) of observers. Particle observables are given by Hermitian
operators in the Hilbert space of the system. Inertial transformations enter
the theory through the unitary representation of the Poincaré group in the
same Hilbert space. Once these ingredients are known, one can calculate
the effect of any transformation on any observable. To do that, there is no
need to make assumptions about the “symmetry” between space and time
coordinates and to introduce a 4-dimensional spacetime geometry. The clear
evidence for non-universal, non-geometrical and interaction-dependent char-
acter of boost transformations was obtained in section 17.2. So, we suggest
that 4D Minkowski space-time should not be used in physical theories at all.
Likewise, there is no basis for the special-relitivistic classification of phys-
ical quantities into 4-scalars, 4-vectors, 4-tensors, etc. In reality, boost trans-
formations of observables in interacting systems may have quite complicated
interaction-dependent forms that deviate from universal linear Lorentz for-
mulas.
Historical and philosophical discussion of the idea that relativistic effects
(such as length contraction and time dilation) result from the dynamical
behavior of individual physical systems rather than from kinematical prop-
erties of the universal “space-time continuum” can be found in the book
[Bro05]. In our work we go further and claim that the difference between
“dynamical” and “kinematical” approaches is not just philosophical one. It
has real observable consequences. We have shown in section 17.2 that boost
transformations are interaction-dependent and that they cannot be reduced
to simple universal Lorentz formulas or “pseudo-rotations” in the Minkowski
space-time. Then the ideas of the universal pseudo-Euclidean space-time
continuum and of the manifest covariance of physical laws33 can be accepted
only as approximations. Additional physical arguments against the notion
of the Minkowski space-time can be found in [Bac04].
We cannot deny that the Minkowski space-time idea turned out to be very
fruitful in the formalism of quantum field theory. However, in section 17.4
we will take a viewpoint that quantum fields are just formal mathematical
objects and that the 4-dimensional manifold on which the fields are defined
has nothing to do with real physical space and time.
33
see Assertions I.1 and I.2
17.3.6 “Dynamical” relativity

From the dynamical character of boosts advocated in this book one can pre-
dict some curious effects which, nevertheless, do not contradict any experi-
mental observations. For example, our approach implies that two measuring
rods made from different materials (e.g., wood and tungsten) may contract
in slightly different ways when viewed from a moving frame of reference. An-
other consequence is that Einstein’s time dilation formula (I.21) may be not
accurate. See section 13.3.
The most significant difference between our approach and special rela-
tivity concerns the effect of boosts on interacting systems. Let us see how
an isolated system is seen by time-translated and boosted observers. In our
cartoon 17.3 we placed images of the same system on the plane t − θ, where
t is the time parameter of the observer and θ is its rapidity. Our approach
and special relativity agree about the effect of time translations: As the time
parameter increases, the system may undergo some dramatic changes (e.g.,
an explosion) caused by internal forces acting in the system. These changes
result from the presence of interaction V in the Hamiltonian (the generator
of time translations H = H0 + V ) that describes the system.
The fundamental disagreement is about the effects of boosts. From the
point of view of special relativity, the boosted observer can see only simple
kinematical changes in the system. They include the change of the system’s
velocity and relativistic contraction. These effects also take place in our
approach. However, in addition to them, we expect non-trivial changes,
which result from the presence of the interaction Z in the generator of boost
transformations K = K0 + Z. For example, it is quite possible that for
sufficiently high boost parameters θ the system may look completely different,
e.g., exploded (the image in the upper left corner of Fig. 17.3). For this
reason our approach can be characterized as dynamical relativity in contrast
to kinematical relativity of Einstein’s special theory.
17.3.7 Does action-at-a-distance violate causality?

We saw in chapters 12 and 15 that RQD describes interactions between
particles in terms of instantaneous potentials. However, textbooks teach us
that interactions cannot propagate faster than light:
In non-relativistic quantum mechanics, it is straightforward to
construct Hamiltonians which describe particles interacting via
θ
v
v
TNT
TNT TNT TNT t

Figure 17.3: Non-trivial dynamics of an isolated interacting system as a
function of time (t) and boost parameter (θ). Three images on the t-axis
illustrate the usual time sequence of events associated with a piece of explo-
sive and an attached burning fuse. Three images on the θ axis show how
the unexploded device is perceived by a moving observer. If the observer’s
velocity (v = c tanh θ) is low, then the observer sees a moving (unexploded)
device whose length is contracted along the velocity’s direction. This trivial
(kinematical) change is predicted also by special relativity. At higher speeds
the moving observer may notice more significant (dynamical) changes, e.g.,
the device may be seen as exploded. Such non-trivial changes result from
the interaction-dependence of boost generators predictions by RQD.
long-range forces (for a simple example, consider two charged par-

ticles interacting via a Coulomb force). However, the concept of a
long-range interaction prima facie requires some sort of preferred
reference frame, which seems to cast doubt upon the possibility
of constructing such an interaction in a relativistically covariant
way. D. Wallace [Wal01]
The traditional viewpoint is that interactions between particles ought to be

retarded, i.e., they should propagate with the speed of light. The usual ar-
gument in favor of this hypothesis is the observation that faster-than-light
interactions violate the special relativistic ban on superluminal signals.34 If
one accepts the validity of this ban, then logically there is no other choice,
but to accept also a field-based approach, rather than the picture of directly
interacting particles advocated in this book. Indeed, interactions are al-
ways accompanied by redistribution of the momentum and energy between
particles. If we assume that interactions are retarded, then the transferred
momentum-energy must exist in some form while en route from one par-
ticle to another. This implies existence of some interaction carriers and
corresponding degrees of freedom not directly related to particle observables.
These degrees of freedom are usually associated with fields, e.g., the electro-
magnetic field of Maxwell’s theory. In other words
...the interaction is a result of energy momentum exchanges be-

tween the particles through the field, which propagates energy and
momentum and can transfer them to the particles by contact. F.
Strocchi [Str04]
The field concept came to dominate physics starting with the work
of Faraday in the mid-nineteenth century. Its conceptual advan-
tage over the earlier Newtonian program of physics, to formulate
the fundamental laws in terms of forces among atomic particles,
emerges when we take into account the circumstance, unknown to
Newton (or, for that matter, Faraday) but fundamental in special
relativity, that influences travel no farther than a finite limiting
speed. For then the force on a given particle at a given time
cannot be deduced from the positions of other particles at that
34
see Appendix I.5
time, but must be deduced in a complicated way from their pre-

vious positions. Faraday’s intuition that the fundamental laws
of electromagnetism could be expressed most simply in terms of
fields filling space and time was of course brilliantly vindicated in
Maxwell’s mathematical theory. F. Wilczek [Wil99]
In this subsection we will argue against this traditional logic. Our point
is that if the dynamical character of boosts is properly taken into account,
then instantaneous action-at-a-distance does not contradict the principle of
causality in all reference frames.
Let us now consider two particles interacting via instantaneous action-
at-a-distance potentials. By definition, the momentum is being exchanged
between the two particles. In RQD there are no interaction carriers or inter-
mediate fields or extra degrees of freedom where the transferred momentum
could be stored. Therefore, when particle 1 loses some part of its momen-
tum, particle 2 instantaneously acquires the same amount of momentum.
Otherwise the momentum conservation law would be violated. So, in RQD
the instantaneous character of interactions is not an approximation, but a
necessity.
The dressed particle Hamiltonian in the 2-particle sector of the Fock
space is a function of positions and momenta of the two particles H d =
H d (r1 , p1 , r2 , p2 ). Particle trajectories can be obtained from equation (17.31)
i d i dt
r1 (t) = e ~ H t r1 e− ~ H
i d i dt
p1 (t) = e ~ H t p1 e− ~ H
i d i dt
r2 (t) = e ~ H t r2 e− ~ H
i d i dt
p2 (t) = e ~ H t p2 e− ~ H
and the force35 acting on the particle 2
d i
f2 (t) = p2 (t) = − [p2 (t), H d]
dt ~
≡ f2 (r1 (t), p1 (t); r2 (t), p2 (t)) (17.64)
2
35
The same conclusions can be reached with the alternative definition of force f = m ddt2r ,
as in subsection 15.1.3.
depends on positions and momenta of both particles at the same time instant
t. Thus, interaction propagates instantaneously in the reference frame O.
The impossibility of superluminal signals is usually “proven” by applying
Lorentz transformations to space-time coordinates of two causally related
events and claiming that there exists a moving frame in which the temporal
order of these events is reversed.36 However, we know from subsection 17.2.9
that for systems with interactions Lorentz transformations are no longer ex-
act. So, the special-relativistic ban on superluminal propagation of interac-
tions may not be valid as well.
Now consider the above two-particle system from the point of view of a
moving reference frame O ′. Trajectories of particles 1 and 2 in this frame
are37
ic d~ i d ′ i d ′ ic d~
r1 (θ, t′ ) = e− ~ K θ e ~ H t r1 e− ~ H t e ~ K θ
ic d~ i d ′ i d ′ ic ~
dθ
p1 (θ, t′ ) = e− ~ K θ e ~ H t p1 e− ~ H t e ~ K
ic d~ i d ′ i d ′ ic d~
r2 (θ, t′ ) = e− ~ K θ e ~ H t r2 e− ~ H t e ~ K θ
ic d~ i d ′ ic d ′ i ~
dθ
p2 (θ, t′ ) = e− ~ K θ e ~ H t p2 e− ~ H t e ~ K
The Hamiltonian in the reference frame O ′ is
ic d~ ic d~
H d (θ) = e− ~ K θ H d e ~ K θ
(17.65)
therefore the force acting on the particle 2 in this frame
d i
f2 (θ, t′ ) = ′
p2 (θ, t′ ) = − [p2 (θ, t′ ), H d (θ)]
dt ~
ic − i Kd ~θ i H d t′ i d ′ ic d ~ i d ~ ic d ~
= − [e ~ e ~ p2 e− ~ H t e ~ K θ , e− ~ K cθ H d e ~ K θ ]
~
i − ic Kd ~θ i H d t′ i d ′ ic d ~
= − e ~ [e ~ p2 e− ~ H t , H d ]e ~ K θ
~
i ic d ~ ic d ~ ic d ~ ic d ~
= − e− ~ K θ [p2 (0, t′), H d ]e ~ K θ = e− ~ K θ f2 (0, t′ )e ~ K θ
~
− ic d~ ic d ~
= e ~ K θ f2 (r1 (0, t′ ), p1 (0, t′ ); r2(0, t′ ), p2 (0, t′ ))e ~ K θ
= f2 (r1 (θ, t′ ), p1 (θ, t′ ); r2(θ, t′ ), p2 (θ, t′ )) (17.66)
36
see Appendix I.5
37
Here t′ is time measured by the clock of observer O′ ; θ is the rapidity of this observer.
d
K is the dressed boost operator introduced in subsection 11.1.9.
17.4. ARE QUANTUM FIELDS NECESSARY? 587
is a function of positions and momenta of both particles at the same time in-
stant t′ . Moreover, in agreement with the principle of relativity, this function
f2 has exactly the same form as in the reference frame at rest (17.64). There-
fore, for the moving observer O ′ the interaction propagates instantaneously
as for the observer at rest O.
If inter-particle potential is used to transmit information between two
events A (the cause) and B (the effect), then in the reference frame at rest
these two events are simultaneous. Then, due to the interaction dependence
of Kd , these two events remain simultaneous in all frames, so instantaneous
potentials do not contradict causality.
The above arguments remain valid for any system of N particles inter-
acting via Poincaré invariant action-at-a-distance potentials.
17.4 Are quantum fields necessary?

The general idea of RQD is that particles are the most fundamental ingredi-
ents of nature and that everything we know in physics can be explained as
manifestations of quantum behavior of particles interacting with each other
at a distance. If this idea is correct, then the notion of fields becomes re-
dundant. On the other hand, it is also true that (quantum) fields are in the
center of all modern relativistic quantum theories, and we actually started
our formulation of RQD from the quantum field version of QED in section
9.1. This surely looks like a contradiction. Then we are pressed to answer the
following question: what is the role of quantum fields in relativistic quantum
theory?
17.4.1 Dressing transformation in a nutshell

Before discussing the meaning of quantum fields, let us now review the pro-
cess by which we arrived to the finite dressed particle Hamiltonian H d =
H0 + V d in sections 12.1 and 14.2. We started with the QED Hamiltonian
H = H0 + V in subsection 9.1.2 (the upper left box in Fig. 17.4) and demon-
strated some of its good properties, such as the Poincaré invariance and the
cluster separability. However, when we used this Hamiltonian to calculate
the S-operator beyond the lowest non-vanishing perturbation order (arrow
(1) in Fig. 17.4) we obtained meaningless infinite results. The solution to this
problem was given by renormalization theory in chapter 10 (arrow (2)): infi-
nite counterterms were added to the Hamiltonian H, and a new Hamiltonian

was obtained H c = H0 +V c . Although the Hamiltonian H c was infinite, these
infinities canceled in the process of calculation of the S-operator (dashed ar-
row (3)), and very accurate values for observable scattering cross-sections
and energies of bound states were obtained (arrow (4)). As a result of the
renormalization procedure, the divergences were “swept under the rug,” and
this rug was the Hamiltonian H c . This Hamiltonian was not satisfactory:
First, in the limit of infinite cutoff matrix elements of H c on bare particle
states were infinite. Second, the Hamiltonian H c contained unphys terms
like a† b† c† and a† c† a, which implied that in the course of time evolution the
(bare) vacuum state and (bare) one-electron states rapidly dissociated into
complex linear combinations of multiparticle states.38 Therefore, H c could
not be used to describe dynamics of interacting particles. To solve this prob-
lem, we applied a unitary dressing transformation to the Hamiltonian H c
(arrow (5)) and obtained a new “dressed particle” Hamiltonian
H d = eiΦ H c e−iΦ (17.67)
We managed to select the unitary transformation eiΦ so that all infinities

from H c were canceled out.39 In addition, the Poincaré invariance and cluster
separability of the theory remained intact, and the S-operator computed with
the dressed particle Hamiltonian H d was exactly the same as the accurate
S-operator of the renormalized QED (arrow (6)).
The Hamiltonian H d of RQD has a number of advantages over the Hamil-
tonian H c of QED. Unlike “trilinear” interactions in H c , all terms in H d have
very clear and direct physical meaning and correspond to real observable
physical processes (see Table 11.1). Both Hamiltonians H c and H d can be
used to calculate scattering amplitudes and energies of bound states. How-
ever, only with H d one can do that without regularization, renormalization,
and other tricks. Only H d can describe the time evolution in a simple and
straightforward way (arrow (7)). It is also important that our “quantum
38
Although the divergences in the Hamiltonian H c can be avoided by the “similarity
renormalization” approach [GW93, Gla97, Wal98], the problem of unphysical time evolu-
tion (=the instability of bare particles) persists in all current formulations of QED that
do not use dressing.
39
One can say that our approach has swept the divergences under another rug. This
time the rug is the phase Φ of the transformation operator eiΦ . This operator has no
physical meaning, so there is no harm in choosing it infinite.
Dressing
Renormalization transformation
Hamiltonian Hamiltonian Dressed

H=H0+V with counterterms particle
Hc=H0+Vc Hamiltonian
(finite) (2) (infinite) (5) Hd=eiΦHce−iΦ
(finite)
S=S(V)
(3)
S=S(Vc) (6)
(1) S=S(Vd))
Infinite Finite,
S−operator accurate
S−operator (7)
(wrong)
Sc
(4)
Observable Dynamics
scattering
properties
Figure 17.4: The logic of construction of the dressed particle Hamiltonian

H d = H0 + V d . S(V ) is the perturbation formula (7.20) that allows one to
calculate the S-operator from the known interaction Hamiltonian V .
theory of dressed particles” (which is based on the Hamiltonian H d ) is con-

ceptually much simpler than the “quantum theory of fields” (which is based
on the Hamiltonian H c ). RQD is similar to the ordinary quantum mechan-
ics: states are described by normalized wave functions, the time evolution
and scattering amplitudes are governed by a finite well-defined Hamiltonian,
the stationary states and their energies can be found by diagonalizing this
Hamiltonian. The only significant difference between RQD and conventional
quantum mechanics is that in RQD the number of particles is not conserved:
particle creation and annihilation can be adequately described.
The above derivation of the dressed particle Hamiltonian H d involved a
sequence of dubious steps: “canonical gauge field quantization → renormal-
ization → dressing”. Are these steps inevitable ingredients of a realistic phys-
ical theory? Is nature meant to be that complicated? Our answer to these
questions is “no.” Apparently, the “first principles” used in constructions of
traditional relativistic quantum field theories (local fields, gauge invariance,
etc.) are not fundamental.40 Otherwise, we would not need such a painful
procedure, involving infinities and their cancelations, to derive a satisfactory
dressed particle Hamiltonian. We believe that it should be possible to build
a fully consistent relativistic quantum theory without ever invoking quantum
fields. Unfortunately, this goal has not been achieved yet, and we must rely
on quantum fields and on the messy renormalization and dressing procedures
to arrive to an acceptable theory of physical particles.
17.4.2 What was the reason for having quantum fields?

In a nutshell, the traditional idea of quantum fields is that particles that
we observe in experiments – photons, electrons, protons, etc. – are not
the fundamental ingredients of nature. Allegedly, the most fundamental
ingredients are fields. For each kind of particle, there exists a corresponding
field – a continuous all-penetrating “substance” that extends all over the
universe. Dyson called it “a single fluid which fills the whole of space-time”
[Dys51]. The fields are present even in situations when there are no particles,
i.e., in the vacuum. The fields cannot be measured or observed by themselves.
We can only see their excitations in the form of small bundles of energy and
momentum that we recognize as particles. Photons are excitations of the
photon field; electrons and positrons are two kinds of excitations of the Dirac
40
electron-positron field, etc.

In this book we adopted a different attitude toward quantum fields. Our
viewpoint is that quantum fields are not the fundamental ingredients of na-
ture. They are just formal mathematical objects (linear combinations of
particle creation and annihilation operators) which just happen to be rather
helpful in constructing relativistic quantum theories of interacting particles.
However, it is not necessary to assign any physical significance to quantum
fields themselves.41
If (as usually suggested) fields are important ingredients of physical re-
ality, then we should be able to measure them. However, the things that
are measured in physical experiments are intimately related to particles and
their properties, not to fields. For example, we can measure (expectation
values of) positions, momenta, velocities, angular momenta, and energies of
particles as functions of time (= trajectories). In interacting systems of par-
ticles one can probe the energies of bound states and their wave functions. A
wealth of information can be obtained by studying the connections between
values of particle observables before and after their collisions (the S-matrix).
All these measurements have a transparent and natural description in the
language of particles and operators of their observables.
On the other hand, field properties (field values at points, their space and
time derivatives, etc.) are not directly observable. Fermion quantum fields
are not Hermitian operators, so that, even formally, they cannot correspond
to quantum mechanical observables. Even for the electric and magnetic fields
of classical electrodynamics their direct measurability is very questionable.
When we say that we have “measured the electric field” at a certain point
in space, we have actually placed a test charge at that point and measured
the force exerted on this charge by surrounding charges. Nobody has ever
measured electric and magnetic fields themselves.
41
It should be noted that in non-relativistic (e.g., condensed matter) physics, quantum
fields may have perfectly valid physical meaning. However, in these cases the field descrip-
tion is approximate and works only in the low-energy long-distance limit. For example, the
quantum field description of crystal vibrations is applicable when the wavelength is much
greater than the inter-atomic distance. The excitations of the crystal elastic field give rise
to (pseudo-)particles called phonons. The concept of renormalization also makes a perfect
sense in these systems. For example, the polaron (a conduction band electron interacting
with lattice vibrations) has renormalized mass that is different from the effective mass of
the “free” conduction band electron in a “frozen” lattice. In this book we are discussing
only fundamental relativistic quantum fields for which the above relationships between
quantum fields and underlying small-scale physics do not apply.
17.4.3 Quantum fields and space-time

The formal character of quantum fields is clear also from the fact that their
arguments t and x have no relationship to measurable times and positions.
The variable t is a parameter, which we used in (8.52) to describe the “t-
dependence” of regular operators generated by the non-interacting Hamil-
tonian H0 . As we explained in subsection 7.1.2, this t-dependence has no
relationship to the observable time dependence of physical quantities, but is
rather added as a help in calculations. Three variables x are just coordinates
in an abstract Minkowski space-time, and they should not be confused with
physical positions of particles.
Arguments x of the fields should not be regarded as eigenvalues of the
Newton-Wigner position operator. This can be seen from the simplest ex-
ample of the scalar field taken at time t = 042
ψ(x, 0) = ψ + (x, 0) + ψ − (x, 0)

Z Z
dp i dp i
= p e~ (p·x)
αp + p e− ~ (p·x) αp†
2(2π~)ωp 2(2π~)ωp
The annihilation part ψ + (x, 0) of this expression cannot be regarded as an

operator annihilating a particle at the space point x, and operator ψ − (x, 0)
does not create the particle at point x. The correct expressions for operators
creating and annihilating one electron with Newton-Wigner position x and
spin projection σ can be obtained using formulas from subsection 5.2.343
Z
1 i
αx = 3/2
dpe ~ (p·x) αp (17.68)
(2π~)
Z
1 i
αx† = 3/2
dpe− ~ (p·x) αp† (17.69)
(2π~)
Likewise, the product ψ − (x, 0)ψ + (x, 0) cannot be interpreted as the spatial
density of particles, but the product αx† αx has exactly this interpretation
[WWS+ 12].
As can be seen from formulas for scattering operators in subsection 7.1.2
and from equations (9.9) - (9.10), the parameters t and x are just integration
42
See section 5.2 in [Wei95].
43 √
Note the absence of the denominator ωp under the integral.
variables, and they are not present in the final expression for the fundamental
measurable quantity calculated in QFT - the S-matrix.
We certainly agree with the following two quotes:
Every physicist would easily convince himself that all quantum cal-
culations are made in the energy-momentum space and that the
Minkowski xµ are just dummy variables without physical mean-
ing (although almost all textbooks insist on the fact that these
variables are not related with position, they use them to express
locality of interactions!) H. Bacry [Bac89]
It is important to note that the x and t that appear in the quan-

tized field A(x, t) are not quantum-mechanical variables but just
parameters on which the field operator depends. In particular, x
and t should not be regarded as the space-time coordinates of the
photon. J. Sakurai [Sak67]
So, we arrive to the conclusion that quantum fields ψ(x, t) are simply
formal linear combinations of particle creation and annihilation operators.
Their arguments t and x are some dummy variables, which are not related
to temporal and spatial properties of the physical system. Quantum fields
should not be regarded as “generalized” or “second quantized” versions of
wave functions. Their role is more technical than fundamental: They pro-
vide convenient “building blocks” for the construction of Poincaré invariant
operators of potential energy V (9.13) - (9.14) and potential boost Z (9.16)
in the Fock space. That’s all there is to quantum fields.
It seems appropriate to end this section with the following quote from
Mermin
But what is the ontological status of those quantum fields that

quantum field theory describes? Does reality consist of a four-
dimensional spacetime at every point of which there is a collection
of operators on an infinite-dimensional Hilbert space? ... But I
hope you will agree that you are not a continuous field of operators
on an infinite-dimensional Hilbert space. Nor, for that matter, is
the page you are reading or the chair you are sitting in. Quantum
fields are useful mathematical tools. They enable us to calculate
things. N. D. Mermin [Mer09]
Chapter 18
CONCLUSIONS
Don’t worry about people stealing your ideas. If your ideas are
any good, you’ll have to ram them down people’s throats.
Howard Aiken
In this book we presented a new relativistic quantum theory of interac-

tions. Our approach is based on two claims that disagree with traditional
textbook theories:
1. The primary constituents of matter are particles. These particles (elec-

trons, protons, photons, etc.) obey the rules of quantum mechanics and
interact with each other via position- and velocity-dependent instan-
taneous potentials. Potentials that change the number of particles are
allowed as well.
2. The dynamical character of boosts. Perception of the system by a mov-

ing observer is different from that predicted by Einstein’s special rela-
tivity. In addition to universal special-relativitic effects, such as length
contraction and time dilation, we predict other phenomena whose ex-
act nature and magnitude depend on the composition of the observed
system and on interactions acting there.
Our first claim about the primary role of particles contradicts the funda-
mental assumptions of such field-based approaches as quantum field theory
595
596 CHAPTER 18. CONCLUSIONS
and Maxwell’s electrodynamics. We agree that quantum fields are useful

mathematical constructs for building invariant interaction operators and cal-
culating scattering amplitudes. However, for solving more general problems
that include the time evolution and bound state properties, one is advised
to switch to the dressed particle representation, which, incidentally, solves
the problem of ultraviolet divergences. In the classical limit, the Hamilto-
nian theory of particles interacting via instantaneous potentials is a viable
alternative to the traditional Maxwell’s electrodynamics.
In the majority of experimental situations, predictions of our theory are
either the same as in old approaches or the differences are too small to be
measurable by modern techniques. So, experimental confirmation of RQD
is rather challenging. The most compelling experimental evidence in favor
of QED is observation of superluminal propagation of electromagnetic forces
discussed in chapter 16.
The most common argument against instantaneous interactions uses the
special-relativistic ban on superluminal signal propagation. We explain this
apparent contradiction by invoking our second claim that boost transforma-
tions are dynamical or interaction-dependent. This interaction-dependence
of boosts follows naturally from the well-understood invariance of physical
laws with respect to the Poincaré group. It is well-known that space transla-
tions and rotations of observers are purely kinematical and independent on
interactions. On the other hand, it is also well-known that time translations
induce highly non-trivial interaction-dependent (dynamical) changes in the
observed system. Then, the Poincaré group structure demands that boosts
have a non-trivial interaction-dependent effect as well. This simple observa-
tion has far-reaching consequences. In particular, it implies that universal
Lorentz transformations of special relativity can be rigorously applied only
to non-interacting systems. In the interacting case, the boost transforma-
tions should involve small, but crucially important, system-dependent and
interaction-dependent corrections. Thus, in our approach, the Minkowski
space-time is a non-rigorous, approximate concept.
The validity of special relativity is usually supported by reference to nu-
merous experiments. However, at closer inspection, it appears that the ma-
jority of these measurements refer either to total observables of compound
systems or to non-interacting particles. In these cases, predictions of our
theory and special relativity are exactly the same. When truly interacting
systems are observed (as in the case of “time dilation” in decays of moving
particles), the differences between the two approaches ar extremely small.
597
Summary:
• Lorentz transformations of special relativity are not exact. Correct

boost transformation laws must depend on the state of the observed
system and on interactions acting there.
• The equivalence between space and time coordinates postulated in spe-

cial (and general) relativity is neither exact nor fundamental. The 4-
dimensional Minkowski space-time formalism should not be used for
describing interacting relativistic systems.
• Interactions between particles propagate instantaneously. This does

not violate the principle of causality.
• Fields (either quantum or classical) should not be considered as fun-

damental constituents of physical reality. Quantum fields are formal
mathematical constructs, which cannot be observed or measured.
• Classical electrodynamics can be formulated as a theory of directly

interacting particles, where electromagnetic fields (as well as their mo-
mentum and energy) do not play any role.
• The most direct way to confirm RQD experimentally is to measure

the superluminal speed of propagation of bound (evanescent) electric
and/or magnetic “fields”.
598 CHAPTER 18. CONCLUSIONS
Part III
MATHEMATICAL
APPENDICES
599
Appendix A
Groups and vector spaces
A.1 Groups
Group is a set where a product ab of any two elements a and b is defined.
This product is also an element of the group, and the following conditions
are satisfied:
1. associativity:
(ab)c = a(bc) (A.1)
2. there is a unique unit element e such that for any a
ea = ae = a (A.2)
3. for each element a there is a unique inverse element a−1 such that
aa−1 = a−1 a = e (A.3)
In many cases a group can be described as a set of transformations pre-

serving certain symmetries. Consider, for example, a square shown in Fig.
A.1(a) and the set of rotations around its center. There are four special
rotations (by the angles 0◦ , 90◦ , 180◦, −90◦ ) which transform the square into
itself. This set of four elements (see Fig. A.1(b)) is the group of symme-
tries of the square. Apparently, 0◦ is the unit element of the group. The
601
602 APPENDIX A. GROUPS AND VECTOR SPACES
0o
−90o 90o
o
180
(a) (b)
Figure A.1: (a) Square; (b) the group of (rotational) symmetries of the
square.
Table A.1: Multiplication table for the symmetry group of the square
0◦ 90◦ 180◦ −90◦
0◦ 0◦ 90◦ 180◦ −90◦
◦ ◦ ◦ ◦
90 90 180 −90 0◦
180◦ 180◦ −90◦ 0◦ 90◦
−90◦ −90◦ 0◦ 90◦ 180◦
composition law of rotations leads us to the multiplication table A.1 and the
inversion table A.2 for this simple group.
The group considered above is commutative (or Abelian), because ab = ba
for any two elements a and b in the group. However, this property is not
required for a general group. For example, it is easy to see that the group of
rotational symmetries of a cube is not Abelian: A 90◦ rotation of the cube
about its x-axis followed by a 90◦ rotation about the y-axis is a transformation
that is different from these two rotations performed in the reverse order.
A.2 Vector spaces

A vector space H is a set of objects (called vectors and further denoted by
boldface letters x) with two operations: addition of two vectors and multi-
plication of a vector by scalars. In this book we are interested only in vector
A.2. VECTOR SPACES 603
Table A.2: Inversion table for the symmetry group of the square
element inverse element
◦
0 0◦
90◦ −90◦
◦
180 180◦
−90◦ 90◦
spaces whose scalars are either complex (C) or real (R) numbers. If x and y
are two vectors and a and b are two scalars, then
ax + by
is also a vector. A vector space forms an Abelian group with respect to vector
additions. This means associativity
(x + y) + z = x + (y + z),
existence of the group unity (denoted by 0 and called zero vector )
x+0=0+x=x
and existence of the opposite (additive inverse) element denoted by −x
x + (−x) = 0,
In addition, the following properties are postulated in the vector space:

The associativity of scalar multiplication
a(bx) = (ab)x
The distributivity of scalar sums:
(a + b)x = ax + bx
The distributivity of vector sums:
a(x + y) = ax + ay
The scalar multiplication identity:
1x = x
We leave it to the reader to prove from these axioms the following useful
results for an arbitrary scalar a and a vector x
0x = a0 = 0
(−a)x = a(−x) = −(ax)
ax = 0 ⇒ a = 0 or x = 0
An example of a vector space is the set of all columns of n numbers1
 
x1

 x2 

 .. 
 . 
xn
The sum of two columns is
     
x1 y1 x1 + y1

 x2  
  y2  
  x2 + y2 

 .. + .. = .. 
 .   .   . 
xn yn xn + yn
The multiplication of a column by a scalar λ is
   
x1 λx1

 x2  
  λx2 

λ .. = .. 
 .   . 
xn λxn
1
If xi are real (complex) numbers then this vector space is denoted by Rn (Cn ).
A.2. VECTOR SPACES 605
A set of nonzero vectors {xi } is called linearly independent if from
X
ai xi = 0
i
it follows that ai = 0 for each i. A set of linearly independent vectors xi is

called basis if by adding arbitrary nonzero vector y to this set it is no longer
linearly independent. If xi is a basis and y is an arbitrary nonzero vector,
then equation
X
a0 y + ai xi = 0
i
has a solution in which a0 6= 0.2 This means that we can express an arbitrary
vector y as a linear combination of basis vectors
X ai X
y=− xi = yi xi (A.4)
i
a0 i
Note that any vector y has unique components yi with respect to the basis
xi . Indeed, suppose we found another set of components yi′ , so that
X
y= yi′ xi (A.5)
i
Then subtracting (A.5) from (A.4) we obtain
X
0= (yi′ − yi )xi
i
and yi′ = yi since xi are linearly independent.

One can choose many different bases in the same vector space. However,
the number of vectors in any basis is the same, and this number is called
the dimension of the vector space V (denoted dim V ). The dimension of the
2
because otherwise we would have ai = 0 for all i, meaning that the full set {xi , y} is
linearly independent in disagreement with our assumption.
space of n-member columns is n. An example of a basis set in this space is

given by n vectors
     
1 0 0

 0  
  1 


 0 

 .. , .. ,..., .. 
 .   .   . 
0 0 1
A linear subspace is a subset of vectors in H which is closed with respect

to addition and multiplication by scalars. For any set of vectors x1 , x2 , . . .
there is a spanning subspace
P (or simply span) Sp(x1 , x2 , . . .) which is the set
of all linear combinations i ai xi with arbitrary coefficients ai . The span of
a single non-zero vector Sp(x) is also called a ray.
Appendix B
Delta function and useful

integrals
Dirac’s delta function δ(x) is defined by the property of the integral
Za
f (x)δ(x)dx = f (0)
−a
where f (x) is any smooth function, and a > 0. The delta function can be
also defined by its integral representation
Z∞
1 i
e ~ ax da = δ(x)
2π~
−∞
Another useful property is
1
δ(ax) = δ(x)
a
The delta function of a vector argument r = (x, y, z) is defined as
δ(r) = δ(x)δ(y)δ(z)
607
608 APPENDIX B. DELTA FUNCTION AND USEFUL INTEGRALS
or
Z
1 i
e ~ k·r dk = δ(r) (B.1)
(2π~)3
It has the property
∂2 1
= −δ(r) (B.2)
∂r2 4πr
The step function θ(t) is defined as

1, if t ≥ 0
θ(t) ≡ (B.3)
0, otherwise
It has the following integral representation
Z∞
1 e−ist
θ(t) = − ds (B.4)
2πi s + iǫ
−∞
Consider integral1
Z Zπ Z2π Z∞ i Z1 Z∞
dr i pr 2e ~ pr cos θ i
e~ = sin θdθ dφ r dr = 2π dz drre ~ prz
r r
0 0 0 −1 0
Z∞ i
pr − ~i pr Z∞
e ~ −e 4π~ pr
= 2π~ rdr = dr sin
ipr p ~
0 0
Z∞
4π~2 4π~2 4π~2
= dρ sin(ρ) = − (cos(∞) − cos(0)) = (B.5)
p2 p2 p2
0
i
1
In this derivation one can set cos(∞) = 0 because in applications the plane wave e ~ pr
in the integrand does not have infinite extension. Typically it has a smooth damping
factor that makes it tend to zero at large values of r, so that cos(∞) can be effectively
taken as zero.
609
Next consider integral
Z i
e ~ (p·x+q·y)
K = dxdy
|x − y|
First we change the integration variables
1
x =(z + t)
2
1
y = (z − t)
2
x−y = t
x+y = z
The Jacobian of this transformation is

∂(x, y)
J ≡ det
= 1/8
∂(z, t)
Then, using integrals (B.1) and (B.5), we obtain
Z i Z i
1 e 2~ (p·(z+t)+q·(z−t)) 1 e 2~ (z·(p+q)+t·(p−q))
K = dtdz = dtdz
8 t 8 t
Z i
e 2~ t·(p−q) (2π~)6 δ(p + q)
= (2π~)3δ(p + q) dt = (B.6)
t 2π 2 ~ p2
Other useful integrals are
Z
dk i kr (2π~)3
e ~ = (B.7)
k2 4π~2 r
Z Z
dkk i kr ∂ dk i kr i(2π~)3 ∂ 1 i(2π~)3 r
e ~ = −i~ e ~ = − =
k2 ∂r k2 4π~ ∂r r 4π~r 3
(B.8)
Z 3
dkq · [k × p] i kr i(2π~) q · [r × p]
2
e~ = (B.9)
k 4π~r 3
610 APPENDIX B. DELTA FUNCTION AND USEFUL INTEGRALS
Z
dk(q · k)(p · k) i kr (2π~)3 (q · r)(p · r)
e~ = (q · p) − (B.10)
k4 8π~2 r r2
Z
dk(p · k)(q · k) i kr (2π~)3 (p · r)(q · r) 1
2
e~ = 3
(p · q) − 3 2
+ (p · q)δ(r)
k 4πr r 3
(B.11)
Z 3
dk i kr (2π~) r
e ~ = E − (B.12)
k4 8π~4
where E is an infinite constant (see [Wei64a]).
Z
2 +br 2 /(4a)
dre−ar = (π/a)3/2 eb (B.13)
Lemma B.1 (Riemann-Lebesgue [GR00]) Fourier image of a smooth

function tends to zero at infinity.
When talking about smooth functions in this book we will presume that these
functions are continuous, can be differentiated as many times as needed and
do not contain singularities.
Appendix C
Some lemmas for

orthocomplemented lattices.
From axioms of orthocomplemented lattices1 one can prove a variety of useful

results
Lemma C.1
z ≤x∧y ⇒ z ≤x (C.1)
Proof. From Postulate 1.8 we have x ∧ y ≤ x, hence z ≤ x ∧ y ≤ x and by
the transitivity Lemma 1.5 we obtain z ≤ x.
Lemma C.2
x≤y ⇔x∧y = x (C.2)
Proof. From x ≤ y and x ≤ x it follows by Postulate 1.9 that x ≤ x ∧ y.
On the other hand, x ∧ y ≤ x (1.8). Lemma 1.4 then implies x ∧ y = x. The
reverse statement follows from Postulate 1.8 written in the form
x∧y ≤y (C.3)
If x∧y = x, then we can replace the left hand side of (C.3) with x and obtain
the left hand side of (C.2)
1
They are summarized in Table 1.1 as statements 1.3 - 1.21.
611
612APPENDIX C. SOME LEMMAS FOR ORTHOCOMPLEMENTED LATTICES.
Lemma C.3 For any proposition z
x≤y ⇒ x∧z ≤y∧z (C.4)

Proof. This follows from x ∧ z ≤ x ≤ y and x ∧ z ≤ z by using Postulate
1.9.
One can also prove equations
x∧x = x (C.5)
∅∧x = ∅ (C.6)
I ∧x = x (C.7)
∅⊥ = I (C.8)
which are left as an exercise for the reader.
Proofs of lemmas and theorems for orthocomplemented lattices are facil-
itated by the following observation: Given an expression composed of lattice
elements we can form a dual expression by the following rules:
• 1) change places of ∧ and ∨ signs;
• 2) change the direction of the implication signs ≤;
• 3) change ∅ to I and change I to ∅.
Then it is easy to see that all axioms in Table 1.1 have the property of duality:
Each axiom is either self-dual or its dual is also a valid axiom. Therefore, for
each logical (in)equality, its dual is also a valid (in)equality. For example, by
duality we have from (C.1), (C.2) and (C.4) - (C.8)
x∨y ≤z ⇒ x≤z
x≤y ⇔ x∨y = y
y≤x ⇒ y∨z ≤x∨z
x∨x = x
I ∨x = I
∅∨x = x
I⊥ = ∅
Appendix D
Rotation group
D.1 Basics of the 3D space

Let us now consider the familiar 3D position space. This space consists of
points. We can arbitrarily select one such point 0 and call it the origin.
Then we can draw a vector a from the origin to any other point in space.
We can also define a sum of two vectors (by the parallelogram rule as shown
in Fig. D.1) and the multiplication of a vector by a real scalar. There is a
natural definition of the length of a vector |a| (also denoted by a) and the
angle α(a, b) between two vectors a and b. Then the dot product (or scalar
product ) of two vectors is defined by formula
a · b = b · a = ab cos α(a, b) (D.1)

Two non-zero vectors are called perpendicular or orthogonal if their dot prod-
uct is zero.
We can build an orthonormal basis of 3 mutually perpendicular vectors
of unit length i, j, and k along x, y, and z axes respectively.1 Then each
vector a can be represented as a linear combination
a = ax i + ay j + az k
1
Let us agree that the triple of basis vectors (i, j, k) forms a right-handed system as
shown in Fig. D.1. Such a system is easy to recognize by the following rule of thumb: If
we point a corkscrew in the direction of k and rotate it in the clockwise direction (from i
to j), then the corkscrew will move in the direction of vector k.
613
614 APPENDIX D. ROTATION GROUP
k a+b
b
a
α
0
j y
i
x
Figure D.1: Some objects in the vector space R3 : the origin 0, the basis
vectors i, j, k, a sum of two vectors a + b via the parallelogram rule.
or as a column of its components or coordinates 2
 
ax
a =  ay 
az
The transposed vector can be represented as a row
aT = [ax , ay , az ]
One can easily verify that the dot product (D.1) can be written in several
equivalent forms
 
3
X ax
b·a= bi ai = bx ax + by ay + bz az = [bx , by , bz ]  ay  = bT a
i=1 az
where bT a denotes the usual “row by column” product of the row bT and
column a. √ √
The length of the vector a can be written as a ≡ |a| = a · a ≡ a2 , and
the distance between two points (or vectors) a and b is defined as d = |a−b|.
2
So, physical space can be identified with the vector space R3 of all triples of real
numbers (see subsection A.2). We will mark vector indices either by letters x, y, z or by
numbers 1,2,3, as convenient.
D.2. SCALARS AND VECTORS 615
D.2 Scalars and vectors

There are two approaches to rotations, as well as to any inertial transfor-
mation: active and passive. An active rotation rotates all objects around
the origin while keeping the orientation of basis vectors. A passive rotation
simply changes the directions of the basis vectors and thus affects only com-
ponents of “real” vectors but not the physical vectors themselves. Unless
noted otherwise, we will use the passive representation of rotations.
We call a quantity A a 3-scalar if it is not affected by rotations. Examples
of scalars are distances and angles.
Let us now find how rotations change the coordinates of vectors in R3 .
By definition, rotations preserve the origin and linear combinations of vec-
tors, so the action of a rotation on a column vector can be represented as
multiplication by a 3 × 3 matrix R
3
X
a′i = Rij aj (D.2)
j=1
or in the matrix form
a′ = Ra (D.3)
′T
b = (Rb)T = bT RT (D.4)
where RT denotes the transposed matrix.
D.3 Orthogonal matrices

Since rotations preserve distances and angles, they also preserve the dot
product:
b · a = bT a = (Rb)T (Ra) = bT RT Ra (D.5)
The validity of equation (D.5) for any a and b implies that rotation matrices
satisfy the condition
RT R = I (D.6)
where I denotes the unit matrix
 
1 0 0
I= 0 1 0 
0 0 1
Multiplying by the inverse matrix R−1 from the right, equation (D.6) can be
also written as
RT = R−1 (D.7)
This implies a useful property
Rb · a = bT RT a = bT R−1 a = b · R−1 a (D.8)
In the coordinate notation, condition (D.6) takes the form
3
X 3
X
T
Rij Rjk = Rji Rjk = δik (D.9)
j=1 j=1
where δij is the Kronecker delta symbol

1, if i = j
δij = (D.10)
0, if i 6= j
Matrices satisfying condition (D.7) are called orthogonal. Thus, any rotation
has a unique representative in the set of orthogonal matrices.
However, not every orthogonal matrix R corresponds to a rotation. To
see that, we can write
1 = det(I) = det(RT R) = det(RT ) det(R) = (det(R))2

D.3. ORTHOGONAL MATRICES 617
which implies that if R is orthogonal then det(R) = ±1. Any rotation

can be connected by a continuous path with the trivial rotation which is
represented, of course, by the unit matrix with unit determinant. Since
continuous transformations cannot abruptly change the determinant from 1
to -1, only matrices with
det(R) = 1 (D.11)
have a chance to represent rotations.3 We conclude that rotations are in one-

to-one correspondence with orthogonal matrices having a unit determinant.
The notion of a vector is more general than just an arrow directed to
a point in space. We will call any triple of quantities A ~ = (Ax , Ay , Az ) a
3-vector if it transforms under rotations in the same way as vector arrows
(D.2).
Let us now derive explicit forms of rotation matrices. Any rotation around
the z-axis does not change z-components of 3-vectors. The most general
matrix satisfying this property can be written as
 
a b 0
Rz =  c d 0 
0 0 1
and condition (D.11) translates into ad − bc = 1. One can verify directly

that the inverse matrix is
 
d −b 0
Rz−1 =  −c a 0 
0 0 1
According to the property (D.7) we must have
a = d
b = −c
3
Matrices with det(R) = −1 describe rotations coupled with inversion (see subsection
2.2.4).
therefore
 
a b 0
Rz =  −b a 0 
0 0 1
The condition det(Rz ) = a2 + b2 = 1 implies that matrix Rz depends on one

parameter φ such that a = cos φ and b = sin φ
 
cos φ sin φ 0
Rz =  − sin φ cos φ 0  (D.12)
0 0 1
Obviously, parameter φ is just the rotation angle.4 The matrices for rotations
around the x- and y-axes are
 
1 0 0
Rx =  0 cos φ sin φ  (D.13)
0 − sin φ cos φ
and
 
cos φ 0 − sin φ
Ry =  0 1 0  (D.14)
sin φ 0 cos φ
respectively.
D.4 Invariant tensors

Tensor of the second rank5 Aij is defined as a set of 9 quantities, which
depend on two indices and transform as a vector with respect to each index
4
Note that positive values of φ correspond to a clockwise rotation (from i to j) of the
basis vectors which drives the corkscrew in the positive z-direction.
5
Scalars and vectors are sometimes called tensors of rank 0 and 1, respectively.
D.4. INVARIANT TENSORS 619
3
X
A′ij = Rik Rjl Akl (D.15)
kl=1
Similarly, one can also define tensors of higher rank, e.g., Aijk .
There are two invariant tensors which play a special role because they
have the same components independent on the orientation of the basis vec-
tors. The first invariant tensor is the Kronecker delta δij .6 Its invariance
follows from the orthogonality of R-matrices (D.9).
3
X 3
X
δij′ = Rik Rjl δkl = Rik Rjk = δij
kl=1 k=1
Another invariant tensor is the Levi-Civita symbol ǫijk , which is defined as

ǫxyz = ǫzxy = ǫyzx = −ǫxzy = −ǫyxz = −ǫzyx = 1, and all other components
of ǫijk are zero. We show its invariance by applying an arbitrary rotation R
to ǫijk . Then
3
X
ǫ′ijk = Ril Rjm Rkn ǫlmn = Ri1 Rj2 Rk3 + Ri3 Rj1 Rk2 + Ri2 Rj3 Rk1
lmn=1
−Ri2 Rj1 Rk3 − Ri3 Rj2 Rk1 − Ri1 Rj3 Rk2 (D.16)
The right hand side has the following properties:
1. it is equal to zero if any two indices coincide: i = j or i = k or j = k;

2. it does not change after cyclic permutation of indices ijk.
3. ǫ′123 = det(R) = 1.
These are the same properties as those used to define the Levi-Civita symbol
above. So, the right hand side of (D.16) must have the same components as
ǫijk
ǫ′ijk = ǫijk
6
see equation (D.10)
Using invariant tensors δij and ǫijk we can convert between scalar, vector
and tensor quantities, as shown in Table D.1. For example, any antisymmet-
ric 3-tensor has 3 independent components, so it can be always represented
as
3
X
Aij = ǫijk Vk
k=1
where Vk are components of some 3-vector.
Table D.1: Converting between quantities of different rank using invariant

tensors
Scalar S → Sδij (tensor)
Scalar S → Sǫijk (antisymmetric tensor)
P3
Vector Vi → ǫijk Vk (antisymmetric tensor)
k=1
3
P
Tensor Tij → δij Tji (scalar)
ij=1
P3
Tensor Tij → ǫijk Tkj (vector)
jk=1
Using invariant tensors one can also build a scalar or a vector from two
independent vectors A and B. The scalar is constructed by using the Kro-
necker delta
3
X
A·B = δij Ai Bj ≡ A1 B1 + A2 B2 + A3 B3
ij=1
This is the usual dot product (D.1). The vector can be constructed using
the Levi-Civita tensor
3
X
[A × B]i = ǫijk Aj Bk
jk=1
This vector is called the cross product (or vector product ) of A and B. It
has the following components
D.5. VECTOR PARAMETERIZATION OF ROTATIONS 621
[A × B]x = Ay Bz − Az By
[A × B]y = Az Bx − Ax Bz
[A × B]z = Ax By − Ay Bx
and properties
[A × B] = −[B × A]
[A × [B × C]] = B(A · C) − C(A · B) (D.17)
The mixed product is a scalar which can be build from three vectors with the
help of the Levi-Civita invariant tensor
3
X
[A × B] · C = ǫijk Ai Bj Ck
ijk=1
Its properties are
[A × B] · C = [B × C] · A = [C × A] · B (D.18)
[A × B] · B = 0
D.5 Vector parameterization of rotations

The matrix representation of rotations (D.2) is useful for describing trans-
formations of vector and tensor components. However, sometimes it is more
convenient to characterize rotation in a more physical way by the rotation
axis and the rotation angle. In other words, a rotation can be described by a
~ = φx i + φy j + φz k, such that its direction represents the axis
single vector φ
of the rotation and its length φ ≡ |φ|~ represents the angle of the rotation. So
we can characterize any rotation by three real numbers {φ} ~ = {φx , φy , φz }.7
Let us now make a link between the matrix and vector representations of
rotations. First, we find the matrix Rφ~ corresponding to the passive rotation
7
This characterization is not unique: there are many vectors describing the same rota-
tion (see Appendix H.4).
P|
φ
P
n
P P’
Figure D.2: Transformation of vector components under active rotation

through the angle −φ.
~ Here it will be convenient to consider the equivalent active rotation by

{φ}.
~ Each vector P in R3 can be decomposed into two parts:
the angle {−φ}.
~ ~
P = Pk + P⊥ The first part Pk ≡ (P · φφ ) φφ is parallel to the rotation axis,
and the second part P⊥ = P − Pk is perpendicular to the rotation axis (see
Fig. D.2). Rotation does not affect the parallel part of the vector, so after
rotation
P′k = Pk (D.19)
If P⊥ = 0 then rotation does not change the vector P at all. If P⊥ 6= 0, we

denote
~
[P⊥ × φ]
n=−
φ
the vector which is orthogonal to both φ ~ and P⊥ and is equal to the latter in
~ forms a right-handed system, just like
length. Note that the triple (P⊥ , n, φ)
vectors (i, j, k). Then the result of the passive rotation through the angle φ~
in the plane spanned by vectors P⊥ and n is the same as a rotation about
the axis k in the plane spanned by vectors i and j, i.e., it is given by the
matrix (D.12)
P′⊥ = P⊥ cos φ + n sin φ (D.20)

D.5. VECTOR PARAMETERIZATION OF ROTATIONS 623
Combining equations (D.19) and (D.20) we obtain
! " #
~
φ ~
φ ~
φ
P′ = P′k + P′⊥ = P· (1 − cos φ) + P cos φ − P × sin φ (D.21)
φ φ φ
or in the component notation
φx sin φ
Px′ = (Px φx + Py φy + Pz φz ) (1 − cos φ) + P x cos φ − (P y φ z − P z φ y )
φ2 φ
φ y sin φ
Py′ = (Px φx + Py φy + Pz φz ) 2 (1 − cos φ) + Py cos φ − (Pz φx − Px φz )
φ φ
φz sin φ
Pz′ = (Px φx + Py φy + Pz φz ) 2 (1 − cos φ) + Pz cos φ − (Px φy − Py φx )
φ φ
This transformation can be also represented in a matrix form.
P′ = Rφ−1
~ P = R−φ
~P
where the orthogonal matrix Rφ~ has the following matrix elements
3
X sin φ 1 − cos φ
(Rφ~ )ij = cos φδij + φk ǫijk + φi φj
k=1
φ φ2
Rφ~
 
cos φ + m2x (1 − cos φ) mx my (1 − cos φ) − mz sin φ mx mz (1 − cos φ) + my sin φ
=  mx my (1 − cos φ) + mz sin φ cos φ + m2y (1 − cos φ) my mz (1 − cos φ) − mx sin φ 
mx mz (1 − cos φ) − my sin φ my mz (1 − cos φ) + mx sin φ cos φ + m2z (1 − cos φ)
(D.22)
~
and m ≡ φ/φ.
Inversely, let us start from an arbitrary orthogonal matrix Rφ~ with det(Rφ~ ) =
1 and try to find the corresponding rotation vector φ. ~ Obviously, this vector
is not changed by the transformation Rφ~ , so
~=φ
Rφ~ φ ~
which means that φ ~ is eigenvector of the matrix R ~ with eigenvalue 1. Each

φ
orthogonal 3 × 3 matrix with unit determinant has eigenvalues (1, eiφ , e−iφ ),8
so that eigenvalue 1 is not degenerate. Then the direction of the vector φ ~
is uniquely specified. Now we need to find the length of this vector, i.e.,
the rotation angle φ. The trace of the matrix Rφ~ is given by the sum of its
eigenvalues
T r(Rφ~ ) = 1 + eiφ + e−iφ = 1 + 2 cos φ
~ ~) = φ
Therefore, we can define the function Φ(R ~ (which maps from the set
φ
of rotation matrices to corresponding rotation angles) by the following rules:
• the direction of the rotation vector φ~ coincides with the direction of

the eigenvector of Rφ~ with eigenvalue 1;
• the length of the rotation angle φ is given by
T r(Rφ~ ) − 1
φ = cos−1 (D.23)
2
As expected, this formula is basis-independent, because the trace of a matrix

does not depend on the basis (see Lemma F.7).
D.6 Group properties of rotations

One can see that rotations form a group. If we perform a rotation {φ ~1}
followed by a rotation {φ ~ 2 }, then the resulting transformation preserves the
origin, the linear combinations of vectors and their dot product, so it is
another rotation.
The identity element in the rotation group is the rotation through zero
angle {~0}, which leaves all vectors intact and is represented by the unit matrix
8
One can check this result by using the explicit representation (D.22)
D.6. GROUP PROPERTIES OF ROTATIONS 625
~ there exists an opposite (or inverse) rotation

R~0 = I. For each rotation {φ}
{−φ}~ such that
~ φ}
{−φ}{ ~ = {~0}
The inverse rotation is represented by the inverse matrix R−φ~ = Rφ−1 T

~ = Rφ
~.
The associativity law
~ 1 }({φ
{φ ~ 2 }{φ
~ 3 }) = ({φ
~ 1 }{φ
~ 2}){φ
~3}
follows from the associativity of the matrix product.

Rotations about different axes do not commute. However, two rotations
{φ~n} and {ψ~n} about the same axis9 do commute. Moreover, our choice
of the vector parameterization of rotations leads to the following important
relationship
Rφ~n Rψ~n = Rψ~n Rφ~n = R(φ+ψ)~n (D.24)
For example, considering two rotations around the z-axis we can write
  
cos φ sin φ 0 cos ψ sin ψ 0
R(0,0,φ) R(0,0,ψ) =  − sin φ cos φ 0   − sin ψ cos ψ 0 
0 0 1 0 0 1
 
cos(φ + ψ) sin(φ + ψ) 0
=  − sin(φ + ψ) cos(φ + ψ) 0 
0 0 1
= R(0,0,φ+ψ)
We will say that rotations about the same axis n form an one-parameter
subgroup of the rotation group.
9
here ~n is a unit vector.
D.7 Generators of rotations

Rotations in the vicinity of the unit element, can be represented as a Taylor
series10
3 3
X 1X i j
{~θ} = 1 + i
θ ti + θ θ tij + . . .
i=1
2 ij=1
At small values of θ we have simply
3
X
{~θ} ≈ 1 + θi ti
i=1
Quantities ti are called generators or infinitesimal rotations. Generators

can be formally represented as derivatives of elements in one-parameter sub-
groups with respect to parameters θi , e.g.,
d ~
ti = lim {θ}
~
θ→0 dθi
For example, in the matrix notation, the generator of rotations around the
z-axis is given by the matrix
   
cos φ sin φ 0 0 1 0
d d 
Jz = lim Rz (φ) = lim − sin φ cos φ 0  =  −1 0 0 (D.25)
φ→0 dφ φ→0 dφ
0 0 1 0 0 0
Similarly, for generators of rotations around x- and y-axes we obtain from

(D.13) and (D.14)
   
0 0 0 0 0 −1
Jx =  0 0 1  , Jy =  0 0 0  (D.26)
0 −1 0 1 0 0
10
Here we denote 1 ≡ {~0} the identity element of the group.
D.7. GENERATORS OF ROTATIONS 627
Using the additivity property (D.24) we can express general rotation {~θ}
as exponential function of generators
( ) ( )N 3
!N
~θ ~θ X θi
{~θ} = lim N = lim = lim 1+ ti
N →∞ N N →∞ N N →∞
i=1
N
3
!
X
= exp θi ti (D.27)
i=1
Let us verify this formula in the case of a rotation around the z-axis
1
eJz φ = 1 + φJz + φ2 Jz2 + . . .
2!
     φ2 
1 0 0 0 φ 0 −2 0 0
2
=  0 1 0  +  −φ 0 0  +  0 − φ2 0  + ...
0 0 1 0 0 0 0 0 0
 2   
1 − φ2 + . . . φ + ... 0 cos φ sin φ 0
φ 2
=  −φ + . . . 1 − 2 + . . . 0 = − sin φ
  cos φ 0 
0 0 1 0 0 1
= Rz = {0, 0, φ}
Exponent of any linear combination of generators ti also results in an or-

thogonal matrix with unit determinant, i.e., represents a rotation. There-
fore, objects ti form a basis in the vector space of generators of the rotation
group. This vector space is referred to as the Lie algebra of the rotation
group. General properties of Lie algebras will be discussed in Appendix E.2.
Appendix E
Lie groups and Lie algebras
E.1 Lie groups

In general, a group1 can be thought of as a set of points (elements) with
a multiplication law such that the “product” of any two points gives you a
third element in the set. In addition, there is an inversion law that map each
point to an “inverse” point. For some groups the corresponding sets of points
are discrete.2 Here we would like to discuss a special class of groups that are
called Lie groups.3 The characteristic feature of a Lie group is that its set of
points is continuous and smooth and that multiplication and inversion laws
are described by smooth functions. This set of points can be visualized as a
multi-dimensional “hypersurface”, which is called the group manifold.
We saw in the Appendix D.54 that elements of the rotation group are
in isomorphic correspondence with points φ ~ in a certain smooth manifold.
The multiplication and inversion laws define two smooth mappings between
points in this manifold. Thus, the rotation group is an example of a Lie
group. Similar to the rotation group, elements in a general Lie group can be
parameterized by n continuous parameters θi , where n is the dimension of the
Lie group. We will join these parameters in one n-dimensional “vector” ~θ and
denote a general group element as {~θ} = {θ1 , θ2 , . . . θn }, so that the group
multiplication and inversion laws are smooth functions of these parameters.
1
see subsection A.1
2
See example in Appendix A.1.
3
Lie groups and algebras were named after Norwegian mathematician Sophus Lie who
first developed their theory.
4
see also Appendix H.4
629
630 APPENDIX E. LIE GROUPS AND LIE ALGEBRAS
It appears that similar to the rotation group, in a general Lie group it

is also possible to choose a parameterization {θ1 , θ2 , . . . θn } such that the
following properties are satisfied
• the unit element has parameters (0,0,...,0);
• {~θ}−1 = {−~θ};
~ and {φ}
• if elements {ψ} ~ belong to the same one-parameter subgroup,
then
~ φ}
{ψ}{ ~ = {ψ
~ + φ}
~
We will always assume that group parameters satisfy these properties. Then,
similar to subsection D.7, we can introduce infinitesimal transformations or
generators ta (a = 1, 2, . . . , n) for a general Lie group and express group
elements in the vicinity of the unit element as exponential functions of gen-
erators
n
! n n
X X 1 X b c
{~θ} = exp a
θ ta =1+ a
θ ta + θ θ tbc + . . . (E.1)
a=1 a=1
2!
bc=1
~ ξ)
Let us introduce function ~g (ζ, ~ which associates with two points ζ~ and ξ~ in
~ ξ)
the group manifold a third point ~g (ζ, ~ according to the group multiplication
law, i.e.,
~ ξ}
{ζ}{ ~ = {~g (ζ,
~ ξ)}
~ (E.2)
~ ξ)
Function ~g (ζ, ~ must satisfy conditions
~g (~0, ~θ) = ~g (~θ, ~0) = θ~ (E.3)

~g (~θ, −~θ) = ~0
which follow from the group properties (A.2) and (A.3), respectively. To
ensure agreement with equation (E.3), the Taylor expansion of ~g up to the
2nd order in parameters must look like
E.1. LIE GROUPS 631
n
X
~ ξ)
g a (ζ, ~ = ζ a + ξa + fbca ξ bζ c + . . . (E.4)
bc=1
where fbca are real coefficients. Now we substitute expansions (E.1) and (E.4)
into (E.2)
n n
! n n
!
X
a 1X b c X 1X b c
a
1+ ξ ta + ξ ξ tbc + . . . 1+ ζ ta + ζ ζ tbc + . . .
a=1
2 bc=1 a=1
2 bc=1
n n
! n
X
a a
X 1X a
= 1+ ζ +ξ + fbca ξ b ζ c + . . . ta + (ζ + ξ a + . . .)(ζ b + ξ b + . . .)tab + . . .
a=1 bc=1
2 ab=1
Factors multiplying 1, ζ, ξ, ζ 2, ξ 2 are exactly the same on both sides of this

equation, but the factor in front of ξζ produces a non-trivial condition
n
1 X
(tbc + tcb ) = tb tc − fbca ta
2 a=1
The left hand side is symmetric with respect to the interchange of indices b
and c. Therefore the right hand side must by symmetric as well
n
X n
X
tb tc − fbca ta − tc tb + a
fcb ta = 0 (E.5)
a=1 a=1
If we define the commutator of two generators by formula
[tb , tc ] ≡ tb tc − tc tb
then, according to (E.5), this commutator is a linear combination of genera-
tors
n
X
a
[tb , tc ] = Cbc ta (E.6)
a=1
a
where real parameters Cbc = fbca − fcb
a
are called structure constants of the
Lie group.
Theorem E.1 Generators of a Lie group satisfy the Jacobi identity
[ta , [tb , tc ]] + [tb , [tc , ta ]] + [tc , [ta , tb ]] = 0 (E.7)
Proof. Let us first write the associativity law (A.1) in the form5
~ ~g (ξ,
0 = g a (ζ, ~ ~η)) − g a (~g (ζ,
~ ξ),
~ ~η )
~ ~η) + f a ζ b g c (ξ,
≈ ζ a + g a (ξ, ~ ~η) − g a (ζ,
~ ξ)
~ − η a − f a g b (ζ,
~ ξ)η
~ c
bc bc
≈ ζ a + ξ a + η a + fbca ξ b η c + fbca ζ b (ξ c + η c + fxy
c x y
ξ η )
−ζ a − ξ a − fxy
a x y
ζ ξ − η a − fbca (ζ b + ξ b + fxy
b x y c
ζ ξ )η
= fbca ξ b η c + fbca ζ bξ c + fbca ζ bη c + fbca fxy
c b x y
ζ ξ η
a x y
−fxy ζ ξ − fbca η c ζ b − fbca η c ξ b − fbca fxy
b c x y
ηζ ξ
= fbca fxy
c b x y
ζ ξ η − fbca fxy
b x y c
ζ ξ η
= (fbca fxy
c a c
− fcy fbx )ζ bξ x η y
~ {ξ}
Since elements {ζ}, ~ and {~η} are arbitrary, this implies
fklc fbca − fbk

c a
fcl = 0 (E.8)
Now let us turn to the left hand side of the Jacobi identity (E.7)
[ta , [tb , tc ]] + [tb , [tc , ta ]] + [tc , [ta , tb ]]

x x x
= [ta , Cbc tx ] + [tb , Cca tx ] + [tc , Cab tx ]
x y x y x y
= (Cbc Cax + Cca Cbx + Cab Ccx )ty
The expression in parentheses is

5
The burden of writing summation signs becomes unbearable at this point, so we will
adopt here the Einstein’s summation rule which allows us to drop the summation signs
and assume that the sums are performed over all pairs of repeating indices. Moreover, we
keep only 2nd order terms in the expansion (E.4).
E.2. LIE ALGEBRAS 633
y y
(fbcx − fcbx y
)(fax − fxa y x
) + (fca x
− fac )(fbx − fxb x
) + (fab x
− fba y
)(fcx y
− fxc )
x y x y x y x y x y x y
= fbc fax − fbc fxa − fcb fax + fcb fxa + fca fbx − fca fxb
−fac fbx + fac fxb + fab fcx − fab fxc − fba fcx + fba fxc
= (fbc fax − fab fxc ) + (fca fbx − fbc fxa ) − (fcb fax − fac fxb )
−(fba fcx − fcb fxa ) + (fab fcx − fca fxb ) − (fac fbx − fba fxc ) (E.9)
According to (E.8) all terms in parentheses on the right hand side of (E.9)
are zero, which proves the theorem.
E.2 Lie algebras

Lie algebra is a vector space over real numbers R with the additional oper-
ation called the Lie bracket. This operation is denoted [A, B] and it maps
two vectors A and B to a third vector. The Lie bracket is postulated to
satisfy the following set of conditions6
[A, B] = −[B, A]
[A, B + C] = [A, B] + [A, C]
[A, λB] = [λA, B] = λ[A, B], f or any λ ∈ R
0 = [A, [B, C]] + [B, [C, A]] + [C, [A, B]] (E.10)
From our discussion in the preceding section it is clear that generators of

a Lie group form a Lie algebra, in which the role of the Lie bracket is played
by the commutator of generators. Consider, for example, the group of rota-
tions. In the matrix representation, the generators are linear combinations
of matrices (D.25) - (D.26), i.e., they are arbitrary antisymmetric matrices
satisfying AT = −A. The commutator is represented by7
6
equation (E.10) is called the Jacobi identity
7
Note that this representation of the Lie bracket as a difference of two products can
be used only when the generators are identified with matrices. This formula (as well as
(E.11)) does not apply to abstract Lie algebras, because the product of two elements AB
is not defined there.
[A, B] = AB − BA
which is also an antisymmetric matrix, because
(AB − BA)T = B T AT − AT B T = BA − AB = −(AB − BA)
We will frequently use the following property of commutators in the ma-

trix representation
[A, BC] = ABC − BCA = ABC − BAC + BAC − BCA

= (AB − BA)C + B(AC − CA) = [A, B]C + B[A, C](E.11)
The structure constants of the Lie algebra of the rotation group can be
obtained by direct calculation from explicit expressions (D.25) - (D.26)
[Jx , Jy ] = Jz
[Jx , Jz ] = −Jy
[Jy , Jz ] = Jx
This can be written more compactly as
3
X
[Ji , Jj ] = ǫijk Jk
k=1
In the vicinity of the unit element, any Lie group element can be repre-
sented as exponent exp(x) of a Lie algebra element x (see equation (E.1)).
As product of two group elements is another group element, we must have
for any two Lie algebra elements x and y
exp(x) exp(y) = exp(z) (E.12)
where z is also an element from the Lie algebra. Then there should exist a
mapping in the Lie algebra which associates with any two elements x and y a
E.2. LIE ALGEBRAS 635
third element z, such that equation (E.12) is satisfied. The Baker-Campbell-

Hausdorff theorem [WM62] gives us the explicit form of this mapping
1 1 1
z = x + y + [x, y] + [[x, y], y] + [[y, x], x]
2 12 12
1 1 1
+ [[[y, x], x], y] − [[[[x, y], y], y], y] + [[[[x, y], y], y], x]
24 720 360
1 1 1
+ [[[[y, x], x], x], y] − [[[[x, y], y], x], y] − [[[[y, x], x], y], x] . . .
360 120 120
This means that Lie bracket relations in the Lie algebra contain full informa-
tion about the group multiplication law in the vicinity of the unit element. In
many cases, it is much easier to deal with generators and their Lie brackets
than directly with group elements and their multiplication law.
In applications one often finds useful the following identity
a2 a3
exp(ax)y exp(−ax) = y + a[x, y] + [x, [x, y]] + [x, [x, [x, y]]] . . . (E.13)
2! 3!
where a ∈ R. This formula can be proved by noticing that both sides are
solutions of the same differential operator equation
dy(a)
= [x, y(a)]
da
with the same initial condition y(a) = y.
There is a unique Lie algebra AG corresponding to each Lie group G.
However, there are many Lie groups with the same Lie algebra. These groups
have the same structure in the vicinity of the unit element, but their global
topological properties can be different.
A Lie subalgebra B of a Lie algebra A is a subspace in A which is closed
with respect to the Lie bracket, i.e., if x, y ∈ B, then [x, y] ∈ B. If H is a
subgroup of a Lie group G, then its Lie algebra AH is a Lie subalgebra of
AG .
Appendix F
Hilbert space
F.1 Inner product

An inner product space H is defined as a complex vector space1 which has a
mapping from ordered pairs of vectors to complex numbers. This mapping
is called the inner product (|yi, |xi) and it satisfies the following properties
(|xi, |yi) = (|yi, |xi)∗ (F.1)

(|zi, α|xi + β|yi) = α(|zi, |xi) + β(|zi, |yi) (F.2)
(|xi, |xi) ∈ R (F.3)
(|xi, |xi) ≥ 0 (F.4)
(|xi, |xi) = 0 ⇔ |xi = 0 (F.5)
where α and β are complex numbers. Given inner product

p we can define the
distance between two vectors by formula d(|xi, |yi) ≡ (|x − yi, |x − yi).
The inner product space H is called complete if any Cauchy sequence2
of vectors in H converges to a vector in H. Analogously, a subspace in H is
called a closed subspace if any Cauchy sequence of vectors belonging to the
subspace converges to a vector in this subspace. The Hilbert space is simply
a complete inner product space.3
1
See Appendix A.2. Vectors in H will be denoted by |xi.
2
Cauchy sequence is an infinite sequence of vectors |xi i in which the distance between
two vectors |xn i and |xm i tends to zero when their indices tend to infinity n, m → ∞.
3
The notions of completeness and closedness are rather technical. Finite dimensional
637
638 APPENDIX F. HILBERT SPACE
F.2 Orthonormal bases

Two vectors |xi and |yi are called orthogonal if (|xi, |yi) = 0. Vector |xi
is called unimodular if (|xi, |xi) = 1. In Hilbert space we can consider
orthonormal bases consisting of mutually orthogonal unimodular vectors |ei i
which satisfy

1, if i = j
(|ei i, |ej i) =
0, if i 6= j
or, using the Kronecker delta symbol
(|ei i, |ej i) = δij (F.6)
Suppose that vectors |xi and |yi have components xi and yi , respectively,
in this basis
|xi = x1 |e1 i + x2 |e2 i + . . . + xn |en i

|yi = y1 |e1 i + y2 |e2 i + . . . + yn |en i
Then using (F.1), (F.2) and (F.6) we can express the inner product through
vector components
(|xi, |yi) = (x1 |e1 i + x2 |e2 i + . . . + xn |en i, y1 |e1 i + y2 |e2 i + . . . + yn |en i)

X
= x∗1 y1 + x∗2 y2 + . . . + x∗n yn = x∗i yi
i
inner product spaces are always complete, and their subspaces are always closed. Although
in quantum mechanics we normally deal with infinite-dimensional spaces, most properties
having relevance to physics do not depend on the number of dimensions. So, we will
ignore the difference between finite- and infinite-dimensional spaces and freely use finite
n-dimensional examples in our proofs and demonstrations. In particular, we will tacitly
assume that every subspace A is closed or forced to be closed by adding all vectors which
are limits of Cauchy sequences in A.
F.3. BRA AND KET VECTORS 639
F.3 Bra and ket vectors

The notation (|xi, |yi) for the inner product is rather cumbersome. We will
use instead a more convenient bra-ket formalism suggested by Dirac, which
greatly simplifies manipulations with objects in the Hilbert space. Let us
call vectors in the Hilbert space ket vectors. We define a linear functional
hf | : H → C as a function (denoted by hf |xi) which maps each ket vector
|xi in H to complex numbers in such a way that linearity is preserved, i.e.
hf |(α|xi + β|yi) = αhf |xi + βhf |yi
Since any linear combination αhf | + βhg| of functionals hf | and hg| is again a
functional, then all functionals form a vector space (denoted H ∗ ). The vectors
in H ∗ will be called bra vectors. We can define an inner product in H ∗ so
that it becomes a Hilbert space. To do that, let us choose an orthonormal
basis |ei i in H. Then each functional hf | defines a set of complex numbers
fi which are values of this functional on the basis vectors
fi = hf |ei i
These numbers define the functional uniquely, i.e., if two functionals hf | and
hg| are different, then their values are different for at least one basis vector
|ek i: fk 6= gk .4 Now we can define the inner product of bra vectors hf | and
hg| by formula
X
(hf |, hg|) = fi gi∗
i
and verify that it satisfies all properties of the inner product listed in (F.1)
- (F.5). The Hilbert space H ∗ is called a dual of the Hilbert space H. Note
that each vector |yi in H defines a unique linear functional hy| in H ∗ by
formula
hy|xi ≡ (|yi, |xi) (F.7)

4
Otherwise, using linearity we would be able to prove that the values of functionals
hf | and hg| are equal on all vectors in H, i.e., hf | = hg|.
for each |xi ∈ H. This bra vector hy| will be called the dual of the ket vector
|yi. Equation (F.7) tells us that in order to calculate the inner product of |yi
and |xi we should find the bra vector (functional) dual to |yi and then find
its value on |xi. So, the inner product is obtained by coupling bra and ket
vectors hx|yi, thus forming a closed bra(c)ket expression, which is a complex
number.
Clearly, if |xi and |yi are different kets, then their dual bras hx| and hy|
are different as well. We may notice that just like vectors in H ∗ define linear
functionals on vectors in H, any vector |xi ∈ H also defines an antilinear
functional on bra vectors by formula hy|xi, i.e.,
(αhy| + βhz|)|xi = α∗ hy|xi + β ∗ hz|xi
Then applying the same arguments as above, we see that if hy| is a bra vector,
then there is a unique ket |yi such that for any hx| ∈ H ∗ we have
hx|yi = (hx|, hy|) (F.8)
Thus we established an isomorphism of two Hilbert spaces H and H ∗ . This

statement is known as the Riesz theorem.
Lemma F.1 If kets |ei i form an orthonormal basis in H, then dual bras hei |
also form an orthonormal basis in H ∗ .
Proof. Suppose that hei | do not form a basis. Then there is a nonzero vector
hz| ∈ H ∗ which is orthogonal to all hei |, and the values of the functional hz|
on all basis vectors |ei i are zero, so hz| = 0. The orthonormality of hei |
follows from equations (F.8) and F.6)
(hei |, hej |) = hei |ej i = (|ei i, |ej i) = δij
The components xi of a vector |xi in the basis |ei i are conveniently repre-
sented in the bra-ket notation as
F.4. TENSOR PRODUCT OF HILBERT SPACES 641
hei |xi = hei |(x1 |e1 i + x2 |e2 i + . . . + xn |en i) = xi
So we can write
X X
|xi = |ei ixi = |ei ihei |xi (F.9)
i i
The bra vector hy| dual to the ket |yi has complex conjugate components in
the dual basis
X
hy| = yi∗ hei | (F.10)
i
This can be verified by checking that the functional on the right hand side
being applied to any vector |xi ∈ H yields
X X
yi∗ hei |xi = yi∗ xi = (|yi, |xi) = hy|xi
i i
F.4 Tensor product of Hilbert spaces

Given two Hilbert spaces H1 and H2 one can construct a third Hilbert space
H which is called the tensor product of H1 and H2 and denoted by H =
H1 ⊗ H2 . For each pair of basis ket vectors |ii ∈ H1 and |ji ∈ H2 there is
exactly one basis ket in H, which is denoted by |ii ⊗ |ji. Other vectors in H
are linear products of the basis kets |ii ⊗ |ji with complex coefficients.
The inner product of two basis vectors |a1 i ⊗ |a2 i ∈ H and |b1 i ⊗ |b2 i ∈ H
is defined as ha1 |b1 iha2 |b2 i. This inner product is extended to linear combi-
nations of basis vectors by linearity.
F.5 Linear operators

Linear transformations of vectors in the Hilbert space (also called operators)
play a very important role in quantum formalism. Such transformations
T |xi = |x′ i
have the property
T (α|xi + β|yi) = αT |xi + βT |yi
for any two complex numbers α and β and any two vectors |xi and |yi. Given
an operator T we can find images of basis vectors
T |ei i = |e′i i
and find the expansion of these images in the original basis |ei i
X
|e′i i = tij |ej i
j
Coefficients tij of this expansion are called the matrix elements of the op-
erator T in the basis |ei i. In the bra-ket notation we can find a convenient
expression for the matrix elements
X X X
hej |(T |ei i) = hej |e′i i = hej | tik |ek i = tik hej |ek i = tik δjk
k k k
= tij
Knowing matrix elements of the operator T and components of vector |xi in

the basis |ei i one can always find the components of the transformed vector
|x′ i = T |xi
X X
x′i = hei |x′ i = hei |(T |xi) = hei | (T |ej i)xj = hei |ek itkj xj
j jk
X X
= δik tkj xj = tij xj (F.11)
jk j
In the bra-ket notation, the operator T has the form

F.6. MATRICES AND OPERATORS 643
X
T = |ei itij hej | (F.12)
ij
Indeed, by applying the right hand side of equation (F.12) to arbitrary vector
|xi we obtain
X X X
|ei itij hej |xi = |ei itij xj = x′i |ei i = |x′ i = T |xi
ij ij i
F.6 Matrices and operators

Sometimes it is convenient to represent vectors and operators in the Hilbert
space H in a matrix notation. Let us fix an orthonormal basis |ei i ∈ H and
represent each ket vector |yi by a column of its components
 
y1

 y2 

|yi =  .. 
 . 
yn
The bra vector hx| will be represented by a row
hx| = [x∗1 , x∗2 , . . . , x∗n ]
of complex conjugate components in the dual basis hei |. Then the inner
product is obtained by the usual “row by column” matrix multiplication
rule.
 
y1
 y2  X
hx|yi = [x∗1 , x∗2 , . . . , x∗n ]  x∗i yi
 
.. =
 .  i
yn
Matrix elements of the operator T in (F.12) can be conveniently arranged in

the matrix
 
t11 t12 . . . t1n

 t21 t22 . . . t2n 

T = .. .. .. .. 
 . . . . 
tn1 tn2 . . . tnn
Then the action of the operator T on a vector |x′ i = T |xi can be represented
as a product of the matrix corresponding to T and the column vector |xi5
     P  
x′1 t11 t12 . . . t1n x1 j t1j xj
 x′2   t21 t22 . . . t2n   x2   P t2j xj 
      j 
 .. = .. .. .. .  ..= .. 
. . . . ..   . P .
     
x′n tn1 tn2 . . . tnn xn j tnj xj
So, each operator has a unique matrix and each n × n matrix defines a
unique linear operator. This establishes an isomorphism between operators
and matrices. In what follows we will often use the terms operator and matrix
interchangeably.
The matrix corresponding to the identity operator is δij , i.e., the unit
matrix
 
1 0 ... 0

 0 1 ... 0 

I= .... . . .. 
 . . . . 
0 0 ... 1
A diagonal operator has diagonal matrix di δij
 
d1 0 ... 0

 0 d2 ... 0 

D= .. .. .. .. 
 . . . . 
0 0 . . . dn
5
compare with (F.11)
F.6. MATRICES AND OPERATORS 645
The action of operators in the dual space H∗ will be denoted by multiplying

bra row by the operator matrix from the right
 
s11 s12 . . . s1n
 s21 s22 . . . s2n 
[y1′ , y2′ , . . . , yn′ ] = [y1 , y2, . . . , yn ] 
 
.. .. .. .. 
 . . . . 
sn1 sn2 . . . snn
or symbolically
X
yi′ = yj sji
j
hy ′| = hy|S
Suppose that operator T with matrix tij in the ket space H transforms
vector |xi to |yi, i.e.,
X
yi = tij xj (F.13)
j
What is the matrix of operator S in the bra space H ∗ which connects corre-
sponding dual vectors hx| and hy|? As hx| and hy| have components complex
conjugate to those of |xi and |yi, and S acts on bra vectors from the right,
we can write
X
yi∗ = x∗j sji (F.14)
j
On the other hand, taking complex conjugate of equation (F.13) we obtain
X
yi∗ = t∗ij x∗j
j
Comparing this with (F.14) we have
sij = t∗ji
This means that the matrix representing the action of the operator T in the
dual space H ∗ , is different from the matrix T in that rows are substituted
by columns,6 and matrix elements are complex-conjugated. This combined
operation “transposition + complex conjugation” is called Hermitian conju-
gation. Hermitian conjugate (or adjoint) of operator T is denoted T † . In
particular, we can write
hx|(T |yi) = (hx|T † )|yi (F.15)

det(T † ) = (det(T ))∗ (F.16)
F.7 Functions of operators

The sum of two operators A and B and the multiplication of an operator A
by a complex number λ are easily expressed in terms of matrix elements
(A + B)ij = aij + bij

(λA)ij = λaij
We can define the product AB of two operators as the transformation ob-

tained by a sequential application of B and then A. This product is also a
linear transformation, i.e., an operator. The matrix of the product AB is the
“row-by-column” product of their matrices aij and bij
X
(AB)ij = aik bkj
k
Lemma F.2 Adjoint of a product of operators is equal to the product of

adjoint operators in the opposite order.
(AB)† = B † A†
6
This is equivalent to the reflection of the matrix with respect to the main diagonal.
Such matrix operation is called transposition.
F.7. FUNCTIONS OF OPERATORS 647
Proof.
X X X
(AB)†ij = (AB)∗ji = a∗jk b∗ki = b∗ki a∗jk = (B † )ik (A† )kj = (B † A† )ij
k k k
The inverse operator A−1 is defined by its two properties
A−1 A = AA−1 = I
The corresponding matrix is the inverse of the matrix A.

Using the basic operations of addition, multiplication, and inversion we
can define various functions f (A) of the operator A. For example, the expo-
nential function is defined by its Taylor series
1 2
eF = 1 + F + F + ... (F.17)
2!
For any two operators A and B the expression
[A, B] ≡ AB − BA (F.18)
is called the commutator. We say that two operators A and B commute

with each other if [A, B] = 0. Clearly, any two powers of A commute:
[An , Am ] = 0, and [A, A−1 ] = 0. Consequently, any two functions of A
commute as well: [f (A), g(A)] = 0.
Trace of a matrix is defined as a sum of its diagonal elements
X
T r(A) = Aii
i
Lemma F.3 Trace of a product of operators is invariant with respect to any

cyclic permutation of factors.
Proof. Take for example a trace of the product of three operators
X
T r(ABC) = Aij Bjk Cki
ijk
Then
X
T r(BCA) = Bij Cjk Aki
ijk
Changing in this expression summation indices k → i, i → j, and j → k, we

obtain
X
T r(BCA) = Bjk Cki Aij = T r(ABC)
ijk
We can define two classes of operators (and their matrices) which play
important roles in quantum mechanics (see Table F.1). These are Hermitian
and unitary operators. We call operator T Hermitian or self-adjoint if
T = T† (F.19)
For a Hermitian T we can write
tii = t∗ii (F.20)

tij = t∗ji
i.e., diagonal matrix elements are real, and non-diagonal matrix elements
symmetrical with respect to the main diagonal are complex conjugates of
each other. Moreover, from equations (F.15) and (F.19) we can calculate the
inner product of vectors hx| and T |yi with a Hermitian T
hx|(T |yi) = (hx|T † )|yi = (hx|T )|yi ≡ hx|T |yi

F.7. FUNCTIONS OF OPERATORS 649
Table F.1: Actions on operators and types of linear operators in the Hilbert
space
Symbolic Condition on matrix elements
or eigenvalues
Action on operators
Complex conjugation A → A∗ (A∗ )ij = A∗ij
Transposition A → AT (AT )ij = Aji
† ∗ T
Hermitian conjugation A → A = (A ) (A† )ij = A∗ji
Inversion A → A−1 inverse eigenvalues
Determinant det(A) product P of eigenvalues
Trace T r(A) i Aii
Types of operators
Identity I Iij = δij
Diagonal D Dij = di δij
Hermitian A = A† Aij = A∗ji
AntiHermitian A = −A† Aij = −A∗ji
Unitary A−1 = A† unimodular eigenvalues
Projection A = A† , A2 = A eigenvalues 0 and 1 only
From this symmetric notation it is clear that a Hermitian T can act either
to the right (on |yi) or to the left (on hx|)
Operator U is called unitary if
U −1 = U †
or, equivalently
U † U = UU † = I
A unitary operator preserves the inner product of vectors, i.e.,
hUa|Ubi ≡ (ha|U † )(U|bi) = ha|U −1 U|bi = ha|I|bi = ha|bi (F.21)
Lemma F.4 If F is an Hermitian operator then U = eiF is unitary.

Proof.
†
U ∗ U = (eiF )† (eiF ) = e−iF eiF = e−iF eiF = e−iF +iF = e0 = I
Lemma F.5 Determinant of a unitary matrix U is unimodular.
Proof. We use equation (F.16) to write
| det(U)|2 = det(U)(det(U))∗ = det(U) det(U † ) = det(UU † ) = det(I) = 1
Operator A is called antilinear if A(α|xi+β|yi) = α∗ A|xi+β ∗ A|yi for any

complex α and β. An antilinear operator with the property hAy|Axi = hy|xi∗
is called antiunitary.
F.8 Linear operators in different orthonor-

mal bases
So far, we have been working with matrix elements of operators in a fixed or-
thonormal basis |ei i. However, in a different basis the operator is represented
by a different matrix. Nevertheless, we are going to show that properties of
operators defined above remain valid in all orthonormal basis sets. In other
words, we would like to demonstrate that above operator properties are basis-
independent.
Theorem F.6 |ei i and |e′i i are two orthonormal bases if and only if there
exists a unitary operator U such that
U|ei i = |e′i i (F.22)

F.8. LINEAR OPERATORS IN DIFFERENT ORTHONORMAL BASES651
Proof. The basis |e′i i obtained by applying a unitary transformation U to

the orthonormal basis |ei i is orthonormal, because unitary transformations
preserve inner products of vectors (F.21). To prove the reverse statement let
us form a matrix
 
he1 |e′1 i he1 |e′2 i . . . he1 |e′n i

 he2 |e′1 i he2 |e′2 i . . . he2 |e′n i 

 .. .. .. .. 
 . . . . 
hen |e′1 i hen |e′2 i . . . hen |e′n i
with matrix elements
uji = hej |e′i i
The operator U corresponding to this matrix can be written as
X X
U = |ej iujk hek | = |ej ihej |e′k ihek |
jk jk
So, acting on the vector |ei i
X X X
U|ei i = |ej ihej |e′k ihek |ei i = |ej ihej |e′k iδki = |ej ihej |e′i i
jk jk j
= |e′i i
it makes vector |e′i i as required. Moreover, this operator is unitary because7
X X X
(UU † )ij = uik u∗jk = he′i |ek ihe′j |ek i∗ = he′i |ek ihek |e′j i = he′i |e′j i
k k k
= δij = Iij
7
Here we use the following representation of the identity operator
X
I= |ei ihei | (F.23)
i
which is valid in each orthonormal basis |ei i.

If F is operator with matrix elements fij in the basis |ek i, then its matrix
elements fij′ in the basis |e′k i = U|ek i can be obtained by formula
fij′ = he′i |F |e′j i = (hei |U † )F (U|ej i) = hei |U † F U|ej i

= hei |U −1 F U|ej i (F.24)
Equation (F.24) can be viewed from two different but equivalent perspectives.
One can regard (F.24) either as matrix elements of F in the new basis set
U|ei i (a passive view) or as matrix elements of the transformed operator
U −1 F U in the original basis set |ei i (an active view).
When the basis is changing, the matrix of the operator changes as well,
but the operator’s type remains the same. If operator F is Hermitian, then
in the new basis8
(F ′ )† = (U −1 F U)† = U † F † (U −1 )† = U −1 F U = F ′
it is Hermitian as well.
If operator V is unitary, then for the transformed operator V ′ we have
(V ′ )† V ′ = (U −1 V U)† V ′ = U † V † (U −1 )† V ′ = U −1 V † UV ′ = U −1 V † UU −1 V U
= U −1 V † V U = U −1 U = I
so, V ′ is also unitary.
Lemma F.7 The trace of an operator is basis-independent.
Proof. From Lemma F.3 we obtain
T r(U −1 AU) = T r(AUU −1 ) = T r(A)
8
adopting the active view and omitting symbols for basis vectors
F.9. DIAGONALIZATION OF HERMITIAN AND UNITARY MATRICES653
F.9 Diagonalization of Hermitian and unitary

matrices
We see that the choice of basis in the Hilbert space is a matter of conve-
nience. So, when performing calculations it is always a good idea to choose
a basis in which operators have the simplest form, e.g., diagonal. It appears
that Hermitian and unitary operators can always be made diagonal by an
appropriate choice of basis. Suppose that vector |xi satisfies equation
F |xi = λ|xi
where λ is a complex number called eigenvalue of the operator F . Then |xi

is called eigenvector of the operator F .
Theorem F.8 (spectral theorem) For any Hermitian or unitary operator

F there is an orthonormal basis |ei i such that
F |ei i = fi |ei i (F.25)
where fi are complex numbers.
For the proof of this theorem see ref. [Rud91].

Equation (F.25) means that the matrix of the operator F is diagonal in
the basis |ei i
 
f1 0 ... 0

 0 f2 ... 0 

F = .. .. .. ..  (F.26)
 . . . . 
0 0 . . . fn
and according to (F.12) each Hermitian or unitary operator can be expressed

through its eigenvectors and eigenvalues
X
F = |ei ifi hei | (F.27)
i
Lemma F.9 Eigenvalues of a Hermitian operator are real.
Proof. Any Hermitian operator can be brought to the diagonal form (F.26)
with eigenvalues on the diagonal. It follows from (F.20) that these diagonal
matrix elements are real.
Lemma F.10 Eigenvalues of an unitary operator are unimodular.
Proof. Using representation (F.27) we can write
! !
X X
I = UU † = |ei ifi hei | |ej ifj∗ hej |
i j
X X X
= fi fj∗ |ei ihei |ej ihej | = fi fj∗ |ei iδij hej | = |fi |2 |ei ihei |
ij ij i
Since all eigenvalues of the identity operator are 1, we obtain |fi |2 = 1.
One benefit of diagonalization is that functions of operators are easily

defined in the diagonal form. If operator A has a diagonal form
 
a1 0 ... 0

 0 a2 ... 0 

A= .. .. .. .. 
 . . . . 
0 0 . . . an
then operator f (A) (in the same basis) has the form
 
f (a1 ) 0 ... 0

 0 f (a2 ) ... 0 

f (A) =  .. .. .. .. 
 . . . . 
0 0 . . . f (an )
For example, the matrix of the inverse operator is9

9
Note that inverse operator A−1 is defined only if all eigenvalues of A are nonzero.
F.9. DIAGONALIZATION OF HERMITIAN AND UNITARY MATRICES655
 
a−1
1 0 ... 0
 0 a−1
2 ... 0 
A−1 = 
 
.. .. .. .. 
 . . . . 
0 0 . . . a−1
n
From Lemma F.10, there is a basis in which the matrix of unitary operator
U is diagonal
 
eif1 0 ... 0

 0 eif2 ... 0 

U = .. .. .. .. 
 . . . . 
0 0 . . . eifn
with real fi . It then follows that each unitary operator can be represented
as
U = eiF
where F is Hermitian
 
f1 0 ... 0

 0 f2 ... 0 

F = .. .. .. .. 
 . . . . 
0 0 . . . fn
Together with Lemma F.4 this establishes an isomorphism between the sets
of Hermitian and unitary operators.
Lemma F.11 Unitary transformation of a Hermitian or unitary operator

does not change the spectrum of its eigenvalues.
Proof. If |ψk i is eigenvector of M with eigenvalue mk
M|ψk i = mk |ψk i
then vector |Uψk i is eigenvector of the unitarily transformed operator M ′ =

UMU −1 with the same eigenvalue
M ′ (U|ψk i) = UMU −1 (U|ψk i) = UM|ψk i = Umk |ψk i = mk (U|ψk i)

Appendix G
Subspaces and projection

operators
G.1 Projections
Two subspaces A and B in the Hilbert space H are called orthogonal (denoted
A ⊥ B) if any vector from A is orthogonal to any vector from B. The span
of all vectors which are orthogonal to A is called the orthogonal complement
to the subspace A and denoted A′ .
For a subspace A (with dim(A) = m) in the Hilbert space H (with
dim(H) = n > m) we can select an orthonormal basis |ei i such that first
m vectors with indices i = 1, 2, . . . , m belong to A and vectors with indices
i = m + 1, m + 2, . . . , n belong to the orthogonal complement A′ . Then for
each vector |yi we can write
n
X m
X n
X
|yi = |ei ihei |yi = |ei ihei |yi + |ei ihei |yi
i i=1 i=m+1
The first sum lies entirely in A and is denoted by |yk i. The second sum lies in
A′ and is denoted |y⊥ i. This means that we can always make a decomposition
of |yi into two uniquely defined mutually orthogonal components |yki and
|y⊥ i1
1
We will also say that Hilbert space H is represented as a direct sum ( H = A ⊕ A′ ) of
orthogonal subspaces A and A′ .
657
658 APPENDIX G. SUBSPACES AND PROJECTION OPERATORS
|yi = |yk i + |y⊥ i

|yki ∈ A
|y⊥ i ∈ A′
Then we can define a linear operator PA called projection on the subspace A

which associates with any vector |yi its component in the subspace A
PA |yi = |yk i
The subspace A is called the range of the projection PA . In the bra-ket

notation we can also write
m
X
PA = |ei ihei |
i=1
so that in the above basis |ei i the operator PA has diagonal matrix with
first m diagonal entries equal to 1, and all others equal to 0. From this, it
immediately follows that
PA ′ = 1 − PA
A set of projections Pα on mutually orthogonal subspaces Hα is called

decomposition of unity if
X
1= Pα
α
or, equivalently
H = ⊕α Hα
Thus PA and PA′ provide an example of the decomposition of unity.

G.2. COMMUTING OPERATORS 659
Theorem G.1 Operator P is a projection if and only if P is Hermitian and

P2 = P.
Proof. For Hermitian P , there is a basis |ei i in which this operator is

diagonal.
X
P = |ei ipi hei |
i
Then
! !
X X X
0 = P2 − P = |ei ipi hei | |ej ipj hej | − |ei ipi hei |
i j i
X X X
= |ei ipi pj δij hej | − |ei ipi hei | = |ei i p2i − pi hei |
ij i i
Therefore p2i − pi = 0 and either pi = 0 or pi = 1. From this we conclude that

P is a projection on the subspace spanning eigenvectors with eigenvalue 1.
To prove the inverse statement we note that any projection operator is
Hermitian because it has real eigenvalues 1 and 0. Furthermore, for any
vector |yi
P 2|yi = P |yki = |yk i = P |yi
which proves that P 2 = P .
G.2 Commuting operators

Lemma G.2 Subspaces A and B are orthogonal if and only if PA PB =
PB PA = 0.
Proof. Assume that
PA PB = PB PA = 0 (G.1)
and suppose that there is vector |yi ∈ B such that |yi is not orthogonal to
A. Then PA |yi = |yA i =
6 0. From these properties we obtain
PA PB |yi = PA |yi = |yA i = PA |yA i

PB PA |yi = PB |yA i
From the commutativity of PA and PB we obtain
PA |yA i = PA PB |yi = PB PA |yi = PB |yA i

PA PB |yA i = PA PA |yA i = PA |yA i =
6 0
So, we found a vector |yA i for which PA PB |yA i =

6 0 in disagreement with our
original assumption (G.1).
The inverse statement is proven as follows. For each vector |xi, the pro-
jection PA |xi is in the subspace A. If A and B are orthogonal, then the
second projection PB PA |xi yields zero vector. The same arguments show
that PA PB |xi = 0, and PA PB = PB PA .
Lemma G.3 If A ⊥ B then PA + PB is the projection on the direct sum

A ⊕ B.
Proof. If we build an orthonormal basis |ei i in A ⊕ B such that first dim(A)

vectors belong to A, and next dim(B) vectors belong to B, then
dim(A) dim(B)
X X
PA + PB = |ei ihei | + |ej ihej | = PA⊕B
i=1 j=1
Lemma G.4 If A ⊆ B (A is a subspace of B) then
PA PB = PB PA = PA
Proof. If A ⊆ B then there exists a subspace C in B such that C ⊥ A and

B = A ⊕ C.2 According to Lemmas G.2 and G.3
PA PC = PC PA = 0
PB = PA + PC
PA PB = PA (PA + PC ) = PA2 = PA
PB PA = (PA + PC )PA = PA
If there exist three mutually orthogonal subspaces X, Y , and Z, such

that A = X ⊕ Y and B = X ⊕ Z, then subspaces A and B (and projections
PA and PB ) are called compatible.
Lemma G.5 Subspaces A and B are compatible if and only if their corre-
sponding projections commute
[PA , PB ] = 0
Proof. Let us first show that if [PA , PB ] = 0 then PA PB = PB PA = PA∩B is

the projection on the intersection of subspaces A and B.
First we find that
(PA PB )2 = PA PB PA PB = PA2 PB2 = PA PB
and that operator PA PB is Hermitian, because
(PA PB )† = PB† PA† = PB PA = PA PB
Therefore, PA PB is a projection by Theorem G.1. If A ⊥ B, then the direct

statement of the Lemma follows from Lemma G.2. Suppose that A and B
are not orthogonal and denote C = A ∩ B (C can be empty, of course). We
can always represent A = C ⊕ X and B = C ⊕ Y , therefore
2
This subspace is composed of vectors in B, which are orthogonal to A.
PA = PC + PX
PB = PC + PY
[PC , PX ] = 0
[PC , PY ] = 0
We are left to show that X and Y are orthogonal. This follows from the
commutator
0 = [PA , PB ] = [PC + PX , PC + PY ]
= [PC , PC ] + [PC , PY ] + [PX , PC ] + [PX , PY ] = [PX , PY ]
Let us now prove the inverse statement. From the compatibility of A and
B it follows that
PA = PX + PY
PB = PX + PZ
PX PY = PX PZ = PY PZ = 0
[PA , PB ] = [PX + PY , PX + PZ ] = 0
Lemma G.6 If projection P is compatible with all other projections in the

Hilbert space, then either P = 0 or P = 1.
Proof. Suppose that P 6= 0 and P 6= 1. Then P has a non-empty range A,

which is different from H. So, the orthogonal complement A′ is not empty as
well. Choose an arbitrary vector y with non-zero components |yk i and |y⊥ i
with respect to A. Then it is easy to show that projection on |yi does not
commute with P . Therefore, by Lemma G.5 this projection is not compatible
with P .
Note that two or more eigenvectors of a Hermitian operator F may corre-

spond to the same eigenvalue (such an eigenvalue is called degenerate). Then
any linear combination of these eigenvectors is again an eigenvector with the

same eigenvalue. The span of all eigenvectors with the same eigenvalue f is
called the eigensubspace of the operator F , and one can associate a projec-
tion Pf on this subspace with eigenvalue f . Then Hermitian operator F can
be written as
X
F = f Pf (G.2)
f
where index f now runs over all distinct eigenvalues of F and Pf are referred
to as spectral projections of F . This means that
P each Hermitian operator
defines an unique decomposition of unity I = f Pf . Inversely, if Pf is a
decomposition of unity and f are real numbers then equation (G.2) defines
an unique Hermitian operator.
Lemma G.7 If two Hermitian operators F and G commute then all spectral
projections of F commute with G.
Proof. Consider operator P which is a spectral projection of F . Take any

vector |xi in the range of P , i.e.,
P |xi = |xi
F |xi = f |xi
for some real f . Let us first prove that the vector G|xi also lies in the range
of P . Indeed, using the commutativity of F and G we obtain
F G|xi = GF |xi = Gf |xi = f G|xi
This means that operator G leaves all eigensubspaces of F invariant. Then

for any vector |xi the vectors P |xi and GP |xi lie in the range of P . Therefore
P GP = GP (G.3)
Taking adjoint of both sides we obtain

P GP = P G (G.4)
Now subtracting (G.4) from (G.3) we obtain
[G, P ] = GP − P G = 0
Theorem G.8 Two Hermitian operators F and G commute if and only if

all their spectral projections commute.
Proof. We write
X
F = fi Pi (G.5)
i
X
G = gj Qj (G.6)
j
If [Pi , Qj ] = 0 for all i, j, then obviously [F, G] = 0. To prove the reverse

statement we notice that from Lemma G.7 each spectral projection Pi com-
mutes with G. From the same Lemma it follows that each spectral projection
of G commutes with Pi .
Theorem G.9 If two Hermitian operators F in (G.5) and G in (G.6) com-

mute then there is a basis |ei i in which both F and G are diagonal, i.e., |ei i
are common eigenvectors of F and G.
Proof. The identity operator can be written in three different ways
X
I = Pi
i
X
I = Qj
j
! !
X X X
I = I ·I = Pi Qj = Pi Qj
i j ij
where Pi and Qj are spectral projections of operators F and G, respectively.

Since F and G commute, the operators Pi Qj with different i and/or j are
projections on mutually orthogonal subspaces. So, these projections form a
spectral decomposition of unity, and the desired basis is obtained by coupling
bases in the subspaces Pi Qj .
Appendix H
Representations of groups
A representation of a group G is a homomorphism 1 between the group G

and the group of linear transformations in a vector space. In other words, to
each group element g there corresponds a matrix Ug with non-zero determi-
nant.2 The group multiplication is represented by the matrix product and
the following conditions are satisfied
Ug1 Ug2 = Ug1 g2

Ug−1 = Ug−1
Ue = I
Each group has a trivial representation in which each group element is rep-
resented by the identity operator. If the linear space of the representation is
a Hilbert space H, then we can define a particularly useful class of unitary
representations. These representations are made of unitary operators.
H.1 Unitary representations of groups

Two representations Ug and Ug′ of a group G in the Hilbert space H are called
unitarily equivalent if there exists a unitary operator V such that for each
g∈G
1
homomorphism = a mapping that preserves group operations
2
Matrices with zero determinant cannot be inverted, so they cannot represent group
elements.
667
668 APPENDIX H. REPRESENTATIONS OF GROUPS
Ug′ = V Ug V −1 (H.1)
Having two representations Ug and Vg in Hilbert spaces H1 and H2 respec-

tively, we can always build another representation Wg in the Hilbert space
H = H1 ⊕ H2 by joining two matrices in the block diagonal form.

Ug 0
Wg = (H.2)
0 Vg
This is called the direct sum of two representations. The direct sum is de-
noted by the sign ⊕
Wg = Ug ⊕ Vg
A representation is called reducible if there is a unitary transformation

(H.1) that brings representation matrices to the block diagonal form (H.2)
for all g. Otherwise, the representation is called irreducible.
Casimir operators are operators which commute with all representatives
of group elements.
Lemma H.1 (Schur’s first lemma [Hsi00]) Casimir operators of an uni-

tary irreducible representation of any group are constant multiples of the unit
matrix.
From Appendix E.1 we know that elements of any Lie group in the vicinity
of the unit element can be represented as
g = eA
where A is an element from the Lie algebra of the group. Correspondingly,

any matrix of the unitary group representation in H can be written as
i
Ug = e− ~ FA
H.2. STONE’S THEOREM 669
where FA is a Hermitian operator and ~ is a real constant.3 Operators FA

form a representation of the Lie algebra in the Hilbert space H. If the Lie
bracket of two Lie algebra elements is [A, B] = C, then the commutator of
their Hermitian representatives is
[FA , FB ] ≡ FA FB − FB FA = i~Fc
H.2 Stone’s theorem

Stone’s theorem provides a valuable information about unitary representa-
tions of 1-dimensional Lie groups. Such groups are called also one-parameter
Lie groups, because all their elements g(z) can be parameterized with one
real parameter z ∈ R, so that
g(0) = e
g(z1 )g(z2 ) = g(z1 + z2 )
g(z)−1 = g(−z)
Theorem H.2 (Stone [Sto32]) If Ug is a unitary representation of a 1-

dimensional Lie group in the Hilbert space H, then there exists an Hermitian
operator T in H, such that
i
Ug(z) = e− ~ T z (H.3)
This theorem is useful not only for 1-dimensional Lie groups, but also
for Lie groups of arbitrary dimension. The reason is that in any Lie group
one can find multiple one-parameter subgroups, for which the theorem can be
applied.4 For example consider an arbitrary Lie group G and a basis vector
~t from its Lie algebra. Consider a set of group elements of the form
~
g(z) = ez t (H.4)
3
Here we use the Planck’s constant, but any other nonzero real constant will do as well.
4
See Appendix E.1
where parameter z runs through all real numbers z ∈ R. It is easy to see

that the set (H.4) forms a one-parameter subgroup in G. Indeed, this set
contains the unit element (when z = 0); the group product is defined as
~ ~ ~
g(z1 )g(z2 ) = ez1 t ez2 t = e(z1 +z2 )t = g(z1 + z2 )
and the inverse element is
~
g(z)−1 = e−z t = g(−z)
From the Stone’s theorem we can then conclude that in any unitary rep-
resentation of G representatives of g(z) have the form (H.3) with some fixed
Hermitian operator T .
H.3 Heisenberg Lie algebra

The Heisenberg Lie algebra h2n of dimension 2n has basis elements Pi and
Ri (i = 1, 2, . . . , n) with Lie brackets
[Pi , Pj ] = [Ri , Rj ] = 0
[Ri , Pj ] = δij
The following theorem is applicable
Theorem H.3 (Stone-von Neumann [vN31]) If (Pi , Ri ) (i = 1, 2, . . . , n)

is a Hermitian representation5 of the Heisenberg Lie algebra h2n in the Hilbert
space H, then
1. representatives Pi and Ri have continuous spectra from −∞ to ∞.

5
This means that Hermitian operators Pi and Ri satisfy commutation relations
[Pi , Pj ] = [Ri , Rj ] = 0
[Ri , Pj ] = i~δij
where ~ is a real constant.

H.4. DOUBLE-VALUED REPRESENTATIONS OF THE ROTATION GROUP671
2. any irreducible representation of h2n is unitary equivalent to the so-

called Schrödinger representation. In the physically relevant case n = 3,
the Schrödinger representation is the one described in subsection 5.2.3:
Vectors in the Hilbert space are represented by complex functions on R3 ;
operator R multiplies these functions by r; operator P is differentiation
−i~d/dr.
H.4 Double-valued representations of the ro-

tation group
The rotation group6 has a peculiar non-trivial topology: Results of two ro-
tations around the same axis by angles φ + 2πn (with different integer n)
are physically indistinguishable. Then the region of independent rotation
vectors7 in R3 can be described as the interior of the sphere of radius π with
opposite points on the surface of the sphere being equivalent. This set of
points will be referred to as the ball Π (see Fig. H.1). The unit element
{~0} is in the center of the ball. We will be interested in one-parameter fam-
ilies of group elements8 which form continuous curves in the group manifold
Π. Since the opposite points on the surface of the ball are identical in our
topology, any continuous path that crosses the surface must reappear on the
opposite side of the sphere (see Fig. H.1(a)).
A topological space is simply connected if every loop can be continuously
deformed to a single point. An example of a simply connected topological
space is the surface of a sphere. However, the manifold Π of the rotation
parameters is not simply connected. The loop shown in Fig. H.1(a) crosses
the sphere once and can not be shrunk to a single point. However, the loop
shown in Fig. H.1(b) can be continuously deformed to a point, because it
crosses the sphere twice. It appears that for any rotation R there are two
classes of paths from the group’s unit element {~0} to R. They are also
called the homotopy classes. These two classes consist of paths that cross
6
see Appendix D
7 ~ coincides with the
Recall from Appendix D.5 that direction of the rotation vector φ
axis of rotation, and its length φ is the rotation angle.
8
They are not necessarily one-parameter subgroups.
Π A
Π B
A
0 0
A’ A’
B’
(a) (b)
Figure H.1: The space of parameters of the rotation group is not simply
connected: (a) a loop which starts from the center of the ball {~0}, reaches
the surface of the sphere Π at point A and then continues from the opposite
point A′ back to {~0}; this loop cannot be continuously collapsed to {~0},
because it crosses the surface an odd number of times (1); (b) a loop {~0} →
A → A′ → B → B ′ → {~0} which crosses the surface of the sphere Π twice
can be deformed to the point {~0}. This can be achieved by moving the points
A′ and B (and, correspondingly the points A and B ′ ) close to each other, so
that the segment A′ → B of the path disappears.
H.5. UNITARY IRREDUCIBLE REPRESENTATIONS OF THE ROTATION GROUP673
the surface of the sphere Π even and odd number of times, respectively. Two
paths from different classes cannot be continuously deformed to each other.
If we build a projective representation of the rotation group, then, similar
to our discussion of the Poincaré group in subsection 3.2.2, central charges can
be eliminated by a proper choice of numerical constants added to generators.
Then a unitary representation of the rotation group can be constructed in
which the identity rotation is represented by the identity operator and by
traveling a small loop in the group manifold from the identity element {~0}
back to {~0} we will end up with the identity operator I again. However, if
we travel the long path {~0} → A → A′ → {~0} in Fig. H.1(a), there is no
guarantee that in the end we will find the same representative I of the identity
transformation. We can get some other equivalent unitary operator from the
ray containing I, so the representative of {~0} may acquire a phase factor eiφ
after travel along such a loop. On the other hand, making two passes on
the loop {~0} → A → A′ → {~0} → A → A′ → {~0} we obtain a loop which
crosses the surface of the sphere twice and hence can be deformed to a point.
Therefore e2iφ = 1 and eiφ = ±1. This demonstrates that there are two types
of unitary representations of the rotation group: single-valued and double-
valued representations. For single-valued representations, the representative
of the identity rotation is always I. For double-valued representations, the
identity rotation has two representatives I and −I and the product of two
operators in (3.16) may have a non-trivial sign factor
Ug1 Ug2 = ±Ug1 g2
For irreducible representations of the rotation group (both single-valued and

double-valued) see Appendix H.5.
H.5 Unitary irreducible representations of the

rotation group
There is an infinite number of unitary irreducible representation D s of the
rotation group which are characterized by the value of spin s = 0, 1/2, 1, . . ..
These representations are thoroughly discussed in a number of good text-
books, see, e.g., ref. [Ros57]. In Table H.1 we just provide a summary of
these results: the dimension of the representation space, the value of the
Casimir operator S2 , the spectrum of each component of the spin operator9

and an explicit form of the three generators of the representation.
Table H.1: Unitary irreducible representations of SU(2)

Spin: s=0 s = 1/2 s=1 s = 3/2, 2, . . .
dimension 1 2 3 2s + 1
2 3 2 2 2
<S > 0 4
~ 2~ ~ s(s + 1)
sx or sy or sz 0 −~/2, ~/2 −~, 0, ~ −~s, ~(−s + 1), . . . ,
  ~(s − 1), ~s
0 0 0
0 ~/2  0 0 −i~ 
Sx 0
~/2 0
 0 i~ 0 
0 0 i~
0 −i~/2
Sy 0  0 0 0  see, e.g., ref. [Ros57]
i~/2 0
 −i~ 0 0 
0 −i~ 0
~/2 0
Sz 0  i~ 0 0 
0 −~/2
0 0 0
Representations characterized by integer spin s are single-valued. Half-

integer spin representations are double-valued.10 For example, in the 2-
dimensional representation (s = 1/2), the rotation through the angle 2π
around the z-axis is represented by negative unity
iπ
− ~i Sz 2π 2πi ~/2 0 e 0
e = exp − =
~ 0 −~/2 0 e−iπ

−1 0
= = −I
0 −1
while a 4π rotation is represented by the unit matrix

− ~i Sz 4π 1 0
e = =I
0 1
9
We denote Sx , Sy , Sz Hermitian representatives of the Lie algebra basis vectors
Jx , Jy , Jz . See Appendix D.7.
10
see Appendix H.4
H.6. PAULI MATRICES 675
H.6 Pauli matrices

Generators of the spin 1/2 representation of the rotation group (see Table
H.1) can be conveniently expressed through Pauli matrices σi (i = x, y, z)
~
Si = σi (H.5)
2
where

0 1
σx ≡ σ1 =
1 0

0 −i
σy ≡ σ2 =
i 0

1 0
σz ≡ σ3 =
0 −1
Sometimes it is convenient to define a fourth Pauli matrix

1 0
σt ≡ σ0 =
0 1
For reference we list here some properties of the Pauli matrices
3
X
[σi , σj ] = 2i ǫijk σk
i=1
{σi , σj } = 2δij
σi2 = 1
For arbitrary numerical 3-vectors a and b we have
(~σ · a)~σ = aσ0 + i[~σ × a] (H.6)

~σ (~σ · a) = aσ0 − i[~σ × a] (H.7)
(~σ · a)(~σ · b) = (a · b)σ0 + i~σ · [a × b] (H.8)
[~σpr × a] · [~σel × a] = [[~σel × a] × ~σpr ] · a
= (a(~σel · ~σpr ) − ~σpr (~σel · a)) · a
= a2 (~σel · ~σpr ) − (~σpr · a)(~σel · a)
Appendix I
Special relativity
In this Appendix we present major assertions of Einstein’s special relativity

[Ein05]. In chapter 17 we argued that this theory is approximate. We also
suggested there an alternative rigorous approach, which ensures the validity
of the relativity principle in interacting systems.
I.1 4-vector representation of the Lorentz group

The Lorentz group is a 6-dimensional subgroup of the Poincaré group, which
is formed by rotations and boosts. Linear (tensor) representations of the
Lorentz group play a significant role in many physical problems.
The 4-vector representation of the Lorentz group forms the mathematical
framework of special relativity discussed in this Appendix. This represen-
tation resembles the 3-vector representation of the rotation group.1 Let us
first define the vector space where this representation is acting. This is a
4-dimensional real vector space M whose vectors are denoted by2
 
ct
 x 
τ̃ = 
 y 

z
1
see Appendix D.2
2
Here c is the speed of light. Also in this book we always denote 4-vectors by the tilde.
By the way, here we do not bestow the Minkowski space M with any physical meaning.
For us M is just an abstract vector space, unrelated to the physical space and time.
677
678 APPENDIX I. SPECIAL RELATIVITY
and the pseudoscalar product of any two 4-vectors τ̃1 and τ̃2 can be written
in a number of equivalent forms3
3
X
2
τ̃1 · τ̃2 ≡ c t1 t2 − x1 x2 − y1 y2 − z1 z2 = (τ1 )µ g µν (τ2 )ν
µν=0
  
1 0 0 0 ct2
 0 −1 0 0   x2 
= [ct1 , x1 , y1 , z1 ]  
 0 0 −1 0   y2


0 0 0 −1 z2
= τ̃1T gτ̃2 (I.1)
where g µν are matrix elements of the so-called metric tensor.
 
1 0 0 0
 0 −1 0 0 
g=
 0 0 −1 0 

0 0 0 −1
For compact notation it is convenient to define a vector with a “raised in-

dex” and to adopt the Einstein’s convention about summation over repeated
indices
3
X
µ
τ ≡ g µν τν ≡ g µν τν = (ct, −x, −y, −z)
ν=0
Then, the pseudoscalar product can be rewritten as
τ̃1 · τ̃2 ≡ (τ1 )µ (τ2 )µ = (τ1 )µ (τ2 )µ (I.2)
The tilde notation allows us to distinguish the pseudoscalar square (or 4-

square) of the 4-vector τ̃
τ̃ 2 ≡ τ̃ · τ̃ = τµ τ µ = τ02 − τ12 − τ22 − τ32 = τ02 − τ 2

3
Here indices µ and ν run from 0 to 3: τ0 = ct, τ1 = x, τ2 = y, τ3 = z.
I.1. 4-VECTOR REPRESENTATION OF THE LORENTZ GROUP 679
from the square of its 3-vector part
τ 2 ≡ (~τ · ~τ ) = τ12 + τ22 + τ32
A 4-vector (τ0 , ~τ ) is called space-like if τ 2 > τ02 . Time-like 4-vectors have

τ 2 < τ02 , and for null 4-vectors the condition is τ 2 = τ02 .
The 4-vector representation of the Lorentz group is defined as a repre-
sentation by linear transformations in the vector space M that conserve the
pseudoscalar product of 4-vectors. In other words, representation matrices
Λ must satisfy
τ̃1′ · τ̃2′ ≡ Λτ̃1 · Λτ̃2 = τ̃1T ΛT gΛτ̃2 = τ̃1T gτ̃2 = τ̃1 · τ̃2
which means that matrices Λ must have the property
g = ΛT gΛ (I.3)
One useful implication of this result is
Λτ̃1 · τ̃2 = τ̃1T ΛT gτ̃2 = τ̃1T gΛ−1τ̃2 = τ̃1 · Λ−1 τ̃2 (I.4)
Another property of Λ can be obtained by taking the determinant of both

sides of (I.3)
−1 = det(g) = det(ΛT gΛ) = det(ΛT ) det(g) det(Λ) = − det(Λ)2
which implies det(Λ) = ±1. Writing equation (I.3) for the g00 component we
also get
3
X
1 = g00 = Λα′ 0 gα′ β ′ Λβ ′ 0 = −Λ210 − Λ220 − Λ230 + Λ200
α′ ,β ′ =0
It then follows that Λ200 ≥ 1, which means that either Λ00 ≥ 1 or Λ00 ≤ −1.
The unit element of the group is represented by the identity transforma-
tion I, which obviously has det(I) = 1 and I00 = 1. As we are interested
only in rotations and boosts which can be continuously connected to the unit
element, we must choose
det(Λ) = 1 (I.5)
Λ00 ≥ 1 (I.6)
The matrices satisfying equation (I.3) with additional conditions (I.5) - (I.6)
will be called pseudoorthogonal. Thus we can say that 4×4 pseudoorthogonal
matrices form a representation of the Lorentz group.
Boost transformations can be written as
   
ct′ ct
 x′   x 
 ′  = B(~θ)   (I.7)
 y   y 
′
z z
where general pseudoorthogonal matrix of boost is4
 
cosh θ − θθx sinh θ − θθy sinh θ − θθz sinh θ
 − θx sinh θ 1 + χθx2 χθx θy χθx θz 
B(~θ) =  θ
 − θy sinh θ 2
 (I.8)
θ
χθ θ
x y 1 + χθy χθ θ
y z

− θθz sinh θ χθx θz χθy θz 1 + χθz2
where we denoted χ = (cosh θ − 1)θ−2 . In particular, boosts along x, y, and

z axes are represented by the following 4 × 4 matrices
 
cosh θ − sinh θ 0 0
 − sinh θ cosh θ 0 0 
B(θ, 0, 0) =   (I.9)
 0 0 1 0 
0 0 0 1
 
cosh θ 0 − sinh θ 0
 0 1 0 0 
B(0, θ, 0) = 
 − sinh θ 0 cosh θ
 (I.10)
0 
0 0 0 1
4
compare with equations (2.50) and (2.51)
I.1. 4-VECTOR REPRESENTATION OF THE LORENTZ GROUP 681
 
cosh θ 0 0 − sinh θ
 0 1 0 0 
B(0, 0, θ) =   (I.11)
 0 0 1 0 
− sinh θ 0 0 cosh θ
Conservation of the pseudoscalar product by these transformations can be

easily verified.
Rotations are represented by 4 × 4 matrices

~ = 1 0
R(φ) .
0 Rφ~
where Rφ~ is a 3 × 3 rotation matrix (D.22). A general element of the Lorentz

group can be represented as (rotation) × (boost),5 so its matrix
~
Λ = R(φ)B( ~θ) (I.12)
~ and B(~θ) do.

preserves the pseudoscalar product just as R(φ)
So far we discussed the matrix representation of finite Lorentz transfor-
mations. Let us now find the matrix representation of the corresponding
Lie algebra. According to our discussion in Appendix H.1, the matrix of a
general Lorentz group element can be represented in the exponential form
Λ = eaF
where F is an element of the Lie algebra and a is a real constant. Condition

(I.3) then can be rewritten as
T
0 = ΛT gΛ − g = eaF geaF − g = (1 + aF T + . . .)g(1 + aF + . . .) − g
= a(F T g − gF ) + . . .
where the ellipsis indicates terms proportional to a2 , a3 , etc. This sets the
following restriction on the matrices F
5
This order of factors agrees with our convention (2.47).
F T g − gF = 0.
We can easily find 6 linearly independent 4 × 4 matrices satisfying this con-

dition. Three generators of rotations are6
     
0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0   0 0 0 −1   0 0 1 0 
Jx =  ,J =  ,J =  (I.13)

 0 0 0 1  y  0 0 0 0  z  0 −1 0 0 
0 0 −1 0 0 1 0 0 0 0 0 0
Three generators of boosts can be obtained by differentiating explicit repre-

sentation of boosts (I.9) - (I.11)
     
0 −1 0 0 0 0 −1 0 0 0 0 −1
1  −1 0 0 0  , Ky = 1 
 0 0 0 0  , Kz = 1 
 0 0 0 0 
Kx =  (I.14)

c  0 0 0 0  c −1 0 0 0  c 0 0 0 0 
0 0 0 0 0 0 0 0 −1 0 0 0
These six matrices represent basis elements of the Lie algebra, so that for
representatives of finite transformations we have
~~
R(~θ) = eJ θ
~~
B(~θ) = ecKθ
I.2 Lorentz transformations for time and po-

sition
The fundamental idea of special relativity is that Minkowski space-time M
is a faithful representation of the physical space and time. In particular,
coordinates (ct, x, y, z) can be interpreted as space-time coordinates of real
physical events7 localized in space and time. Moreover, it is claimed that 4×4
6
Note that matrices (D.25) - (D.26) are 3×3 submatrices of (I.13).
7
For definition of an event see subsection 17.2.1.
I.2. LORENTZ TRANSFORMATIONS FOR TIME AND POSITION 683
matrices (I.12) accurately describe transformations of these coordinates to

reference frames affected by rotations and/or boosts. Suppose that observer
O ′ moves with respect to O with rapidity θ. ~ Suppose also that (t, x) are
space-time coordinates of an event viewed by observer O. Then, according
to special relativity, the space-time coordinates (t′ , x′ ) of this event from the
point of view of O ′ are given by formula (I.7), which is called the Lorentz
transformation for time and position of the event. In particular, if observer
O ′ moves with the speed v = c tanh θ along the x-axis, then the matrix B(~θ)
is (I.9)
 
cosh θ − sinh θ 0 0
 − sinh θ cosh θ 0 0 
B(θ, 0, 0) =   (I.15)
 0 0 1 0 
0 0 0 1
and Lorentz transformation (I.7) can be written in a more familiar form
t′ = t cosh θ − (x/c) sinh θ (I.16)

x′ = x cosh θ − ct sinh θ (I.17)
y′ = y (I.18)
z′ = z (I.19)
It is important to note that special relativity makes the following assertion
Assertion I.1 (the universality of Lorentz transformations) Lorentz trans-
formations (I.16) - (I.19) are exact and universal: they are valid for all kinds
of events in any physical system; they do not depend on the composition of
the physical system and on interactions acting there.
In the main body of this book8 we explain why Assertion I.1 does not hold
in relativistic theory (RQD) developed here. The key difference between our
approach and the standard logic of special relativity is that in RQD boost
transformations of space-time coordinates of events involving interacting par-
ticles have a more complicated form, which depends on the interaction and
on the state of the physical system. So, from our standpoint all consequences
of the Assertion I.1 described in the rest of this Appendix are neither rigorous
nor accurate.
8
See, especially, chapter 17.
I.3 Minkowski space-time and manifest co-

variance
An important consequence of the Assertion I.1 is the idea of the Minkowski 4-
dimensional space-time. It wouldn’t be an exaggeration to say that this idea
is the foundation of the entire mathematical formalism of modern relativistic
physics.
The logic of introducing the Minkowski space-time was as follows: Ac-
cording to Assertion I.1, Lorentz transformations (I.16) - (I.19) are universal
and interaction-independent. These transformations coincide with the ab-
stract 4-vector representation of the Lorentz group introduced in Appendix
I.1. It is then natural to assume that the abstract 4-dimensional vector space
M with pseudo-scalar product defined in Appendix I.1 can be identified with
the space-time arena where all real physical processes occur. Then space and
time coordinates of any event become unified as different components of the
same time-position 4-vector, and the real geometry of the world becomes
4-dimensional one. Space and time of the old physics become unified as the
Minkowski space-time endowed with pseudo-Euclidean metric. Minkowski
described this space and time unification in following words:
From henceforth, space by itself and time by itself, have vanished

into the merest shadows and only a kind of blend of the two exists
in its own right. H. Minkowski
In analogy with familiar 3D scalars, vectors, and tensors (see Appendix

D), special relativity of Einstein and Minkowski requires that physical quan-
tities transform in a linear “manifestly covariant” way, i.e., as 4-scalars, or
4-vectors, or 4-tensors, etc.
Assertion I.2 (manifest covariance of physical laws [Ein20]) Every gen-

eral law of nature must be so constituted that it is transformed into a law of
exactly the same form when, instead of the space-time variables t, x, y, z of
the original coordinate system K, we introduce new space-time variables t′ ,
x′ , y ′, z ′ of a coordinate system K ′ . In this connection the relation between
the ordinary and the accented magnitudes is given by the Lorentz transfor-
mation. Or in brief: General laws of nature are co-variant with respect to
Lorentz transformations.
I.4. DECAY OF MOVING PARTICLES IN SPECIAL RELATIVITY 685
From Assertions I.1 and I.2 one can immediately obtain many important
physical predictions of special relativity. One consequence of Lorentz trans-
formations is that the length of a measuring rod reduces by a universal factor
l′ = l/ cosh θ (I.20)
from the point of view of a moving reference frame. Another well-known
result is that the duration of time intervals between any two events increases
by the same factor cosh θ
∆t′ = ∆t cosh θ (I.21)

One experimentally verifiable consequence of this time dilation formula will
be discussed in the next section.
I.4 Decay of moving particles in special rela-

tivity
Suppose that from the viewpoint of observer O the unstable particle is pre-
pared at rest in the origin x = y = z = 0 at time t = 0 in the non-decayed
state, so that ω(0, 0) = 1.9 Then observer O may associate the space-time
point
(t, x, y, z)prep = (0, 0, 0, 0) (I.22)

with the event of preparation. We know that the non-decay probability
decreases with time by (almost) exponential decay law 10

t
ω(0, t) ≈ exp − (I.23)
τ0
9
Here we follow notation from chapter 13 by writing ω(θ, t) the non-decay probability
observed from the reference frame O′ moving with respect to O with rapidity θ at time t
(measured by a clock attached to O′ ).
10
Actually, as we saw in subsection 13.2.3, the decay law is not exactly exponential, but
this is not important for our derivation of equation (I.25) here.
At time t = τ0 the non-decay probability is exactly ω(0, τ0) = e−1 . This “one
lifetime” event has space-time coordinates
(t, x, y, z)lif e = (τ0 , 0, 0, 0) (I.24)
according to the observer O.

Let us now take the point of view of the moving observer O ′ . Ac-
cording to special relativity, this observer will also see the “preparation”
and the “one lifetime” events, when the non-decay probabilities are 1 and
e−1 , respectively. However, observer O ′ may disagree with O about the
space-time coordinates of these events. Substituting (I.22) and (I.24) in
(I.16) - (I.19) we see that from the point of view of O ′ , the “preparation”
event has coordinates (0, 0, 0, 0), and the “lifetime” event has coordinates
(τ0 cosh θ, −cτ0 sinh θ, 0, 0). Therefore, the time elapsed between these two
events is cosh θ times longer than in the reference frame O. This also means
that the decay law is exactly cosh θ slower from the point of view of the mov-
ing observer O ′ . This finding is summarized in the famous Einstein’s “time
dilation” formula

t
ω(θ, t) = ω 0, (I.25)
cosh θ
which was confirmed in numerous experiments [RH41, ACG+ 71, RMR+ 80],
most accurately for muons accelerated to relativistic speeds in a cyclotron
[BBC+ 77, Far92]. These experiments were certainly a triumph of Einstein’s
theory. However, as we see from the above discussion, equation (I.25) can be
derived only under assumption I.1, which lacks proper justification. There-
fore, a question remains whether equation (I.25) is a fundamental exact result
or simply an approximation that can be disproved by more accurate mea-
surements? This question is addressed in chapter 13.
I.5 Ban on superluminal signaling

Perhaps, the most famous assertion of special relativity is
Assertion I.3 (no superluminal signaling) No signal may propagate faster

than the speed of light.
I.5. BAN ON SUPERLUMINAL SIGNALING 687
O'
Figure I.1: Illustration to the special-relativistic “proof” that superluminal

signals violate the principle of causality.
The “proof” of this Assertion [Rus05] relies on the principle of causality,

which says that the cause precedes the effect in all reference frames. Suppose
that two events “Cause” and “Effect” are causally related, while separated
by a space-like interval11 in the reference frame O with coordinate axes (t, x),
as in Fig. I.1. In special relativity, we obtain coordinates of the two events
in the moving reference frame O ′ from Lorentz formulas (I.16) - (I.19). This
transformation can be represented graphically as a pseudorotation of the co-
ordinate axes shown in the figure. If the speed of O ′ is high enough, then this
observer will find that the “Effect” happens earlier than the “Cause,” which
clearly violates the principle of causality. Thus, within special relativity all
superluminal signals are forbidden. In chapter 16 we will discuss experiments
that challenge this conclusion.
11
This means that the signal has propagated from the “Cause” to the “Effect” superlu-
minally.
Appendix J
Quantum fields for fermions
According to our interpretation of quantum field theory, quantum fields are

not fundamental ingredients of the material world. They are just convenient
mathematical expressions, which simplify the construction of relativistic and
cluster-separable interaction operators. For this reason, discussion of quan-
tum fields is placed in this Appendix rather than in the main body of the
book. Here we will discuss quantum fields for spin 1/2 fermions (electrons,
protons, neutrinos, and their antiparticles). In the next Appendix we will
consider the photon’s quantum field.
J.1 Dirac’s gamma matrices

Let us introduce the following 4 × 4 Dirac gamma matrices.1
 
1 0 0 0
0
 0 1 0 0  σ0 0 1 0
γ = 
 0 0 −1 0
= = (J.1)
 0 −σ0 0 −1
0 0 0 −1
 
0 0 0 1
x
 0 0 1 0  0 σx
γ = 
 0 −1 0 0
=
 −σx 0
−1 0 0 0
1
On the right hand sides each 2 × 2 block is expressed in terms of Pauli matrices from
Appendix H.6
689
690 APPENDIX J. QUANTUM FIELDS FOR FERMIONS
 
0 0 0 −i
 0 0 i 0  0 σy
γy =  =
 0 i 0 0  −σy 0
−i 0 0 0
 
0 0 1 0
 0 0 0 −1  0 σz
γz =  −1
=
0 0 0  −σz 0
0 1 0 0

0 ~σ
~γ = (J.2)
−~σ 0
These matrices have the following properties2
γ 0~γ = ~γ † γ 0 = −~γ γ 0 (J.3)

γ µγ ν + γ ν γ µ = 2g µν (J.4)
γ 0γ 0 = 1 (J.5)
γ iγ i = −1 (J.6)
T r(γ µ ) = 0 (J.7)
T r(γ µ γ ν ) = 4gµν (J.8)
γµ γ µ = −γ x γ x − γ y γ y − γ z γ z + γ 0 γ 0 = 4 (J.9)
γµ γν γ µ = −γν γµ γ µ + 2gµν γ µ = −4γν + 2γν = −2γν (J.10)
If A, B, C are any linear combinations of gamma-matrices, then
γµ Aγ µ = −2A (J.11)
γµ ABγ µ = 2(AB + BA) (J.12)
γµ ABCγ µ = −2CBA (J.13)
J.2 Bispinor representation of the Lorentz group

In this section, we would like to build the bispinor representation D(Λ) of
the Lorentz group. Similar to the 4-vector representation from Appendix I.1,
the bispinor representation is realized by 4 × 4 matrices.
2
The indices take values µ, ν = 0, 1, 2, 3, i = 1, 2, 3.
J.2. BISPINOR REPRESENTATION OF THE LORENTZ GROUP 691
The boost and rotation generators of the bispinor representation of the

Lorentz group are defined through commutators of gamma matrices

~ = i~ 0 i~ 0 ~σ
K [γ , ~γ ] = (J.14)
4c 2c ~σ 0

i~ ~ σx 0
Jx = [γy , γz ] = (J.15)
4 2 0 σx

i~ ~ σy 0
Jy = [γz , γx ] = (J.16)
4 2 0 σy

i~ ~ σz 0
Jz = [γx , γy ] = (J.17)
4 2 0 σz
Using properties of Pauli matrices from Appendix H.6, it is not difficult
to verify that these generators indeed satisfy commutation relations of the
Lorentz algebra (3.53), (3.54), and (3.56). For example,

~2 [σx , σy ] 0 i~2 σz 0
[Jx , Jy ] = = = i~Jz
4 0 [σx , σy ] 2 0 σz

i~2 σx 0 0 σy 0 σy σx 0
[Jx , Ky ] = −
4c 0 σx σy 0 σy 0 0 σx
2

~ 0 σz
= − = i~Kz
2c σz 0

~2 0 σx 0 σy 0 σy 0 σx
[Kx , Ky ] = − 2 −
4c σx 0 σy 0 σy 0 σx 0

~2 [σx , σy ] 0 i~2 σz 0 i~
= − 2 =− 2 = − 2 Jz
4c 0 [σx , σy ] 2c 0 σz c
We also get the following representation of finite boosts3
" #!
− ic K·θ~ 1 0 ~σ · ~θ
Dij (e ~ ) = exp
2 ~σ · ~θ 0
" # 2
1 0 ~σ · θ~ 1 θ 1 0
= 1+ + + ...
2 ~σ · ~θ 0 2! 2 0 1
3
Note that this representation is not unitary.
θ 2c ~ θ~ θ
= I cosh + K · sinh (J.18)
2 i~ θ 2
This equation allows us to prove another important property of gamma ma-
trices
X
D −1 (Λ)γ µ D(Λ) = Λµν γ ν (J.19)
ν
where Λ is any Lorentz transformation, and Λµν is a 4 × 4 matrix (I.12)

realizing the 4-vector representation of the Lorentz group. Indeed, let us
consider a particular case of this formula with µ = 0 and Λ being a boost
with rapidity θ along the x-axis. Then
D −1 (Λ)γ 0 D(Λ)

θ 2c θ θ 2c θ
= I cosh − Kx sinh γ 0 I cosh + Kx sinh
2 i~ 2 2 i~ 2

θ 1 0 θ 0 σx 1 0
= cosh − sinh ×
2 0 1 2 σx 0 0 −1

θ 1 0 θ 0 σx
cosh + sinh
2 0 1 2 σx 0

2 θ 1 0 θ θ 0 −σx θ2 1 0
= cosh − 2 sinh cosh + sinh
2 0 −1 2 2 σx 0 2 0 −1
= γ 0 cosh θ + γ x sinh θ
In agreement with formula for the boost matrix Λµν (I.9).

One can also check for pure boosts
" # 2
− ic K·θ~ 1 0 0 ~σ · θ ~ 1 θ 1 0
γ 0 D(e ~ )γ 0 = 1+ γ 0
γ + + ...
2 ~σ · ~θ 0 2! 2 0 1
" # 2
1 0 ~σ · θ ~ 1 θ 1 0
= 1− + + ...
2 ~σ · ~θ 0 2! 2 0 1
ic ic
K·θ~ −1 ~
= D e ~ =D e− ~ K·θ
J.3. CONSTRUCTION OF THE DIRAC FIELD 693
A similar calculation for rotations should convince us that for a general trans-
formation Λ from the Lorentz group
γ 0 D(Λ)γ 0 = D −1 (Λ) (J.20)

Another useful formula is
D(Λ)γ 0D(Λ) = D(Λ)γ 0 D(Λ)γ 0 γ 0 = D(Λ)D −1(Λ)γ 0 = γ 0 (J.21)

It will be convenient to introduce a slash notation for pseudoscalar prod-
ucts of γ µ with 4-vectors k̃
/k ≡ kµ γ µ ≡ γ 0 k0 − ~γ · k (J.22)
/k2 = γ µ kµ γ ν kν = 1/2(γ µ γ ν + γ ν γ µ )kµ kν = g µν kµ kν
= k̃ 2 (J.23)
2 2 2 4 2 2 4
/ − mc )(k
(k / + mc ) = /k/k − m c = k̃ − m c (J.24)
γµ/k + /kγµ = 2kµ (J.25)
J.3 Construction of the Dirac field

According to the Step 1 in subsection 9.1.1, in order to construct relativistic
interaction operators, we need to associate with each particle type a finite-
dimensional representation of the Lorentz group and a quantum field. In this
section we are going to build the quantum field for electrons and positrons.
We postulate that this Dirac field has 4 components that transform by means
of the representation D(Λ) constructed above. The explicit formula for the
field is4
ψα (x̃) ≡ ψα (x, t)
Z s
dp mc2 X − i p̃·x̃ i
p̃·x̃ †

= e ~ u α (p, σ)ap,σ + e ~ vα (p, σ)bp,σ
(2π~)3/2 ωp σ
(J.26)
4
This form (apart from the overall normalization of the field) can be uniquely estab-
lished [Wei95] from the properties (I) - (IV) in Step 1 of subsection 9.1.1. The bispinor
index α takes values 1,2,3,4.
Here ap,σ is the electron annihilation operator, and b†p,σ is the positron cre-
ation operator. For brevity, we denote p̃ ≡ (ωp , cpx , cpy , cpz ) the energy-
momentum 4-vector and x̃ ≡ (t, x/c, y/c, z/c) the 4-vector in the Minkowski
space-time.5 The pseudo-scalar product p of the 4-vectors is denoted by a
µ
dot: p̃ · x̃ ≡ pµ x ≡ px − ωp t and ωp ≡ m2 c4 + p2 c2 . Numerical factors
uα (p, σ) and vα (p, σ) will be discussed in Appendix J.4. Note that according
to equations (8.36) and (8.37)
i i
ψα (x, t) = e− ~ H0 t ψα (x, 0)e ~ H0 t (J.27)
so, the t-dependence demanded by equation (8.52) for regular operators is

satisfied in our definition (J.26).
The Dirac field can be represented by a 4-component column of operator
functions
 
ψ1 (x̃)
 ψ2 (x̃) 
ψ(x̃) =  
 ψ3 (x̃) 
ψ4 (x̃)
We will also need the conjugate field
Z s
dp mc2 X i p̃·x̃ † − ~i p̃·x̃ †

ψα† (x̃) = e ~
†
uα (p, σ)ap,σ + e vα (p, σ)bp,σ
(2π~)3/2 ωp σ
which is usually represented as a row
ψ † = [ψ1∗ , ψ2∗ , ψ3∗ , ψ4∗ ]
The adjoint field

5
As discussed in section 17.4, the only purpose for introducing quantum fields is to
build interaction operators as in (9.13) - (9.14). In these formulas field arguments x, y, z
are integration variables. Therefore, they should not be identified with positions in the
physical space. Moreover, in applications bispinor labels α serve as dummy summation
indices, so no physical meaning should be assigned to them as well.
J.4. PROPERTIES OF FACTORS U AND V 695
X
ψ α (x̃) ≡ ψβ† (x̃)γβα
0
(J.28)
β
is also represented as a row
 
1 0 0 0
 0 1 0 0 
ψ ≡ ψ † γ 0 = [ψ1∗ , ψ2∗ , ψ3∗ , ψ4∗ ]  
 0 0 −1 0 
0 0 0 −1
∗ ∗ ∗ ∗
= [ψ1 , ψ2 , −ψ3 , −ψ4 ]
The quantum field for the proton-antiproton system is built similarly to

(J.26)
Z s
dp Mc2 X − i p̃·x̃ i
p̃·x̃ †

Ψ(x̃) = e ~ w(p, σ)dp,σ + e ~ s(p, σ)fp,σ
(2π~)3/2 Ωp σ
(J.29)
p
where Ωp = M 2 c4 + p2 c2 , M is the proton mass, P̃ · x̃ ≡ px − Ωp t, and
coefficient functions w(p, σ) and s(p, σ) are the same as u(p, σ) and v(p, σ)
but with the electron mass m replaced by the proton mass M.
J.4 Properties of factors u and v

The key components of the quantum field formula (J.26) are numerical func-
tions uα (p, σ) and vα (p, σ). We can represent them as 4 × 2 matrices with
the (bispinor) index α = 1, 2, 3, 4 enumerating rows and the (spin projec-
tion) index σ = −1/2, 1/2 enumerating columns. Let us first postulate the
following form of these matrices at zero momentum
   
0 1 0 0
 1 0   0 0 
u(0) = 
 0
 , v(0) =  
0   0 1 
0 0 1 0
Sometimes it is convenient to represent these matrices as four vectors-columns


0
 1 
u(0, −1/2) =  
 0  (J.30)
0
 
1
 0 
u(0, 1/2) =  
 0  (J.31)
0
 
0
 0 
v(0, −1/2) =  
 0  (J.32)
1
 
0
 0 
v(0, 1/2) =  
 1  (J.33)
0
We will get more compact formulas if we introduce 2-component quantities

1 0
χ1/2 = , χ−1/2 = , χ†1/2 = (1, 0), χ†−1/2 = (0, 1) (J.34)
0 1
Then we can write

χσ 0
u(0, σ) = , v(0, σ) =
0 χσ
Let us verify that matrix u(0) has the following property
X X
Dαβ (R)uβ (0, σ) = uα (0, τ )Dτ1/2
σ (R) (J.35)
β τ
where D is the bispinor representation of the Lorentz group,6 D 1/2 is the

2-dimensional unitary irreducible representation of the rotation group,7 and
6
see Appendix J.1
7
see Table H.1
J.4. PROPERTIES OF FACTORS U AND V 697
R is any rotation. By denoting Jk the generators of rotations in the rep-

resentation Dαβ (R) and Sk the generators of rotations in the representation
1/2
Dσσ′ (R) we can write equation (J.35) in an equivalent differential form
X X
(Jk )αβ (R)uβ (0, σ) = uα (0, τ )(Sk )τ σ (R)
β τ
Let us check, for example, that this equation is satisfied for rotations around
the x-axis. Acting with the 4 × 4 matrix (J.15)
 
0 1 0 0
~ 1 0 0 0 
Jx =  
2 0
 0 0 1 
0 0 1 0
on the index β in uβ (0, σ) we obtain
    
0 1 0 0 0 1 1 0
~
 1 0 0 0  1 0
   ~ 0 1 
Jx u(0) = =  
2 0 0 0 1  0 0  2 0 0 
0 0 1 0 0 0 0 0
This has the same effect as acting with 2 × 2 matrix (see Table H.1)

~ 0 1
Sx =
2 1 0
on the index τ in uα (0, τ )
   
0 1 1 0
~ 1

 0 1 = ~
0   0 1 
u(0)Jx = 
2 0 0  1 0 2 0 0 
0 0 0 0
This proves equation (J.35). Similarly, one can show

X X
Dαβ (R)vβ (0, σ) = vα (0, τ )Dτ∗1/2
σ (R)
β τ
The corresponding formula for the adjoint factor u is obtained as follows:

take the Hermitian conjugate of (J.35), multiply it by γ 0 from the right and
take into account equations (J.5) and (J.20)
X
u† (0, σ)γ 0 γ 0 D † (R)γ 0 = u† (0, τ )γ 0 Dτ1/2
σ (R)
τ
X
†
u(0, σ)D (−R) = u(0, τ )Dτ1/2
σ (R) (J.36)
τ
So far, we have discussed zero-momentum values of functions u and v.

The values of uα (p, σ) and vα (p, σ) at arbitrary momentum p are defined by
applying the bispinor representation matrix (J.18) of the standard boost λp
(5.3) to zero-momentum values
X
uα (p, σ) ≡ Dαβ (λp )uβ (0, σ) (J.37)
β
X
vα (p, σ) ≡ Dαβ (λp )vβ (0, σ) (J.38)
β
Taking a Hermitian conjugate of (J.37) and multiplying by γ 0 from the right

we obtain factors in adjoint fields
u(p, σ) ≡ u† (p, σ)γ 0 = u† (0, σ)D † (λp )γ 0 = u† (0, σ)γ 0 γ 0 D(λp )γ 0

= u† (0, σ)γ 0 D −1 (λp ) = u(0, σ)D −1(λp ) (J.39)
−1
v(p, σ) = v(0, σ)D (λp )
J.5 Explicit formulas for u and v

Now let us find explicit expressions for factors u, v, u, and v for all momenta.
Using formulas (5.3), (J.18), (J.14) and
J.5. EXPLICIT FORMULAS FOR U AND V 699
θ = tanh−1 (v/c)
θ tanh θ v/c pc
tanh = p = p =
2 1 + 1 − tanh2 θ 1 + 1 − v 2 /c2 ωp + mc2
r
θ 1 ωp + mc2
cosh = q =
2 1 − tanh2 θ 2mc2
2
θ θ θ
sinh = tanh cosh
2 2 2
we obtain
ic ~ p θp 2c K ~ ·p θp
D(λp) = e− ~ K· p θp = I cosh + sinh
2 i~ p 2
" #
θp 1 0 θp 0 ~σp·p
= cosh + sinh σ·p
~
2 0 1 2 p
0
" #!
θp θp 0 ~σp·p
= cosh 1 + tanh σ·p
~
2 2 p
0
r " #!
ωp + mc2 pc 0 ~σp·p
= 1+
2mc2 ωp + mc2 ~σp·p 0
r " #
σ ·pc
~
ωp + mc2 1 ωp +mc 2
= σ ·pc
~
2mc2 ωp +mc 2 1
Then, inserting this result in (J.37) we obtain
r " σ ·pc
~
#
ωp + mc2 1 ωp +mc 2 χσ
u(p, σ) = σ ·pc
~
2mc2 ωp +mc2
1 0
" p #
ωp + mc 2
= p √ χσ (J.40)
2 p
ωp − mc ~σ · p 2mc2
Similarly, the explicit expressions for v, u, and v are

p
ωp − mc2 (~σ · pp ) χσ
v(p, σ) = p
2
√ (J.41)
ωp + mc 2mc2

χ†σ p p p
u(p, σ) = √ ωp + mc2 , − ωp − mc2 ~σ · (J.42)
2mc2 p
†

χσ p p p
v(p, σ) = √ ωp − mc ~σ · 2 , ωp + mc 2 (J.43)
2mc2 p
These functions are normalized to unity in the sense that8
u(p, σ)u(p, σ ′ )
" p
2
#
p p p ω p + mc 1
= χ†σ ωp + mc2 , ωp − mc2 · ~σ p
2

p
χσ′
p − ωp − mc p · ~σ 2mc2

† 2 (p · ~
σ )(p · ~σ ) 1
2
= χσ ωp + mc − (ωp − mc ) 2
χσ′ = χ†σ χσ′
p 2mc2
= δσ,σ′ (J.44)
P1/2
Let us also calculate the sum σ=−1/2 u(p, σ)u†(p, σ). At zero momentum
we can use the explicit representation (J.30) - (J.33)
     
1/2
1 0 0 0 0 0 0 0 1 0 0 0
X  0 0 0 0   0 1 0 0   0 1 0 0 
u(0, σ)u†(0, σ) = 
 0 0 0
+ = 
0   0 0 0 0   0 0 0 0 
σ−1/2
0 0 0 0 0 0 0 0 0 0 0 0
1
= 1 + γ0
2
To generalize this formula for arbitrary momentum, we use (J.37), (J.39),
the Hermiticity of the matrix D(λp ) and properties (J.5), (J.19) - (J.21)
 
1/2 1/2
X X
u(p, σ)u†(p, σ) = D(λp )  u(0, σ)u†(0, σ) D † (λp )
σ=−1/2 σ=−1/2
8
Here we used (H.8).
J.6. CONVENIENT NOTATION 701
1 1 1
= D(λp ) 1 + γ 0 D(λp ) = D(λp )D(λp ) + γ 0 = D(λp )γ 0 γ 0 D(λp )γ 0 γ 0 + γ 0
2 2 2
1
= D(λp )γ 0 D −1 (λp )γ 0 + γ 0
2 !
1 0 1 ~θ
0 −1
= D(λp )γ D (λp ) + 1 γ = 0
γ cosh θ + ~γ sinh θ + 1 γ 0
2 2 θ
1 1
= 2
γ 0 ωp − ~γ pc + mc2 γ 0 = 2
/p + mc2 γ 0 (J.45)
2mc 2mc
Similarly we can derive a number of useful formulas9
1/2
X 1
u(p, σ)u(p, σ) = 2
/p + mc2 (J.46)
2mc
σ=−1/2
1/2
X X 1
uα (p, σ)uα (p, σ) = 2
T r γ 0 ωp − c~γ p + mc2 = 2
α
2mc
σ=−1/2
1/2
X 1 0
v(p, σ)v †(p, σ) = /
p − mc 2
γ (J.47)
2mc2
σ=−1/2
1/2
X 1 2

v(p, σ)v(p, σ) = /
p − mc (J.48)
2mc2
σ=−1/2
1/2
X X
vα (p, σ)v α (p, σ) = −2
α σ=−1/2
J.6 Convenient notation

To simplify QED calculations we introduce the following combinations of
particle operators
s
mc2 X
Aα (p) = uα (p, σ)ap,σ (J.49)
ωp σ
9
Here we used the facts that the trace of any gamma-matrix is zero, and that the trace
of the unit 4×4 matrix is 4.
s
† mc2 X
Aα (p) = uα (p, σ)a†p,σ (J.50)
ωp σ
s
mc2 X
Bα† (p) = vα (p, σ)b†p,σ (J.51)
ωp σ
s
mc2 X
B α (p) = vα (p, σ)bp,σ (J.52)
ωp σ
s
Mc2 X
Dα (p) = wα (p, σ)dp,σ (J.53)
Ωp σ
s
† Mc2 X
D α (p) = w α (p, σ)d†p,σ (J.54)
Ωp σ
s
Mc2 X
Fα† (p) = †
sα (p, σ)fp,σ (J.55)
Ωp σ
s
Mc2 X
F α (p) = sα (p, σ)fp,σ (J.56)
Ωp σ
In this notation, indices α, β = 1, 2, 3, 4 are those corresponding to the
bispinor representation of the Lorentz group, and index σ = ±1/2 enumer-
ates two spin projections of fermions.
With the above conventions, the electron/positron and proton/antiproton
quantum fields can be written compactly
h iZ i
i
−3/2
ψα (x̃) = (2π~) dp e− ~ p̃·x̃ Aα (p) + e ~ p̃·x̃ Bα† (p) (J.57)
Z h i i
i
−3/2 − ~ P̃ ·x̃ P̃ ·x̃ †
Ψα (x̃) = (2π~) dp e D α (p) + e ~ Fα (p) (J.58)
J.7 Transformation laws

Operators (J.49)-(J.56) have simple boost transformation laws. For example,
we can use (3.59), (8.37), (5.16), and (J.35) to obtain
ic ~ ic ~
U0 (Λ; 0)A(p)U0−1 (Λ; 0) = e− ~ K0 θ A(p)e ~ K0 θ
J.7. TRANSFORMATION LAWS 703
s
mc2 X ic ~ ic ~
= u(p, σ)e ~ K0 θ ap,σ e− ~ K0 θ
ωp σ
s r
mc2 ωΛp X X 1/2
~ W )aΛp,σ′
= u(p, σ) Dσσ′ (−φ
ωp ωp σ
σ′
s r
mc2 ωΛp X X 1/2
~ W )aΛp,σ′
= D(λp ) u(0, σ) Dσσ′ (−φ
ωp ωp σ σ′
s r
mc2 ωΛp ~W )
X
= D(λp )D(−φ u(0, σ)aΛp,σ
ωp ωp σ
s r
mc2 ωΛp X
= D(λp )D(λ−1 p Λ −1
λ Λp ) u(0, σ)aΛp,σ
ωp ωp σ
s r
mc2 ωΛp X
= D(Λ−1 )D(λΛp) u(0, σ)aΛp,σ
ωp ωp σ
r s
ωΛp mc2 X
= D(Λ−1 ) u(Λp, σ)aΛp,σ
ωp ωp σ
ωΛp
= D(Λ−1 )A(Λp) (J.59)
ωp
Similarly, using (J.36)
s
† mc2 X ic ~ ic ~
U0 (Λ; 0)A (p)U0−1 (Λ; 0) = u(p, σ)e− ~ K0 θ a†p,σ e ~ K0 θ
ωp σ
s r
mc2 ωΛp X X
~ W )a† ′
= u(p, σ) (D 1/2 )∗σσ′ (−φ Λp,σ
ωp ωp σ
σ′
s r
mc2 ωΛp X X ~ W )D −1 (λp )a† ′
= u(0, σ ′ )(D 1/2 )∗σσ′ (−φ Λp,σ
ωp ωp σ σ ′
s r
mc2 ωΛp X †
= u(0, σ)D(λ−1 −1
Λp Λλp )D (λp )aΛp,σ
ωp ωp σ
s r
mc2 ωΛp X
= u(Λp, σ)D(Λ)aΛp,σ
ωp ωp σ
ωΛp †
= A (Λp)D(Λ) (J.60)
ωp
Let us show that quantum field ψα (x̃) has the required covariant trans-
formation law (9.1)
X
U0 (Λ; ã)ψα (x̃)U0−1 (Λ; ã) = Dαβ (Λ−1 )ψβ (Λ(x̃ + ã)) (J.61)
j
Transformations with respect to translations are
U (1; ã)ψα (x̃)U0−1 (1; ã)

Z0
dp − i p̃·x̃ −1 i
p̃·x̃ † −1

= e ~ U 0 (1; ã)Aα (p)U 0 (1; ã) + e ~ U0 (1; ã)Bα (p)U0 (1; ã)
(2π~)3/2
Z
dp − i p̃·(x̃+ã) i
p̃·(x̃+ã) †

= e ~ A α (p) + e ~ B α (p) = ψα (x̃ + ã)
(2π~)3/2
For transformations with respect to boosts we use equations (J.59), (J.60),

(5.26), and (I.4)
U0 (Λ; 0)ψ(x̃)U0−1 (Λ; 0)

Z i
i
−3/2 − ~ p̃·x̃ −1 p̃·x̃ † −1
= (2π~) dp e U0 (Λ; 0)A(p)U0 (Λ; 0) + e ~ U0 (Λ; 0)B (p)U0 (Λ; 0)
Z
ωΛp − i p̃·x̃ i

= (2π~)−3/2 D(Λ−1 ) dp e ~ A(Λp) + e ~ p̃·x̃ B † (Λp)
ωp
Z i −1
i −1
= (2π~)−3/2 D(Λ−1 ) dq e− ~ (Λ q̃·x̃) A(q) + e ~ (Λ q̃·x̃) B † (q)
Z i
i
−3/2
= (2π~) D(Λ ) dq e− ~ (q̃·Λx̃) A(q) + e ~ (q̃·Λx̃) B † (q)
−1
= D(Λ−1)ψ(Λx̃)
We leave to the reader the proof of equation (J.61) in the case of rotations.
Thus we conclude that in agreement with Step 1(II) in subsection 9.1.1,
the Dirac field transforms according to the 4D bispinor representation of the
Lorentz group.
J.8. FUNCTIONS Uµ AND Wµ . 705
J.8 Functions Uµ and Wµ.

In QED calculations one often meets products like uγ µ u and wγ µ w. It is
convenient to introduce special symbols for them
U µ (p, σ; p′ , σ ′ ) ≡ u(p, σ)γ µ u(p′ , σ ′ ) (J.62)

W µ (p, σ; p′ , σ ′ ) ≡ w(p, σ)γ µ w(p′ , σ ′ ) (J.63)
Note that quantities U µ and W µ are four-vectors with respect to Lorentz

transformations of their momentum and spin labels.10 For example, using
(J.35), (J.36), (J.37), (J.39), and (J.19), we obtain11

U µ Λp, R−1 (p, Λ)σ; Λp′ , R(p′ , Λ)σ ′

≡ u Λp, R−1 (p, Λ)σ γ µ u (Λp′ , R(p′ , Λ)σ ′)

= u 0, R−1 (p, Λ)σ D −1 (λΛp )γ µ D(λΛp′ )u (0, R(p′ , Λ)σ ′ )

= u 0, R−1 (p, Λ)σ D −1 Λλp R−1 (p, Λ) γ µ D Λλp′ R−1 (p′ , Λ) u (0, R(p′ , Λ)σ ′)

= u 0, R−1 (p, Λ)σ D R−1 (p, Λ) D −1 (λp ) D −1 (Λ)γ µ D(Λ)D (λp′ ) D R−1 (p′ , Λ) ×
u (0, R(p′ , Λ)σ ′ )
= u(0, σ)D −1 (λp )D −1 (Λ)γ µ D(Λ)D(λp′ )u(0, σ ′ )
= u(p, σ)D −1 (Λ)γ µ D(Λ)u(p′, σ ′ )
= u(p, σ) (Λµν γ ν ) u(p′ , σ ′ )
= Λµν U ν (p, σ; p′ , σ ′ )
J.9 (v/c)2 approximation

Often it is useful to obtain QED results in a weakly-relativistic or non-
relativistic case, when momenta of electrons are much less than mc and
momenta of protons are much less than Mc. In these cases, with reasonable
accuracy we can represent all quantities as series in powers of v/c and leave
only terms having orders not higher than (v/c)2 . First, we can use (8.96) to
write
10
Such a transformation acts by matrix Λ (rotation×boost) on momentum arguments
and by the corresponding Wigner rotation R on spin components. See subsection 9.2.2.
11
Here R(p, Λ) is the Wigner rotation defined in (5.16).
r r
p p 2 p2
ωp + mc2 ≈ mc2 + + mc2 = 2mc2 +
r2m
2m

√ p 2 √ p2
= 2
2mc 1 + 2
≈ 2mc 1 +
4m2 c2 8m2 c2
r
p p2 p
ωp − mc2 ≈ mc2 + − mc2 = √
2m 2m
(q + k ÷ q)2 = (ωq+k − ωq )2 − c2 k 2 ≈ −c2 k 2 (J.64)
Mmc4
p
Ωp−k ωq+k Ωp ωq
1 1 1 1
≈ q q q q
2 2 2 2
1 + (p−k)
2M 2 c2
1 + 2Mp 2 c2 1 + (q+k)
2m2 c2
1 + 2mq2 c2
(p − k)2 p2 (q + k)2 q2
≈ 1− − − −
4M 2 c2 4M 2 c2 4m2 c2 4m2 c2
2 2 2
p pk k q qk k2
= 1− + − − − − (J.65)
2M 2 c2 2M 2 c2 4M 2 c2 2m2 c2 2m2 c2 4m2 c2
To obtain the (v/c)2 approximation for expressions (J.62), (J.63) we use
equations (J.40) - (J.43) and (H.6) - (H.8)
U 0 (p, σ; p′ , σ ′ ) = u(p, σ)γ 0 u(p′ , σ ′ ) = u† (p, σ)u(p′ , σ ′ )

" p
2
#
p p p ω p′ + mc 1
= χ†σ ωp + mc2 , ωp − mc2 · ~σ p
2 p′ χσ′
p ωp′ − mc ( p′ · ~σ ) 2mc2

†
p
2
p
2
p
2
p
2
(p · ~σ )(p′ · ~σ ) 1
= χσ ωp + mc ωp′ + mc + ωp − mc ωp′ − mc ′
χσ′
pp 2mc2

p2 (p′ )2 pp′ (p · ~σ )(p′ · ~σ )
≈ χ†σ 1+ 1 + + χσ′
8m2 c2 8m2 c2 4m2 c2 pp′

† p2 + (p′ )2 + 2p · p′ + 2i~σ · [p × p′ ]
= χσ 1 + χσ′
8m2 c2
J.9. (V /C)2 APPROXIMATION 707

(p + p′ )2 + 2i~σ · [p × p′ ]
= χ†σ 1+ χσ′ (J.66)
8m2 c2

′ ′ ′ ′ (p + p′ )2 + 2i~σ · [p × p′ ]
0 0
W (p, σ; p , σ ) = w(p, σ)γ w(p , σ ) ≈ χ†σ 1+ χσ′
8M 2 c2
(J.67)
U(p, σ; p′ , σ ′ ) = u(p, σ)~γ u(p′ , σ ′ )
" p 2
#
p p p · ~
σ 0 ~
σ ω p ′ + mc 1
= χ†σ ωp + mc2 , − ωp − mc2 p p′ ·~
σ χσ′
p −~σ 0 ωp′ − mc p′ 2
2mc2
" p 2~σ (p′ ·~
σ)
#
p p p · ~
σ ω p′ − mc 1
= χ†σ ωp + mc2 , − ωp − mc2 p p′ χσ′
p −~σ ωp′ + mc 2 2mc2

†
p
2
p
2
~σ (p′ · ~σ ) p 2
p
2
(p · ~σ )~σ 1
= χσ ωp + mc ωp′ − mc ′
+ ωp − mc ωp′ + mc χσ′
p p 2mc2

√ p′ ~σ (p′ · ~σ ) √ p (p · ~σ )~σ 1
≈ χ†σ 2mc2 √ ′
+ 2mc2 √ χσ′
2m p 2m p 2mc2
1
= χ†σ ((~σ · p)~σ + ~σ (~σ · p′ )) χσ′
2mc
1
= χ†σ (p + i[~σ × p] + p′ − i[~σ × p′ ]) χσ′
2mc
1
= χ†σ (p + p′ + i[~σ × (p − p′ )]) χσ′ (J.68)
2mc
1
W(p, σ; p′, σ ′ ) ≈ χ†σ (p + p′ + i[~σ × (p − p′ )]) χσ′ (J.69)
2Mc
In the non-relativistic limit c → ∞, all formulas are further simplified
lim ωp = mc2
c→∞
lim Ωp = Mc2
c→∞
4
Mmc
lim p = 1 (J.70)
c→∞ Ωp−k ωq+k Ωp ωq
lim U0 (p, σ; p′ , σ ′ ) = χ†σ χσ′ = δσ,σ′ (J.71)
c→∞
lim W0 (p, σ; p′ , σ ′ ) = δσ,σ′
c→∞
lim U(p, σ; p′ , σ ′ ) = 0
c→∞
lim W(p, σ; p′ , σ ′ ) = 0
c→∞
J.10 Anticommutation relations

To check the anticommutation relations (9.3) we calculate, for example,12
{ψα (x, 0), ψβ† (y, 0)}

Z s s 1/2
2 ′
dp mc dp mc2 X
=
(2π~)3/2 ωp (2π~)3/2 ωp′
σ,σ′ =−1/2
i i

− ~ px px †
{ e uα (p, σ)ap,σ + e vα (p, σ)bp,σ ,
~
i ′ i ′

e ~ p y u†β (p′ , σ ′ )a†p′ ,σ′ + e− ~ p y vβ† (p′ , σ ′ )bp′ ,σ′ }
Z 1/2
dpdp′ mc2 X i i ′
= √ e− ~ px+ ~ p y uα (p, σ)u†β (p′ , σ ′ ){ap,σ , a†p′ ,σ′ }
(2π~)3 ωp ωp′
σ,σ′ =−1/2
i

px− ~i p′ y
+e ~ vα (p, σ)vβ† (p′ , σ ′ ){b†p,σ , bp′ ,σ′ }
Z 1/2
dpdp′ mc2 X i
= e− ~ p(x−y) uα (p, σ)u†β (p′ , σ ′ )δ(p − p′ )δσ,σ′
(2π~)3 ωp
σ,σ′ =−1/2
i

+e ~ p(x−y) vα (p, σ)vβ† (p′ , σ ′ )δ(p − p′ )δσ,σ′
Z 1/2
−3 dpmc2 X
= (2π~)
ωp
σ=−1/2

− ~i p(x−y) i
e uα (p, σ)u†β (p, σ) +e ~
p(x−y)
vα (p, σ)vβ† (p, σ)
Z 1/2
dpmc2 − i p(x−y) X † †

= e ~ u α (p, σ)u β (p, σ) + vα (−p, σ)vβ (−p, σ)
(2π~)3 ωp
σ=−1/2
Z 2
dpmc − i p(x−y) ωp 0 0
= e ~ (γ γ )αβ
(2π~)3 ωp mc2
= δ(x − y)δαβ (J.72)
12
Here we used (J.45) and (J.47).
J.11. DIRAC EQUATION 709
We will also find useful the following anticommutators
† mc2 X
{Aα (p), Aβ (p′ )} = uα (p, σ)u†β (p′ , σ ′ ){ap,σ , a†p′ ,σ′ }
ωp σσ′
!
mc2 X
= uα (p, σ)uβ (p, σ) δ(p − p′ )
ωp σ
1
= (γ 0 ωp − ~γ pc + mc2 )αβ δ(p − p′ ) (J.73)
2ωp
X
{A†α (p), Aα (p′ )} = 2δ(p − p′ ) (J.74)
α
† 1
{Bα (p), B β (p′ )} = (γ 0 ωp − ~γ pc − mc2 )αβ δ(p − p′ ) (J.75)
2ωp
X
{Bα† (p), B α (p′ )} = 2δ(p − p′ ) (J.76)
α
J.11 Dirac equation

We can write the electron-positron quantum field (J.26) as a sum of two
terms
ψα (x̃) = ψα+ (x̃) + ψα− (x̃)

s
XZ dp mc2 − i p̃·x̃
ψα+ (x̃) ≡ 3/2
e ~ uα (p, σ)ap,σ
σ
(2π~) ω p
s
XZ dp mc2 i p̃·x̃
ψα− (x̃) ≡ 3/2
e ~ vα (p, σ)b†p,σ
σ
(2π~) ωp
Let us now act on the component ψ + (x̃) by operator in parentheses13

0 ∂ ∂ imc2
γ + c~γ − ψ + (x̃)
∂t ∂x ~
13
Here we use explicit definitions of gamma matrices from (J.1) and (J.2) as well as
equation (J.40).
XZ s
2
∂ ∂ imc dp mc2 − i px+ i ωp t
= γ 0 + c~γ − e ~ ~ u(p, σ)ap,σ
∂t ∂x ~ σ
(2π~)3/2 ωp
Z s
iX dp mc2 0 i i
= 3/2
(γ ωp − c~γ · p − mc2 )u(p, σ)e− ~ px+ ~ ωp t ap,σ
~ σ (2π~) ωp
For the product on the right hand side we obtain
(γ 0 ωp − c~γ · p − mc2 )u(p, σ)

p p
ωp + mc2 χσ ω p − mc 2p χσ
= ωp p
2 p √ − p √ − mc2 u(p, σ)
− ωp − mc (~σ · p ) 2mc 2 − ω p + mc 2 (~
σ · p) 2mc 2
p 2
p 2
p
2 2 2
= pωp ωp + mc p− (ωp p − mc ) ωp + mc − mc p ωp + mc
p 2 p √
χσ
2 2
−ωp ωp − mc (~σ · p ) + ωp + mc (~σ · p )pc − mc ωp − mc (~σ · p ) 2
2mc2
2
p 2
p
(ωp − mcp ) ωp + mc2 − (ωp − mcp ) ωp + mc2 χ
= 2 2 2 2 p √ σ
(−(ωp + mc ) ωp − mc + (ωp + mc ) ωp − mc )(~σ · p ) 2mc2
= 0 (J.77)
This leads to the Dirac equation for the field component ψ + (x̃)

0 ∂ ∂ imc2
γ + c~γ − ψ + (x̃) = 0 (J.78)
∂t ∂x ~
The same equation is satisfied by the component ψ − (x̃). So, the Dirac equa-
tion for the full field is

0 ∂ ∂ imc2
γ + c~γ − ψ(x) = 0 (J.79)
∂t ∂x ~
The equation conjugate to (J.77) is

0 = u† (p, σ) (γ 0 )† ωp − c(~γ )† · p − mc2 = u† (p, σ) γ 0 ωp + c~γ · p − mc2

= u† (p, σ)γ 0 γ 0 γ 0 ωp + c~γ · p − mc2 = u(p, σ)γ 0 γ 0 ωp + c~γ · p − mc2

= u(p, σ) γ 0 ωp − c~γ · p − mc2 γ 0 (J.80)
Therefore, the equation satisfied by the conjugated field is

J.11. DIRAC EQUATION 711
∂ † ∂ imc2 †
ψ (x̃)γ 0 − c ψ † (x̃)~γ † + ψ (x̃) = 0
∂t ∂x ~
or multiplying from the right by γ 0 and using (J.3), we obtain
∂ ∂ imc2
ψ(x̃)γ 0 + c ψ(x̃)~γ + ψ(x̃) = 0 (J.81)
∂t ∂x ~
It should be emphasized that in our approach to QFT Dirac equation
appears as a rather unremarkable property of the electron-positron quan-
tum field ψ(x̃) . This equation does not play a fundamental role assigned
to it in many textbooks. Definitely, Dirac equation cannot be regarded as a
“relativistic analog of the Schrödinger equation for electrons”.14 The correct
electron wave functions and corresponding relativistic Schrödinger equations
should be constructed by using Wigner-Dirac theory of unitary representa-
tions of the Poincaré group. For free electrons such derivations are performed
in chapter 5. The relativistic analog of the Schrödinger equation for an in-
teracting electron-proton system is constructed in chapter 12.
In the slash notation (J.22) the momentum-space Dirac equations (J.77)
and (J.80) take compact forms
/ − mc2 )u(p, τ ) = 0
(p (J.82)
/ − mc2 ) = 0
u(p, τ )(p (J.83)
If we denote /k ≡ /p′ − /,
p then it follows from (J.82) - (J.83)
/u(p′, σ ′ ) = u(p, σ)[p

U µ (p, σ; p′ , σ ′ )kµ = u(p, σ)k /′ u(p′ , σ ′ )] − [u(p, σ)p
/]u(p′ , σ ′ )
= (mc2 − mc2 )u(p, σ)u(p′ , σ ′ ) = 0 (J.84)
µ ′ ′
W (p, σ; p , σ )kµ = 0 (J.85)
We will also need the Gordon identity 15

14
A point of view similar to ours is adopted also in textbook [Wei95].
15
See Problem 3.2 in [PS95b].
u(p, σ)(γκ/k − /kγκ )u(p′ , σ ′ ) = u(p, σ)(γκ (p

/′ − /)
p − (p/′ − /)γ
p κ )u(p′ , σ ′ )
= u(p, σ)(γκ (mc2 − /) p − (p/′ − mc2 )γκ )u(p′ , σ ′ )
= u(p, σ)(2γκ mc2 − γκ/p − /p′ γκ )u(p′ , σ ′ )
= u(p, σ)(2γκ mc2 + /γ p κ − 2pκ + γκ/p′ − 2p′κ )u(p′ , σ ′ )
= u(p, σ)(2γκ mc2 + mc2 γκ − 2pκ + mc2 γκ − 2p′κ )u(p′ , σ ′ )
= u(p, σ)(4mc2 γκ − 2pκ − 2p′κ )u(p′ , σ ′ ) (J.86)
J.12 Fermion propagator

Let us calculate the electron propagator, which is frequently used in Feynman-
Dyson perturbation theory
Dab (x̃1 , x̃2 ) ≡ h0|T (ψa (x̃1 )ψ b (x̃2 ))|0i
if t1 > t2 we can omit the time ordering sign and use (J.46)
Dab (x̃1 , x̃2 ) = h0|ψa (x̃1 )ψ b (x̃2 )|0i ∝ h0|(a + b† )(a† + b)|0i ∝ h0|aa† |0i
Z s !
dp mc2 X − i p̃·x̃1
= h0| e ~ ua (p, σ)ap,σ ×
(2π~)3/2 ωp σ
Z s !
dq mc2 X i q̃·x̃2 †
e ~ ub (q, τ )a†q,τ γ 0 |0i
(2π~)3/2 ωq τ
Z
dpdq mc2 X − i p̃·x̃1 i
= 3 √ e ~ ua (p, σ)e ~ q̃·x̃2 ub (q, τ )δ(p − q)δστ
(2π~) ωp ωq στ
Z
dp mc2 i p̃·(x̃2 −x̃1 ) X
= e~ ua (p, σ)ub (p, σ)
(2π~)3 ωp σ
Z
dp i 1
= 3
e ~ (ωp (t2 −t1 )−p(x2 −x1 )) γ 0 ωp − ~γ pc + mc2 ab
(2π~) 2ωp
if t1 < t2 we use (J.48) to obtain16

16
Note that for the anticommuting fermion field the definition of the time ordered prod-
J.12. FERMION PROPAGATOR 713
Dab (x̃1 , x̃2 ) = −h0|ψ b (x̃2 )ψa (x̃1 )|0i ∝ −h0|(a† + b)(a + b† )|0i ∝ −h0|bb† |0i
Z Z
−3/2 − ~i p̃·x̃2 −3/2 i
= −h0|(2π~) dpe Bb (p)2π~) dqe ~ q̃·x̃1 Ba† (q)|0i
Z
−3 mc2 i p̃·(x̃1 −x̃2 ) X
= −(2π~) dp e~ vb (p, σ)va (p, σ)
ωp σ
Z
dp i
(ωp (t1 −t2 )−p(x1 −x2 )) 1 0 2

= − e ~ γ ω p − ~
γ pc − mc ab
(2π~)3 2ωp
The sum of these two terms gives
Z
dp i 1
Dab (x̃1 , x̃2 ) = θ(t1 − t2 ) 3
e ~ (ωp (t2 −t1 )−p(x2 −x1 )) Pab (p, ωp )
(2π~) 2ωp
Z
dp i 1
+ θ(t2 − t1 ) 3
e ~ (ωp (t1 −t2 )−p(x1 −x2 )) Pab (−p, −ωp )
(2π~) 2ωp
(J.88)
where we denoted

Pab (p, ωp ) = γ 0 ωp − ~γ pc + mc2 ab
and θ(t) is the step function defined in (B.3). Our next goal is to rewrite
equation (J.88) so that integration goes by 4 independent components of
the 4-vector of momentum (p0 , px , py , pz ). We use an integral representation
(B.4) for the step function to obtain
Dab (x̃1 , x̃2 )

Z Z
1 dp e−is(t1 −t2 ) − i (ωp (t1 −t2 )−p(x1 −x2 )) 1
= − ds e ~ Pab (p, ωp )
2πi (2π~)3 s + iǫ 2ωp
uct involves a change of sign (compare with (7.16))

ψa (x̃1 )ψ b (x̃2 ), if t1 > t2
T [ψa (x̃1 )ψ b (x̃2 )] = (J.87)
−ψb (x̃2 )ψa (x̃1 ), if t1 < t2
Z Z
1 dp eis(t1 −t2 ) i (ωp (t1 −t2 )−p(x1 −x2 )) 1
− ds e~ Pab (−p, −ωp )
2πi (2π~)3 s + iǫ 2ωp
Z Z
1 dp 1 1
= − 3
ds ×
2πi (2π~) s + iǫ 2ωp
h i i
i
e− ~ ((ωp+~s )(t1 −t2 )−p(x1 −x2 )) Pab (p, ωp ) + e ~ ((ωp +~s)(t1 −t2 )−p(x1 −x2 )) Pab (−p, −ωp )
Z Z
1 dp 1 1
= − dp 0 ×
2πi (2π~)3 p0 − ωp + iǫ 2ωp
h i i
i
− ~ (p0 (t1 −t2 )−p(x1 −x2 )) (p0 (t1 −t2 )−p(x1 −x2 ))
e Pab (p, ωp ) + e ~ Pab (−p, −ωp )
Z Z
1 dp 1 1
= − 3
dp0 ×
2πi (2π~) p0 − ωp + iǫ 2ωp
h i i
i
e− ~ (p0 (t1 −t2 )−p(x1 −x2 )) Pab (p, p0 ) + e− ~ (−p0 (t1 −t2 )+p(x1 −x2 )) Pab (−p, −p0 )
Z Z
1 dp − ~i (p0 (t1 −t2 )−p(x1 −x2 )) 1 Pab (p, p0 ) Pab (p, p0 )
= − dp0 e +
2πi (2π~)3 2ωp p0 − ωp + iǫ −p0 − ωp + iǫ
Z Z
1 dp i 1 2ωp
= 3
dp0 e− ~ (p0 (t1 −t2 )−p(x1 −x2 )) Pab (p, p0 ) 2
2πi (2π~) 2ωp p0 − ωp2 + iǫ
Z
1 i Pab (p, p0 )
= 3
d4 pe− ~ (p̃·x̃) 2
2πi(2π~) p0 − c p2 − m2 c4 + iǫ
2
Z 0
1 4 − ~i (p̃·x̃) (γ p0 − ~ γ pc + mc2 )ab
= d pe
2πi(2π~)3 p̃2 c2 − m2 c4 + iǫ
Z
1 4 − ~i (p̃·x̃) (p/ + mc2 )ab
= d pe (J.89)
2πi(2π~)3 p̃2 c2 − m2 c4 + iǫ
Appendix K
Quantum field for photons
K.1 Construction of the photon’s quantum

field
Let us now construct a quantum field based on creation (c†p,τ ) and annihila-
tion (cp,τ ) operators for photons.1 Our goal is to satisfy conditions listed in
Step 1. in subsection 9.1.1.
We will postulate that Lorentz transformations (9.1) of the photon field
Aµ (x̃) are associated with the 4-dimensional representation of the Lorentz
group from subsection I.1
X
U0 (Λ; ã)Aµ (x̃)U0−1 (Λ; ã) = Λ−1 ν
µν A (Λ(x̃ + ã)) (K.1)
ν
with indices µ and ν taking values 0,1,2,3. Then we attempt to define a

4-component quantum field for photons as2
Aµ (x̃) ≡ Aµ (x, t)
√ Z
~ c dp X h − i p̃·x̃ i
p̃·x̃ ∗ †
i
= √ e ~ eµ (p, τ )c p,τ + e ~ eµ (p, τ )c p,τ (K.2)
(2π~)3/2 2p τ
1
2
In equation (K.23) we will see that, actually, this field does not satisfy our requirement
(K.1) for boosts.
715
716 APPENDIX K. QUANTUM FIELD FOR PHOTONS
where p̃ · x̃ ≡ px − cpt and the coefficient functions eµ (p, τ ) should be chosen

such that (K.1) is satisfied. Following the recipe from subsection J.4, we
first choose the value of the coefficient function at the standard momentum
k = (0, 0, 1)3 appropriate for massless photons
 
0
1  1 
eµ (k, τ ) = √   (K.3)
2  iτ 
0
For all other photon momenta p we define4
e(p, τ ) = λp e(k, τ ) (K.4)

e† (p, τ ) = e† (k, τ )λp (K.5)
where λp is a boost transformation which takes the particle from the standard
momentum k to an arbitrary p.
λp = Rp Bp (K.6)
where Bp is a boost along the z-axis and Rp is a pure rotation, as in equation

(5.62).
K.2 Explicit formula for eµ (p, τ )

Note that the boost Bp in equation (5.61) has no effect on the 4-vector (K.3).
The 0-th component of this vector is not affected by rotations Rp as well.
Therefore, we conclude that for all p and x
e0 (p, τ ) = 0 (K.7)
A0 (x, t) = 0 (K.8)
3
see equation (5.54)
4
This is similar in spirit to the massive case (J.37) - (J.38)
K.2. EXPLICIT FORMULA FOR Eµ (P, τ ) 717
Let us now find the 3-vector part of eµ (p, τ ), which we denote by e(p, τ ).
From (K.3), (K.4), (K.6), and (D.22) we obtain
√
2e(p, τ )
 
cos φ + n2x (1 − cos φ) nx ny (1 − cos φ) − nz sin φ nx nz (1 − cos φ) + ny sin φ
=  nx ny (1 − cos φ) + nz sin φ cos φ + n2y (1 − cos φ) ny nz (1 − cos φ) − nx sin φ  ×
nx nz (1 − cos φ) − ny sin φ ny nz (1 − cos φ) + nx sin φ cos φ + n2z (1 − cos φ)
 
1
 iτ 
0
 
pz p2y (p−pz ) px py (p−pz ) px  
+ p(p2x +p2y )
− p(p2x +p2y ) 1
 p p 
=  − px py (p−pz ) pz + p2x (p−pz ) py   iτ 
 p(p2x +p2y ) p p(p2x +p2y ) p 
−p px
−p py pz 0
p
 
pz p2x +pp2y px py (p−pz ) px  
p(p 2 +p2 ) − p(p2 +p2 ) p 1
 x y x y 
 − px py (p−pz ) pz p2y +pp2x py   iτ 
=  2 2 2 2
p(px +py ) p(px +py ) p 
−p px
−p py pz 0
p
 
pz p2x + pp2y − iτ px py (p − pz )
1  −px py (p − pz ) + iτ (pz p2y + pp2x ) 
=
p(px + p2y )
2
−px (p2x + p2y ) − iτ py (p2x + p2y )
Therefore
 
0
1  pz p2x + pp2y − iτ px py (p − pz ) 
e(p, τ ) = √  2 2
 (K.9)
2p(p2x + p2y )  −px py (p − pz ) + iτ (pz py + ppx ) 
−px (p2x + p2y ) − iτ py (p2x + p2y )
One can easily see that e(p, τ ) is orthogonal to the momentum vector p =
(px , py , pz ) and that
eµ (p, τ )pµ = 0 (K.10)

K.3 Useful commutator

For our derivations in subsection 9.2.1 we need the following expression
√
~ c µ X
Cαβ (p) = √ γαβ eµ (p, τ )cp,τ (K.11)
2p τ
and the commutator
h
†
i ~2 c X µ ν †
Cαβ (p), Cγδ (p′ ) = √ ′ γαβ γγδ eµ (p, τ )eν (p′ , τ ′ )[c†p,τ , cp′ ,τ ′ ]
2 pp τ τ ′
~2 c X µ ν †
= − γαβ γγδ eµ (p, τ )eν (p′ , τ ′ )δ(p − p′ )δτ,τ ′
2p ′
ττ
2 X
~ c µ ν †
= − γαβ γγδ eµ (p, τ )eν (p, τ )δ(p − p′ )
2p τ
~2 c µ ν
= − γ γ hµν (p)δ(p − p′ ) (K.12)
2p αβ γδ
where
X
hµν (p) ≡ eµ (p, τ )e†ν (p, τ )
τ
is a sum frequently appearing in calculations. First we calculate this sum at

the standard momentum k = (0, 0, 1) with the help of (K.3)
  
0 0
1
 1 
 1  1 
hµν (k) = 0 1 −i 0 +   0 1 i 0
2 i  2  −i 
0 0
   
0 0 0 0 0 0 0 0
1
 0 1 −i 0  1  0 1 i 0 
= +  
2 0 i 1 0  2  0 −i 1 0 
0 0 0 0 0 0 0 0
K.3. USEFUL COMMUTATOR 719
 
0 0 0 0
 0 1 0 0 
= 
 0

0 1 0 
0 0 0 0
which can be also expressed in terms of components of the standard vector
k
h0µ (k) = hµ0 (k) = 0

ki kj
hij (k) = δij − 2
k
At arbitrary momentum p we use formulas (K.4), (K.5), and (K.6)
X
hµν (p) = eµ (p, τ )e†ν (p, τ )
τ
 
0 0 0 0
 0 1 0 0 
= Rp Bp   B −1 R−1
 0 0 1 0  p p
0 0 0 0
 
0 0 0 0
 0 1 0 0 
= Rp   R−1
 0 0 1 0  p
0 0 0 0
It then follows that h0µ (p) = hµ0 (p) = 0, that the 3 × 3 submatrix is

ki kj pi pj
hij (p) = Rp δij − 2 Rp−1 = δij − 2 (K.13)
k p
and the final formula for hµν (p) is
 
0 0 0 0
 p2x px py px pz 
 0 1 − p2 − p2 − p2 
hµν (p) =  p p p 2
 0 − x 2 y 1 − y2 − py p2 z

 (K.14)
 p p p 
2
0 − pxp2pz − pzpp2 y 1 − ppz2
K.4 Equal time commutator of photon fields

The photon quantum field (K.2) commutes with itself at space-like intervals
(x 6= y), as required in equation (9.4)
[Aµ (x, 0), A†ν (y, 0)]

Z
~2 c dpdp′ Xh − i px i
px ∗ †

= √ e ~ eµ (p, τ )c p,τ + e ~ eµ (p, τ )c p,τ ,
2(2π~)3 pp′ τ τ ′
i ′ i ′
i
e ~ p y e∗ν (p′ , τ ′ )c†p′ ,τ ′ + e− ~ p y eν (p′ , τ ′ )cp′ ,τ ′
Z
~2 c dpdp′ X − i px i p′ y
= √ ′ e ~ e ~ eµ (p, τ )e†ν (p′ , τ ′ )[cp,τ , c†p′ ,τ ′ ]
2(2π~)3 pp τ τ ′
i

px − ~i p′ x ∗ ∗† ′ ′ †
+e e~ eµ (p, τ )eν (p , τ )[cp,τ , cp′ ,τ ′ ]
Z
~2 c dpdp′
= δ(p − p′ ) ×
2(2π~)3 p
X i i

δτ,τ ′ e− ~ p(x−y) eµ (p, τ )e†ν (p′ , τ ′ ) − e ~ p(x−y) e∗µ (p, τ )e∗† ν (p ′
, τ ′
)
ττ′
Z
~2 c dp X − i p(x−y) i

= e ~ eµ (p, τ )e†ν (p, τ ) − e ~ p(x−y) e∗µ (p, τ )e∗†
ν (p, τ )
2(2π~)3 p τ
Z
~2 c dp − i p(x−y) i
p(x−y)

= e ~ −e ~ hµν (p)
2(2π~)3 p
Z
i~2 c dp
= − 3
sin(p(x − y))hµν (p) (K.15)
2(2π~) p
= 0
because the integrand in (K.15) is an odd function of p.
K.5 Photon propagator

Next we need to calculate the photon propagator. We use the integral rep-
resentation (B.4) of the step function to write
h0|T [Aµ(x̃1 )Aν (x̃2 )]|0i

K.5. PHOTON PROPAGATOR 721
Z h i i
2 dp p̃·(x̃1 −x̃2 ) i
p̃·(x̃2 −x̃1 )
= ~c hµν (p) e ~ θ(t1 − t2 ) + e ~ θ(t2 − t1 )
2(2π~)3 p
Z∞ Z
~2 c dp i
p̃·(x̃ −x̃ ) e−is(t1 −t2 ) i
p̃·(x̃ −x̃ ) eis(t1 −t2 )
= − ds hµν (p) e ~ 1 2
+ e~ 2 1
2πi 2(2π~)3p s + iǫ s + iǫ
−∞
Z∞ Z
~2 c dp 1
= − ds 3
hµν (p) ×
2πi 2(2π~) p s + iǫ
−∞
h i i
i
e ~ (cp(t1 −t2 )−p(x1 −x2 )) e−is(t1 −t2 ) + e ~ (−cp(t1 −t2 )+p(x1 −x2 )) eis(t1 −t2 )
Z∞ Z " i
(cp−~s)·(t1 −t2 ) − ~i (cp−~s)·(t1 −t2 )
#
~2 c dp i e ~ e
= − ds hµν (p)e− ~ p(x1 −x2 ) +
2πi 2(2π~)3p s + iǫ s + iǫ
−∞
Next we change variables: in the first integral p0 = cp − ~s; in the second

integral p0 = −cp + ~s
h0|T [Aµ (x̃1 )Aν (x̃2 )]|0i

Z∞ Z " i
p (t −t2 ) i
p (t −t2 )
#
~2 c dp i e ~ 0 1 e ~ 0 1
= − dp0 hµν (p)e− ~ p(x1 −x2 ) +
2πi 2(2π~)3 p cp − p0 + iǫ cp + p0 + iǫ
−∞
Z∞ Z
~ 2 c2 dp i ic 1
= dp0 3
hµν (p)e ~ p0 (t1 −t2 ) e− ~ p(x1 −x2 ) 2
2πi (2π~) p̃ + iǫ
−∞
2 2 Z
~c d4 p i
p̃(x̃1 −x̃2 ) 1
= hµν (p)e ~ (K.16)
2πi (2π~)3 p̃2 + iǫ
where we denoted d4 p ≡ dp0 dp.

The matrix hµν (p) has been calculated in (K.14). However, as explained
in subsection 9.2.3, it is more convenient to use the Feynman-Dyson approach
where this matrix is replaced by the metric tensor hµν (p) = gµν . Then we
obtain our final propagator formula
Z
~ 2 c2 d4 p i p̃(x̃1 −x̃2 ) gµν
h0|T [Aµ(x̃1 )Aν (x̃2 )]|0i = e~ (K.17)
2πi (2π~)3 p̃2 + iǫ
K.6 Poincaré transformations of the photon

field
Now we need to determine transformations of the photon field with respect
to the non-interacting representation of the Poincaré group. Note that we
have defined coefficient functions eµ (p, τ ) in subsection K.2 in the hope to
achieve the transformation law (K.1) for the photon field. This approach
was successful in the case of electron-positron field in Appendix J.7. How-
ever, for massless photons the situation is more complicated. The actions of
translations and rotations do agree with our condition (K.1)
U0 (R; 0)A0 (x, t)U0−1 (R; 0) = A0 (Rx, t)

U0 (R; 0)A(x, t)U0−1 (R; 0) = R−1 A(Rx, t)
U0 (1; r, τ )Aµ (x, t)U0−1 (1; r, τ ) = Aµ (x + r, t + τ ) (K.18)
However, transformations with respect to boosts disagree with our expecta-

tion [Wei64a]
X
U0 (Λ; 0)Aµ(x̃)U0−1 (Λ; 0) = Λ−1 ν
µν A (Λx̃) (K.19)
ν
To demonstrate this disagreement we first use equations (8.38) and (8.39) to

write
U0 (Λ; 0)Aµ (x̃)U0−1 (Λ; 0)

√ Z
~ c dp X − i p̃·x̃
= √ e ~ eµ (p, τ )U0 (Λ; 0)cp,τ U0−1 (Λ; 0)
(2π~)3/2 2p τ
i

+e ~ p̃·x̃ e∗µ (p, τ )U0 (Λ; 0)c†p,τ U0−1 (Λ; 0)
√ Z s
~ c dp |Λp| X − i p̃·x̃
= 3/2
√ e ~ eµ (p, τ )e−iτ φW (p,Λ) cΛp,τ
(2π~) 2p p τ
i

p̃·x̃ ∗ iτ φW (p,Λ) †
+e ~ eµ (p, τ )e cΛp,τ (K.20)
Next we take equation (K.4) for vector Λp

K.6. POINCARÉ TRANSFORMATIONS OF THE PHOTON FIELD 723
e(Λp, τ ) = λΛp e(k, τ )

and multiply both sides from the left by Λ−1
Λ−1 e(Λp, τ ) = λp (λ−1 −1

p Λ λΛp )e(k, τ )
The term in parentheses is a member of the little group5 which corresponds

to a Wigner rotation through the angle −φW , so we can use representation
(5.55)
λ−1 −1
p Λ λΛp e(k, τ ) = S(X1 , X2 , −φW )e(k, τ )
  
1 + (X12 + X22 )/2 X1 X2 −(X12 + X22 )/2 0
 X1 cos φW − X2 sin φW cos φW − sin φW −X1 cos φW + X2 sin φW  1 
= 
 X1 sin φW + X2 cos φW sin φW cos φW −X1 sin φW − X2 cos φW
 
  iτ 
(X12 + X22 )/2 X1 X2 1 − (X12 + X22 )/2 0
   
0 1
 1 
 0 
 
= e−iτ φW (p,Λ) 
 iτ  + (X1 + iτ X2 )  0 

0 1
X1 + iτ X2
= e−iτ φW (p,Λ) e(k, τ ) + k̃
c
where kµ = (c, 0, 0, c) and X1 , X2 are certain functions of Λ and p. Our next
goal is to eliminate these unknown functions from our formulas. Denoting
X1 + iτ X2
Xτ (p, Λ) = (K.21)
c
we obtain
3
X
Λ−1 ν
µν e (Λp, τ ) = e
−iτ φW (p,Λ)
λp eµ (k, τ ) + Xτ (p, Λ)λp kµ
ν=0
pµ
= e−iτ φW (p,Λ) eµ (p, τ ) + Xτ (p, Λ) (K.22)
p
5
where pµ = (p, px , py , pz ) is the energy-momentum 4-vector corresponding to

the 3-momentum p. By letting µ = 0 and taking into account (K.7) we also
obtain
3
X p0
Λ−1 ν
0ν e (Λp, τ ) = e
−iτ φW (p,Λ)
e0 (p, τ ) + Xτ (p, Λ) = Xτ (p, Λ)
ν=0
p
3
−iτ φW (p,Λ)
X pµ
e eµ (p, τ ) = Λ−1 ν
µν e (Λp, τ ) − Xτ (p, Λ)
ν=0
p
3 3
X pµ X −1 ν
= Λ−1 −ν
µν e (Λp, τ )Λ e (Λp, τ )
ν=0
p ν=0 0ν
3
−1 pµ
X
−1
= Λµν − Λ0ν eν (Λp, τ )
ν=0
p
The complex conjugate of this equation is
3
−1 pµ
X
eiτ φW (p,Λ) e∗µ (p, τ ) = −1
Λµν − Λ0ν e∗ν (Λp, τ )
ν=0
p
Then using (5.25) and (I.4) we can rewrite equation (K.20) as
U0 (Λ; 0)Aµ (x̃)U0−1 (Λ; 0)

√ Z s
1 3
~ c dp |Λp| X X − i p̃·x̃ −1 −1 pµ
= √ √ e ~ Λµν − Λ0ν eν (Λp, τ )cΛp,τ
2(2π~) 3/2 p p τ =−1 ν=0 p

i
−1 pµ †
+e ~ p̃·x̃ Λ−1
µν − Λ 0ν e∗ν
(Λp, τ )c Λp,τ
p
√ 3 Z 1
~ c X d(Λp) p X i i
= √ Λµν−1
|Λp| e− ~ p̃·x̃ eν (Λp, τ )cΛp,τ + e ~ p̃·x̃ e∗ν (Λp, τ )c†Λp,τ
2(2π~)3/2 ν=0 |Λp| τ =−1
√ Z 1 X 3
~ c d(Λp) p µ
X Λ−1
0ν
i
− ~ p̃·x̃ ν i
p̃·x̃ ∗ν †

− |Λp|p e e (Λp, τ )c Λp,τ + e ~ e (Λp, τ )c Λp,τ
(2π~)3/2 |Λp| τ =−1 ν=0
p
3
" √ Z 1
#
X ~ c dp X i i
= Λ−1
µν √ √ e− ~ p̃·Λx̃ eν (p, τ )cp,τ + e ~ p̃·Λx̃ e∗ν (p, τ )c†p,τ
ν=0
2(2π~) 3/2 p τ =−1
K.6. POINCARÉ TRANSFORMATIONS OF THE PHOTON FIELD 725
√ Z 1 3
~ c dp X (Λ−1 p)µ X −1 − i Λ−1 p̃·x̃ ν i −1
Λ p̃·x̃ ∗ν †

− √ Λ e ~ e (p, τ )cp,τ + e ~ e (p, τ )cp,τ
(2π~)3/2 p τ =−1 |Λ−1p| ν=0 0ν
3
X
= Λ−1 ν
µν A (Λx̃) + Ωµ (x̃, Λ) (K.23)
ν=0
Thus we see that property (K.19) is not satisfied. In addition to the desired
covariant transformation Λ−1 A(Λx̃), there is an extra term
√ Z 1 3
~ c dp X (Λ−1 p)µ X −1
Ωµ (x̃, Λ) ≡ − √ Λ ×
(2π~)3/2 2p τ =−1 |Λ−1 p| ν=0 0ν
h i −1 i −1
i
e− ~ Λ p̃·x̃ eν (p, τ )cp,τ + e ~ Λ p̃·x̃ e∗ν (p, τ )c†p,τ (K.24)
in the boost transformation law. The presence of this extra term is the rea-
son why QED with massless photons cannot be formulated via simple steps
outlined in subsection 9.1.1. A more elaborate construction is required in
order to maintain the relativistic invariance of QED as detailed in subsection
9.1.2 and in Appendix N.2.
From
3
X 3
X
lim Λ−1 ρ
0ρ e (p, τ ) = δ0ρ eρ (p, τ ) = e0 (p, τ ) = 0 (K.25)
θ→0
ρ=0 ρ=0
we obtain the following useful property
Ω(x̃, 1) = 0 (K.26)
Appendix L
QED interaction in terms of

particle operators
L.1 Current density

In QED an important role is played by the operator of current density which
is defined as a sum of the electron/positron J µ (x̃) and proton/antiproton
J µ (x̃) current densities
j µ (x̃) = J µ (x̃) + J µ (x̃)

≡ −ecψ(x̃)γ µ ψ(x̃) + ecΨ(x̃)γ µ Ψ(x̃) (L.1)
where e is the absolute value of the electron charge, gamma matrices γ µ are
defined in equations (J.1) - (J.2) and quantum fields ψ(x̃), ψ(x̃), Ψ(x̃), Ψ(x̃)
are defined in Appendix J.3.1 Let us consider the electron/positron part
J µ (x̃) of the current density and derive three important properties of this
operator.2 First, with the help of (J.19), (J.21), and (J.61) we can find that
the current operator (L.1) transforms as a 4-vector function on the Minkowski
space-time
U0 (Λ; 0)J µ (x̃)U0−1 (Λ; 0)

1
Note that ψ(x̃) is a 4-component bispinor-column and ψ(x̃) is a 4-component bispinor-
row. So, the product ψ(x̃)γ µ ψ(x̃) is a scalar in the bispinor space.
2
Properties of the proton/antiproton part J µ (x̃) are similar.
727
728APPENDIX L. QED INTERACTION IN TERMS OF PARTICLE OPERATORS
= −ecU0 (Λ; 0)ψ †(x̃)γ 0 γ µ ψ(x̃)U0−1 (Λ; 0)

= −ecU0 (Λ; 0)ψ †(x̃)U0−1 (Λ; 0)γ 0γ µ U0 (Λ; 0)ψ(x̃)U0−1 (Λ; 0)
= −ecψ † (Λx̃)D † (Λ−1 )γ 0 γ µ D(Λ−1)ψ(Λx̃)
= −ecψ † (Λx̃)D(Λ−1)γ 0 D(Λ−1)D(Λ)γ µ D(Λ−1 )ψ(Λx̃)
= −ecψ † (Λx̃)γ 0 D(Λ)γ µ D(Λ−1)ψ(Λx̃)
X3
= −ec ψ † (Λx̃)γ 0 (Λ−1 )νµ γ ν ψ(Λx̃)
ν=0
3
X
= (Λ−1 )νµ Jν (Λx̃) (L.2)
ν=0
From this we obtain a useful commutator
[K0z , J0 (x̃)]
i~ d ic ic
= − lim e ~ K0z θ J0 (x̃)e− ~ K0z θ
c θ→0 dθ
i~ dh z
= − lim J0 x, y, z cosh θ − ct sinh θ, t cosh θ − sinh θ cosh θ
c θ→0 dθ c i
z
+Jz x, y, z cosh θ − ct sinh θ, t cosh θ − sinh θ sinh θ
c
z d d i~
= i~ 2 + t J0 (x̃) − Jz (x̃) (L.3)
c dt dz c
Space-time translations act by shifting the argument of the current
U0 (0; ã)J µ (x̃)U0−1 (0; ã) = J µ (x̃ + ã) (L.4)
Second, the current density satisfies the continuity equation which can be
proven by using Dirac equations (J.79), (J.81), and property (J.3)
∂ 0 ∂
J (x̃) = −ec (ψ(x̃)γ 0 ψ(x̃))
∂t ∂t

∂ 0 0 ∂
= −ec ψ(x̃) γ ψ(x̃) + ψ(x̃) γ ψ(x̃)
∂t ∂t

∂ i
= ec c ψ † (x̃)~γ † + mc2 ψ † (x̃) γ 0 ψ(x̃)
∂x ~
L.1. CURRENT DENSITY 729

∂ i 2
+ecψ(x̃) c~γ ψ(x̃) − mc ψ(x̃)
∂x ~
∂ ∂
= ec2 ψ(x̃)~γ ψ(x̃) + ec2 ψ(x̃)~γ ψ(x̃)
∂x ∂x
∂
= ec2 (ψ(x̃)~γ ψ(x̃))
∂x
∂
= −c J(x̃) (L.5)
∂x
Third, from equation (J.72) it follows that current components commute at
spacelike separations
[j µ (x, t), j ν (y, t)] = 0, if x 6= y

Using expressions for fields (J.57) and (J.58), we can also write the current
density operator (L.1) in the normally ordered form3
j µ (x̃) = −ecψ(x̃)γ µ ψ(x̃) + ecΨ(x̃)γ µ Ψ(x̃)

Z
−3
= ec(2π~) dpdp′ ×
i † i i ′ i ′
µ
− [e ~ p̃·x̃ Aα (p) + e− ~ p̃·x̃ B α (p)]γαβ [e− ~ p̃ ·x̃ Aβ (p′ ) + e ~ p̃ ·x̃ Bβ† (p′ )]
i † i i ′ i ′

µ
+ [e ~ P̃ ·x̃ D α (p) + e− ~ P̃ ·x̃ F α (p)]γαβ [e− ~ P̃ ·x̃ Dβ (p′ ) + e ~ P̃ ·x̃ Fβ† (p′ )]
Z
−3 µ
= ec(2π~) dpdp′ γαβ ×
† i ′ † i ′
− Aα (p)Aβ (p′ )e− ~ (p̃ −p̃)·x̃ − Aα (p)Bβ† (p′ )e ~ (p̃ +p̃)·x̃
i ′ i ′
− B α (p)Aβ (p′ )e− ~ (p̃ +p̃)·x̃ − B α (p)Bβ† (p′ )e ~ (p̃ −p̃)·x̃
† i ′ −P̃ )·x̃ † i ′ +P̃ )·x̃
+ D α (p)Dβ (p′ )e− ~ (P̃ + D α (p)Fβ† (p′ )e+ ~ (P̃
i ′ i ′ −P̃ )·x̃

+ F α (p)Dβ (p′ )e− ~ (P̃ +P̃ )·x̃ + F α (p)Fβ† (p′ )e ~ (P̃
Z
−3 µ
= ec(2π~) dpdp′ γαβ ×
† i ′ † i ′
3
Summation on bispinor indices α and β is assumed.
i ′ i ′
− B α (p)Aβ (p′ )e− ~ (p̃ +p̃)·x̃ + Bβ† (p′ )B α (p)e ~ (p̃ −p̃)·x̃
† i ′ −P̃ )·x̃ † i ′ +P̃ )·x̃
+ Dα (p)Dβ (p′ )e− ~ (P̃ + D α (p)Fβ† (p′ )e+ ~ (P̃
i ′ +P̃ )·x̃ i ′ −P̃ )·x̃
+ F α (p)Dβ (p′ )e− ~ (P̃ − Fβ† (p′ )F α (p)e ~ (P̃
i ′ i ′ −P̃ )·x̃

− {B α (p), Bβ† (p′ )}e ~ (p̃ −p̃)·x̃ + {F α (p), Fβ† (p′ )}e ~ (P̃
Let us show that the two last terms vanish. We use anticommutator (J.75)
and properties of gamma matrices to rewrite these two terms as
Z
−3 µ
ec(2π~) dpdp′ γαβ ×
1 i ′
− (γ 0 ωp + ~γ pc − mc2 )βα δ(p − p′ )e ~ (p̃ −p̃)·x
2ωp
1 i ′

+ (γ 0 Ωp + ~γ pc − Mc2 )βα δ(p − p′ )e ~ (P̃ −P̃ )·x
2Ωp
Z 2
−3 µ mc Mc2
= ec(2π~) dpγαα −
2ωp 2Ωp
Z
−3 µ 0 1 1
+ ec(2π~) dp(γ γ )αα − + (L.6)
2 2
Z
−3 µ pc pc
+ ec(2π~) dp(γ ~γ )αα − +
2ωp 2Ωp
Z 2
−3 µ mc Mc2
= ec(2π~) T r(γ ) dp −
2ωp 2Ωp
Z
2 −3 µ 1 1
+ ec (2π~) T r(γ ~γ ) dpp − +
2ωp 2Ωp
The first term vanishes due to the property (J.7). The second integral is
zero, because the integrand is an odd function of p.4 So, finally, the normally
ordered form of the current density is
Z
−3 µ
µ
j (x̃) = ec(2π~) dpdp′ γαβ ×
4
Note that cancelation in (L.6) was possible only because our theory contains two
particle types (electrons and protons) with opposite electric charges.
L.2. FIRST-ORDER INTERACTION IN QED 731
† i ′ † i ′
i ′ i ′
− B α (p)Aβ (p′ )e− ~ (p̃ +p̃)·x̃ + Bβ† (p′ )B α (p)e ~ (p̃ −p̃)·x̃
† i ′ −P̃ )·x̃ † i ′ +P̃ )·x̃
+ Dα (p)Dβ (p′ )e− ~ (P̃ + D α (p)Fβ† (p′ )e+ ~ (P̃
i ′ +P̃ )·x̃ i ′ −P̃ )·x̃

+ F α (p)Dβ (p′ )e− ~ (P̃ − Fβ† (p′ )F α (p)e ~ (P̃ (L.7)
L.2 First-order interaction in QED

Inserting (L.7) and (K.11) in (9.13) we obtain the 1st order interaction ex-
pressed in terms of creation and annihilation operators
Z
e ′ † ′ − ~i (p′ −p)·x
V1 = dxdpdp dk −A α (p)Aβ (p )e + . . . ×
(2π~)9/2
i i

†
e− ~ kx Cαβ (k) + e ~ kx Cαβ (k)
Z
e
= dkdp ×
(2π~)3/2
† †
†
− Aα (p + k)Aβ (p)Cαβ (k) − Aα (p − k)Aβ (p)Cαβ (k)
† ††
+ D α (p + k)Dβ (p)Cαβ (k) + D α (p − k)Dβ (p)Cαβ (k)
+ Bβ† (p + k)B α (p)Cαβ (k) + Bβ† (p − k)B α (p)Cαβ
†
(k)
− Fβ† (p + k)F α (p)Cαβ (k) − Fβ† (p − k)F α (p)Cαβ
†
(k)
† †
− Aα (p + k)Bβ† (p)Cαβ (k) − Aα (p − k)Bβ† (p)Cαβ
†
(k)
†
− Aβ (p + k)B α (p)Cαβ (k) − Aβ (p − k)B α (p)Cαβ (k)
† †
+ D α (p + k)Fβ† (p)Cαβ (k) + Dα (p − k)Fβ† (p)Cαβ
†
(k)

†
+ Dβ (p + k)F α (p)Cαβ (k) + Dβ (p − k)F α (p)Cαβ (k) (L.8)
This operator is of the pure unphys type.
L.3 Second-order interaction in QED

The second order interaction Hamiltonian (9.14) has rather long expression
in terms of particle operators
Z
1 1
V2 = 2 dxdyj0 (x, 0) j0 (y, 0)
c 8π|x − y|
Z Z
−6
X 1
2
= e (2π~) γαβ γγδ dxdy dpdp′ dqdq′
0 0
×
αβγδ
8π|x − y|
† i ′ † i ′
− Aα (p)Aβ (p′ )e− ~ (p −p)·x − Aα (p)Bβ† (p′ )e ~ (p +p)·x
i ′ i ′
− B α (p)Aβ (p′ )e− ~ (p +p)·x − B α (p)Bβ† (p′ )e ~ (p −p)·x
† i ′ † i ′
+ D α (p)Dβ (p′ )e− ~ (p −p)·x + D α (p)Fβ† (p′ )e+ ~ (p +p)·x
i ′ i ′

+ F α (p)Dβ (p′ )e− ~ (p +p)·x + F α (p)Fβ† (p′ )e ~ (p −p)·x ×
† i ′ † i ′
− Aγ (q)Aδ (q′ )e− ~ (q −q)·y − Aγ (q)Bδ† (q′ )e ~ (q +q)·y
i ′ i ′
− B γ (q)Aδ (q′ )e− ~ (q +q)·y − B γ (q)Bδ† (q′ )e ~ (q −q)·y
† i ′ † i ′
+ D γ (q)Dδ (q′ )e− ~ (q −q)·y + D γ (q)Fδ† (q′ )e+ ~ (q +q)·y
i ′ i ′

+ F γ (q)Dδ (q′ )e− ~ (q +q)·y + F γ (q)Fδ† (q′ )e ~ (q −q)·y
XZ Z
γαβ0 0
γγδ
−6
2
= e (2π~) dxdy dpdp′ dqdq′ ×
αβγδ
8π|x − y|
† † i ′ i ′
+ Aα (p)Aβ (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ Aα (p)Aβ (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ Aα (p)Aβ (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ Aα (p)Aβ (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aα (p)Aβ (p′ )Dγ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aα (p)Aβ (p′ )Dγ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− Aα (p)Aβ (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− Aα (p)Aβ (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ Aα (p)Bβ† (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
+ Aα (p)Bβ† (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
+ Aα (p)Bβ† (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e ~ (p +p)·x
L.3. SECOND-ORDER INTERACTION IN QED 733
† i ′ i ′
+ Aα (p)Bβ† (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
− Aα (p)Bβ† (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
− Aα (p)Bβ† (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
− Aα (p)Bβ† (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
− Aα (p)Bβ† (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e ~ (p +p)·x
† i ′ i ′
+ B α (p)Aβ (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ B α (p)Aβ (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ B α (p)Aβ (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ B α (p)Aβ (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− B α (p)Aβ (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− B α (p)Aβ (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− B α (p)Aβ (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− B α (p)Aβ (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ B α (p)Bβ† (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ B α (p)Bβ† (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
+ B α (p)Bβ† (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
+ B α (p)Bβ† (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
− B α (p)Bβ† (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
− B α (p)Bβ† (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− B α (p)Bβ† (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− B α (p)Bβ† (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e ~ (p −p)·x
† † i ′ i ′
− D α (p)Dβ (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− D α (p)Dβ (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− D α (p)Dβ (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− D α (p)Dβ (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ D α (p)Dβ (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ D α (p)Dβ (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ D α (p)Dβ (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ D α (p)Dβ (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− D α (p)Fβ† (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e+ ~ (p +p)·x
† † i ′ i ′
− D α (p)Fβ† (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
− D α (p)Fβ† (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
− D α (p)Fβ† (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e+ ~ (p +p)·x
† † i ′ i ′
+ D α (p)Fβ† (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e+ ~ (p +p)·x
† † i ′ i ′
+ D α (p)Fβ† (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
+ D α (p)Fβ† (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
+ D α (p)Fβ† (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e+ ~ (p +p)·x
† i ′ i ′
− F α (p)Dβ (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− F α (p)Dβ (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− F α (p)Dβ (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− F α (p)Dβ (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ F α (p)Dβ (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ F α (p)Dβ (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ F α (p)Dβ (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ F α (p)Dβ (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− F α (p)Fβ† (p′ )Aγ (q)Aδ (q′ )e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
− F α (p)Fβ† (p′ )Aγ (q)Bδ† (q′ )e ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− F α (p)Fβ† (p′ )B γ (q)Aδ (q′ )e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− F α (p)Fβ† (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ F α (p)Fβ† (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ F α (p)Fβ† (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
+ F α (p)Fβ† (p′ )F γ (q)Dδ (q′ )e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′

+ F α (p)Fβ† (p′ )F γ (q)Fδ† (q′ )e ~ (q −q)·y e ~ (p −p)·x (L.9)
We need to convert this expression to the normal order, i.e., move all creation
operators in front of annihilation operators. After this is done we will obtain
a sum of phys, unphys, and renorm terms. It can be shown that the renorm
terms are infinite. This is an indication of renormalization troubles with
QED. These troubles are discussed in detail in chapter 10. In the rest of this
section we will simply ignore the renorm part of interaction.
There are some cancelations among unphys terms. To see how they work,
let us convert to the normal order the 12th term in (L.9)
XZ Z 0
γαβ 0
γγδ
2 −6 ′ ′
e (2π~) dxdy dpdp dqdq ×
αβγδ
8π|x − y|
† i ′ i ′
Aα (p)Bβ† (p′ )B γ (q)Bδ† (q′ )e ~ (q −q)·y e ~ (p +p)·x
XZ Z 0
γαβ 0
γγδ
−6
2
= −e (2π~) dxdy dpdp′ dqdq′ ×
αβγδ
8π|x − y|
† i ′ i ′
Aα (p)Bβ† (p′ )Bδ† (q′ )B γ (q)e ~ (q −q)·y e ~ (p +p)·x
XZ Z 0
γαβ 0
γγδ
2 −6 ′ ′
+ e (2π~) dxdy dpdp dqdq ×
αβγδ
8π|x − y|
† i ′ i ′
Aα (p)Bβ† (p′ ){B γ (q), Bδ† (q′ )}e ~ (q −q)·y e ~ (p +p)·x
Now we denote the second term on the right hand side of this expression by
I and use (J.75), (J.7) - (J.8), and (B.6)
XZ Z 0
γαβ 0
γγδ
2 −6 ′ ′
I = e (2π~) dxdy dpdp dqdq ×
αβγδ
8π|x − y|
† 1 i ′ i ′
Aα (p)Bβ† (p′ ) (γ 0 ωq + ~γ qc − mc2 )γδ δ(q′ − q)e ~ (q −q)·y e ~ (p +p)·x
2ωq
XZ Z 0
γαβ
−6
2
= e (2π~) dxdy dpdp′ dq ×
αβ
8π|x − y|
† 1 i ′
Aα (p)Bβ† (p′ ) (ωq T r(γ 0 γ 0 ) + qcT r(γ 0~γ ) − mc2 T r(γ 0 ))e ~ (p +p)·x
2ωq
XZ Z 0
γαβ † i ′
2
= 2e (2π~) −6
dxdy dpdp′ dq Aα (p)Bβ† (p′ )e ~ (p +p)·x
αβ
8π|x − y|
2 2 Z ′
2e ~ X ′ 0 † † ′ δ(p + p)
= dpdp dqγ αβ Aα (p)Bβ (p ) (L.10)
(2π~)3 αβ (p′ + p)2
This term is infinite. However there are three other infinite terms in (L.9)
that arise in a similar manner from −A† B † F F † + BB † A† B † − F F † A† B † .
These terms cancel exactly with (L.10). Similar to (L.6), this cancelation is
possible only because of the condition qelectron + qproton = 0.
Taking into account the above results and using anticommutators like
(J.73) and (J.75) we can bring the second order interaction (L.9) to the
normal order
XZ Z 0
γαβ 0
γγδ
2 −6 ′ ′
V2 = e (2π~) dxdy dpdp dqdq ×
αβγδ
8π|x − y|
† † i ′ i ′
( − Aα (p)Aγ (q)Aβ (p′ )Aδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aα (p)Aγ (q)Aβ (p′ )Bδ† (q′ )e ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ Aα (p)Aβ (p′ )Aδ (q′ )B γ (q)e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− Aα (p)Aβ (p′ )Bδ† (q′ )B γ (q)e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aα (p)Aβ (p′ )Dγ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aα (p)Aβ (p′ )Dγ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− Aα (p)Aβ (p′ )Dδ (q′ )F γ (q)e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ Aα (p)Aβ (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ Aα (p)Aγ (q)Aδ (q′ )Bβ† (p′ )e− ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
+ Aα (p)Aγ (q)Bβ† (p′ )Bδ† (q′ )e ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
+ Aα (p)Aδ (q′ )Bβ† (p′ )B γ (q)e− ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
− Aα (p)Bβ† (p′ )Bδ† (q′ )B γ (q)e ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
− Aα (p)Bβ† (p′ )Dγ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
− Aα (p)Bβ† (p′ )Dγ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
− Aα (p)Bβ† (p′ )Dδ (q′ )F γ (q)e− ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
+ Aα (p)Bβ† (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e ~ (p +p)·x
† i ′ i ′
− Aγ (q)Aβ (p′ )Aδ (q′ )B α (p)e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ Aγ (q)Aβ (p′ )Bδ† (q′ )B α (p)e ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ Aβ (p′ )Aδ (q′ )B α (p)B γ (q)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ Aβ (p′ )Bδ† (q′ )B α (p)B γ (q)e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− Aβ (p′ )B α (p)D γ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− Aβ (p′ )B α (p)D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− Aβ (p′ )B α (p)Dδ (q′ )F γ (q)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ Aβ (p′ )B α (p)Fδ† (q′ )F γ (q)e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− Aγ (q)Aδ (q′ )Bβ† (p′ )B α (p)e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ Aγ (q)Bβ† (p′ )Bδ† (q′ )B α (p)e ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− Aδ (q′ )Bβ† (p′ )B α (p)B γ (q)e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− Bβ† (p′ )Bδ† (q′ )B α (p)B γ (q)e ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ Bβ† (p′ )B α (p)D γ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ Bβ† (p′ )B α (p)D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
+ Bβ† (p′ )B α (p)Dδ (q′ )F γ (q)e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− Bβ† (p′ )B α (p)Fδ† (q′ F γ (q)e ~ (q −q)·y e ~ (p −p)·x
† † i ′ i ′
− Aγ (q)Aδ (q′ )Dα (p)Dβ (p′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aγ (q)Bδ† (q′ )D α (p)Dβ (p′ )e ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− Aδ (q′ )B γ (q)D α (p)Dβ (p′ )e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ Bδ† (q′ )B γ (q)D α (p)Dβ (p′ )e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Dα (q)D γ (p)Dβ (q′ )Dδ (p′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Dα (p)D γ (q)Dβ (p′ )Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ Dα (p)Dβ (p′ )Dδ (q′ )F γ (q)e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− Dα (p)Dβ (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− Aγ (q)Aδ (q′ )D α (p)Fβ† (p′ )e− ~ (q −q)·y e+ ~ (p +p)·x
† † i ′ i ′
− Aγ (q)Bδ† (q′ )Dα (p)Fβ† (p′ )e ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
− Aδ (q′ )B γ (q)D α (p)Fβ† (p′ )e− ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
+ Bδ† (q′ )B γ (q)Dα (p)Fβ† (p′ )e ~ (q −q)·y e+ ~ (p +p)·x
† † i ′ i ′
+ D α (p)D γ (q)Dδ (q′ )Fβ† (p′ )e− ~ (q −q)·y e+ ~ (p +p)·x
† † i ′ i ′
+ D α (p)D γ (q)Fβ† (p′ )Fδ† (q′ )e+ ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
+ D α (p)Dδ (q′ )Fβ† (p′ )F γ (q)e− ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
− D α (p)Fβ† (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e+ ~ (p +p)·x
† i ′ i ′
− Aγ (q)Aδ (q′ )Dβ (p′ )F α (p)e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− Aγ (q)Bδ† (q′ )Dβ (p′ )F α (p)e ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− Aδ (q′ )B γ (q)Dβ (p′ )F α (p)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ Bβ† (q′ )B γ (q)Dβ (p′ )F α (p)e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− D γ (q)Dβ (p′ )Dδ (q′ )F α (p)e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ D γ (q)Dβ (p′ )Fδ† (q′ )F α (p)e+ ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ Dβ (p′ )Dδ (q′ )F α (p)F γ (q)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ Dβ (p′ )Fδ† (q′ )F α (p)F γ (q)e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
+ Aγ (q)Aδ (q′ )Fβ† (p′ )F α (p)e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ Aγ (q)Bδ† (q′ )Fβ† (p′ )F α (p)e ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
+ Aδ (q′ )B γ (q)Fβ† (p′ )F α (p)e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− Bδ† (q′ )B γ (q)Fβ† (p′ )F α (p)e ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
− D γ (q)Dδ (q′ )Fβ† (p′ )F α (p)e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ D γ (q)Fβ† (p′ )Fδ† (q′ )F α (p)e+ ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− Dδ (q′ )Fβ† (p′ )F α (p)F γ (q)e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− Fβ† (p′ )Fδ† (q′ )F α (p)F γ (q)e ~ (q −q)·y e ~ (p −p)·x )
Next we switch summation labels α ↔ γ and integration variables x ↔ y

and p ↔ q to simplify
XZ Z 0
γαβ 0
γγδ
−6
2
V2 = e (2π~) dxdy dpdp′ dqdq′ ×
αβγδ
8π|x − y|
† † i ′ i ′
( − Aα (p)Aγ (q)Aβ (p′ )Aδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† i ′ i ′
+ 2Aα (p)Aβ (p′ )Aδ (q′ )B γ (q)e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− 2Aα (p)Aβ (p′ )Bδ† (q′ )B γ (q)e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− 2Aα (p)Aβ (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− 2Aα (p)Aβ (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− 2Aα (p)Aβ (p′ )Dδ (q′ )F γ (q)e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ 2Aα (p)Aβ (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ 2Aα (p)Aγ (q)Aδ (q′ )Bβ† (p′ )e− ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
+ Aα (p)Aγ (q)Bβ† (p′ )Bδ† (q′ )e ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
+ 2Aα (p)Aδ (q′ )Bβ† (p′ )B γ (q)e− ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
− 2Aα (p)Bβ† (p′ )Bδ† (q′ )B γ (q)e ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
− 2Aα (p)Bβ† (p′ )D γ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p +p)·x
† † i ′ i ′
− 2Aα (p)Bβ† (p′ )D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
− 2Aα (p)Bβ† (p′ )Dδ (q′ )F γ (q)e− ~ (q +q)·y e ~ (p +p)·x
† i ′ i ′
+ 2Aα (p)Bβ† (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e ~ (p +p)·x
i ′ i ′
+ Aβ (p′ )Aδ (q′ )B α (p)B γ (q)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ 2Aβ (p′ )Bδ† (q′ )B α (p)B γ (q)e ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− 2Aβ (p′ )B α (p)D γ (q)Dδ (q′ )e− ~ (q −q)·y e− ~ (p +p)·x
† i ′ i ′
− 2Aβ (p′ )B α (p)D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
− 2Aβ (p′ )Bα (p)Dδ (q′ )F γ (q)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ 2Aβ (p′ )Bα (p)Fδ† (q′ )F γ (q)e ~ (q −q)·y e− ~ (p +p)·x
i ′ i ′
− Bβ† (p′ )Bδ† (q′ )B α (p)B γ (q)e ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ 2Bβ† (p′ )B α (p)D γ (q)Dδ (q′ )e− ~ (q −q)·y e ~ (p −p)·x
† i ′ i ′
+ 2Bβ† (p′ )B α (p)D γ (q)Fδ† (q′ )e+ ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
+ 2Bβ† (p′ )B α (p)Dδ (q′ )F γ (q)e− ~ (q +q)·y e ~ (p −p)·x
i ′ i ′
− 2Bβ† (p′ )B α (p)Fδ† (q′ )F γ (q)e ~ (q −q)·y e ~ (p −p)·x
† † i ′ i ′
− D α (p)D γ (q)Dβ (p′ )Dδ (q′ )e− ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
− 2D α (p)D γ (q)Dβ (p′ )Fδ† (q′ )e+ ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
+ 2D α (p)Dβ (p′ )Dδ (q′ )F γ (q)e− ~ (q +q)·y e− ~ (p −p)·x
† i ′ i ′
− 2D α (p)Dβ (p′ )Fγ† (q′ )F δ (q)e ~ (q −q)·y e− ~ (p −p)·x
† † i ′ i ′
+ D α (p)D γ (q)Fβ† (p′ )Fδ† (q′ )e+ ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
+ 2D α (p)Dδ (q′ )Fβ† (p′ )F γ (q)e− ~ (q +q)·y e+ ~ (p +p)·x
† i ′ i ′
− 2D α (p)Fβ† (p′ )Fδ† (q′ )F γ (q)e ~ (q −q)·y e+ ~ (p +p)·x
i ′ i ′
+ Dβ (p′ )Dδ (q′ )F α (p)F γ (q)e− ~ (q +q)·y e− ~ (p +p)·x
i ′ i ′
+ 2Dβ (p′ )Fδ† (q′ )F α (p)F γ (q)e ~ (q −q)·y e− ~ (p +p)·x
i ′ i ′
− Fβ† (p′ )Fδ† (q′ )F α (p)F γ (q)e ~ (q −q)·y e ~ (p −p)·x )
Integrals over x and y can be evaluated by using formula (B.6)
Z
e2 ~2 X
V2 = dpdp′ dqdq′ γαβ
0 0
γγδ ×
2(2π~)3 αβγδ
† † 1
− Aα (p)Aγ (q)Aβ (p′ )Aδ (q′ )δ(q′ − q + p′ − p)
− q|2 |q′
† 1
+ 2Aα (p)Aβ (p′ )Aδ (q′ )B γ (q)δ(q′ + q + p′ − p) ′
|q + q|2
† 1
− 2Aα (p)Aβ (p′ )Bδ† (q′ )B γ (q)δ(q′ − q − p′ + p) ′
|q − q|2
† † 1
− 2Aα (p)Aβ (p′ )Dγ (q)Dδ (q′ )δ(q′ − q + p′ − p) ′
|q − q|2
† † 1
− 2Aα (p)Aβ (p′ )Dγ (q)Fδ† (q′ )δ(q′ + q − p′ + p) ′
|q + q|2
† 1
− 2Aα (p)Aβ (p′ )Dδ (q′ )F γ (q)δ(q′ + q + p′ − p) ′
|q + q|2
† 1
+ 2Aα (p)Aβ (p′ )Fδ† (q′ )F γ (q)δ(q′ − q − p′ + p)
|q′ − q|2
† † 1
+ 2Aα (p)Aγ (q)Aδ (q′ )Bβ† (p′ )δ(q′ − q − p′ − p) ′
|q − q|2
† † 1
+ Aα (p)Aγ (q)Bβ† (p′ )Bδ† (q′ )δ(q′ + q + p′ + p) ′
|q + q|2
† 1
+ 2Aα (p)Aδ (q′ )Bβ† (p′ )B γ (q)δ(q′ + q − p′ − p) ′
|q + q|2
† 1
− 2Aα (p)Bβ† (p′ )Bδ† (q′ )B γ (q)δ(q′ − q + p′ + p) ′
|q − q|2
† † 1
− 2Aα (p)Bβ† (p′ )D γ (q)Dδ (q′ )δ(q′ − q − p′ − p) ′
|q − q|2
† † 1
− 2Aα (p)Bβ† (p′ )D γ (q)Fδ† (q′ )δ(q′ + q + p′ + p) ′
|q + q|2
† 1
− 2Aα (p)Bβ† (p′ )Dδ (q′ )F γ (q)δ(q′ + q − p′ − p) ′
|q + q|2
† 1
+ 2Aα (p)Bβ† (p′ )Fδ† (q′ )F γ (q)δ(q′ − q + p′ + p) ′
|q − q|2
1
+ Aβ (p′ )Aδ (q′ )B α (p)B γ (q)δ(q′ + q + p′ + p) ′
|q + q|2
1
+ 2Aβ (p′ )Bδ† (q′ )B α (p)B γ (q)δ(q′ − q − p′ − p) ′
|q − q|2
† 1
− 2Aβ (p′ )B α (p)D γ (q)Dδ (q′ )δ(q′ − q + p′ + p) ′
|q − q|2
† 1
− 2Aβ (p′ )B α (p)D γ (q)Fδ† (q′ )δ(q′ + q − p′ − p) ′
|q + q|2
1
− 2Aβ (p′ )Bα (p)Dδ (q′ )F γ (q)δ(q′ + q + p′ + p) ′
|q + q|2
1
+ 2Aβ (p′ )Bα (p)Fδ† (q′ )F γ (q)δ(q′ − q − p′ − p) ′
|q − q|2
1
− Bβ† (p′ )Bδ† (q′ )B α (p)B γ (q)δ(q′ − q + p′ − p) ′
|q − q|2
† 1
+ 2Bβ† (p′ )B α (p)D γ (q)Dδ (q′ )δ(q′ − q − p′ + p) ′
|q − q|2
† 1
+ 2Bβ† (p′ )B α (p)Dγ (q)Fδ† (q′ )δ(q′ + q + p′ − p)
|q′ + q|2
1
+ 2Bβ† (p′ )B α (p)Dδ (q′ )F γ (q)δ(q′ + q − p′ + p) ′
|q + q|2
1
− 2Bβ† (p′ )B α (p)Fδ† (q′ )F γ (q)δ(q′ − q + p′ − p) ′
|q − q|2
† † 1
− Dα (p)D γ (q)Dβ (p′ )Dδ (q′ )δ(q′ − q + p′ − p) ′
|q − q|2
† † 1
− 2Dα (p)D γ (q)Dβ (p′ )Fδ† (q′ )δ(q′ + q − p′ + p) ′
|q + q|2
† 1
+ 2Dα (p)Dβ (p′ )Dδ (q′ )F γ (q)δ(q′ + q + p′ − p) ′
|q + q|2
† 1
− 2Dα (p)Dβ (p′ )Fγ† (q′ )F δ (q)δ(q′ − q − p′ + p) ′
|q − q|2
† † 1
+ Dα (p)D γ (q)Fβ† (p′ )Fδ† (q′ )δ(q′ + q + p′ + p) ′
|q + q|2
† 1
+ 2Dα (p)Dδ (q′ )Fβ† (p′ )F γ (q)δ(q′ + q − p′ − p) ′
|q + q|2
† 1
− 2Dα (p)Fβ† (p′ )Fδ† (q′ )F γ (q)δ(q′ − q + p′ + p) ′
|q − q|2
1
+ Dβ (p′ )Dδ (q′ )F α (p)F γ (q)δ(q′ + q + p′ + p) ′
|q + q|2
1
+ 2Dβ (p′ )Fδ† (q′ )F α (p)F γ (q)δ(q′ − q − p′ − p) ′
|q − q|2
1
− Fβ† (p′ )Fδ† (q′ )F α (p)F γ (q)δ(q′ − q + p′ − p) ′
|q − q|2
Finally, we integrate this expression on q′ and divide V2 into phys and unphys
parts
V2 = V2phys + V2unphys
Z
e2 ~2 X
V2phys = dpdp′ dqγαβ
0 0
γγδ ×
2(2π~)3 αβγδ
† † 1
− Aα (p)Aγ (q)Aβ (p′ )Aδ (q − p′ + p)
|p′ − p|2
† 1
− 2Aα (p)Aβ (p′ )Bδ† (q + p′ − p)B γ (q) ′
|p − p|2
† † 1
− 2Aα (p)Aβ (p′ )Dγ (q)Dδ (q − p′ + p) ′
|p − p|2
† 1
+ 2Aα (p)Aβ (p′ )Fδ† (+q + p′ − p)F γ (q) ′
|p − p|2
† 1
+ 2Aα (p)Aδ (−q + p′ + p)Bβ† (p′ )B γ (q) ′
|p + p|2
† 1
− 2Aα (p)Bβ† (p′ )Dδ (−q + p′ + p)F γ (q) ′
|p + p|2
† 1
− 2Aβ (p′ )B α (p)D γ (q)Fδ† (−q + p′ + p) ′
|p + p|2
1
− 2Aβ (p′ )Bα (p)Dδ (−q − p′ − p)F γ (q) ′
|p + p|2
1
− Bβ† (p′ )Bδ† (q − p′ + p)B α (p)B γ (q) ′
|p − p|2
† 1
+ 2Bβ† (p′ )B α (p)D γ (q)Dδ (q + p′ − p) ′
|p − p|2
1
− 2Bβ† (p′ )B α (p)Fδ† (q − p′ + p)F γ (q) ′
|p − p|2
† † 1
− D α (p)D γ (q)Dβ (p′ )Dδ (q − p′ + p) ′
|p − p|2
† 1
− 2D α (p)Dβ (p′ )Fγ† (q + p′ − p)F δ (q) ′
|p − p|2
† 1
+ 2D α (p)Dδ (−q + p′ + p)Fβ† (p′ )F γ (q) ′
|p + p|2
1
† ′ † ′
− Fβ (p )Fδ (q − p + p)F α (p)F γ (q) ′ (L.11)
|p − p|2
Z
e2 ~2 X
V2unphys = 3
dpdp′ dqγαβ
0 0
γγδ ×
2(2π~) αβγδ
† 1
+ 2Aα (p)Aβ (p′ )Aδ (−q − p′ + p)B γ (q)
|p′ − p|2
† † 1
− 2Aα (p)Aβ (p′ )D γ (q)Fδ† (−q + p′ − p)
|p′ − p|2
† 1
− 2Aα (p)Aβ (p′ )Dδ (−q − p′ + p)F γ (q) ′
|p − p|2
† † 1
+ 2Aα (p)Aγ (q)Aδ (q + p′ + p)Bβ† (p′ ) ′
|p + p|2
† † 1
+ Aα (p)Aγ (q)Bβ† (p′ )Bδ† (−q − p′ − p) ′
|p + p|2
† 1
− 2Aα (p)Bβ† (p′ )Bδ† (q − p′ − p)B γ (q) ′
|p + p|2
† † 1
− 2Aα (p)Bβ† (p′ )D γ (q)Dδ (q) + p′ + p) ′
|p + p|2
† † 1
− 2Aα (p)Bβ† (p′ )D γ (q)Fδ† (−q − p′ − p) ′
|p + p|2
† 1
+ 2Aα (p)Bβ† (p′ )Fδ† (q − p′ − p)F γ (q) ′
|p + p|2
1
+ Aβ (p′ )Aδ (−q − p′ − p)B α (p)B γ (q) ′
|p + p|2
1
+ 2Aβ (p′ )Bδ† (q + p′ + p)B α (p)B γ (q) ′
|p + p|2
† 1
− 2Aβ (p′ )B α (p)D γ (q)Dδ (q − p′ − p) ′
|p + p|2
1
− 2Aβ (p′ )Bα (p)Dδ (−q − p′ − p)F γ (q) ′
|p + p|2
1
+ 2Aβ (p′ )Bα (p)Fδ† (q + p′ + p)F γ (q) ′
|p + p|2
† 1
+ 2Bβ† (p′ )B α (p)D γ (q)Fδ† (−q − p′ + p) ′
|p − p|2
1
+ 2Bβ† (p′ )B α (p)Dδ (−q + p′ − p)F γ (q) ′
|p − p|2
† † 1
− 2Dα (p)D γ (q)Dβ (p′ )Fδ† (−q + p′ − p) ′
|p − p|2
† 1
+ 2Dα (p)Dβ (p′ )Dδ (−q − p′ + p)F γ (q) ′
|p − p|2
† † 1
+ D α (p)D γ (q)Fβ† (p′ )Fδ† (−q − p′ − p)
|p′ + p|2
† 1
− 2D α (p)Fβ† (p′ )Fδ† (q − p′ − p)F γ (q) ′
|p + p|2
1
+ Dβ (p′ )Dδ (−q − p′ − p)F α (p)F γ (q) ′
|p + p|2
1
′ † ′
+ 2Dβ (p )Fδ (q + p + p)F α (p)F γ (q) ′ (L.12)
|p + p|2
Appendix M
Loop integrals in QED
M.1 4-dimensional delta function

In covariant Feynman-Dyson perturbation theory one often needs 4-dimensional
delta function of 4-momentum (p0 , px , py , pz )
δ 4 (p̃) ≡ δ(p0 )δ(px )δ(py )δ(pz ) = δ(p0 )δ(p) (M.1)

which has the following integral representation
Z
1 i
e ~ (p̃·x̃) d4 x = δ 4 (p̃) (M.2)
(2π~)4
In our notation
x̃ = (t, x)
p̃ = (p0 , p)
p̃ · x̃ = p0 t − p · x
d4 x ≡ dtdx
M.2 Feynman’s trick

In QED loop calculations one often meets integrals on the loop 4-momentum
k̃ of expressions like 1/(abc . . .), where a, b, c, . . . are certain functions of k̃.
747
748 APPENDIX M. LOOP INTEGRALS IN QED
Calculations become much simpler if one can replace the integrand 1/(abc . . .)
with an expression in which a, b, c, . . . are present in the denominator in a
linear form. This can be achieved using a trick first introduced by Feynman
[Fey49].
The simplest example of such a trick is given by the integral representation
of the product 1/(ab)
Z1 1
dx 1
=
(ax + b(1 − x))2 (b − a)(ax + b(1 − x)) 0
0
1 1 1
= − = (M.3)
(b − a)a (b − a)b ab
The denominator on the left hand side is a square of a function linear in a and
b. In spite of adding one more integral (on x), the overall integration task is
greatly simplified, as we will see in many examples in this Appendix. Using
this result, we can convert to the linear form more complex expressions, e.g.,
Z1
1 d 1 d dx
= − =−
a2 b da ab da (ax + b(1 − x))2
0
Z1
2xdx
= (M.4)
(ax + b(1 − x))3
0
These two results can be used to get an integral representation for 1/(abc)
 1 
Z
1 1 1  dy 1
= = 2
abc bc a (by + c(1 − y)) a
0
Z1 Z1
1
= dy 2xdx
[(by + c(1 − y))x + a(1 − x)]3
0 0
Z1 Z1
dy
= 2 xdx (M.5)
[byx + cx(1 − y) + a(1 − x)]3
0 0
M.3. SOME BASIC 4D INTEGRALS 749
Another useful formula is1
Z1 Z1 Z1
1 δ(x + y + z − 1)
= 2 dx dy dz
abc [ax + by + cz]3
0 0 0
Z1 Z1−x
1
= 2 dx dy (M.6)
[ax + by + c(1 − x − y)]3
0 0
Next differentiate equation (M.4) on a
Z1 Z1
1 1 d 1 d zdz z 2 dz
3
= − =− =3
ad 2 da a2 d da [az + d(1 − z)]3 [az + d(1 − z)]4
0 0
This results in
1
abcd
 
Z1 Z1
1 1
= 2 xdx dy 3
[a(1 − x) + bxy + cx(1 − y)] d
0 0
Z1 Z1 Z1
z 2 dz
= 6 xdx dy (M.7)
[az(1 − x) + bxyz + cxz(1 − y) + d(1 − z)]4
0 0 0
Obviously these calculations can be continued for expressions with larger

numbers of factors in denominators. See, e.g., the last formula on page 520
of [Sch61] and equation (11.A.1) in [Wei95].
M.3 Some basic 4D integrals

In our studies of loop integrals we will follow Feynman’s approach [Fey49]
and begin with the following simple integral
1
equation (131.2) in [BLP01]
Z Z
d4 k dk0 dk
K = ≡ (M.8)
(k̃ 2 − L)3 (k02 − c k2 − L
2 + iǫ)3
√
The integral on k0 has two 3rd order poles at k0 = ± c2 k 2 + L. We can
rotate2 the integration contour on k0 , so that it goes along the imaginary
axis and then change the integration variables ik0 = m4 and ck = m. Then
the integral is
Zi∞ Z Z∞ Z
1 dm i dm
K = 3 dk0 2
= 3 dm4
c 2
(k0 − m − L) 3 c (−m24 − m2 − L)3
−i∞ −∞
Next we introduce 4-dimensional sphericalRcoordinates [Blu60] where r 2 =

m24 + m2 , and the area of a unit sphere3 is dΩ = 2π 2
Z∞ Z∞
2π 2 i r 3 dr π2i (t − L)dt π2
K = − 3 = − = − (M.9)
c (r 2 + L)3 c3 t3 2ic3 L
0 L
From symmetry properties we also get
Z
kσ
d4 k =0 (M.10)
(k̃ 2 − L)3
Replacing k̃ → k̃ − p̃ in (M.8) and calling L − p̃2 = ∆ we get
Z Z
π2 d4 k d4 k
− = =
2i(p̃2 + ∆)c3 ((k̃ − p̃)2 − L)3 (k̃ 2 − 2p̃k̃ + p̃2 − L)3
Z
d4 k
= (M.11)
(k̃ 2 − 2p̃k̃ − ∆)3
Making the same substitutions in (M.10) we obtain

2
This step is known as the Wick rotation [PS95b].
3
See equation (7.81) in [PS95b].
M.3. SOME BASIC 4D INTEGRALS 751
Z Z
d4 k(kσ − pσ ) d4 k(kσ − pσ )
0 = =
((k̃ − p̃)2 − L)3 (k̃ 2 − 2p̃k̃ − ∆)3
Then
Z Z
d4 kkσ d4 kpσ π 2 pσ
= =− (M.12)
(k̃ 2 − 2p̃k̃ − ∆)3 (k̃ 2 − 2p̃k̃ − ∆)3 2i(p̃2 + ∆)c3
Differentiating both sides of (M.11) either by ∆ or by pσ we obtain
Z
d4 k π2
= (M.13)
(k̃ 2 − 2p̃k̃ − ∆)4 6i(p̃2 + ∆)2 c3
Z
d4 kkσ π 2 pσ
= (M.14)
(k̃ 2 − 2p̃k̃ − ∆)4 6i(p̃2 + ∆)2 c3
Next differentiate both sides of (M.12) by pτ . If τ 6= σ then
Z
d4 kkσ kτ π 2 pσ pτ
= (M.15)
(k̃ 2 − 2p̃k̃ − ∆)4 6i(p̃2 + ∆)2 c3
If τ = σ
Z
d4 kkσ kσ π 2 pσ pσ π2
= −
(k̃ 2 − 2p̃k̃ − ∆)4 6i(p̃2 + ∆)2 c3 12i(p̃2 + ∆)c3
(M.16)
Combining (M.15) and (M.16) yields
Z
d4 kkσ kτ π 2 (pσ pτ − 12 δστ (p̃2 + ∆))
=
(k̃ 2 − 2p̃k̃ − ∆)4 6i(p̃2 + ∆)2 c3
Next we use (M.4) and (M.11) to calculate

Z
d4 k
(k̃ 2 − 2p̃1 k̃ − ∆1 )2 (k̃ 2 − 2p̃2 k̃ − ∆2 )
Z1 Z
d4 k
= 2xdx
[(k̃ 2 − 2p̃1 k̃ − ∆1 )x + (k̃ 2 − 2p̃2 k̃ − ∆2 )(1 − x)]3
0
Z1 Z
d4 k
= 2xdx
[k̃ 2 x − 2p̃1 k̃x − ∆1 x + k̃ 2 − 2p̃2 k̃ − ∆2 − k̃ 2 x + 2p̃2 k̃x + ∆2 x]3
0
Z1 Z Z1
d4 k π2 xdx
= 2xdx =− 3 (M.17)
[k̃ 2 − 2p̃x k̃ − ∆x ]3 ic p̃2x
+ ∆x
0 0
where p̃1 , p̃2 are two arbitrary 4-vectors, ∆1 , ∆2 are numerical constants and
p̃x = xp̃1 + (1 − x)p̃2

∆x = x∆1 + (1 − x)∆2
Similarly, we use (M.12) to obtain
Z
d4 kkσ
(k̃ 2 − 2p̃1 k̃ − ∆1 )2 (k̃ 2 − 2p̃2 k̃ − ∆2 )
Z1 Z Z1
d4 kkσ π2 pxσ xdx
= 2xdx =− 3 (M.18)
[k̃ 2 − 2p̃x k̃ − ∆x ]3 ic p̃2x + ∆x
0 0
Three more integrals are obtained by differentiating (M.17) with respect to

∆2 and p2τ and by differentiating (M.18) with respect to p2τ
Z Z1
d4 k π2 x(1 − x)dx
= 3
(k̃ 2 − 2p̃1 k̃ − ∆1 )2 (k̃ 2 − 2p̃2 k̃ − ∆2 )2 ic (p̃2x + ∆x )2
0
(M.19)
Z Z1
d4 kkτ π2 pxτ x(1 − x)dx
= 3
(k̃ 2 − 2p̃1 k̃ − ∆1 )2 (k̃ 2 − 2p̃2 k̃ − ∆2 )2 ic (p̃2x + ∆x )2
0
M.4. ELECTRON SELF-ENERGY INTEGRAL 753
(M.20)
Z 4
d kkσ kτ
(k̃ 2 − 2p̃1 k̃ − ∆1 )2 (k̃ 2 − 2p̃2 k̃ − ∆2 )2
Z1
π2 (pxσ pxτ − 1/2δστ (p̃2x + ∆x ))x(1 − x)dx
= (M.21)
ic3 (p̃2x + ∆x )2
0
M.4 Electron self-energy integral

The loop integral in square brackets in (10.18) can be represented in the
form4
Jad (p / + mc2 )Iγµ − γµ γ κ I κ γµ = (−2p

/) = γµ (p / + 4mc2 )I + 2γ κ I κ
(M.22)
where
Z
d4 k
I ≡
[(p̃ − k̃)2 − m2 c4 ]k̃ 2
Z
d4 kk κ
Iκ ≡
[(p̃ − k̃)2 − m2 c4 ]k̃ 2
The factor 1/k̃ 2 in the integrand is a source of both ultraviolet and infrared
divergences. So, the integrals need to be regularized, as described in subsec-
tion 10.1.1. To do that, we introduce two parameters: the ultraviolet cutoff
Λ and the infrared cutoff λ.5 Then we replace the troublesome factor 1/k̃ 2
by the integral
Z2 c4
Λ
dL
1/k̃ 2 → − (M.23)
(k̃ 2 − L)2
λ2 c4
4
Here we used equations (J.9), (J.10), and (J.11).
5
Λ and λ have the dimensionality of (mass)
In the end of calculations we should take limits Λ → ∞ and λ → 0. In this

limit the integral reduces to 1/k̃ 2 , as expected
Z∞ Z∞
dL dx 1
− = − 2
=
(k̃ − L)2
2 x k̃ 2
0 −k̃ 2
Then we can use (M.17) and (M.18) with parameters
∆1 = L; p̃1 = 0; ∆2 = m2 c4 − p̃2 ; p̃2 = p̃ (M.24)

p̃x = (1 − x)p̃; ∆x = xL + (1 − x)(m2 c4 − p̃2 ) (M.25)
to rewrite our integrals6
Z2 c4
Λ Z
d4 k
I = − dL
(k̃ 2 − 2p̃k̃ + p̃2 − m2 c4 )(k̃ 2 − L)2
λ2 c4
Z2 c4
Λ Z1
π2 xdx
= dL
ic3 (p̃2x
+ ∆x )
λ2 c4 0
Λ2 c 4
Z Z1
π2 xdx
= dL
ic3 (1 − x)2 p̃2 + xL + (1 − x)(m2 c4 − p̃2 )
λ2 c4 0
Z1
π2
2 2 2 4 2
L=Λ2 c4

= dx ln (1 − x) p̃ + xL + (1 − x)(m c − p̃ )
ic3 L=λ2 c4
0
Z1
π2 (1 − x)2 p̃2 + xΛ2 c4 + (1 − x)(m2 c4 − p̃2 )
= dx ln
ic3 (1 − x)2 p̃2 + xλ2 c4 + (1 − x)(m2 c4 − p̃2 )
0
Z1
π2 xΛ2 c4
≈ dx ln (M.26)
ic3 (1 − x)2 p̃2 + (1 − x)(m2 c4 − p̃2 )
0
6
we have assumed that Λ2 ≫ m2 c4
M.4. ELECTRON SELF-ENERGY INTEGRAL 755
Z2 c4
Λ Z
κ kκ
I = − dL d4 k
(k̃ 2 − 2p̃ · k̃ + p̃2 − m2 c4 )(k̃ 2 − L)2
λ2 c4
Z2 c4
Λ Z1
π2 pκ x(1 − x)dx
= dL
ic3 (1 − x)2 p̃2 + xL + (1 − x)(m2 c4 − p̃2 )
λ2 c4 0
Z1
π2 (1 − x)2 p̃2 + xΛ2 c4 + (1 − x)(m2 c4 − p̃2 )
= dx(1 − x)pκ ln
ic3 (1 − x)2 p̃2 + xλ2 c4 + (1 − x)(m2 c4 − p̃2 )
0
Z1
π2 xΛ2 c4
≈ dx(1 − x)pκ ln
ic3 (1 − x)2 p̃2 + (1 − x)(m2 c4 − p̃2 )
0
(M.27)
Inserting (M.26) and (M.27) in (M.22) we obtain
Z1
π2 xΛ2 c4
/) ≈
Jad (p 3
/ + 4mc2 )
(−2p dx ln
ic (1 − x)2 p̃2 + (1 − x)(m2 c4 − p̃2 )
0
Z1
2π 2/p xΛ2 c4
+ 3 dx(1 − x) ln
ic (1 − x)2 p̃2 + (1 − x)(m2 c4 − p̃2 )
0
Z1
π2 xΛ2 c4
= dx(4mc2 − 2p
/x) ln
ic3 (1 − x)2 p̃2 + (1 − x)(m2 c4 − p̃2 )
0
(M.28)
For our discussion in subsections 10.2.1 and 10.2.2 it will be convenient

to represent Jad in the form of a Taylor expansion around the on-mass-shell
value of the 4-momentum /p = mc2
Jad (p / − mc2 )ad + R(p

/) = C0 δad + C1 (p /) (M.29)
where C0 is a constant (independent on pµ ) term, C1 (p / − mc2 )ad is linear in

2 2
/−mc
p , and R(p
/) combines all other terms (quadratic, cubic, etc. in /−mc
p ).
2 2 4 7
To calculate C0 we simply set p̃ = m c in (M.28).
Z1
2π 2 mc2 xΛ2 c4
C0 = dx(2 − x) ln
ic3 (1 − x)2 m2 c4
0
Z1 Z1
2π 2 mc2 Λ2 2π 2 mc2 x
≈ (2 − x)dx ln 2 + (2 − x)dx ln
ic3 m ic 3 (1 − x)2
0 0

3π 2 mc2 Λ2 3π 2 mc2 3π 2 mc2 Λ
= ln 2 + = 4 ln + 1 (M.30)
ic3 m 2ic3 2ic3 m
For the coefficient C1 we obtain8
dJad
C1 =
dp
/ /=mc
p 2
Z1
2π 2 xΛ2 c4

= − 3 xdx ln 2 2 2 4 2 4 2

ic (1 − x) p̃ + xλ c + (1 − x)(m c − p̃ ) /=mc
p 2
0
Z1
2π 2 (1 − x)2 p̃2 + xλ2 c4 + (1 − x)(m2 c4 − p̃2 )
− 3 mc2 (2 − x)dx ×
ic xΛ2 c4
0
xΛ c (2(1 − x)2/p − 2(1 − x)p
2 4
/)

2 2 2 4 2 4 2 2

((1 − x) p̃ + xλ c + (1 − x)(m c − p̃ )) /=mc
p 2
Z1
2π 2 xΛ2
= − 3 xdx ln
ic (1 − x)2 m2
0
Z1
2π 2 mc2 2(1 − x)2 mc2 − 2(1 − x)mc2
− dx(2 − x)
ic3 (1 − x)2 m2 c4 + xλ2 c4
0
7
Note that p̃2 is a function of /p due to (J.23). When doing calculations with slash
symbols /p and γ-matrices, it is convenient to use properties (J.3) - (J.13)
8
R1
Here we used integral dxx ln(1/(1 − x)2 ) = 5/4.
0
M.5. INTEGRAL FOR THE VERTEX RENORMALIZATION 757
Z1 Z1
2π 2 xΛ2 4π 2 (2 − x)(x2 − x)
= − 3 xdx ln − dx
ic (1 − x)2 m2 ic3 (1 − x)2 + xλ2 /m2
0 0
Z1
2π 2 x π2 Λ2 2π 2 λ2
= − 3 xdx ln − ln − 3 1 + ln 2
ic (1 − x)2 ic3 m2 ic m
0

2π 2 Λ λ 9
= − 3 ln + 2 ln + (M.31)
ic m m 4
Then the residual term
R(p
/) = Jad (p / − mc2 )ad
/) − C0 δad − C1 (p
is ultraviolet-finite, because all Λ-dependent terms there cancel out
Z1 Z1
π2 2π 2 mc2
ln Λ2 dx(4mc2 − 2p
/x) − ln Λ2 dx(2 − x)
ic3 ic3
0 0
Z1
2π 2
/ − mc2 )
+ (p ln Λ2 xdx = 0
ic3
0
We see that C0 is ultraviolet-divergent, while C1 has both ultraviolet and

infrared divergences. It can be said that Jbc (p
/), as a function of /,
p is infinite at
2
the point /p = mc and has an infinite first derivative at this point. However,
the 2nd and higher derivatives are all finite.
M.5 Integral for the vertex renormalization

Let us calculate the integral in square brackets in equation (10.40)9
Z
κ ′ −h / + /q + mc2 −h/ + /q′ + mc2 1
I (q̃, q̃ ) = d4 hγµ γκ γµ
(h̃ − q̃)2 − m2 c4 (h̃ − q̃ ′ )2 − m2 c4 h̃2
9
We used equation (M.23) and took into account that q̃ and q̃ ′ are on the mass shell,
so that q̃ 2 = (q̃ ′ )2 = m2 c4 .
Z2 c4
Λ Z
γµ (−h/ + /q + mc2 )γκ (−h / + /q′ + mc2 )γµ
≈ − dL d4 h
(h̃2 − 2q̃ h̃)(h̃2 − 2q̃ ′ h̃)(h̃2 − L)2
λ2 c4
The numerator can be rewritten as
γµ (−h / + /q′ + mc2 )γµ

/ + /q + mc2 )γκ (−h
/′ + mc2 )γµ − γµ/hγκ (q
/ + mc2 )γκ (q
= γµ (q /′ + mc2 )γµ
−γµ (q/ + mc2 )γκ/hγµ + γµ/hγκ/hγµ
Then the desired integral is
I κ (q̃, q̃ ′ ) = γµ (q /′ + mc2 )γµ J

/ + mc2 )γκ (q
/′ + mc2 )γµ Jσ − γµ (q
−γµ γσ γκ (q / + mc2 )γκ γσ γµ Jσ
+γµ γσ γκ γτ γµ Jστ (M.32)
where10
Z1 Z2 c4
Λ Z
d4 h
J = − dy dL (M.33)
[h̃2 − 2h̃ · q̃y ]2 [h̃2 − L]2
0 λ2 c4
Z1 Z2 c4
Λ Z
d4 hhσ
Jσ = − dy dL (M.34)
[h̃2 − 2h̃ · q̃y ]2 [h̃2 − L]2
0 λ2 c4
Z1 Z2 c4
Λ Z
d4 hhσ hτ
Jστ = − dy dL (M.35)
[h̃2 − 2h̃ · q̃y ]2 [h̃2 − L]2
0 λ2 c4
These are particular cases of integrals (M.19) - (M.21) with parameters p̃1 =
q̃y , ∆1 = 0, p̃2 = 0, ∆2 = L, p̃x = xq̃y , and ∆x = (1 − x)L
2 Z1 Z2 c4
Λ Z1
π x(1 − x)dx
J = − 3 dy dL (M.36)
ic (x2 q̃y2 + (1 − x)L)2
0 λ2 c4 0
10
The denominators were combined using (M.3) and q̃y ≡ y q̃ + (1 − y)q̃ ′ .
Z1 Z2 c4
Λ Z1
π2 qyσ x2 (1 − x)dx
Jσ = − 3 dy dL (M.37)
ic (x2 q̃y2 + (1 − x)L)2
0 λ2 c4 0
Z1 Λ2 c 4
Z Z1
π2 [x2 qyσ qyτ − 1/2δστ (x2 q̃y2 + (1 − x)L)]x(1 − x)dx
Jστ = − 3 dy dL
ic (x2 q̃y2 + (1 − x)L)2
0 λ2 c4 0
(M.38)
In the limit Λ → ∞ we obtain for (M.36)
Z1 Z1 Z1 Z1
π2 x L=∞
π2 x
J = dx dy 2 2 = − dx dy
ic3 x q̃y + (1 − x)L L=λ2 c4 ic3 x2 q̃y2 + (1 − x)λ2 c4
0 0 0 0
Z1 Z1
π2 dy 2 2 2 4
x=1 π2 dy 2 2 4

≈ − ln(−x q̃y − (1 − x)λ c ) = − ln(−q̃y ) − ln(−λ c )
2ic3 q̃y2 x=0 2ic3 q̃y2
0 0
Z1
π2 1 q̃y2
= − dy ln (M.39)
2ic3 q̃y2 λ2 c4
0
To proceed further with this integral we introduce the 4-vector of the

transferred momentum
k̃ ≡ q̃ ′ − q̃ (M.40)
Then from (q̃ ′ )2 = (q̃ + k̃)2 and (q̃ ′ )2 = q̃ 2 = m2 c4 it follows that
2q̃ · k̃ = −k̃ 2
q̃y = q̃ + (1 − y)k̃
q̃y2 = m2 c4 − (1 − y)y k̃ 2
Instead of k̃ 2 and integration variable y it is convenient to introduce two new

variables θ and α, such that11
11
Note that, by definition, 0 ≤ k̃ 2 ≤ 4m2 c4 .
k̃ 2 ≡ 4m2 c4 sin2 θ (M.41)

1 tan α
y = 1+
2 tan θ

1 tan α
1−y = 1−
2 tan θ

1 tan α
1 tan α
q̃y2 2 4 2 4
= m c − 4m c sin θ · 2
1+· 1−
2 tan θ
2 tan θ
cos2 θ
= m2 c4 − m2 c4 cos2 θ(tan2 θ − tan2 α) = m2 c4 2
cos α
dα d sin α dα
dy = =
2 tan θ dα cos α 2 tan θ cos2 α
dy dα dα
= =
q̃y2 2m2 c4 cos2 θ tan θ m2 c4 sin(2θ)
Integral J is infrared-divergent12
Zθ
π2 dα m2 cos2 θ
J = − 3 2 4
ln
2ic m c sin(2θ) λ2 cos2 α
−θ
Zθ
2π 2 θ m π2 cos2 θ
= − 3 2 4 ln − dα ln
ic m c sin(2θ) λ 2ic3 m2 c4 sin(2θ) cos2 α
−θ
Zθ
2π 2 θ m 2π 2
= − 3 2 4 ln − 3 2 4 dα ln(cos θ) − ln(cos α)
ic m c sin(2θ) λ ic m c sin(2θ)
0
Zθ
2π 2 θ m 2π 2 θ ln(cos θ) 2π 2
= − 3 2 4 ln − 3 2 4 + 3 2 4 dα ln(cos α)
ic m c sin(2θ) λ ic m c sin(2θ) ic m c sin(2θ)
0
 
Zθ
m
= 2A θ ln − α tan αdα (M.42)
λ
0
12
Rθ Rθ
Here we took the following integral by parts dα ln(cos α) = θ ln(cos θ) + α tan αdα
0 0
where we defined
π2 θ
A ≡ −
ic3 m2 c4 sin(2θ)
Next we calculate (M.37) using variables θ and α introduced above. Taking

the limits λ → 0, Λ → ∞ we obtain a finite result (both infrared and
ultraviolet divergences are absent)
Z1 Z1 Z1
π2 x2 qyσ L=Λ2 c4
π2 qyσ
Jσ = dx dy 2 2 ≈− 3 dy
ic3 x q̃y + (1 − x)L L=λ2 c4 ic q̃y2
0 0 0
Zθ
π2 dα kσ tan α
= − 3 qσ + 1−
ic m2 c4 sin(2θ) 2 tan θ
−θ
Zθ
π2 2θ kσ π 2 kσ
= − 3 2 4 qσ + + dα tan α
ic m c sin(2θ) 2 2ic3 m2 c4 sin(2θ) tan θ
−θ
= A(qσ + qσ′ ) (M.43)
Next we need to calculate (M.38)13
2 Z2 c4
Λ Z1 Z1
π x3 (1 − x)qyσ qyτ
Jστ = − 3 dL dx dy
ic (x2 q̃y2 + (1 − x)L)2
λ2 c4 0 0
Λ2 c 4
Z Z1 Z1
π2 δστ x(1 − x)
+ 3 dL dx dy
2ic x q̃y2 + (1 − x)L
2
λ2 c4 0 0
Z1 Z1
π2 x3 qyσ qyτ xqyσ qyτ
≈ dx dy −
ic3 x2 q̃y2 + (1 − x)Λ2 c4 q̃y2
0 0
13
R1
Here we assumed that Λ2 c4 ≫ q̃y2 ≫ λ2 c4 and used integral dxx ln (1 − x)/x2 =
0
2 1
(x /2) ln (1 − x)/x2 + x2 /4 − x/2 − 1/2 ln(1 − x) = −1/4.
0
Z1 Z1
π2
+ 3 dx dyδστ x[ln((1 − x)Λ2 c4 ) − ln(x2 q̃y2 )]
2ic
0 0
Z1 Z1 Z1
π2 qyσ qyτ π 2 δστ (1 − x)Λ2 c4
≈ − 3 dy 2 + dx dyx ln
2ic q̃y 2ic3 x2 q̃y2
0 0 0
Z1 Z1 Z1
π2 qyσ qyτ π 2 δστ (1 − x) π 2 δστ Λ 2 c4
= − 3 dy 2 + dxx ln + dy ln
2ic q̃y 2ic3 x2 4ic3 q̃y2
0 0 0
Z1 Z1
π2 qyσ qyτ π 2 δστ Λ2 c4 π 2 δστ
= − dy + dy ln −
2ic3 qy2 4ic3 q̃y2 8ic3
0 0
Integrations on y are performed using variables θ and α14
Z1 Zθ
qyσ qyτ 1 kσ tan α 1 kτ tan α dα
dy 2 = qσ + kσ − qτ + kτ − 2 4
q̃y 2 2 tan θ 2 2 tan θ m c sin(2θ)
0 −θ
Zθ Zθ
1 1 dα kσ kτ tan2 α dα
= qσ + kσ qτ + kτ 2 4
+ 2 2 4
2 2 m c sin(2θ) 4 tan θ m c sin(2θ)
−θ −θ
Zθ
θ kσ kτ cos θ
= (qσ + qσ′ )(qτ + qτ′ ) + tan2 αdα
2 4
2m c sin(2θ) 4m2 c4 sin3 θ
0
θ kσ kτ
= 2 4
(qσ + qσ′ )(qτ + qτ′ ) + (1 − θ cot θ)
2m c sin(2θ) k̃ 2
Z1 Zθ
Λ 2 c4 dα Λ2 cos2 α
dy ln = ln
q̃y2 tan θ cos2 α m2 cos2 θ
0 0
Λ2 1
= ln 2 2
+ (−2θ + ln(cos2 θ) tan θ + 2 tan θ)
m cos θ tan θ
Λ
= 2 ln + 2(1 − θ cot θ)
m
R R
R
14
Here we used integrals tan2 (x)dx = tan(x) − x + C, cos−2 (x)dx = tan(x) + C and
ln(cos2 (x))/ cos2 (x)dx = −2x + 2 tan(x) + tan(x) ln(cos2 (x)) + C.
Then we see that integral Jστ is ultraviolet-divergent
A
Jστ = (qσ + qσ′ )(qτ + qτ′ ) + Dkσ kτ + Eδστ (M.44)
4
π 2 (1 − θ cot θ)
D ≡ −
3 2
2
2ic k̃
π Λ 3
E ≡ ln + − θ cot θ
2ic3 m 4
Using results (M.42), (M.43), (M.44), we obtain for the full integral (M.32)
I κ (q̃, q̃ ′ ) = Jγµ (q
/ + mc2 )γκ (q /′ + mc2 )γµ
−Aγµ (q/ + /q′ )γκ (q
/′ + mc2 )γµ − Aγµ (q / + /q′ )γµ
/ + mc2 )γκ (q
A
+ γµ (q/ + /q′ )γκ (q
/ + /q′ )γµ + Dγµ/kγκ/kγµ + 4Eγκ
4
A
= JT1 − AT2 − AT3 + T4 + Dγµ/kγκ/kγµ + 4Eγκ (M.45)
4
Let us now use (J.11) - (J.13) and process these terms one-by-one
T1 = / + mc2 )γκ (q
γµ (q /′ + mc2 )γµ
= q κ/q′ γµ + mc2 γµ γκ/q′ γµ + mc2 γµ/γ
γµ/γ q κ γµ + m2 c4 γµ γκ γµ
= −2q/′ γκ/q + 2mc2 γκ/q′ + 2mc2/q′ γκ + 2mc2/γ q κ + 2mc2 γκ/q − 2m2 c4 γκ
= −2(q / + /k)γκ(q /′ − /k) + 2mc2 γκ/q′ + 2mc2 (q/ + /k)γκ + 2mc2/γ
q κ
2 ′ 2 4
+2mc γκ (q / − /k) − 2m c γκ
′
= −2q/γκ/q − 2k /γκ/q′ + 2q /γκ/k + 2mc2 γκ/q′
/γκ/k + 2k
+2mc2/γ q κ + 2mc2/kγκ + 2mc2/γ q κ + 2mc2 γκ/q′ − 2mc2 γκ/k − 2m2 c4 γκ
According to (10.40), integral I κ (q̃, q̃ ′ ) is multiplied by u(q, σ) from the

left and by u(q′, σ ′ ) from the right. Then, due to (J.82) - (J.83), in the above
summands the factor /q standing on the left and the factor /q′ standing on the
right can both be changed to mc2
T1 = −2m2 c4 γκ − 2mc2/kγκ + 2mc2 γκ/k + 2k

/γκ/k + 2m2 c4 γκ
+2m2 c4 γκ + 2mc2/kγκ + 2m2 c4 γκ + 2m2 c4 γκ − 2mc2 γκ/k − 2m2 c4 γκ

/γκ/k + 4m2 c4 γκ
= 2k
It follows from (J.4) and (J.23) that
/kγκ/k = γµ γκ γν k µ k ν = −γκ γµ γν k µ k ν + 2γν gµκ k µ k ν

= −γκ/k2 + 2k /kκ = −γκ k̃ 2 + 2(q /′ − /)k
q κ
The last term vanishes when sandwiched between u(q, σ) and u(q′ , σ ′ ). So,
we can set /kγκ/k = −k̃ 2 γκ . Then
T1 = (−2k̃ 2 + 4m2 c4 )γκ
We use the same techniques to obtain the 2nd, 3rd, and 4th terms in (M.45)
T2 = / + /q′ )γκ (q
γµ (q /′ + mc2 )γµ
= q κ/q′ γµ + γµ/q′ γκ/q′ γµ + mc2 γµ/γ
γµ/γ q κ γµ + mc2 γµ/q′ γκ γµ
= −2q/′ γκ/q − 2q/′ γκ/q′ + 2mc2/γ q κ + 2mc2 γκ/q + 2mc2/q′ γκ + 2mc2 γκ/q′
= −2(q / + /k)γκ (q/′ − /k) − 2(q / + /k)γκ/q′ + 2mc2/γ /′ − /k)
q κ + 2mc2 γκ (q
+2mc2 (q / + /k)γκ + 2mc2 γκ/q′
= −2q/γκ/q′ − 2k /γκ/q′ + 2q /γκ/k + 2k/γκ/k − 2q /γκ/q′
−2k /γκ/q′ + 2mc2/γ q κ + 2mc2 γκ/q′ − 2mc2 γκ/k
+2mc2/γ q κ + 2mc2/kγκ + 2mc2 γκ/q′
= −2m2 c4 γκ − 2mc2/kγκ + 2mc2 γκ/k + 2k /γκ/k − 2m2 c4 γκ
−2mc2/kγκ + 2m2 c4 γκ + 2m2 c4 γκ − 2mc2 γκ/k
+2m2 c4 γκ + 2mc2/kγκ + 2m2 c4 γκ
= /γκ/k − 2mc2/kγκ + 4m2 c4 γκ
2k
= (−2k̃ 2 + 4m2 c4 )γκ − 2mc2/kγκ
T3 = / + mc2 )γκ (q
γµ (q / + /q′ )γµ
= γµ/γ q µ + γµ mc2 γκ/γ
q κ/γ q µ + γµ/γ q κ/q′ γµ + γµ mc2 γκ/q′ γµ
= −2q/γκ/q + 2mc2 γκ/q + 2mc2/γ q κ − 2q /′ γκ/q + 2mc2 γκ/q′ + 2mc2/q′ γκ
= −2q /′ − /k) + 2mc2 γκ (q

/γκ (q /′ − /k) + 2mc2/γ q κ − 2(q /′ − /k)
/ + /k)γκ (q
+2mc2 γκ/q′ + 2mc2 (q / + /k)γκ
′
= −2q/γκ/q + 2q /γκ/k + 2mc2 γκ/q′ − 2mc2 γκ/k + 2mc2/γ q κ
′ ′
−2q/γκ/q − 2k /γκ/q + 2q /γκ/k + 2k /γκ/k
+2mc2 γκ/q′ + 2mc2/γ q κ + 2mc2/kγκ
= −2m2 c4 γκ + 2mc2 γκ/k + 2m2 c4 γκ − 2mc2 γκ/k + 2m2 c4 γκ
−2m2 c4 γκ − 2mc2/kγκ + 2mc2 γκ/k + 2k /γκ/k
2 4 2 4 2
+2m c γκ + 2m c γκ + 2mc /kγκ
= (4m2 c4 − 2k̃ 2 )γκ + 2mc2 γκ/k
T4 / + /q′ )γκ (q
= γµ (q / + /q′ )γµ
= γµ/γ q µ + γµ/q′ γκ/γ
q κ/γ q µ + γµ/γ q κ/q′ γµ + γµ/q′ γκ/q′ γµ
= −2q/γκ/q − 2q /γκ/q′ − 2q /′ γκ/q − 2q/′ γκ/q′
= −2q /′ − /k) − 2q
/γκ (q /γκ/q′ − 2(q /′ − /k) − 2(q
/ + /k)γκ (q / + /k)γκ/q′
= −2q/γκ/q′ + 2q /γκ/k − 2q /γκ/q′ − 2q /γκ/q′ − 2k/γκ/q′
+2q/γκ/k + 2k /γκ/k − 2q /γκ/q′ − 2k /γκ/q′
= −2m2 c4 γκ + 2mc2 γκ/k − 2m2 c4 γκ − 2m2 c4 γκ − 2mc2/kγκ
+2mc2 γκ/k + 2k /γκ/k − 2m2 c4 γκ − 2mc2/kγκ
= −8m2 c4 γκ + 4mc2 γκ/k − 4mc2/kγκ − 2k̃ 2 γκ
Putting all terms together we obtain15
I κ (q, q ′) = J(−2k̃ 2 + 4m2 c4 )γκ

−A((−2k̃ 2 + 4m2 c4 )γκ − 2mc2/kγκ ) − A((4m2 c4 − 2k̃ 2 )γκ + 2mc2 γκ/k)
A
+ (−8m2 c4 γκ + 4mc2 γκ/k − 4mc2/kγκ − 2k̃ 2 γκ ) + 2D k̃ 2 γκ + 4Eγκ
4
2 2 2 4 2 4 7 2
= −2J k̃ + 2D k̃ + 4Jm c + 4E − 10Am c + Ak̃ γκ
2
2
−Amc (γκ/k − /kγκ )
!
2
7A k̃
= −2J k̃ 2 + 2D k̃ 2 + 4Jm2 c4 + 4E − 14Am2 c4 + γκ
2
+ 2Amc2 (qκ + qκ′ )
15
Here we used (J.86) to write γκ/k − /kγκ = 4mc2 γκ − 2(qκ + qκ′ ).
The coefficient in front of γκ is
 
2 2 2 4 Zθ 2
2π (−2k̃ + 4m c ) 
− θ ln
m
− α tan αdα  − π (1 − θ cot θ)
ic3 m2 c4 sin(2θ) λ ic3
0

2π 2 Λ 1 14π 2 θ 7π 2 k̃ 2 θ
+ 3 ln + (1 − θ cot θ) − + 3 −
ic m 4 ic sin(2θ) 2ic3 m2 c4 sin(2θ)
 
2 2 4 2 2 4 Zθ
2π (−8m c sin θ + 4m c )  m
= − 3 2 4
θ ln − α tan αdα
ic m c sin(2θ) λ
0
π 2 (1 − θ cot θ) 2π 2 Λ π2 14π 2θ 14π 2 θm2 c4 sin2 θ
+ + ln − + −
ic3 ic3 m 2ic3 ic3 sin(2θ) ic3 m2 c4 sin(2θ)
 
Zθ
8π 2 m
= − 3 θ ln − α tan αdα
ic tan(2θ) λ
0
2 2
π (1 − θ cot θ) 2π Λ π2 7π 2 θ cot θ
+ + ln − +
ic3 ic3 m 2ic3 ic3
Therefore, finally
I κ (q̃, q̃ ′ )
Zθ
π2γ κ 8θ m 8 1 Λ
= 3
− ln + α tan αdα + + 6θ cot θ + 2 ln
ic tan(2θ) λ tan(2θ) 2 m
0
2 ′ κ
2π θ(q + q )
− (M.46)
imc5 sin(2θ)
M.6 Integral for the ladder diagram

For the integral (10.54)
Z
d4 h
b(p, q, k) =
[h̃2 + 2(q̃ · h̃)][h̃2 − 2(p̃ · h̃)][h̃2 − λ2 c4 ][h̃2 + 2(h̃ · k̃) + k̃ 2 − λ2 c4 ]
M.6. INTEGRAL FOR THE LADDER DIAGRAM 767
we follow the calculation technique from [Red53]. First use equation (M.7)
and notation
a = h̃2 + 2(k̃ · h̃) + k̃ 2 − λ2 c4

b = h̃2 + 2(q̃ · h̃)
c = h̃2 − 2(p̃ · h̃)
d = h̃2 − λ2 c4
to write
b(p, q, k)
Z Z1 Z1 Z1
= 6 d4 h dx dy xz 2 dz[(h̃2 + 2(k̃ · h̃) + k 2 − λ2 c4 )z(1 − x) + (h̃2 + 2(q̃ · h̃))xyz
0 0 0
+ (h̃ − 2(p̃ · h̃))xz(1 − y) + (h̃2 − λ2 c4 )(1 − z)]−4

2
Z Z1 Z1 Z1
= 6 d h dx dy xz 2 dz[h̃2 − 2h̃ · (−k̃z(1 − x) − q̃xyz + p̃xz(1 − y))
4
0 0 0
+ k̃ z(1 − x) + λ c (zx − 1)]−4

2 2 4
Z Z1 Z1 Z1
xz 2 dz
= 6 d4 h dx dy
[h̃2 − 2(h̃ · p̃x )z − ∆]4
0 0 0
where
∆ ≡ λ2 c4 (1 − zx) − k̃ 2 z(1 − x)
p̃x = −k̃(1 − x) + p̃x(1 − y) − q̃xy = −k̃(1 − x) + xp̃y
p̃y = p̃(1 − y) − q̃y
From (M.13) we obtain
Z1 Z1 Z1
π2 xz 2 dz
b(p, q, k) = 3 dx dy
ic (z 2 p̃2x + ∆)2
0 0 0
We have q̃ ′ = q̃ − k̃ and p̃′ = p̃ + k̃. Taking squares of both sides of these

equations and using q̃ 2 = (q̃ ′ )2 = m2 c4 and p̃2 = (p̃′ )2 = M 2 c4 we obtain
(q̃ · k̃) = k̃ 2 /2 (M.47)

(p̃ · k̃) = −k̃ 2 /2 (M.48)
k̃ 2 k̃ 2 k̃ 2
(k̃ · p̃y ) = (k̃ · p̃)(1 − y) − (k̃ · q̃)y = − (1 − y) − y = −
2 2 2
p̃2x = (xp̃y − k̃(1 − x))2 = x2 p̃2y + k̃ 2 (1 − x)2 − 2x(1 − x)(p̃y · k̃)
= x2 p̃2y + k̃ 2 − 2k̃ 2 x + k̃ 2 x2 + k̃ 2 x − k̃ 2 x2 = x2 p̃2y + k̃ 2 (1 − x)
Z1 Z1 Z1
π2 xz 2 dz
b(p, q, k) = dx dy
ic3 [z 2 (x2 p̃2y + k̃ 2 (1 − x)) + λ2 c4 (1 − zx) − k̃ 2 z(1 − x)]2
0 0 0
Even though λ is small, the term λ2 c4 (1 − zx) cannot be neglected16 when

x → 0, z → 0, when x → 1, z → 0 and when x → 0, z → 1. Therefore, we
are going to break the region of integration on x into three parts 0 < x < ǫ,
ǫ < x < 1 − δ and 1 − δ < x < 1, where ǫ and δ are small, but large enough,
so that in the interval ǫ < x < 1 − δ the term λ2 c4 (1 − zx) can be neglected.
Integrations on x in these three regions split our integral into three parts
b(p, q, k) = LI + LII + LIII

In the second region we neglect the λ-term
Z1−δ Z1 Z1
π2 xdz
LII ≈ dx dy
ic3 [z(x2 p̃2y + k̃ 2 (1 − x)) − k̃ 2 (1 − x)]2
ǫ 0 0
use table integrals
Z
dz 1
2
= − + const
(az + b) a(ax + b)
Z
dx
= ln(x) − ln(x − 1) + const
x(1 − x)
16
because other terms in the denominator can be even smaller
and obtain
Z1−δ Z1
π2 x z=1

LII = − 3 dx dy
ic [x2 p̃2y + k̃ 2 (1 − x)][z(x2 p̃2y + k̃ 2 (1 − x)) − k̃ 2 (1 − x)] z=0
ǫ 0
Z1−δ Z1
π2 x 1 1
= − 3 dx dy +
ic x2 p̃2y + k̃ 2 (1 − x) x2 p̃2y k̃ 2 (1 − x)
ǫ 0
Z1−δ Z1
π2 1
= − dx dy
ic3 k̃ 2 xp̃2y (1 − x)
ǫ 0
Z1
π2 1 x=1−δ

= − dy 2
(ln(x) − ln(x − 1))
ic3 k̃ 2 p̃y x=ǫ
0
Z1
π2 1
≈ − dy (− ln(δ) − ln(−1) − ln(ǫ) + ln(−1))
ic3 k̃ 2 p̃2y
0
Z1
π 2 ln(δǫ) dy
=
ic3 k̃ 2 p̃2y
0
In the third integral we replace x → 1 − x
Z0 Z1 Z1
π2 (1 − x)z 2 dz
LIII = − 3 dx dy
ic [z 2 ((1 − x)2 p̃2y + k̃ 2 x) + λ2 c4 (1 − z(1 − x)) − k̃ 2 zx]2
δ 0 0
Z0 Z1 Z1
π2 z 2 dz
≈ − 3 dx dy
ic [z 2 p̃2y + z 2 k̃ 2 x + λ2 c4 (1 − z) − k̃ 2 zx]2
δ 0 0
Z1 Z1 !
π2 zdz 1 x=0

= dy
ic3 k 2 (z − 1) z 2 p̃2y + z 2 k̃ 2 x + λ2 c4 (1 − z) − k̃ 2 zx x=δ
0 0
Z1 Z1 !
π2 zdz 1 1
= dy 2 2 2 4
−
ic3 k̃ 2 (z − 1) z p̃y + λ c (1 − z) z p̃y + z k̃ δ + λ2 c4 (1 − z) − k̃ 2 zδ
2 2 2 2
0 0
Z1 Z1
π2 zdz z 2 p̃2y + z 2 k̃ 2 δ + λ2 c4 (1 − z) − k̃ 2 zδ − z 2 p̃2y − λ2 c4 (1 − z)
= dy ·
ic3 k̃ 2 (z − 1) (z 2 p̃2y + λ2 c4 (1 − z))(z 2 p̃2y + z 2 k̃ 2 δ + λ2 c4 (1 − z) − k̃ 2 zδ)
0 0
Z1 Z1
π2δ z 2 dz
= dy
ic3 [z 2 p̃2y + λ2 c4 (1 − z)][z 2 p̃2y + z 2 k̃ 2 δ + λ2 c4 (1 − z) − k̃ 2 zδ]
0 0
We now break the z integration into two regions 0 ≤ z < zc and zc ≤ z < 1,
where zc is chosen such that λ2 c4 ≪ zc2 p̃2y ≪ k̃ 2 zc δ. We also use table integrals
Z
dz 1
= [ln(z) − ln(az + b)] + const
z(az + b) b
Z r √
a + bz b 2 a −1 cz
dz 2
= ln(a + cz ) + tan √ + const
a + cz 2c c a
b
≈ ln(a + cz 2 ) + const (M.49)
Z 2c
a a
dz = ln(a + cz) + const
a + cz c
Then
LIII = LIIIa + LIIIb

Z1 Z1
π2δ z 2 dz
LIIIb = dy
0 zc
Z1 Z1
π2δ dz
≈ dy
ic3 p̃2y z[z(p̃2y + k̃ 2 δ) − k̃ 2 δ]
0 zc
Z1
π2δ dy 1 2 2 2
z=1

= − ln(z) − ln(z(p̃ y + k̃ δ) − k̃ δ)
ic3 2
p̃y k̃ 2 δ z=zc
0
Z1
π2δ dy 1 2 2 2 2

= − 3 − ln(p̃ y ) − ln(zc ) + ln[zc (p̃ y + k̃ δ) − k̃ δ]
ic p̃2y k̃ 2 δ
0
Z1 !
π2 dy −k̃ 2 δ
≈ − ln (M.50)
ic3 k̃ 2 p̃2y p̃2y zc
0
Z1 Zzc
π2δ z 2 dz
LIIIa = dy
0 0
Z1 Zzc
π2δ z 2 dz
≈ dy
ic3 (z 2 p̃2y + λ2 c4 )(λ2 c4 − k̃ 2 zδ)
0 0
Z1 Zzc !
π2δ 1 λ2 c4 + k̃ 2 zδ λ 2 c4
= − 3 dy dz −
ic p̃y λ c + k̃ 4 δ 2
2 2 4 z 2 p̃2y + λ2 c4 λ2 c4 − k̃ 2 zδ
0 0
2 Z1 2 2 4 z=zc
!
π δ 1 k̃ δ 2 4 2 2 λc 2 4 2
= − 3 dy 2
ln(λ c + z p̃y ) + ln(λ c − k̃ zδ)
ic p̃y λ c + k̃ 4 δ 2
2 2 4 2p̃y 2
k̃ δ z=0
0
Z1
π2δ 1 k̃ 2 δ
2 4 2 2 λ 2 c4
= − dy ln(λ c + zc yp̃ ) + ln(λ2 c4 − k̃ 2 zc δ)
ic3 p̃y λ c + k̃ δ 2p̃y
2 2 4 4 2 2 2
k̃ δ
0
k2δ 2 4 λ 2 c4 2 4

− ln(λ c ) − ln(λ c )
2p̃2y k̃ 2 δ
Z1
π2δ 1 k̃ 2 δ
2 2 λ 2 c4
≈ − 3 dy ln(z p̃
c y ) + ln(−k̃ 2 zc δ)
ic p̃2y λ2 c4 + k̃ 4 δ 2 2p̃2y k̃ 2 δ
0
2
k̃ δ 2 4 λ 2 c4 2 4

− ln(λ c ) − ln(λ c )
2p̃2y k̃ 2 δ
Z1 2 2
π2 1 zc p̃y
≈ − dy 2 ln (M.51)
3
ic 2k̃ 2 p̃y λ 2 c4
0
Adding together (M.50) and (M.51) we obtain
Z1 ! Z1
π2 dy −k̃ 2 δ π2 1 zc2 p̃2y
LIII = LIIIa + LIIIb =− ln − dy 2 ln
ic3 k̃ 2 p̃2y p̃2y zc ic3 2k̃ 2 p̃y λ 2 c4
0 0
Z1 ! Z1
π2 dy k̃ 4 δ 2 π2 1 zc2 p̃2y
≈ − ln − dy 2 ln
ic3 2k̃ 2 p̃2y p̃4y zc2 ic3 2k̃ 2 p̃y λ 2 c4
0 0
Z1 !
π2 dy k̃ 4 δ 2
= − ln
ic3 2k̃ 2 p̃2y p̃2y λ2 c4
0
In the integral LI we replace z → 1 − z
Zǫ Z1
π2
LI = 3 dx dy ×
ic
0 0
Z1
x(1 − z)2 dz
[(1 − z)2 (x2 p̃2y + k̃ 2 (1 − x)) + λ2 c4 (1 − (1 − z)x) − k̃ 2 (1 − z)(1 − x)]2
0
and break z-integration into two regions 0 ≤ z < zc and zc ≤ z ≤ 1, where

zc is small, but large enough, so that in the second region we can neglect the
λ-term. Then
Zǫ Z1 Zzc
π2 xdz
LIa ≈ dx dy
ic3 [(1 − 2z)(x2 p̃2y + k̃ 2 (1 − x)) + λ2 c4 − k̃ 2 (1 − z)(1 − x)]2
0 0 0
Zǫ Z1 Zzc
π2 xdz
= dx dy
ic3 [(x2 p̃2y + λ2 c4 ) − (2x2 p̃2y + k̃ 2 (1 − x))z]2
0 0 0
Zǫ Z1
π2 1 z=zc

= − xdx dy
ic3 [2x2 p̃2y + k̃ 2 (1 − x)][−(x2 p̃2y + λ2 c4 ) + (2x2 p̃2y + k̃ 2 (1 − x))z] z=0
0 0
Zǫ Z1
π2 1 1
= − 3 xdx dy
ic (2x2 p̃2y + k̃ 2 (1 − x)) −(x2 p̃2y + λ2 c4 ) + (2x2 p̃2y + k̃ 2 (1 − x))zc
0 0
1
+
x2 p̃2y
+ λ 2 c4
Zǫ Z1 Zǫ Z1
π2 x π2 x(1 + x)
≈ − 3 dx dy ≈− dx dy 2 2
ic k̃ 2(1 − x)(x2 p̃2y + λ2 c4 ) ic3 k̃ 2 x p̃y + λ2 c4
0 0 0 0
Zǫ Z1
π2 x 1
≈ − dx dy 2 2 2 4
+ 2
ic3 k̃ 2 x p̃y + λ c p̃y
0 0
The last term in parentheses can be neglected when integrated on x. Using

integral (M.49) with a = λ2 c4 , b = 1, c = p̃2y we obtain
Z1
π2 dy 2 2 2 4
x=ǫ
LIa ≈ − ln(x p̃ y + λ c )
ic3 k̃ 2 2p̃2y x=0
0
Z1
π2 dy ǫ2 p̃2y
= − ln
ic3 k̃ 2 2p̃2y λ 2 c4
0
In the second part LIb we neglect the λ-term
Zǫ Z1 Z1
π2 x(1 − z)2 dz
LIb ≈ dx dy
ic3 [(1 − z)2 (x2 p̃2y + k̃ 2 (1 − x)) − k̃ 2 (1 − z)(1 − x)]2
0 0 zc
Zǫ Z1 Z1
π2 xdz
≈ dx dy
ic3 [−x2 p̃2y + (x2 p̃2y + k̃ 2 (1 − x))z]2
0 0 zc
Zǫ Z1
π2 1 z=1

= − xdx dy
ic3 2 2 2 2 2 2 2 2
[x p̃y + k̃ (1 − x)][−x p̃y + (x py + k̃ (1 − x))z] z=zc
0 0
Zǫ Z1
π2 x 1 1
= − dx dy −
ic3 x2 p̃2y + k̃ 2 (1 − x) k̃ 2 (1 − x) −x2 p̃2y + (x2 p̃2y + k̃ 2 (1 − x))zc
0 0
Zǫ Z1
π2 1 1
≈ − dx dyx −
ic3 k̃ 2 k̃ 2 k̃ 2 zc
0 0
≈ 0
Collecting all non-vanishing contributions we obtain
L = LIa + LII + LIII

Z1 2 2 Z1 Z1 !
π2 1 ǫ p̃y π 2 ln(δǫ) dy π2 dy k̃ 4 δ 2
≈ − dy 2 ln 2 4
+ 2
− ln
3
ic k̃ 2 2p̃y λ c 3
ic k̃ 2 p̃y 3
ic 2k̃ 2 p̃2y p̃2y λ2 c4
0 0 0
Z1 !
π2 1 ǫ2 p̃2y k̃ 4 δ 2
= − dy 2 ln ·
ic3 k̃ 2 2p̃y λ2 c4 δ 2 ǫ2 p̃2y λ2 c4
0
! Z1
2 2
π k̃ 1
= − ln dy (M.52)
ic3 k̃ 2 λ 2 c4 p̃2y
0
This is equation (A20) in [Red53].
M.7 Coulomb scattering in 2nd order

Here we will calculate the 3D integral
Z
′ ds
D(q, q ) =
[(q − s)2 + λ2 c2 ][s2 − q 2 + iµ][(s − q′ )2 + λ2 c2 ]
in formula (14.22) for the 4th order commutator term in the electron-proton
interaction. We are interested in leading terms surviving in the limits λ → 0,
µ → 0. The calculation method was adopted from §121 in [BLP01].17
First we use (M.6) and the elastic scattering condition (q ′ )2 = q 2 to write
D(q, q′ )
Z1 Z1−x Z
ds
= 2 dx dy
[((q − s)2 + λ2 c2 )x + ((s − q′ )2 + λ2 c2 )y + (s2 − q 2 − iµ)(1 − x − y)]3
0 0
Z1 Z1−x Z
ds
= 2 dx dy
[s2 − 2(qs)x − 2(q′ s)y + λ2 c2 (x + y) + q 2 (2x + 2y − 1) − iµ]3
0 0
Next we shift the integration variable s → h ≡ s − xq′ − yq and take into

account that 2(qq′ ) = 2q 2 − k 2 , where the vector of transferred momentum
is defined as k = q′ − q
D(q, q′ )
17
see also [Kac59]
M.7. COULOMB SCATTERING IN 2ND ORDER 775
Z1 Z1−x Z
dh
= 2 dx dy
[h2 + q 2 (−x2 − y2 + 2x + 2y − 1) − 2(qq′ )xy + λ2 c2 (x + y) − iµ]3
0 0
Z1 1−x
Z Z
dh
= 2 dx dy
[h2 − q 2 (x +y− 1)2 + k 2 xy + λ2 c2 (x + y) − iµ]3
0 0
Z1 Z1−x
iπ 2 1
= dx dy
2 [q 2(x + y − 1)2 − k 2 xy − λ2 c2 (x + y) − iµ]3/2
0 0
Change integration variables ξ = x + y, ζ = x − y
Z1 Zξ
′ iπ 2 1
D(q, q ) = dξ dζ
2 (q 2 (ξ − 1)2 − k 2 ξ 2 /4 + k 2 ζ 2/4 − λ2 c2 ξ − iµ)3/2
0 0
Z1
iπ 2 ξdξ
= p
2 (q 2 (ξ − 1)2 − k 2 ξ 2 /4 − λ2 c2 ξ 2 − iµ) q 2 (ξ − 1)2 − λ2 c2 ξ − iµ
0
Next we introduce parameter δ, such that 1 ≫ δ ≫ λ2 c2 /q 2 , and split the

integration range into two parts
D(q, q′ ) = D1 (q, q′ ) + D2 (q, q′ )

1−δ
2 Z
iπ
D1 (q, q′ ) = . . . dξ
2
0
Z1
iπ 2
D2 (q, q′ ) = . . . dξ
2
1−δ
In the first integral we ignore the λ-term
D1 (q, q′ )
Z1−δ
iπ 2 ξdξ
≈ 3
2q [(ξ − 1) − k ξ /(4q 2 ) − iµ](ξ − 1)
2 2 2
0

iπ 2 2q 2 −k 2 ξ 2 /(4q 2 ) + ξ 2 − 2ξ + 1 1−δ iπ 2 −k 2
= · ln ≈ 2 ln
2q 3 k 2 (1 − ξ)2 0 qk 4q 2 δ 2
In the second integral we change the integration variable y = x − 1
D2 (q, q′ )
Z0 Zδ
iπ 2 dy(y + 1) 2iπ 2 dy
≈ p ≈ 2 p
2 (q 2 y 2 − k 2 /4) q 2 y 2 − λ2 c2 k q 2 y 2 − λ 2 c2
−δ 0
p !
2iπ 2 p δ 2iπ 2 q q 2 δ 2 − λ 2 c2 + q 2 δ
ln(q q 2 y 2 − λ2 c2 + q 2 y) =

= 2
ln
qk 0 qk 2 iqλc
2
2 2

iπ −4q δ
≈ 2
ln
qk λ 2 c2
Putting both parts of the integral together we finally obtain

′ iπ 2 −k 2 −4q 2 δ 2 iπ 2 k2
D(q, q ) ≈ ln · 2 2 = 2 ln (M.53)
qk 2 4q 2 δ 2 λc qk λ 2 c2
which is equation (121.16) in [BLP01].

Appendix N
Relativistic invariance of RQD
N.1 Relativistic invariance of simple QFT

Here we would like to verify that interacting theory presented in subsection
9.1.1 is, indeed, relativistically invariant [Wei95, Wei64b]. In other words,
we are going to prove the validity of Poincaré commutators (6.22) - (6.26)
for the interacting energy and boost operators
Z
V =
dxV (x, 0)
Z
1
Z = 2 dxxV (x, 0) (N.1)
c
in (9.9) - (9.10).
Equation (6.22) follows directly from the property (9.7) in the case of
space translations and rotations. The potential boost Z in (N.1) is a 3-
vector by construction, so equation (6.24) is valid as well. Let us now prove
the commutator (6.23)
i~
[P0i , Zj ] = V δij
c2
Consider the case i = j = z. Then, using equation (9.7) with Λ = 1, we

obtain
777
778 APPENDIX N. RELATIVISTIC INVARIANCE OF RQD
Z
i~ d i i
[P0z , Zz ] = − 2 lim dxe ~ P0z a zV (x, 0)e− ~ P0z a
c a→0 da
Z
i~ d
= − 2 lim dxzV (x, y, z + a, 0)
c a→0 da
Z
i~ d
= − 2 lim dx(z − a)V (x, y, z, 0)
c a→0 da
Z
i~ i~
= dxV (x, y, z, 0) = V (N.2)
c2 c2
which is exactly equation (6.23).
The proof of equation (6.26) is more challenging. Let us consider the case
i = z and attempt to prove1
[K0z , V (t)] + [Zz (t), H0 ] − [V (t), Zz (t)] = 0 (N.3)
For the first term on the left hand side we use (9.7) and
d d z
lim V (Λx̃) = lim V (x, y, z cosh θ − ct sinh θ, t cosh θ − sinh θ)
θ→0 dθ θ→0 dθ c
∂V ∂V z
= lim (z sinh θ − ct cosh θ) + (t sinh θ − cosh θ)
θ→0 ∂z ∂t c
∂V z ∂V
= −ct −
∂z c ∂t
where Λ is the boost matrix (I.11). Then
Z
i~ d ic K0z θ ic
[K0z , V (t)] = − lim e ~ dxV (x̃)e− ~ K0z θ
c θ→0 dθ
Z
i~ d
= − lim dxV (Λx̃)
c θ→0 dθ
Z
i~ ∂V (x, t) z ∂V (x, t)
= − dx −ct − (N.4)
c ∂z c ∂t
1
In this calculation it is convenient to write condition (6.26) in a t-dependent form, i.e.,
multiply this equation by exp( ~i H0 t) from the left and exp(− ~i H0 t) from the right, as in
(7.10). At the end of calculations we will set t = 0.
N.2. RELATIVISTIC INVARIANCE OF QED 779
For the second term we obtain
Z
∂ i~ ∂
[Zz (t), H0 ] = −i~ Zz (t) = − dxzV (x, t) (N.5)
∂t c ∂t
The last term in (N.3) vanishes due to (9.8). Now we can set t = 0 and see
that (N.4) and (N.5) cancel each other, which proves (N.3).
Derivation of the last remaining nontrivial commutation relation
[K0i , Zj ] + [Zi , K0j ] + [Zi , Zj ] = 0
is left as an exercise for the reader.
N.2 Relativistic invariance of QED

In this Appendix we are going to prove the relativistic invariance of the field-
theoretical formulation of QED presented in subsection 9.1.2. In other words,
we are going to prove the validity of Poincaré commutators (6.22) - (6.26).2
The proof presented here is taken from Weinberg’s works [Wei95, Wei64b]
and, especially, Appendix B in [Wei65].
The interaction operator V (t) in (9.12) clearly commutes with operators
of the total momentum and total angular momentum, so equation (6.22) is
easily verified. The potential boost Z in (9.16) is a 3-vector by construction,
so equation (6.24) is valid as well. Let us now prove the commutator (6.23)
i~
[P0i , Zj (t)] = V (t)δij
c2
Consider the case i = j = x and denote
Z
~ 1
V (x, t) ≡ √ j(x, t)A(x, t) + 2 dyj0 (x, t)G(x − y)j0 (y, t)
c 2c
1
G(x) ≡
4π|x|
2
We write conditions (6.22) - (6.26) in a t-dependent form. See footnote on page 778.
so that
Z
V (t) = dxV (x, t)
Z Z
1 ~
Z(t) = 2 dxxV (x, t) + 5/2 dxj0 (x, t)C(x, t) (N.6)
c c
where
√ Z
i~2 c dp X − i p̃·x̃ i
p̃·x̃ ∗ †

C(x̃) ≡ p e ~ e(p, τ )c p,τ − e ~ e (p, τ )c p,τ (N.7)
2(2π~)3 p3/2 τ
Then, using equations (L.4) and (8.38) - (8.39) we obtain
[P0x , Zx (t)]
d i i
= −i~ lim e ~ P0x a Zx (t)e− ~ P0x a
a→0 da
Z
i~ d i ~ i
= − 2 lim dxe ~ P 0x a
xV (x, t) + √ j0 (x, t)Cx (x, t) e− ~ P0x a
c a→0 da c
Z
i~ d ~
= − 2 lim dx xV (x + a, y, z, t) + √ j0 (x + a, y, z, t)Cx (x + a, y, z, t)
c a→0 da c
Z
i~ d ~
= − 2 lim dx (x − a)V (x, y, z, t) + √ j0 (x, y, z, t)Cx (x, y, z, t)
c a→0 da c
Z
i~ i~
= 2
dxV (x, y, z, t) = 2 V (t) (N.8)
c c
which is exactly equation (6.23).
The proof of equation (6.26) is more challenging. Let us consider the case
i = z and attempt to prove
d
[K0z , V1 (t)] + [K0z , V2 (t)] − i~ Zz (t) − [V (t), Zz (t)] = 0 (N.9)
dt
where we took into account that [Zz (t), H0 ] = −i~ dtd Zz (t). We will calculate
all four terms on the left hand side of (N.9) separately. Consider the first
term and use equations (L.3), (K.23), (K.26), (I.4)
[K0z , V1 (t)]
i~ d ic ic
= − lim e ~ K0z θ V1 (t)e− ~ K0z θ
c θ→0 dθ Z
i~2 d ic K0z θ ic
= − 3/2 lim e ~ dxj̃(x̃) · Ã(x̃)e− ~ K0z θ
c θ→0 dθ
Z
i~2 d
= − 3/2 lim dx(Λ−1 j̃(Λx̃) · Λ−1 Ã(Λx̃) + Λ−1 j̃(Λx̃) · Ω(x̃, Λ))
c θ→0 dθ
Z
i~2 d
= − 3/2 lim dx(j̃(Λx̃) · Ã(Λx̃) + Λ−1 j̃(Λx̃) · Ω(x̃, Λ))
c θ→0 dθ
Z
i~2 d d d −1
= − 3/2 lim dx j̃(Λx̃) · Ã(x̃) + j̃(x̃) · Ã(Λx̃) + Λ j̃(x̃) · Ω(x̃, 1)
c θ→0 dθ dθ dθ
d d
+ j̃(Λx̃) · Ω(x̃, 1) + j̃(x̃) · Ω(x̃, Λ)
dθ Z dθ
i~2 d d d
= − 3/2 lim dx j̃(Λx̃) · Ã(x̃) + j̃(x̃) · Ã(Λx̃) + j̃(x̃) · Ω(x̃, Λ) (N.10)
c θ→0 dθ dθ dθ
where Ω(x̃, Λ) is given by equation (K.24) and Λ is matrix (I.11). Next we
use the following results
d d z
lim j̃(Λx̃) = lim j̃(x, y, z cosh θ − ct sinh θ, t cosh θ − sinh θ)
θ→0 dθ θ→0 dθ c
∂ j̃ ∂ j̃ z
= lim (z sinh θ − ct cosh θ) + (t sinh θ − cosh θ)
θ→0 ∂z ∂t c
∂ j̃ z ∂ j̃
= −ct − (N.11)
∂z c ∂t
d ∂ Ã z ∂ Ã
lim Ã(Λx̃) = −ct − (N.12)
θ→0 dθ ∂z c ∂t
Calculation of the dΩ/dθ term is more involved
d
lim Ωµ (x̃, Λ)
θ→0 dθ
√ 3 Z 1
~ c X d dp X (Λ−1 p)µ
= − lim √ ×
(2π~)3/2 θ→0 ν=0 dθ 2p τ =−1 |Λ−1p|
3
X
− ~i Λ−1 p̃·x̃ i −1
Λ−1
0ν e eν (p, τ )cp,τ + e ~
Λ p̃·x̃
e∗ν (p, τ )c†p,τ (N.13)
νρ=0
The only quantities dependent on θ are Λ-matrices. Therefore, taking the

derivative on the right hand side of equation (N.13) we will obtain four
d −1 d −1 d d
terms, those containing dθ Λνµ , dθ Λ0ρ , dθ |Λ−1 p|−1 and dθ exp(±iΛ−1 p̃ · x̃).
After taking the derivative we must set θ → 0. It follows from equation
(K.25) that the only non-zero term is that containing
d −1 d
lim Λ0ρ = lim (cosh θ, 0, 0, sinh θ)
θ→0 dθ θ→0 dθ
= lim(− sinh θ, 0, 0, cosh θ)
θ→0
= (0, 0, 0, 1)
Thus
d
lim Ωµ (x̃, Λ)
θ→0 dθ
√ Z 1
~ c dppµ X − i p̃·x̃ i
p̃·x̃ ∗ †

= − √ e ~ ez (p, τ )cp,τ + e ~ ez (p, τ )cp,τ
(2π~)3/2 2p3/2 τ =−1
√ Z 1
i~2 c dp X − i p̃·x̃ i
p̃·x̃ ∗ †

= −p ∂µ e ~ e z (p, τ )c p,τ − e ~ ez (p, τ )c p,τ
2(2π~)3 p3/2 τ =−1
= −∂µ Cz (x̃) (N.14)
where ∂µ ≡ (− 1c ∂t
∂ ∂
, ∂x ∂
, ∂y ∂
, ∂z ). So, using (N.14) and the continuity equation
(L.5), we obtain that the last term on the right hand side of equation (N.10)
is3
3
due to the property (9.2) all functions f and g of quantum fields vanish at infinity,
therefore we can take integrals by parts (ξ ≡ (t, x, y, z))
Z∞ Z∞ Z∞
d d d
dx f (ξ) g(ξ) = dx (f (ξ)g(ξ)) − dxf (ξ) g(ξ)
dx dx dx
−∞ −∞ −∞
Z
i~2 d
− 3/2 lim dxj̃(x̃) · Ω(x̃, Λ)
c θ→0 dθ
2 XZ
i~
= dxjµ (x, t)gµν ∂ν Cz (x, t)
c3/2 µν
Z Z
i~2 ∂Cz (x, t) i~2 ∂Cz (x, t)
= 5/2
dxj0 (x, t) + 3/2 dxj(x, t)
c ∂t c ∂x
2 Z 2 Z
i~ ∂Cz (x, t) i~ ∂j(x, t)
= 5/2
dxj0 (x, t) − 3/2 dx Cz (x, t)
c ∂t c ∂x
Z Z
i~2 ∂Cz (x, t) i~2 ∂j0 (x, t)
= 5/2
dxj0 (x, t) + 5/2 dx Cz (x, t)
c ∂t c ∂t
Z
i~2 ∂
= dxj0 (x, t)Cz (x, t) (N.15)
c5/2 ∂t
Substituting results (N.11), (N.12), (N.14), and (N.15) in equation (N.10)
and setting t = 0 we obtain
[K0z , V1 (t)]
Z !
i~2 z ∂ j̃ z ∂ Ã 1 ∂
= − 3/2 dx − · Ã(x̃) − j̃(x̃) · − 2 (j0 (x̃)Cz (x̃))
c c ∂t c ∂t c ∂t
Z
i~2 ∂
= − 5/2 dx −z(j̃(x̃) · Ã(x̃)) − j0 (x̃)Cz (x̃) (N.16)
c ∂t
For the second term on the left hand side of (N.9) we use equation (L.3)
[K0z , V2 (t)]
Z Z
1 1
= dxdx [K0z , j0 (x̃)]G(x − x )j0 (x̃ ) + 2 dxdx′ j0 (x̃)G(x − x′ )[K0z , j0 (x̃′ )]
′ ′ ′
2c2 2c
Z∞
d
= f (x = ∞)g(x = ∞) − f (x = −∞)g(x = −∞) − dxf (ξ) g(ξ)
dx
−∞
Z∞
d
= − dxf (ξ) g(ξ)
dx
−∞
Z
1
= 2
dxdx′ [K0z , j0 (x̃)]G(x − x′ )j0 (x̃′ )
c
Z
i~ ′ z ∂j0 (x̃) 1
= dxdx − jz (x̃) G(x − x′ )j0 (x̃′ )
c2 c2 ∂t c
Z Z
i~ ′ ∂j0 (x̃) ′ ′ ′ i~ ′ ∂j0 (x̃)
= 4
dxdx (z − z )G(x − x )j0 (x̃ ) + 4
dxdx zG(x − x′ )j0 (x̃′ )
2c ∂t 2c ∂t
Z Z
i~ ∂j 0 (x̃) i~
+ 4 dxdx′ z ′ G(x − x′ )j0 (x̃′ ) − 3 dxdx′ jz (x̃)G(x − x′ )j0 (x̃′ )
2c ∂t c
Z Z
i~ ′ ∂j(x̃) i~ ∂j0 (x̃)
= − 3 dxdx (z − z )G(x − x )j0 (x̃ ) + 4 dxdx′
′ ′ ′
zG(x − x′ )j0 (x̃′ )
2c ∂x 2c ∂t
Z ′ Z
i~ ′ ′ ∂j0 (x̃ ) i~
+ 4 dxdx j0 (x̃)zG(x − x ) − 3 dxdx′ jz (x̃)G(x − x′ )j0 (x̃′ )
2c ∂t c
Z Z
i~ ′ ∂((z − z ′ )G(x − x′ )) ′ i~ ∂
= dxdx j(x̃) j0 (x̃ ) + 4 dxdx′ j0 (x̃)zG(x − x′ )j0 (x̃′ )
2c3 ∂x 2c ∂t
Z
i~
− 3 dxdx′ jz (x̃)G(x − x′ )j0 (x̃′ ) (N.17)
c
Using expression (N.6) for Z(t) we obtain for the third term on the left hand
side of equation (N.9)
Z Z
∂ i~2 ∂ i~2 ∂
− i~ Zz (t) = − 5/2 dxzj(x, t)A(x, t) − 5/2 dxj0 (x, t)Cz (x, t)
∂t c ∂t c ∂t
Z
i~ ∂
− dxdyj0 (x, t)zG(x − y)j0 (y, t) (N.18)
2c4 ∂t
In order to calculate the last term in (N.9), we notice that the only term in
Z(t) which does not commute with V (t) is that containing C, therefore
Z
~2
− [V (t), Zz (t)] = − 3 dxdx′ j(x̃)j0 (x̃′ )[A(x̃), Cz (x̃′ )] (N.19)
c
To calculate the commutator, we set t = 0 and use equation (B.12)
[Ai (x, 0), Cz (x′ , 0)]

Z
−3 dpdq X
= i~(2π~) p
2 q 3 p στ
h i i
i ′ i ′
i
ei (p, σ)cp,σ e ~ px + e∗i (p, σ)c†p,σ e− ~ px , ez (q, τ )cq,τ e ~ qx − ez (q, τ )c†q,τ e− ~ qx
Z
−3 dpdq X
= i~(2π~) p ei (p, σ)e∗z (q, τ ) ×
2 q 3 p στ
i i ′ i i ′

−δ(p − q)δσ,τ e ~ px− ~ qx − δ(p − q)δσ,τ e− ~ px+ ~ qx
Z i
−3 dp X ∗ p(x−x′ ) − ~i p(x−x′ )
= −i~(2π~) ei (p, σ)e z (p, τ )δ σ,τ e ~ + e
2p2 στ
Z
−3 dp pi pz i p(x−x′ ) − ~i p(x−x′ )

= −i~(2π~) δ iz − e ~ + e
2p2 p2
Z
−3 dp pi pz i
p(x−x′ )
= −i~(2π~) δiz − e ~
p2 p2
2 Z
i ′ i~(−i~) dp i p(x−x′ )
= − δiz G(x − x ) + 3
∂xi ∂z e~
~ (2π~) p4
i |x − x′ |
= − δiz G(x − x′ ) + i~3 ∂xi ∂z
~ 8π~4
i i
= − δiz G(x − x′ ) + ∂xi ((z − z ′ )G(x − x′ )) (N.20)
~ 2~
Then
−[V (t), Zz (t)]

3 Z
i~ X ′ ′ ′ 1 ′ ′ ′
= 3 dxdx ji (x̃) δiz G(x − x )j0 (x̃ ) − ∂xi [(z − z )G(x − x )]j0 (x̃ )
c i=1 2
(N.21)
Now we can set t = 0, add four terms (N.16), (N.17), (N.18), and (N.21)
together and see that the first two terms in (N.18) cancel with the two terms
on the right hand side of (N.16); the third term in (N.18) cancels the sec-
ond term on the right hand side of (N.17); and (N.21) exactly cancels the
remaining first and third terms on the right hand side of (N.17). This proves
equation (N.9).
The proof of the last remaining commutation relation
[K0i , Zj ] + [Zi , K0j ] + [Zi , Zj ] = 0 (N.22)

is left as an exercise for the reader.
N.3 Relativistic invariance of classical elec-

trodynamics
In this Appendix we will prove the relativistic invariance of the classical limit
of RQD constructed in subsections 12.1.2 and 15.1.1.
From our derivation in chapter 12 it follows that the Darwin-Breit Hamil-
tonian (12.10) is a part of a relativistically invariant theory in the instant form
of dynamics. This means that there exists an interacting boost operator K,
which satisfies all commutation relations of the Poincaré Lie algebra together
with the Darwin-Breit Hamiltonian H. In principle, it should be possible to
find the explicit form of the operator K by applying the unitary dressing
transformation4 to the boost operator (9.15) - (9.16) of QED. However, here
we will choose a different route. Together with [CV68, CO70, KF74] we will
simply postulate the form of K and verify that Poincaré commutators are,
indeed, satisfied in the (v/c)2 approximation.
Let us first write the non-interacting generators of the Poincaré group for
a two-particle system as sums of one-particle generators5
P0 = p1 + p2 (N.23)
J0 = [r1 × p1 ] + s1 + [r2 × p2 ] + s2 (N.24)
H0 = h1 + h2
p2 p2 p41 p42
≈ m1 c2 + m2 c2 + 1 + 2 − − (N.25)
2m1 2m2 8m31 c2 8m32 c2
h1 r1 [p1 × s1 ] h2 r2 [p2 × s2 ]
K0 = − 2 − 2
− 2 −
c m1 c + h1 c m2 c2 + h2
2 2
p r1 p r2
≈ −m1 r1 − m2 r2 − 1 2 − 2 2
2m1 c 2m2 c

1 [s1 × p1 ] [s2 × p2 ]
+ 2 + (N.26)
2c m1 m2
The full interacting generators are
H = H0 + V
K = K0 + Z (N.27)
4
5
see equations (17.17) - (17.20)
N.3. RELATIVISTIC INVARIANCE OF CLASSICAL ELECTRODYNAMICS787
The potential energy V is given by (15.2), and the potential boost is postu-
lated as [CV68, CO70, KF74]
q1 q2 (r1 + r2 )
Z ≈ − (N.28)
8πc2 r
The non-trivial Poisson brackets of the Poincaré Lie algebra (3.52) - (3.58)
that need to be verified are those involving interacting generators H and K
X
[J0i , Kj ]P = ǫijk Kk (N.29)
k=x,y,z
[J0 , H]P = [P0 , H]P = 0 (N.30)
1 X
[Ki , Kj ]P = − 2 ǫijk J0k (N.31)
c k=x,y,z
1
[Ki , P0j ]P = − Hδij (N.32)
c2
[K, H]P = −P0 (N.33)
where i, j, k = (x, y, z).

The proof of (N.29) - (N.30) follows easily from the Poisson brackets of
particle observables (17.12) - (17.16) and formula (6.96) for brackets involving
complex expressions. This proof is left as an exercise for the reader. For the
less trivial brackets (N.31) - (N.33), it will be convenient to write H and K
as series in powers of (v/c)2 (the superscript in parentheses is the power of
(v/c)2 )
(1) (1) (1)

H ≈ H (−1) + H (0) + Horb + Hspin−orb + Hspin−spin
(1) (1)
K ≈ K(0) + Korb + Kspin−orb
where
H (−1) = m1 c2 + m2 c2
p21 p2 q1 q2
H (0) = + 2 +
2m1 2m2 4πr

(1) p41 p42 q1 q2 (p1 · r)(p2 · r)
Horb = − 3 2− − (p1 · p2 ) +
8m1 c 8m32 c2 8πm1 m2 c2 r r2
(1) q1 q2 [r × p1 ] · s1 q1 q2 [r × p2 ] · s2 q1 q2 [r × p2 ] · s1
Hspin−orb = − + +
8πm21 c2 r 3 8πm22 c2 r 3 4πm1 m2 c2 r 3
q1 q2 [r × p1 ] · s2
−
4πm1 m2 c2 r 3
(1) (s1 · s2 ) 3(s1 · r)(s2 · r)
Hspin−spin = 2 3
−
4πm1 m2 c r 4πm1 m2 c2 r 5
K(0) = −m1 r1 − m2 r2
(1) p2 r 1 p2 r 2 q1 q2 (r1 + r2 )
Korb = − 1 2− 2 2−
2m1 c 2m2 c 8πc2 r

(1) 1 [s1 × p1 ] [s2 × p2 ]
Kspin−orb = +
2c2 m1 m2
Then we find that the following relationships need to be proven
1 (−1) (0)
− 2
H δij = [Ki , P0j ]P (N.34)
c
(0)
0 = [Ki , H (−1) ]P (N.35)
(0)
−P0i = [Ki , H (0) ]P (N.36)
(0) (0)
0 = [Ki , Kj ]P (N.37)
1 (0) (1) (1)
− 2
H δij = [Ki−orb , P0j ]P + [Ki−spin−orb , P0j ]P (N.38)
c
(1) (1) (0) (1)
0 = [Ki−orb , H (0) ]P + [Ki−spin−orb, H (0) ]P + [Ki , Horb ]P
(0) (1) (0) (1)
+[Ki , Hspin−orb]P + [Ki , Hspin−spin]P (N.39)
3
1 X (1) (0) (1) (0) (0) (1)
− 2
ǫijk J0k = [Ki−orb , Kj ]P + [Ki−spin−orb, Kj ]P + [Ki , Kj−orb]P
c k=1
(0) (1)
+[Ki , Kj−spin−orb]P (N.40)
Again, we skip the easy-to-prove (N.34), (N.35), (N.36), and (N.37). For
equation (N.38) we obtain
(1) (1)
[Kx−orb + Kx−spin−orb, P0x ]P
N.3. RELATIVISTIC INVARIANCE OF CLASSICAL ELECTRODYNAMICS789
2
p21 r1x p2 r2x q1 q2 r1x + r2x q1 q2 r1x + r2x
= − , p1x − , p2x − , p1x − , p2x
2m1 c2 P 2m2 c2 P 8πc2 r P 8πc2 r P
p21 p22 q1 q2
= − − −
2m1 c2 2m2 c2 4πc2 r
1
= − 2 H (0)
c
Individual terms on the right hand side of (N.39) are
(1)
[Kx−orb , H (0) ]P

p21 2 q1 q2 r1x 2 1 p22 2 q1 q2 r2x 2 1
= − 2 2 [r1x , p1 ]P − p , − [r2x , p2 ]P − p,
4m1 c 8πm1 c2 1 r P 4m22 c2 8πm2 c2 2 r P

q1 q2 2 q1 q2 r1x 1 2 q1 q2 r2x 1 2 q1 q2 r1x 1 2
− [r1x , p1 ]P − ,p − ,p − ,p
16πm1 c2 r 16πm1 c2 r 1 P 16πm1 c2 r 1 P 16πm2 c2 r 2 P

q1 q2 2 q1 q2 r2x 1 2
− [r2x , p2 ]P − ,p
16πm2 c2 r 16πm2 c2 r 2 P
p2 p1x q1 q2 (r1x − r2x ) (p1 · r) p22 p2x q1 q2 (r1x − r2x ) (p2 · r)
= − 12 2− 2 3
− 2 2
−
2m1 c 8πm1 c r 2m2 c 8πm2 c2 r3
q1 q2 p1x q1 q2 p2x
− − (N.41)
8πm1 c r 8πm2 c2 r
2

(1) (0) 1 1 1 q1 q2
[Kx−spin−orb, H ]P = 2 [s1 × p1 ]x + [s2 × p2 ]x ,
2c m1 m2 4πr P
q1 q2 [s1 × r]x q1 q2 [s2 × r]x
= − (N.42)
8πm1 c2 r 3 8πm2 c2 r 3
(1)
[Kx(0) , Horb]P

1 4 q1 q2 (p1 · r)(p2 · r)
= [r1x , p1 ]P + r1x , (p1 · p2 ) +
8m21 c2 8πm2 c2 r r2 P

1 q q
1 2 (p1 · r)(p2 · r)
+ 2 2 [r2x , p42 ]P + r2x , (p1 · p2 ) +
8m2 c 8πm1 c2 r r2 P

p21 p1x q1 q2 (r1x − r2x )(p2 · r)
= + p2x +
2m21 c2 8πm2 c2 r r2

p22 p2x q1 q2 (p1 · r)(r1x − r2x )
+ 2 2+ p1x + (N.43)
2m2 c 8πm1 c2 r r2
(1)
h q1 q2 [r × p1 ] · s1
[Kx(0) , Hspin−orb]P
= −m1 r1x − m2 r2x , −
8πm21 c2 r 3
q1 q2 [r × p2 ] · s2 q1 q2 [r × p2 ] · s1 q1 q2 [r × p1 ] · s2 i
+ + −
8πm22 c2 r 3 4πm1 m2 c2 r 3 4πm1 m2 c2 r 3 P
q1 q2 [s2 × r]x q1 q2 [s1 × r]x q1 q2 [s2 × r]x q1 q2 [s1 × r]x
= − + + −
8πm2 c2 r 3 8πm1 c2 r 3 4πm2 c2 r 3 4πm1 c2 r 3
q1 q2 [s1 × r]x q1 q2 [s2 × r]x
= − + (N.44)
8πm1 cr 3 8πm2 cr 3
(1)
[Kx(0) , Hspin−spin]P = 0 (N.45)
Summing up the right hand sides of equations (N.41) - (N.45) we see that
equation (N.39) is, indeed, satisfied. For equation (N.40) we obtain
(1) (1) (1) (1)

[Kx−orb , Ky(0) ]P + [Kx(0) , Ky−orb ]P + [Kx−spin−orb, Ky(0) ]P + [Kx(0) , Ky−spin−orb]P
r1x 2 r2x r1y r2y
= 2
[p1 , r1y ]P + 2 [p22 , r2y ]P + 2 [r1x , p21 ]P + 2 [r2x , p22 ]P
2c 2c 2c 2c

1 1 1
− 2 − s1z p1y − s2z p2y , m1 r1y + m2 r2y
2c m1 m2
P
1 1 1
− 2 m1 r1x + m2 r2x , s1z p1x + s2z p2x
2c m1 m2 P
1 1 1 1
= − 2 [r1 × p1 ]z − 2 [r2 × p2 ]z − 2 (s1z + s2z ) = − 2 J0z (N.46)
c c c c
Appendix O
Dimensionality checks
In our formulas in this book we chose to show explicitly all fundamental con-
stants, like c and ~, rather than adopt the usual convention ~ = c = 1. This
makes our expressions slightly lengthier, but has the benefit of easier control
of dimensions and checking correctness at each calculation step. In this sub-
section we are going to suggest a few rules for such dimension estimates in
formulas involving quantum fields.
From the familiar formula
Z
dpδ(p) = 1
it follows that the dimension of the delta function is1
1
hδ(p)i =
hp3 i
Then (anti)commutation relations of creation and annihilation operators
{a†p,σ , ap′ ,σ′ } = δ(p − p′ )δσ,σ′

[cp,τ , c†p′ ,τ ′ ] = δ(p − p′ )δτ,τ ′
1
Angle brackets hAi denote the dimension of an observable A, as it has been introduced
in subsection 2.3.1. For example, hpi = hmihvi = hEi/hvi denotes the dimension of
momentum. Note that dimension of the 4D delta function (M.1) is hE −1 ihp−3 i.
791
792 APPENDIX O. DIMENSIONALITY CHECKS
suggests that dimensions of these operators are
1
ha†p,σ i = hap′ ,σ′ i = hcp,τ i = hc†p′ ,τ ′ i = (O.1)
hp3/2 i
In the definition of the Dirac’s quantum field (J.26)
Z s
dp mc2 X − i p̃·x̃ i
p̃·x̃ †

ψ(x, t) = e ~ u(p, σ)ap,σ + e ~ v(p, σ)bp,σ
(2π~)3/2 ωp σ
4-vectors p̃ and x̃ have dimensions of energy hp̃i = hEi and time hx̃i = hti,
respectively. The dimension of the Planck’s constant is h~i = hpihri =
hEihti, which implies that arguments ~i p̃· x̃ of exponents are dimensionless, as
expected. Functions u and v are dimensionless as well.2 Then the dimension
of the Dirac quantum field is
hp3 i hp3/2 i 1
hψi = 3/2 3/2
= 3/2 = 3/2
h~ ihp i h~ i hr i
Similarly, we obtain the dimension of the photon’s quantum field (K.2)3
h~ihc1/2 i hp1/2 ihc1/2 i

hAi = = (O.2)
hr 3/2 ihp1/2 i hr 1/2 i
current density operator (L.1)
heihci
hji = hecψψi =
hr 3 i
2
see (J.40) - (J.43)
3
In different texts one can find various definitions of quantum fields, which can differ
from definitions adopted here by their numerical factors and dimensions. However, as we
stress in subsection 17.4.2, quantum fields do not correspond to any observable quantities.
They are just formal mathematical objects, whose role is to provide convenient “building
blocks” for interaction operators (9.13), (9.14) and (9.16). So, there is a significant freedom
in choosing concrete forms of quantum fields. All these choices should lead to the same
forms of the physically meaningful interaction operators V1 , V2 , and Z.
793
and potential energy (9.13)4
1 3 heihci hp1/2 ihc1/2 i

hV1 i = hr i 3
hci hr i hr 1/2 i
heihp1/2 ihc1/2 i heih~1/2 ihc1/2 i he2 i
= = =
hr 1/2 i hri hri
This is exactly the dimension of energy, as one can expect from the Coulomb
law V = e2 /(4πr). The 2nd order QED potential (9.14) also has the dimen-
sion of energy
1 hr 3 ihr 3i heihci heihci he2 i

hV2 i = =
hc2 i hri hr 3 i hr 3i hri
By following the same rules it is easy to establish that all three terms in the
potential boost (9.16) have the dimension hmihri, as expected.
Let us illustrate the dimensionality checks on the example of the scatter-
ing amplitude (9.32). The S-operator is a dimensionless quantity and particle
creation-annihilation operators have the dimension hp−3/2 i. Therefore, the
dimension of the matrix element h0|aq,τ dp,σ S2 d†p′ ,σ′ a†q′ ,τ ′ |0i is expected to be
hp−6 i. Turning to the final result (9.34) we may note that according to (9.28)
1
hδ 4 (p)i =
hEihp3i
Then the dimension of (9.34)
he2 ihc2 i hc3 i 1

3
= 3 3
= 6
h~ihEihp ihEi hE ihp i hp i
is consistent with expectations.
Note also that d4 x ≡ dtdx and d4 p ≡ dEdp, so
hd4 xi = htihr 3i
hd4 pi = hEihp3 i
4
This expression was simplified by using he2 i = h~ihci, which follows from the fact that
α ≡ e2 /(4π~c) ≈ 1/137 is the dimensionless fine structure constant.
794 APPENDIX O. DIMENSIONALITY CHECKS
Bibliography
[AB59] Y. Aharonov and D. Bohm. Significance of electromagnetic

potentials in quantum mechanics. Phys. Rev., 115:485, 1959.
[ACG+ 71] D. S. Ayres, A. M. Cormack, A. J. Greenberg, R. W. Kenney,

D. O. Cladwell, V. B. Elings, W. P. Hesse, and R. J. Morrison.
Measurements of the lifetime of positive and negative pions.
Phys. Rev. D., 3:1051, 1971.
[AD78a] D. Aerts and I. Daubechies. About the structure-preserving

maps of a quantum mechanical propositional system. Helv.
Phys. Acta, 51:637, 1978.
[AD78b] D. Aerts and I. Daubechies. Physical justification for using the

tensor product to describe two quantum systems as one joint
system. Helv. Phys. Acta, 51:661, 1978.
[AFKW64] T. Alväger, F. J. M. Farley, J. Kjellman, and I. Wallin. Test

of the second postulate of special relativity in the GeV region.
Phys. Lett., 12:260, 1964.
[AHR04] J. M. Aguirregabiria, A. Hernández, and M. Rivas. Linear mo-

mentum density in quasistatic electromagnetic systems, 2004.
arXiv:physics/0404139.
[AN13] H. Aichmann and G. Nimtz. The superluminal tunneling story,

2013. arXiv:1304.3155v1.
[AP65] A. B. Arons and M. B. Peppard. Einstein’s proposal of the

photon concept - a translation of the Annalen der Physik paper
of 1905. Am. J. Phys., 33:367, 1965.
795
796 BIBLIOGRAPHY
[APV88] Y. Aharonov, P. Pearle, and L. Vaidman. Comment on

”Proposed Aharonov-Casher effect: Another example of an
Aharonov-Bohm effect arising from a classical lag”. Phys. Rev.
A, 37:4052, 1988.
[Are72] I. Ya. Aref’eva. Renormalized scattering theory for the Lee
model. Theor. Math. Phys., 12:859, 1972.
[AW75] S. M. W. Ahmad and E. P. Wigner. Invariant theoretic deriva-
tion of the connection between momentum and velocity. Nuovo
Cimento A, 28:1, 1975.
[Bac89] H. Bacry. The notions of localizability and space: from Eugene
Wigner to Alain Connes. Nucl. Phys. Proc. Suppl., 6:222, 1989.
[Bac04] H. Bacry. The foundations of the Poincaré group and the
validity of general relativity. Rep. Math. Phys., 53:443, 2004.
[Bak61] B. Bakamjian. Relativistic particle dynamics. Phys. Rev.,
121:1849, 1961.
[Bal98] L. E. Ballentine. Quantum Mechanics: A Modern Develop-
ment. World Scientific, Singapore, 1998.
[Bar12] S. J. Barnett. On electromagnetic induction and relative mo-
tion. Phys. Rev., 35:323, 1912.
[BBC+ 77] J. Bailey, K. Borer, F. Combley, H. Drumm, F. Kreinen,
F. Lange, E. Picasso, W. von Ruden, F. J. M. Farley, J. H.
Field, W. Flegel, and P. M. Hattersley. Measurements of rel-
ativistic time dilatation for positive and negative muons in a
circular orbit. Nature, 268:301, 1977.
[BCOR09] S. Blanes, F. Casas, J.A. Oteo, and J. Ros. The Magnus ex-
pansion and some of its applications. Phys. Rep., 470:151,
2009. arXiv:0810.5488v1.
[BD64] J. D. Bjorken and S. D. Drell. Relativistic quantum mechanics.
McGraw-Hill, New York, 1964.
[BD65] J. D. Bjorken and S. D. Drell. Relativistic quantum fields.
McGraw-Hill, New York, 1965.
BIBLIOGRAPHY 797
[BD97] Ph. Balcou and L. Dutriaux. Dual optical tunneling times in

frustrated total internal reflection. Phys. Rev. Lett., 78:851,
1997.
[Ber65] R. A. Berg. Position and intrinsic spin operators in quantum
theory. J. Math. Phys., 6:34, 1965.
[BF62] B. Barsella and E. Fabri. Angular momenta in relativistic
many-body theory. Phys. Rev., 128:451, 1962.
[BJK69] G. E. Brown, D. D. Jackson, and T. T. S. Kuo. Nucleon-
nucleon potentials and minimal relativity. Nucl. Phys.,
A113:481, 1969.
[BLP01] V. B. Berestetskiı̆, E. M. Livshitz, and L. P. Pitaevskiı̆. Quan-
tum electrodynamics. Fizmatlit, Moscow, 2001. (in Russian).
[Blu60] L. E. Blumenson. A derivation of n-dimensional spherical co-
ordinates. Am. Math. Monthly, 67:63, 1960.
[Boy05] T. H. Boyer. The paradoxical forces for the classical electro-
magnetic lag associated with the Aharonov-Bohm phase shift,
2005. arXiv:physics/0506180v1.
[Boy06] T. H. Boyer. Darwin-Lagrangian analysis for the interaction
of a point charge and a magnet: Considerations related to
the controversy regarding the Aharonov-Bohm and Aharonov-
Casher phase shifts. J. Phys. A:Math. Gen., 39:3455, 2006.
arXiv:physics/0506181v1.
[Boy07a] T. H. Boyer. Comment on experiments related to the
Aharonov-Bohm phase shift, 2007. arXiv:0708.3194v1.
[Boy07b] T. H. Boyer. Unresolved classical electromagnetic aspects of
the Aharonov-Bohm phase shift, 2007. arXiv:0709.0661v1.
[Boy08] T. H. Boyer. Illustrating some implications of the conservation
laws in relativistic mechanics, 2008. arXiv:0812.1017v1.
[BRBG09] D. Babson, S. P. Reynolds, R. Bjorkquist, and D. J. Griffiths.
Hidden momentum, field momentum and electromagnetic im-
pulse. Am. J. Phys., 77:826, 2009.
798 BIBLIOGRAPHY
[Bre68] E. Breitenberger. Magnetic interactions between charged par-

ticles. Am. J. Phys., 36:505, 1968.
[Bro05] H. R. Brown. Physical relativity: Space-time structure from a

dynamical perspective. Oxford University Press, Oxford, 2005.
[BT53] B. Bakamjian and L. H. Thomas. Relativistic particle dynam-

ics. II. Phys. Rev., 92:1300, 1953.
[Bud09] N. V. Budko. Observation of locally negative velocity of

the electromagnetic field in free space. Phys. Rev. Lett.,
102:020401, 2009.
[Bud10] N. V. Budko. Superluminal, subluminal, and negative

velocities in free-space electromagnetic propagation, 2010.
arXiv:1006.5576v1.
[But69] J. W. Butler. A proposed electromagnetic momentum-energy

4-vector for charged bodies. Am. J. Phys., 37:1258, 1969.
[BvN36] G. Birkhoff and J. von Neumann. The logic of quantum me-

chanics. Ann. Math., 37:823, 1936.
[BY94] D. Buchholz and J. Yngvason. There are no causality problems

with Fermi’s two atom system. Phys. Rev. Lett., 73:613, 1994.
arXiv:hep-th/9403027.
[Can65] D. J. Candlin. Physical operators and the representations of

the inhomogeneous Lorentz group. Nuovo Cim., 37:1396, 1965.
[Car00] S. Carlip. Aberration and the speed of gravity. Phys. Lett. A,

267:81, 2000. arXiv:gr-qc/9909087.
[Car05] R. Carroll. Remarks on photons and the aether, 2005.

[CB09] A. Caprez and H. Batelaan. Feynman’s relativistic electrody-

namics paradox and the Aharonov-Bohm effect. Found. Phys.,
39:295, 2009.
BIBLIOGRAPHY 799
[CBB07] A. Caprez, B. Barwick, and H. Batelaan. A macroscopic test of

the Aharonov-Bohm effect. Phys. Rev. Lett., 99:210401, 2007.
arXiv:0708.2428v1.
[CCTS13] G. Cavalleri, E. Cesaroni, E. Tonni, and G. Spavieri. Inter-

pretation of the longitudinal forces detected in a recent exper-
iment of electrodynamics. Eur. Phys. J. D, 26:221, 20013.
[Cha60] R. G. Chambers. Shift of an electron interference pattern by

enclosed magnetic flux. Phys. Rev. Lett., 5:3, 1960.
[Cha64] A. J. Chakrabarti. On the canonical relativistic kinematics of

N-particle systems. J. Math. Phys., 5:922, 1964.
[CJS63] D. G. Currie, T. F. Jordan, and E. C. G. Sudarshan. Relativis-

tic invariance and Hamiltonian theories of interacting particles.
Rev. Mod. Phys., 35:350, 1963.
[CO70] F. E. Close and H. Osborn. Relativistic center-of-mass mo-

tion and the electromagnetic interaction of systems of charged
particles. Phys. Rev. D, 2:2127, 1970.
[Com96] E. Comay. Exposing ”hidden momentum”. Am. J. Phys.,

64:1028, 1996.
[Com97] E. Comay. Decomposition of electromagnetic fields into radi-

ation and bound components. Am. J. Phys., 65:862, 1997.
[Com00] E. Comay. Lorentz transformation of a system carrying ”hid-

den momentum”. Am. J. Phys., 68:1007, 2000.
[Cor86] F. H. J. Cornish. An electric dipole in self-accelerated trans-

verse motion. Am. J. Phys., 54:166, 1986.
[CP82] F. Coester and W. N. Polyzou. Relativistic quantum mechanics

of particles with direct interactions. Phys. Rev. D, 26:1348,
1982.
[CR03] P. Caban and J. Rembielinski. Photon polarization and

Wigner’s little group, 2003. arXiv:quant-ph/0304120.
800 BIBLIOGRAPHY
[CSR96] A. E. Chubykalo and R. Smirnov-Rueda. Action at a distance

as a full-value solution of Maxwell equations: basis and appli-
cation of separated potential’s method. Phys. Rev. E, 53:5373,
1996.
[Cul52] E. G. Cullwick. Electromagnetic momentum and Newton’s
third law. Nature, 170:425, 1952.
[CV68] S. Coleman and J. H. Van Vleck. Origin of “hidden momentum
forces” on magnets. Phys. Rev., 171:1370, 1968.
[CZJW00] J. J. Carey, J. Zawadzka, D. A. Jaroszynski, and K. Wynne.
Noncausal time response in frustrated total internal reflection?
Phys. Rev. Lett., 84:1431, 2000.
[DG73] G. Dillon and M. M. Giannini. On the clothing transformation
in the Lee model. Nuovo Cim., 18A:31, 1973.
[DG75] G. Dillon and M. M. Giannini. On the potential description of
the V N −NNθ sector of the Lee model. Nuovo Cim., 27A:106,
1975.
[Dir49] P. A. M. Dirac. Forms of relativistic dynamics. Rev. Mod.
Phys., 21:392, 1949.
[DLM12] J. M. Dahlström, A. L’Huillier, and A. Maquet. Introduction
to attosecond delays in photoionization. J. Phys. B: At. Mol.
Opt. Phys., 45:183001, 2012.
[dlT04a] A. C. de la Torre. Understanding light quanta: First quan-
tization of the free electromagnetic field, 2004. arXiv:quant-
ph/0410171.
[dlT04b] A. C. de la Torre. Understanding light quanta: The photon,
2004. arXiv:quant-ph/0410179.
[dlT05] A. C. de la Torre. Understanding light quanta: Construc-
tion of the free electromagnetic field, 2005. arXiv:quant-
ph/0503023v2.
[Dol76] J. D. Dollard. Interpretation of Kato’s invariance principle in
scattering theory. J. Math. Phys., 17:46, 1976.
BIBLIOGRAPHY 801
[DS10] I. Dubovyk and O. Shebeko. The method of unitary clothing

transformations in the theory of nucleon–nucleon scattering.
Few-Body Systems, 48:109, 2010.
[dSFP+ 12] R. de Sangro, G. Finnochiaro, P. Patteri, M. Piccolo, and

G. Pizzella. Measuring propagation speed of Coulomb fields,
2012. arXiv:1211.2913v2.
[DW65] H. Van Dam and E. P. Wigner. Classical relativistic mechanics

of interacting point particles. Phys.Rev., B138:1576, 1965.
[Dys51] F. J. Dyson. The renormalization method in quantum electro-

dynamics. Proc. Roy. Soc., A207:395, 1951.
[EF13] H. Essén and M. C. N. Fiolhais. The Darwin-Breit magnetic

interaction and superconductivity, 2013. arXiv:1312.1607v1.
[Ein05] A. Einstein. Zur Electrodynamik bewegter Körper. Annalen

der Physik, 17:891, 1905.
[Ein20] A. Einstein. Relativity: The Special and General Theory.

Methuen and Co, 1920.
[Ein49] A. Einstein. in Albert Einstein: Philosopher-Scientist. Open

Court, Peru, 1949.
[EKL76] W. F. Edwards, C. S. Kenyon, and D. K. Lemon. Continuing

investigation into possible electric fields arising from steady
conductor currents. Phys. Rev. D, 14:922, 1976.
[Eks60] H. Ekstein. Equivalent Hamiltonians in scattering theory.

Phys. Rev., 117:1590, 1960.
[EKU62] H. Ezawa, K. Kikkawa, and H. Umezawa. Potential represen-

tation in quantum field theory. Nuovo Cim., 23:751, 1962.
[EL08] A. Einstein and J. Laub. Uber die electromagnetischen Grund-

gleichungen für bewegte Körper. Ann. Phys. (Leipzig), 26:532,
1908.
[EN93] A. Enders and G. Nimtz. Evanescent-mode propagation and

quantum tunneling. Phys. Rev. E, 48:632, 1993.
802 BIBLIOGRAPHY
[Eng05] W. Engelhardt. Instantaneous interaction between charged

particles, 2005. arXiv:physics/0511172v1.
[EPC+ 08] P. Eckle, A. N. Pfeiffer, C. Cirelli, A. Staudte, R. Dörner, H. G.

Muller, M. Büttiker, and U. Keller. Attosecond ionization
and tunneling delay time measurements in Helium. Science,
322:1525, 2008.
[Ess95] H. Essén. A study of lattice and magnetic interactions of con-

duction electrons. Phys. Scr., 52:388, 1995.
[Ess96] H. Essén. Darwin magnetic interaction energy and its macro-

scopic consequences. Phys. Rev. E, 53:5228, 1996.
[Ess99] H. Essén. Magnetism of matter and phase space energy of

charged particle systems. J. Phys. A: Math. Gen., 32:2297,
1999.
[Ess07] H. Essén. Circulating electrons, superconductivity, and the

Darwin-Breit interaction, 2007. arXiv:cond-mat/0002096.
[Fad63] L. D. Faddeev. On the separation of self-interaction and scat-

tering effects in perturbation theory. Dokl. Akad. Nauk SSSR,
152:573, 1963.
[Far92] F. J. M. Farley. The CERN (g-2) measurements. Z. Phys. C,

56:S88, 1992.
[Fey49] R. P. Feynman. Space-time approach to quantum electrody-

namics. Phys. Rev., 76:769, 1949.
[Fey85] R. P. Feynman. Q.E.D. Princeton University Press, 1985.
[FGR78] L. Fonda, G. C. Ghirardi, and A. Rimini. Decay theory of

unstable quantum systems. Rep. Prog. Phys., 41:587, 1978.
[Fie97] J. H. Field. A new kinematical derivation of the Lorentz trans-

formation and the particle description of light. Helv. Phys.
Acta, 70:542, 1997. arXiv:physics/0410062.
BIBLIOGRAPHY 803
[Fie04] J. H. Field. On the relationship of quantum mechanics to

classical electromagnetism and classical relativistic mechanics,
2004. arXiv:physics/0403076.
[Fie06a] J. H. Field. Classical electromagnetism as a consequence

of Coulomb’s law, special relativity and Hamilton’s principle
and its relationship to quantum electrodynamics. Phys. Scr.,
74:702, 2006. arXiv:physics/0501130v5.
[Fie06b] J. H. Field. Space-time transformation properties of inter-

charge forces and dipole radiation: Breakdown of the
classical field concept in relativistic electrodynamics, 2006.
[Fiv70] D. I. Fivel. Solutions of the Lee model in all sectors by dy-

namical algebra. J. Math. Phys., 11:699, 1970.
[FM96] G. Fiore and G. Modanese. General properties of the decay am-

plitudes for massless particles. Nucl. Phys., B477:623, 1996.
[FN94] W. I. Fushchich and A. G. Nikitin. Symmetries of equations

of quantum mechanics. New York, 1994.
[Fol61] L. L. Foldy. Relativistic particle systems with interaction.

Phys. Rev., 122:275, 1961.
[Fou] D. J. Foulis. A half century of quan-

tum logic. What have we learned?
http://www.quantonics.com/Foulis On Quantum Logic.html
.
[Fra07] J. Franklin. The nature of electromagnetic energy, 2007.

arXiv:0707.3421v2.
[Fri94] A. Friedman. Nonstandard extension of quantum logic and

Dirac’s bra-ket formalism of quantum mechanics. Int. J.
Theor. Phys., 33:307, 1994.
[Frö52] H. Fröhlich. Interaction of electrons with lattice vibrations.

Proc. Roy. Soc. (London), A 215:291, 1952.
804 BIBLIOGRAPHY
[Frö61] H. Fröhlich. The theory of the superconductive state. Rep.

Prog. Phys., 24:1, 1961.
[FS64] R. Fong and J. Sucher. Relativistic particle dynamics and the

S-matrix. J. Math. Phys., 5:456, 1964.
[FS73] V. A. Fateev and A. S. Shvarts. Dressing operators in quantum

field theory. Dokl. Akad. Nauk SSSR, 209:66, 1973. [English
translation in Sov. Phys. Dokl. 18 (1973), 165.].
[FS88] G. Feinberg and J. Sucher. Two-photon-exchange force be-

tween charged systems: Spinless particles. Phys. Rev. D,
38:3763, 1988.
[Fur69] W. H. Furry. Examples of momentum distributions in the

electromagnetic field and in matter. Am. J. Phys., 37:621,
1969.
[Gal01] Galileo Galilei. Dialogues Concerning the Two Chief World

Systems. Modern Library Science Series, New York, 2001.
[Gal05] E. A. Galapon. Theory of confined quantum time of arrivals,

2005. arXiv:quant-ph/0504174.
[Gau03] N. Gauthier. What happens to energy and momentum when

two oppositely-moving wave pulses overlap? Am. J. Phys.,
71:787, 2003.
[GI91a] G. C. Giakos and T. K. Ishii. Anomalous microwave propaga-

tion in open space. Microwave and Optical Technology Letters,
4:79, 1991.
[GI91b] G. C. Giakos and T. K. Ishii. Rapid pulsed microwave prop-

agation. IEEE Microwave and Guided Wave Letters, 1:374,
1991.
[GJR01] N. Graneau, T. Phipps Jr., and D. Roscoe. An experimen-

tal confirmation of longitudinal electrodynamic forces. Europ.
Phys. J. D, 15:87, 2001.
BIBLIOGRAPHY 805
[GL] C. Giunti and M. Laveder. Neutrino mixing. in Develop-

ments in Quantum Physics, edited by F. H. Columbus and V.
Krasnoholovets, (Nova Science, New York, 2004) pp. 197-254,
arXiv:hep-ph/0310238.
[Gla97] St. D. Glazek. Similarity renormalization group approach

to boost invariant Hamiltonian dynamics, 1997. arXiv:hep-
th/9712188.
[Gle57] A. M. Gleason. Measures on the closed subspaces of a Hilbert

space. J. Math. Mech., 6:885, 1957.
[GR80] S. N. Gupta and S. F. Radford. Quantum field-theoretical elec-

tromagnetic and gravitational two-particle potentials. Phys.
Rev. D, 21:2213, 1980.
[GR00] I. S. Gradshteyn and I. M. Ryzhik. Tables of Integrals, Series,

and Products. Academic Press, San Diego, 2000.
[Gri86] D. J. Griffiths. Electrostatic levitation of a dipole. Am. J.

Phys., 54:744, 1986.
[GRI89] S. N. Gupta, W. W. Repko, and C. J. Suchyta III. Muonium

and positronium potentials. Phys. Rev. D, 40:4100, 1989.
[GRT96] N. Grot, C. Rovelli, and R. S. Tate. Time-of-arrival in quantum

mechanics, 1996. arXiv:quant-ph/9603021v1.
[GS58] O. W. Greenberg and S. S. Schweber. Clothed particle oper-

ators in simple models of quantum field theory. Nuovo Cim.,
8:378, 1958.
[Gup63] A. K. Das Gupta. Unipolar machines, association of the mag-

netic field with the field-producing magnet. Am. J. Phys.,
31:428, 1963.
[GW64] M. L Goldberger and K. M. Watson. Collision theory. J. Wiley

& Sons, New York, 1964.
[GW93] St. D. Glazek and K. G. Wilson. Renormalization of hamilto-

nians. Phys. Rev. D, 48:5863, 1993.
806 BIBLIOGRAPHY
[GZL04] T. P. Gill, W. W. Zachary, and J. Lindesay. The classical

electron problem, 2004. arXiv:physics/0405131v1.
[Har62] T. E. Hartman. Tunneling of a wave packet. J. Appl. Phys.,

33:3427, 1962.
[HBH+ 01] J. B. Hertzberg, S. R. Bickman, M. T. Hummon, Jr. D. Krause,

S. K. Peck, and L. R. Hunter. Measurement of the relativistic
potential difference across a rotating magnetic dielectric cylin-
der. Am. J. Phys., 69:648, 2001.
[HC01] H. Halvorson and R. Clifton. No place for particles in rela-

tivistic quantum theories?, 2001. arXiv:quant-ph/0103041.
[Heg98] G. C. Hegerfeldt. Instantaneous spreading and Einstein causal-

ity in quantum theory. Ann. Phys. (Leipzig), 7:716, 1998.
arXiv:quant-ph/9809030.
[Hei58] W. Heisenberg. Physics and Philosophy. Harper and Brothers,

New York, 1958.
[HMS79] D. Hasselkamp, E. Mondry, and A. Scharmann. Direct obser-

vation of the transversal Doppler-shift. Z. Physik A, 289:151,
1979.
[HN00] A. Haibel and G. Nimtz. Universal tunneling time in photonic

barriers, 2000. arXiv:physics/0009044.
[HN08] G. C. Hegerfeldt and J. T. Neumann. The Aharonov-Bohm

effect: the role of tunneling and associated forces, 2008.
arXiv:0801.0799v2.
[Hni04] V. Hnizdo. On linear momentum in quasistatic electromag-

netic systems, 2004. arXiv:physics/0407027v1.
[Hob12] A. Hobson. There are no particles, there are only fields, 2012.
arXiv:1204.4616.
[Hol04] B. R. Holstein. Effective interactions and the hydrogen atom.

Am. J. Phys., 72:333, 2004.
BIBLIOGRAPHY 807
[Hov55] L. Van Hove. Energy corrections and persistent perturbation

effects in continuous spectra. Physica, 21:901, 1955.
[Hov56] L. Van Hove. Energy corrections and persistent perturba-

tion effects in continuous spectra. II. The perturbed stationary
states. Physica, 22:343, 1956.
[How44] G. W. O. Howe. A problem of two electrons and Newton’s

third law. Wireless Engineer, 21:105, 1944.
[Hsi00] W. Y. Hsiang. Lectures on Lie Groups. World Scientific, Sin-

gapore, 2000.
[Hua13] K. Huang. A critical history of renormalization, 2013.

arXiv:1310.5533v1.
[III13] R. J. DeJonghe III. Rebuilding Mathemat-

ics on a Quantum Logical Foundation. Uni-
versity of Illinois, Chicago, 2013. PhD Thesis,
http://indigo.uic.edu/bitstream/handle/10027/10195/DeJonghe Richard.pdf?sequence=
[IPL99] M. Ibison, H. E. Puthoff, and S. R. Little. The speed of gravity

revisited, 1999. arXiv:physics/9910050.
[IS38] H. E. Ives and G. R. Stilwell. An experimental study of the

rate of a moving clock. J. Opt. Soc. Am, 28:215, 1938.
[IS41] H. E. Ives and G. R. Stilwell. An experimental study of the

rate of a moving clock. II. J. Opt. Soc. Am, 31:369, 1941.
[Ito65] T. Itoh. Derivation of nonrelativistic Hamiltonian for electrons

from quantum electrodynamics. Rev. Mod. Phys., 37:159,
1965.
[Jac99] J. D. Jackson. Classical electrodynamics. J. Wiley and Sons,

3rd edition, 1999.
[Jac04] J. D. Jackson. Torque or no torque? Simple charged particle

motion observed in different inertial frames. Am. J. Phys.,
72:1484, 2004.
808 BIBLIOGRAPHY
[Jau71] J. M. Jauch. Projective representation of the Poncaré group

in a quaternionic Hilbert space. in Group theory and its ap-
plications, edited by E.M. Loebl. Academic Press, New York,
1971.
[Jef99a] O. D. Jefimenko. A relativistic paradox seemingly violating

conservation of momentum law in electromagnetic systems.
Eur. J. Phys., 20:39, 1999.
[Jef99b] O. D. Jefimenko. The Trouton-Noble paradox. J. Phys. A:

Math. Gen., 32:3755, 1999.
[Joh96] L. Johansson. Longitudinal electrodynamic forces -

and their possible technological applications, 1996.
MSc Thesis, Lund Institute of Technology, Sweden,
http://www.df.lth.se/ snorkelf/LongitudinalMSc.pdf.
[Jor77] T. F. Jordan. Identification of the velocity operator for an

irreducible unitary representation of the Poincaré group. J.
Math. Phys., 18:608, 1977.
[Jor80] T. F. Jordan. Simple derivation of the Newton-Wigner position

operator. J. Math. Phys., 21:2028, 1980.
[Kac59] C. Kacser. Higher Born approximations in non-relativistic

Coulomb scattering. Nuovo Cim., 13:303, 1959.
[Kaz71] E. Kazes. Analytic theory of relativistic interactions. Phys.

Rev. D, 4:999, 1971.
[KCS07] V. Yu. Korda, L. Canton, and A. V. Shebeko. Relativistic

interactions for the meson-two-nucleon system in the clothed-
particle unitary representation. Ann. Phys., 322:736, 2007.
arXiv:nucl-th/0603025v1.
[Kei94] B. D. Keister. Forms of relativistic dynamics: What are the

possibilities?, 1994. arXiv:nucl-th/9406032.
[Kel42] J. M. Keller. Newton’s third law and electrodynamics. Am. J.

Phys., 10:302, 1942.
BIBLIOGRAPHY 809
[Ken17] E. H. Kennard. On unipolar induction: Another experiment

and its significance as evidence for the existence of the aether.
Phil. Mag., 33:179, 1917.
[KF74] R. A. Krajcik and L. L. Foldy. Relativistic center-of-mass vari-

ables for composite systems with arbitrary internal interac-
tions. Phys. Rev. D, 10:1777, 1974.
[Kha97] L. A. Khalfin. Quantum theory of unstable particles

and relativity, 1997. Preprint of Steklov Mathemati-
cal Institute, St. Petersburg Department, PDMI-6/1997
http://www.pdmi.ras.ru/preprint/1997/97-06.html.
[Kho] A. L. Kholmetskii. The author’s collection of

relativistic paradoxes in classical electrodynam-
ics. http://www.space-lab.ru/files/pages/PIRT VII-
XII/pages/text/PIRT IX/Kholmetskii 1.pdf.
[Kho03] A.L. Kholmetskii. One century later: Remarks on the Barnett

experiment. Am. J. Phys., 71:558, 2003.
[Kho04] A. L. Kholmetskii. Remarks on momentum and energy flux of

a non-radiating electromagnetic field. Ann. Found. Louis de
Broglie, 29:549, 2004.
[Kho05] A. L. Kholmetskii. On momentum and energy

of a non-radiating electromagnetic field, 2005.
[Kho06] A. L. Kholmetskii. Momentum-energy of the non-radiating

electromagnetic field: open problems? Phys. Scr., 73:620,
2006.
[Kit66] H. Kita. A non-trivial example of a relativistic quantum the-

ory of particles without divergence difficulties. Progr. Theor.
Phys., 35:934, 1966.
[Kit68] H. Kita. Another convergent relativistic model theory of in-

teracting particles. Progr. Theor. Phys., 39:1333, 1968.
810 BIBLIOGRAPHY
[Kit70] H. Kita. Structure of the state space in a convergent relativistic

model theory of interacting particles. Progr. Theor. Phys.,
43:1364, 1970.
[Kit72a] H. Kita. A model of relativistic quantum mechanics of inter-

acting particles. Progr. Theor. Phys., 48:2422, 1972.
[Kit72b] H. Kita. Vertex functions in convergent relativistic model the-

ories. Progr. Theor. Phys., 47:2140, 1972.
[Kit73] H. Kita. A realistic model of convergent relativistic quan-

tum mechanics of interacting particles. Progr. Theor. Phys.,
49:1704, 1973.
[KMS90] A. Yu. Korchin, Yu. P. Mel’nik, and A. V. Shebeko. Angu-

lar distributions and polarization of protons in the d(e,e’p)n
reaction. Few-Body Systems, 9:211, 1990.
[KMSR07a] A. L. Kholmetskii, O. V. Missevitch, and R. Smirnov-Rueda.

Measurement of propagation velocity of bound electromagnetic
fields in near zone. J. Appl. Phys., 102:013529, 2007.
[KMSR+ 07b] A. L. Kholmetskii, O. V. Missevitch, R. Smirnov-Rueda, R. I.

Tzonchev, A. E. Chubykalo, and I. Moreno. Experimen-
tal evidence on non-applicability of the standard retarda-
tion condition to bound magnetic fields and on new gener-
alized Biot-Savart law. J. Appl. Phys., 101:023532, 2007.
[KP91] B. D. Keister and W. N. Polyzou. Relativistic Hamiltonian

dynamics in nuclear and particle physics. in Advances in Nu-
clear Physics vol. 20, edited by J. W. Negele and E. W. Vogt.
Plenum Press, 1991. http://www.physics.uiowa.edu/ wpoly-
zou/papers/rev.pdf.
[KPR85] M. Kaivola, O. Poulsen, and E. Riis. Measurement of the

relativistic Doppler shift in neon. Phys. Rev. Lett., 54:255,
1985.
BIBLIOGRAPHY 811
[KS04] V. Yu. Korda and A. V. Shebeko. Clothed particle represen-

tation in quantum field theory: Mass renormalization. Phys.
Rev. D, 70:085011, 2004.
[KSO97] M. Kobayashi, T. Sato, and H. Ohtsubo. Effective interactions

for mesons and baryons in nuclei. Progr. Theor. Phys., 98:927,
1997.
[KV02] A. Kislev and L. Vaidman. Relativistic causality and conser-

vation of energy in classical electromagnetic theory. Am. J.
Phys., 70:1216, 2002. arXiv:physics/0201042v1.
[KvB06] J. Kofler and Č. Brukner. Classical world because of quantum

physics, 2006. arXiv:quant-ph/0609079.
[KY07a] A. L. Kholmetskii and T. Yarman. Apparent paradoxes in

classical electrodynamics: relativistic transformation of force.
Eur. J. Phys., 28:537, 2007.
[KY07b] A. L. Kholmetskii and T. Yarman. Relativistic transforma-

tion of force: resolution of apparent paradoxes. Eur. J. Phys.,
28:1081, 2007.
[KY08] A. L. Kholmetskii and T. Yarman. Energy flow in a bound

electromagnetic field: resolution of apparent paradoxes. Eur.
J. Phys., 29:1135, 2008.
[LEK92] D. K. Lemon, W. F. Edwards, and C. S. Kenyon. Phys. Lett.

A, 162:105, 1992.
[Liv47] I. M. Livshitz. JETP, 11:1017, 1947.
[LK75] A. R. Lee and T. M. Kalotas. Lorentz transformations from

the first postulate. Am. J. Phys., 43:434, 1975.
[LL76] J.-M. Lévy-Leblond. One more derivation of the Lorentz trans-

formation. Am. J. Phys., 44:271, 1976.
[LL77] L. Landau and E. Lifshitz. Course of theoretical physics, Vol-

ume 3, Quantum mechanics, Non-relativistic theory. Perga-
mon, 1977.
812 BIBLIOGRAPHY
[LLB90] J.-M. Lévy-Leblond and F. Balibar. Quantics. North Holland,

1990.
[Mac63] G. W. Mackey. The mathematical foundations of quantum me-

chanics. W. A. Benjamin, New York, 1963. see esp. Section
2-2.
[Mac86] D. W. MacArthur. Special relativity: Understanding experi-

mental tests and formulations. Phys. Rev. A, 33:1, 1986.
[Mac00] G. W. Mackey. The theory of unitary group representations.

University of Chicago Press, Chicago, 2000.
[Mag54] W. Magnus. On the exponential solution of differential equa-

tions for a linear operator. Commun. Pure Appl. Math., 7:649,
1954.
[Mal96] D. B. Malament. In defence of dogma: Why there cannot be

a relativistic quantum mechanics of (localizable) particles. in
Perspectives on quantum reality, edited by R. Clifton. Kluwer,
1996.
[Man12] M. Mansuripur. Trouble with the Lorentz law of force: Incom-

patibility with special relativity and momentum conservation.
Phys. Rev. Lett., 108:193901, 2012. arXiv:1205.0096v1.
[Mat75] T. Matolcsi. Tensor product of Hilbert lattices and free or-

thodistributive product of orthomodular lattices. Acta Sci.
Math. (Szeged), 37:263, 1975.
[McDa] K. T. McDonald. Cullwick’s paradox: Charged particle on the

axis of a toroidal magnet. http://puhep1.princeton.edu/ mc-
donald/examples/cullwick.pdf.
[McDb] K. T. McDonald. The Wilson-Wilson experiment.

http://128.112.100.2/ mcdonald/examples/wilson.pdf.
[McD00] K. T. McDonald. Limits on the applicability of classical elec-

tromagnetic fields as inferred from the radiation reaction, 2000.
BIBLIOGRAPHY 813
[McD06] K. T. McDonald. Onoochin’s paradox,

2006. http://puhep1.princeton.edu/ mcdon-
ald/examples/onoochin.pdf.
[Mer09] N. D. Mermin. What’s bad about this habit. Physics Today,
May:8, 2009.
[MIB03] G. Matteucci, D. Iencinella, and C. Beeli. The Aharonov-Bohm
phase shift and Boyer’s critical considerations: New experi-
mental result but still an open subject? Found. Phys., 33:577,
2003.
[Mit02] P. Mittelstaedt. Quantum physics and classical physics - in
the light of quantum logic, 2002. arXiv:quant-ph/0211021.
[MKSR11] O. V. Missevitch, A. L. Kholmetskii, and R. Smirnov-Rueda.
Anomalously small retardation of bound (force) electromag-
netic fields in antenna near zone. Europhys. Lett., 93:64004,
2011.
[MM97] A. H. Monahan and M. McMillan. Lorentz boost of the
Newton-Wigner-Pryce position operator. Phys. Rev. D,
56:2563, 1997.
[MRR00] D. Mugnai, A. Ranfagni, and R. Ruggeri. Observation of su-
perluminal behaviors in wave propagation. Phys. Rev. Lett.,
84:4830, 2000.
[Mut78] U. Mutze. A no-go theorem concerning the cluster decomposi-
tion property of direct interaction scattering theories. J. Math.
Phys., 19:231, 1978.
[NFRS78] D. Newman, G. W. Ford, A. Rich, and E. Sweetman. Precision
experimental verification of special relativity. Phys. Rev. Lett.,
21:1355, 1978.
[Nik08] H. Nikolić. Time in relativistic and nonrelativistic quantum
mechanics, 2008. arXiv:0811.1905v1.
[NPS12] G. Naumenko, Yu. Popov, and M. Shevelev. Direct obser-
vation of a semi-bare electron Coulomb field recover, 2012.
arXiv:1112.1649v1.
814 BIBLIOGRAPHY
[NPSP10] G. A. Naumenko, A. P. Potylitsyn, L. G. Sukhikh, and

Yu. A. Popov. The investigation of relativistic electron electro-
magnetic field features during interaction with matter, 2010.
arXiv:1006.2477v1.
[NW49] T. D. Newton and E. P. Wigner. Localized states for elemen-

tary systems. Rev. Mod. Phys., 21:400, 1949.
[OMK+ 86] N. Osakabe, T. Matsuda, T. Kawasaki, J. Endo, A. Tono-

mura, S. Yano, and H. Yamada. Experimental confirmation of
Aharonov-Bohm effect using a toroidal magnetic field confined
by a superconductor. Phys. Rev. A: Math. Gen., 34:815, 1986.
[opu59] J. L opuszaǹski. The Ruijgrok-van Hove model of field theory

in terms of “dressed” operators. Physica, 25:745, 1959.
[ORU98] J. Oppenheim, B. Reznik, and W.G. Unruh. Time-of-arrival

states, 1998. arXiv:quant-ph/9807043.
[Osb68] H. Osborn. Relativistic center-of-mass variables for two-

particle system with spin. Phys. Rev., 176:1514, 1968.
[Par02] S. Parrott. Radiation from a uniformly accelerated charge

and the equivalence principle. Found. Phys., 32:407, 2002.
arXiv:gr-qc/9303025.
[Par05] S. Parrott. Variant forms of Eliezer’s theorem, 2005. arXiv:gr-

qc/0505042.
[Pin04] M. J. Pinheiro. Do Maxwell’s equations need a revision? - A

methodological note, 2004. arXiv:physics/0511103v3.
[Pir64] C. Piron. Axiomatique quantique. Helv. Phys. Acta, 37:439,

1964.
[Pir76] C. Piron. Foundations of Quantum Physics. W. A. Benjamin,

Reading, 1976.
[PL66] P. Pechukas and J. C. Light. On the exponential form of

time-displacement operators in quantum mechanics. J. Chem.
Phys., 44:3897, 1966.
BIBLIOGRAPHY 815
[PN45] L. Page and N. I. Adams Jr. Action and reaction between

moving charges. Am. J. Phys., 13:141, 1945.
[Pol85] W. N. Polyzou. Manifestly covariant, Poincaré invariant quan-
tum theories of directly interacting particles. Phys. Rev. D,
32:995, 1985.
[Pol01] R. Polishchuk. Derivation of the Lorentz transformations,
2001. arXiv:physics/0110076.
[Pol03] W. N. Polyzou. Relativistic quantum mechanics - particle pro-
duction and cluster properties. Phys. Rev. C, 68:015202, 2003.
arXiv:nucl-th/0302023.
[Pry48] M. H. L. Pryce. The mass-centre in the restricted theory of
relativity and its connexion with the quantum theory of ele-
mentary particles. Proc. Royal Soc. London, Ser. A, 195:62,
1948.
[PS95a] G. N. Pellegrini and A. R. Swift. Maxwell’s equations in a
rotating medium: Is there a problem? Am. J. Phys., 63:694,
1995.
[PS95b] M. E. Peskin and D. V. Schroeder. An introduction to quantum
field theory. Westview Press, 1995.
[PS98a] A. Pineda and J. Soto. The Lamb shift in dimensional regu-
larization. Phys. Lett. B, 420:391, 1998.
[PS98b] A. Pineda and J. Soto. Potential NRQED: The positronium
case, 1998. arXiv:hep-ph/9805424.
[RB99] F. Richman and D. Bridges. A constructive proof of Gleason’s
theorem. J. Funct. Anal., 162:287, 1999.
[Rec09] E. Recami. Superluminal waves and objects: an overview
of the relevant experiments. J. Phys.: Conference Series,
196:012020, 2009.
[Red53] M. L. G. Redhead. Radiative corrections to the scattering
of electrons and positrons by electrons. Proc. Roy. Soc. A,
220:219, 1953.
816 BIBLIOGRAPHY
[RF69] R. A. Reck and D. L. Fry. Orbital and spin magnetization in

Fe-Co, Fe-Ni, and Ni-Co. Phys. Rev., 184:492, 1969.
[RFPM93] A. Ranfagni, P. Fabeni, G. P. Pazzi, and D. Mugnai. Anoma-
lous pulse delay in microwave propagation: A plausible con-
nection to the tunneling time. Phys. Rev. E, 48:1453, 1993.
[RGC01] M. T. Reiten, D. Grischkovsky, and R. A. Cheville. Optical
tunneling of single-cycle terahertz bandwidth pulses. Phys.
Rev. E, 64:036604, 2001.
[RGR91] H. Rubio, J. M. Getino, and O. Rojo. The Aharonov-Bohm ef-
fect as a classical electromagnetic effect using electromagnetic
potentials. Nuovo Cim. B, 106:407, 1991.
[RH41] B. Rossi and D. B. Hall. Variation of the rate of decay of
mesotron with momentum. Phys. Rev., 59:223, 1941.
[Rit61] V. I. Ritus. Transformations of the inhomogeneous Lorentz
group and the relativistic kinematics of polarized states. Soviet
Physics JETP, 13:240, 1961.
[RM96] A. Ranfagni and D. Mugnai. Anomalous pulse delay in mi-
crowave propagation: A case of superluminal behavior. Phys.
Rev. E, 54:5692, 1996.
[RMGC01] M. T. Reiten, K McClatchey, D Grischkowsky, and R. A.
Cheville. Incidence-angle selection and spatial reshaping of
terahertz pulses in optical tunneling. Optics Lett., 26:1900,
2001.
[RMM74] T. N. Rescigno, C. W. McCurdy, and V. McKoy. Discrete
basis set approach to nonspherical scattering. Chem. Phys.
Lett., 27:401, 1974.
[RMR+ 80] C. E. Roos, J. Marraffino, S. Reucroft, J. Waters, M. S. Web-
ster, E. G. H. Williams, A. Manz, R. Settles, and G. Wolf. Σ±
lifetimes and longitudinal acceleration. Nature, 286:244, 1980.
[Rob00] T. Roberts. What is the experimen-
tal basis of special relativity?, 2000.
http://math.ucr.edu/home/baez/physics/Relativity/SR/experiments.html.
BIBLIOGRAPHY 817
[Roh] F. Rohrlich. The theory of the electron.

http://www.philsoc.org/1962Spring/1526transcript.html.
[Roh60] F. Rohrlich. Self-energy and stability of the classical electron.

Am. J. Phys., 28:639, 1960.
[Rom66] R. H. Romer. Angular momentum of static electromagnetic

fields. Am. J. Phys., 34:772, 1966.
[Ros57] M. E. Rose. Elementary theory of angular momentum. John

Wiley & Sons, New York, 1957.
[Ros93] W. G. V. Rosser. Classical electromagnetism and relativity: A

moving magnetic dipole. Am. J. Phys., 61:371, 1993.
[Rud91] W. Rudin. Functional Analysis. McGraw-Hill, New York,

1991.
[Rui] Th. W. Ruijgrok. On localisation in relativistic quantum me-

chanics. in Lecture Notes in Physics, Theoretical Physics.
Fin de Siécle, vol. 539, edited by A. Borowiec, W. Cegla, B.
Jancewicz, and W. Karwowski, (Springer, Berlin, 2000), pp.
52-74.
[Rui59] Th. W. Ruijgrok. Exactly renormalizable model in quantum

field theory. III. Renormalization in the case of two V-particles.
Physica, 25:357, 1959.
[Rui98] Th. W. Ruijgrok. General requirements for a relativistic quan-

tum theory. Few-body Systems, 25:5, 1998.
[Rus05] G. Russo. Conditions for the generation of casual paradoxes

from superluminal signals. Electronic J. Theor. Phys., 8:36,
2005.
[Sak67] J. Sakurai. Advanced quantum mechanics. Addison-Wesley,

Reading, Mass., 1967.
[Sar47] R.D Sard. The forces between moving charges. Electrical En-
gineering, January:61, 1947.
818 BIBLIOGRAPHY
[Sar82] D. A. Sardelis. Unified derivation of the Galileo and the

Lorentz transformations. Eur. J. Phys., 3:96, 1982.
[Sat66] S. Sato. Some remarks on the formulation of the theory of

elementary particles. Prog. Theor. Phys., 35:540, 1966.
[SC92] G. Spavieri and G. Cavalleri. Interpretation of the Aharonov-

Bohm and the Aharonov-Casher effects in terms of classical
electromagnetic fields. Europhys. Lett., 18:301, 1992.
[Sch] S. Schleif. What is the experimen-

tal basis of the special relativity theory?
http://www.weburbia.demon.co.uk/physics/experiments.html.
[Sch35] E. Schrödinger. Die gegenwartige Situation in der Quanten-

mechanik. Naturwissenschaftern., 23:807, 823, 844, 1935.
[Sch61] S. S. Schweber. An introduction to relativistic quantum field

theory. Row, Peterson & Co., Evanston, Il, 1961.
[Sch84] H. M. Schwartz. Deduction of the general Lorentz transfor-

mations from a set of necessary assumptions. Am. J. Phys.,
52:346, 1984.
[SF11] A. V. Shebeko and P. A. Frolov. A possible way for construct-

ing generators of the Poincaré group in quantum field theory,
2011. arXiv:1107.5877v1.
[SG03] G. Spavieri and G. T. Gillies. Fundamental tests of electro-

dynamic theories: Conceptual investigations of the Trouton-
Noble and hidden momentum effects. Nuovo Cim., 118B:205,
2003.
[Shi72] M. Shirokov. Quantum field theory: “dressing” contra diver-

gencies, 1972. Preprint JINR, P2-6454, Dubna.
[Shi93] M. I. Shirokov. Dressing and bound states in quantum field

theory, 1993. preprint JINR E4-93-55, Dubna.
[Shi94] M. I. Shirokov. Bound states of ”dressed” particles, 1994.

preprint JINR E2-94-82, Dubna.
BIBLIOGRAPHY 819
[Shi04] M. I. Shirokov. Decay law of moving unstable particle. Int. J.

Theor. Phys., 43:1541, 2004.
[Shi06] M. I. Shirokov. Evolution in time of moving unstable systems.

Concepts of Physics, 3:193, 2006. arXiv:quant-ph/0508087.
[Shi07] M. I. Shirokov. “dressing” and Haag’s theorem, 2007.

arXiv:math-ph/0703021.
[SJ67] W. Shockley and R. P. James. ”Try simplest cases” discovery

of ”hidden momentum” forces on ”magnetic currents”. Phys.
Rev. Lett., 18:876, 1967.
[Sok75] S. N. Sokolov. Physical equivalence of the point and instan-

taneous forms of relativistic dynamics. Theor. Math. Phys.,
24:799, 1975.
[SS78] S. N. Sokolov and A. N. Shatnii. Physical equivalence of the

three forms of relativistic dynamics and addition of interac-
tions in the front and instant forms. Theor. Math. Phys.,
37:1029, 1978.
[SS98] A. V. Shebeko and M. I. Shirokov. Relativistic quantum field

theory (RQFT) treatment of few-body systems. Nucl. Phys.,
A631:564c, 1998.
[SS01] A. V. Shebeko and M. I. Shirokov. Unitary transformations

in quantum field theory and bound states. Phys. Part. Nucl.,
32:15, 2001. arXiv:nucl-th/0102037v1.
[SSS+ 02] G. G. Shishkin, A. G. Shishkin, A. G. Smirnov, A. V. Dudarev,

A. V. Barkov, P. P. Zagnetov, and Yu. M. Rybin. Investigation
of possible electric potential arising from a constant current
through a superconductor coil. J. Phys. D: Applied physics,
35:497, 2002.
[Ste96] E. V. Stefanovich. Quantum effects in relativistic decays. Int.

J. Theor. Phys., 35:2539, 1996.
[Ste01] E. V. Stefanovich. Quantum field theory without infinities.

Ann. Phys. (NY), 292:139, 2001.
820 BIBLIOGRAPHY
[Ste02] E. V. Stefanovich. Is Minkowski space-time compatible with

quantum mechanics? Found. Phys., 32:673, 2002.
[Ste05] E. V. Stefanovich. Renormalization and dressing in quantum

field theory, 2005. arXiv:hep-th/0503076.
[Ste06] E. V. Stefanovich. Violations of Einstein’s time dilation for-

mula in particle decays, 2006. arXiv:physics/0603043v2.
[Ste08] E. V. Stefanovich. Classical electrodynamics without fields and

the Aharonov-Bohm effect, 2008. arXiv:0803.1326v2.
[Ste13] A. M. Steane. The non-existence of the self-accelerating dipole,

and related questions, 2013. arXiv:1311.5798v1.
[Sto32] M. H. Stone. On one-parameter unitary groups in Hilbert

space. Ann. Math., 33:643, 1932.
[Str04] F. Strocchi. Relativistic quantum mechanics and field theory.

Found. Phys., 34:501, 2004. arXiv:hep-th/0401143.
[Stu60] E. C. G. Stueckelberg. Quantum theory in real Hilbert space.

Helv. Phys. Acta, 33:727, 1960.
[Tan59] S. Tani. Formal theory of scattering in the quantum field the-

ory. Phys. Rev., 115:711, 1959.
[Tay09] G. I. Taylor. Proc. Cam. Phil. Soc., 15:114, 1909.
[Teu96] S. A. Teukolsky. The explanation of the Trouton-Noble exper-

iment revisited. Am. J. Phys., 64:1104, 1996.
[The62] J. W. Then. Experimental study of the motional electromotive

force. Am. J. Phys., 30:411, 1962.
[Tho52] L. H. Thomas. The relativistic dynamics of a system of parti-

cles interacting at a distance. Phys. Rev., 85:868, 1952.
[TN04] F. T. Trouton and H. R. Noble. The mechanical forces acting

on a charged electric condenser moving through space. Phil.
Trans. Roy. Soc. London A, 202:165, 1904.
BIBLIOGRAPHY 821
[TOM+ 86] A. Tonomura, N. Osakabe, T. Matsuda, T. Kawasaki, J. Endo,

S. Yano, and H. Yamada. Evidence for Aharonov-Bohm effect
with magnetic field completely shielded from electron wave.
Phys. Rev. Lett., 56:792, 1986.
[TY13] M. Tuval and A. Yahalom. Newton’s third law in the frame-

work of special relativity, 2013. arXiv:1302.2537v1.
[Uhl63] U. Uhlhorn. Representation of symmetry transformations in

quantum mechanics. Arkiv f. Phys., 23:307, 1963.
[Urb14] K. Urbanowski. Decay law of relativistic particles: Quantum

theory meets special relativity. Phys. Lett. B, 737:346, 2014.
arXiv:1408.6564v1.
[vN31] J. von Neumann. Die Eindeutigkeit der Schrödingerschen Op-

erationen. Math. Ann., 104:570, 1931.
[VS74] M. M. Vişinesku and M. I. Shirokov. Perturbation approach

to the field theory, “dressing” and divergences. Rev. Roum.
Phys., 19:461, 1974.
[Wal70] R. Walter. Recoil effects in scalar-field model. Nuovo Cim.,

68A:426, 1970.
[Wal98] T. S. Walhout. Similarity renormalization, Hamiltonian flow

equations, and Dyson’s intermediate representation, 1998.
arXiv:hep-th/9806097.
[Wal00] W. D. Walker. Experimental evidence of near-field

superluminally propagating electromagnetic fields, 2000.
[Wal01] D. Wallace. Emergence of particles from bosonic quantum field

theory, 2001. arXiv:quant-ph/0112149.
[Web99] A. Weber. Bloch–Wilson Hamiltonian and a generalization of

the Gell-Mann–Low theorem, 1999. arXiv:hep-th/9911198.
[Web05] A. Weber. Fine and hyperfine structure in different bound

systems, 2005. arXiv:hep-ph/0509019.
822 BIBLIOGRAPHY
[Wei64a] S. Weinberg. Photons and gravitons in S-matrix theory:

Derivation of charge conservation and equality of gravitational
and inertial mass. Phys. Rev., 135:B1049, 1964.
[Wei64b] S. Weinberg. The quantum theory of massless particles. in

Lectures on Particles and Field Theory, vol. 2, edited by S.
Deser and K. W. Ford. Prentice-Hall, Englewood Cliffs, 1964.
[Wei65] S. Weinberg. Photons and gravitons in perturbation theory:

Derivation of Maxwell’s and Einstein’s equations. Phys. Rev.,
138:B988, 1965.
[Wei95] S. Weinberg. The Quantum Theory of Fields, Vol. 1. Univer-

sity Press, Cambridge, 1995.
[Wei97] S. Weinberg. What is quantum field theory, and what did we

think it is?, 1997. arXiv:hep-th/9702027.
[Wes98] J. P. Wesley. Induction produces Aharonov-Bohm effect. Ape-

iron, 5:73, 1998.
[WFF+ 10] W. Withayachumnankul, B. M. Fischer, B. Ferguson, B. R.

Davis, and D. Abbott. A systemized view of superluminal
wave propagation. Proc. IEEE, 98:1, 2010.
[WHSK+ 95] M. Weitz, A. Huber, F. Schmidt-Kaler, D. Leibfried,

W. Vassen, C. Zimmermann, K. Pachucki, T. W. Hänsch,
L. Julien, and F. Biraben. Precision measurement of the 1S
ground-state Lamb shift in atomic hydrogen and deuterium by
frequency comparison. Phys. Rev. A, 52:2664, 1995.
[Wig31] E. P. Wigner. Gruppentheorie und Ihre Anwendung auf die

Quantenmechanik der Atomspektren. F. Vieweg und Sohn,
Braunschweig, 1931.
[Wig39] E. P. Wigner. On unitary representations of the inhomoge-

neous Lorentz group. Ann. Math., 40:149, 1939.
[Wil99] F. Wilczek. Quantum field theory. Rev. Mod. Phys., 71:S58,

1999. arXiv:hep-th/9803075.
BIBLIOGRAPHY 823
[WL02] A. Weber and N. E. Ligterink. The generalized Gell-Mann–

Low theorem for relativistic bound states. Phys. Rev. D,
65:025009, 2002. arXiv:hep-ph/0101149.
[WL05] A. Weber and N. E. Ligterink. Bound states in Yukawa model,

2005. arXiv:hep-ph/0506123.
[WM62] G. H. Weiss and A. A. Maradudin. The Baker-Hausdorff for-

mula and a problem in crystal physics. J. Math. Phys., 3:771,
1962.
[WW13] M. Wilson and H. A. Wilson. On the electric effect of rotating

a magnetic insulator in a magnetic field. Proc. R. Soc. London,
Ser. A, 89:99, 1913.
[WWS+ 12] R. E. Wagner, M. R. Ware, E. V. Stefanovich, Q. Su, and

R. Grobe. A study of local and non-local spatial densities in
quantum field theory. Phys. Rev. A, 85:022121, 2012.
[WX06] Z.-Y. Wang and C.-D. Xiong. Arrival time in relativistic quan-
tum mechanics, 2006. arXiv:quant-ph/0608031.
[You04] T. Young. Experimental demonstration of the general law of

the interference of light. Philosophical Transactions, Royal
Soc. London, 94:1, 1804. (reprinted in Great Experiments in
Physics, Morris Shamos, ed. (Holt Reinhart and Winston, New
York, 1959), p. 96.).
[Zub00] F. S. G. Von Zuben. Quantum time and spatial localization.

in Position Location and Navigation Symposium. IEEE, San
Diego, 2000.
Index
< “less than”, 20 active rotation, 615, 622

Sp(. . . , . . .), span of subspaces, 41 active transformation, 615
[. . . , . . .] Lie bracket, 633 addition of interactions, 191
[. . . , . . .] commutator, 631, 647 adiabatic switching, 226
[. . . , . . .]P Poisson bracket, 209 adjoint field, 694
∩, intersection of subspaces, 41 adjoint operator, 646
◦, 258 Aharonov-Bohm effect, 513
/k slash notation, 693 angular momentum, 104, 116
÷, 307 annihilation, 590
↔ compatibility, 36 annihilation operator, 244, 246
≤ “less than or equal to”, 20 anticommutator, 264
⊥ orthocomplement, 22 antilinear functional, 640
τ̃ 4-vector, 677 antilinear operator, 650
expr , 223 antisymmetric tensor, 633
|{z} antisymmetric wave function, 174
expr, 223
antiunitary operator, 85, 86, 650
∨ join, 22
assertion, 12
∧ meet, 21
associativity, 21, 22, 601, 603, 625
{. . . , . . .} anticommutator, 245
atom, 26, 41
pow (. . .), power of operator, 122
atomic lattice, 26
2-particle potential, 188, 272
atomic proposition, 26
3-vector, 617 bare particle, 361, 393
4-scalar, 107, 684 Barnett experiment, 508
4-square, 107, 678 baryon number, 255, 261
4-vector, 82, 107, 677, 684 basic observables, 104
basis, 605
Abelian group, 602, 603 Bethe logarithm, 470
aberration, 166 Biot-Savart force law, 491, 500
acceleration, 561 Birman-Kato invariance principle, 236
action integral, 213, 515 bispinor representation, 690
824
INDEX 825
Bohr radius, 405 composition of transformations, 65

Boolean lattice, 40 compound system, xxx, 170
Boolean logic, 27, 28, 42 Compton scattering, 10, 282, 321, 532
boost, xxxiii, 66 conjugate field, 694
boost operator, 93, 100, 104, 120 connected diagram, 285, 291
bosonic operator, 257 connected operator, 291
bosons, 174 conservation law, 254
bound field, 535 conservative force, 505
bound states, 196 conserved observable, 106, 254
bra vector, 639 contact interaction, 402
bra-ket formalism, 639 continuity equation, 728
Breit-Wigner distribution, 435 continuous spectrum, xxxii
bremsstrahlung, 389 contraposition, 23
Brewster angle, 542 coordinates, 614
corpuscular theory, 5
camera obscura, 5, 53 Coulomb gauge, 303
canonical form of operator, 120 Coulomb potential, 284, 402
Casimir operator, 107, 668 counterterms, 333
causality, 549, 587 coupling constant, 276
center of a lattice, 40 cover, 26
central charges, 93 creation operator, 244, 246
characteristic function, 34 cross product, 620
charge conservation law, 255 Cullwick paradox, 516
circular frequency, 10 current density, 727
classical logic, 28, 30
classical mixed state, 35 Darwin Hamiltonian, 484
closed subspace, 637 Darwin potential, 402
cluster, 187 Darwin-Breit potential, 398, 402
cluster separability, 187, 275 decay law, 414, 685
coefficient function, 258 decay potential, 261
commutation relations, 247 decay products, 414
commutativity, 21, 22 decay rate, 438
commutator, 631, 647 decomposition of unity, 47, 658
compatibility, 36 degenerate eigenvalue, 662
compatible observables, 14 delta function, 607, 747
compatible propositions, 36 density matrix, 48
compatible subspaces, 661 density operator, 48
complete inner product space, 637 diagonal matrix, 644
826 INDEX
diagram, 277 energy, 104

diffraction, 6 energy function, 259
dimension, 78, 605 energy shell, 259
Dirac equation, 710 energy-momentum 4-vector, 108, 137
Dirac field, 693 ensemble, 14
direct product, 34 entangled states, 173
direct sum, 657, 668 evanescent waves, 544
disconnected diagram, 285 event, 554
discrete spectrum, xxxii expectation value, 51
disjoint propositions, 24 experiment, 14
distributive laws, 28
distributivity postulate, 17 Faraday’s law of induction, 506
Doppler effect, 164, 576 fermions, 174
dot product, 613, 620 Feynman diagram, 316
double negation, 23 Feynman rules, 316
double-valued representation, 673 Feynman-Dyson interaction operator,
dressed particle, 361, 372, 393 313
dressing transformation, 378 Feynman-Dyson perturbation theory,
dual Hilbert space, 171, 639 312
dual vector, 640 fine structure, 407
duality, 612 fine structure constant, xxvi, 347
dynamical inertial transformation, 178, Fock space, 239
450 force, 486, 561
dynamics, 89 forms of dynamics, 179
front form, 179
eigenstate, 47 frustrated total internal reflection, 543
eigensubspace, 47, 663
eigenvalue, 46, 653 g-factor, 478
eigenvector, 47, 653 Galilei group, 67
elastic potential, 388 Galilei Lie algebra, 68
electric charge, 255 gamma matrices, 689
electric field, 506 generator, 92, 626, 630
electromagnetic induction, 504 Gleason theorem, 48
electron, 133 Gordon identity, 711
electron propagator, 712 group, 601
electron self-energy, 339 group inversion, 602
elementary particle, 131 group manifold, 629
emitting antenna, 540 group multiplication, 602
INDEX 827
group product, 601 infrared divergences, 326

gyromagnetic ratio, 478 inhomogeneous Lorentz group, 74
inner product, 637
Hamilton’s equations of motion, 211 inner product space, 637
Hamiltonian, 104 instant form, 179
Hamiltonian interaction operator, 313 interacting representation, 177
Hartman effect, 543 interaction, 169
Heisenberg equation, 101 interference, 7, 531
Heisenberg Lie algebra, 133, 182, 670 internal line, 279
Heisenberg picture, 89, 101 intrinsic angular momentum, 116
Heisenberg uncertainty relation, 204 intrinsic properties, 107
helicity, 115, 161 invariant tensor, 619
Hermitian conjugation, 646 inverse element, 601
Hermitian operator, 648 inverse matrix, 647
hidden momentum, 517 inverse operator, 647
Hilbert space, 637 irreducible lattice, 40
homopolar generator, 506 irreducible representation, 132, 668
homotopy class, 671 isolated system, xxx
hydrogen atom, xxxii, 197, 403, 475
hyperfine structure, 407 Jacobi identity, 633
join, 22
identity matrix, 644
Kennedy-Thorndike experiment, 576
identity operator, 651
ket vector, 639
identity transformation, 65
kinematical inertial transformation, 178,
implication, 20
450
improper state, 144
kinetic energy, 402
index of potential, 257
Kronecker delta symbol, 616, 619
induced representation method, 142
inelastic potential, 388 laboratory, xxxiii
inertial frame of reference, xxxiii Lagrangian, 213
inertial observer, xxxiii Lamb shift, 391, 475, 477
inertial transformations of observables, Larmor’s formula, 463
xxxiv, 101 lattice, 22
inertial transformations of observers, lattice irreducible, 40
xxxiii lattice reducible, 40
infinitesimal rotation, 626 law of addition of velocities, 106
infinitesimal transformation, 630 length contraction, 557, 685
infrared cutoff, 328, 753 lepton number, 255
828 INDEX
less than, 20 metric tensor, 678

less than or equal to, 20 Michelson-Morley experiment, 576
Levi-Civita symbol, 619 minimal proposition, 19
Liénard-Wiechert fields, 527 Minkowski space-time, 684
Lie algebra, 627, 633 mixed product, 621
Lie bracket, 633 mixed state, 49
Lie group, 629 momentum, 104
lifetime, 438 muon, 132
line external, 277
line in diagram, 277 neutrino, 133
linear functional, 639 neutrino oscillations, 133, 261
linear independence, 605 neutron, 132
linear subspace, 606 Newton’s first law, 487
little group, 143 Newton’s second law, 487
local gauge invariance, 373 Newton’s third law, 487
loop, 281, 293 Newton-Wigner position operator, 115
loop diagram, 325 non-conservative force, 505
loop momentum, 281 non-contradiction, 23, 24
Lorentz group, 80, 681 non-decay probability, 414
Lorentz transformations, 102, 558, 568, non-interacting representation, 177, 242,
573, 683 418
normal order, 256
Møller wave operator, 236 null 4-vector, 679
magnetic moment, 478
magnetic quantum number, 405 observable, xxix
manifest covariance, 577, 684 observer, xxxiii
many worlds interpretation, 56 one-parameter subgroup, 625, 669
mass, 107, 418 operator, 641
mass distribution, 423, 425 operator of mass, 108
mass hyperboloid, 139 operator of time, 579
mass operator, 108 orbital angular momentum, 116
mass shell, 319 orbital quantum number, 405
matrix, 644 origin, 613
matrix element, 642 orthocomplement, 22
maximal proposition, 19 orthocomplemented lattice, 24
measurement, xxix orthogonal complement, 657
measuring apparatus, xxix, 56 orthogonal matrix, 616
meet, 21 orthogonal subspaces, 657
INDEX 829
orthogonal vectors, 613, 638 point form, 179

orthomodular lattice, 40 Poisson bracket, 209
orthomodularity, 40 polaron, 591
orthonormal basis, 613, 638 position operator, 111, 115
oscillation potential, 261 position-time 4-vector, 677
postulate, 12
pair annihilation, 389 potential, 257
pair conversion, 389 potential boost, 180
pair creation, 389 potential energy, 180
pairing, 279 potential energy density, 301
partial ordering, 20 power of operator, 122
partially ordered set, 20 preparation device, xxix
particle, xxx primary term, 121
particle dressed, 372 principal quantum number, 405
particle observables, 245 principal value integral, 433
particle operators, 245 principle of causality, 687
particle-wave duality, 11 principle of relativity, 62, 63
passive transformation, 615 probability density, 35, 435
Pauli exclusion principle, 175, 244 probability measure, 18, 47
Pauli matrices, 675 projection operator, 658
Pauli-Lubanski operator, 109 projective representation, 90
perturbation order, 276 proposition, 17
phase space, 30, 31, 34, 206 proposition-valued measure, 45
phonon, 591 propositional system, 18
photon, 10, 164 propositional system quantum, 40
photon propagator, 314, 721 pseudo-Euclidean metric, 684
phys operator, 262 pseudoorthogonal matrix, 680
physical equivalence, 232 pseudoscalar, 74, 109
physical particle, 361 pseudoscalar product, 678
physical system, xxix pseudotensor, 74
pilot wave interpretation, 56 pseudovector, 74
pion, 132 pure quantum state, 49
Piron’s theorem, 43
Planck constant, 10, 104 QED (quantum electrodynamics), xix
Poincaré group, 79 QFT (quantum field theory), xix
Poincaré invariance, 577 quantum electrodynamics, 240, 298
Poincaré Lie algebra, 79 quantum field, 299
Poincaré stress, 501 quantum field theory, 298
830 INDEX
quantum logic, 4, 16, 36, 40 RQD (relativistic quantum dynamics),

quantum mechanics, 3 xxi
quasiclassical state, 203
quaternions, 44 S-matrix, 220
S-operator, 219
radiation field, 535 scalar, 602, 615
scalar product, 613
radiation reaction, 389, 531
scattering equivalence, 231
radiative corrections, 326
scattering phase operator, 224
radiative transitions, 463
scattering states, 226
range, 658
Schrödinger cat, 54
rank, 619
Schrödinger picture, 89, 100
rank of a lattice, 40
secondary term, 121
rapidity, 80
sector, 240
ray, 41, 49, 86, 606
self-adjoint operator, 648
receiving antenna, 540
simply connected space, 671
reduced mass, 404
single-valued representation, 673
reducible lattice, 40 smooth function, 610
reducible representation, 132, 668 smooth operator, 291
reflectivity, 20 smooth potential, 188, 275, 291
regular operator, 258 space-like 4-vector, 679
regularization, 328 span, 41, 606
relative momentum, 182 special relativity, 677
relative position, 182 spectral projection, 47, 663
relativistic Hamiltonian dynamics, 177 spectral theorem, 653
relativistic quantum dynamics, xxi, 371 spectrum of observable, xxxii
renorm potential, 260 spin, 107, 116, 135
renormalizability, 359 spin operator, 111
renormalization, 326 spin-orbit potential, 403
renormalization conditions, 328 spin-spin potential, 403
representation of group, 667 spin-statistics theorem, 175
representation of Lie algebra, 669 standard momentum, 143, 158, 716
representations of rotation group, 674 state, xxix, xxx
resonance, 438 statement, 12
rest mass energy, 108 stationary Schrödinger equation, 195,
Riemann-Lebesgue lemma, 610 404
Riesz theorem, 640 step function, 608
right-handed coordinate system, 613 Stone’s theorem, 91, 669
INDEX 831
structure constants, 93, 631 unitary equivalent representations, 667

subalgebra, 635 unitary operator, 649
sublattice, 46 unitary representation, 667
superselection rules, 242 universal covering group, 99
symmetric wave function, 174 unphys operator, 262
symmetry, 20
vacuum polarization, 342
T-matrix, 228 vacuum subspace, 241
tensor, 618 vacuum vector, 241
tensor product, 171, 176, 641 vector, 602, 613
tertiary term, 121 vector components, 605
Thomson formula, 322 vector product, 620
time dependent Schrödinger equation, velocity, 126
156, 200, 212 vertex, 277
time dilation, 686 virtual particle, 361
time evolution, xxxiv
wave function, 51
time evolution operator, 200 wave function in the momentum rep-
time ordering, 223, 712 resentation, 144
time-like 4-vector, 679 wave function in the position repre-
total internal reflection, 542 sentation, 149
total observables, 105 wave packet, 203
trace, 48, 647 Wick rotation, 750
trajectory, 206 Wigner angle, 141, 164
transitivity, 20 Wigner’s theorem, 85
transposed matrix, 615 Wilson-Wilson experiment, 510
transposed vector, 614
transposition, 646 zero vector, 603
tree diagram, 325
trivial representation, 667
Trouton-Noble paradox, 503
true vector, 74
truth function, 28
truth table, 29
ultraviolet cutoff, 328, 753

ultraviolet divergences, 293, 326
unimodular vector, 638
unit element, 601
Index
< “less than”, 20 active rotation, 615, 622

Sp(. . . , . . .), span of subspaces, 41 active transformation, 615
[. . . , . . .] Lie bracket, 633 addition of interactions, 191
[. . . , . . .] commutator, 631, 647 adiabatic switching, 226
[. . . , . . .]P Poisson bracket, 209 adjoint field, 694
∩, intersection of subspaces, 41 adjoint operator, 646
◦, 258 Aharonov-Bohm effect, 513
/k slash notation, 693 angular momentum, 104, 116
÷, 307 annihilation, 590
↔ compatibility, 36 annihilation operator, 244, 246
≤ “less than or equal to”, 20 anticommutator, 264
⊥ orthocomplement, 22 antilinear functional, 640
τ̃ 4-vector, 677 antilinear operator, 650
expr , 223 antisymmetric tensor, 633
|{z} antisymmetric wave function, 174
expr, 223
antiunitary operator, 85, 86, 650
∨ join, 22
assertion, 12
∧ meet, 21
associativity, 21, 22, 601, 603, 625
{. . . , . . .} anticommutator, 245
atom, 26, 41
pow (. . .), power of operator, 122
atomic lattice, 26
atomic proposition, 26
3-vector, 617 bare particle, 361, 393
4-scalar, 107, 684 Barnett experiment, 508
4-square, 107, 678 baryon number, 255, 261
4-vector, 82, 107, 677, 684 basic observables, 104
basis, 605
Abelian group, 602, 603 Bethe logarithm, 470
aberration, 166 Biot-Savart force law, 491, 500
acceleration, 561 Birman-Kato invariance principle, 236
action integral, 213, 515 bispinor representation, 690
832
INDEX 833
Bohr radius, 405 composition of transformations, 65

Boolean lattice, 40 compound system, xxx, 170
Boolean logic, 27, 28, 42 Compton scattering, 10, 282, 321, 532
boost, xxxiii, 66 conjugate field, 694
boost operator, 93, 100, 104, 120 connected diagram, 285, 291
bosonic operator, 257 connected operator, 291
bosons, 174 conservation law, 254
bound field, 535 conservative force, 505
bound states, 196 conserved observable, 106, 254
bra vector, 639 contact interaction, 402
bra-ket formalism, 639 continuity equation, 728
Breit-Wigner distribution, 435 continuous spectrum, xxxii
bremsstrahlung, 389 contraposition, 23
Brewster angle, 542 coordinates, 614
corpuscular theory, 5
camera obscura, 5, 53 Coulomb gauge, 303
canonical form of operator, 120 Coulomb potential, 284, 402
Casimir operator, 107, 668 counterterms, 333
causality, 549, 587 coupling constant, 276
center of a lattice, 40 cover, 26
central charges, 93 creation operator, 244, 246
characteristic function, 34 cross product, 620
charge conservation law, 255 Cullwick paradox, 516
circular frequency, 10 current density, 727
classical logic, 28, 30
classical mixed state, 35 Darwin Hamiltonian, 484
closed subspace, 637 Darwin potential, 402
cluster, 187 Darwin-Breit potential, 398, 402
cluster separability, 187, 275 decay law, 414, 685
coefficient function, 258 decay potential, 261
commutation relations, 247 decay products, 414
commutativity, 21, 22 decay rate, 438
commutator, 631, 647 decomposition of unity, 47, 658
compatibility, 36 degenerate eigenvalue, 662
compatible observables, 14 delta function, 607, 747
compatible propositions, 36 density matrix, 48
compatible subspaces, 661 density operator, 48
complete inner product space, 637 diagonal matrix, 644
834 INDEX
diagram, 277 energy, 104

diffraction, 6 energy function, 259
dimension, 78, 605 energy shell, 259
Dirac equation, 710 energy-momentum 4-vector, 108, 137
Dirac field, 693 ensemble, 14
direct product, 34 entangled states, 173
direct sum, 657, 668 evanescent waves, 544
disconnected diagram, 285 event, 554
discrete spectrum, xxxii expectation value, 51
disjoint propositions, 24 experiment, 14
distributive laws, 28
distributivity postulate, 17 Faraday’s law of induction, 506
Doppler effect, 164, 576 fermions, 174
dot product, 613, 620 Feynman diagram, 316
double negation, 23 Feynman rules, 316
double-valued representation, 673 Feynman-Dyson interaction operator,
dressed particle, 361, 372, 393 313
dressing transformation, 378 Feynman-Dyson perturbation theory,
dual Hilbert space, 171, 639 312
dual vector, 640 fine structure, 407
duality, 612 fine structure constant, xxvi, 347
dynamical inertial transformation, 178, Fock space, 239
450 force, 486, 561
dynamics, 89 forms of dynamics, 179
front form, 179
eigenstate, 47 frustrated total internal reflection, 543
eigensubspace, 47, 663
eigenvalue, 46, 653 g-factor, 478
eigenvector, 47, 653 Galilei group, 67
elastic potential, 388 Galilei Lie algebra, 68
electric charge, 255 gamma matrices, 689
electric field, 506 generator, 92, 626, 630
electromagnetic induction, 504 Gleason theorem, 48
electron, 133 Gordon identity, 711
electron propagator, 712 group, 601
electron self-energy, 339 group inversion, 602
elementary particle, 131 group manifold, 629
emitting antenna, 540 group multiplication, 602
INDEX 835
group product, 601 infrared divergences, 326

gyromagnetic ratio, 478 inhomogeneous Lorentz group, 74
inner product, 637
Hamilton’s equations of motion, 211 inner product space, 637
Hamiltonian, 104 instant form, 179
Hamiltonian interaction operator, 313 interacting representation, 177
Hartman effect, 543 interaction, 169
Heisenberg equation, 101 interference, 7, 531
Heisenberg Lie algebra, 133, 182, 670 internal line, 279
Heisenberg picture, 89, 101 intrinsic angular momentum, 116
Heisenberg uncertainty relation, 204 intrinsic properties, 107
helicity, 115, 161 invariant tensor, 619
Hermitian conjugation, 646 inverse element, 601
Hermitian operator, 648 inverse matrix, 647
hidden momentum, 517 inverse operator, 647
Hilbert space, 637 irreducible lattice, 40
homopolar generator, 506 irreducible representation, 132, 668
homotopy class, 671 isolated system, xxx
hydrogen atom, xxxii, 197, 403, 475
hyperfine structure, 407 Jacobi identity, 633
join, 22
identity matrix, 644
Kennedy-Thorndike experiment, 576
identity operator, 651
ket vector, 639
identity transformation, 65
kinematical inertial transformation, 178,
implication, 20
450
improper state, 144
kinetic energy, 402
index of potential, 257
Kronecker delta symbol, 616, 619
induced representation method, 142
inelastic potential, 388 laboratory, xxxiii
inertial frame of reference, xxxiii Lagrangian, 213
inertial observer, xxxiii Lamb shift, 391, 475, 477
inertial transformations of observables, Larmor’s formula, 463
xxxiv, 101 lattice, 22
inertial transformations of observers, lattice irreducible, 40
xxxiii lattice reducible, 40
infinitesimal rotation, 626 law of addition of velocities, 106
infinitesimal transformation, 630 length contraction, 557, 685
infrared cutoff, 328, 753 lepton number, 255
836 INDEX
less than, 20 metric tensor, 678

less than or equal to, 20 Michelson-Morley experiment, 576
Levi-Civita symbol, 619 minimal proposition, 19
Liénard-Wiechert fields, 527 Minkowski space-time, 684
Lie algebra, 627, 633 mixed product, 621
Lie bracket, 633 mixed state, 49
Lie group, 629 momentum, 104
lifetime, 438 muon, 132
line external, 277
line in diagram, 277 neutrino, 133
linear functional, 639 neutrino oscillations, 133, 261
linear independence, 605 neutron, 132
linear subspace, 606 Newton’s first law, 487
little group, 143 Newton’s second law, 487
local gauge invariance, 373 Newton’s third law, 487
loop, 281, 293 Newton-Wigner position operator, 115
loop diagram, 325 non-conservative force, 505
loop momentum, 281 non-contradiction, 23, 24
Lorentz group, 80, 681 non-decay probability, 414
Lorentz transformations, 102, 558, 568, non-interacting representation, 177, 242,
573, 683 418
normal order, 256
Møller wave operator, 236 null 4-vector, 679
magnetic moment, 478
magnetic quantum number, 405 observable, xxix
manifest covariance, 577, 684 observer, xxxiii
many worlds interpretation, 56 one-parameter subgroup, 625, 669
mass, 107, 418 operator, 641
mass distribution, 423, 425 operator of mass, 108
mass hyperboloid, 139 operator of time, 579
mass operator, 108 orbital angular momentum, 116
mass shell, 319 orbital quantum number, 405
matrix, 644 origin, 613
matrix element, 642 orthocomplement, 22
maximal proposition, 19 orthocomplemented lattice, 24
measurement, xxix orthogonal complement, 657
measuring apparatus, xxix, 56 orthogonal matrix, 616
meet, 21 orthogonal subspaces, 657
INDEX 837
orthogonal vectors, 613, 638 point form, 179

orthomodular lattice, 40 Poisson bracket, 209
orthomodularity, 40 polaron, 591
orthonormal basis, 613, 638 position operator, 111, 115
oscillation potential, 261 position-time 4-vector, 677
postulate, 12
pair annihilation, 389 potential, 257
pair conversion, 389 potential boost, 180
pair creation, 389 potential energy, 180
pairing, 279 potential energy density, 301
partial ordering, 20 power of operator, 122
partially ordered set, 20 preparation device, xxix
particle, xxx primary term, 121
particle dressed, 372 principal quantum number, 405
particle observables, 245 principal value integral, 433
particle operators, 245 principle of causality, 687
particle-wave duality, 11 principle of relativity, 62, 63
passive transformation, 615 probability density, 35, 435
Pauli exclusion principle, 175, 244 probability measure, 18, 47
Pauli matrices, 675 projection operator, 658
Pauli-Lubanski operator, 109 projective representation, 90
perturbation order, 276 proposition, 17
phase space, 30, 31, 34, 206 proposition-valued measure, 45
phonon, 591 propositional system, 18
photon, 10, 164 propositional system quantum, 40
photon propagator, 314, 721 pseudo-Euclidean metric, 684
phys operator, 262 pseudoorthogonal matrix, 680
physical equivalence, 232 pseudoscalar, 74, 109
physical particle, 361 pseudoscalar product, 678
physical system, xxix pseudotensor, 74
pilot wave interpretation, 56 pseudovector, 74
pion, 132 pure quantum state, 49
Piron’s theorem, 43
Planck constant, 10, 104 QED (quantum electrodynamics), xix
Poincaré group, 79 QFT (quantum field theory), xix
Poincaré invariance, 577 quantum electrodynamics, 240, 298
Poincaré Lie algebra, 79 quantum field, 299
Poincaré stress, 501 quantum field theory, 298
838 INDEX
quantum logic, 4, 16, 36, 40 RQD (relativistic quantum dynamics),

quantum mechanics, 3 xxi
quasiclassical state, 203
quaternions, 44 S-matrix, 220
S-operator, 219
radiation field, 535 scalar, 602, 615
scalar product, 613
radiation reaction, 389, 531
scattering equivalence, 231
radiative corrections, 326
scattering phase operator, 224
radiative transitions, 463
scattering states, 226
range, 658
Schrödinger cat, 54
rank, 619
Schrödinger picture, 89, 100
rank of a lattice, 40
secondary term, 121
rapidity, 80
sector, 240
ray, 41, 49, 86, 606
self-adjoint operator, 648
receiving antenna, 540
simply connected space, 671
reduced mass, 404
single-valued representation, 673
reducible lattice, 40 smooth function, 610
reducible representation, 132, 668 smooth operator, 291
reflectivity, 20 smooth potential, 188, 275, 291
regular operator, 258 space-like 4-vector, 679
regularization, 328 span, 41, 606
relative momentum, 182 special relativity, 677
relative position, 182 spectral projection, 47, 663
relativistic Hamiltonian dynamics, 177 spectral theorem, 653
relativistic quantum dynamics, xxi, 371 spectrum of observable, xxxii
renorm potential, 260 spin, 107, 116, 135
renormalizability, 359 spin operator, 111
renormalization, 326 spin-orbit potential, 403
renormalization conditions, 328 spin-spin potential, 403
representation of group, 667 spin-statistics theorem, 175
representation of Lie algebra, 669 standard momentum, 143, 158, 716
representations of rotation group, 674 state, xxix, xxx
resonance, 438 statement, 12
rest mass energy, 108 stationary Schrödinger equation, 195,
Riemann-Lebesgue lemma, 610 404
Riesz theorem, 640 step function, 608
right-handed coordinate system, 613 Stone’s theorem, 91, 669
INDEX 839
structure constants, 93, 631 unitary equivalent representations, 667

subalgebra, 635 unitary operator, 649
sublattice, 46 unitary representation, 667
superselection rules, 242 universal covering group, 99
symmetric wave function, 174 unphys operator, 262
symmetry, 20
vacuum polarization, 342
T-matrix, 228 vacuum subspace, 241
tensor, 618 vacuum vector, 241
tensor product, 171, 176, 641 vector, 602, 613
tertiary term, 121 vector components, 605
Thomson formula, 322 vector product, 620
time dependent Schrödinger equation, velocity, 126
156, 200, 212 vertex, 277
time dilation, 686 virtual particle, 361
time evolution, xxxiv
wave function, 51
time evolution operator, 200 wave function in the momentum rep-
time ordering, 223, 712 resentation, 144
time-like 4-vector, 679 wave function in the position repre-
total internal reflection, 542 sentation, 149
total observables, 105 wave packet, 203
trace, 48, 647 Wick rotation, 750
trajectory, 206 Wigner angle, 141, 164
transitivity, 20 Wigner’s theorem, 85
transposed matrix, 615 Wilson-Wilson experiment, 510
transposed vector, 614
transposition, 646 zero vector, 603
tree diagram, 325
trivial representation, 667
Trouton-Noble paradox, 503
true vector, 74
truth function, 28
truth table, 29
ultraviolet cutoff, 328, 753

ultraviolet divergences, 293, 326
unimodular vector, 638
unit element, 601
ct ct’
O O’
x’
B x

Relativistic Quantum Mechanics

Uploaded by

Copyright:

Available Formats

Relativistic Quantum Mechanics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Relativistic Quantum Mechanics

Uploaded by

Copyright:

Available Formats

arXiv:physics/0504062v17 [physics.

gen-ph] 16 Feb 2015

RELATIVISTIC QUANTUM DYNAMICS

Draft, 3rd edition

RELATIVISTIC QUANTUM DYNAMICS:

Mountain View, California

This book is an attempt to build a consistent relativistic quantum theory of

1.5 Quantum logic . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3 QUANTUM MECHANICS AND RELATIVITY 83

3.2.2 Elimination of central charges in the Poincaré algebra . 91

4 OPERATORS OF OBSERVABLES 103

5 SINGLE PARTICLES 131

7.2.3 Scattering equivalence of forms of dynamics . . . . . . 234

8 FOCK SPACE 239

9 QUANTUM ELECTRODYNAMICS 297

9.2 S-operator in QED . . . . . . . . . . . . . . . . . . . . . . . . 304

II QUANTUM THEORY OF PARTICLES 367

11.1.2 No-self-interaction condition . . . . . . . . . . . . . . . 374

12 COULOMB POTENTIAL AND BEYOND 397

13.3.4 Decays of states with deﬁnite velocity . . . . . . . . . . 446

14 RQD IN HIGHER ORDERS 455

15 CLASSICAL ELECTRODYNAMICS 481

15.4.1 Inﬁnitely long solenoids or magnets . . . . . . . . . . . 512

16 EXPERIMENTAL SUPPORT FOR RQD 535

17 PARTICLES AND RELATIVITY 545

17.3.3 Poincaré invariance vs. manifest covariance . . . . . . . 577

III MATHEMATICAL APPENDICES 599

B Delta function and useful integrals 607

C Some lemmas for orthocomplemented lattices. 611

D Rotation group 613

E Lie groups and Lie algebras 629

F Hilbert space 637

F.3 Bra and ket vectors . . . . . . . . . . . . . . . . . . . . . . . . 639

G Subspaces and projection operators 657

H Representations of groups 667

I Special relativity 677

J Quantum fields for fermions 689

J.11 Dirac equation . . . . . . . . . . . . . . . . . . . . . . . . . . 709

K Quantum field for photons 715

L QED interaction in terms of particle operators 727

M Loop integrals in QED 747

N Relativistic invariance of RQD 777

O Dimensionality checks 791

notions of quantum ﬁelds, ghosts, propagators, and Lagrangians. It was even

The most notorious failure of QFT is the problem of ultraviolet diver-

This is just not sensible mathematics. Sensible mathematics in-

In modern QFT the problem of ultraviolet inﬁnities is not solved, it is “swept

The dynamical character of boosts. Lorentz transformations for

formulas can be extended to all events with interacting particles regardless

Particles rather than fields. Presently accepted quantum ﬁeld theo-

us that these important principles can be reconciled with quantum postulates

The physical world is composed of point-like particles.

This book is divided into three parts. Part I: QUANTUM ELEC-

Chapter 3 Quantum mechanics and relativity uniﬁes the two above

interaction between charged particles and use it to calculate the spectrum of

lected in the Part III: MATHEMATICAL APPENDICES.

Remarkably, the development of the new particle-based RQD approach

edgements do not imply any direct or indirect endorsements of my work by

It is wrong to think that the task of physics is to find out how

In this Introduction, we will try to specify more exactly what is the