The origins of the exclusion

principle: an extremely natural
prescriptive rule

The exclusion principle was the final outcome of Pauli’s struggle to under-
stand some spectroscopic anomalies in the early 1920s: doublets were
observed in the spectra of alkali metals, singlets and triplets in the spectra
of the alkaline earths, and even more anomalous patterns were observed
when chemical elements were placed in an external magnetic field (anom-
alous Zeeman effect and Paschen–Back effect). These anomalous spectra
challenged the old quantum theory, and prompted a radical theoretical
change (Section 2.1). From 1920 to 1924 Alfred Landé, Werner
Heisenberg, and Niels Bohr were all engaged in trying to save the trad-
itional spectroscopic model (the so-called atomic core model) and to
reconcile it with the observed anomalies. The impasse was solved only
with Pauli’s introduction of a fourth degree of freedom for the electron,
and the consequent demise of the atomic core model (Section 2.2). What
Pauli called the ‘twofoldness’ [Zweideutigkeit] of the electron’s angular
momentum was soon reinterpreted as the electron’s spin (Section 2.3).
Pauli’s exclusion rule was announced in this semi-classical spectroscopic
context that characterized the revolutionary transition from the old quan-
tum theory to the new quantum theory around 1925.

2.1 The prehistory of Pauli’s exclusion principle

2.1.1 Atomic spectra and the Bohr–Sommerfeld theory of
atomic structure
The existence of spectral lines had been known to scientists since
the beginning of the nineteenth century when Wollaston and Fraunhofer
first observed the dark absorption lines in the spectrum of the Sun. It took

almost eighty years – the time necessary to improve the quality of technical
instruments – to find an empirical law governing the line distribution in the
hydrogen spectrum as observed in stars. In 1885 Balmer discovered that
each of the fourteen lines composing the spectrum of hydrogen obeyed the
simple rule

l¼h (2:1)
n22  n21

where l is wavelength, h ¼ 3645.6 Å, n1 ¼ 2 and n2 ¼ 3, 4, 5, 6, . . . respectively

for the first, second, third, and so forth elements of the series. A few years
later it became clear that this formula could be derived from another more
general formula due to Rydberg, which gave the frequency for any spectral
series as

n ¼ 2
ðn1 þ 1 Þ ðn2 þ 2 Þ2

where  n is the frequency of the n-th member of the series, R is the Rydberg
constant, and  denotes spectral terms (e.g. 1 ¼ 3P1, 2 ¼ 3S1). By setting
1 ¼ 0, 2 ¼ 0, n1 ¼ 2 and n2 ¼ 3, 4, 5, . . . Rydberg’s formula reduces to
Rydberg was the first to distinguish between a sharp series (S) and a
diffuse series (D). Other types of series were later discovered: the so-called
principal series (P) and the fundamental series (F). Jointly they form the
four chief series (S, P, D, F) available for every type of line (i.e. singlet,
doublet, triplet, . . . ).1 According to Rydberg’s formula, each line of a
series is then given by the difference between two spectral terms.2 This
was a useful phenomenological law, but did not offer a theoretical under-
standing of spectra. Only with Bohr’s atomic theory of hydrogen in 1913,
was the first step towards such an understanding taken.
Not only could Bohr derive from his atomic theory Rydberg’s and
Balmer’s formulae for the hydrogen series, but he provided Rydberg’s

The standard notation uses the superscript on the left of the capital letter to denote the type
of line; for instance the singlet four chief series are denoted as 1S, 1P, 1D, 1F.
For instance, the three lines composing a triplet of the sharp series S are given by the
following differences: 3P2  3S1; 3P1  3S1; 3P0  3S1, where the subscripts indicate the
specific line of the series at issue (e.g. 3P2 denotes the second line of the principal series
triplet). These differences are obtained by plugging into Rydberg’s formula 2 ¼ 3S1 (fixed
term) and 1 ¼ 3P2, 3P1, 3P0 respectively (running terms).
phenomenological law with a physical meaning: the difference between the

spectral terms was interpreted as the difference between two energy-states
(stationary states) of the atom. The frequency of a spectral line  n would
then be proportional to the difference between the initial state W2 in which
the electron is located and the final state W1 into which it jumps by
emitting radiation. The fixed term 2 of Rydberg’s formula was identified
with the final or lower energy-state, and the running term 1 with the
different possible initial states. From Planck’s quantum postulate it fol-
lows that radiation is not emitted continuously but only in discrete quanta
of energy, so Bohr wrote his formula h ¼ W2  W1, where h is Planck’s
Sommerfeld significantly extended Bohr’s atomic theory of hydrogen
and hydrogen-like atoms to include elliptic orbits.3 In his model, an
electron moving in an elliptic orbit is a system with two degrees of freedom
and two quantum conditions. In polar coordinates, the position of the
electron is given by the electron–nuclear distance r and the azimuthal angle
’ between r and the major axis of the elliptic orbit. Accordingly the
electron had two momentum coordinates:

ðiÞ p’ ¼ mr2 ’_ (2:3)

ðiiÞ pr ¼ mr_ (2:4)

where m is electron mass. By applying Bohr’s quantum condition to the two
momentum coordinates, Sommerfeld’s quantum conditions followed:4
ð1Þ p’ d’ ¼ kh (2:5)
ð2Þ pr dr ¼ rh (2:6)

where k and r are respectively the azimuthal and the radial quantum
number and, following Bohr’s quantum condition, they take only discrete
integral values (r ¼ 0, 1, 2, 3, . . . ; k ¼ 1, 2, 3, 4, . . . ). The sum r + k was
equal to n, where n was the so-called principal quantum number denoting
the energy-state of the electron. Like r and k, n also took only integral
values (n ¼ 1, 2, 3, . . . ). While r and k determined respectively the size and
shape of the orbit, in order to determine the orientation of the orbit in an
external magnetic field, it was necessary to introduce a third degree of

3 4
Sommerfeld (1916a), (1919). For details see White (1934), pp. 42ff.
As is well known from classical electrodynamics, in particular from

Larmor’s theorem,5 in the presence of an external magnetic field of inten-
sity H the orbit of the electron precesses about the field direction with
uniform angular velocity
!¼H (2:7)
and uniform frequency
L ¼ H (2:8)
This is equivalent to saying that Larmor’s angular velocity is given by the
field strength H times the magneto-mechanical ratio /p ¼ e/(2mc), where
 is the electron’s magnetic moment due to its orbital motion and p is its
orbital angular momentum; on the other hand, Larmor’s precession fre-
quency is given by H/2p times /p. Interestingly enough, Larmor’s mag-
netic effect was known to be proportional to the electron’s charge-to-mass
ratio e/m well before Thomson’s discovery of the electron. The unit
eh/(4pmc) (or, equivalently, eh /(2mc)) was called the Bohr magneton B
and in the old quantum theory it was taken as the unit of the magnetic
moment for an electron bound in an atom. According to Larmor’s theo-
rem, the presence of an external magnetic field did not alter the size and
shape of the electron’s orbit, because of a balance between the Coriolis
force and the Lorentz radial force on the electron. But it altered the
orientation of the orbit with respect to the field direction. If we represent
the orbital angular momentum of the electron by a vector K (in modern
notation L), under the effect of Larmor precession K turns about the field
vector H with an angle . According to Larmor’s theorem, the angle  is
constant, i.e. K takes continuous values in the precession. But, as
Sommerfeld6 pointed out, the vector K did not project itself on the field
direction in a continuous way. Rather, the cosine of  took only discrete
values denoted by a new quantum number, the so-called magnetic quantum
number m, which like n and k took only discrete integral values7 (m ¼ 1,
2, 3, . . . ,  k for k ¼ 1, 2, 3, 4, . . . , n). For instance, for n ¼ 1 and
k ¼ 1, m took the values 1; for n ¼ 2 and k ¼ 1, 2, m ¼ 1, 2 corres-
ponding to the four possible projections of the orbital angular momen-
tum vector K on the field direction (Fig. 2.1): the orbit appeared to be

Larmor (1897). 6 Sommerfeld (1916b).
It was assumed that m could not take the value 0, because this would lead to the electron
colliding with the nucleus.
m = +2 +2
m = +1 +1 +1


(0) (0) (0)


–1 –1 –1

–2 –2


Fig. 2.1 Space-quantization diagrams for Bohr–Sommerfeld orbits with

orbital angular momentum k ¼ 1, 2, and 3.
Source: H. E. White (1934) Introduction to Atomic Spectra (New York,
London: McGraw-Hill Book Company). Reproduced with permission of
The McGraw-Hill Companies.

space-quantized. And space quantization seemed to be confirmed by the

Stern–Gerlach experiment in 1921.8
Following up on Sommerfeld, at the beginning of the 1920s Bohr devel-
oped a building-up schema for the elements of Mendeleev’s periodic table.
He assumed that the number of electrons in an atom was equal to the
atomic number Z; each electron was in a definite quantized state and no
radiation was emitted as long as the electron remained in its stationary
state. Each stationary state corresponded to a possible nk-orbit for the
electron, where n was the principal quantum number and k the azimuthal
quantum number. Bohr fixed n ¼ 1 for the electrons of hydrogen (Z ¼ 1)
and helium (Z ¼ 2); n ¼ 2 for the electrons of the elements from lithium

The Stern–Gerlach experiment consisted of directing a beam of silver atoms in a high
vacuum through collimating slits along a strong field gradient so that the beam split into
two sub-beams. According to classical mechanics, since the magnetic moments of the atoms
were isotropically distributed (i.e. they had equal probability to take on any value
between M and –M), one would expect one single spot at the centre of the distribution
corresponding to the average magnetic moment. But Stern and Gerlach observed two spots
centred at two different points corresponding to two discrete values of M. This result was
greeted as a confirmation of Sommerfeld’s space quantization, as evidence for the discrete
orientations atoms would take up in the presence of an external magnetic field. As a matter
of fact, the Stern–Gerlach experiment can be unambiguously interpreted only with the
introduction of the spin: the correct explanation is that the total angular momentum of the
silver atoms coincides with the spin of the valence electron (which is alone responsible for
the magnetic moment M of the atom). And the spin projection takes only two discrete
eigenvalues (+1/2 and 1/2 in units of h / 2p) corresponding to the two observed spots.
(Z ¼ 3) to neon (Z ¼ 10); n ¼ 3 for the electrons of the atoms of the 3rd

period, and so forth. Thus, hydrogen would have a single 11 orbit (n ¼ 1
and k ¼ 1); helium would have two 11 orbits for its two electrons; lithium
would have the two 11 orbits of helium plus a new 21 orbit for its 3rd
electron; the 4th electron of beryllium (Z ¼ 4) would be in a second 21
orbit, and so on up to carbon (Z ¼ 6), whereas the 7th electron of nitrogen
(Z ¼ 7) to the 10th electron of neon (Z ¼ 10) would be placed in many 22
Each chemical element then seemed to be built up by adding an outer
electron to the electronic configuration of the previous element in the
periodic table, which consequently became the atomic core of the element
at issue. Bohr’s fundamental assumption was that the addition of the outer
electron did not affect the quantum numbers n and k of the already bound
electrons of the atomic core. This invariance and permanence of quantum
numbers9 is known as Bohr’s building-up principle (Aufbauprinzip). Since
this principle will be crucial for the rest of our story, it is necessary to linger
a little on it.
The building-up principle was justified by the classical theory of con-
ditionally periodic systems via Bohr’s correspondence principle. The cor-
respondence principle established a general correspondence between
classical electrodynamics and Bohr’s atomic theory. In the original for-
mulation, the principle established a correspondence – in the region of high
quantum numbers – between the classical concept of harmonic compon-
ents in the motion of an electrically charged particle and the quantum
concept of an electron’s transitions between stationary states.10 However,
Bohr soon regarded this principle, more generally, as the methodological
guideline underpinning his project to shape quantum theory as closely as
possible along the lines of classical physics.11 As such, the correspondence
principle is at work in the justification of the building-up principle: via the

See Bohr (1923a). Reprinted in Bohr (1977), p. 632.
‘Although the process of radiation cannot be described on the basis of the ordinary theory
of electrodynamics, according to which the nature of the radiation emitted by an atom is
directly related to the harmonic components occurring in the motion of the system, there is
found nevertheless to exist a far-reaching correspondence between the various types of
possible transitions between the stationary states on the one hand and the various
harmonic components of the motion on the other. This correspondence is of such a
nature that the present theory of spectra is in a certain sense to be regarded as a rational
generalization of the ordinary theory of radiation.’ Bohr (1914). Reprinted in
Bohr (1976), p. 301.
For a detailed analysis of the implications of the correspondence principle for Bohr’s later
research as well as for the development of quantum mechanics, see Petruccioli (1988).
correspondence principle, Bohr extended analogically the classical notion

of adiabatic invariants to atomic structure.
From the end of nineteenth century it was known that all periodic
mechanical systems possessed adiabatic invariants, i.e. magnitudes that
remained invariant while certain parameters underwent slow variations.12
Bohr extended in a non-mechanical way the classical notion of adiabatic
invariants to the electrons of an atom. In the case of a single-electron atom
such as hydrogen, the action of the electron moving around the nucleus
was taken to be adiabatically invariant: the atom was unaffected by the
slowly varying forces of the electric field. But problems arose with many-
electron atoms, because in this case the motion of the electrons did not
exhibit periodic properties. It was at this point that Bohr resorted to the
correspondence principle. Even if it was not possible to apply the classical
theory of conditionally periodic systems to nk-orbits of many-electron
atoms, yet
the general stability of atomic structures leads to the view that, in every atom
with several electrons, there are properties of the motion of each electron and of
the interplay among the electrons, which possess an invariant character that
cannot be explained mechanically, but that finds its meaningful expression just
in the quantum numbers. This view might be called a formal postulate of the
invariance and permanence of quantum numbers.13

As a result, the statistical weights of the stationary states characterized by

these quantum numbers were now regarded as adiabatically invariant. In
other words, the addition of an electron in the building-up process was
expected to let the atom undergo a slow variation of the electric field,
without however affecting the stationary states of the already bound
electrons. This explains why the building-up principle can also be known
in the literature as Bohr’s adiabatic principle.14 But this terminology is
misleading given the existence of another adiabatic principle in quantum
theory due to Ehrenfest;15 henceforth I prefer to refer to it as the building-
up principle.

The concept of adiabatic change goes back to Boltzmann’s and Clausius’s attempts to
reduce the second law of thermodynamics to pure mechanics, while the notion of adiabatic
motion was developed by Hertz and Helmoltz.
Bohr (1923a). Reprinted in Bohr (1977), pp. 631–2.
See for instance Serwer (1977), p. 204.
Ehrenfest (1913) was the first to extend the classical notion of adiabatic invariants to
quantum theory as well as the first to introduce – following a terminology suggested by
Einstein – an adiabatic principle. For the history of this principle, see Jammer (1966),
second edition (1989), pp. 99–107. It is worth noticing that Bohr derived the invariance of
statistical weights precisely from Ehrenfest’s adiabatic principle: ‘A basis for the
Bohr’s building-up schema offered a relatively simple and elegant clas-

sification of electrons in nk-orbits, in which symmetry considerations
played a relevant role.16 Yet the schema had a major drawback: symmetry
considerations about the electronic distribution often spoiled the consist-
ency of the building-up process through the periodic table. For example,
argon (Z ¼ 18) was built up from the atomic core of the previous noble gas
in the periodic table, namely neon (Z ¼ 10), by adding eight new electrons –
placed in suitable 3k-orbits – to the 2k-orbits of neon. Then, one would
expect the next noble gas, krypton (Z ¼ 36), to be built up from the atomic
core of argon by adding eighteen new electrons – placed in suitable
4k-orbits – to the 3k-orbits of argon. But this was not the case. According
to Bohr’s building-up schema, the electronic distribution of argon was not
strictly maintained in the building-up process of krypton. Argon orbits
were instead reopened and the electronic distribution rearranged so as to
satisfy symmetry in the structure of krypton, i.e. so as to have three groups
of six electrons each, placed in suitable 3k-orbits:

KrðZ ¼ 36Þ ¼ ðneonÞ10 ð31 Þ6 ð32 Þ6 ð33 Þ6 ð41 Þ4 ð42 Þ4 :

Analogous arbitrary reopenings of orbits occurred in the 4th and 5th
period. As we shall see, this drawback led Edmund Stoner, a few years
later, to modify Bohr’s building-up schema in a way that would turn out to
be crucial for Pauli’s introduction of his exclusion rule. Suffice it to say that
despite this drawback, Bohr’s building-up schema led to the successful
prediction of the chemical properties of an undiscovered element with
atomic number 72, and Hevesy and Coster’s discovery of hafnium in
1922 was hailed as a confirmation of Bohr’s schema.17 The widespread
consensus that Bohr’s theory gained in the period up to 1920 could not
however prevent it from facing serious difficulties when it came to the
explanation of some puzzling spectroscopic anomalies.

treatment of this problem [i.e. the problem of the weights that must be assigned to the
individual stationary states] is contained in a general condition, derived by Ehrenfest . . .
stating that in a continuous transformation of a system, the statistical weights of the
individual stationary states do not change as long as the degree of periodicity remains the
same. For systems for which the degree of periodicity is equal to the number s of the
degrees of freedom, this condition involves that all stationary states must be assigned the
same weight hs.’ Bohr (1923a). Reprinted in Bohr (1977), p. 616.
This is particularly evident in the electronic distribution of neon (Z ¼ 10), where the ten
electrons were distributed as follows: the first two electrons in two (11) orbits, the following
four electrons in four (21) orbits and the last four electrons in four (22) orbits. For a detailed
discussion of Bohr’s schema and its drawback, see Heilbron (1982), p. 273.
For the discovery of hafnium, see Kragh (1979).
2.1.2 The doublet riddle and the riddle of statistical weights

Bohr had resorted to the principal and azimuthal quantum numbers to
distribute electrons in nk-orbits. However, it soon became evident that
these two quantum numbers were not sufficient to account for the multi-
plets observed in the spectra of many-electron atoms. In 1920 doublets and
triplets were well known to spectroscopists, and after the work of Catalan
in 1922 higher multiplicities were discovered. The jungle of observed
doublets, triplets, quintuplets, and sextuplets in the series spectra hinted
at the existence of a multiplicity in the energy-levels that was not ade-
quately mirrored by the values of n and k alone. Multiplets stemmed
supposedly from an energy difference due to a splitting of the orbital
angular momentum state k. Thus, energy-levels for individual electrons
labelled s, p, d, f18 were associated with the azimuthal quantum numbers
k ¼ 1, 2, 3, . . . n, and it was assumed that each energy-level – except s –
could further split to yield doublets, triplets, or other multiplets. For
instance, the series spectra of alkali metals, i.e. the elements of the first
column of the periodic table (Li, Na, K, Rb, Cs), exhibited doublets that
supposedly corresponded to double energy-levels p, d, f. The spectra of the
alkaline earths, i.e. the elements of the second column (Be, Mg, Ca, Sr, Ba),
could display triplets and singlets, but only singlets were usually observed.
How did the alkali doublets originate? And why did alkaline earths exhibit
only singlets, even if triplets were theoretically allowed? More generally,
why did the angular momentum state split?
The current physical explanation of multiplets involves the so-called
spin–orbit coupling. To put it in an extremely simple way, consider the
semi-classical vector model of a one-electron atom, where we can represent
the total angular momentum as a vector J, which is the sum of the valence
electron’s orbital angular momentum vector L and of its spin angular
momentum vector S. The electron moves in the electrostatic field E created
by the protons of the nucleus. This field appears in the electron frame as an
internal magnetic field B at the electron normal to the plane of the orbit.
The magnetic moment of the electron spin interacts with this internal field
in such a way that it can line up either parallel to B when J ¼ L  S, or
antiparallel to B when J ¼ L + S. These two possible spin–orbit

For instance k ¼ 1 corresponds to s; k ¼ 2 to p; k ¼ 3 to d and k ¼ 4 to f. Notice the
notational difference between the small letters s, p, d, f referring to the energy-levels (and
hence to the spectral terms) for individual electrons and the capital letters S, P, D, F
referring to the spectral terms resultant from the combination of any two or more
non-equivalent s, p, d, f electrons.
combinations correspond to doublet fine structure levels (e.g. for p elec-

trons in sodium and potassium, 2P1/2 and 2P3/2). It is the spin–orbit
coupling that is at work in the doublet shift of the alkalis, as well as in
other spectroscopic anomalies. No wonder then that in the pre-spin era
before 1925 the alkali doublets appeared as a puzzling phenomenon.
The first step towards the solution of this riddle came from the analysis
of the fine structure of X-ray spectra. As known since 1914,19 an atom
emits X-ray doublets during the transition of an outer electron into an
internal vacant orbit. Sommerfeld20 pointed out that the width of the
X-ray doublets was proportional to the fourth power of the effective
nuclear charge, (Z  s)4, where Z is the atomic number corrected by a
factor s due to the screening exerted by the inner electrons of the atom.
Sommerfeld thought that the electron responsible for the emission of
X-rays underwent a relativistic increase in mass due to the different orbital
angular momenta of the two orbits having different eccentricities. This
difference accounted satisfactorily for the X-ray doublet fine structure.
The step from the ‘relativistic’ X-ray doublets to the ‘optical’ alkali doub-
lets was short, and at the beginning of the 1920s attempts were made to
extend the relativistic explanation to optical doublets.21 But the two kinds
of doublets were not analogous, since the energy-levels at work in the alkali
doublets did not have different orbital angular momenta. No unified
explanation was possible for relativistic and optical doublets: hence the
so-called ‘doublet riddle’.22
The second important step was taken once again by Sommerfeld a few
years later.23 To bring about the required multiplicities in the energy-
levels, Sommerfeld introduced a further quantum number j in addition
to Bohr’s n and k. This new quantum number, called the inner quantum
number, denoted the energy-sublevels into which the nk-orbits supposedly
split to give rise to multiplets. Without any clear geometrical or physical
meaning, j led nonetheless to a useful empirical rule for doublets and
triplets: doublets were supposed to originate from two energy-sublevels
j ¼ k, k  1 while triplets from three sublevels j ¼ k, k  1, k  2, with
j selection rule  j ¼ 0, 1. One year later Landé24 provided the inner
quantum number j with a physical meaning in the light of the so-called
atomic core model.

For a historical reconstruction of the research on the X-rays and their spectra see Heilbron
(1966), (1967).
Sommerfeld (1916a). 21 See de Broglie and Dauvillier (1922).
See Forman (1968). 23 Sommerfeld (1920). 24 Landé (1921a), (1921b).
The atomic core model – differently re-elaborated by Landé,

Heisenberg, Bohr and Pauli – became the standard spectroscopic model
for the explanation of multiplets until Pauli’s breakthrough in 1924.
According to the model, the atomic core of any atom (i.e. the internal
part – valence electrons aside – consisting of the nucleus plus the bound
electrons in closed shells) was supposed to have a non-zero angular
momentum in units of h/2p, denoted by a core quantum number r and
represented by the core angular momentum vector R. The orbital angular
momentum of the valence electron, denoted by the azimuthal quantum
number k, was represented by the vector K (in modern notation L). The
sum of K and R was supposed to give the total angular momentum vector
of the atom J. Landé identified the total angular momentum with
Sommerfeld’s inner quantum number j. He was perhaps driven to such
an identification by Sommerfeld’s selection rule for j, which surprisingly
coincided with Rubinowicz’s selection rule for the total angular momen-
tum of the atom.25 Multiplets seemed to arise from different energy-
sublevels due to different orientations of K with respect to R, and, as required
by space quantization, only a few discrete orientations (in units of h/2p) were
allowed, namely those associated with the multiplet components.
The atomic core model accounted satisfactorily for the observed multi-
plets, because the supposed electron–core coupling was a mistaken empir-
ical surrogate for the spin–orbit coupling: it entailed the same phenomena
that are actually entailed in the spin–orbit coupling. The atomic core soon
turned out to play no role in the total angular momentum of the atom and
its alleged angular momentum was substituted by the electron’s spin
angular momentum. This turning point dates back to Pauli’s spectroscopic
research, as I shall explain in Section 2.2.4. But in the pre-spin era, in order
to account for the observed multiplicity of spectral terms, the most natural
way was to ascribe a non-zero angular momentum to the atomic core and
to assume a coupling between the core and the electron’s orbital angular

Rubinowicz (1918). Rubinowicz was one of Sommerfeld’s students and assistants. While
working on the conservation of angular momentum in the classical process of radiation, he
found the following integral selection rules for the azimuthal and the magnetic quantum
numbers: k ¼ 0, 1 and m ¼ 0, 1. But Rubinowicz identified the azimuthal quantum
number with the total angular momentum of the atom, rather than with the electron’s
orbital angular momentum. Accordingly, the physical meaning of this rule was that the
total angular momentum of the atom was quantized as required by Sommerfeld’s space
quantization. For the historical reconstruction of the way Landé was driven to identify
j with the total angular momentum via Rubinowicz’s selection rule, see Forman (1970),
p. 196.
There were however a few striking difficulties with the atomic core
model. First, it contradicted Bohr’s building-up principle. Consider, for
instance, the atomic cores of alkalis. They coincide with the noble gases
occupying the column immediately preceding them in Mendeleev’s peri-
odic table. Hence, if alkalis had a non-zero core momentum as the model
required, noble gases too should have a non-zero angular momentum. But
this was in contrast with experimental evidence.
Furthermore, even granted a non-zero angular momentum for noble
gases, it was still impossible to get doublets and, more generally, even
multiplets. According to the atomic core model, the values of the inner
quantum number j were defined as | k  r |  j  | k + r |. But given the
integral values of the electron azimuthal quantum number k and the core
quantum number r, it was possible to get spectral terms of odd multiplicity
(e.g. singlets, triplets, . . . ), but not of even multiplicity (e.g. doublets,
quadruplets, . . . ). Alkali doublets remained unexplained.
The atomic core model contradicted Bohr’s building-up principle in
another way: the inner quantum number j and the number of states
associated with it did not remain adiabatically invariant during the build-
ing-up process. As Landé26 pointed out, the inner quantum number j+ of
an ion did not coincide with the core quantum number r of the immediately
preceding element in the periodic table, as Bohr’s principle required.
Rather, Landé found that

j þ ¼ r  1=2 (2:9)
and more generally
j ¼ r þ k  1=2; r þ k  3=2; : : : (2:10)
The number of states associated with j seemed to decrease in the building-
up process by an inexplicable half-integral unit. And this decrease brought
with it a corresponding decrease in the statistical weights of the states. But
the building-up principle required the statistical weights to remain invari-
ant. The riddle of statistical weights 27 was no less puzzling than the
doublet riddle. Indeed they were two sides of the same coin as the analysis
of the Zeeman effect soon revealed.

Landé (1923a) See also Serwer (1977), pp. 206–7.
For this terminology see Serwer (1977), pp. 204–7.
2.1.3 The anomalous Zeeman effect and the mystery

of half-integral quantum numbers
The atomic core model was also deployed to account for an even more
puzzling spectroscopic anomaly: the Zeeman effect. In 1896 Pieter Zeeman
observed that when a sodium flame was placed between the poles of a
Ruhmkorff electromagnet the two lines of the first principal doublet were
considerably broadened.28 One year later Lorentz provided a classical
explanation for this phenomenon.29 The external magnetic field H was
supposed to induce a change in the motion of the electrons. The electrons,
whose orbit-planes were normal to the direction of H, were speeded up (if
moving in a counter-clockwise direction) or slowed down (if moving in a
clockwise direction) from their usual orbital frequency  0 by an amount of
respectively  depending on the strength of the field, the charge-to-
mass ratio of the electron and the velocity of light (i.e.   ¼ eH/(4p mc)).
It was soon realised that  was nothing but Larmor’s frequency. If the
electrons’ motions are viewed parallel to the direction of the field, the
emitted light is right- and left-handed circularly polarized so that a doublet
is observed (Fig. 2.2B) with frequency  d ¼  0   L. On the other hand, if
viewed perpendicular to the field, three components are observed: a central
unshifted line and two lines displaced on either side of the central one, with

(a) (b) (c) (d)
y– X+ y–
X y+ X X


⊥ ⊥

–∆ν +∆ν

–∆ν +∆ν

s s s ν0 s
ν0 ν
Z (e) (A) (B)

Fig. 2.2 Diagrams for Lorentz’s explanation of the normal Zeeman effect,
where (A) is the diagram for the triplet and (B) for the doublet.
Source: H. E. White (1934) Introduction to Atomic Spectra (New York,
London: McGraw-Hill Book Company). Reproduced with permission of
The McGraw-Hill Companies.

28 29
Zeeman (1896). Lorentz (1897).
frequency,  t ¼  0,  0   L respectively (Fig. 2.2A). This triplet is called the

Lorentz normal triplet and is the spectral pattern of the so-called normal
Zeeman effect typical of the series spectra of zinc, copper, and cadmium,
among others.
The low resolving power of the instruments did not actually allow
Zeeman to observe doublets and triplets; he could only observe that the
lines were widened in the presence of a magnetic field. It was Preston who
first observed – using instruments with greater resolving power – that not
only were certain lines of the zinc spectrum split up into triplets when
viewed perpendicular to the field, but also that others were split into as
many as four and even six components. In December 1897, Preston
reported to the Royal Dublin Society that the two D lines of sodium
were split into a quadruplet and a sextuplet (Fig. 2.3).
Similar anomalous patterns were later observed in the series spectra of
other elements and they all fell under the name of anomalous Zeeman
effect. A few years later, Runge30 pointed out that the frequencies of the
components of the anomalous Zeeman pattern were rational fractions –
called Runge fractions – of the so-called Lorentz unit of the Zeeman
effect.31 Despite Runge’s law and similar empirical rules, an explanation
of the anomalous Zeeman effect was beyond the reach of spectroscopy
until 1921, when Landé made an important contribution. He managed to

Zinc Singlet Sodium Principal Doublet

No field

Weak field

Normal Triplet Anomalous Patterns

Fig. 2.3 Spectral patterns of the normal Zeeman effect of zinc and of the
anomalous Zeeman effect of sodium, viewed perpendicular to the magnetic
Source: H. E. White (1934) Introduction to Atomic Spectra (New York,
London: McGraw-Hill Book Company). Reproduced with permission of
The McGraw-Hill Companies.

Runge (1907).
For instance, the anomalous Zeeman pattern of the sodium principal series doublet can be
expressed as  2/3L for the quadruplet and  4/3L for the sextuplet, where L is the Lorentz
unit of the Zeeman effect L ¼ (eH/(4pmc2)).
derive the anomalous doublets, but at the cost of strengthening the doublet
riddle and the riddle of statistical weights.
Landé32 empirically ascribed half-integral values to the magnetic quan-
tum number (and hence also to the inner quantum number j): m ¼ 1/2,
3/2, 5/2, . . . ,  ( j  1/2) so that he could derive doublets. But half-
integral values for j and m clearly violated the Bohr–Sommerfeld theory,
which allowed only integral values for quantum numbers, as well as Bohr’s
building-up principle, which forbade the decrease of a half-integral unit in
the building-up process. The riddle of statistical weights was still looming
on the horizon. Furthermore, half-integral quantum numbers for alkali
doublets were in contrast with relativistic X-ray doublets that required
only integral quantum numbers: Landé’s proposal strengthened the doub-
let riddle.
Landé also calculated the energy W of a Zeeman state (i.e. the Zeeman
Hamiltonian describing the interaction energy between the external mag-
netic field and the magnetic moment of the atom) as
W ¼ W0 þ ½mghðeH=4pmcÞ ¼ W0 þ mghL (2:11)
where W0 is the Hamiltonian of the atom in the absence of an external
field, m is the magnetic quantum number,33  L is the Larmor frequency
and g is the so-called Lande´ g factor. The g factors were introduced on a
purely empirical basis as Grundenergieniveau specific for each spectral
term.34 The magnetic quantum number m times g turned out to give the
right splitting factors for the anomalous doublets. For instance, by empir-
ically fixing the g values for the following spectral terms of the sodium
principal and sharp series doublets,35 the resultant mg splitting factors
could be obtained:

g ¼ 2=3 for the term 2 P1=2 ¼) mg ¼ 1=3

g ¼ 4=3 for the term 2 P3=2 ¼) mg ¼ 2=3; 6=3

g ¼ 2 for the term 2 S1=2 ¼) mg ¼ 1

Landé (1921a).
Notice the difference between the first m in the formula (the magnetic quantum number),
and the second m in the denominator, which denotes the electron’s mass in the charge-
to-mass ratio of the Larmor frequency.
Landé (1921a). For a historical reconstruction, see Forman (1970).
Here the subscript of the spectral term denotes Landé’s half-integral j as well as the
corresponding magnetic number m, e.g. the term 2P1/2 corresponds to j ¼ 1/2 and
m ¼ 1/2. For details see White (1934), p. 158.
Landé proceeded to generalize this result to singlets and triplets. But how
to justify the g factors and their fractional values? How to explain the
2(2k  1) states observed in the presence of a magnetic field? Landé’s
phenomenological analysis left these questions unanswered. As we shall
see in the following sections, it was Heisenberg’s, Bohr’s, and Pauli’s task
to address them and find possible solutions.
Two years later, Landé returned to the problem.36 By using the atomic
core model he found an empirical formula for the g factors, which turned
out to depend on the quantum numbers j, r and k:
3 1 k2  r2
g ¼   2 1 (2:12)
2 2 j 4

Setting k ¼ l + 1/2, r ¼ s + 1/2, and j ¼ j 0 þ 1/2, Landé’s formula corre-

sponds to the current formula for the g factor, in modern notation

3 1 lðl þ 1Þ  sðs þ 1Þ
g¼  (2:13)
2 2 j0 ðj0 þ 1Þ

where the pspin quantum number s has replaced the core quantum number
r and j ¼ jðj þ 1Þ. Landé’s procedure was entirely a posteriori: both the
g factors and the formula (2.12) were obtained from a phenomenological
analysis of spectra. Looking for a justification, Landé hypothesized – in
the light of the atomic core model – that the mg splitting factor corres-
ponded to the sum of the projections of the electron angular momentum
vector K and the core angular momentum vector R on the external field
vector H. Both K and R Larmor-precessed around the resultant total
angular momentum vector J, with J in turn Larmor-precessing around
H with angular velocity g!L. There was however a puzzling anomaly
concerning the observed unexpected value 2 of the g factor. In other
words, the mg splitting factor was equal to

mg ¼ jKj cosðK; HÞ þ 2jRj cosðR; HÞ (2:14)

rather than

mg ¼ jKj cosðK; HÞ þ jRj cosðR; HÞ (2:15)

Landé (1923a).
as was expected from Larmor’s theorem applied to K and R. This sug-

gested that R precessed twice as fast around H as K did, i.e. R’s preces-
sional angular velocity seemed to be twice Larmor’s angular velocity !L.
This faster precession was ascribed to an anomalous magneto-mechanical
ratio between the core magnetic moment r and the core angular momen-
tum pr, a ratio that was supposed to be twice the corresponding magneto-
mechanical ratio for the electron

r =pr ¼ 2ðk =pk Þ ¼ 2ðe=2mcÞ (2:16)

There were only two alternatives to account for this ‘core magnetic anom-
aly’, experimentally detected a few years earlier:37 either to modify
Larmor’s theorem, or to postulate a further rotation of the core so as to
account for its faster precessional angular velocity. Landé first tried to
modify Larmor’s theorem without success. In 1923 he was driven to the
second alternative.38 Both alternatives were wrong and a conclusive under-
standing of this anomaly came only with Pauli’s rejection of the atomic
core model and the later introduction of the electron spin, whose magneto-
mechanical ratio is exactly twice the orbital magneto-mechanical ratio.
Furthermore, as we shall see in Chapter 4, only with Dirac’s equation for
the electron in 1928 did it become possible to derive the spin magnetic
moment, and to explain the anomalous value 2 of the g factor. This was
indeed one of the most important achievements of the new quantum
theory. But in the framework of the (pre-spin) old quantum theory, the
culprit to blame for the observed anomaly was the atomic core.
Landé’s analysis of the anomalous Zeeman effect left some important
questions unanswered. First, the failure of Larmor’s theorem in the anom-
alous Zeeman effect needed to be explained. Second, half-integral values
cried out for a theoretical explanation: they were postulated to accommo-
date spectroscopic data, although they were not allowed in the
Bohr–Sommerfeld theory (doublet riddle and riddle of statistical weights).
Should this theoretical impasse be overcome by abandoning the
Bohr–Sommerfeld theory together with the building-up principle? Or
should half-integral quantum numbers be rejected as non-orthodox?
Could the two horns of this dilemma be reconciled? In the short but intense

In 1919 E. Beck performed a series of experiments that shed light on the existence of an
anomalous magneto-mechanical ratio for all ferromagnetic materials. However, Beck
thought that such an anomalous ratio was due to the electron’s orbital motion, and not to
the atomic core as later Landé (and after him Heisenberg) supposed.
An attempt to modify Larmor’s theorem without any compelling theoretical reason is
given in Landé (1921b), while the second option is discussed in Landé (1923a).
period between 1921 and 1924 Bohr, Heisenberg, and Pauli addressed
these questions and proposed alternative solutions that I now turn to in
some detail.

2.2 Bohr, Heisenberg, and Pauli on spectroscopic anomalies

2.2.1 Niels Bohr: nothing but a ‘non-mechanical constraint’?
In an article written for a special issue of Annalen der Physik dedicated to
the spectroscopist Kayser, Bohr offered a survey of the state of the art in
spectroscopy up to 1923.39 Starting with the quantum theory of condition-
ally periodic systems and the correspondence principle, Bohr investigated
how this theoretical framework fared on the score of spectroscopic find-
ings. While the treatment of one-electron atom spectra was fairly unprob-
lematic, an adequate treatment of many-electron atom spectra had proved
more complicated. Even more serious difficulties arose in the case of the
anomalous Zeeman effect and the Paschen–Back effect:40 while the former
concerns spectral patterns in a weak magnetic field, the latter concerns
spectral patterns in a strong magnetic field.41 Bohr could not help noting
that Landé’s experimental findings about the anomalous Zeeman effect
were at odds with the quantum theory of periodic systems:

It follows from Landé’s analysis that the value of the component of the angular
momentum of the atom parallel to the field, divided by 2p, may not always be
equal to an integer, as assumed in the interpretation of the Zeeman effect of the
hydrogen lines . . . Landé actually obtained values for this quantity . . . differing
from an integer by 1/2. This finding has made questionable the justification of
fixing the motion in the stationary states of an atom with several electrons in
direct analogy with the quantum theory of periodic systems.42

Not disheartened by Landé’s experimental findings, Bohr put forward a

very speculative hypothesis that could accommodate evidence. Following
the atomic core model, Bohr assumed that the spectral multiplicity arose

Bohr (1923a). 40 Paschen and Back (1921).
In its simplest form, the Paschen–Back effect concerns the appearance of triplets that
resemble the Lorentz normal triplets of the normal Zeeman effect. In the terminology of
the semi-classical vector model, as the doublet splitting in the anomalous Zeeman effect is
due to the spin–orbit coupling, in which both k and s precess (with s twice as fast as k),
analogously the triplets of the Paschen–Back effect are due to the spin–orbit decoupling: in
a strong magnetic field, k and s get decoupled, separately quantized and precess around the
field independently of each other.
Bohr (1923a). Reprinted in Bohr (1977), p. 630.
from the coupling of the core’s inner orbits with the valence electron’s
outer orbit. This resulted in radiative transitions not corresponding to the
fundamental harmonic component of the electronic motion, but to smaller
harmonic components: the correspondence principle still offered the gen-
eral framework for an understanding of complex spectra. Furthermore, he
noticed a ‘recurring closure of certain electron groups with increasing
atomic number, a closure expressing itself in a disinclination of these
groups to admit any additional electrons in orbits with the same values
of the quantum number n and k’.43 As we shall see in Section 2.2.4, what
Bohr here mildly describes as a ‘disinclination’ of groups to admit add-
itional electrons in orbits with the same quantum numbers soon turned out
to be a ‘prohibition’ according to Pauli’s exclusion rule. By contrast, Bohr
traced this ‘disinclination’ back to the symmetry requirements he envi-
saged for the closure of electronic orbits of noble gases,44 even if, as
mentioned in Section 2.1.1, these symmetry requirements implied a prob-
lematic reopening of already closed electronic orbits.
The interplay between inner and outer orbits (i.e. the core–electron
coupling) could – once suitably adjusted – account both for the alkali
doublets and for the anomalous Zeeman effect. In the case of alkalis,
Bohr fixed the j-values of the doublet p-terms (k ¼ 2) equal to k + d and
k + d – 1, whereas for the singlet s-term (k ¼ 1) j ¼ 1 + d, ‘where d is a
quantity that is independent of k, but is to begin with undetermined’.45
Bohr’s explanation of the anomalous Zeeman effect was even more ingenious.
As Landé’s analysis had revealed, an atom placed in a weak magnetic field
could take on 2(2k  1) states. But this number was not equal to the
product of the number of electron states with the number of core states
in the magnetic field. In fact, the atomic core could take on only one state,
i.e. perpendicular to the field direction, by analogy with noble gases. On
the other hand, the valence electron could take on only 2k positions in the
field so that the product of the respective positions of the core and electron
was 2k, a number half as large as Landé’s experimentally detected
2(2k  1). To accommodate this discrepancy, Bohr advanced the following

Ibid., p. 642.
‘We may assume that, in the normal states of the atoms of the noble gases, we are dealing
with electron configurations of pronounced symmetry. Actually, we see just in this
symmetry the explanation of the closed character of these electron groups; for the admission
of additional electrons, which would destroy this symmetry, would, in fact, not show that
analogy with a classical radiation process which could be established for the possible
transitions between stationary states of simply or multiply periodic systems.’ Ibid., p. 645.
Ibid., p. 645.
We are led to the view that, because of stability properties of the atom which
cannot be described mechanically, the coupling of the series electron to the
atomic core is subject to a constraint (Zwang) which is not analogous to the
effect of an external field, but which forces the atomic core to assume two
different positions in the atom, instead of the single orientation possible in a
constant external field, while at the same time, as a result of the same constraint,
the outer electron, instead of 2k possible orientations in an external field, can
only assume 2k –1 orientations in the atomic assemblage.46

A ‘non-mechanical constraint’47 (unmechanischer Zwang) not analogous

to the action of any external field was supposed to intervene in the
electron–core coupling to set right the total number of the states in the
anomalous Zeeman effect: the Zwang would subtract one position from
the 2k positions of the electron (which then remained 2k  1) and would
add it to the core that as a result would take two possible positions, instead
of one.48 As if by magic, the total number of anomalous Zeeman states
turned out to be 2(2k  1). The ‘non-mechanical constraint’ was plucked
out of the air as an ad hoc auxiliary hypothesis to retain the validity of the
atomic core model, and to reconcile it with recalcitrant evidence. But its
physical nature remained obscure: since it was not analogous to the effect
of a magnetic or electric field, what kind of physical constraint was it and
how did it originate? Without answering these questions, Bohr contented
himself with reconciling theory and evidence.
Bohr was however well aware that this way of amending the atomic
core model was in contrast with the quantization rules for periodic sys-
tems. A clear sign of this contrast was the half-integral values of the
magnetic quantum number m and inner quantum number j introduced
by Landé:
This finding makes it natural to surmise that the orientation of the orbit of the
series electron relative to the atomic core cannot be described either by integer
values of j; in particular, for reasons of symmetry, the formal assumption
d ¼ 1/2 might offer itself.49

Given the total angular momentum j of the atom, saying that its half-integral
values are due to the sum of k with a new quantity d ¼ 1/2, is empirically
equivalent to saying, in the later semi-classical terminology, that the total

Ibid., p. 646.
The adjective ‘non-mechanical’ was later added to Bohr’s ‘constraint’ by Heisenberg
(1925a). On Bohr’s unmechanischer Zwang, see Serwer (1977).
The hypothesis of a non-mechanical force not analogous to a Coulomb force had been
previously used by Van Vleck (1922) to explain the anomalous instability of the helium
Bohr (1923a). Reprinted in Bohr (1977), p. 647.
angular momentum of the atom, which is contributed by the valence electron

alone, is made up of two parts: the electron’s orbital angular momentum
(given by k) and the electron’s spin angular momentum (s ¼ 1/2) so that
j ¼ k  s, hence the doublet p-terms j ¼ k + 1/2 and j ¼ k  1/2. Bohr’s quan-
tity d played a role empirically equivalent to that of the electron’s spin s to
bring about the half-integral values of the total angular momentum,
and hence the doublet terms. However, d did not anticipate the electron’s
spin, because it was not meant to be a property of the electron. It was
rather an empirically fixed quantity suitably chosen to accommodate experi-
mental evidence, but whose physical meaning was as obscure as that of the
Surprisingly enough, half-integral values for j and m did not undermine
the validity of the building-up principle, whose scope was restricted to
quantum numbers for individual electrons in nk-orbits. Thus, while allow-
ing half-integral values for j and m, Bohr insisted that half-integral values
for the azimuthal quantum number k ‘must be regarded as a departure
from [the quantum theory of periodic systems] which can hardly be sub-
stantiated’, and whose consequences ‘seem to contradict our experience
about spectra’.50 The target of this blow, explicitly mentioned in the same
passage, was a bold version of the atomic core model elaborated by Werner
Heisenberg in 1921 that relied precisely on unorthodox (building-up prin-
2.2.2 Heisenberg’s first core model: the sharing principle.

Does success justify the means?
After Landé’s empirical findings on the anomalous Zeeman effect,51
Heisenberg too tried to accommodate the anomalous evidence with the
atomic core model. To this purpose he elaborated a first version of the
model in 1922, and a second improved one in 1924, known in the historical
literature respectively as Heisenberg’s first and second core models. The
first core model52 hinged on the so-called sharing principle: the valence
electron was supposed to ‘share’ half of its angular momentum with the
core, which was taken as originally having no net angular momentum. The
quantum condition for the electron’s angular momentum was fixed as

Ibid., footnote on p. 647. 51 Landé (1921a), (1921b).
Heisenberg (1922). For a detailed historical analysis of the first core model, see Cassidy
(1979) on which I draw for this section.
56 2 The origins of the exclusion principle

p d ¼ k h (2:17)

where p is the electron’s angular momentum,  is the azimuthal angle, and

k* is the azimuthal quantum number that under the sharing principle was
set equal to k  1/2 (the asterisk indicates that the number was half-integral
valued). While Landé and Bohr retained integral values for k as required
by the Bohr–Sommerfeld quantum conditions (2.5)–(2.6), Heisenberg
assigned it – in a completely unorthodox way – half-integral values
(k ¼ 1/2, 3/2, . . . ). On the other hand, the core picked up the half-quantum
from the valence electron so that its angular momentum was equal to 1/2.
In this case, it was the core (not Bohr’s quantity d) that played a role
empirically equivalent to that of the electron spin to bring about the alkali
doublets and their Zeeman splitting. The half-integral values for k* and r*
allowed Heisenberg to obtain doublets in agreement with Sommerfeld,
who for k ¼ 2 (i.e. p-terms) had fixed the doublet states j ¼ k and j ¼ k  1.
Summing k* and r* Heisenberg got respectively j ¼ (k 1/2) + 1/2 ¼ k and
j ¼ (k  1/2)  1/2 ¼ k  1.
In the vectorial notation, the electron’s angular momentum vector k*
was supposed to move around the core’s angular momentum vector r*
with an angle  in such a way that an internal magnetic field HInt (whose
direction coincided with k*) was created. The core precessed about HInt
with interaction energy
E ¼ 1=2  h=2p  e=2mc  jHInt j cos  ¼ 1=2hL cos  (2:18)

H + HInt
HInt H

k* r*

Fig. 2.4 The core–electron coupling in the presence of an external magnetic field
H, according to Heisenberg’s first core model. Adapted from Cassidy (1979).
where  L is the Larmor frequency. In the presence of an external magnetic

field H, because of the interaction between the internal HInt and
the external field H, the core was supposed to stay along the resultant
H+HInt (Fig. 2.4) with corresponding interaction energy
E ¼ hL ðk cos  þ r cos  þ  cos Þ (2:19)
where  ¼ |HInt| / |H|; k cos  was the projection of the electron angular
momentum on H;53 r cos  was the projection of the core angular momen-
tum (r ¼ 1/2) on H; and  cos  denoted the internal magnetic interaction
between the core and the electron. Heisenberg gave an explanation of the
anomalous Zeeman effect as well as of the Paschen–Back effect using
simple geometrico-mechanical properties of this model. He pointed out
that if H increases (strong magnetic field), the angle  between HInt + H
and HInt increases. Accordingly, the ratio  ¼ |HInt| / |H| decreases, with
the consequence that the internal field HInt is gradually overcome and the
decoupling of the two fields causes the appearance of the triplet typical of
the Paschen–Back effect.
On the other hand, if H is small (weak magnetic field), the ratio 
increases ( ! 1), while  decreases so that cos  approaches 1. If
cos  ¼ 1, the core (which, recall, was supposed to stay along the
HInt + H axis) is aligned parallel to HInt. If cos  ¼ þ1, the core is aligned
anti-parallel to HInt. This simple geometrical picture accounted for the
doublet fine structure. The parallel or anti-parallel alignment of the core
vector r with the electron vector k yielded a result empirically equivalent
to the one obtained by taking into account the spin–orbit coupling con-
tribution in the fine structure Hamiltonian. In other words, the total
angular momentum j ¼ k  r was empirically equivalent to what in the
later semi-classical spinning electron model would be j ¼ l  s, with the spin
magnetic moment s aligned either anti-parallel or parallel to H, respect-
ively (see Fig. 2.5). From this model, Heisenberg could derive the
Sommerfeld–Voigt formula, from which the Landé g factors for doublets
could be retrieved.54 The calculated splitting for the lithium 2p level,

In contrast with Landé, who identified the magnetic quantum number m with the
projection of the total angular momentum on H, Heisenberg identified m with the
projection of the electron angular momentum. Given the half-integral values for k*, m*
also took half-integral values (m* ¼  1/2,  3/2, . . . ,  k) in agreement with Landé.
In September 1921 Sommerfeld (1922) was working on the Voigt theory of the anomalous
Zeeman effect for sodium D lines. Only with Heisenberg’s introduction of half-integral k
could the Sommerfeld–Voigt formula yield the optical doublets. See Cassidy (1979),
pp. 196–202.
l l
S µs

µs S

(a) j=ι+s (b) j=ι–s

Fig. 2.5 The spin–orbit coupling for the two fine-structure states (a) j ¼ l + 1/2
and (b) j ¼ l  1/2, according to the semi-classical spinning electron model.
Source: H. E. White (1934) Introduction to Atomic Spectra (New York,
London: McGraw-Hill Book Company). Reproduced with permission of The
McGraw-Hill Companies.

 2p ¼ 0.32 cm1, was close to the experimentally found value,

 2p ¼ 0.34 cm1.
Despite the relative success in recovering both the Paschen–Back effect
and the anomalous Zeeman effect, the model faced serious theoretical
difficulties in the case of intermediate fields. As mentioned, Heisenberg
assumed that the core stayed along the resultant axis HInt + H, and that
the transition from weak to strong fields was an adiabatic process. Thus,
for intermediate fields, the core was forced to stay invariantly in the
direction of HInt + H, without precessing about the electron vector k
under the effect of the internal magnetic field HInt. But this implied (1) a
violation of Larmor’s theorem; (2) a violation of Rubinowicz’s integral
selection rules j ¼ 0,  1 and m ¼ 0, 1.
The violation of Larmor’s theorem was not a novelty. As we saw in Section
2.1.3, Landé’s analysis had already revealed that Larmor’s theorem was
violated in the anomalous Zeeman effect, and he even suggested modifying
it. Pauli too – as we shall see later in this chapter – had to deal with the failure
of Larmor’s theorem, and indeed this was the main stumbling-block towards
his final breakthrough. But it is one thing to claim that Larmor’s theorem
fails because of a core magnetic anomaly (as Landé mistakenly supposed), or
because of the anomalous spin magnetic moment (as it was realized after
Pauli’s breakthrough). It is another thing to claim, as Heisenberg did, that
Larmor’s theorem applies both to the core–field and to the electron–field
interaction, but not to the core–electron interaction. Given the internal
magnetic field HInt, why should not the core precess about it? This was a
completely arbitrary and unjustified deviation from Larmor’s theorem.
Even worse was the violation of Rubinowicz’s selection rules, which
Landé complained of to Heisenberg. If the core is forced to stay invariantly
along the HInt + H direction, it can take on any continuous value. This
violated space quantization as well as the Rubinowicz rules: the total
angular momentum was not restricted to a set of discrete values. And
since Rubinowicz’s rules were obtained from the conservation of the
angular momentum in the classical theory of radiation, their violation
could be interpreted as due either to a violation of the conservation of
angular momentum or to a violation of the classical theory of radiation.
Between the two, Heisenberg sacrificed the latter by introducing non-
classical waves with fractional angular momenta, whose sum supposedly
yielded a classical wave with integral angular momentum.
This solution could not however prevent Heisenberg’s model from
facing even more theoretical difficulties, namely those Bohr was concerned
with. Fractional angular momenta violated the Bohr–Sommerfeld quan-
tum conditions. Furthermore, half-integral k were also a departure from
the relativistic X-ray doublets, for which the usual integral values were
used. Thus, Heisenberg’s first core model accounted for the optical doub-
lets in a way that contradicted the treatment of relativistic doublets: the
theoretical unification of optical and relativistic doublets was still far away
and the doublet riddle unsolved.
No less problematic was the half-integral value of the core. By picking
up half-quantum angular momentum from the valence electron, the
atomic core patently violated Bohr’s building-up principle. The quantum
numbers and the statistical weights did not remain adiabatically invariant.
The sharing principle strengthened the riddle of statistical weights.
To sum up, Heisenberg’s first core model could empirically entail alkali
doublets and their Zeeman splitting. But the price to pay for this empirical
success was the violation of several theoretical assumptions: Larmor’s
theorem, Rubinowicz’s selection rules, Sommerfeld’s space quantization,
the classical theory of radiation, and the Bohr–Sommerfeld quantum
conditions. Heisenberg announced his result in a letter to Pauli.
Intellectual honesty compelled him to admit the ‘drawbacks of this
theory’. Yet he optimistically concluded with the Machiavellian motto that
‘despite all: success justifies the means’.55 But, as a matter of fact, empirical
success does not justify the use of whatever means. No wonder then that
after two years Heisenberg proposed a second improved core model, which
in his expectations would succeed where the first model failed.

Heisenberg to Pauli, 19 November 1921. Pauli (1979), p. 44.
2.2.3 Heisenberg’s second core model: the branching rule

and a new quantum principle
Heisenberg’s first core model gave rise to a lively discussion. The theor-
etical scenario was messy enough to urge ‘the introduction of new
hypotheses – either new quantum conditions or proposals for changing
mechanics’.56 The year 1923 was the annus mirabilis for the history of the
anomalous Zeeman effect. Landé57 published his result on the g formula
(2.12) and the core magnetic anomaly, as we saw in Section 2.1.3. In
September, Pauli wrote to Landé that the quantities figuring in Landé’s
formula for the g factors (namely r, k and j) were not to be taken as the true
[wahren] angular momenta of the core, the valence electron, and the atom,
respectively. The ‘true’ angular momenta were rather given by R ¼ r + 1/2,
K ¼ k + 1/2, J ¼ j + 1/2:
It seems that each angular momentum is not represented through one quantum
number, but rather through a pair of numbers. Under certain respects, angular
momenta appear to be twofold [zweideutig]. For instance, the invariance of the
statistical weights is violated during the [electron–core] coupling because the
core angular momentum is given by either of the numbers. Notice moreover that
this ‘twofoldness’ [Zweideutigkeit] concerns also k. According to this point of
view, there is no half-integral k. This seems to be in a better agreement also with
the X-rays.58

In this historically important passage, Pauli anticipated an idea that turned

out to be crucial for later developments: the Zweideutigkeit of the angular
momentum. By this vague expression Pauli denoted a no better qualified
‘twofoldness’ – so to speak – of the angular momentum. A year later Pauli
came to use this expression to refer exclusively to the electron’s angular
momentum, as I shall discuss in detail in the next section. The historically
important fact is that already in September 1923 Pauli had introduced the
notion of Zweideutigkeit to denote a ‘twofold’ angular momentum. Nor
did it refer to the electron’s angular momentum in particular, but to any
angular momentum, i.e. of the core, the electron as well as of the atom. Let
me then underline two significant consequences of this theoretical step:
(i) The Zweideutigkeit of the core’s angular momentum shed light on the
riddle of statistical weights: the core of an alkali ion did not coincide

56 57
Pauli to Landé, 23 September 1923. In Pauli (1979), p. 123.
2.2 Bohr, Heisenberg, and Pauli on spectroscopic anomalies 61

with the preceding noble gas because there were now two possibilities
for its angular momentum, either r or r + 1/2.
(ii) The Zweideutigkeit of the electron’s angular momentum shed light on
the problematic half-integral values for k, which turned out to be due
to k + 1/2, while integral-valued k was in agreement with X-ray
Pauli’s notion of Zweideutigkeit exerted a direct influence on
Heisenberg. In May 1924, Heisenberg and Landé published a joint article
on the term-structure of higher-level multiplets,59 starting from Landé’s
recent analysis of the neon spectrum.60 The neon ion consisted of two
p-terms: a p1-term with j ¼ 2 and a p2-term with j ¼ 1. The addition of the
outer electron in the building-up process apparently made the p1-term split
into a triplet and a quintuplet, while the p2-term split into a singlet and a
triplet. Hence:
(i) from the neon ion ( j ¼ 1) with the addition of the valence electron, two
s-terms were derived with respectively j ¼ 1/2 and j ¼ 3/2;
(ii) from the neon ion ( j ¼ 2) another two s-terms were derived with j ¼ 3/2
and j ¼ 5/2.
Both these results seemed to spring from j ‘branching’ [Verzweigung] into
j  1/2 with the addition of the valence electron.61 Landé and Heisenberg
generalized this result in a ‘branching rule’ [Verzweigungsregel]:
When the atom is built up from the states of the ion that are characterized
through the j-values j1, j2, . . . , jn we assume that the atom possesses 2n s-terms
with the j-values j1  1/2, . . . , jn  1/2, and accordingly the atom shows 2n
multiplet-systems with multiplicity 2 j ¼ 2(j1  1/2), 2( j2  1/2), . . . and so on.62

This new rule, phenomenologically derived from the neon spectrum, could
account for the multiplet structure. Pauli’s Zweideutigkeit was tacitly at
work in the Heisenberg–Landé branching rule. The Zahlenpaar that Pauli
regarded as the ‘true’ angular momenta had here become the ‘branching’
quantum numbers of the total angular momentum. Yet the rule violated
Bohr’s building-up principle. Given the ‘branching’ process, the terms of the
atom could not be expected to coincide with those of the ion. The riddle of
statistical weights, once again, loomed on the horizon. Landé and
Heisenberg could not help concluding the article with an apologetic remark:

Landé and Heisenberg (1924). 60 Landé (1923c).
Landé and Heisenberg (1924), p. 280. 62 Ibid., p. 284.
This formally so easily representable branching process of the angular

momentum has become one of the main objections against the applicability of
the quantum rule, valid for conditionally periodic motions, to coupled systems
and it has consequently led to search for a simple modification of the quantum
rules used so far.63

A footnote accompanied this last sentence mentioning a forthcoming work

of Heisenberg in Zeitschrift für Physik. One month later (in June 1924)
Heisenberg published an article significantly entitled ‘On a modification of
the formal rules of quantum theory in the problem of the anomalous
Zeeman effect’.64 Apropos of this, the historian Daniel Serwer has noticed
that the branching rule was a sort of publicity gimmick to smooth the path
to the introduction of Heisenberg’s new quantum principle, which solved
both the riddle of statistical weights and the doublet riddle.65 But I think
that the branching rule smoothed the path to Heisenberg’s new quantum
principle precisely because it was, in Heisenberg’s words, ‘directly equiva-
lent’ to it.66 Indeed, the new quantum principle was nothing but the
Heisenberg–Landé branching rule (inspired by Pauli’s Zweideutigkeit)
but this time devoid of the problematic branching process.
Pauli received Heisenberg’s draft in December 1923 together with a letter
in which Heisenberg asked him ‘1. if you regard it as totally rubbish, 2. and if
this is not the case, (a) would you be so kind to send it to Bohr for criti-
cisms . . . since I would like to have the ‘‘papal blessing’’ before the publica-
tion’.67 As a reply, Bohr invited Heisenberg to Copenhagen in spring 1924,
where under Bohr’s influence Heisenberg modified the draft, finally pub-
lished in June 1924.68 At the beginning of the article, Heisenberg recalled the
riddle of statistical weights: j was expected to be invariant according to the
building-up principle, and yet spectroscopic evidence suggested j+1/2 and
j  1/2. Hence the necessity to introduce a new quantum rule:
A determinate value of the coupling energy between the electron and the atomic
core is not associated, as it has been so far assumed, to one value of the inner
quantum number j, but rather with two values ( j+1/2 and j  1/2 in our

Heisenberg (1924), p. 301.
Heisenberg to Pauli, 7 December 1923. In Pauli (1979), p. 132.
The original title of Heisenberg’s work was ‘Über ein neues Quantenprinzip und dessen
Anwendung auf die Theorie der anomalen Zeemaneffekte’; a copy of the original
manuscript is in AHQP (45, 8).
Heisenberg (1924), p. 292.
2.2 Bohr, Heisenberg, and Pauli on spectroscopic anomalies 63

As Heisenberg himself admitted, this was nothing but the Heisenberg–Landé

branching rule suitably devoid of the branching process. Starting from it,
Heisenberg formulated a new quantum principle. The coupling energy
between the core and the electron was given by
Hqu ¼ (2:20)
By plugging in the derivative F ¼ F( j + 1/2 )  F( j  1/2), the quantum
mechanical Hamiltonian Hqu turned out to be
þ 1=2

Hqu ¼ Hcl dj (2:21)

j  1=2

This formula suggested a correspondence between the quantum mechan-

ical Hamiltonian Hqu and the classical Hamiltonian Hcl. Not only was Hqu
obtained by integrating over Hcl in the interval imposed by the new
quantum rule, i.e. from j  1/2 to j + 1/2, but ‘in the case of high quantum
numbers, they almost coincide’.70 There could hardly be a better choice to
get Bohr’s ‘papal blessing’ than elaborating a new quantum principle along
the lines of Bohr’s correspondence principle.
Implementing the above formula in the Ersatzmodell,71 Heisenberg
obtained the Landé g factors, under Pauli’s proviso that zweideutig k,
r and j replaced Landé’s integral quantum numbers. But what was the
physical meaning of Heisenberg’s new quantum principle?
The more the rule . . . simplifies the problem of the Zeeman effect in the formal
respect, the more dubious its physical interpretation seems. Prima facie it seems
an unusual feature of the theory that a value of the coupling energy with two
quantum numbers j is associated . . . But perhaps the formal analogy between
[the new principle] and the frequency condition could later provide a physical
interpretation of [the new principle], because the problem of the coupling of
electrons is closely related to the theory of radiation.72

This rather obscure passage betrays the real nature of Heisenberg’s new
quantum principle. This was shaped along the lines of the recent
Bohr–Kramers–Slater73 (BKS) project of a quantum theory of radiation,
which was meant to be the highest expression of Bohr’s correspondence

The Ersatzmodell was nothing but Landé’s atomic core vector model that he used for the
analysis of the anomalous Zeeman effect, as described in Section 2.1.3.
Heisenberg (1924), p. 299. 73 Bohr, Kramers, and Slater (1924).
64 2 The origins of the exclusion principle

principle. According to the BKS theory, the frequency of the radiation

emitted during the transition between stationary states was equal to
the frequency emitted by a set of virtual harmonic oscillators. The corres-
pondence was only ‘virtual’, in the sense that the model was ‘a purely
logical tool, a theoretical fiction which, though constructed within
a conceptual framework irreducible to the quantum theoretical, can never-
theless enable us to explore certain aspects of the reality of the atoms’.74
Similarly, the correspondence between the classical and the quantum
Hamiltonian in Heisenberg’s new quantum principle was only a theoretical
fiction to explore the anomalous Zeeman effect. As Heisenberg teasingly
announced to Pauli, ‘despite you, I am going now to publish it (without
physical interpretation) with the papal blessing’.75
The main tribute to Bohr, the one which led him to concede the ‘papal
blessing’, concerned the building-up principle. Whereas Heisenberg’s first
core model and the Heisenberg–Landé branching rule violated it, the new
quantum principle ‘was in a natural connection with the building-up
principle . . . The building-up principle is directly connected to the ascrip-
tion of two quantum numbers to the coupling energy’.76 The riddle of
statistical weights was avoided: thanks to a zweideutig j, no anomalous
decrease in the number of states occurred during the building-up process.
The new quantum principle could entail the doublet fine structure and,
more generally, the multiplet fine structure, since it entailed the Landé g
factors. In spite of that, as Heisenberg readily admitted, the new principle
did not explain the anomalous Zeeman effect. It was more a contribution
to the mathematics than to the physics of the Zeeman effect. Furthermore,
the principle was successfully extended to many-electron systems only in
the case of strong magnetic fields, not in the case of weak fields. There were
other difficulties and, as Heisenberg knew, ‘a theory can still be false when
it gives something right, but it can never be right when it gives something
false’.77 The origin of half-integral k for optical doublets was still unclear,
especially if compared to the corresponding integral values for relativistic
doublets. No unification for optical and relativistic doublets was yet
possible. Also the origin of the alleged core magnetic anomaly remained
mysterious. Heisenberg simply noticed that the addition of the valence
electron with k ¼ 1/2 did not seem to affect the magnetic factor 2 of the

Petruccioli (1988), English translation (1993), p. 120. See the same chapter for a detailed
analysis of the BKS theory.
Heisenberg to Pauli, 8 June 1924. In Pauli (1979), p. 155.
Heisenberg (1924), pp. 301, 303.
Heisenberg to Pauli, 21 February 1923. In Pauli (1979), p. 82.
core.78 But he did not take the further crucial step of associating the
magnetic anomaly with the electron’s half-integral values. This step was
taken by Pauli, who dispensed with the atomic core and finally identified
the valence electron as the only culprit of all these anomalies.

2.2.4 Pauli: from the electron’s Zweideutigkeit to the exclusion rule

Heisenberg’s viewpoint sheds no light on the half-integral quantum numbers
and on the failure of Larmor’s theorem. But I regard such an explanation the
most important step and since [Heisenberg’s] entire story is purely formal and
contains no new physical idea, it is not the theory I hope for.79

With these words Pauli dismissed Heisenberg’s second core model.

Explaining the failure of Larmor’s theorem and half-integral values in a
fashion that did not land quantum theory in incoherence became Pauli’s
main goal in the year 1923–4. Following this path Pauli came to anticipate
the concept of electron spin and to introduce the ‘exclusion rule’.
Pauli spent the winter term 1922–3 in Copenhagen, working with Bohr on
a paper about the anomalous Zeeman effect in which no half-integral
quantum numbers were used. As Bohr intended, the paper should have
offered an analysis of the Zeeman effect in perfect agreement with the
orthodox quantum conditions. After many difficulties, Bohr and Pauli with-
drew the paper. Yet an echo of Bohr’s strenuous attempt to retain integral
quantum numbers remained in Pauli’s aversion to Heisenberg’s models.
In April 1923, Pauli made his first important contribution to the anom-
alous Zeeman effect.80 He decided to follow a different route to the
problem, one that did not pass through any modelling of empirical data.
The upshot was to recover the term values for the anomalous Zeeman
effect from the known term values for the Paschen–Back effect via a new
rule, since the Paschen–Back effect was more convenient to analyse than
the anomalous Zeeman effect. Pauli calculated the mg splitting factors for
doublets, triplets, quadruplets, and quintuplets in strong fields where the
magnetic quantum number m was set equal to
m ¼ m1 þ 
with m1 taking (2k  1) integral values 0, 1, . . . , (k  1) and  taking
2r values 1/2, . . . , (r  1/2) for even multiplets (e.g. doublets), and

79 80
Pauli to Kramers, 19 December 1923. In Pauli (1979), p. 135. Pauli (1923).
66 2 The origins of the exclusion principle

0, 1, . . . , (r  1/2) for odd multiplets (e.g. triplets). Following the program-
matic intent, no model-based interpretation was given for these quantum
numbers. Only in a later article (submitted in October 1923)81 did Pauli
explicitly fall back on the atomic core model in interpreting them as

m1 ! mk and  ! mr

where the first is the orbital magnetic moment and the second the core
magnetic moment, so that the magnetic quantum number m ¼ m1 + 
denoted the total magnetic moment of the atom. Without explicitly men-
tioning the atomic core model, in the first article Pauli went on to calculate
the Zeeman energy W ¼ W0 + mgh L (2.11) for strong magnetic fields
(where crucially g ¼ 1) as

W=hL ¼ m1 þ 2 ¼ m þ  ¼ 2m  m1 (2:22)

where once again the puzzling value 2 appeared in the Zeeman energy.
The strong-field term values for both even and odd multiplets followed.
Most interestingly, once these had been calculated, it was also possible
to get the term values for weak fields thanks to a new rule giving the
‘symmetry condition for the values of the terms in the transition from
strong to weak fields’,82 the so-called ‘permanence of the g sums’ as
Landé dubbed it:83

The sum of the energy values in all those stationary states belonging to given
values of m and k, remains a linear function of the field strength during an entire
transition from weak to strong field.84

As Pauli acknowledged, this rule had already been introduced in

Heisenberg’s first core model, although the particular interpretation that
Heisenberg gave to it in terms of the statistical conservation of the angular

Pauli (1923), p. 162. Notice that in the later semi-classical vector model for the spin–orbit
coupling, this rule came to mean that the sum of the g factors, for levels with the same total
angular momentum J or magnetic moment M, is the same in all field strengths,
independent of the coupling scheme used. In the semi-classical vector model, there are two
main coupling schemes for the spin s and the orbital angular momentum l of the valence
electron: the LS coupling (i.e. the so-called Russell–Saunders coupling) and the jj coupling.
In the LS coupling, given two valence electrons, the two l are coupled together to form the
resultant L and so also the two s to form the resultant S, where in turn L and S are coupled
together to give J. In the jj coupling, on the other hand, l1 is coupled with s1 to form j1 and l2
with s2 to form j2, where in turn j1 and j2 form the resultant J. The g factors change
depending on the LS coupling or jj coupling. Yet, according to Pauli’s permanence of the
g-sum rule, for given L, S, M or for given j1, j2, M the sum of the g factors remains the same
in the passage from strong to weak fields. See White (1934), pp. 189–91.
2.2 Bohr, Heisenberg, and Pauli on spectroscopic anomalies 67

momentum was unacceptable. A few short remarks concluded the article

expressing Pauli’s dissatisfaction with his own result: since the half-integral
quantum numbers had not yet been explained, this contribution was
purely formal and without any new physical idea, precisely as with
Heisenberg. Pauli’s main worry remained the violation of Larmor’s theo-
rem, as evident from formula (2.22). There was a sort of doubling of the
magnetic moment of the atom (2m) in the presence of an external field that
Pauli, like Landé and Heisenberg, traced back to an alleged core magnetic
anomaly (2).
In the autumn–winter 1923–4, while Heisenberg was working on his new
quantum principle, Pauli was struggling with half-integral quantum num-
bers and with Larmor’s theorem, as he wrote in a letter to Bohr, which is
worth quoting in some detail:
The atomic physicists in Germany fall today into two groups. One group
first works through a given problem with half-integral values of the quantum
numbers and if it does not agree with experience, then they work with integral
quantum numbers. The others calculate first with integral numbers and if it
does not agree, they calculate with halves. Both groups however have the
characteristic in common that there is no a priori argument to be had from
their theories that tells which quantum numbers and which atoms should be
calculated with half-integral values . . . and which with integral values. They
can decide this only a posteriori by comparison with experience. I myself have
no taste for this kind of theoretical physics . . . I am far more radical than the
‘half-integral-number’ atomic physicists. This is because I do not believe that
the deviation of reality from the results obtained from the theory of periodic
systems in many-electron atoms can be explained by just plugging in half-integral
numbers in the final formulae of that theory . . . But I believe that considerations
of this kind can be fruitful only if we manage to put them in direct association
with the failure of Larmor’s theorem. (The interpretation of this failure through
the simple assumption that the ratio between the core magnetic moment and its
angular momentum is twice as big as the classical value is too formal. It must be
replaced by another interpretation. Unfortunately, with respect to this main
point Heisenberg’s considerations do not lead us beyond what we already

And the search for an interpretation of the failure of Larmor’s theorem put
Pauli on the right track that neither Bohr nor Heisenberg had envisaged.
Pauli first tried to explain the failure of Larmor’s theorem by hypothesizing a
relativistic correction to the mass of the electrons in the core. In an important
letter to Landé,86 Pauli raised the question as to whether a deviation from the

Pauli to Landé, 10 November 1924. In Pauli (1979).
68 2 The origins of the exclusion principle

Larmor frequency  L would be expected in classical circumstances, once the

effect of the relativistic change of the mass of the electron
m ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi (2:23)
1   2 =c2
was taken into account. The magneto-mechanical ratio of the atom would
accordingly deviate from its normal value
¼ (2:24)
p 2m0 c
by a correction factor
 0 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
m ffi
¼ ¼ 1   2 =c2 (2:25)
that would affect both the Zeeman energy
W ¼ mhL (2:26)
and the Larmor frequency
8 91=2
< >
2 Z 2
 ¼ L ¼ L 1 þ h pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffii2 > (2:27)
: n  k þ k 2  2 Z 2 ;

where  is the fine structure constant  ¼ 2pe2/hc and Z is the atomic

number. When n ¼ k ¼ 1 (i.e. in the case of the hydrogen atom),  would
reduce to
 ¼ 1  2 Z 2 (2:28)
with 2 Z2 so small (since Z ¼ 1) that the correction factor would not differ
significantly from 1, and hence no influence on the Zeeman splitting would
be observed. But for high Z,  would supposedly differ considerably from
1, and it would be possible to detect the violation of the Larmor frequency.
Pauli then went on to calculate the eventual influence of  on the Zeeman
effect for elements with high Z. Although he had the strong feeling that
‘the calculated deviations of the Zeeman effect from the usual ones are not
actually available’,87 he asked Landé to let him know whether any relativ-
istic correction could actually be detected. Anticipating an eventual nega-
tive answer, Pauli concluded:

Ibid., p. 171.
What would then follow in the case that the calculated deviations were not
effectively detected (as I think it is likely to be)? The entire calculation relies on
the assumption that the K-shell [i.e. the core] . . . possesses a non-vanishing
resultant angular momentum. But this momentum of the K-shell is not to be
taken too seriously and has something unrealistic about it . . . (I am tempted
to say that not only has the core momentum half-integral values, but also a
half-integral reality) . . . A future, adequate (not à la Heisenberg) theory of the
anomalous Zeeman effect must give reasons for the non-availability of the
relativistic correction factor.88

In the following few days Landé, from the Institute for Physics at
Tübingen, informed Pauli that the observed relativistic correction factor
in cadmium amounted to 6% and in zinc to 2%. But Pauli was
not quite satisfied with these results . . . One must consider elements with higher
atomic numbers . . . in order to decide the question with certainty . . . It is a pity
that the Zeeman effect is not precisely measured in alkalis with higher atomic
numbers. Do we know the precise values of the Ba spark spectrum? And those in
the III column? I am convinced since the beginning that there is no relativistic

Landé finally announced the negative result for thallium (Z ¼ 90) that
Pauli welcomed as ‘totally sufficient to eliminate the last part of a possible
doubt about the empirical state of affairs’.90 The absence of a relativistic
correction in thallium spoke against the assumption of an atomic core
allegedly endowed with a non-zero angular momentum. Against this
assumption stood other facts that Pauli listed in the same important letter
to Landé on 24 November 1924. For instance, in the observable character-
istics, the K-shell did not differ from the closed shells with higher quantum
numbers. But the K-shell was typically ascribed a non-vanishing angular
momentum, whereas the closed shells with higher quantum numbers were
ascribed a zero angular momentum. This introduced an asymmetry in the
theory, to which nothing corresponded at the empirical level. Pauli drew
the following remarkable conclusion:
In the alkalis, the valence electron alone makes the complex structure as well as
the anomalous Zeeman effect. The contribution of the atomic core is out of
question (also in other elements). In a puzzling, non-mechanical way, the
valence electron manages to run about in two states with the same k but with
different angular momenta.91

Pauli to Landé, 24 November 1924. In Pauli (1979), p. 176. 91 Ibid., p. 177.
70 2 The origins of the exclusion principle

The puzzling, non-mechanical way in which the valence electron manages

to run about in two different states with the same orbital angular momen-
tum k was referred to as the electron’s Zweideutigkeit in an article sub-
mitted one week later (on 2 December 1924), in which Pauli summarized
his unsuccessful search for the relativistic correction:
From this viewpoint the doublet structure of the alkali spectra as well as the
failure of Larmor’s theorem arise through a specific, classically non-describable
sort of ‘twofoldness’ [Zweideutigkeit] of the quantum-theoretical properties of
the valence electron.92

The concept of Zweideutigkeit – as we have seen in Section 2.2.3 – origin-

ally sat squarely within the atomic core model as the ‘twofoldness’, i.e. the
‘two-quantum-numbered’ (Zahlenpaar) angular momentum of the core,
electron, and atom. As such, it underpinned Heisenberg and Landé’s
branching rule. One year later, Pauli reinterpreted this very same concept
as the non-mechanical ability of the valence electron to run about in two
different states with the same orbital angular momentum. As the following
scientific developments revealed, this was nothing but the two-valuedness
of the spin angular momentum. Pauli did not speak in terms of spin in
1924, but he was the first to abandon the time-honoured atomic core
model in the explanation of complex spectra and of the anomalous
Zeeman effect. He was the first to regard the valence electron as alone
responsible for these spectroscopic anomalies in the light of its enigmatic,
non-mechanical Zweideutigkeit.
In the same letter to Landé, Pauli ascribed three quantum numbers to
the valence electron along the lines of the Bohr–Coster theory of X-rays:93
nk1 ;k2 where n was the principal quantum number, k1 the azimuthal quan-
tum number and k2 denoted the magnitude of the relativistic correction.
Pauli went on to introduce another quantum number, the magnetic num-
ber m, under the assumption that the atom was placed in a strong magnetic
field. Given the null experimental result about the relativistic correction,
Pauli then replaced k2 with the magnetic quantum number m2, and m
with m1, where the number m2 represented the total magnetic interaction
energy with the strong field, with ‘m1 half-integral; why, not yet clear;
m2 ¼ m1  1/2’. In so doing, Pauli could

Pauli (1925a), p. 385.
treatment by identifying any X-ray term with three quantum numbers nk1, k2), with
 k1 ¼ 1, and  k2 ¼ 1,0. Landé (1923b) identified k2 with the inner quantum number j
of the alkali doublets, because it could take two states k2 ¼ k1, k1  1 as j did. In so doing he
took an important step towards the unification of relativistic and optical doublets.
2.2 Bohr, Heisenberg, and Pauli on spectroscopic anomalies 71

catch a glimpse of a logical connection between the number 2 of these states and
the half-integral quantum numbers, on the one side, and the violation of
Larmor’s theorem on the other side.94

He imagined that a strong magnetic field could be associated with any

atom so that the Paschen–Back states were realized, and every electron
could accordingly be characterized by the four quantum numbers nk1 ;m1 ;m2 .
The first consequence of this assumption was that ‘Bohr’s Zwang . . .
disappears’.95 No atomic core angular momentum, no unmechanischer
Zwang in the alleged core–electron coupling. It was then the turn of the
Landé–Heisenberg branching rule:

From a free atomic core with Ni possible states springs a term-system with
(Ni + 1) (2k1  1) states and one with (Ni  1) (2k1  1) in the presence of a
field . . . Now I interpret these total 2Ni (2k1  1) states simply as follows: the Ni
states of the atomic core remain, and the valence electron takes 2 (2k  1) states
as in the alkalis.96

No core angular momentum, no branching states with the addition of an

outer electron: the doubling of the number of states was attributed to the
electron’s Zweideutigkeit. The riddle of statistical weights could then be
averted: ‘The Aufbauprinzip is strictly valid in my case. This fact seems to
me to make the here suggested viewpoint so superior over those held so far,
that despite all difficulties I regard it as the physically more correct’.97
But even more interesting for our story was what Pauli announced as
‘the second strength of my viewpoint’, namely the possibility of classifying
equivalent orbits ‘in the most natural way’.98 He referred back to a recent
work of Edmund Stoner, a Ph.D. student of Rutherford at Cambridge,
who a few months earlier had published an article in the Philosophical
Magazine99 on the electronic distribution in atomic shells. Stoner’s insight
was that the largest possible number N of electrons in a closed nkj shell
coincided with the number N of sublevels into which the term nkj split in a
weak magnetic field. Using this rule, Stoner could account for the electro-
nic distribution in the periodic table (2, 8, 18, 32, . . . ): the 11 group was
closed with helium, the 211 group was closed with beryllium, the 221 with
calcium, and the 222 with neon. Not only was Stoner’s rule better than
Bohr’s rival building-up schema, as it gave a more natural classification of

Pauli to Landé, 24 November 1924. In Pauli (1979), p. 178. 95 Ibid., p. 178.
Ibid., p. 180.
Stoner (1924). On Stoner’s work and its relevance to the origin of the exclusion principle
see the excellent article of Heilbron (1982), pp. 280–7.
72 2 The origins of the exclusion principle

subshells without arbitrary reopenings. But it also accounted successfully

for some evidence about the spectrum of ionized carbon that patently
violated Bohr’s schema.100
Pauli reinterpreted Stoner’s rule in the light of the electron’s
Zweideutigkeit. Having dispensed with the atomic core, Pauli could abandon
the inner quantum number j that Stoner had used. In the presence of a strong
magnetic field, the number of sublevels into which the spectral term split was
known to be 2(2k  1), i.e. 2n2. But 2n2 was also Rydberg’s ‘cabalistic’101
formula – as Sommerfeld called it – for the n periods of Mendeleev’s table, in
which the number of electrons per period (2, 8, 18, . . . ) were intended as
2  (n ¼ 1)2, 2  (n ¼ 2)2, 2  (n ¼ 3)2. Hence, surprisingly enough, the num-
ber of states an electron can take up in a strong magnetic field coincided with
the number of states an electron can take up in the n-th period of the periodic
table. This was justified by Pauli’s permanence of g sums: the sum of the
energy-states associated with a given magnetic quantum number m remains
the same in all field strengths. So it remains the same in the transition from
strong fields (Paschen–Back effect), to weak fields, up to zero fields (normal
atoms, following Rydberg’s rule for electronic distribution). Having so
reinterpreted Stoner’s rule, Pauli traced back the lengths of the periods to
what he presented as a prescriptive rule [Vorschrift]:
I can trace back the closure of groups [i.e. in Stoner’s schema] . . . to a single
prescription that seems to me extremely natural. I am thinking of a so strong
magnetic field that all electrons can be characterised through the symbol
nk1;m1;m2 as above described. Then it should be forbidden that more than one
electron with the same (equivalent) n belongs to the same values of the three
quantum numbers k1, m1, m2. When an electron corresponds to a given
nk1;m1;m2 -state, this state is occupied.102

There was still a long way to go for this prescriptive rule to become the
exclusion principle. As we shall see in Chapter 4, the process that led to
promoting this rule to the rank of a scientific principle is linked to the
development of quantum mechanics as a theoretical framework that

‘The strongest evidence for Stoner came from Alfred Fowler’s finding that the spectrum of
ionized carbon shows doublets that, because of their appearance and position, had to arise
from a 22 term. The natural interpretation of this unexpected oddity was, as Bohr himself
explained it, that ‘‘the singly charged carbon ion in its normal state besides two electrons in
21 orbits possesses one electron in a 22 orbit’’. That interpretation required a correction in
the interpretation he had laid down: the fifth electron for some reason does not take up the
21 path he [Bohr] had prepared for it. The reason, according to Stoner: since only two 21
orbits exist, the fifth electron necessarily falls into a 22 circle.’ Heilbron (1982), pp. 283–4.
Pauli to Landé, 24 November 1924. In Pauli (1979), p. 180.
2.3 The turning point 73

Pauli’s rule came to be built into from the ground up. Hence, in this
original historical context, it is more appropriate to refer to it as a rule
than as a principle. Pauli himself called it Ausschlieungsregel103 or meine
Ausschluregel (exclusion rule),104 while Heisenberg teasingly called it
Pauli’s Verbot der äquivalenten Bahnen (Pauli’s prohibition of equivalent
Pauli admitted that ‘we cannot give a closer foundation to this rule, yet it
seems to present itself in a very natural way’.106 The exclusion rule was then
introduced as a theoretical consequence of reinterpreting Stoner’s rule in the
light of the electron’s Zweideutigkeit, with the help of the permanence of the
g-sum rule. The closure of electronic groups in the periodic table followed
naturally from it. Furthermore, the rule also accounted for the possible
combinations of electrons in non-closed shells. For instance, it finally
became clear why – despite alkaline earths containing two s-electrons in
their ground state, which could give rise to singlets and triplets – only
singlets were usually observed. The reason is that the two s-electrons are
equivalent and hence certain terms, namely triplets, are forbidden according
to Pauli’s rule. Only when one of the two electrons is excited to an s-orbit of
different n, does the triplet appear: similarly, for any two equivalent
p-electrons or d-electrons. Out of all logically possible combinations of
any two electrons, only a few of them can actually be realized: those in
which no two equivalent electrons are present. Everything seemed to fit
Pauli’s rule very well. Only Bohr’s correspondence principle was left out:
how to reconcile the classical periodic motions presupposed by the corres-
pondence principle with the classically non-describable Zweideutigkeit of
the electron’s angular momentum? Pauli could not help remarking that ‘this
is indeed a very strange state of affairs’.107

2.3 The turning point

Bohr welcomed Pauli’s exclusion rule, although he did not hide his per-
plexity about the classically non-describable nature of the Zweideutigkeit
and the impossibility of reconciling it with the correspondence principle:

Pauli to Landé, 15 December 1924. In Pauli (1979), p. 191.
Pauli to Landé, 25 December 1924. In Pauli (1979), p. 196.
Heisenberg to Pauli, 16 November 1925. In Pauli (1979), p. 256.
74 2 The origins of the exclusion principle

Dear Pauli,
I cannot easily describe how welcome your submission was. We are all excited
for the many new beautiful things you have brought to light. I do not need to
advance any criticism, since you by yourself, better than anyone else could have
done, have characterised the whole thing in your letter as complete madness . . .
I am not very sure either whether you are overstepping a dangerous threshold –
intoning your old ‘Carthaginem esse delendam’ – when you declare the definitive
death sentence about a correspondence-like explanation of the closure of
groups . . . I sense that we are here standing at a decisive turning point.108

The Zweideutigkeit stood indeed at a decisive turning point, as Bohr

noticed. It introduced a new crucial property for the electron, which
escaped any classical description as much as the exclusion rule eluded
any quantum theoretical proof. The theoretical breakthrough called for
experimental confirmation. On 9 January 1925 Pauli visited Landé at the
Institute for Physics in Tübingen, which was the leading centre for spectro-
scopic research. The data on the spectrum of lead (Z ¼ 82) turned out to be
in striking agreement with Pauli’s rule.109 Intense discussions followed
within the small Gesellschaft of physicists gathered at Landé’s house till
late at night. Among them was a young Ph.D. graduate from Columbia
University, Ralph Kronig, who had the chance to read Pauli’s letter
to Landé where the new rule was announced. As Kronig recalled many
years later:
Pauli’s letter made a great impression on me and naturally my curiosity was
aroused as to the meaning of the fact that each individual electron of the atom
was to be described in terms of quantum numbers familiar from the spectra of
the alkali atoms, in particular the two angular momenta l and s ¼ 1/2
encountered there. Evidently s could now no longer be attributed to a core, and
it occurred to me immediately that it might be considered as an intrinsic angular
momentum of the atom. In the language of the models which before the advent
of quantum mechanics were the only basis for discussion one had, this could
only be pictured as due to a rotation of the electron about its axis . . . the same

Bohr to Pauli, 22 December 1924. In Pauli (1979), pp. 194–5.
‘Lead in its normal state has two superficial p-electrons (electrons with k ¼ 2). Five
possibilities therefore exist for the lowest state . . . There result five nkj terms defined by
values of j of 2, 2, 1, 0, 0, a prescription testable by calculating the g values and examining
the anomalous Zeeman effect in lead. Pauli was not pleased to find that only four levels
had been identified, and that they had been assigned j ’s of 1, 2, 1, 0. If the experimentalists
had not erred, he wrote Landé, ‘‘my closure rule must be modified for complicated cases’’
(Pauli to Landé, 15 and 25 December 1924). Heroic measures were called for. Pauli
stopped at the dull town of Tubingen en route to Hamburg from Vienna, where he had
spent Christmas. There, on Landé’s kitchen table, he examined Back’s latest photographs
of the lead-spectrum in a magnetic field. Analysis disclosed five terms with the predicted j
2.3 The turning point 75

afternoon, still quite under the influence of the letter I had read, I succeeded in
deriving with it the so-called relativistic doublet formula.110

Kronig introduced the idea of electron spin. If a spinning or self-rotating

electron is associated with a magnetic moment of one Bohr magneton, via
a Lorentz transformation, in the rest frame of the electron the electric field
created by the protons of the nucleus would appear as a magnetic field,
with which the electron’s magnetic moment would interact.
Since the idea of a spinning electron was still based on a semi-classical
mechanical model, given Pauli’s reluctance to use semi-classical models, he
dismissed it with the remark ‘this is indeed quite a witty idea’.111 Kronig,
disheartened, abandoned the idea himself. In the meantime, on 16
January, Pauli submitted the paper in which the exclusion rule was
announced. The article ended with an apologetic remark on the impossi-
bility of reconciling the new view with the correspondence principle. As an
intellectual tribute to Bohr, Pauli advanced the hope that ‘in the near
future, a fusion of these two viewpoints [i.e. Zweideutigkeit and correspon-
dence principle] will be achieved’.112 Yet Pauli remained deeply convinced
that no classical description for the electron’s Zweideutigkeit could ever be
given. Indeed, the subsequent developments did justice to his intuition.
The ‘witty idea’ was resurrected nine months later by two Dutch physi-
cists, George Uhlenbeck and Samuel Goudsmit, who – independently of
Kronig – arrived at a similar conclusion starting from Pauli’s work.
Uhlenbeck and Goudsmit published a short note in Naturwissenschaften
where the fourth degree of freedom for the electron was identified with a
rotation of the electron about its axis.113 The gyromagnetic ratio for this
new degree of freedom was twice the corresponding ratio for the orbital
motion, and the coupling between the orbital angular momentum vector
and the intrinsic angular momentum vector was invoked as the key for an
understanding of doublets.
The main obstacle the spinning electron model faced was a discrepancy
between the observed doublet splitting and the calculated fine structure: a
factor of 2 had already appeared in Kronig’s calculation. Note that this
factor had nothing to do with the well-known value 2 of the Landé g factor:
an explanation of the latter could be given in Uhlenbeck and Goudsmit’s
model, where the gyromagnetic ratio for a self-rotating electron with sur-
face charge turned out to be equal to 2 e/(2mc). But an explanation of this
other factor 2 appearing in Kronig’s calculation proved more difficult.

Pauli (1925b), p. 771.
Uhlenbeck and Goudsmit (1925).
76 2 The origins of the exclusion principle

Heisenberg repeated Kronig’s calculation in November 1925, and in a

letter to Pauli he himself expressed doubts about the model.114 Despite
this and other difficulties,115 Bohr adhered to the spinning electron model
because it restored a desirable, classically describable picture. Bohr’s
enthusiasm overcame Heisenberg’s reluctance,116 but not Pauli’s, who
firmly rejected ‘the new heresy’.
In February 1926 a solution was finally found for the puzzling factor 2.
As Bohr announced to Pauli,117 a young English physicist, Llewellyn Hillet
Thomas, who had spent the previous half year in Copenhagen, had dis-
covered that this factor was simply due to a mistake in the calculation of
the relative motion of the electron and the atomic nucleus: an additional
angular velocity of the nucleus, due to the effect of special relativity in the
rest frame of the electron, should be introduced in the calculation of the
equation of motion of the electron’s magnetic moment. A copy of
Thomas’s article118 was enclosed with the letter.
Pauli did not welcome the new result favourably. In his reply to Bohr119
he insisted that the question could not be solved the way Thomas sug-
gested and he even asked Bohr to block the publication of Thomas’s paper.
But Bohr remained firm in his position: ‘Dear Pauli, . . . your letter has
only strengthened our belief in the validity and justification of the argu-
ment.’120 Pauli was left with no other choice than ‘capitulate comple-
tely’.121 Pauli’s capitulation coincided with the publication of a second
article by Uhlenbeck and Goudsmit,122 where the spinning electron model
was deployed to calculate the hydrogen spectrum, while in May a con-
firmation of the validity of Thomas’s calculation came from the work of
As later developments revealed, Pauli’s reluctance towards the spinning
electron model contained a kernel of truth. The classically non-describable

Heisenberg to Pauli, 24 November 1925. In Pauli (1979), p. 265.
As Uhlenbeck later recalled, ‘It was quite clear that the picture of the rotating electron, if
taken seriously, would give rise to serious difficulties. For one thing, the magnetic energy
would be so large that by the equivalence of mass and energy the electron would have a
larger mass than the proton, or, if one sticks to the known mass, the electron would be
bigger than the whole atom! In any case, it seemed to be nonsense.’ Quotation from van
der Waerden (1960), p. 214.
‘Bohr’s optimism about Goudsmit’s theory has so much influenced me, that I’d really like
to believe in the magnetic electron.’ Heisenberg to Pauli, 24 December 1925. In Pauli
(1979), p. 271.
Pauli to Bohr, 26 February 1926. In Pauli (1979), p. 297.
Bohr to Pauli, 9 March 1926. In Pauli (1979), p. 309.
Pauli to Bohr, 12 March 1926. In Pauli (1979), p. 310.
Uhlenbeck and Goudsmit (1926). 123 Frenkel (1926).
2.3 The turning point 77

electron’s Zweideutigkeit could not be cast in classical terms, and the

electron’s spin turned out to have no classical analogue. In 1928 Dirac
finally showed that
the incompleteness of the previous theories [i.e. Pauli’s and Uhlenbeck–
Goudsmit’s] [lay] in their disagreement with relativity . . . All the same there is a
great deal of truth in the spinning electron model, at least as a first

The Dirac relativistic wave equation for the electron finally allowed the
derivation of the spin magnetic moment, and in so doing vindicated Pauli’s
original reluctance to cast the electron’s Zweideutigkeit in the semi-classical
spinning electron model. But that is another story, and I shall come back to
it in Chapter 4.

Dirac (1928a), p. 610.

You might also like