Michael Weiss - Lie Groups and QM PDF
Michael Weiss - Lie Groups and QM PDF
Michael Weiss - Lie Groups and QM PDF
Michael Weiss
1 Introduction
These notes attempt to develop some intuition about Lie groups, Lie algebras,
spin in quantum mechanics, and a network of related ideas. The level is rather
elementary— linear algebra, a little topology, a little physics. I don’t see any
point in copying proofs or formal definitions that can be had from a shelf full
of standard texts. I focus on a couple of concrete examples, at the expense of
precision, generality, and elegance. See the first paragraph on Lie groups to get
the flavor of my “definitions”. I state many facts without proof. Verification
may involve anything from routine calculation to a deep theorem. Phrases like
“Fact:” or “it turns out that” give warning that an assertion is not meant to
be obvious.
A quote from the Russian mathematician V. I. Arnol’d:
A taste of things to come: consider the spin of an electron. One would like to
visualize the electron as a little spinning ball. This is not right, yet not totally
wrong. A spinning ball spins about an axis, and the angular velocity vector
points along this axis. You can imagine changing the axis by rotating the space
containing the ball. 1 Analogously, the quantum spin state of an electron has
an associated axis, which can be changed by rotating the ambient space.
The classical concepts of rotation and angular velocity are associated with
SO(3), the group of rotations in 3-space. SO(3) is an example of a Lie group.
1 If you want to be really concrete, imagine a spinning gyroscope fitting snugly in a box.
1
Another Lie group, SU (2), plays a key role in the theory of electron spin. Now
SO(3) and SU (2) are not isomorphic, but they are “locally isomorphic”, mean-
ing that as long as we consider only small rotations, we can’t detect any differ-
ence. However, a rotation of 360◦ corresponds to a element of SU (2) that is not
the identity. Technically, SU (2) is a double cover of SO(3).
Associated with every Lie group is something called its Lie algebra. The Lie
algebra is a vector space, but it has additional structure: a binary operation
called the Lie bracket. For the rotation group, the elements of the corresponding
Lie algebra can be thought of as angular velocities. Indeed, angular velocities
are usually pictured as vectors in elementary physics (right hand rule of thumb).
The Lie bracket for this example turns out to be the familiar cross-product from
vector algebra. (Unfortunately, I won’t get round to discussing the Lie bracket.)
The Lie algebras of SO(3) and SU (2) are isomorphic. This is the chief technical
justification for the “electron = spinning ball” analogy. The non-isomorphism
of SU (2) and SO(3) has subtle consequences. I can’t resist mentioning them,
though these notes contain few further details. Electrons are fermions, a term in
quantum mechanics which implies (among other things) that the Pauli exclusion
principle applies to them. Photons on the other hand are bosons, and do not
obey the exclusion principle. This is intimately related to the difference between
the groups SU (2) and SO(3). Electrons have spin 12 , and photons have spin 1.
In general, particles with half-odd-integer spin are fermions, and particles with
integer spin are bosons.
The deeper study of the electron involves the Dirac equation, which arose out
of Dirac’s attempt to marry special relativity and quantum mechanics. The
relevant Lie group here is the group of all proper Lorentz transformations.
A Rough Road-map. The basic plan of attack: show how elements of SU (2)
correspond to rotations; then apply this to the spin of the electron.
I start with the most basic concepts of Lie group and Lie algebra theory. SO(3)
is the ideal illustrative example: readily pictured, yet complicated enough to be
interesting. The main goal is the double covering result. I do not take the most
direct path to this goal, attempting to make it appear “naturally”. Once we do
have it, the urge to explore some of the related topology is irresistible.
Next comes physics. Usually introductory quantum mechanics starts off with
things like wave/particle duality, the Heisenberg uncertainty principle, and so
forth. Technically these are associated with the Hilbert space of complex-valued
L2 functions on R3 — not the simplest Hilbert space to start with. If one ignores
these issues and concentrates solely on spin, the relevant Hilbert space is C2 .
(Feynman’s Lectures on Physics, volume III, was the first textbook to take this
approach in its first few chapters.) SU (2) makes its entrance as a symmetry
group on C2 . I conclude with a few hand-waves on some loose ends.
2
2 Lie Groups
A Lie matrix group is a continuous subgroup of the group of all non-singular
n × n matrices over a field K, where K is either Ror C. “Continuous” really is
a shorthand for saying that the Lie group is a manifold. The rough idea is that
the components of a matrix in the group can vary smoothly; thus, concepts like
“differentiable function f : R → G” should make sense. I’ll just say “Lie group”
for Lie matrix group, though many mathematicians would groan at this.
Example: O(n) is the group of all orthogonal n × n matrices, i.e. all matrices A
with real components such that At A = 1. This is just the group of all isometries
of Rn which leave the origin fixed. (Standard calculation: let x be a column
vector. Then (Ax)t (Ax) = xt x, i.e., the norm of x equals the norm of Ax.)
Note also that the equations At = A−1 and AAt = 1 follow from At A = 1.
If At A = 1, then we have immediately det(A)2 = 1, i.e., det(A) = ±1. SO(n)
is the subgroup of all matrices in O(n) with determinant 1. Fact: SO(n) is
connected, and is in fact the connectedness component of 1in O(n). I will focus
initially on SO(2) and SO(3). These are, colloquially, the groups of rotations
in 2-space and 3-space. O(n) is the group of reflections and rotations.
Digression: the well-known puzzle, “why do mirrors reverse left and right, but
not up and down?” is resolved mathematically by pointing out that a mirror
perpendicular to the y-axis performs the reflection:
For some psychological reason, people tend to think of this as the composition of
a 180◦ rotation about the z-axis followed by a reflection in a plane perpendicular
to the x-axis: (x, y, z) → (−x, −y, z) → (x, −y, z). This makes it seem that the
mirror is treating the x and z axes differently (left/right vs. updown), though
it really isn’t. End of digression.
Example: SO(2), rotations in 2-space. Since det(A) = 1, it is easy to write
down the components of A−1 . Equating these to At , we see that A has the
form:
a −b
b a
with the constraint that a2 + b2 = 1. We can set up a one-one correspondence
between this matrix and the complex number a + ib on the unit circle. This is
3
a Lie group isomorphism between SO(2) and the unit circle. We can of course
find an angle θ for which a = cos θ and b = sin θ.
Elements of SO(2) have real components, but it is enlightening to consider
SO(2) as a subgroup of the group of all non-singular complex 2 × 2 matrices.
Fact: any matrix in SO(2) is similar to a matrix of the form
iθ
a + ib 0 e 0
=
0 a − ib 0 e−iθ
3 Lie Algebras
Let G be a Lie group. Let x(t) be a smooth curve in G passing through the
unit element 1of G, i.e., a smooth mapping from a neighborhood of 0 on the
real line into G with x(0) = 1. T (G), the tangent space of G at 1, consists of
dx(t)
all matrices of the form dt , or just x (0) in a less clumsy notation.
t=0
4
T (G) is the Lie algebra of G. I will show in a moment that T (G) is a vector
space over R, and I really should (but I won’t) define a binary operation [x, y]
(the Lie bracket) on T (G) × T (G).
Proof that T (G) is a vector space over R: if x(t) is a smooth curve and x(0) = 1,
then set y(t) = x(kt), k ∈ R. This is also a smooth curve and y (0) = kx (0). So
T (G) is closed under multiplication by elements of R. (Note this argument fails
for complex k.) Similarly, differentiating z(t) = x(t)y(t) (with x(0) = y(0) = 1,
as usual) shows that T (G) is closed under addition.
Historically, the Lie algebra arose from considering elements of G “infinites-
imally close to the identity”. Suppose ∈ R is very small, or (pardon the
expression), “infinitesimally small”. Then x (0) is approximately x()−x(0)
, or
(remembering x(0) = 1)
x() ≈ 1 + x (0)
Historically, x() is a so-called infinitesimal generator of G.
Robinson has shown how this classical approach can be made rigorous, using
non-standard analysis. Even without this, the classical notions provide a lot of
insight. For example, let n be an “infinite” integer. Then if t ∈ R is an ordinary
real number (not “infinitesimal”), we can let = t/n and so
n
tx (0)
x()n ≈ 1 + ≈ etx (0)
n
Assume that the left hand side is an ordinary “finite” element of G. Write v for
x (0), an arbitrary element of the Lie algebra T (G). This suggests there should
be a map (t, v) → etv from R × T (G) into G.
In fact, the following is true: for any Lie group G with Lie algebra T (G), we have
a mapping exp from T (G) into G such that exp(0) = 1, and exp ((t1 + t2 )v) =
exp(t1 v) exp(t2 v), for any t1 , t2 ∈ R and v ∈ T (G).
It also turns out that the Lie algebra structure determines the Lie group struc-
ture “locally”: if the Lie algebras of two Lie groups are isomorphic, then the
Lie groups are locally isomorphic. Here, the Lie algebra structure includes the
bracket operation, and of course one has to define local isomorphism.
Now for our standard example, SO(n). Notation: the Lie algebra of SO(n) is
so(n). If you differentiate the condition At A = 1 and plug in A(0) = 1, you
will conclude that all elements of so(n) are anti-symmetric. Fact: the converse
is true.
Example: SO(2), rotations in 2-space. All elements of so(2) have the form
0 −c
c 0
5
In the earlier discussion of SO(2), I set up a one-one correspondence
a −b
↔ a + ib
b a
(We’ll see the reason for the peculiar choice of signs and arrangement of a, b,
and c shortly.)
Fact: the vector (a, b, c) is the angular velocity vector for the above element of
so(3). What does this mean? Well first, let v0 ∈ R3 be some arbitrary vector;
if A(t) is a curve in SO(3), and we set v(t) = A(t)v0 , then v(t) is a rotating
vector, whose tip traces out the trajectory of a moving point. The velocity of
this point at t = 0 is A (0)v0 . It turns out that A (0)v0 equals the cross-product
(a, b, c) × v0 , which characterizes the angular velocity vector. The next few
paragraphs demonstrate this equality less tediously than by direct calculation.
Let
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 0 0 0 0 1 0 −1 0
x̂ = ⎣ 0 0 −1 ⎦ , ŷ = ⎣ 0 0 0 ⎦ , ẑ = ⎣ 1 0 0 ⎦
0 1 0 −1 0 0 0 0 0
so the general element of so(3) can be written ax̂ + bŷ + cẑ. And x̂, ŷ, and
ẑare simply the elements of so(3) corresponding to unit speed uniform rotations
about the x, y, and z axes, respectively— as can be seen by considering their
effects on the standard orthonormal basis.
This verifies the equation A (0)v0 = (a, b, c) × v0 for the special cases of A (0) =
x̂, ŷ, and ẑ. The general case now follows by linearity.
the action is faithful, i.e., the mapping determines the element of SO(3).
6
General fact: for any Lie group G, there is a homomorphism (also known as a
representation) of G into the group of non-singular linear transformations on
the vector space T (G), with kernel Z(G), the center of G.
Here’s how it goes. For any group G, we have the group of inner automorphism
Inn(G) and a homomorphism G → Inn(G) defined by g → ιg , where ιg (h) =
ghg −1 . The kernel is Z(G). The automorphism ιg is furthermore determined
completely by its effects on any set of generators for G.
Now take G to be a Lie group. Let’s consider the effect of ιg on an “infinitesimal”
generator 1 + h, where h ∈ T (G).
Unitary Matrices: SU (n). Now for a different example. U (n) is the group
of unitary n × n matrices, i.e., complex matrices satifying A∗ A = 1. An easy
computation shows that | det(A)| = 1. SU (n) is the subgroup for which the
determinant is 1 (unimodular matrices). Unlike the situation with O(n) and
SO(n), the dimensions of U (n) and SU (n) (as manifolds) differ by 1.
The Lie algebras of U (n) and SU (n) are denoted u(n) and su(n), respectively.
Differentiating A∗ A = 1 we conclude that u(n) consists of anti-Hermitian ma-
trices: B ∗ = −B. Note that B is anti-Hermitian if and only if iB is Hermitian.
Fact: if A(0) = 1, then d detdtA(t) = tr A (0) (where tr = trace). (Expanding
t=0
by minors does the trick.) This makes one half of the following fact obvious: the
Lie algebra for the Lie group of unimodular matrices consists of all the traceless
matrices.
For the special case SU (2), things work out very nicely. Since det(A) = 1, one
can write down the components for A−1 easily, and equating them to A∗ , one
7
concludes that SU (2) consists of all matrices of the form:
a + id c + ib 1 0 0 i 0 1 i 0
=a +b +c +d
−c + ib a − id 0 1 i 0 −1 0 0 −i
= a1 + bi + cj + dk, a2 + b2 + c2 + d2 = 1
defining i,j,kas the given 2 × 2 matrices in SU (2). Exercise: these four elements
satisfy the multiplication table of the quaternions, so SU (2) is isomorphic to
the group of quaternions of norm 1. (The somewhat peculiar arrangement of
a, b, c, d in the displayed element of SU (2) is dictated by convention.)
Next, an arbitary anti-Hermitian matrix looks like:
i(a + d) c + ib i 0 0 i 0 1 i 0
=a +b +c +d
−c + ib i(a − d) 0 i i 0 −1 0 0 −i
= ai1 + bi + cj + dk
This is traceless if and only if a = 0. So we have a canonical 1–1 correspondence
between su(2) and R3 , and so also with so(3): bi + cj + dk ↔ (b, c, d) ↔
bx̂ + cŷ + dẑ.
It turns out that this correspondence is a Lie algebra isomorphism. SU (2) and
SO(3) are locally isomorphic, but not isomorphic— as we will see next.
SU (2) acts on su(2) via the adjoint representation. But we have a 1–1 corre-
spondence between su(2) and R3 , so we have a representation of SU (2) in the
group of real 3×3 matrices. Let A be an element of SU (2) and v = bi+cj+dk be
an element of su(2). Note that det v = b2 + c2 + d2 . Since the map v → AvA−1
preserves determinants, it preserves norms when considered as acting on R3 .
So the adjoint representation maps SU (2) into O(3). Fact: it maps SU (2) onto
SO(3).
Incidentally, you can see directly that v → AvA−1 preserves anti-Hermiticity
by writing it v → AvA∗ .
The kernel of the adjoint representation for SU (2) is its center, which clearly
contains ±1— and in fact, consists of just those two elements. So we have a
2–1 mapping SU (2) → SO(3). Our double cover! I’ll look at the topology of
this in a moment.
Physicists prefer to work with the Pauli spin matrices instead of the quaternions.
The Pauli matrices are just the Hermitian counterparts to i, j, and k:
i = iσx , j = iσy , k = iσz
They form a basis (with 1)for the vector space of Hermitian 2 × 2 matrices:
a + d b − ic 1 0 0 1 0 −i 1 0
=a +b +c +d
b + ic a − d 0 1 1 0 i 0 0 −1
8
= a1 + bσx + cσy + dσz
SU (2) acts on the space of traceless Hermitian 2 × 2 matrices in the same way
as on su(2): h → ghg −1 .
So if we understand one SU (2) action, we understand the other. I’ll use Pauli
matrices from now on.
An arbitary element A of SU (2) looks like
a + id c + ib
A= = a1 + biσx + ciσy + diσz , a2 + b2 + c2 + d2 = 1
−c + ib a − id
and we see that A∗ = a1 − biσx − ciσy − diσz . So the result of acting with
A on v can be computed simply by working out the product (a1 + biσx +
ciσy + diσz )(xσx + yσy + zσz )(a1 − biσx − ciσy − diσz ). For this we need the
9
multiplication table of the σ’s. This is simply:
The easiest way to check this is to work out the action of the σ’s on (p, q) ∈ C2 :
p q
σx =
q p
p −iq
σy =
q ip
p p
σz =
q −q
(Warning: the exponential of a sum is not in general the product of the expo-
nentials, because of non-commutativity.) For the rotation group (as we’ve seen)
this says simply that an angular velocity determines a rotation— e.g., by the
prescription “rotate at the given angular velocity for one time unit”. The basis
of “angular velocities” in su(2) is (iσz , iσy , iσz ). Let us consider rotations about
the z-axis.
ib
ibσz e 0
e = = cos b 1 + i sin b σz
0 e−ib
(since iσz acts separately on each coordinate.) Perhaps the clearest way to
exhibit the action of this rotation on v = xσx + yσy + zσz is to work entirely in
10
matrix form:
ib
e 0 z x − iy e−ib 0 z e2ib (x − iy)
=
0 e−ib x + iy −z 0 eib e −2ib
(x + iy) −z
i.e., a rotation about the z-axis of −2b radians. Exercise: the same sort of thing
holds for iσy and iσz .
So one can directly picture the action of SU (2) on vectors in 3-space. The
“double angles” 2b, etc., stem from the two multiplications in the action: v →
AvA∗ . And the double angles in turn are the reason the map from SU (2) to
SO(3) is two-to-one.
SU (2) acts on C2 via left multiplication: v → Av, where v is a column vector.
Can one picture v as some kind of geometric object in 3-space? Yes indeed!
An object known as a spin vector embodies v geometrically. But I won’t get to
them.
11
4 Quantum Mechanics: Two-state Systems
The framework of quantum mechanics rests on three pillars: the Hilbert space
of quantum states; the Hermitian operators, also called observables; and the
unitary evolution operators. I start by trying to attach some pictures to these
abstractions.
The simplest classical system consists of a single point particle coasting along
in space (perhaps subject to a force field). To “quantize” this, you’ll need
the Hilbert space of complex-valued L2 functions on R3 , and you’ll encounter
unbounded operators on this space. So goes the tale of history: Heisenberg,
Schrödinger, Dirac and company cut their milk teeth on this problem.
I will take an ahistorical but mathematically gentler approach. The Hilbert
space for a two-state quantum system is C2 , and the operators can all be rep-
resented as 2 × 2 complex matrices. The spin of an electron provides a physical
example. That is, if we simply ignore position and momentum (“mod them
out”, so to speak), we have a physical picture that can be modelled by this
(relatively) simple framework. (As noted before, Feynman’s Lectures, volume
III, starts off like this.)
complex number w or else ∞ with each class (a : b). If b = 0, then (a : b) = (w : 1), where
w = a/b. All pairs of the form (a, 0) belong to the class (1 : 0). So we associate a/b with
(a : b) if b = 0, and ∞ with (1 : 0). Mapping the complex plane plus ∞ to the Riemann
sphere via the usual stereographic projection completes the trick. Some sample points to bear
in mind: the north pole is (1 : 0); the south pole is (0 : 1); points on the equator have the
form (eiθ : 1). (For the purist, the special treatment of ∞ rankles. It is not singled out on
either the complex projective line or on the Riemann sphere. Later I will show how to set up
the correspondence without this blemish.)
12
is a vector pointing along the axis of rotation, with length proportional to the
speed of rotation. This defines the vector up to sign. The sign ambiguity is
resolved, conventionally, by the right hand rule of thumb: if you curl the fingers
of your right hand in the direction of rotation, the thumb points in the direction
of the angular momentum vector.
Classically, the electron could be spinning at any speed, so its angular momen-
tum could have any magnitude. In quantum mechanics, the angular momentum
is quantized: its magnitude (measured along the axis of rotation) must be h̄/2
for any spin 12 particle. (Note: h̄ = h/2π, where h of course is Planck’s con-
stant.)
The quantization of angular momentum, although inexplicable classically, is
easy enough to picture: we just stipulate that our spinning ball must be spinning
at a particular rate. So we should be able to specify the spin state of the electron
just by giving a direction in 3-space, or equivalently, by picking a point on the
sphere S 2 . But I just noted that the set of states is “isomorphic” to the Riemann
sphere. So everything fits.
of an orthonormal basis of eigenvectors; another reason why spin is simpler than position.
13
component of spin along the z-axis. Since the electron is charged, it acts like a
little magnet, with north and south poles along the axis of rotation. (Circulating
charge causes a magnetic field. Think of an electromagnet— a coil of wire with
an electric current flowing around in it.) A physicist would say that the electron
has a magnetic moment.
We can use the magnetic moment to measure the spin. Stern and Gerlach got a
Nobel prize for doing just that. They sent a beam of electrically neutral silver
atoms through a magnetic field. It turns out that the magnetic moments of
the electrons in a silver atom cancel out in pairs except for one electron, so we
can pretend (so far as the spin is concerned) that we’re looking at a beam of
electrons passing through a magnetic field.
The magnetic field was designed to produce a force on the electrons. An electron
N
with spin pointing up would look like an S magnet, and would experience
S
an upward force; an electron with spin pointing down would look like an N
magnet, and would experience a downward force. Classically, you would expect
an electron with spin at angle α to the vertical to experience an upward force
proportional to cos α.5 So the electron beam should be spread out into a vertical
smear, according to classical mechanics.
In fact, the beam splits into two beams, one up, one down. In other words, if we
measure the component of the spin along the vertical axis, we always find that
the spin is entirely up or entirely down. This is the most basic sense in which
the “spinning ball” analogy is wrong. The same two-valued behavior holds for
any measurement axis.
Classically this is inexplicable. How can the electron have spin up and spin
sideways at the same time? Answer: it doesn’t. After you’ve measured the
spin along the z-axis, the electron has vertical spin (say spin up). If you take
your vertically spinning electron and measure its spin along the x-axis, you have
a 50–50 chance at getting spin left or spin right. If you now repeat the spin
measurement along the z-axis, you have a 50–50 chance of getting spin up or
spin down. The x-axis measurement has destroyed the information obtained
from the z-axis measurement.
Let A be the Hermitian operator corresponding to “measure the spin along the
z-axis”. The eigenvalues (i.e., possible results) will be 1 and −1, if we choose
our units right.
Pick abasis of two eigenvectors; then the matrix for A in this
1 0
basis is just , i.e., the Pauli matrix σz . (Common notation for the
0 −1
eigenvectors is |up and |down, although |dead and |alive are popular for that
other famous two-state system, Schrödinger’s Cat.)
5 Why wouldn’t the electron simply snap into alignment with the magnetic field? Answer:
the spinning electron would act like a gyroscope, and precess in response to the torque exerted
by the field. Thus it would maintain its angle of inclination to the field.
14
If σz is here, can σx and σy be far behind? In fact, these are the matrices for
measuring the x-component (respectively y-component) of spin, provided we
continue to use the same basis |up and |down.
It turns out that (1 : 0) represents spin up along the z-axis, (1 : 1) represents
spin along the x-axis, and (i : 1) represents spin along the y-axis. In a different
notation, |up + |down and i|up + |down are the state vectors for these two
spin directions. The x-axis and y-axis state vectors are not eigenvectors of σz .
The rules for calculating probabilities (clothed in any philosophy you like) yield
the 50–50 chances mentioned earlier.
As an exercise, you may like to chew on these remarks: if two measurements
can be done simultaneously, then the associated Hermitian operators must have
the same set of eigenvectors, and so the operators must commute. But the σ
matrices don’t commute. This accounts mathematically for the non-intuitive (or
at least non-classical) results of the Stern-Gerlach experiment. The Heisenberg
uncertainty principle stems from the same sort of considerations.
Now a general comment. Any linear operator on a Hilbert space H induces
a mapping on the space of states, since if x and cx are two state vectors for
the same state, then Ax and Acx = cAx will represent the same state. Can I
dispense with the Hilbert space entirely and just work with the space of states
and the induced mappings? The answer is yes, but it would be inconvenient.
If A is an observable with eigenvector v, say Av = λv, then the eigenvalue λ
has physical significance. But when we look at the action of A on the space of
states, all we notice (at first) is that A leaves the state corresponding to v fixed.
Nonetheless, the results of measurement are encoded in the action of A on the
space of states. A and cA, c
= 0, induce the same mapping on the states, and
the converse holds for the cases of interest to us (if A and B induce the same
state mapping, then A = cB for some scalar c
= 0). This scalar c will be real
for Hermitian A and B. If c
= 1, then A and B really represent the same
measurement, but expressed in different units (e.g., foot-pounds vs. ergs.)
Example: suppose Av = λv and Aw = μw, and Bv = λ v, Bw = μw, with
λ
= λ . A and B do the same thing to the quantum states determined by v
and w— namely, the states are left fixed. However, A and B send the state
determined by v + w to different states.
15
out you can even assume a little more: the change of state caused by rotating
the electron is induced by an operator in SU (2). The operator in question is
called a rotation operator.
How should we visualize the action of a rotation operator R on a state vector
v? We saw how to picture R as a rotation in 3-space by looking at its effects on
traceless Hermitian matrices: A → RAR∗ , where A = xσx +yσy +zσz . How can
we hook up the action of R on state vectors with the action of R on traceless
Hermitian matrices? It seems we need a correspondence between states (say
(a : b)) and matrices of the form xσx + yσy + zσz . We won’t get quite this, but
we’ll get something just as good.
The trick is to set up a correspondence between states and yet another kind
of matrix: a projection matrix. You probably noticed that the matrix σz does
a pretty good job specifying the state “spin up along the z-axis”. As it turns
out, σz is not a projection matrix, but it corresponds in a natural fashion to
1 1
2 1 + 2 σz ), which is.
Here’s how it goes for an arbitrary state vector v = a|up + b|down. Suppose v
is normalized, so |a|2 + |b|2 = 1. The projection matrix for v is given by taking
the product of the column vector v with the row vector v ∗ :
∗ a ∗ ∗ aa∗ ab∗
vv = [a , b ] =
b a∗ b bb∗
16
Why do I call vv ∗ a projection matrix? Answer: by analogy with projections in
ordinary real vector spaces, say R3 . If v is a vector of norm 1, and w is an arbi-
trary vector, then the projection of w “along the vector v” (i.e., in the subspace
spanned by v) is (v · w)v. Analogous to this, we define projv (w) = v, wv,
using the notation v, w for the inner product. In the “row vector, column
vector” notation, this is (v ∗ w)v = v(v ∗ w) = vv ∗ w. In physicists’ notation, this
is |vv|w.
We have acquired a new way of picturing the action of SU (2) on 3-space. The
formula v → Avv ∗ A∗ captures it succinctly. The mapping v → vv ∗ sets up a
one-one correspondence between the states (a : b) (i.e., the complex projective
line) and points on a sphere in 3-space. In fact this is just the Riemann sphere
mapping!
So the quantum states for the spin of an electron can be pictured as points
on a sphere. Elements of SU (2) correspond to the change in state induced by
rotating the electron, and this action of SU (2) can be pictured as a rotation of
the sphere. The naive pictures match up with the SU (2) formalism flawlessly.
The element −1 of SU (2) induces the identity mapping on the space of states,
since v and −v represent the same quantum state.
A simple computation illustrates how everything meshes. The rotation operator
for a clockwise 90◦ rotation about the y-axis is √12 (1 + iσy ). Indeed, if you work
out (1 + iσy )σx (1 − iσy ), you get 2σz , and likewise (1 + iσy )σz (1 − iσy ) = −2σx .
The x-axis maps to the z-axis, and the z-axis maps to minus the x-axis.
The example of electron spin illustrates two features of quantum mechanics very
clearly.
Some more technical features, also embodied in this example, and typical of
quantum mechanics:
• Need for complex numbers: The neat correspondence with the classi-
cal spinning ball picture wouldn’t work if we did everything over R.
• Non-commuting observables: You cannot simultaneously measure the
x and z components of spin (for example), because σx and σz do not
commute.
17
• Symmetry groups and observables: The rotation symmetry group
gives rise indirectly to the σ matrices, and ultimately to the notion of
angular momentum. The mathematical basis is the Lie groupLie algebra
correspondence.
Had I started with the first historical example, the single spinless particle coast-
ing in space, I would be illustrating the same morals with different actors:
• Need for complex numbers: The appropriate Hilbert space is the space
of complex-valued L2 functions on 3-space.
• Non-commuting observables: The momentum and position operators
do not commute, and you cannot simultaneously measure position and
momentum.
• Symmetry groups and observables: The group of translations in 3-
space gives rise to Lie group acting on the L2 Hilbert space; the momentum
operator emerges from the corresponding Lie algebra.
5 Loose Ends
There are so many loose ends that it seems pointless to try to tie them all up. I
will finish off with a few observations, meant more to tantalize than enlighten.
The Lie bracket. Can one reconstruct the Lie group from the Lie algebra?
In general, the multiplication table of a group is determined if you know the
multiplication table for its generators; why not try this with the “infinitesi-
mal” generators? If you try this approach, you will find you need to know the
commutators of infinitesimal elements, like x()y()x()−1 y()−1 .
My “definition” of the Lie algebra involved approximating infinitesimal gener-
ators by Taylor expansions out to the first order. In other words, I used only
first order derivatives. But to the first order, the commutators are zero!
Say we approximate an “infinitesimal” element of the Lie group out to the
second order:
2
x() ≈ 1 + x (0) + x (0)
2
If you work out the commutator, you will find expressions of the form vw − wv
appearing, where v and w belong to the Lie algebra. And one can verify that
vw − wv belongs in fact to the Lie algebra, as I’ve defined it, although vw and
wv in general don’t.
Remarkably, knowledge of these second order terms completely specifies the
structure of the Lie group near the identity. That is, if the Lie algebras are
18
isomorphic, then the Lie groups are locally isomorphic. Third and higher-order
terms are not needed.
Special relativity, spinors, and the Dirac equation. The crucial Lie
group for special relativity is the Poincaré group: all transformations of Minkowski
4-space (spacetime) that preserve the Minkowski pseudo-metric. The Lorentz
group is the subgroup that leaves the origin fixed, and the proper Lorentz group
is the subgroup of orientation preserving Lorentz transformations. The proper
Lorentz group in turn contains the rotation group of 3-space, SO(3).
Just as SU (2) is the double cover of SO(3), so SL(2) is the double cover of the
proper Lorentz group, where SL(2) is the group of unimodular 2 × 2 complex
matrices.
Say A is in SL(2) and v is in C2 . It turns out to be important to pry the
mapping vv ∗ → Avv ∗ A∗ apart into v → Av and v ∗ → v ∗ A∗ . The vector v
can be pictured as a geometric object consisting of a vector in space (rooted at
the orgin) with an attached “flag”, i.e., a half-plane whose “edge” contains the
vector. Moreover, if the flag is rotated through 360◦ , v turns into −v! (Recall
the earlier remarks on untangling threads.) Such an object is called a spin
vector. And just as one can create tensors out of the raw material of vectors,
so one creates spinors out of spin vectors.
Dirac invented spinors in the course of inventing (or discovering) the Dirac
equation, the correct relativistic wave equation for the electron. As it happens,
the σ matrices are not enough to carry the load; Dirac had to go up to 4 × 4
matrices (called the Dirac matrices). The σ matrices are imbedded in the Dirac
matrices.
I won’t repeat the story of how Dirac discovered antiparticles. Nor the story of
how he rediscovered knitting and purling (see Gamow’s Thirty Years that Shook
Physics.)
Pex v = −v
19
and for the boson case,
Pex v = v
The minus sign for fermions ultimately derives from the double covering of
SO(3) via SU (2). Spinors also get into the act. Since I don’t fully understand
the story myself, this seems like a good place to stop.
20