Hughes, Structure and Interpretation of Quantum Mechanics
Hughes, Structure and Interpretation of Quantum Mechanics
Hughes, Structure and Interpretation of Quantum Mechanics
5403923795 I
()os. \-,~.L.
oG\'~y:;-1
Preface xi
I\ n U,wxn'p llo l1 l1b h' 1,, 11" pn'l,' lio n of Qua ntulll Logi ' 207
l' lIln ;:lIn on Q Ull lllll'" Ll.gic 209
Properties a nd I (:vi.lIll Logic 212
9 Measurement 259
Three Principles of Limitation 259
Indeterminacy and Measurement 265
Projection Postulates 271
Measurement and Conditionalization 275
The Measurement Problem and Schrodinger's Cat 278
Jauch's Model of the Measurement Process 281
A Problem for Internal Accounts of Measurement 284
Three Accounts of Measurement 288
* * *
Among those to whom lowe thanks are Malcolm McMillan and M. H. L.
Pryce of the University of British Columbia, in whose physics classes I
learned about quantum mechanics; I hope that in the pages that follow they
can recognize the beautiful theory that they taught me. For what I learned
when I came to teach courses myself lowe a debt to my students at the
University of British Columbia, at the University of Toronto, at Princeton,
1'1'1'/(/('/' iii
.llld dt Yak. Allen I'llk hllhln, in parti ular, read most of the manuscript in
Ills sl'nior y ar al Yale and would return to me weekly, politely drawing my
.I ttl'ntion to obs urities, fallacies, and simple errors of fact. Roger Cooke,
Michael Keane, and W. Moran have kindly allowed me to reprint their
" l'l 'mcntary proof" of Gleason's theorem, and my commentary on it has
he 'n much improved by Roger Cooke's suggestions. For detailed comments
on a late draft of the book I am also indebted to Jon Jarrett, while for specific
.Idvice, encouragement, and appropriate reproof I would like to thank
Steven Savitt, Michael Feld, Clark Glymour, David Malament, and Lee
Smolin .
Sections of the book were written in railway carriages, airport lounges,
and theatrical dressing rooms, but for more tranquil environments I am
grateful to Sue Hughes and Paul Schleicher, to Susan Brison, and to Margot
Livesey, in whose houses whole chapters took shape. The final manuscript
was typed up swiftly and accurately by Caroline Curtis, and the diagrams
l'legantly rendered by Mike Leone; Patricia Slatter is even now at work on
t he index. At Harvard University Press, Lindsay Waters has been a source of
grea t encouragement over several years, and Kate Schmit edited the manu-
script with great care and sensitivity. My thanks to all of them.
I saved my two greatest personal debts till last. I met Ed Levy within days
of my arrival at U.B.C.; he it was who first stimulated my interest in the
philosophical foundations of quantum mechanics, who later supervised my
dissertation in that area, and who has continued to help me to clarify my
Ihoughts on the subject. Bas van Fraassen and I met at the University of
Toronto; since then we have discussed the problems quantum theory raises
(along with the architecture of the Renaissance and the plays of Friedrich
Durrenmatt) on two continents and in half a dozen countries. Both have
helped me more, perhaps, than they know.
I would also like to express my gratitude to the following firms and
institutions:
To Addison-Wesley Publishing Company, Reading, Massachusetts, for
permission to reprint a diagram from The Feynman Lectures on Physics
(1965), by R. P. Feynman, R. B. Leighton, and M. Sands.
To Kluwer Academic Publishers, Dordrecht, Holland, for permission to
reprint a diagram from J. Earman's A Primer on Determinism (1986).
To Cambridge University Press, Cambridge, U.K., for permission to re-
print in Appendix A "An Elementary Proof of Gleason's Theorem," by R.
ooke, M. Keane, and W. Moran, from Mathematical Proceedings of the
Ca mbridge Philosophical Society (1985).
To the Frederick W. Hilles Publication Fund of Yale University, with
whose assistance the manuscript was prepared.
I seik about this warld unstabille
To find ane sentence convenabille,
Bot I can nocht in all my wit
Sa trew ane sentence fynd off it
As say, it is dessaveabille.
-WILLIAM DUNBAR
INTRODUCTION
Quantum mechanics is at once one of the most successful and one of the
most mysterious of scientific theories. Its success lies in its capacity to clas-
sify and predict the behavior of the physical world; the mystery resides in
the problem of what the physical world must be like to behave as it does.
The theory deals with the fundamental entities of physics-particles like
protons, electrons, and neutrons, from which matter is built; photons,
which carry electromagnetic radiation; and the host of "elementary parti-
les" which mediate the other interactions of physics. We call these " parti-
cles" despite the fact that some of their properties are totally unlike the
properties of the particles of our ordinary, macroscopic world, the world of
billiard balls and grains of sand. Indeed, it is not clear in what sense these
" particles" can be said to have properties at all.
Physicists have been using quantum mechanics for more than half a
century; yet there is still wide disagreement about how the theory is best
understood. On one interpretation, the so-called state functions of quantum
mechanics apply only to ensembles of physical systems, on another, they
describe individual systems themselves; on one view we can say something
useful about a particle only when it interacts with a piece of measuring
equipment, on another such a particle can be perfectly well described at all
times, but to do so we need a language in which the ordinary laws of logic do
not hold.
Are we, perhaps, foolish to seek an interpretation of this theory? Maybe
we should take the advice Richard Feynman (1965, p. 129) offers in one of
his lectures on The Character of Physical Law:
I am going to tell you what nature behaves like .. . Do not keep saying to yourself,
if you ca n possibly avoid it, " But how can it be like that?" because you will get
" dow n the drain," into a blind alley from which nobody has yet escaped. Nobody
knows how it can be like that.
2 [lIl rv lil/ cl io "
diaphragm/,
--, "'
o~D--tt+~~~: ga,,~a'e
",magnet~
' - .,.
I
/ OJ '
"".--- ......
"' DuBois
\ magnet
\ viewed
\"' o:J
I
\
I obliquely
II
round lhe dir' lion o( Ill\.' field, like a spinni ng top precessing round the
vertical. In a nonuniform field, on the other hand, the magnet will feel a net
force in one direction or the other, depending on which pole is in the
stronger part of the field.
But to picture a silver atom as a tiny compass needle would be wrong. For
if the atoms behaved in that way we would expect to find their magnetic
axes oriented randomly as they entered the field; that being so, those de-
flected most in one direction would be those with their axes aligned parallel
with the field gradient, and those deflected most in the other would be those
with their axes antiparallel with the field . In addition, however, there would
be large numbers of atoms that were not aligned exactly upward or down-
ward and that would suffer deflections intermediate between these two
extremes. In other words, instead of two spots of silver on the glass plate,
Stern and Gerlach would have seen a smeared line.
Of course, we could take the two spots they observed to show that the
magnetic axes of the atoms were oriented either upward or downward but
nowhere in between. When the magnets were rotated 90°, however, the
beam was again split into two, but now one part was deflected to the left and
the other to the right; by parallel reasoning, this would show that the
magnetic axes of all the atoms were oriented horizontally. Clearly, the
simple compass-needle model of a silver atom will not do.
More formally, we can contrast the behavior of these atoms with that of a
compass needle as follows. A classical magnet has a magnetic axis; its
magnetic moment is directed along this axis, but this moment has a compo-
nent in any direction we choose, whose value ranges continuously from a
maximum in the direction of the axis through zero along a line perpendicu-
lar to it, to a maximum negative value in the opposite direction (see Figure
1.2). However, it seems that the components in any direction in space of the
magnetic moment of a silver atom can have only one of two values; these are
numerically the same as each other, but one is positive with respect to that
direction and the other is negative.
At first the experiment was explained in terms of the "magnetic core"
hypothesis of Sommerfeld and Lande, a hypothesis long since discarded,
which attributed the deflection of the beam to the magnetic properties of the
nucleus and inner electrons of an atom. In 1925 an alternative explanation
was to hand, proposed by Goudsmit and Uhlenbeck, and it is this explana-
tion which was incorporated into quantum theory as we now know it.
The explanation is roughly this. An electron possesses an intrinsic angular
momentum, known as "spin," which gives rise to a magnetic moment. A
component of the spin in any direction has one of two values, + t h or - t h ;
hence we list electrons among the "spin-t particles." (The constant h is the
4 illirotiu clioll
/
/
/
/
I
I
I
I
I
I
I
Figure 1.2 If the magnetic moment J1 of the compass needle is directed along the dotted
line, then the component of J1 along AB will equal J1cosO.
Hcpa rat' beams ,lC 'ordingly. This oncI usion would be confirmed were we
10 blo k off one of th ' beams as it left the apparatus, the spin-down beam,
let us say. The emerging atoms would now all be spin-up, as we can verify
by placing a second magnet in tandem with the first (Figure 1.3). No further
splitting of the beam would take place, though the beam as a whole would
be deflected further upward. Let us call this "Experiment VV."
Now consider a different experiment (Experiment VH) in which the sec-
ond apparatus is rotated 90° (Figure 1.4). The incoming beam-that is, the
spin-up beam from the first apparatus-will be split into two horizontally
separated beams, spin-left and spin-right. (So far our account and quantum
theory are entirely in harmony.) However, now let us block off the spin-
right beam. What are we to say of the atoms which now emerge? Our
account suggests that they have been through two filters: they have passed
the first by virtue of having a spin-up vertical component of spin, and the
second by virtue of a spin-left horizontal component of spin. In other words,
it suggests that we can specify both the horizontal and the vertical compo-
nents of spin these atoms possess: were a third apparatus set up to receive
this beam, whether the magnetic field gradient were vertical or horizontal,
no further splitting would occur.
Unfortunately, this is not the case (Feynman, Leighton, and Sands, 1965,
vol. 3, lecture 5). With the axis of the third apparatus set in any direction but
horizontal, the emergent beam will be split into two. With it vertical (Exper-
iment VHV, Figure 1.5) the two parts of the beam will be equal in intensity.
This, at least, is what quantum theory predicts for idealized experiments of
this kind, and all the evidence from actual experiments, some of which are
very close in principle to those described, confirms its predictions. It seems
that, somewhere along the line, there is a divergence between the quantum-
theoretic analysis of what happens and our account of it: at some point an
unwarranted assumption or two has found its way into the latter. Within it
we find at least four separate assumptions at work:
(1) That when we assign a numerical value to a physical quantity for a
system (as when we say that the vertical component of spin of an electron is
+tn), we can think of this quantity as a property of the system; that is, we
can talk meaningfully of the electron having such and such a vertical compo-
nent of spin.
(2) That we can assign a value for each physical quantity to a system at
any given instant-for example, that we can talk of a silver atom as being
both spin-up and spin-left.
(3) That the apparatus sorts out the atoms according to the values of one
particular quantity (such as the values of the vertical component of spin), in
other words, according to the properties they possess.
(4) That as it does so the system's other properties remain unchanged.
The evidence of Experiments Wand VH is consistent with all of these
assumptions; that of Experiment VHV with (I), (2), and (3), but not (4). It
looks as though the spin-left, spin-right measurement effected by the sec-
ond apparatus disturbs the values of the vertical components of spin. But,
oddly enough, it disturbs only half of them. According to our interpretation
of Experiment W, all the atoms entering the second apparatus of VH have
spin-up vertical components of spin, but as they emerge half of them are
spin-up and half spin-down. Or so Experiment VHV informs us, as inter-
preted on the basis of assumptions (I), (2), and (3).
On this analysis, quantum theory owes us an explanation for the selectiv-
ity displayed by the second apparatus. Why is it, we may ask, that half the
atoms entering this apparatus are tipped upside down, while the other half
journey on undisturbed? Quantum theory declines to tell us. Rather, it
suggests that we not only abandon (4) but also look severely at the other
principles involved. Assumption (2) may be the first casualty: there may be
'1'111' Sit''''' (.r r/llcil [ ;xllt'ri lllL'1I1 7
distinct properties which are in ompatible. These properties would not just
bl' mutually ex lusive values of one quantity, like spin-up and spin-down,
but also properties associated with two different quantities, vertical and
horizontal components of spin, for example, or the possibly more familiar
pair, position and momentum. The possession of a well-defined value for
one such quantity would rule out its possession for the other: to say that an
.. 10m was spin-up would rule out our saying that it was also spin-left. But if
this is the case, then measurement will not be the simple process suggested
by (3); if the vertical and horizontal components of spin are incompatible,
and a system has a well-defined vertical component, then a measurement of
I he horizontal component will not merely reveal what value of the latter the
system possesses. The measurement process may have to be seen as in some
sense bringing this value about. To say that properties are not revealed by
measurement, however, serves to point out an oddity, not only in the
quantum concept of measurement but also in the notion of a property at
work here. If we accept assumption (I)-that is, the identification of a
property with a particular value of a physical quantity-then we may find
ourselves dealing with properties of a very peculiar kind. All four assump-
tions, not just the last of them, need careful scrutiny.
• For a contemporary view of the problems it raises, see Einstein and Eh renfest (1922),
reprinted in Ehrenfest (1959), pp. 452 - 455 .
I
The Structure of
Quantum Theory
1
Vector Spaces
1.1 Vectors
Consider the two-dimensional real space of the plane of the paper, II~F, We
pick a particular point in 1R2 and call it the zero vector, 0, The other vectors in
1R2 are arrows of finite length which lie within the plane with their tails at
zero; any arrow of this kind is a nonzero vector of 1R2. (See Figure 1.1.)
We can define vector addition, the operation by which we add two vectors
to form a third, as follows, Given two vectors u and v, we construct a
parallelogram with u and v as adjacent sides (see Figure 1,2). The diagonal
of this parallelogram, which passes through 0, will also be a vector: call it w,
This vector is the vector sum of u and v, We write,
w=u+v
v + (-l)v = 0
Note that 0 here denotes the zero vector, not the number zero,
So far we have proceeded entirely geometrically, using a geometrical
construction to obtain u + v and giving a geometrical meaning to av (where
a is a real number). However, an alternative, arithmetical approach is open
to us. We may impose a coordinate system on our space and then refer to
each vector by the coordinates of its tip. Each vector will be designated by a
pair of numbers, which we write as a column, thus:
(;)
The numbers x and y by which we denote a particular vector v will of course
/V
o
Figure 1.1
VI'C/lIr '"aces 1J
o
Figure 1.2
vary according to the coordinate system we have chosen (see Figure 1.4), or,
as we say, according to the basis we use. Provided we are consistent and
don't switch haphazardly from one to another, in principle it doesn't matter
what basis we choose, though one may be more convenient than another. In
every basis the zero vector is represented by
Unless otherwise stated, we shall assume from now on that we are using a
single (arbitrarily chosen) fixed basis.
Corresponding to the operations of vector addition and scalar multiplica-
tion carried out on vectors, we can perform very simple arithmetical opera-
tions on representations of vectors, It is easily shown th~t,
v+u= ( y+y'
X+ X') and av = (::)
Figure 1.3
14 Tlte S/mclllrt' (If (Jill/II/II III '1'/11'111.'1
\ y
\
y\--
\
Figure 1.4 v = (~) in basis 1; v = (;) in basis 2; thus the same vector can be repre-
sented in many different ways.
Note that
as required.
1.2 Operators
We now consider operators on our set of vectors. An operator transforms any
vector in the space into another vector; one example is a rotation operator,
which swings any vector round through a certain angle without altering its
length. We will denote operators by boldface capital letters and write Av for
the vector which results when the operator A acts on the vector v . In Figure
1.5 v' is the vector we get by swinging v round through an angle () (counter-
. clockwise), and so we write,
v' = ReV
Figure 1.5
(;)
and transforms it into the vector
U=SyU
V'=S V - - -
Y V
Figure 1.6
16 '/'III' ' Irtlrlllrt ' llf )1It//l1 II III '1'/1 t'll rt/
Figure 1.7
(;)
Notice that, while for a given vector v there is only one vector PxV (or else Px
would not be an operator), nevertheless we may well have distinct vectors u
and v such that PxU = PxV (see Figure 1.8). In this way projection operators
differ from rotation and reflection operators.
We may, of course, perform a series of operations on a given vector v. We
may, for instance, rotate it through an angle () to produce Rov and then
project the resulting vector onto the x-axis, producing Px(Rov). Now, pro-
vided () is neither 0 nor 180 this vector is different from the one we get if
0 0
,
( 1.1) A(u+v)=Au+Av
( 1. 2) A(ev) = e(Av).
such that if
v= (~)
then
It is trivial to prove the converse, that any such matrix represents a linear
operator.
To perform the manipulations in (1.3), think of taking the top line (a b) of
the matrix, rotating it so that it matches up with the vector v (multiply a by x
ax
and b by y), and then adding to by to get the top entry of the vector Av.
Figure 1.B
'ff TIl!' . IIIII'II/n' 0/l)I/""/,,,,, I II"PII/
The bottom entry is obtained by dOIlI/j Ih (' S,\n1l' with the bottom linc of the
matrix. (For help with the proof of this th 'orcm, or for a fuller account of
elementary vector-space theory, consult any book on linear algebra, such as
Lang, 1972.)
The operators we have looked at all have simple matrix representations.
For instance,
A little thought should show why these operators have the representations
shown, and it is a useful exercise to show that
R = (COSO - SinO)
9 sinO cosO
I=(~ n
This is the operator that leaves any vector as it found it: Iv = v, for all v.
If we have two operators A and B, with representations
( ~ ~) and (; h)
wh.11 j, 1111' rl'pres nta tion of their product AB? It is the matrix which, when
We' 1I1l\'1",1 1\' wi l h il on !lll'V tor v according to the rule (1.3), yields the same
1\·: IIlt -, III IlPt""l ling wi lh IJ .lnd A, in that order. A little brisk manipulation
s how:-. lh .1 1
(We obtain the top left-hand entry by matching the top line of A with the left
column of B, the top right entry by matching the top line of A with the right
column of B, and so on.)
It's worth going into this in more detail. Let
VI't lllr S,,(I /'I'S 19
Then if
Now
and so
AB = (ae + bg af + bh)
ee + dg ef + dh
R ()'-p r = (cose
sine
0)
0 an
d PR
rOo
= (cose -soine) (*)
Figure 1.9). That is, the vector v can be written as the sum: v = VL + vu. We
now define the projection operator P IJ onto the line L as the operator P IJ such
that PIJV = V L • As an exercise it's worth showing that
2
(1.4) P = (cos 0 cosO . SinO)
IJ cosO, sinO sin 2 0
The addition of two linear operators is easily defined: we write, for all
vectors v,
(A + B)v = Av + Bv
A +B= + e b + f)
(ac+g d+h
v=G)
In this case
Av = (:) = 2v
That is, the vector which results is just a multiple of the vector we started
with . This is not always so with this operator; if we evaluate Au, where
Thus A does not simply double the length of all vectors. In fact, if we
interpret A geometrically, we find that it corresponds to the operation of
first doubling the length of a vector, then rotating it 90° counterclockwise,
and finally reflecting the result about the y-axis. (To see this, check that
~2 ~1
x
/
/
However, for any vector v', lying along the same line as
(that is, any vector lying along the line Ll at 45° to our axes-Figure 1.10),
and hence of the form
(;)
we find that Av' = 2v' (*l Also, we can check very quickly that if w is a
vector of the form
Aw = ( 2X) =
-2x
- 2w
Vectors of the form v' and ware known as eigenvectors of A, and the
eigenvalues corresponding to them are 2 and - 2, respectively. More for-
mally,
( I .',) v is sa id Lo b~·0111 I'IXI ' II ///'('/o r of a linear operator A, with correspond-
ing eige ll va /u l' fI, i( v i-" 0 and Av = av o
NoLe that we do not allow the zero vector to be an eigenvector of any vector.
Not all operators have eigenvectors. For instance, the rotation operator Ro
has, in general, no eigenvectors, since, unless () = 0° or () = 180°, the vector
Rov cannot lie along the same line as the vector V. When () = 0°, Ro is the
identity operator I; for the two special cases, I and R 180 , every vector is an
eigenvector; the corresponding eigenvalues are + 1 and -1, respectively.
Likewise, any purely multiplicative operator has every vector in the space as
an eigenvector.
Now consider the reflection operator Sy and the projection operator Po.
Do these admit eigenvectors, and, if so, how many? In general, obviously,
Lhe vector SyV does not lie along the same line as V. However, it does so in
Lwo special cases: first, when v lies along the y-axis (so that SyV = v); and,
second, when v lies along the x-axis (so that SyV = -v). Thus we have two
classes of eigenvector and two eigenvalues, + 1 and -1. The projection
operator Po maps all vectors onto the line Lo at an angle () to the x-axis (see
Figure 1.11). Thus any vector in this line is an eigenvector of Po, with
eigenvalue 1. Now consider a vector v along the line L8+90' at right angles to
Lo. For this vector we have P Ov = 0 = Ov. As always, the symbol 0 denotes
the zero vector, while the symbol 0 denotes the number zero. In fact we have
here a special case of the eigenvector equation Av = av in which a = O.
(Note that, although the zero vector is not an admissible eigenvector, the
number zero is a perfectly good eigenvalue.) We see that the eigenvectors of
Y
\
L!l+90\
./'Ls
6
X
/
./'
\
\
figure 1.11 Eigenvectors o f Sy lie along y-axis (eigenvalue 1) or along x-axis (eigenvalue
- 1); eigenvectors of PI/ lie along L" (eigenvalue 1) or along L"+9o (eigenvalue 0).
24 Th e 'tru e/llre of Q lllllltlllll '/,11/ '/ 11 '/
P ()lie along either Loor Lo+ 90 , and the correspond ing eigenvalues are 1 and 0
in the two cases.
These examples suggest some very general conclusions. The operators we
have looked at fall into three classes. One class has no eigenvectors at all:
this class includes all the rotation operators except Ro and R 180 . In the
second class we find the projection and reflection operators and the example
A used at the beginning of this section. In each of these cases all the
eigenvectors lie along one or the other of two lines. With each set of eigen-
vectors (that is, with all the eigenvectors lying along a particular line) is
associated a particular eigenvalue; thus to each operator of this type we can
associate a pair of distinct eigenvalues. Now, in the examples we looked at,
the two lines containing the eigenvectors are at right angles one to the other.
While this is not the case for all the operators on 1R2 that (as we say) admit
two eigenvectors with distinct eigenvalues, nevertheless this result holds for
a very significant subclass of such operators, namely those among them
which are symmetric. The matrix representing a symmetric operator on 1R2
may be recognized by the fact that its top right element is equal to its bottom
left element.
The third class contains operators like I and R 180 • They admit all the
vectors in the space as eigenvectors, all sharing a common eigenvalue. Note
that these operators are also symmetric.
We can now use the operator
A = (~ ~)
to illustrate an important result. Clearly A is symmetric: further, it has two
eigenvalues, al and a2 , where al = 2 and a2 = -2. The eigenvectors lie
0 0
along two lines, at 45 and at 135 to the x-axis; to the eigenvalue al
corresponds the line LI in Figure 1.10, and to the eigenvalue a2 corresponds
L2 • We can find the matrices which represent the projection operators onto
these lines. Using (1.4), we obtain:
pI = (1 t)
1.
2
1.
2
and P2 = ( -1.t 2
-1) 1.
2
Vc r /(!/' 'pflces 25
1
I) + (-1
1-11)
=(~ ~)=A
This result turns out to be quite general. That is, if we take a symmetric
operator A which admits two eigenvectors with distinct eigenvalues a l and
a2 , then the eigenvectors corresponding to a l all lie within a line L I , and
those corresponding to a2 all lie within a line L2 (LI ..L L2 ). If PI projects onto
L. and P2 onto L2 , then, as in the case above,
It's worth approaching these ideas in a slightly different way. Any linear
operator A on ~2 may be "decomposed" into the sum of other linear opera-
tors, as follows. Let
A = (~ ~)
Then
This is called the spectral decomposition theorem for ~2. There are two cases:
(i) a. =1= a2 , (ii) al = a2 •
In both cases (that is, whenever A is symmetric), A admits eigenvectors.
When a l =1= a2 , the decomposition of A into the weighted sum of two projec-
tion operators is unique. Furthermore, all eigenvectors of A lie either along
the line onto which PI projects or along the line onto which P2 projects.
Those in the first line have corresponding eigenvalue aI' while those in the
26 Ti,e 51ru elll n' of QIIIII/IIIIII '1'111'0/1/
The introduction of the notion of the inner product of two vectors, also called
the dot product or the scalar product, enables us to give numerical expression
to such geometrical ideas as the length of a vector and the orthogonality of
vectors. (Vectors at right angles one to the other are said to be orthogonal.)
Using the notation introduced by Dirac, we denote the inner product of two
vectors u and v by (ulv). We define it for /R 2 as follows:
The + here is the plus sign of ordinary arithmetical addition. We see that,
although u and v are vectors, their inner product is just a number. For
instance, let u = (i) and v = (!); then (ulv) = (2)(3) + (1)(4) = 10.
How does this number acquire a geometrical significance? Consider the
case when
u=v=(~)
In this case (ulv) = (ulu) = X2 + y2. Here the geometrical significance is
clear; by Pythagoras' theorem, (ulu) is equal to the square of the length of
the vector u . We denote the length of the vector v by Ivl and observe that
prodllcts (using x <l ilt! y coordi nates) involves ref 'rence to a particular coor-
dinate system. The g ' Il era l result, for two vectors u and v at an angle ¢ to
ea h other, is:
( 1.10) (ulv)=lullvlcos¢
I will not prove this general result, but will show that it yields the right
answer in the case of two normalized vectors, u and v, such that u lies along
the x-axis and v is at an angle ¢ to it (see Figure 1.12). In this case
COS<\»
V= ( sin<\>
Figure 1.12
8 '1'111' S /IIII'I/lfI' tlJ (.>/11111111111 '1'111'0111
cos(I»
v = (cos¢)
. A. and so p ...v = ( 0
sm'f'
(See Figure 1.13.) It follows that (vIP...v) = cos 2¢ + (sin¢)(O) = cos 2¢, in
accordance with (1.11). In the general case, we see from trigonometry that
IPvl = Ivl cos¢. By (1.10),
Figure 1.13
V/ '/'Ior SP" /,/'S 2(
\
\
\-----
\
\
\
Figure 1.14
As is well known, the number 36 is the square of 6, and also the square of
- 6; there is no real number x such that x 2 = - 36. However, we can imagine
the sort of properties such a number would have, if it existed. It would be
twice the square root of - 9, for instance, so that it would conform to the
eq uation (X/2)2 = x 2/ 4, and it would be a root of the equation x 2 + 36 = O.
In fact, if we included "imaginary" numbers like x in our set of numbers,
then every quadratic equation would be capable of solution. Equations like
x 2 - 36 = 0 would have real solutions, whereas those like x 2 + 36 = 0
would have imaginary solutions, the square roots of negative numbers. If
the inclusion of imaginary numbers is worrying, it is worth considering the
ense in which a negative number, -6 say, is real-or, come to that, the
ense in which 6 itself is real. Of course, the sum of your worries may not be
decreased by such considerations.
From what has been said, if a is a positive real number, then - a is
negative, and r-a will be imaginary. Our imaginary numbers are to con-
form to the same rules as the real numbers, so
2(3i) = 6i = (3i)2
(a + ib)(a - ib) = a 2 + b2
This is both real and positive. We call a - ib the complex cOllju8nle of n + ib,
VI'cior S,II/c('s 31
LInd conversely. W~' d\' l\o l \' by c· the complex conjugate of the complex
number e. Thus (a ilit II - ib, and (a - ib)* = a + ib. For all e, (e*)* =
e, but e* = e if and onl y if e is real. We have just shown that (e*)(e) is always
real and positive.
Observe that the use of complex numbers enables us to factorize expres-
sions like a 2 + b 2, which previously resisted factorization.
The quantity lei = ,j(e)(e*) is known as the norm of e. We are often inter-
ested in complex numbers of norm 1; from the definition of the norm, it
follows that if e = a + ib and lei = 1, then a 2 + b 2 = l. This in tum implies
that there is an angle 0 such that a = cosO and b = sinO. Thus any complex
number of norm 1 can be written in the form:
e = cosO + isinO = e i8
(The number e is the base of so-called natural logarithms; it is the sum of the
infinite series:
For any x,
(:)
32 Th t' /ru c/llre of Q I/(/II/UIII 'I'Ilt'O IY
as in the case when our vectors were pairs of reals. We allow scalar multipli-
cation by any complex number:
0= (~)
An operator on a complex space is like an operator on a real space: it is an
instruction to transform a vector into some other vector. As in O~F, to each
linear operator on (:2 there corresponds a 2 X 2 matrix of numbers, but in
this case the numbers are complex. For instance, a typical operator on (:2 is
represented by the matrix
1-
3
i) =A
The algorithm for determining Av, given A and v, is the same as before, as is
the procedure for finding the matrix AB, the product of A and B, given those
matrices. For example, let A be as above, and let
Vec/o/' ' ,lIll' I' S JJ
Th en
_ (2
Av - 1+;
1-
3
i) (l+i
1 ) _ (2)(1) + (1 - i)(1 + i»)
- (1+i)(1)+(3)(1+i)
= (4(1 ~ i») = 4v
We see that v is an eigenvector of A, with corresponding eigenvalue 4. It
turns out that the vector
But now consider the inner product (ulu) . From (1.12), (ulu) = CI*C I +
C2*c 2 • We know that, for any complex number c, c*c is a positive real number;
thus, for any vector u of (?, (ulu) is real and positive.
This means that even in complex space we can talk of the length of a
vector; we define it by writing
We see that
As with th ' d('finilioIlHof inn 'r product, length, and orthogonality, wher-
ever po ible we find suitable generalizations in 1[2 of the concepts familiar
from the real space II~F . For instance, the analogues in 1[2 of the lines ofIR 2 are
the one-dimensional subspaces of 1[2. If two vectors of 1R 2 lie along the same
line, then one is a multiple of the other; similarly, if two vectors v and v' lie
within the same one-dimensional subspace of 1[2, then v' = cv, where cis
some (complex) number, and conversely. We usually use the term ray in-
stead of the cumbersome one-dimensional subspace.
Let us now generalize the notion of a projection operator. We do this by
following the route taken in Section 1.2; we can usefully use a diagram
(Figure 1.15), provided that we remember that what we see in the diagram is
only the analogue in 1R2 of what we have in 1[2.
Let L be any ray of 1[2, and v be any vector of 1[2. Then there exist two
vectors V L and VLJ. , such that (i) VL + VLJ. = v, (ii) VL lies within L, and (iii)
VLJ. 1- VL' Further, for a given vector v and ray L, V L and VLJ. are unique. As in
the analogous case in 1R2 (see Figure 1.15), we can define the projection
operator Ponto L by writing, for any vector v, Pv = v L , where v, Vu and VLJ.
stand in the relations given by (i), (ii), and (iii) above.
As in 1R2 we find that, for every vector v and every projection operator P,
(vIPv) = IPvI 2 = (PvIPv)
This is always real and positive; furthermore, whenever v is normalized,
0:5 (vlPv) :5 1
\ \
\
\
\
\\~
VL= PV
Figure 1.15
36 'J'1,e lru elure of Q utlllllllll '/'//1 '01 1/
1 )
( l+i
.
IS
1 (1
"3 l+i
1-
2
i)
and the projection operator P 2 onto the ray containing
. 1(2 -1 + i)
IS -
3 - 1-i 1
The eigenvalu es are 4 and I, respectively, and a brisk calculation shows that
(1.15) holds. As an exercise (**), it is worth considering how one could
show that PI and P 2 are given by these particular matrices.
lh ' qll Ll nliti 'S W • 1lIl'! in Ih l' di LI S ion of tern and Gerlach's results in
the Introdu tion. The 111 , trix representations of these operators are shown
be low.
The three matrices involved are known as the Pauli spin matrices.
PROBLEMS
1. Show that S" and Sy do not commute, and evaluate SxSy - SyS" . Express
this difference in terms of Sz, and show that this relation holds cyclically
among the three operators.
2. Let
y- = '21 ( -1-
1-
i
i)
Show that x+ and x_ are eigenvectors of S", and that y+ and y_ are
eigenvectors of Sy. In each case, what are the corresponding eigenvalues?
3. Show (i) that x+ and y+ are both normalized; and (ii) that x+ is orthogonal
tox_, and that y+ is orthogonal toy_. Why might one expect (ii) to be the
case?
4. Determine the eigenvectors and corresponding eigenvalues of Sz.
S. Let P x+ be the projection operator onto the one-dimensional subspace of
(:2 containingx+. We extend the notation in an obvious way to P x- , P y+,
and so on. Then
-0
(i) Show that P x-"- = x_ and Py+Y+ = y+. (ii) Determine the vector
P y+ Y_. (ill) Show that P y+ is indeed the projection operator onto the ray
containing y+ . (iv) Evaluate Px+Px- and Py+Py+ (= P;+). Why are these
results predictable?
6. Evaluate (x+lPy+~), (x_IPy+"-), (y+lPy+y+), and (y-lPy+Y-). Confirm
that these are equal to /Py+x+/ 2, lPy¥_/2, /Py+y+/2, and /Py+y_/2, respec-
tively.
38 Ti,e Stru c/llre of Q llall/11111 '/'111 '1".1/
0
There are just four members of this set, since a rotation of 360 is equivalent
to no rotation at all; in fact, we have R360 = Ro = I. Now, given any pair of
linea r operators A and B, we can form their product AB. A distinguishing
feature of the set Rot is that the product of any pair of these operators
(including the product of anyone of them with itself) is again a mem-
ber of the set. We have, for example: ~OR180 = R 270 , Ro~o = ~o,
R 270R2 70 = R 180 , and so on. We say that multiplication is a binary operation
on the set.
Consider now the set of four numbers, two real and two imaginary,
{I,i,-l,-i}. We can of course multiply any two of them together in the
usual way, and if we do so we find that they have the same property that we
observed in the set of four rotation operators: the product of any two of
them is also a member of the set: (i)(-I) = - i, (l)(i) = (i), (- i)(- i) = -I,
and so on. Observe that these equations involving numbers exactly match
up with the earlier equations involving operators.
We find that each operator from the first set, Rot, corresponds to a number
from the second; furthermore, to the operation of operator multiplication on
Rot corresponds the operation of arithmetical multiplication on the set of
numbers. The two sets are alike, not in the objects they contain, but in their
mathematical structure. When, as in this case, the slru tural similarity is
VI'f /or SJlIlc/'.~ .19
su h that a p ,r( 'cl (l1l(' 10 (IIW co rrespondence exists between two sets, we
say that th ey arc iso lll orl'"ic.
However, we're interested in a weaker kind of similarity. We want to
specify, for instance, the way in which Rot is similar to the set of rotations
through multiples of 60 · . In fact, this set and the two we started with,
together with the product operations on them, are all examples of a very
general kind of mathematical structure known as a group, and defined as
follows.
(1.16a) (aob)oc=ao(boc)
0 .16b) aoI=a=Ioa
that there ex ist obj ts in th - /let wit h parlielll ••!' prop -rlies: (or exa mple,
every group has to con tain an identity -I 'm -nt. To dcscrib a stru cture, we
make a list: first on the list is the relevant set of objects, then come the
operations performed on that set, and finally we list the elements with
specific properties. Thus the first structure we looked at in this section is the
group (Rot, 0, " Ro), which includes the set Rot, the binary operation of
multiplication, the singulary operation which gives the inverse of any ele-
ment of Rot, and the identity element of Rot.
I should emphasize that our present concern is not the mathematical
theory of groups per se. It so happens that a group is a particularly simple
form of mathematical structure and that by talking about groups we can see
what is meant by an operation on a set, by one structure being isomorphic to
another, and so on. Armed with these ideas, we can move to a more compli-
cated structure, that of a vector space-recalling, as we go, Russell's (1917,
p. 59) suggestion that "Mathematics may be defined as the subject in which
we never know what we are talking about, nor whether what we are saying
is true."
L 't :1 (J', I ", .", ', 0 ", 1) be a field . (The binary operations and zero
clement arc tagg , I wilh tI 'gree signs to distinguish them from the opera-
tions on the ve tor spa e, which are customarily represented by the same
symbols.) The elements of F will be called scalars. Then
(l.17a) (u + v) + w = u + (v + w)
(l.17b) u+v=v+u
(l.17c) v+O=v
(l.17d) a . (b . v) = (a . ° b) . v
(1 .17e) (a +0 b) . v = a . v + b . v
(U7f) a . (v + u) = a . v + a . u
(l .17g) 0° . v = 0
(1 .17h) 1· v=v
if we do so, then, because all the clauses of (1.17) are satisfied, we have
defined a vector space in which the sequences are the vectors.
Alternatively, consider the set of all complex-valued functions of a real
number. Examples are the squaring function, which maps a real number
onto it:; square, the function which maps a real number x onto the complex
number x + ix, the function which yields the cosine of x, and so on. This set
42 'I'!Il' ' I rrt (' 1II rt' tlJ (JIIIIIIIIIIII '1'111 '0 11/
can be made into a vector spa cas (ollows. The (u nctions themselves ar th'
vectors in the space; given any two function, q> and 'II, we define their sum
¢ + If/, so that, for all real numbers x,
(a . ¢)(x) = a .0 ¢(x)
performed on vectors has now served its purpose. It may be omitted without
loss of clarity; however, it is still a useful exercise to see which operation is
being referred to by a particular symbol on any given occasion.)
Unless otherwise stated, in what follows vector spaces are assumed to be
over the field of the complex numbers.
(1.18) A mapping A of the set V into itself is called a linear operator if for all
vectors u and v and for any scalar a,
(1.18a) A(v + u) = Av + Au
(1.18b) A(av) = a(Av)
(writing x for A, </) .lI)d 1/1 for v and u), it is a linea r opera tor. Again, the
differential opera tor tI/rix is a linear opera tor on the space of functions of x.
Il owcver, though the process of squaring any function t:/J of x-so that
(p2(X) = [t:/J(x)f-could be regarded as the action of an operator, such an
operator would not be linear, because both (1.18a) and (1.18b) would be
violated.
The definitions of an eigenvector and an eigenvalue of an operator carry
through to the general case:
(J.20) If A and B are linear operators on <Y, then A + Band AB are linear
operators such that, for all v in V,
(I . 20a) (A + B)v = Av + Bv
( J.20b) (AB)v = A(Bv)
The restriction on the sequences is needed to ensure that the inner product
defined in this way will always be finite, that is, will be a scalar. The space of
such sequences is called [2 .
Similar considerations lead us to define an inner product, not on the
vector space of all functions of X, but on the vector space of all square-inte-
grable functions of X, that is, functions ¢(x) such that f'~. ao¢(x)*¢(x)dx is
finite. This space is known as L2; we define an inner product on it by writing,
for vectors ¢ and 1fJ,
(¢IIfJ) = L~ ¢(x)*IfJ(x)dx
A remarkable mathematical fact now presents itself, that the spaces [2
and U are isomorphic. (See Fano, 1971, p . 269.) That is, we can find a
correspondence between sequences in [2 and square-integrable functions in
U such that, if to the sequences x and y there correspond functions ¢ and 1fJ,
then (i) to the sequence x + y corresponds the function ¢ + 1fJ, (ii) to the
sequence ax corresponds the function a¢, and (iii) (xIY) = (¢IIfJ) (provided
that each of these inner products is evaluated in the way appropriate to the
vectors involved). This isomorphism is relevant to the history of the devel-
opment of quantum mechanics. The early formul a tions of the th eory, by
Heisenberg and Schrodinger, were, respectively, in t rms o f sequences a nd
VI' (' /Of 8/11/( '('8 45
+ anx". Then a typical subsp;)ce of 'P conta ins all the polynomials of
order two or less. (A function of the form ao + a1 x + a2 x 2 is a polynomial of
order 2.) Clearly, the addition of two such functions yields another of the
same kind, as does multiplication by an arbitrary number a, and this is all we
require.
Orthogonality between vectors was defined in Section 1.11. We also say
that a vector v is orthogonal to a subspace Lif v is orthogonal to every vector in
L, and that two subspaces Ll and L2 are orthogonal if every vector in Ll is
orthogonal to every vector in L2 • Note that in Figure 1.16 the planes L1 and L2
are at right angles to each other but are not orthogonal, since Ll contains
vectors which are not orthogonal to all vectors in L2 • In particular, the vector
v is common to both.
Note that the zero subspace (containing just the vector 0) is orthogonal to
all subspaces.
The projection operators we encountered on 1R2 and 1[2 were projection
operators onto rays (one-dimensional subspaces) of 1R2 and 1[2, respectively.
We can define projection operators onto subspaces of any dimension; if L is
a plane of 1R3, then P v the projection operator onto L, maps vectors into L, as
shown in Figure 1.17.
We define projection operators in the general case just as we did for 1[2.
That is, given a subspace L, we can decompose any vector v into two parts,
VL and V L-,- , so that VL lies in L, Vu is orthogonal to L (and hence to vd, and
figllre 1.16
VI'I'IM Sf/III'/'S 47
""
""
""
x
z
Figure 1.17
(1.24) A linear operator A on 'V is said to be Hermitian if, for all vectors u,
and v, (uIAv) = (Aulv).
(1 . 25) A linear operator A on 'V is said to be idempotent if AA = A. (We
write A2 = A.)
on 1(2 is idempotent.
Our new definition of a projection operator is:
It can be shown that such an operator is an operator which has the property
of" projecting onto a subspace" in the way described above, and conversely
(**l (See Jordan, 1969, pp. 26 - 27.)
The set of projection operators (or projectors) on a vector space is in
on . to-one correspondence with the set of subspaces of that space. It in-
48 'J'//I' Iru l'llI rl' oj QIIIIIIIIIIII '/'/1 1'011/
eludes the zero operator (whi h prokl'l s onto the zero subspa e)- th
operator Po such that, for all v, Pov - 0 and also the identity operator I,
which projects onto the whole space: for all v, Iv = v. As we shall see,
projectors play an enormously important role in quantum theory; in fact, the
discussions of the theory in later chapters are almost entirely in terms of
these operators.
I conclude this section with a general proof of a relation we have met a
couple of times already. Let P be any projection operator and v be any
vector, then:
As proof, consider:
The set
forms a convenient orthonormal basis for IJ;F, and also for 2. Notice that
Ih 'r(' is no IIniqlll' orlil011 orm.lI basis for any spa e. For exa mple, we see
from ' 'ction 1.7 that 'achofth threesets(x+,x_),(y+,y_),and{z+,z_)isan
orthonormal basis for (:2, as are nondenumerably many other pairs of
v' tor .
An II -dimensional space can be spanned by n mutually orthogonal vec-
lors. If th e space is infinitely dimensional, an infinite set {v;} is required.
If {v I ,v 2 , . . . ,vn} is an orthonormal basis for a vector space <Y, then any
subset of this basis is an orthonormal basis for a subspace of <y. The set
(v l ,V3), for example, is an orthonormal basis for a two-dimensional sub-
space L; we say that L is spanned by {V 1 ,V3 }, and we also talk of the rays
containing VI and V3 spanning L. These rays, of course are themselves
spanned by {vd and {v3 }, respectively.
A result that will be important in Section 8.8 is the following. Let
(v I, . . . ,v n) be any orthonormal basis for <y. Then any linear operator A
on <y is uniquely determined by the vectors A v I , . . . ,A v n' In other
words, to specify A we need only specify its action on an arbitrary ortho-
normal basis. This result follows immediately from the definition of linear-
it y in (1.18). For let v be an arbitrary vector in <Y; then for some c1 , • . • ,c"
we have:
And, by linearity,
[Case (i) corresponds to the case where m = n and, for all i, dj = 1.] By a
curious usage, when dj > I, we say that aj is degenerate. As in case (i), we can
still choose an orthonormal basis for 'Y consisting only of eigenvectors of A:
we first choose an orthonormal basis for each Lj (each of which will consist,
obviously, only of eigenvectors of A) and then form the union of these sets
of vectors.
With this as background, here is the general form of the spectral decom-
position theorem for a finitely ctimensional vector space 'Y.
m
A= L ajPj
j- l
(1.32a) aj *- aj unless i = j
No ll' Ih.11 in Ill(' (' p r('~.~, I( Hl lor A above we art.' adding operators: we have
til
(1.33) Let PI and P 2 project onto subspaces L. and L2 of a vector space 'V; we
define the relation::5 Cless than or equal to") by: PI ::5 P 2 if and only if
Ll ~ L2 if and only if every vector in Ll is in L2 .
11('(or' lh step, and l'ueh Slep is itself a projection operator P j. These steps
oc ur at the eigenva lu cs; the requirement of continuity from the right sim-
ply means that, for exa mple, P(x) = P 1 when a1 ::5 x < a2 , rather than when
(/1 X ::5 Cl 2 ·
The spectrum of A is the set of points where P(x) changes value, in this
casc the set of eigenvalues of A.
Now, when A is a Hermitian operator on an infinitely dimensional space
it i ' still possible to associate a spectral measure P(x) with A, but it can
h.lppen that, where P(x) increases, it increases continuously rather than by
sleps. The sets of points over which A increases are in this case intervals on
I he real line. Again, the set of all such points is called the spectrum of A, but
!lOW we say that A has a continuous rather than a discrete spectrum.
What does it mean to say, in the continuous case, that P(x) is associated
lII illi A? We can explain this by analogy with the discrete case. In the discrete
1',IS we have, by (1.32),
A= L ajP j
j
Let us look at the inner products (vlPjv). We know from (1.27) that any
\·xpression of this form yields a real number. Now, in terms of the spectral
Il ll'asure of A, the projector Pj is the "step" by which P(x) changes at aj.
Wri ting P«aj) for the greatest value of P(x) when x < aj, we have P(aj) =
1'( Cl i ) + Pj, and so
0.35) For any Hermitian operator A there is a spectral measure {P(x» such
that, for any vector v,
Wt' will meet these operators all the time in our discussion of quantum
IlIl·ory.
1. 16 Hilbert Spaces
Thl' vector spaces we shall use are known as "Hilbert spaces," a term coined
hy von Neumann (see Stein, 1972, p. 427, n. 10). A Hilbert space is just a
vt'ctor space on which an inner product has been defined, and which is also
"'lInplete: a vector space is said to be complete if any converging sequence of
vt'ctor in the space converges to a vector in the space. All finitely dimen-
i()na l vector spaces are complete. To show what's involved in the infinite
,.. ,se, let us look at a space which does not meet this condition.
onsider the space 5 of all finite sequences of real numbers. 5 includes
So = (1)
SI = (1 , t )
S2 = (1, t, t )
56 The 51ruclu ft' of /11111 III III '/'111' 0 11/
Now so , Sl' . . . forms a converging s 'q u 'n c on 5, but its limit, that is, the
sequence to which it converges, is infinite, and so does not lie in 5. Thus 5 is
not complete and so is not a Hilbert space.
On a related topic: in the infinitely dimensional case one should distin-
guish between subspaces and closed subspaces. Any subspace contains all
linear combinations of any finite set of vectors within it [see (1.23)]; a
subspace is closed if, additionally, it contains the limit vector of any con-
verging sequence of vectors within it. Thus a closed subspace of a Hilbert
space is itself a Hilbert space. Quantum mechanics deals with closed sub-
spaces; however, since the examples presented in this book are almost all
finitely dimensional, the distinction will largely be ignored in what follows .
This concludes our hasty introductory survey of vector-space theory. In
Chapter 5 I return to a few selected topics, prompted by some questions
which arise in the discussion of quantum theory in Chapters 2-4.
2
States and Observables in
Quantum Mechanics
electric field. It is these laws which dt'I~'rrninl' how th e system will evolve as
time goes on.
The position and momentum of each particle are particularly significant
members of TIy. (The momentum of a particle is the product of its mass and
its velocity.) We express this by saying that specification of the position and
the momentum of each particle at time t gives us the state of the system at
time t. Once the state at time t is specified, then specification of TIc and A
determines the values of all the properties in TIy at that time.
As an example, consider a system of charged particles. The electrostatic
potential energy of that system at time t is determined by (a) the relative
positions of the particles at that time (given by the state), (b) the charge on
each particle (given in TIc) and (c) the Coulomb law of electrostatic force
(given in A).
Classical mechanics is usually taken to be deterministic; a complete speci-
fication of TIc and the state at a given time would determine the values of TIy
at all other times, provided the system remains isolated. I return to this point
in Section 2.4.
As I mentioned, the state of a system is given by the positions and the
momenta of the particles which compose it. Since physical space is three-
dimensional, to specify the position of each particle we need three numbers
(or position coordinates) qx, qy' and qz to locate it relative to an appropriate
coordinate system. Similarly, in order to specify the momentum fully, so
that we know not only how fast the particle is going but also in what
direction, we need three more numbers Px, Py' and Pz, the momentum coordi-
lIates, often called the components of momentum, parallel to the three axes of
our coordinate system. For each particle these six numbers are independent
of one another. Thus, given a system of n particles, the state is specified by a
total of 6n numbers.
In the same way that we can think of a pair of numbers as specifying a
point in a two-dimensional space like the plane of the paper, and of a trio of
numbers as representing a point in three-dimensional space, we can say that
the state of a system is represented by a point in 6n-dimensional real space.
All that we mean by this is that 6n independent real numbers are needed to
specify the state. This abstract 6n-dimensional space just consists of all
possible sequences of 6n real numbers; it is called the phase space for the
system. We denote the phase space by n and the state of the system by w.
Clearly, wEn.
For illustration I will often consider the simple case of a single particle
constrained to move in one dimension. For this particle the phase space can
be represented by a plane (that is, it can be drawn on the paper); specifica-
tion of the x and y coordinates of any point in the plane picks out a possible
,' ;'rlfr '/l lIlIrI ()/),~ r ' I'7lIlI!l/ ':< ill (}I1/u/lllnl MI'l'iullril'H .'19
state of the partici(' h 1( ' lIilll'. li S its (singl e coordinate of) position, q, and its
momentum, p; w ' h.lVl' (I) (q,p) .
T = fT (q,p) = ;~
Most theoretically significant quantities in classical mechanics have a
continuum of possible values, but experimentally, of course, we content
ourselves with the rationals, and it is possible to construct artificial" observ-
able quantities" which take on only certain discrete values; an example is
"the observable whose value is 1 when the momentum is positive, and 0
otherwise." To call such quantities "artificial" is not to dismiss them: any
method of testing a system which just gives a yes/no (or pass/fail) answer
measures an observable of this kind, and we can develop an alternative
account of the notion of "state" in terms of such tests. I will return to our
simple example to show what is involved.
60 Til e S/ru c/llre of Q llall/IIIII '/'11 1'1 11 1/
+--+--+-I---q
2 3
Figu re 2.1
!; /III, ' 111111 O IIN/'IlII/IIII's; /1 )1111/1111111 Mer/IIII/ics 61
(for yl'!-J) or 0 (for no). In thi s vein I ,t us d note the qu stion, " Does observ-
,Ible A hav a valu within ~?" (where ~ is some subset of the rea Is) by(A,~),
and the value assigned to it by the state by w(A,Ll), thus making it clear that
(I) i a function. We then obtain
To see that this equivalence holds, consider the conditions under which
l'ach side is true. The left-hand side, w(A,Ll) = I, is true when the state of the
!-Jystem is such that the experimental question "(A,Ll)" - that is, the question
"Does the observable A have a value within Ll?" -receives the answer yes.
But, on the right-hand side, [A(W) gives us the value of the observable A
when the system is in state w; it follows that [A(W) E Ll just when w(A,Ll) = 1.
The experimental questions we deal with are all of the form (A,Ll), and to
each of them corresponds a region, technically a subset of the phase space.
I.a ter on we shall be concerned with the algebraic structure of the set of
t'xperimental questions in classical mechanics; un surprisingly, it has the
structure of the set of subsets of a space.
In the analysis above, the state appeared as a function mapping experi-
mental questions into 1 (yes) or 0 (no). If this account of it seems willfully
obscure, the following implausible narrative may be helpful. Two experi-
menters, one in Moose Jaw and one in Medicine Hat, regularly receive
consignments of identical physical systems. Both of them proceed in the
sn me way: each new system is treated in some way or other ("prepared")
,Ind then tested. Figure 2.2, which should be thought of as a pair of flow
Medicine Hat
Preparation
Fail
MOOSeJ~ •
~\
Figllre 2.2 Experiments in Moose Jaw and Medicine Hat.
62 '/'I/(' Slm clllrt' IIJ (JIIII/IIIIIII 'l 'It,'tIIl/
charts rather than as sk 'teh 'S ()I \'x IWI'llwlll al arrangements, hows the
principle. Now imagine tha t ea h of tilt' l'XP 'rimen lcrs has a variety of
methods of preparation at his or her dispo al and that the methods they use
are quite different. All methods of testing, on the other hand, are common to
both, and all their tests are of the pass/fail sort. Clearly, they can soon find
out whether, despite the differences in the modes of preparation, systems
prepared using Method X by Medicine Hat Man are in effectively the same
state as those prepared using Method Y by Moose Jaw Woman. They just list
the tests run on these systems and compare the results. They will also find it
convenient to refer to a prepared state not by specifying its method of
preparation (since these are not common to both experimenters), but in
terms of the test performances which identify it. Thus the state specification
might read, "Test A, pass; Test B, pass; Test C, fail; . . . " and so on. Butthis
is just to regard a state as assigning a value to each experimental question, in
other words, to treat it as a function. To a set theoretician, in fact, a function
is precisely a set of ordered pairs like those we have here.
The two experimenters cannot know whether their specification of a state
is complete, that is, whether there is no further test which, with additional
equipment, would sort out some apparently homogeneous state still further.
If their systems are classical, this knowledge would of course be available to
them if they could establish the components of position and momentum for
all the particles of their prepared systems. As we have seen, classical me-
chanics tells us that all significant tests are tests of the values of various
functions of these variables.
From this discussion it seems that in classical mechanics we have two
ways of thinking of the state of a system. We defined it as a sequence of 6n
coordinates (where n is the number of particles in the system), each of which
tells us a component of position or momentum of a particle. This can be
regarded as a description of the system: the specification of the state is
effectively a list of some of the system's properties. On the other hand,
when we regard it as a two-valued function of the set of experimental
questions, then we are drawing attention to the system's dispositions to
behave in certain ways. The distinction between properties and dispositions
may be challenged; all properties, it may be argued, are just dispositions to
certain kinds of behavior. I will put this question to one side; for the present I
will assume that such a distinction can be made (but see Section 10.2). This
granted, then the specification of the state in classical mechanics can be said
to have two distinct aspects. As we shall see, in quantum mechanics this is
less clearly so: while the specification of the state still serves to summarize a
system's dispositions, its descriptive role is moot.
,
,'; / 11 /1 '1 II/Itl () 1/NI·'III1IJ/('.~ III (J ll llltllIlII M I·('''II"i('.~ 63
Many but not all of these op rn tor odm i l 'ige nvectors; as noted in Sec-
tion 1.15, a notable exception is the positj, n operator x. Such exceptions we
will return to later; for the moment we wili confine discussion to the opera-
tors which admit eigenvectors, that is, to the case when, for the operator A,
there are vectors VI' V2' . . . such that, for each ;, AVj = ajvj(and, since Ais
Hermitian, each aj is a real number).
In these cases, the eigenvalues aI' a2 , • • • of the operator are the possi-
ble values of the observable quantity which the operator represents. We can
see immediately that this aspect of the theory gives us very different results
from classical theory: instead of a continuum of possible values, the observ-
abIes we are now dealing with can have only certain specific values. A
measurement of the observable A represented by A will yield a given value
aj with certainty, provided that aj is an eigenvalue of A and that the state of
the system on which the measurement is carried out is represented by the
corresponding eigenvector. In general, however, the state, v, of the system
will not be an eigenvector of A; in such a case we cannot say with certainty
what the result of such a measurement would be. Inste i we assign to each
eigenvalue (or possible value) of A a probability calculated as follows .
Let Vj be the eigenvector with aj as corresponding eigenvalue, and denote
by Pi' the projection operator onto the ray containing Vj (see Figure 2.3).
Then, according to quantum theory, the probability pv(A,aj) that a measure-
ment of A conducted on a system in state v will yield a result aj is given by
(2.1) pv(A,aj) = (vIPi'v) = IPi'vI 2
Figure 2.3
,
,' ;1111,' 1111111 ()fI/WnJ lIl1h'H il/ (.JIII/I1III1I1 MI'r/lllllicH 65
Si nce v is normnli z('d, W(' know from a previous dis ussion (Section 1.6) that
th ,inn ' r produ t (vll'l' v) ca n only take values 0 and 1. In other words its
vn lu es are appropriate to probability measurements.
As examples, consider the operators Sx, Sy, and Sz used to represent three
omponents of spin of a fermion. They have familiar matrix representations:
S
y
= .!.(?
2I
-i)0 Sz =
1(1 -10)
'2 0
These are just the spin matrices encountered in Section 1.7. Each operator
has eigenvalues +t and -t, and these eigenvalues are the only possible
va lu es of each component of spin of a fermion . (Note that we are working in
natural units of spin, measuring spin in multiples of Planck's constant h .)
The eigenvectors of Sx are the vectors '4 and)L, where, as in Section 1.7,
y-='2 1( 1- i)
-1-;
will yield +1 with certainty (Wl' soIY 111 .. 1 YI is nn eigl'lls/a l l' of Sy); in lhe
second the probability of such a resull is Zl'ro, and in the third the chances of
such a result are fifty-fifty . Of course, the slale of the particle need not be an
eigenstate of any of these particular components of spin. For instance, it
might be represented by the (normalized) vector
If a measurement of Sx or Sz is performed
9 16
Pu(Sz, +t) = 25 and Pu(Sz, - t) = 25
In each case, there are only two possible outcomes, and so the probabili-
ties of these outcomes add to unity.
Before dealing with operators which do not admit eigenvectors, I will
amplify a remark made earlier.
We denote by Lf the one-dimensional subspace containing the eigenvec-
tor Vj of the operator A, to which corresponds the eigenvalue aj . Briefly, Lf is
the subspace onto which Pf projects. Then a measurement of A yields aj
with certainty if, and only if, the vector v representing the system's state lies
within Lf. In that case Pfv = v, and
pv(A,a j ) = (vIPfv)
= (vlv)
=1
With this result in mind, we can now extend the discussion to include
those operators which, like the position operator x on the Hilbert space of
square-integrable functions of x, admit no eigenvectors. The possible values
of position lie anywhere along a continuum, and the operator has a co ntillu-
8 /111, '/1 II/ld ()/J.~ I "(lIII"I '.~ iI/ (JIIII/IIIIIII Medl(/I/il's 67
ous spectrulI/ (see Sl'clioJ) 1. 15). Bul, whatever spe ies of observa ble we are
dealing with, the foll o wing holds true.
Let A be an operator representing an observable A . Then to each interval
Ll on the real line there corresponds a subspace L~ of the Hilbert space, such
that a measurement of A yields a value within Ll with certainty if and only if
the state v of the system lies within L~. Let P~ be the projector onto L~ . The
expression (2 .1) for the probability of a particular experimental outcome
now has a straightforward generalization. We write Pv(A,Ll) for the probabil-
ity that a measurement of the observable A conducted on a system in state v
will yield a result in the interval Ll, and obtain
When Ll contains both + t and - t , as in case (4), the projection operator P1>-
is the identity operator, which maps every vector in 7f: onto itself; it is the
projector onto the whole space. In case (4), for any pure state v, Pv(Sy,Ll) =
(vllv) = (vlv) = 1. Experiments are certain to yield a result within Ll, since
each outcome is either + t or -t. When Ll contains neither of these numbers,
as in case (1), the projection operator is the zero operator, which maps all
v Lors onto the zero vector. In this case we have, Pv(Sy,Ll) = (vlPov) =
68 1'1/ 1' 'lfII('/l/rI' of Q 1/1111 I 11111 ,, '111' /1/ 1/
[case (2)]
[case (3)]
ing th > state doc mor' thnn assign probabilities to experimental questions,
n thi view, the function/l v is conceptually prior to the vector v; this vector
then appears as a convenient mathematical way to represent the function in
question, and, whereas in classical mechanics the state could be said to have
both a descriptive and a dispositional aspect, in quantum theory the de-
scriptive aspect disappears and we are left with the dispositional aspect
alone.
This is a view which, ultimately, I will reject (see Section 10.2), but it is
true that there is no obvious analogu • in quantum theory for the equation
W = (p,q) of classical mechanics, which specifies the state in terms of the
properties of the system. On this, more later; this is a good point at which to
pause and summarize what has been said. To this end, Table 2.1 sets out the
main differences between the mathematical representation of quantum
theory and that of classical mechanics.
The proof is simple. Let Pi be the projection operator onto the one-dimen-
sional subspace spanned by the eigenvector Vi' Since the eigenvectors are
mutually orthogonal, we have Piv = CiVi' It follows that
Tlit' second fl'Slilt IN ,111 \'xpn'ssion for the exp eclfllioll va lli e of an observ-
,Iblt', that is, th 'avcrag 'v,llu 'we would expect to obtain if we measured the
v,IluC' of A in a large number of trials on systems all of which were in the
sa me sta te.
We denote the expectation value of A by (A). It is obtained by weighting
each possible outcome, aj, of the measurement by its probability pv(A,a j). As
before, we confine ourselves to those observables with a discrete spectrum;
I'll route to our conclusion we use a result argued for at the end of Section
1. 14, that, for an operator A admitting eigenvectors, A = LjajPj (where P j
projects onto the space spanned by the eigenvector Vj, as before).
We have, then,
(A) = LPv(A,aj)a j
= L(vIPjv)aj
= (vILajPjv)
(numerically, the force needed to stretch the "spring" a unit distance), and q
is the position coordinate. We have, then,
2 k 2
Total energy = H(q,p) = L + -.!L
2m 2
It fo llows that
dq = aH =£ dp aH
-=--=-kq
dt ap m dt aq
A propos of these equations, note that dqldt is the rate of change of
position with time - in other words, the velocity v of the particle - and also
th at p = mv; thus the left-hand equation informs us that v = v, which is
rl'.lssuring, if not very enlightening. Note, however, that the right-hand
eq ua tion yields Newton's second law of motion, since kq is the force pushing
I
I
-- - I_~ww
.. .wr. t q
I
I
q=o
Fig ll re 2. 4
74 'l'hl' 5/ rue/II n' oj Q 11 1111/11111 '/'1l/ 'fllll
the particle back toward the origin. More relevant to our present concerns is
that these equations govern the evolution of the system's state. Assume for
argument's sake that the particle is displaced by a distance d and instanta-
neously at rest at the time t = 0, so that its initial state is (d,O). Then the
equations tell us that the particle's position and momentum as time goes on
are given by the graphs shown in Figure 2.5 . To put this another way, as
time goes by the state of the particle will follow the trajectory in phase space
shown by Figure 2.6; in the absence of retarding forces like friction, this will
be an ellipse.
2.6 Determinism
In the last decade of the eighteenth century, the Marquis de Laplace wrote:
We ought then to regard the present state of the universe as the effect of its anterior
state and as the cause of the one which is to follow. Given for one instant an
intelligence which could comprehend all the forces by which nature is animated and
the respective situations of the beings who compose it-an intelligence sufficiently
vast to submit these data to analysis-it would embrace in the same formula the
movements of the greatest bodies of the universe and those of the lightest atom; for
it, nothing would be uncertain, and the future, as well as the past, would be present
to its eyes. (1951 [1814], pp. 3-4)
tllilwd by tlw Idws o f 11,111111 ' I..lpl,l(:c put this ml'laphysicalthcsis in 'piste-
mi l' ll'nns by talking o f the knowledge availa ble to a "supermind "; this
su pennind ould work out th e answer to any question about the future or
the past if it had a complete description of how things are now-the
" situ ati on of the beings" who comprise the world-and the forces which
dt·t ermine how the world changes with time.
Thc epistemic thesis is stronger than the metaphysical one. The meta-
ph ysica l thesis is that (1) there is exactly one state WI of the world at time tI
which is physically compatible with its state Wo at to (tI > to); further, (2)
these states, Wo and WI' determine the values of all physical quantities in the
world at the times in question. The metaphysical thesis might be true, and
th e epistemic one false: WI might not be calculable from w o , even by a
supermind. (For a discussion, see Earman, 1986, chap. 2.) Both theses have
b 'cn associated with the classical world picture. Indeed, the stronger, epi-
stemic, thesis finds precise expression in the Hamilton-Jacobi version of
classica l mechanics. Or so it would appear.
onsider a system of particles. We may think of the specification of its
state as a precise formulation of what Laplace meant by the " situation of the
Iwings" which comprise it. In order to know all there is to know about the
present, a Laplacean supermind would have to know the state of the entire
universe. To deduce from this a description of the universe at any time in the
P,lst or future, this supermind would need in addition to "comprehend all
Ih e forces by which nature is animated," Of, equivalently, to know the
Il amiitonian function for the entire cosmos.
O ur minds, alas, fall short of the Laplacean ideal. Given a system of any
compl exity, the Hamiltonian may be impossible for us to ascertain, or too
Cllmbersome for us to employ. Nonetheless, Laplace's vision can, and in-
tIL' ·d did, function as a regulative ideal for classical physics. That is to say, a
meta physical presupposition, that the universe is deterministic, can govern
the search for scientific laws. And ev 'n our fi nite intelligences can work out
what happens to a particle on the end of an (ideal) spring, as the example in
the previous section shows. We obtained the graphs in Figure 2.5 by solving
Hamilton's equations. Because Hamilton's equations are first-order differ-
ential equations, they can only be solved within a constant term; to obtain
unique solutions we plug in the particular values of p and q at one specified
time (in this case, when t = 0). But from the resulting graphs we can now
read off the state of the particle (qt,Pt) at any time t in the future, and from
this state we can deduce all its (mechanical) properties at that time. All this
knowledge is available to us through our knowledge of "the forces by which
nature is animated" (or, equivalently, the Hamiltonian for the system we are
looking at), and the "situation of the beings who compose it," in this case the
initial state of the single particle involved.
We can see why Laplace took the universe to be both classically governed
and deterministic, but the link between the two is not as clear-cut as he
assumed; the laws of classical physics do not entail the thesis of determi-
nism. As Earman (1986, chap. 3) has pointed out, classical physics can be
made deterministic only by the adoption of seemingly ad hoc assumptions.
Such assumptions are needed, for example, to ensure that the universe is a
closed system; within a framework of Newtonian space-time, they tum out
to be deeply problematic. Thus, far from entailing the deterministic thesis,
classical physics may not even be compatible with it.
Here I will set these fundamental problems on one side and merely
indicate how less problematic, but certainly nontrivial, assumptions must be
made if the Hamilton-Jacobi formulation of classical mechanics is to be a
deterministic theory.
If the state of a classical system of n particles is to evolve deterministically,
then a1l6n differential equations describing this evolution must have unique
solutions for any time t. To guarantee this, H must be continuously differ-
entiable (more precisely, it must be differentiable in principle) with respect
to q and Pfor all physically possible states of the system. The curves showing
the variation of H with each position and momentum coordinate must be
smooth, and exhibit no singularities. This is not an empty requirement. It
rules out, for instance, the view that atoms are incompressible spheres of a
certain definite radius which exert forces on each other only when they
touch. If they were, then the graph representing the force exerted by one
atom on another would leap incontinently to infinity as they made contact,
and the requirement would be violated (see Figure 2.7). On the assumption
that classical mechanics is true, this would mean not merely that no solu-
tions to the set of equations governing the evolution of th e universe were
calculable, either by our finite minds or by a supermind, but th nt no uniqu e
solutions to these equation existpd. Both versions of Ihe II1l'sis of del rmi -
,' ;/11/ 1' I 1I1It! O /IlWr'lJIIIJII'Il ill ) /11/111/1111 M l'rlltlllic !l 77
F F
I
\ 1\ I
d \ d
\"
.... _ - , . /
I ,
....... _-/ / '
' ...... - -'" " ,--,,/
I
1\11 11 re 2.7 Force-distance graphs for (left) incompressible and (right) compressible
ph 'res.
nism, the epistemic and the metaphysical, would fail to hold; there could be
(.It Icast) two distinct states of the world at time t 1 , both of which were
I'ompa tible with a given state at to .
( I ())
. av =
Ih - Hv
at
_Ind th is equation is known as Schr6dinger's time-dependent equation, or
>l\lI11e! imes simply as Schr6dinger's equation.
Ther is a n equivalent way to describe what happens as time goes on. It is
possible to use H to construct an operator U, which, as the notation implies,
IN.I fun tion of the time. We use this operator to obtain a simple expression
f or tl1(' state v, of a system at some time t in terms of its present state vo:
(/ I) V, U /VO
78 TIll' S lrllt'illrt' of )111111111111 '/,11/'0 11/
Each tenn in this series is well defined in the algebra of operators, and the
series converges. Its sum is more easily expressed (see Section 1.5) as
(2.Bb) VI = e-itH / h
~I('nting the state. Quantum mechanics, we may say, uses the models sup-
plied by Hilbert spaces.
Implicit in this way of presenting quantum mechanics is a general account
of scientific theories. A theory T displays a set of models within which the
hdlavior of ideal "possible systems" (or "T-systems") can be represented.
For a realist, at least, to accept T is to say that there exist actual systems
whi ch are T-systems. (For an antirealist but still model-theoretic view, see
V.II) Fraassen, 1980.) The actual solar system, for example, is (approxi-
1l1.ltcly) a Newtonian system, that is, a system representable within the math-
"mati al models supplied by the theory of classical mechanics. A system S is
" II/wllilim system if the behavior ofS is representable within a Hilbert-space
Illod ' I in the way I have outlined.
This model -theoretic account of a scientific theory is by no means
original - it can even be called "the new orthodoxy" in the philosophy of
80 TIll' S lru('/lIrt' of Qllllilllilit '/'11/'0/11
science. (See Suppes, 1967; ier', 1\)7\) ; Suppe, 1977, pp. 22L - 230 .) It
stands in contrast to "the received view" (th e phrase is Putnam's: Putnam,
1962), which takes an axiomatic approach to theories and emphasizes the
role of theoretical laws (see Suppe, 1977, pp. 3 - 61). While I don't quite
share Schopenhauer's view of the Euclidean method (it is, he said, as if a
man were to cut off both legs in order to be able to walk on crutches;
Blanche, 1962), I would reject any claim that an axiom system is the ideal,
canonical form for the expression of a scientific theory. The point is this. For
any axiom system there exists a class of models; Peano's axioms for arith-
metic, for example, have as a model the set of natural numbers. And within
science we are not interested in axioms for their own sake, but in the class of
models they define. It does not matter how this class is specified, provided
that the specification is precise. When we investigate a theory, demands
typical of the axiomatic approach-like the requirement that the specifica-
tion be expressed in a first-order language, or that the predicates of this
language be divided into two classes, observational and theoretical-give
undue prominence to linguistic matters and are extraneous to our concerns.
Thus van Fraassen (1980, p. 44):
The syntactic picture of a theory identifies it with a body of theorems, stated in one
particular language chosen for the expression of that theory. This should be con-
trasted with the alternative of presenting a theory in the first instance by identifying
a class of structures as its models. In this second, semantic, approach the language
used to express the theory is neither basic nor unique; the same class of structures
could well be described in radically different ways, each with its own limitations.
The models occupy center stage.
But when we say that quantum theory uses the models supplied by
Hilbert spaces, what sort of models are these? They are models in two
apparently dissimilar senses. In the first place, they are models as that term
is used in contemporary mathematics; in other words, they are mathemati-
cal structures of the kind described in Section 1.8, containing sets of ele-
ments on which certain operations and relations are defined. More surpris-
ingly, they are also models in the way that a Tinkertoy construction can be a
model of the Eiffel Tower. Just as a point on the model can represent a point
on the tower, so, for example, an operator on a Hilbert space can represent a
physical quantity.
The two senses are linked in the following way. When we recognize that
the Tinkertoy model is a model of the Eiffel Tower, we not only see that
points on the model represent points on the tower, but also that certain
important relations are preserved in this representation; for example, we
would expect the ratio of the overall height to the length of one side of the
.';'11,, '1 1111,[ O/ISI' lllil/JlCtl ill )/111111111/1 M,'dl/lII;cS 81
hoi St, 10 lw II1\.' tldn\\' fOl" bolh Ih e low ' r a nd th ' mod I. That is to say, we
,'x IW( llh e tow 'r ;)11<.1 the model to be isomorphic. But isomorphic structures
. lr~' just th e subject matter of model theory in the first, mathematical, sense.
The outline of quantum theory given in this chapter uses the mathemati-
\", d stru ture of HiJbert space (a model in the first sense) to provide a repre-
lI' lll a tion (a model in the second sense) of the behavior of systems. This
Ih' havior has itself been described in very abstract terms; there is a wide gap
hel ween the way a working physicist uses quantum theory and the account
of th ' theory I have offered. Of such accounts, Cartwright (1983, pp. 135-
I ~6) ays,
( )1\1' may know all of this and not know any quantum mechanics. In a good under-
I'.r,ltiunte text these . .. principles are covered in one short chapter. It is true that
I h,' Schrodinger equation tells how a quantum system evolves subject to the Hamil-
IOlli,ln; but to do quantum mechanics, one has to know how to pick the Hamiltonian.
III\" principles that tell us how to do so are the real bridge principles of quantum
""' 'hanics.
1I1,lli al theory" (p. 159). These models, however, have a very different
'""clion from the mathematical model in which we represent states and
I,hut'rvables. They are essentially models in the second, Tinkertoy, sense,
whi ch represent actual entities, like a ruby laser, in terms of fictional ele-
1lll'Ill s ("two-level atoms" in this instance) whose behavior is amenable to
IllI'm'tica l treatment. These are just useful representations, simulacra of
wh,ltthey represent, and are contrasted with the underlying mathematical
III\'lIry: "a model- a specially prepared, usually fictional description of the
"ysll'm under study-is employed whenever a mathematical theory is ap-
1"H'd 10 rea lity . . . Without [models] there is just abstract mathematical
1I,"\'Iur " formulae with holes in them, bearing no relation to reality" (pp.
I 'I H 159 ). This view of the mathematical theory is at odds with my sugges-
111111 Ih at the mathematical models supplied by Hilbert spaces are also re-
IH' ·Hl' nlaliona l. Such models are not simulacra, nor are they to be contrasted
w,lh th e theory; in fact, to present the theory is just to exhibit this class of
'1IIH.ll'ls. In what sense, then, are they more than "abstract mathematical
II I, lid 1I res" ? Wha t, we may ask, do they represent?
Well , 10 ask this question is precisely to seek an interpretation of quantum
Illt'ory . Wh 'n we constru ct models of the Eiffel Tower or of the ruby laser,
82 '1'111' Slrll clllrc of QIIII II I 1/111 'l 'III'tll '/
we start from these objects and proce 'd to th e task of mod el building. In the
case of quantum theory, we have certain notions like "state" and "observ-
able" which find a representation in the model. Antecedent to the theory,
however, these are very insubstantial concepts. We rely on the theory's
models to tell us how they are to be understood. The process of interpreting
quantum theory is thus the reverse of that of building a model of a preexist-
ing object. We judge our models of the Eiffel Tower and the ruby laser by
how well they represent the objects modeled. When we try to interpret
quantum theory we assume that the representation the theory offers is a
good one and ask Feynman's forbidden question: what sort of world could it
represent? In the most abstract, perhaps metaphysical sense, what must the
world be like, if it is representable by the mathematical models that quan-
tum theory employs?
3
Physical Theory and
JJi Ibert Spaces
Il h' previous chapter outlined, in rather summary fashion, the way Hilbert
'11',ln's supply mathematical models for quantum theory. In fact, Hilbert-
IP.lCl' theory was developed for just this purpose. If someone were to ask,
'Why I iii bert spaces?" we might think the question a little peculiar; the
II hvious answer would be, "Because that's the way the world is." But we
, .I II rdin e the question, and ask what it is about the mathematical theory of
I 111I1('rt spaces which makes it clearly suitable for the representation of the
I,ll ysi <I I world. More specifically, given the task of representing the quan-
11 1111 world within a mathematical framework, why might we turn to Hil-
Iw ll s p~ce theory?
T ht' exa mple of classical mechanics shows us that there are possible
1I'III'('sentations of physical theories which do not involve Hilbert spaces. Of
, I It I rSl', this doesn' t mean that classical mechanics could not be reforrnula ted
III Ihis way. In fact, our strategy for providing a partial answer to the
'1I I1'sti on, " Why Hilbert spaces?" will be to show that the theory of vectors
1, .It' ve ry general application. We will take as an example a particular physi-
1.11 situa tion and model it mathematically. The situation will be paradig-
III .. tic<l lI y of the kind with which physical theory deals, but our description
W III be general enough to leave open the question of what sorts of processes,
Here vx , vy , and V z are the proj " tions o( v on to an orthogonal triple of rays
spanning 1R3 -or, as we can call them, the axes of our coordinate system (see
Figure 3.1).
Pythagoras' theorem tells us that
,,
,, /
/
x
, I /
-----
z -- ---'./ / " I
Figllre 3.1
t'''YSil'lI/ 'f'''I'(I/Y"lltlttlllll'I'/ S /J/I I'I'H 85
lil li' c)sslimp tion, that the world is such that in certain specifiable circum-
1IoIII Cl'S various events can be assigned definite probabilities, I take this
,wsllmption to be minimal if we are to have any physical theory at all: we
,11 111 1(me tha t there are links, albeit only probabilistic ones, between one set of
III 'currences (the initial circumstances) and another (the resulting events),
I( th ' world were fully determined then the assumption would still hold,
,dlhough ultimately all the probabilities involved would take either one or
/I'ro < S values,
To place our theory in a specific context, let us imagine modified versions
I" I he schematic experiments described in Section 2,2, In each of those
" pl'riments, a preparation of a system was followed by a test, and the result
,If thi test was assumed to depend on the mode of preparation. The tests
WI'rl all of the pass/fail kind, and it was tacitly assumed that a given method
IIf preparation would always yield the same test results. We may relax both
Ilin, . conditions. We will consider a test for which there are a number of
IHlllsible outcomes: for present purposes we will assume their number to be
II I l1Iost denumerably infinite, so that they may be labeled Xl' X2 , X3 , and so
I1I1 This allows us to consider any test which involves assigning a rational
1IlH)th or. Certainly the term system then refers (Of, to the scrupulous, appears
10 reft'r) to whatever is in common between two, possibly very different,
p",'pMation-measllrement procedur s. But this just shows that the minimal
86 Tlte ' Imclllre of (JIII/lllllllt 1'11/'1 1/1/
Nt·l·d Ihe subs p" n '. Iw O lh' dime nsional? No: by lea ving th e dimension
IlllslW'ifit'd w' allow fo r th ' fa t th a t our tests may be coarse-grained.
" !lolh 'r t'st whi ch w ) rega rd as a refinement of our original procedure
Ill ight persuade us tha t what we had previously regarded as one outcome,
I J' say, should properly be regarded as two: X 2a and X 2 b' In that case we
would come to regard the subspace corresponding to x2 as the span of two
'I tIH'rs. O f course, if we have reason to believe that no further discrimination
1'1 possible, that the outcomes are in a sense atomic, then we are at liberty to
(3.2a) p(Eo) = 0
(3.2b) p(E 1 ) = 1, where E1 = Uj{Xj}
(3.2c) for events ej and ej, p(ej U ej) = v(ej) + p(ej)' provided ej n ej = Eo .
Two methods of preparation are identified if and only if to each outcome
one gives the same probability as the other. Then, by definition, each dis-
tinct method of preparation results in a different assignment of probabili-
ties, that is, a different function p.
Now let us turn to the vector space 'Y. Let Lj be the subspace correspond-
ing to the event ej , and Pj the projection operator onto Lj. Since subspaces
and projection operators are in one-to-one correspondence we may regard
Pj as representing ej . Note that the zero operator, Po, corresponds to Eo and
the identity operator, I, to E1 • We define the length of a vector v E V, as
usual, in terms of the inner product with which we have equipped 'Y and
denote it by Ivl . The projection Pjv of the vector v onto the subspace Lj will be
of length IPjvl .
Now let v be a normalized vector of 'Y. We have Pov = 0 and Iv = v,
whence
(3.3a) lPovl 2 = 0
and
by Pythagoras' theorem.
1 IIIy.~ it'1I1 'f'III'OIY IIlItf Ililll/ 'rl S,lfl C('S 89
For the present we restrict this function to the set of subspaces in corre-
"pondence with the set of events. We then obtain
r I -/11) .uv(O) = 0
r I -//1) .uv(V) = 1
r I -/ (') for subspaces Lj and Lj , .uv(Lj EB Lj ) = .uv(Lj) + .uv(Lj) provided
L; n Lj = O.
II .Ippears that our representation of the set of outcomes within the vector
'lp,lce'V has enabled us to represent not only each possible event in 8, but
.,) so the-probability measures on that set. The probabilities of the various
"Vl'nts are physically determined by the method of preparation, or, as we
IIltl y say, by the state of the system being tested. Thus, already, from our
)'," moralized "theory" we can begin to see the rationale behind the use of
vI'('lors to represent quantum-mechanical states; further, if we look back at
I'. q II a tion (2.1) we find that, both here and in quantum mechanics, probabili-
Iit'S .He computed in the same way from the state vector: for any vector v and
proj 'ction operator P; we have
(w hl'r the subspace L; and the projection operator Pj represent the outcome
IIlLJlIl'stion). We can also show the converse, that any probability function p
Oil I he parti ula r set of events we are dea ling with can be represented by a
v= L CiVi
and specify that, for each i,
Notice that all the vectors Vi are mutually orthogonal; whence, by Pytha-
goras' theorem, we know that
Thus
dillSt' fmlll t'.lCh Illlhr P,II 'I' I., ,III ..rbitrary normnlizl..'d v' tor v,: a different
dlOI('l' would havl' rt'llldtl'd in a different vector v representing the same
prolMbility measure. There are two reasons why a choice of vectors is
,Iv,Ii lable to us. The first is that we have not claimed that our test outcomes
, 11'(' ,1lomic: we have given each subspace L; arbitrary dimensionality. The
ril'cond is that, even within a one-dimensional subspace, there is more than
11 11(' normalized vector. If <y is a vector space over the reals, and v; is a
,In' some in which only certain vectors are eligible to represent states, and, as
/loled in the last section, there are others in which some states are not
11'))1' 'sentable by a vector at all. I tum now to physical theories of the first
IPjvjl = 1 but
IPjvjl = 0 whenever j'* i
Bearing this in mind, let us look at one of the ways in which the difference
between classical mechanics and quantum theory has been characterized.
In chapter 1 of his Principles of Quantum Mechanics, Dirac (1930, pp. 10-18)
locates the major difference between the two theories in the role played by
the principle of superposition in quantum mechanics. Put in general terms,
the principle states that,
We can now see the significance of this principle. It is clear that no deter-
ministic theory can include it, for on such a theory the only vectors allowed
to represent pure states lie within the subspaces LI , L2 , . • . which repre-
sent the outcomes Xl' X2 , • • • • Though these vectors span the whole space,
that is, any vector v can be written as a sum LjCjVj of such vectors, we are not
free to regard every vector constructed in this way as representing a physi-
cally possible pure state. In the two-dimensional case, where there are two
possible outcomes Xl and X2 represented by the on e-dimen sional subspaces
LI and L2 , respectively (see Figure 3.2), th en, on a de lc rmini li lheory, the
Pllysical Tlle/lfy IIl1tlllil/lI'rl pacl's 93
(norma lized) vectors VI E Ll and V 2 E L2 may represent pure states, but the
v\'cLor V3 = (1/.fi)v 1 + (1/.fi)v2 may not.
Within quantum mechanics, on the other hand, the principle holds; thus
,II I vectprs in the space cy can represent possible physical states, and they
III ,1 y all be written in the form ~iCiVj, that is, as linear sums of vectors within
I lrt' :lrranged in horizontal lines, and in any line the distance between adja-
( '('111 pins is again slightly greater than the diameter of the ball. The pins of
",Ich line are staggered with respect to those in the lines immediately above
,"1d below, so that, when a ball passes through a gap in one line, it will strike
" pin in the line below. Beneath the array there is a series of boxes, each box
dirl' L1 y below a gap in the lowest line of pins. Each box corresponds to a
diff,' r nt outcome of the test, and each outcome has a certain probability of
1)( 'l'lirr nc .
~( 0 0 0
0 0 0 • 0
0 0 0 0 0
0 0 0 0 0 0
0 0 0.0 0 0 0
IJ I I..J I I
Figure 3.3
that a ball would bounce to the left as to the right after striking a pin. In that
case we could assign an expected probability to each outcome as follows.
Given n rows of pins there will be n + 1 different outcomes, which we can
label from the left to right, xo, XI' • • • , X n , and the binomial theorem
would lead us to expect that:
1 n!
JI(Xk) = ~ k!(1/ - k)!
,lnd Pi(Xj) = 0 for i =1= j. Now, because of the way the apparatus is set up,
l'a h of these pure states may have a particular probability of occurring. Let
th e probability of occurrence of the pure state corresponding to Pi be bi .Then
Ihe probability function P on the set of outcomes can be expressed as a
weighted sum of the functions Pi: for each outcome Xj' we have:
We see that, provided there are at least two coefficients bj and bk greater than
zero, P will be the probability function corresponding to a mixed state rather
Ihan a..pure state.
Within our vector-space representation, each probability function Pi is
r 'presented by the function f.-li on the set of subspaces such that
where Vi is some pure state corresponding to Pi' Note, incidentally: for the
reasons given in Section 3.3, more than one vector can represent a given
pu re state. We may, however, pick a representative Vi E Li and proceed as
though there was no such degeneracy. Given, then, a mixed state repre-
sented by the weighted sum Li bif.-li, one may ask why we may notrepresent
it by a vector which is a suitable weighted sum of the vectors Vi' It is certainly
not mathematically impossible to do so. In Section 3.3 we saw that, as long
a we are confining ourselves to events associated with one particular ex-
periment, we can represent any probability measure on the set of these
96 Tilt! Stru cture II! (J 111111 I 11111 '1'1,/'/111/
By doing so, however, we violate the principle that we use vectors to repre-
sent only pure states; in a determinist theory the only vectors that do this are
the vectors Vi' To put it another way, by doing so we use the principle of
superposition.
If we are not to lose an important distinction, we need to find a way of
representing the weighted sum of two or more probability functions which
is distinct from merely adding the (suitably weighted) vectors which repre-
sent them. We do so by finding an alternative representation of pure states.
We have already noted that a ray in 'V serves to represent a pure state; we
use, not the ray, but the projection operator onto it. Mixed states are then
represented by weighted sums of projection operators in a very direct way:
if each Pi is represented by the projection operator Pi' then LibiPi is repre-
sented by LibiPi' Since LibiPi is not a projector unless there is exactly one
coefficient bi which is nonzero (and hence equal to one), the distinction
b tween pure states and mixed states is made clear.
This treatment of states is developed in Chapter 5. There I will treat the
prob lem of finding an algorithm to relate probabilities to these weighted
sum of projectors in the way that the equation
and so was representable bytpj + iPk- this would be taken to mean that the
ball was actually in one of the states I1j or 11k; the cumulative effect of factors
individually too small to allow for means that we do not know which state it
is in, but we do know that the ball is thr tim's li S lik 'Iy to be in III as Il, .
I'hysit'll l '/'II/'II/ Y 11"" 1//1111'1'1 S,"/t't'S 97
( · I ~,.tr l y, any IOtlSic.111lwory dealing with sy tem about which our infor-
111.11 ion is I 's than omp l 'le an use the notion of a mixed state interpreted
hi thi s way. Perhaps more surprisingly, mixed states appear in quantum
Ih('ory as well, but there the " ignorance interpretation," as we may call it,
1',lvl's rise to a number of problems. I discuss these in Chapter 5.
()ne question we can pose at this stage is this: what, in quantum me-
l'lloll1ics, distinguishes the mixed state represented by }:,jbjPj from the state
It'prl'sented by the vector }:,jCjVj , where ICjl2 = bj ? Each yields the same
pl'\lbobility to any given outcome Xj of our experiment (which we may take
,1'1 an 'asuring some quantum-mechanical magnitude). To put the question
Since each outcome corresponu :-. III 01 p.IIIICu l.lr v,llul' of the obscrvabk A,
we could regard it as atomic(in lhcSl' nseoes('rib 'd in Se lion 3.2), and make
each subspace Lj one-dimensional. We need not do this, however, a nd in th e
remainder of this section 1 shall not assume we have done so. As before, we
denote by P j the projection operator onto Lj • We now construct the operator
LjajPj on <V, and claim that this operator represents the observable A: in fact,
we show this by using the same letter for the operator as for the observable
and writing
It remains to show just what this claim involves and how it is justified.
As a preliminary, let us distinguish what is happening here from what
was going on in the previous section when we constructed the mixed state
Lj bjPj . There each Pj represented a pure state, and (on the ignorance inter-
pretation) each bj represented the probability of its occurrence. Here every P j
represents an outcome of an experiment, and each a; the value of the observ-
able to which the outcome corresponds. Now let us consider the claim itself.
First, we may observe that any operator on <v of the form LjajPj (where all
the numbers aj are real) is Hermitian. Conversely, the spectral decomposi-
tion theorem (1.32) tells us (i) that any Hermitian operator on a finitely
dimensional vector space is expressible in this way, as a weighted sum of
proj ectors onto mutually orthogonal subspaces, and (ii) that, if all the aj are
distinct, then this decomposition is unique. (One further condition, the
ompactness of th e operator, is required if the space is infinitely dimen-
sional: sec f.ano, 1971, pp. 81, 291.) This means that we cannotconstructthe
same opera lor in two distinct ways: if
A = ~
£.III
a·P = ~
£.III
bP'
j
(where all the aj are distinct from one another, as are all the bj ), then
{aI' . . . , an} = {b l , . . . ,bn}, {Pj} = {Pj}, and, for any i and j, if Pj = Pj
then aj = bj • Thus, locked up, as it were, in the operator A is all the informa-
tion we have about the observable A: that the observable can take the values
aI' a2 , and so on; that we take an outcome Xj of the test to mean that the value
of this observable for the system is aj; and that we represent this outcome Xj
within our vector space by the subspace Lj (projection operator P;).
It is worth noting that the values aj are the eigenvalues of the operator A
we have constructed, and that each corresponding eigenvector Vj lies within
the subspace Lj • As in quantum theory, eigenvalues of an operator are the
permissible values of the corresponding observable.
/'/'.'1,.. 1/,11/ '/ '/11'11/1/ IlIltil/lf/'I'''' S/IIII 'I',.. 99
But th is J1)(l thl'm ,ltl \\11 obj \'d , th c operator A, is no t just a m 'mory ba nk
within w hich we store inform ation a bout the observa ble in question. That
alone might be enough to justify the claim that A represents the observable,
but more can be said. For we may use this operator, together with the vector
representing the (pure) state of the system, to calculate probabilities and
expectation values. The algorithms are exactly as they are in quantum
theory: from Equation (2 .1) we know that, in quantum mechanics, the
probability that a measurement of observable A will yield value aj is given by
-p(Xj) = IP vl 2
j
From what has been said it is obvious that we should identify p(Xj) in this
equation with pv(A,ai) from the earlier one.
Given identical procedures for assigning probabilities to the various pos-
sible values of a given observable, we could hardly compute expectation
values, denoted (A), differently in our general representation and in quan-
tum theory: in each case they are calculated by weighting the various possi-
ble values by the probability of their occurrence. As in Section 2.4, we obtain
(A) = L IP jvl aj
2
= (vIAv)
have not investiga ted how th e n.'sult s uf on' kind of test might be related to
those of another. We saw in the last se tion th at eac h test can be thought of
as a measurement of a physical quantity, or observable; in this section we
will look at some of the ways in which two observables can be related .
As before, we associate an observable with a measurement procedure; the
various outcomes from the measurement correspond to values of the ob-
servable in question. Again, for simplicity, I will not consider observables
with a continuous spectrum; for an observable of that kind, an outcome
corresponds to a range of values (a Borel set of the reals), rather than to one
value in particular. Most of what we could say about such observables can
be inferred from the discussion of observables with a point (or discrete)
spectrum.
Let us consider, then, two observables A and B: the values of A are
associated with the various outcomes Xl' X 2 , . . . of a suitable experiment,
and values of B with outcomes YI' Y2' ... of another. We now ask, what
relationships can exist between the probabilities p(Xj) assigned to the out-
comes of anA-experiment by a given state and the p:obabilities p(Yj) which
that state assigns to the outcomes of the B-experiment? More formally, let
'JI A be the vector space within which we represent the (outcomes associated
with) observable A, and 'JIB the vector space within which we represent
observable B. Then within 'JI Athere is a set {VA} of normalized vectors which
represent admissible probability measures on the outcomes of measure-
ments of A: we may call these the admissible pure A-states. Similarly,let {VB}
b the set of admissible pure B-states. Then an ordered pair (VA,VB) will
r 'present a probability measure which simultaneously assigns probabilities
to A-outcomes and to B-outcomes. Any relationship that obtains between
ob erva bles A and B will effect a constraint on the set of ordered pairs which
we regard as admissible pure AB-states.
Consider first the relation (or nonrelation) of independence. In this case
there are no constraints on the set: if A and B are independent, then the
ascription of a set of probabilities to A-outcomes gives us no information
about the B-outcomes. We may say that A and B are independent if and only
if each ordered pair (vA,VB) represents an admissible AB-state. Within classi-
cal mechanics each component of linear momentum and of position is
independent of all the others, and within quantum theory each component
of linear momentum is independent of each component of spin. The condi-
tion for the independence of A and B requires us to treat 'JI A and 'JIB as two
distinct vector spaces. We may, if we wish, think of the state of a system as a
vector in the direct sum of these, 'JI A EB 'JIB, and use the ordered pair (VA' v B)
to 'represent this vector. If we do so, VA and VB will be the components of
(VA,VB) in the subspaces 'JI A and 'JIB of 7fA EB 71" . This, in fa ct, is how the
I'''YH;(,II / '1'I1I'ory 1I1II/1Ii/llt'rl Spact'/! /OJ
It follows that each state which assigns probability 1 to any of the outcomes
iJ l " • • • , bij , . . . of B also assigns probability 1 to the outcome ai of A.
lienee the sets (b i) corresponding to different a/s are mutually exclusive.
In terms of the vector-space representation, we represent an outcome ai
of A by th e span Lf of the subspaces q corresponding to different outcomes
102 TIl l' Slm clllrc of (Jllf/III II 1/1 '1'/// '/ 1/ II
we obtain
r- ---
I I
I I
I .-l--
I /--- I
_1---/
<---
--- /
/
/
...:'--
"i.~ " rl! 3.4 Subspaces compatible with L are (a) the zero subspace {O}; (b) any line in L,
,lilt! th e line U perpendicular to L; (c) the plane L, and any plane obtained by rotating La
,.hout the line U (for example, La' Lb , L,); (d) the whole space 1R3.
( I Ii) In any vector space LV, subspaces L. and Lb are said to be compatible if
there exist mutually orthogonal subspaces L.o' Lboand Lc in LV (any or
all of which may be the zero subspace) such that
" bI p!lI
B = 'L.J
104 The Siru cillre of QIIIIIII II III 'f'lII 'lIry
From (2), all the projection operators Pf and P? commute with each other;
this in tum guarantees that the operators A and B commute. Thus we obtain
the elegant result that compatible observables may be represented by com-
muting operators.
82
bp b2
VA
VB b1
81 b1 81
82
VA'
b2
b1
81 b1 81
Figure 3.8
'If x' thdSe of an Sy-experiment within a two-dimensional Hilbert space 'Jf y'
,)nd those of an Sz-experiment within a two-dimensional Hilbert space
'If z' Thus, vis-a-vis this trio of observables, any state can be represented by a
lriple (vx,vy,vz) of vectors, where Vx E 'Jf x , Vy E 'Jf Y' and Vz E 'Jfz. But it
lurns out that these vectors are not independent; we can use the same
l wo-dimensional Hilbert space to represent all three observables, so that,
for any pure state, Vx = Vy = VZf and this vector will assign probabilities to all
three pairs of outcomes. To do so we first need to make 'Jf x , 'Jf Y ' and 'Jfz
complex-that is, to use the space 1[2 for all three of them-and then to
rota te 'Jf x and 'Jf y' as it were, to fit them on top of 'Jf z.
To speak geometrically-that is, analogically, since 1[2 is complex rather
lhan real-within 1[2 the rays we use to represent, say, x+ and z+ can be
obliquely inclined to each other in a way that captures the relation between
/I(X+) and p(z+) for all states of the system. For any pair of spin observables,
some, though not all, states are representable in 1R2; within the partial
r 'presentation of Sx and Sz which 1R2 affords, the x+ ray must be at 45 to the 0
z
x- x~
z+
(3.9) We say that two observables are mutually transformable if (a) they are
representable in a Hilbert space 7f by operators A and B, and (b)
there exists a unitary operator U on 7f such that A = UBU- 1 .
V_-.fi
-
211
(1
-1) _ ..fi (
V- I - -
2 -1
1
~)
It is simple to show that Sx and Sz are mutually transformable.
Where there are no incompatible observables, the relationship of mutual
Iransformability becomes trivial, as the "transformations" involved reduce
10 a relabeling of the outcomes of a single experiment. However, mutual
Iransformability is an important characteristic of sets of observables in
quantum mechanics, and I discuss one such set in detail in Chapter 4.
Unlike Definition (3.9), the definitions of functional dependence and
l'ompatibility given in Section 3.7 made no direct reference to the represen-
1.llion of observables within a Hilbert space. Both definitions, however, can
I>l' reformulated in these terms; the definition of functional dependency is a
hil cumbersome, and I omit it, but a definition of compatibility of striking
~, implicity presents itself:
lion it might b' m tUt' into n determjnist theory P (see Bohm, 1957). Of
roursc this would mea n that the " pure states" of T had, so to speak, been
misidentified; presumably they would appear as rruxed states in P . Supple-
mentary "hidden-variable" theories of this kind have in fact been proposed
for quantum mechanics, and I discuss them in Section 7.8.
Although both the principles under discussion entail that the theory is
inherently probabilistic, they are conceptually independent. The existence
of incompatible observables does not entail that we can add any (suitably
weighted) pair of pure states to obtain another; conversely, we can envisage
i.1 theory in which all pairs of observables are either compatible or indepen-
den t but in which the principle of superposition holds. In the latter case,
however, when all pairs of nonindependent observables are compatible, the
principle of superposition may have no empirical content. In the absence of
incompatible observables there may be no way to distinguish a superposi-
I ion of two pure states from a rruxture of them.
To see what's involved here, let us return to the farruliar incompatible
observables Sx and Sz and the (pure) sta tes Pz+ and Pz- which assign probabi 1-
ity 1 to outcomes z+ and z-, respectively, of an Sz-experiment. Note that we
have
I n the space (:2 the states Pz+ and Pz- (the eigenstates of the observable Sz) are
represented by the vectors
(~) and (n
(the eigenvectors of the Sz matrix: see Section 1.7). Now consider the state
represented by the vector
This function Px+ is thus a pure state such that, for ea h Sz-outcome Zi'
we now obtain
The (pure) superposition Px+ is distinguished from the mixture p not by the
probabilities it assigns to the Sz-outcomes, but by those assigned to the
Sr-outcomes. It is the existence of an observable Sx incompatible with Sz
which enables us to distinguish the mixed state p from the pure state Px+'
The fact th at different probabilities are assigned to the Sx-outcomes by P and
Px I is associated with the fact that the subspaces in C2 representing these
ou tcomes are (geometrically speaking) obliquely inclined to those repre-
senting the Sz-outcomes: as we noted, in the partial representation of Sx and
Sz available in ~2 (see Figure 3.9), the x+ line is at 45 to both the z+ line and
0
the z-line.
If all the outcomes in question could be represented by mutually orthogo-
nal subspaces, or by subspaces all of which were generated from one set
of mutually orthogonal rays-it in other words, the observables were
compatible-then such differences would not occur. Assume, for instance,
that each outcome a of observable A, and each outcome b of observable B
(which is not independent of A), can be represented by subspaces M. and Mb
such that
!'111/Nil'lIl 'I'III'II/y llwl 11111' 1'1" Sfllll'I '~ III
where each of til(' fl \ll!~j Pd "i'fl / '"1 nnd L", is a m ' mb ' r o f a set {L;} of mutu ally
orthogo nal rays of a fl lX I(l' '/1 . Assume further tha t the set {M.} of subspaces
corresponding to A-outcomes spans 'Ii , as does the set {Mb} of those corre-
sponding to B-outcomes. Clearly, A and B are compatible.
We see that any function Ji on the set {L j }, such that (a) 0 :s; Ji(L j) :s; 1 for
each Lj in {L;} and (b) LjJi(Lj) = 1, determines a probability function p on the
sets of A-outcomes and of B-outcomes such that
provided that
Jij(L j) = J jj = 1 if i = j
= 0 if i =f= j
Figure 3.10
Now the 'xis t 'n 'l' o( IlInHlll .Itiblc obscrvabl es is not enoug h to guaran -
In' either (2) or (3). For t' ample, in the Hilbert space of square-integrable
fun ' tion of x there are in ompatible observables P and Q (momentum and
position) but, it seems, no genuine observables corresponding to the Hermi-
li.lI1 operators P + Q or PQ + QP, to name but two (see Wigner, 1973,
p. 369). For the Hilbert spaces representing spin systems, however, (1), (2)
.lI1d (3) all hold; this was established by Swift and Wright (1980). To dem-
onstrate (3) Swift and Wright showed that under certain idealizing
.Issumptions-in particular, the assumption that we can create in the labo-
I"ltory any electromagnetic field consistent with Maxwell's equations-an
.lrbitrary Hermitian operator on a spin system can be measured using a
I-> lIitable generalization of the Stem-Gerlach experiment. (They also ignore
masking effects due to charge; see Section 10.1.)
Thus, at least in the case of spin systems, quantum theory makes use of
the full representational capacity of a Hilbert space.
As this equation shows, H defines not just a single unitary operator V , but
a family {Vt} of such operators indexed by the time t. The question we now
address is: why should the dynamical evolution of states be given by opera-
tors of this kind? Is there, so to speak, an a priori derivation of Schrodinger's
equation?
Note first that the family {Vt} has a structure: it forms a one-parameter
group parameterized by the real numbers. This statement needs some am-
plification.
Consider two sets of numbers, the set ~ = {t: t is a real number} and the
set P = {e t : t is a real number}. ~ forms a group under the operation of
addition, and the identity element of this group is the number zero (see
Section 1.8). Since (i) for all t}, t2 E~, etl + t2 = etl • et2 and (ii) eO = 1, it
follows tha t <~, +,0) is isomorphic to (P, . ,1). In other words, P also forms a
group (under multiplication) whose identity element is 1.
The set we are interested in, {Vt}, is a set not of numbers but of operators,
each expressible in the form e- iHt . However, the rules for operator multipli-
cation echo those for arithmetical multiplication:
Hence the set {Vt} also forms a group isomorphic to (~,+,O); the group
operation is operator multiplication, and the identity element is the identity
operator I. It should be clear what is meant when we say that this group is
parameterized by the real numbers.
We can show that,
TIll' s i ~ nifj can n' of Ihls IIwOrl'm is this: if we can show why th e dynamical
,'volution of sta tes should be givcn by a weakly continuous one-parameter
",fOUp of un ita ry opera tor , then it will follow from the theorem that there is
.. single Hermitian operator governing this evolution. (See Jordan, 1969,
p, 52; weak continuity is defined below, but see also Fano, 1971, p. 331.)
Wha t such an investigation will not show is why this operator should be the
11 " miltonian (the energy operator) for the system.
I.c t us ignore, for the moment, the fact that a Hilbert-space representation
of the states of systems exists, and consider a state just as a probability
f unction on a set of experimental questions, a set {(A, a;): A an observable, ai
, Ill outcome of an A-experiment}. We assume that the state P2 at time t2 is
"Iwcifiable in terms of the state Pt at tt (tt ::5 t2), whatever the latter may be.
Thus we can write,
where V~~ is some function on the set 5 of states; formally V~~ : 5 --+ 5 is a
Ill a pping of the set of states into itself.
If the state Pt is in turn specifiable in terms of the state Po at to (to ::5 tt) -
!h<l tis, if
then
,lI1d, using the standard notation for the composition of functions, we may
write
v:: = V, where t = t2 - tl
The definition of the product of these functions now gives us, for all t l , t 2 ,
and t3 ,
Thus from just two assumptions, (1) statistical determinism, that the state
at time t2 is a function of the state at time tl (tl :5 t2), and (2) homogeneity, that
time is homogeneous, it follows that the evolution of states is governed by a
family {V,} of functions having the structure of a one-parameter commuta-
tive semigroup. By adding the further assumption, (3) continuity, that the
probabilities given by the state vary continuously with time (so that small
changes in time result in small changes in probability), we give {V,} the
structure of a continuous one-parameter commutative semigroup.
If {V,} is to be a group, then (4) each mapping V, of S into S must be
on e-to-on e. That is, to each mapping V,: S --- S there must correspond an
inverse ma pping V ~ I : S - S, so that
Mackey (1963, p. 81) called this assumption (4) "reversibility, "but this name
is " not quite appropriate," as Stein (1972, p. 390 and n. 21) has remarked,
because the assumption does not imply that, for each possible dynamical
evolution of the system, there is another evolution like the first but in the
reverse order.
We may associate each inverse mapping V~1 with a negative number - t
by writing
(l· oI (.' h mapping Ih.ll , NO II ) H<ly, moves the stat ba kward through time)
ob lains it physi al signifi 'an c only from V" the member of the original
Il(' migroup of which it i th inverse.
Two more assumptions are needed to ensure that each operator V, can be
represented by a unitary operator V, on a Hilbert space 7f, The first is (5)
,Jrl'seroation of pure states, that V, maps pure states into pure states, Then its
n'presentation in 7f maps vectors into vectors, and so is an operator V, on
'/1 . Furthermore, since all vectors representing pure states are normalized,
V, leaves the lengths of such vectors unchanged.
A second assumption is needed to ensure that V, is linear; this may be
('xpressed by either of two requirements. The first is (6a) preservation of
~; lIperpositions, that V, preserves superpositions on 7f: that for all scalars a
lind b, and for all vectors u and v,
We...rnay get some feel for the physical consequences of (6b) from the
(oll owing considerations. Let Po and qo be two pure states, and let us assume
Ihat for some experimental question (A,a), po(A,a) = 1. Now let Po and qo
('volve under the same evolution operator v, to pure states P, and q" respec-
Iively, such that p,(B,b) = 1 for some new experimental question (B,b). Then,
provided that (6b) holds for the operator V, representing V"
qo(A,a) = q,(B,b)
To use a term we have not hitherto corne across, (6b) guarantees that
I rn nsition probabilities between states are preserved under dynamical evolu-
lion.
Assumptions (5) and (6) between them ensure that each V, is a linear
opera tor which leaves the lengths of vectors invariant. Since we have as-
sumed (4) that each V, has an inverse, it follows from Definition (2.9) that
l'ach V, is a unitary operator on 7f.
Hence, given assumptions (1) - (6), we know that the Hilbert-space repre-
st'ntations of the evolution operators satisfy the antecedent of Stone's
Iheorem (3 .12). It follows that, if these assumptions are satisfied, then
118 '/'lie lru elllrl' of QIIIIIIIIIIII 1'/1"11/ 1/
. iJv
1-=
AV
at
where A is a Hermitian operator. As was stated earlier, however, the as-
sumptions do not tell us why this Hermitian operator should be the energy
operator for the system.*
• For an argument by analogy with classical mechanics, Sl't' Jordan , 1969, pp. 101 - 102.
4
Spin and Its Representation
The quantum theory can be adaph.· J 10 <I grl'nl lll<lIly dim ulties . It is an op 'n theory,
in the sense that apparent inadequacies a n be a counted for in a n ad hoc manner, by
adding suitable operators or elements to the Hamiltonian, ra ther than by recasting
the whole structure,
Both sets of remarks are true, but both ignore the fact that the Hilbert-
space formalism is, in an important sense, not theory-neutral. This fact has
been hinted at in the discussion of minimal representations in Section 3,8
and of representational capacity in Section 3.9. In this chapter it is illustrated
by an analysis of one particular problem.
The problem is this. Suppose that we neglect, for the moment, the physi-
cal significance of spin, the interaction of spin with a magnetic field, for
instance. Are there very general constraints to which the family {Sa} of
components of spin conform, and which guarantee that the family is repre-
sentable in (:2 in just the way that quantum theory tells us? According to
quantum mechanics, we can represent S%I Sy, and Sz by the Pauli spin
matrices; we can also produce a general form of matrix by which to repre-
sent any component of spin. What is it about spin that establishes that this
representation must be the right one? Come to that, what is it about spin that
establishes that a minimal representation in a Hilbert space exists? We shall
find that the possibility of such a representation depends crucially on certain
features of the family {Sa}; we can portray systems, not very dissimilar to
quantum systems, whose behavior cannot be modeled in this way. These
results will give us good reason to think that Hilbert spaces provide repre-
sentations of quantum behavior which are not only versatile and adaptable,
but physically significant.
So that each point on the sphere is represented by just one pair of coordi-
nates we set
-n n
-n<c/>$n -<()$-
2 2
S,d" 11111/ 11/1 1~1'1'''I'III ' I/I(lli()1/ I I
The pure states of the system are those which assign probability 1 to exactly
one question a+ (and hence probability 0 to the complementary question
a - ).
Let us now see what the effect is of imposing some very general con-
straints, like symmetry and continuity, on the way that the probability
varies over the sphere. We assume the following to be the case.
122 '1'/11' ~ /1'II('/1I1'/' tI/ (..."/11/1/111/1 '1'/11°11111
(4.1a) There exists a family (S.. ) of obSt'1 voIblt's, indexed by points 011 the
unit sphere S of 1R3 (in othl'r words, by directions in physical space).
(4.1b) For each point a on S, the observable Sa has two possible values, +
and -, which we associate with directions parallel and antiparallel to
a.
(4.1c) The pure states w of the system assign probabilities Pw to all values of
the members of {Sa}.
(4.1d) (i) For each pure state w there is one direction in space a w such that
pw(a;:;) = 1.
(ii) For each direction in space a there is one pure state w such that
pw(a+) = 1.
Alternatively:
I
PX(a I ) = PX(a' I) = "2
..........
_ where ap is the angular separation of a and p. Further, t(O) = 1 and
t(n) = o.
We have also seen that t(n/2) = t and, in general, that
(D.D)
1-------",~----t(1T/2.D)
Figure 4.2 Unit circle in physical space Oeft) and representation space 1R2 (right) .
.' ince, for all a, pzJa) = 1 - pzJa'), L(</» is orthogonal to L(n - </», as re-
quired.
In this way we obtain a representation of {S",} and WI within II~F. Can this
construction also give us a representation of {S",} and WG? The question to be
.Inswered is this. We have mapped the unit circle G into the set of rays of ~2
in a way that yields the probabilities pzJ(</>,O)+] for each 5", in {S",}. These are
'he pr~babilities assigned by the pure state z+. But does the construction
hold good for pure states associated with other points on G? Are the rays
/.«/» oriented in such a way as to yield the correct probabilities for all such
slates? For instance, consider the state x+, such that PxJx+) = 1. This state
must be represented by a unit vector in L(n/2). Now this certainly gives the
correct probabilities to the possible values of Sz, since we have
.Ind, by our previous construction, L(x+) is at 45° to L(z+) and L(L) (see
Figure 4.3).
Il owever, consider the angle </> such that 1fI", = n/8, in other words, the
point (</>,0) on G such that pzJ(</>,O)+] = cos 2(n/8) . The subspace L(</» is at an
angle n/8 (22.5° ) to L(z+). Clearly, if our representation is to hold good for
'he sta te x+, then the question (</>,0)+ has to be assigned the same value by x+
_IS by z+. But, on the assumptions (4.1), this means that the point (n/4,0),
'qu idista nt from (0,0) and (n/2,O), must be among the points of G mapped
onto L(c/». (Note that, on the assumptions (4.1), the function t need not be
126 '['ill' S/m('/I/f(' II/ (2 111111 I II III 'f'ill'IIIY
L(Z)
Figure 4.3
t{n) = ° = cos
2
(~)
t (~) = ~ = cos ~) 2
(
t( ~) = cos
2
(i)
In fact, an extension of the argument given above to pure states associated
with the points (n/4,0), (n/8,0), and so on, shows that, for every nonnega-
tive integer n,
t (!!...)
2n
= cos 2 (~)
2+ n 1
Use of the relation t{A) = 1 - t{n - A), together with the continuity as-
sumption, now gives us:
does our vector sp.ln· l'l'pl'l'sl'ntation hold good (or a ll the pure states asso-
ciLltcd with poin ts on the great circle G.
To recapitulate, (4 .2) told us that, given certain assumptions about {Sa},
the probability Pa(fi +) is a function of the angular separation of a andfi; (4.4)
tells us what this function must be if we are to represent the subset {S",} of
{Sa}, together with its associated pure states, in II,F; (4.4) is a necessary
condition for obtaining a representation of {S",} and We in II'F.
Equation (4.4) does indeed hold for spin-t probabilities, and so the repre-
sentation we have constructed is perfectly adequate, as far as it goes. But it
does not go far enough. The only states that find representation in it are
those associated with points on G; for full generality we need to consider,
and to represent, the full set Ws of states, or every state corresponding to a
point on S. The state y+, for example, such that, in accordance with (4 .2) and
(4.3),
xi + yi + zi = x~ + y~ + z~
The identity x 2 + y2 + Z2 = 1 is invariant under rotations.
We can readily show that a set of transformations under which an identity
is invariant forms a group (see Section 1.8). The symmetry group of 5 is just
the set 5U(3) of all rotations of 5 about its center. Inter alia, this leaves
invariant the angular separation of pairs of points on the sphere.
Let us now look at the way symmetry considerations enter into the prob-
lem of finding the conditions under which a Hilbert-space representation of
{5a } and Ws exists. As we have seen, one task is to find a mapping of points of
5 onto the rays of some two-dimensional representation space which yields
probability assignments consistent with assumptions (4.1). Within the rep-
resentation space these probabilities are determined by the "angles" be-
tween rays. (The term angle is metaphorical, if we are in a complex space: in
general, probabilities are given by expressions of the form 1< ulv >1, where u
and v are normalized vectors within the two rays.) The symmetry assump-
tions (4.1£) and (4.1g) require that, to any automorphism of 5 under which
the angular separation of points of 5 is invariant (that is, to any rotation of 5),
there correspond an automorphism of the set of rays of the representation
space which leaves invariant the "angles" between them; to such an auto-
morphism, in turn, will correspond a unitary operator on the representation
space (see Section 2.7).
We may express this by saying that assumptions (4.1£) and (4.1g) require
th e group 5U(3) of rotations of 5 to have a representation in the representa-
tion space. A group 9 is said to have a representation within a space 'V if there
exists a set of unitary operators on 'V which, under the operation of operator
multiplication, forms a group isomorphic to 9. Using this terminology, we
can attribute the partial success and ultimate inadequacy of 1R2 as the repre-
sentation space to the fact that, while (obviously) there exists within 1R2 a
representation of 5U(2) (the group of rotations of the unit circle G), there is
no representation within it of 5U(3).
But, as Felix Klein showed in the late nineteenth century, 5U(3) does have
a representation within (? which is effectively unique (see Goldstein, 1950,
chap. 4.5 and bibliography on p. 140). (I say the representation is "effec-
tively" unique because any rotation can be mapped onto two matrices, M
and - M, in C2.) Further, this representation (which is a mapping of rotation
operators on 1R3 onto unitary operators on C2) is consistent with a particular
mapping of points of 5 onto subspaces of C2, namely the mapping which
I,lk's Ihe point (V «IJ,O) ( S into the ray L(a), who e projcctorP(a) is given
by
COS2(4))
(;t .5) P(a) = 2
( cos 2¢ sm. ¢ e'O.
2
( ompare this projector with Po, discussed in Section 1.2.)
The argument so far has shown that, if the probabilities associated with
(Sa) conform to assumptions (4.1a - g), then the only possible representation
of {Sa} within C2 will use the mapping given above. But it has not yet been
shown that the probability function given by this representation is the
............
function t(aP) = Pa(P+), which actually obtains in quantum theory, still less
that it is the one which must obtain. In the remainder of this section I will
deal with the first of these issues; the second I postpone to Section 4.4.
The subspace L(a) projected onto by P(a) is to represent the experimental
question a+. The pure state w such that pw(a+) = 1 can be represented by a
normalized vector ~ in L(a), where
COS2 1> )
2
P(a)z+ = 1> 1> .
(
sin "2 cos "2 e'o
whence
Before doing this calculation note that, for each point a on the unit sphere S
of Il~P, th ere will be an operator SOt. Although the steps of the calculation are
best performed using the angular coordinates of ex, in the final stages it is
worth moving to Cartesian coordinates, so that ex = (x,y,z), where x 2 +
y2 + Z2 = 1. We set 1> = 0 along the z-axis and 8 = 0 along the x-axis, as
before.
A wonderfully simple result now presents itself:
x - iY)
(4.7) S
Ot
= (
x +z iy -z
The Pauli matrices Sx, Sy, and Sz appear as special cases of (4.7). In terms of
these matrices we obtain
(4.8)
[','i,' IIl1d Ti ll R"IIII'l t'lIllIlllllI I ,ll
z+ z ~ =z _ x+ x~=x_ y+ y~= y-
7t 7t 7t 7t
,II 0 7t
2 2 2 2
7t 7t
() 0 0 0 0
2 2
lOS -
1> 1 0 -
1
-
1
-
1
-
1
2 .fi .fi .fi .fi
1>
', 111 - 0 1 -
1
--
1
-
1 1
--
2 .fi .fi .fi .fi
1 ]
" "'/2 1 1 1 1 - (1 - i) - (1 - i)
.fi J2
1 1
1',11/2 1 1 1 - (1 + i) - (1 + i)
.fi J2
(r,
G) (n ~G) ~(~1) ~e-i)
2 1+ i 1(I-i)
2 -1-1
4.4 Conclusion
The conditions imposed by (4.1) guarantee that, if a representation of {Sa}
and Ws exists in C2, then it is the one which employs the Pauli spin matrices.
Further, if this representation is faithful, then the function t of (4.2) is given
by
/'... 1 /'...
(4.9) t(ap) = cos 2 2' (ap)
It follows that, unless we can show why this is the only t-function possible,
we have not established that {Sa} must be representable in C2. But (4.9)
cannot be derived from (4.1a-g). Any monotone function t", of the form
132 Th e Stru ctll re (lJ Qllllllt""t '1'111'0 /.'1
(where, as the notation implies, 1fI«(/» is a function of (/» is consistent with
these assumptions, provided that
Typical admissible variations of 1fI(4)) with 4> are shown in Figure 4.4.
As an illustration, consider this whimsical example, proposed by Mielnik
(1968, p . 55; see also Beltrametti and Cassinelli, 1981, pp. 204-207). Imag-
ine that we have a spherical container, exactly half full of some liquid.
Imagine, further, that the surface of the liquid in the sphere is always a plane
through the sphere's center. This container, we assume, can be divided in
half by a thin partition along any plane through its center, and whenever
this is done we find that all the liquid ends up on one side or other of the
partition; thus the liquid exhibits quantum behavior. Furthermore, the side
of the partition that the liquid moves to is not determined; rather, there is a
certain probability of the liquid's moving to one side of the partition rather
than the other, and this probability depends on the orientation of the parti-
tion to the original surface of the liquid, as follows. If V L is the (volume of
the) hemisphere originally occupied by the liquid, and VA is the hemisphere
on side A of the partition, then the probability that all the liquid will be
found on side A of the partition is given by
o~~----+-------+-~
7T/2 1T
but instead by
What constraint, then, must we add to (4.1) to guarantee that (4.9) holds?
Well, what is nowhere expressed in the assumptions (4.1) is the sense in
o.w
Figure 4.5
134 '['lte S/ru c/ure II! (JIIIIII/IIIII '1'111'/111/
which the members of {Sa} ar' ompol1ents of a physical quantity. From
(4.8) we see that, if we assume that {Sa} is reprcsentabl as a set of Ilermitian
operators in C 2, then these are indeed vector operators, which can be re-
solved into components (Messiah, 1958, vol. 2, p . 509). But it's not obvious
how such a relation might be expressible just in terms of the probabilities
that states assign to values of Sa. Clearly such probabilities cannot add
vectorially, on pain of yielding probabilities less than zero.
However, a possible condition on expectation values presents itself. We
write <Sa)w for the expectation value of Sa, as in Section 2.4; then
To see this, assume that the system is in a pure state wand that the angular
eparation of a and a w is cp. We now choose a coordinate system such that
z+
a z ---a=(<I>.O)
I
I
I
I
I
I
I
(\' «(1),0) (Hin(/), () , co~I(M .111(.1 (~lII - (0,0) = (0,0, I). Then
1
pw(a+) = cos 2 '2 (c/» [Q.E.D.]
The question posed at the beginning of the chapter now has an answer.
Under the assumptions (4.1) and (4.10), a family {Sa} of observables and a
set Ws of states has a representation in C 2 , and this representation, involving
the Pauli spin matrices, is just that employed in quantum mechanics for the
spin-t particle. Further, these assumptions are nontrivial; as Mielnik's ex-
ample shows, there could be "quantum systems" for which no such mini-
mal representation was possible.
Two more general conclusions can be drawn. The first is that any inter-
pretation of quantum mechanics must recognize that the theory deals with
families of observables which are knitted together in a way precisely cap-
tured by the Hilbert-space representation. The mutual interdependence of
the members of {Sa} is not a functional interdependence of the kind found in
classical mechanics, but an essentially probabilistic interdependence; the
observables are, in the technical sense, mutually transformable, as defined
in (3.9). Prima facie, any interpretation which invites us to consider them
independently should be mistrusted.
The second is that the way in which the relations between the observables
Sa in quantum mechanics are determined by the symmetries of three-
dimensional physical space typifies the way in which the relations within
any family of mutually transformable observables are determined by un-
derlying symmetries in nature.
5
Density Operators and
Tensor-Product Spaces
When the idea of a mixed state was introduced in Chapter 3, I suggested that
a weighted sum of projectors could represent such a state but postponed the
problem of providing a statistical algorithm. The problem is that of finding a
natural generalization of Equation (2.1):
Pv(A,Ll) = (vIP~v)
that is, of the equation whereby to each experimental question (A,Ll) the
state assigns a probability.
I will attend to this problem first. In the rest of the chapter I will discuss
the vector-space representation of states of complex systems; when two
hilh rto independent systems interact, they behave as one complex system,
and we ca n represent the states of this complex system, and observables on
ii, within a new vector space, the tensor product of the spaces appropriate to
the two component systems.
whence
(!i. 3) Tr(P) = n
conversely-see Jordan, 1969, S' .6), and al l lin ear operators on a finitely
dimensional vector space are bounded.
The terms statistical operator and density matrix are also used.
From what has been said, any projection operator P projecting onto a ray
of 7f is a density operator. Further, let {P;} be a family of projection opera-
tors projecting onto rays of 7f. Then, by (5.4) and (5.5),
(5.8) D = ~iaiPi is a density operator, provided (a) 0 :s; ai' for each
ai' and (b) ~iai = 1. (*)
We see that (5.8) gives us a recipe for constructing density operators from
projectors. But does it also give us a prescription for decomposing a density
opera tor? Specifically, (i) can we always express a density operator as a
weighted sum of projectors, and (ii) is this decomposition unique?
The answer to (i) is yes. Every density operator D admits a set {ail of
eigenvalues. (This is because every density operator is compact: see Fano,
1971, pp. 376, 291.) Assume, for the moment, that there is no degeneracy
(see Section 1.14). From the discussion in the previous section, these eigen-
values are all positive and add to one, and the spectral decomposition
theorem (1.32) then guarantees that a set {Pi} of projectors exists (each
projector Pi projecting onto a ray containing eigenvectors of D with eigen-
value ai) and that D = ~iaiPi'
Even if there is degeneracy, we can still apply the spectral decomposition
theorem and stipulate that each Pi project onto a ray of 7f. We will then find
that not all the ai are distinct, that aj = ak , for instance. But all this means is
that some ai are going to appear more than once in the summation that yields
~iai = 1 in clause (b) of (5.8).
I I,'" 1/1/ ('I/I'rt/lors IIlId '1'1'11 110 / 1'lIIdlll'l 8 /1111'1 '8 1.l9
111 111
"2 P x+ + "2 P x- = "2 P y+ + "2 Py- = "2 P z+ + "2 P z-
More fundamentally, the very construction employed in (5 .8) ensures
that density operators do not, in general, have a unique decomposition . For
in that construction there was no requirement that the rays onto which the
projectors Pi projected were to be mutually orthogonal. Yet we know from
the spectral decomposition theorem that for each D there exists a set {Pj} of
projectors onto mutually orthogonal rays such that D = LibiP:. Thus, in
general, we have
(72 =
0
( i
-i)° I = (~ ~)
These are, of course, familiar: (71 = 25x , (72 = 25y , and (13 = 5 z (see Section
1.7).
Let A be a linear operator on (:2.
(5.9) If A is Hermitian, then there are real numbers PI' P2' P3' and P4 such
that
where d + r~ + r~ = 1.
Let PI and P2 be the points on the unit sphere of 1R3 corresponding to the
projectors PI and P 2 on (:2.
rigure 5.1 The set of density operators on C2; D = alP I + a2P2 = bl P3 + b2 P.; PI is
orthogonal to P 2.
(!,. 15) If A is a density operator on (:2, then A may be written in the form
where ri + r~ + r~ :5 1.
The last two results of this section are included solely on account of their
elegance; they will not be used in what follows.
(:;, "/ 6) The set of Hermitian operators on (:2 forms a four-dimensional vec-
tor space over the reals, and {GV G2,a3' I} forms a basis for this
space.
(:;.17) t Tr(AB) supplies an inner product for this space; with respect to this
inner product, the basis {GV G 2,a3' I} is orthonormal (see Section
1.9). (*)
then, by definition ,
Tr(QPv) = ~(vjIQPvvj)
Using the strategy used to derive (5.2), let us take a basis {vJ containing v as
one of its members. Then Pvv = v, and PvVi = 0 when Vi 1= v, whence
It follows that, if we represent a pure state by the projector P v rather than the
vector v, then Pv(A,Ll), the probability that this pure state assigns to (A,il) is
given by
Fill" .ln YSubsp.l n' I . . IIHI proll'l'lor 1\ we ha ve, using (5.4) and (5.5),
= LajJ.lj(L)
Since 0 is a density operator, the constraints on the aj are just those we need;
I hus to 0 there corresponds the probability measure J.lD = LjajJ.lj on the set
S(,// ) of subspaces of 'if. To each subspace L of 'if it assigns the weighted
slim of the probabilities assigned by the pure states P j according to the
.l lgorithm
Within this equation, the density operator D represents the state of a system .
The use of density operators allows us to give a vector-space representa-
tion to mixed states. Mathematically, these are just appropriately weighted
su ms of pure states, so that, for instance, if P 1 and P 2 represent distinct pure
slates, then any density operator D = a1P1 + a2P2 (with a1 > 0, a2 > 0, and
II I + a2 = 1) represents a mixed state. We express this fact by saying that the
set of states forms a convex set, of which the extremal points are the pure
sta tes. This geometrical mode of expression seems particularly apt in the
case of ( ? , where the terms convex set and extremal point find a literal
re presentation. Recall from Section 5 .3 that the set of density operators on
('2- that is, the set of all states-can be put into one-to-one correspon-
d ' nce with the set of points in the unit ball of 1R3 . Within the set of states, the
extremal points, or pure states, represented by projectors onto the rays of
~ 2 , a re in one-to-one correspondence with the points on the surface of this
ba ll (in other words, with the points on the unit sphere of 1R3). Of course,
after the discussion of the spin-t particle in Chapter 4, this latter fact should
ha rdl y come as a surprise.
Le t me once more emphasize the distinction between a superposition
. .H1d a mi xture o f two pure sta tes, using, yet again, the example of spin. Con-
sider th e pure sta tes z+ a nd z_ (equivalently, P z + and Pz- )' We can form a
144 n'l! ' Im r /llrl' of )/11111111111 '1'11 1'/ 111/
This particular mixed state, in which the particle is, as we say, completely
unpo/arized, is one we shall come across again in future chapters.
The customary interpretation of mixed states used to be the ignorance
interpretation. According to this interpretation, a system in a state 0 =
alP I + a2 P2 was really in some pure state (PI or P 2 ), and the coefficients a)
and a2 represented the likelihoods of its being in one or the other; these were
epistemic probabilities, representing our best estimates of the chances.
This interpretation of a mixed state is clearly appropriate to a classical
theory (see Section 3.5), but it is open to two objections in the quantum-me-
chanical case. The first stems from the nonuniqueness of decomposition: as
we saw in Section 5.2, any density operator 0 which is not itself a projector
can be decomposed in an infinite number of ways. Now this may just mean
that our ignorance when we represent a state by 0 is (vastly) greater than we
had assumed; still, it does seem odd that when we cannot say which are the
possible pure states of a system, we can assign to a particular pair of them
probabilities which add to one. In the case of the unpolarized spin-t particle,
for instance, can we say that there is a probability of 0.5 that the particle is in
the x+ state and a probability of 0.5 that it is in the x_ state, and that the same
holds true for the y+ state and the y_ state, and for the z+ state and the z_
state, not to mention the nondenumerable infinity of other pairs of states
associated with different directions in space? And this is not merely a diffi-
culty associated with the central point of the set of states; all mixed states
allow an infinite number of decompositions.
It may be that the particular decomposition we should consider is in all
cases determined for us by the preparation the system has undergone. If so,
this is a fact that the formal specification of the state fails to reveal. And
there still remains a second, possibly more telling, objection against the
ignorance interpretation, which I will spell out in Section S.B.
Nonetheless, even though the ignorance interpretation is suspect, the
following remains true.
Assume that we prepare an ensemble of systems in a mixed state 0 and
that 0 can be decomposed according to the equation 0 = LjajPj. Then our
estimate of the relative frequency of any given experimental result from this
ensemble is exactly what we would get if th e ens mbl e consisted of variou s
1 )1'I1 1111f ('IJl'mlors 11111/ 'I't·II .~or l'milll l'l S,It/Ct'S /J/.!i
Hubensc mbll's, l'tlCh ill ,I pllrl' sl,)l '1'" and ea h of these subcnsembles were
r 'pre nt d in lh 'whole 'ns ' mble with relative frequency a, . This follows
from the fact that, for any projector P,
Tr(OP) = ~aiTr(PiP)
0= aP. + bPb
Let 0, p., and Pb evolve under (5.21) in time t to Of, P~, andPb, respectively.
Then
striking results. A th eorem du e 10 Kndison (1 9 I), d fecl ivel y the COl1 w rsl'
of the result quoted above, shows th e consequ ences of assumin g ( ) prl'sl' r
vation of convexity: that the convex stru cture of th e set o f sta tes is preserved
under dynamical evolution.
Let ft be a mapping of the set S of density operators on a Hilbert space 71
onto itself: ft : S -+ S. Then
(5. 22) If ft preserves the convex structure of the set S, then there is a unitary
operator V t on 'Jf (with inverse V;-l) such that, for every density
operator D in S,
V, = e- iAt
respectively.
The question arises: does this exhaust the set of possible probability
measures on S('Jf)? In other words, is every probability measure on S(7f)
representable by a density operator? To this question, " The affirma tive
answer was assumed by von Neumann, conjectured by Mackey, a nd
1)1'/1 11/1/ ('/I/" 'lIllIrS 111111 '/'/'/1 III/' I'mtl/l/ 'I SIII/ ('ell II/ 7
proVl'd by C Il'.IHOIl " (1II'1Ir.IIlH'lli ,1Ild 'assine lli, 1981 , p. 11 5; see Mackey,
1963; Gleason, 1<)57) .
The formal stateme nt of Gleason 's theorem runs as follows.
(,' , 23) Let J1 be any measure on the closed subspaces of a separable (real or
complex) Hilbert space 71 of dimension at least 3. There exists a
positive self-adjoint operator T of the trace class such that, for all
closed subspaces L of 71,
Lf(v;) = W
onal rays which spans 71 as rcprcsenting a sel of mutually 'x lusivc and
jointly exhaustive outcomes of a possible experiment; the probabilities as-
signed to these rays should therefore add to 1.
A frame function is said to be regular if there exists a self-adjoint (Hermi -
tian) operator Ton 71 such that, for all normalized vectors v,
f(v) = (vITv)
It is straightforward ('tr) to show that (5.23) follows from the fact that all
frame functions are regular.
The importance of the theorem can be summarized in this way. A quan-
tum-mechanical state gives a simultaneous assignment of probabilities to all
experimental questions involving observables in a given family (for exam-
ple, to all questions involving components of spin). Quantum theory allows
us to represent all members of this family on the same Hilbert space 71, and
tells us that certain states are representable by vectors in 71. With respect to
these (pure) states, the structure of the set of all these experimental
questions - the structure of the set of quantum-mEchanical events -is that
of the set 5(71) of subspaces of 71. Gleason's theorem tells us what the set of
all possible states on this structure is: it contains just those states which are
representable by density operators on 71; they form a convex set with the
pure states as its extremal points.
As we shall see in the next chapter, any straightforward account of the
properties of a quantum-mechanical system is ruled out by this result.
Sin 'Ihe s'l (vt 0<> un , j>.II) S '/fA ® 'liD, this quation defines an inner
prod uct on th ' wholl' tl'J)sor-product space, In any vector space, Ivl = 0 if
and on ly if v is the zero vector [see (1.21)]; it follows from (5 .24) that, for any
VA E 'lfA and UO E 71 8 ,
(5. 25) VA ® 0 = 0 = 0 ® u B
For our purposes, the details of the construction of 71 A ® 7fB are not
important (see Jauch, 1968), chap. 11.7; van Fraassen, 1972, pp. 351-362).
But a highly significant result of this construction is that the set of vectors
expressible in the form VA ® u Bis only a proper subset of 7fA ® 7fB . In other
words, although every vector in the space we construct is a linear sum of
vectors expressible in the form VA ® u B, not every vector in the space is itself
expressible in that form. Thus the tensor product of 7fA and 7fB is not simply
the Cartesian (or topological) product of 71 A and 7fB, but includes it as a
proper subset.
Since all vectors in a space are linear sums of the basis vectors, we can
define linear operators in terms of the transformations they effect on the
latter (see Section 1.13). We use this fact to define an operator AA ® All on
7fA ® 7fB in terms of the action of linear operators AA and AD on 7{A and
7fB, respectively, by writing:
Let us make this ques tion more prl'citl " Let 0 b' n d 'nsily op'ralor on
7i A ® 7i B representing a state of the omposilc y lem , Assume lhnl the
spectral decompositions of arbitrary Hermitian operators A A and AU on 'f f A
and 7i B are given by {PA} and {PB}, respectively. The question is now, are
there states DA and DB of the component system which, for all observa bles
AA and AB, and for all Ll and r ~ ~, satisfy the equations below?
III HUIII , i( th ~' ('01 II p w. 1I I' ,llld (ompon 'n t sta t 's sa tis fy (5.27), th en:
(.', )HII) I( th e component states are pure (that is, representable by vectors
VA and Ull), then the composite state is pure and is represented by
VA ® U Il .
(.', JH I,) If the component states are mixed, then the composite state is not
uniquely defined by them; in particular, it may sometimes be a pure
sta te not expressible in the form VA ® u B •
(', Jlk) Any composite state D defines uniquely two component states, DA
and DB.
(', lHIi) If (and only if) the composite state is expressible in the form VA ® u B
are the component states pure.
II
The Interpretation of
Quantum Theory
6
Th e Problem of Properties
PD(A,L1) = Tr(DP~)
observables change over time as the ystem ' state changes, but at any tim'
a measurement of any quantity will (ideally) yield a value within any de-
sired range of accuracy. A specification of the state gives us these values; as
we saw, the classical state w acts as a two-valued function on the set of pairs
(A,~): when w(A,~) = 1, the system possesses the property in question;
when w(A,~) = 0, it does not.
In this way classical mechanics allows us to preserve certain elements of
the ontological structure of the world first enunciated in Aristotle's Catego-
ries. * Where Aristotle had talked of "substance" and "quantity," in classical
mechanics we speak of "system" and "property." The question this chapter
addresses is whether these categorial elements can be preserved in an inter-
pretation of quantum theory.
In the discussion of quantum theory in Chapter 2, a pair (A,~) was de-
scribed as an "experimental question." But what exactly does such a ques-
tion ask? In classical mechanics too, the pair can be thought of as a question:
it asks of system whether it has the property (A,~), to which the state gives
the answer yes or no. The functions defined by the states of quantum
theory, however, are not two-valued; their values lie anywhere in the inter-
val [0,1]. Nor do classical states-states, that is, which assign to every
question either a yes or a no-emerge as special cases. In any theory which
uses the full representational capacity of a Hilbert space, there will be
questions represented by incompatible subspaces to which no state simulta-
neously assigns the limiting values 1 or O. Thus there will be no dispersion-
free states. This is easily seen geometrically. Consider, for example, (Sz,+)
and (Sx,+)' As we saw in Section 4.2, we can represent these experimental
qu estions, together with a selection of states (including z+, z_,~, and x_),
in 1R2(Figure 4.3). Clearly, any vectorlyingin, or at right angles to, the (Sz,+)
ray will be at 45 ° to the (Sx,+) ray. But these are the only vectors which
assign limiting values to (Sz,+), and they all assign a probability of t to
(Sx,+)' In fact, imagine the state vector v moving round the representation
space ~2. Then Pv(Sz,+) = cos 2 1f1, but piSx,+) = cos 2(1fI- n/4), and we see
that each probability approaches a limiting value only when the other
approaches t (Figure 6.1). This holds even if we move to C2, for none of the
additional states representable in C2 but not in 1R2 assigns a limiting value to
either question.
For observables with a continuous spectrum, the situation is even more
• In Categories 6 Aristotle suggests that the only quantities of substance are position, length,
area, and volume, but in Physics IV.14 locomotion (speed) also appears as a quantity. These
works are included in Aristotle (1984), among many other editions.
----1
I
I
I
I
I
1/2
Tr/2 1T
J'igure 6.1 Probabilities of (5., +) and (5,,+) for the state a+, where a = (1),0), as 1> varies
from 0 to 1l.
pv(Q,[a,b)) = 1
is the set IR itself (Busch and Lahti, 1985; see also Section 9.1).
In quantum theory the dispersion principle holds: there are no dispersion-
free states (see Section 9.1). But neither the claim that the pairs (A,~) repre-
sent properties nor the claim that individual systems possess a full range of
such properties is necessarily at odds with this principle. Imagine the fol-
lowing hypothetical situation. At all times each observable for a system has
a well-defined value. Thus, for any putative property (A,~) at any juncture,
either the system has that property or it does not. Our present theory,
however, can only predict the probability that a given system has the prop-
erty in question; as a description of reality the theory is incomplete. If, in this
situation, we were to rest content with the theory we had, then there would
be serious and systematic limitations to our knowledge of the world. On
Einstein's view, this is just the situation in which we are placed by quantum
mechanics.
I !)H Till ' 11t/1'1111'I'11I/l011 tI/ (} IIIIIIIIIIII 1'111'111 .'1
If, without in any way disturbing a system, we can predict with certainty (i.e., with
probability equal to unity) the value of a physical quantity, then there exists an element of
physical reality corresponding to this physical quantity. (P. 777)
I will call this the EPR criterion for physical reality. The quotation above
makes it clear that the "elements of physical reality" they are concerned
with are values of physical quantities. These are thought of as properties
(A,a) of systems, as in classical mechanics, and (on our account) are repre-
'1l' llt,lbll' by I4 l1h, l1,h','/1 I , ,~ IIf .\ Ililbt'rt spa ' c. A J) 'cssa ry onditi on for the
('o llli let ' J)l'SS of ,\ tlll'ory, EI'I< says, is that "every element of the physical
I'l'lility III/I st have (/ cO //llt erpart ill the physical theory" (p. 777).
What Ei nstein, Podolsky, a nd Rosen now claim about position and mo-
l1Ientum applies equally well to the two noncommuting observables Sz and
S, for the spin-t particle:
If both of them had simultaneous reality-and thus defirrite values-these values
would enter into the complete description, according to the condition of complete-
ness. If then the wave function provided such a complete description of reality, it
would contain these values . .. (P. 778)
As we have seen, the spin state vector cannot" contain" the values of Sz and
5, simultaneously. However, the fact that they both can't enter at one time
into the kind of description which the state vector provides may just indicate
that they cannot have simultaneous reality. We could say, for instance, that
in the state z+ the particle has the property (5z ,+); the value of 5z is predict-
able with certainty, and so there is an element of reality corresponding to it.
Ilowever, we could also say that, in this state, the particle has neith er the
property (5 x ,+) nor the property (Sx,-), that neither of these properties
constitutes an element of reality,
Einstein, Podolsky, and Rosen saw that the fact that quantum mechanics
admits no dispersion-free states does not, on its own, tell us whether the
theory is complete or not. As they write,
From [the dispersion principle] it follows that either (1) the quantum-mechanical
description of reality given by the wave function [in our terminology the state vector] is
not complete or (2) when operators corresponding to two physical quantities do not
commute the two quantities cannot have simultaneous reality. (P. 778)
Now it may be surprising that, by using the theory itself, one could ever be
led to embrace alternative (1) of this disjunction. Although the EPR criterion
is only a sufficient condition for the ascription of reality, if this is the only
criterion we have, then what we regard as real will be limited by what we
ca n predict with certainty. But these predictions are provided by the theory.
How can a theory fail to predict with certainty something which it predicts
with certainty?
tions, to generate predictions about th 'oth 'r. These predictions have prob-
ability one, and so, according to the EPR criterion, properties of the second
particle acquire the status of elements of reality. Furthermore, since we may
choose what measurement to carry out on the first particle, such predictions
can be made about either of two incompatible observables. But it is implau-
sible that the reality of a property of the second particle depends on what
measurement is carried out on the first; hence values of both of these observ-
abIes should be considered elements of reality. Since this contradicts alter-
native (2) of the EPR disjunction, we are therefore led to alternative (1): the
quantum-mechanical description of reality is not complete.
In the thought experiment the paper describes, the incompatible observ-
abIes in question are position and momentum. I will describe an analogous
experiment suggested in 1951 by Bohm, in which the observables are dif-
ferent components of spin of the spin-t particle.
It is possible to prepare pairs of particles, such as an electron-positron
pair, whose total spin in any direction is zero. If the pair then separates,
theory suggests that if, for instance, an Sz experiment is carried out on each
system, then the results will always be opposite in sign: if the result of
measuring Sz on the electron is +, then on the positron it will be -, and vice
versa. The same holds for all directions in space (that is, for Sx, Sy, and so on),
provided that both experiments measure the same component of spin.
It's worth sketching the formalism by which quantum mechanics reaches
this result; the general result, Equation (6.1), will be important later. We
represent the spin state of a single spin-t particle on a two-dimensional
complex space; call it'll. States of the composite system, electron +
positron, will be represented in the tensor-product space'll e ® 'II P of two
such spaces (see Section 5.7). Now let v+ and v_ be the eigenvectors for
some component of spin S~ for the electron, and let u+ and 1L be the
eigenvectors of the same component of spin, S~, for the positron. The singlet
spin state in which the system is prepared is given by
The intriguing thing about this state is that it is independent of the direction
a; that is, we get the same vector in 'lie® 'II P no matter what component of
spin we choose to work with, provided only that we choose the same
component for both systems. Compare this with the single system, for
which
'I'lli' I'm/J!,' /11 IIJ flrtJ/l('rl ii's 'I G1
'l'giv'8 WHI H ·d t .. ,liI )lI of lh , sta teof th ' ompo it system.To measure
.11' observabl ' on th(, composi le system we can perform an experiment on
"oIeh of the component systems; for instance, we may measure S~ on the
"I,'elron and Sp on the positron. Such a (joint) observable is represented by
I Itl' opera tor S ~ ® Sp on 7i' ® 7i p. The probabilities computed by using the
Il.lndard quantum-mechanical algorithm on the tensor-product space are
II,i n t probabilities, the probability, for instance, that a measurement of S ~ on
Ihl' electron will yield + and that a measurement of Sp on the positron will
.dso yield +. It turns out that, for the singlet spin state, this joint probability
HI given by
(It I) (**)
/"-..
where ap is the angle /"-..
between the directions a and p.
Notice that when ap = 0 (when a andp coincide) there is zero probability
I hat both measurements will yield +; this is exactly in line with what was
s.,id earlier, that if the result of measuring S~ (say) is +, then the result of
111 ' asuring S~ must be -. In fact we have, for any direction a,
1
(t, '}) P'l'[(S~,+),(S~,-)] ="2 = P'l'[(S~,-),(S~,+)]
/"-..
Effectively, in these cases ap = 180 0
•
independen tly of w hat happe ns to ti ll' electron one - th e pa ir has separa ted .
In particular they are assumed to ex ist ind ependentl y of the fact th at we
perform measurements upon it. Notice tha t although certainty of prediction
is a sufficient condition for ascription of reality, what exists is not to be
identified with what we can predict. This lifts the paradox we met at the end
of Section 6.2: there is no suggestion that we can predict with certainty the
values, for example, of both S~ and S~ at the same time. For any given pair,
we can choose to perform either an S~ or an S ~ experiment. Each of these
experiments would reveal an element of reality associated with the positron .
It is because (if locality obtains) our choice will not disturb the positron in
any way that we can claim that both these elements of reality exist simulta-
neously. In the words of EPR, to make " the reality [of S~ and Sn depend on
the process of measurement carried out on the first system, which does not
disturb the second system in any way" is something that "no reasonable
definition of reality could be expected to permit" (EPR, 1935, p . 780).
The summary I have given departs from EPR, not only by reworking the
argument in terms of spin components as Bohm suggested, but also by
putting it in terms of incompatible properties of the second particle, whereas
EPR assigns it two distinct states. (I discuss EPR in terms of states in Chapter
8; see also Beltrametti and Cassinelli, 1981, pp. 69 -72.) I have rewritten itin
this way partly to emphasize that the argument, if valid, does not convict
quantum theory of internal inconsistency. Nor was that its aim. As will
appear, there are other deep problems which the EPR experiment raises, but
here I have been concerned to bring out the thesis argued by the original
authors, that we can regard quantum mechanics as complete only at the cost
of abandoning a particular-and appealing-account of physical reality."
• For a detailed analysis of EPR, see Hooker (1 972); for a futl accounl of r('spo nscs 10 ii, Sl't'
Jammer (1974, chap . 6).
stat' v 'ctor or dt'lwll ol"'ldltlr ,IS "pplicable to .lll ensl'mble of sim ilarl y
prepared sys tems, r.IIII,'r 111.1111 0 an individual sys tem (l3all ' ntinc, 1970).
The term ellselllbll' is borrowed from statistical thermodynamics; it refers
to a conceptual entity: a se t of similarly prepared particles. As Ballentine
(1970, p . 361) points out, this should not be confused with a beam of
particles, whose individual members may well interact with each other.
On this interpretation, the state description provides statistical informa-
tion about such ensembles; a natural, though not necessary, concomitant of
this is the view that quantum mechanics is a classical statistical theory, in
that the probabilities yielded by the state vector give the relative frequencies
of occurrence of properties among the members of the ensemble. If, for
example, an ensemble of spin-t particles were in the z+ state, so that
p(Sv+t) = p(Sx,-t) = t, then half of the members of the ensemble would
have the property (Sx,+t) and half the property (Sx,-t). Which property
any particular system had would be revealed upon measurement.
It is clear that, on this interpretation, the description of individual systems
offered by quantum mechanics is invariably less than complete.
The view I have sketched here has three components, which can be ca ll ·d
the Precise Value Principle (PVP), the Relative Frequency Principle (RPP),
and the Faithful Measurement Principle (FMP). (I use the nomencla ture of
Healey, 1979, here, and the general direction of this chapter is clo ely
aligned with that of his paper. RFP is implicit in his account, though not
explicitly stated.) According to PVP, whatever the state of a system (or, more
properly, of the ensemble containing the system), each observable has a
precise value for the individual system. According to RFP, the quantum-
mechanical statistics represent the relative frequency of occurrence of these
values within the ensemble. FMP suggests that every successful measure-
ment reveals the (preexisting) value of that observable for the particular
system under test. FMP thus tells us that, if the value a of an observable A
occurs in an ensemble with relative frequency n, then (ideal) measurements
of A will yield that value with the same frequency.* Thus the measured
frequencies coincide with the existing frequencies of particular values, pro-
vided, that is, that the measured sample can be thought of as a genuine
ensemble.
Elements of this view are to be found in the work of Einstein and of
Popper. Certainly, both believed that the quantum-mechanical form alism
applied to ensembles of systems, and both espoused PVP. (See, for exa m pie,
Einstein, 1948; Popper, 1982; Ballentine, 1972.) And, as Healey points out,
without FMP, PVP has little empirical content. Note, however, that Popper
• In an acidul ous footnote Fine (1979, p . 152) dispul es Ihis o rrelation, but his rejeclion 10 il
seems, instead, to be a rejection of FMP.
764 Till! 11I1L'rllrl'll/lioll IIj )/Ilillllllll 'I'//('ory
(1982, pp. 64 - 74) did not int ' rprl't probabilities as relative frequen ies,
preferring instead a propensity interpreta tion .
Independently of any cachet bestowed by its pedigree, the statistical
interpretation is prima facie a very plausible and attractive view of quantum
theory. Unfortunately it cannot be maintained-at least, not in the simple
form in which I have presented it.
n.• tural ext 'nsions of thOH~' UNI.·d for the 2 X 2 matrices of ( ? (see Section
1.6). The analogues of th • Pauli spin matrices for the X- , y-, and z-compo-
n 'nts of spin are
S·~O
0
0
-0 S,~ ~ ( -I
0
0
0 ns~U . %
0 n
-i
0
0
We see that
s:~O
0
1
0 n s~~O n 0
0
0
s:~O
The operators Sx, SY' and S% do not commute with each other; like the Pauli
0
1
0 n
matrices, they obey a cyclic commutation relation (see Section 1.7). The
operators S;, S;, and S; , on the other hand, commute with each other. Each
of them has eigenvalues 0 and 1, and so these are the possible values of th
observables they represent. Their sum is given by
such functional relations among cornp<ltiblc op 'ra tors M' dcfinl'd on just
this basis. Kochen and Specker address the first assumption by proposing a n
experiment which would yield values to the observable repre ented by
The system they consider is an atom of orthohelium. Thus they establish not
only that 21 represents a genuine observable (when a = b = c), but also that
5;,5;, and 5; are actually commeasurable as well as being compatible. For
the possible values of K (its eigenvalues) are a + b, b + c, and c + a, which
will be distinct provided that a, b, and c are. From our second assumption,
these values correspond to the cases when 5;, 5;, and5;, respectively, have
value O.
There remains the question of the uniqueness of the observables repre-
sented by the 5 matrices (by 5;, for example), but I will defer discussion of
this until Section 6.8.
I will give the impossibility proof in an elegant version Jue to Friedberg
(first published in Jammer, 1974, p. 325).
Let us assume (A): We can assign a value of 0 or 1 to each point on a sphere
in such a way that, of any orthogonal triple of points, just one receives value
O. Call such an assignment an A-assignment. We then show: (I) There is an
angle p such that, if any point p on the sphere receives value 0 on an
A-assignment, then so does any point q at an angular distance pfrom p. (II) If
one point on the sphere receives value 0 on an A-assignment, then, from (I),
so do all the others. But (II) contradicts our original assumption; it follows
that no A-assignment exists.
In what follows, our notation shows A-assignments assigning values to
vectors rather than to points on the unit sphere; for example, we understand
by v(x + y) the value given by an A-assignment to the point q on the sphere
where it is pierced by the vector x + y (in its positive direction).
To show (I): Consider an orthonormal triple of vectors, {x,y,z}, from the
center of the sphere. From this triple we generate two more orthogonal (but
not normalized) triples of vectors: {x + y, x - y, z}, {x + z, y, x - z}. We
now show that there is no A-assignment v such that,
For such an assignment would yield, from (6.3), v(z) = 0 and, from (6.4),
v(y) = 0, thus violating assumption (A).
'/'},I ' 1'/0/111'111 ollll'lll,a/it 'H / 67
v[(y + z) + xl = 0 = v[(y + z) - xl
Figure 6.2
p, then the required sequence is {p, ... ,Pi,q}. If the angular separation of
Pi from q is less than p, then, by continuity, there is a Pk whose angular
separation from both Pi and q is equal to p (see Figure 6.2), and the required
sequence is {p, ... ,Pi,Pk,q}.
This concludes the proof.
others th ' valu(' () 1" '1'111 '/ yH 1I '1lI do '$ not have the prop rties (A /,a !), (A j ,a 2 ),
l'l ."J. The impossibi lity proof in the previous section showed that in a
three-dimensional real pa ce we cannot assign the values 0,1, and 1 consist-
'ntly to each orthogonal triple of rays; trivially, we cannot assign the values
I , 0, and 0, either. The proof extends straightforwardly to complex spaces
within which there are orthogonal triples of vectors, that is, to any space of
dimension three or greater. The crucial condition on assignments, the con-
dition impossible to fulfill, is that, of any mutually orthogonal set of rays
spanning the space, exactly one be assigned the value 1 while the others are
all assigned 0. In this extended proof, we replace talk of angular separation
of points on the sphere by formulations involving the inner product of two
vectors. (Recall that, in 1R3, (xly) = Ixl . Iyl . cosO.)
Alternatively, the (generalized) impossibility proof can be viewed as a
orollary of Gleason's theorem (see Section 5.6). For assume a function f
exists mapping all rays of a Hilbert space 11 onto {O, 1}, which has value 1 for
exactly one ray of each set of mutually orthogonal rays which span 11 . Such
a function would, in Gleason's terminology, be a frame function, and from
his theorem it follows that, provided 11 has dimensionality higher than two,
there exists a density operator D on 11 such that
f(a) = Tr(DP,,)
1 = Tr(biP i) = biTr(Pi ) = bi
°
But D is a density operator; we have bj ~ for all j, and 2. j bj = 1. Thus D is
the projection operator Pi' and hence, for any ray a in 11 distinct from i,
P" =1= Pi, and so
(Reca ll from Section 5.4 lh a l 'l'r(I',I',,) ( v" iP,v,, ), wh ere v" is a normali zt,u
vector in a.)
Hence i is the only ray in 7i assigned 1 by f, contra ry to our assumptions. It
follows that no such function exists.
All the proofs given here are open to the following objection. They ma ke
the assumption that a full (and hence nondenumerable) set of observables
exists for the space 7i (as does the version given by Bell; see Bub, 1974, pp.
69-70). Now, as we saw in Section 3.9, while this assumption may be
well-founded for the space (:2 of the spin-t particle, it may not be true in
general in quantum theory. Kochen and Specker's own proof, on the other
hand, makes no such assumption. They show that the required mapping
fails in three-dimensional space for a set of triples involving only 117 points.
As they point out, this avoids the objection that "it is not meaningful to
assume that there are a continuum number of quantum mechanical proposi-
tions" (Kochen and Specker, 1967, p. 70).
Now
-1'21-""-
sm -(ac) =
1'2
-sm 60 0 3
=-
2 2 2 8
sumed (explicitly or impli ill y) or 'nlailed by lh 'ir argum ·nl. 0 11 ' lively
we can refer to them as the assumplion of " local realism ."
Bell's result gives a surprising turn to discussions of EPR. It wa never
suggested by Einstein, Podolsky, and Rosen that quantum theory was wrong
in its predictions, but rather that it failed to satisfy a particular criterion of
completeness. But it now appears that to accept their conclusion is to mak
certain assumptions which are actually inconsistent with quantum theory.
Thus, if we test the theory's predictions for coupled systems, we are also,
surprisingly, testing a cluster of metaphysical assumptions. For, should the
theory's predictions be confirmed, and the Bell-Wigner inequality be vio-
lated, this would offer a severe challenge to these assumptions; one might
even be tempted to say that they were falsified.
I return to this topic in Chapter 8, but a few preliminary remarks are in
order. Since the difference between what quantum theory predicts and
what the Bell-Wigner inequality demands was first pointed out, a number of
experiments have been performed to see whether the inequality holds (see
Clauser and Shimony, 1978; d'Espagnat, 1979). The results, though not
unanimous, have largely borne out the predictions of quantum theory; we
may take the evidence of those favorable to quantum theory as particularly
significant, since the requirement that certain predictions are precisely real-
ized is more stringent than the requirement that a certain inequality obtains.
The consensus of opinion is that these results have been a remarkable test of
the theory, which it has survived.
I'J'O('('dUfl'S, orllw. t ' qu,IIIIHl l'. may b' in prin iplc in. 's iblc (see)ammer,
11)74 , p . 267).
Th 'suggestion th at th ere may be such " hidden variables" is as old as the
probabilistic interpretation of the state vector. It was made by Born (1962b,
p 825) a few months after he first proposed that interpretation: "Anyone
II issa tisfied with these ideas may feel free to assume that there are additional
I'....a meters not yet introduced into the theory which determine the individ-
11.11 event." But almost as old is the denial that such hidden variables can
\' is!. By considering sequences of experiments like the sequence VH, VHV,
.Illd so on described in the Introduction, von Neumann was led to believe
Ih ot the existence of hidden variables would contradict quantum theory.
For, on a natural account of hidden variables, these experiments would act
.. s quantum theory tells us they cannot, that is, as a sequence of filters which
would eventually yield a homogeneous beam; the value of the hidden
v.lriable would be the same for all its members, and it would be incapable of
Iwing split further (see Jammer, 1974, p. 267).
Von Neumann's book, The Mathematical Foundations of Quantum Theory
(1932, chap. 4), contains the first "no-go" theorem for hidden-variabl
Iheories (henceforth "HV theories"). A "no-go" theorem is a theorem to
show that no HV theory which satisfies certain constraints can reprodu ce
Ihe quantum mechanical statistics.
The constraints suggested by von Neumann have since been challenged
.IS overly stringent, and the theorems of Kochen and Specker and of Bell are
now considered much more decisive. Although a survey of HV theories
wou ld take us too far afield, I will indicate the kinds of HV theories which
th ese two theorems disallow. (For a survey see Belinfante, 1973, or Jammer,
1974, chap. 7; Bub, 1974, has a good discussion of certain no-go theorems.)
Kochen and Specker ask whether it is possible to construct a classical
phase space.o, involving hidden variables, which allows a "reconstruction"
of the quantum statistics. Recall from Chapter 2 that, in a classical theory, a
physical quantity A is represented as a real-valued function fA : .0 -IR on
Ihe phase space. Kochen and Specker require that the algebraic relations
obtaining among quantum-mechanical observables are preserved in the
algebra of these real-valued functions on .0.
The relations they consider are just those involving compatible (they
write "commeasurable") operators on the quantum-mechanical Hilbert
space 'H; to use a term we shall meet in Chapter 7, they require that the
partial algebra of Hermitian operators on 'H be embeddable in the set IRQ of
functions from a classical phase space .0 to the reals. It turns out that a
necessary condition for this embedding is that a mapping exists of the rays
of 71 (equival ntly, the projectors onto these rays) onto {O,l} such that, of
/ 74 '/.,,/' /1111 ,,/,,'1'1111 ;111111/ (.)1111/111111' '/'/1/'(11.'/
any mutually orthogonal s ' I of r<lys spanning '1/ , l'X.lClly one ray fe l'ivl.'s
value 1. But, as we saw in Se lions 6,5 and 6,6, lhefl.' are no such mappings ,
Hence no HV theory satisfying their requirements is possible,
To see exactly what kind of HV theory this rules out, we need to examine
the assumptions Kochen and Specker make, I drew attention to these as-
sumptions in Section 6.5, One of them in particular might be questioned,
namely the assumption that a Hermitian operator on a Hilbert space repre-
sents a unique observable, The proof rests on the requirement that, if a, {l,
and yare any three directions in physical space, then of the observables S~,
S'/J, and 5 ~, two must be given value 1 and the third 0, It is then assumed tha t
if we assign to 5;, say, the value 0 when we encounter it as a member of the
triple 5;, 5;, S;-for example, when we measure 5; + 5; + S;-then it
must also be given value 0 when it is viewed as a member of the triple 5;"
5;,,5;, where x' and y' are directions in space different from x and y, lt is
assumed, in other words, that the value to be assigned to 5; is not contextual.
A contextualist HV theory would not require this consistency of assign-
ment to 5; , On such a theory, a Hermitian operator which belonged to more
than one set of mutually compatible operators would not be taken to repre-
sent one single observable, Gudder (1970) has shown that (provided we
restrict ourselves to a single system) we can always, as it were, piece together
HV theories, each dealing with a mutually compatible set of Hermitian
operators, and thus produce a contextual HV theory,
Gudder's theorem shows no more than the mathematical possibility of
producing such a theory, nor did he claim more for it. It gives no physical
grounding for one, and indeed one may think that the move to a contextual
theory has sapped the project of much of its motivation,
This reservation apart, it's important to note the restriction to a single
system, For if Kochen and Specker'S result limits us to contextualist HY
theories, then Bell's theorem limits us to non local ones (as does a result by
Stairs, 1983b, which applies an argument like Kochen and Specker's to
coupled systems), A local theory is one in which the hidden variables de-
scribing spatially separated systems are independent of one another, How-
ever, as soon as we seek an HV theory to deal with composite systems, we
are faced with the correlations typical of EPR-type experiments, By aban -
doning Einstein's assumption that spatially separated systems are indepen-
dent of each other, and appealing to interactions between the systems
concerned, it may be possible to reproduce the quantum-mechanical predic-
tions for these experiments, However, it's not possible to reproduce them by
recourse to a classical probability space, and a forliori not by recourse to a
classical probability space wherein such frequenci s appear as relative fre
quencies of classical states,
'I'ltl' 1'/tI/III' II/ O/I ' IOIII'rl/I '1-I 17[,
Th l' Ih' l! h .. j IllIpll l\lIi ons ex t ' ntling beyo nd th e topi of IIV
"WIII'I' 1l1
I IIt 'mil's, ,1Ild
I disCIIHS th l'Hl' impli ca tio ns in ha pter 8. The conclusion to be
d rolw n (rom it in this section is tha t no local HV theory for quantum me-
\'h.lni S is po sible.
To sum up, any HV theory that reproduces the quantum-mechanical
', l.lIistics must be both contextual and nonlocal.
st'c what kind of world is representable within the class of models the theory
(·mploys.
Recall that, on the semantic view of theories, a scientific theory provides a
representation, or model, of a certain domain. Thus geometrical optics pro-
vides a geometrical representation of the transmission, reflection, and re-
(raction of light, the Bohr theory of the atom a model of atomic structure.
Sometimes these models have a physical representation, sometimes they
are wholly abstract mathematical structures, but in both cases they supply
representations of the phenomena, or, as in the case of the Bohr model, of
(he structures postulated as underlying the phenomena. The Hilbert spaces
o( quantum theory are, obviously, of the second, abstract kind.
We interpret the theory by recognizing, in the models the theory provides,
' Iements of a particular conceptual scheme. For example, in the Hamilton-
Jacobi theory of classical mechanics for a single particle, the element w of
th e phase space is interpreted as an encapsulated summary of the pri-
mary qualities of the particle, and the mathematical expression - \lH(w)
1= (- aH/ax) - (aH/ay) - (aH/az), where H is the Hamiltonian function
(or the system] is interpreted as the forc e acting on the particle, such forces
being the effici ent causes responsible for the processes the theory describes.
Thus the theory is interpreted within a particular categorial framework. I
borrow the phrase from Komer (1969, pp. 192-210); a categorial frame-
work is a set of fundamental meta physical assumptions about what sorts of
e ntities and what sorts of processes lie within the theory's domain. The loci
classici (or the a rticul a tion of the ca tegorial fra mework of classical me-
176 Tht' 11111'IIIrl'lll iioll oj (..)111111111111 TItI'IIIY
• " Classical mechanics" is here identified with a class (T, ,I,), (T2,12)' . . . of theories a nd
interpretations.
wlllr h o( tlW tl~· 1',111 1'1I 11 1)(' Il 'g,mkd as the sy~Hc m 'll prop 'rtie . A value-
11,,11' will Ihus I11Jp p •.in (1\,1/) (Into I or 0, depending on whether the system
POW-ll'IlSeS the prop 'rty in question or not, and so will resemble a classical
rl l II II'.
Two remarks need to be made about this value-state. In the first place, the
,1\lribution of properties it provides is over and above the work done by the
Ihl'ory simpliciter. We use it to yield an interpretation of the theory which
,1('nlmmoda tes the notion of the properties of a system, but another altema-
IIVl' is always open to us, that of finding a categorialframeworkin which the
!lollon does not appear. Second, even if we hang on to properties, the
concomi tant value-states cannot be just like their classical counterparts. For
Kochen and Specker's theorem tells us that, for most quantum systems,
IIwl" an be no function...1. mapping all pairs (A,a) onto 1 or 0 in accordance
with PV P - in other words, so that for each A there is exactly one value a for
whi ' h A(A,a) = 1. Any workable account of a value-state must therefore be
llIodified away from adherence to PVP. Different modifications will yield
dlf(erent interpretations of quantum theory .
. A number of these interpretations can best be explicated using the vocab-
tll.lry of "quantum logic"; partly for that reason the next chapter is devoted
It) that topic.
7
Quantum Logic
Various enterprises are subsumed under the heading quantum logic. Two
useful introductions to the topic, Mittelstaedt (1981) and van Fraassen
(1981a), appear in the same volume; more extended acconnts are given by
Beltrametti and Cassinelli (1981) and Holdsworth and Hooker (1983).
Common to all quantum-logical enterprises is the aim of giving, or utilizing,
an algebraic account of quantum theory.
In Sections 7.2-7.4 I define the algebraic structures that quantum logic
makes use of (Boolean algebras, partial Boolean algebras, and orthomodular
lattices), and show how these structures can be found embedded within
Hilbert spaces. In Section 7.5 I look at the work of a group of writers
(Mackey, Maczinski, Finkelstein, Jauch, and Piron) who have sought to
recapture the Hilbert-space formalism of quantum theory by looking at the
algebraic constraints to which the event structure of any theory must con-
form. Finally, in Sections 7.6-7.9 I show how quantum logic can bethought
of as a logic, in the sense in which that word is used when we speak of
"deductive logic," and I discuss whether a "quantum-logical" interpreta-
tion of quantum mechanics will allow us to salvage the notion of a property
of a system.
To illustrate how all these enterprises hang together, I start by examining
a very simple classical system, showing, first, how the algebraic structure of
a field of sets can coincide with the structure of the set of properties the
system can possess and, second, how this structure can also be viewed as a
logical structure.
p P (penny, heads-up)
P P (penny, tails-up)
00
180 TIlt' IlIlerl"'l'Il/lio/l of )/11/1111/111 Tllt '() IY
Figure 7.2
At the top of the diagram is the whole space, and at the bottom is the empty
set $25.
If any point on the diagram can be reached from another by traveling
upwards along the lines of the diagram, then the subset represented by the
higher node properly contains the subset represented by the lower. Thus a
line running upwards between two points (possibly passing through others
en route) represents the relation of set inclusion.
Figure 7.3
(.)III111/l/l/Il.ox;r 18/
Figure 7.4
Each of these subsets, and hence each node, represents a possible prop-
('fly of the system and so the diagram also displays the relations between
Ilwse properties. Now, associated with each property is a sentence express-
i Ilg lhe fact that the system has the property in question. In fact, the sentence
{lJ ( P (where w is the state of the system) expresses the fact that the penny
is heads-up.
Lel p be synonymous with wE P, and q with wE Q. Then, using the
sl,)ndard logical connectives & and v for "and" and "or," we can write p & q
(or W P n Q, and p v q for wE. P U Q.* Clearly, to each node on the
diagram we can attach the corresponding sentence: to the nodes represent-
ing P and Q we attach p and q, respectively, and to the nodes representing P
.1Ilt! Q we attach - p and - q, where - is to be read as, "It is not the case
lhal .. ." Let LC be the set of sentences which can be formed from p and q
hy lIsing the connectives &, v, an~-. These three connectives mimic the
St'l lheoretic operations n, U, and ,as Figure 7.4 shows .
• Througho ul Ihis ('haplcr senlences and scnll'n . s hemata of the logical language will not
I,.. marked off by tl,mlolil'" marks or quos i quotalion marks . Quelle horreur.
"182 Till' /II/t'rflfl'/fl/illll IIJ (.,JIIIIII/IIII/ '1'''/'/111/
( / }) aV a /I a/\a=a
(/ .1)
In view of (7.3), there are elements of B, namely a Val. and a /\ a1., which,
,"though they are obtained from a single element a by Boolean operations,
do not depend on the choice of a. These are the designated elements 1 and 0
n'spcctively. We have then, by definition,
( / ,4)
( / !I) (a1.)1. = a
( /.1)
OV1=1=lVO 0/\1=0=1/\0
OVO=O=O/\O
1V1=1=1/\1
TIlt' opl'rations V, /\, a nd I on th • right -hand sid s of these equati on ' ar'
0 11l'r.1 t ions on Z2 '
184 Tilt' 11I11' rIJrl'llIlic)H of )/1111/1111/1 'I'III'My
The importance of this mapping fur lassi a l logic is lear. o n::;idcr Lh >
Boolean algebra ~16pictured in Figure 7.4; Lhis is the algebra of the set n 16 of
propositions expressible using just two atomic sentences, p and q, together
with the usual connectives. If we think of the two elements, 1 and 0, of Z2 as
true and false, respectively, then each of the homomorphisms of ~16 onto Z2
offers a systematic way of assigning truth-values to the propositions of n 16
(or, more precisely, to the sentences expressing them). On these assign-
ments, as we shall see, the connectives &, v, and - are the familiar truth-
functional connectives given by truth tables in any introductory logic text
(such as Kleene, 1967, p. 9). In the case of ~16' there are just four such
homomorphisms, each corresponding to a possible assignment of truth -
values to p and q.
Each homomorphism is associated with one of the four atoms of ~16' that
is, with one of the points immediately above 0 in the diagram. Each homo-
morphism maps just one of these atoms onto 1, together with all the points
lying above that atom. The set of these elements is said to form an ultrafilter
on ~16. Figure 7.5 shows the elements of ~16 which are mapped onto 1 by
the homomorphism associated with the atom a. The remainder are mapped
onto O.
o
Figure 7.5 A typica l ultrafill cr o n 00'0 .
(JIIIIIIIIIIII tONic 185
The gen ''',lIiZ.ltll''l of Ihls t' ample to any atomic Boolean algebra is
rl lr,l ightforwarJ , bul SOIIll' preliminary work is required.
In the discussion of '13 16 , I have talked of the atoms as points "immedi-
.tldy above" O. We need an algebraic specification of that relation. Note first
IIt .. t, for all a and b in B,
The relation R is reflexive, transitive, and antisymmetric. That is, for all a
.Int! b in B,
( I 1111) aRa
(I 1111) aRb and bRc together imply aRc
(I I I c) aRb and bRa together imply a = b
Whi le all finite Boolean algebras are atomic (that is, they contain atoms),
/,nlne infinite ones do not. I will restrict present discussion to atomic Boolean
.lIg 'bras, although, in fact, (7.13) and (7.14) below are perfectly general
resu Its.
I\n ultrafilter U on an atomic Boolean algebra 13 is a set of elements of B
n> nl aining just one atom a and all points b such that a =5 b. We find that, if U
IS .111 ultrafilter on a Boolean algebra 13, then, for all a and b in B,
There is a one-to-one corresponde nce betwee n the s<.'l of ullrafilters Oil '/3
and the set of homomorphisms of '/3 onto Z2 su h that, if U is an ultrnfilleron
13 and hu is the corresponding homomorphism, then, for all a in B,
From (7.13) and (7.14) we can see why, if we have a Boolean algebra of
propositions, the connectives of the language expressing them behave
tru th -functionally.
The definition of a Boolean algebra given by (7.1) is purely structural, and
so the theorems (7.2)-(7.12) are completely general; no interpretation of V,
A, and 1- is assumed, nor are there restrictions on what B may contain. To
emphasize the general nature of a Boolean algebra, and to provide an
example which will be useful in the next section, let us look at an interpreta-
tion of 13 16 very different from those we have considered.
Let .A be the algebra (A,LCM,HCF,COMP,I,210), such that A contains
the sixteen numbers 1, 2, 3, 5, 7,6, 10, 14, 15, 21, 35, 30, 42, 70, 105, 210,
while the two binary operations on A yield the lowest common multiple
(LCM) and the highest common factor (HCF) of any two numbers in A, and
COMP(a) = 210/a, for all a in A. This algebra is isomorphic to 13 16 , that is,
we can attach each number to a node on Figure 7.3; the maximum element
(1) of this algebra is 210, the minimum element (0) is 1, and the atoms are the
primes 2, 3, 5, and 7.
Noneth eless, among all the possible realizations of Boolean algebras, one
Iype o f reali zation has a privileged status: we know from a representation
Ilwor ' m du e to M . H. Stone that every Boolean algebra is isomorphic to a
fi eld of sets (Bell and Slomson, 1969).
The significance of this theorem for our present purposes is this. The
presentation in Section 7.1 may suggest that, because the propositions of a
classical theory are represented by subsets of a phase space, their algebraic
structure, or logic, is Boolean; however, it is more accurate to say that,
because their logic is Boolean, they can be represented by the subsets of a
phase space.
A blf 0 bOa"
a b-'-
0
I~
B
E
0 0
C ~a
c ~b
F
c-'-
0 a c
I ',s"re 7.6 Some finite posets and lattices. A is a poset with no maximum element: (7.18)
f.lll s. (7 .18) holds for B, but B is not complemented: (7.19) fails . (7.19) holds for C, but C is
lIul orlhocomplemented: (7.20) fails. (The arrows show how complementation works.)
(7.20) holds for D, but D is not orthomodular: (7.22) fails . E is a poset with maximum and
minimum elements, but it is not a lattice. (Nor is A.) F is an orthocomplemented distribu-
IIV<' la ttice; it is a Boolean algebra. Compare Figure 7.3 and Figure 7.8 .
We do not require that, for all a and b in A, either a:::; b or b :::; a. (A set for
which this holds is said to be totally ordered by:::;.) In the rest of this section,
.// is taken to be the poset (A,:::;).
If a a nd b are elements of A, then there may exist an element c such that
(7.17a) e :5 a and e :5 b;
(7.18) a:51
(7.19) a 1\ a.L = 0
These equations should be read, "Sup{a,a.L} exists and is equal to I," and
" lnf{a,a.L} exists and is equal to 0."
.A is said to be orthocomplemented if it is complemented and, for all a in A,
(7.20a) (a.L).L = a;
For an orthocomplemented poset, De Morgan's laws hold for sup and inf
wherever they are defined; see (7.6). Notice that, if .A is orthocomple-
mented, then sup{a,b} is defined if and only if inf{a,b} is.
We can define a relation of orthogonality on an orthocomplemented poset
by the following condition:
It may be that sup{a,b} and inf{a,b} are defined for all pairs, {a,b}, of
. \'It-ments of A. In that case, .A is said to be a lattice. We can now regard Vand
1\ as binary operations onA and refer to them (unsurprisingly) as " join" and
" ITIeet."
Notice that the lattice condition is independent of conditions (7.18) -
(7.20) and (7.23)-(7.24). We can apply these constramts to the class of
lattices to obtain, successively, lattices with maximum and minimum ele-
ments, complemented lattices, orthocomplemented lattices, orthocomplete
I.lltices, and orthomodular lattices.
It is easy to show that, for all lattices, clauses (7.la-c) of the definition of a
Boolean algebra hold (that is, commutativity, associativity, and absorption),
,1S do (7.2) (idempotence) and (7.10), which now appears as a theorem
rather than a definition. (7.le) and (7.4) hold for complemented, and (7.5)-
(7.6) for orthocomplemented lattices.
A lattice for which (7.ld) holds is known as a distributive lattice. An
orthocomplemented distributive lattice is a Boolean algebra.
Now let .A be a complemented distributive lattice. Take a, bE A such that
(I b. Then
b = b 1\ (a V a.L) [(7.le)]
= (b 1\ a) V (b 1\ a.L) [(7.ld)]
= a V (b 1\ a.L) [(7.10)]
(7. 25a) L /\ M = L n M
(7. 25b) LV M = n{N: N E 5(7£) and L ~ N, M ~ N}
Notice that the latter is not the union of two subspaces, but their span. The
union of two rays, for example, contains just the vectors in the two rays;
~ ince it does not contain all linear superpositions of these vectors, it is not a
11I.bspace. The span of two rays is the plane containing them both and it is
this which, in lattice-theoretic terms, is the join of the two rays .
'/I is the maximum element and {O} the minimum element of £.(71). The
closure (see Section 1.16) of the set of vectors orthogonal to L forms a
/i ll bspace L..L , which is the orthocomplement of L, obeying (7.19) and (7.20).
a V; b = a Vj b
(7. 27) for all a,b,c E U{B;}, if there are i,j,k E I such that
a,b E B;
a 1\ b = a 1\; b
I'I3As. The difference between the lattice and the PBA is this: whereas the
I,lltice operations V and 1\ are defined for all pairs of points on the lattice,
the operations V and 1\ on a PBA are partial operations, defined only for
pairs of points, both of which are in the same Boolean subalgebra of the
I'BA . We call such points compatible, noting that in the PBA :B(7£) two
points are compatible in this sense if and only if the subspaces they corre-
spond to are compatible in the sense of (3.8). But now notice that (3.8),
suitably rewritten, gives a purely algebraic definition of compatibility:
ITh e orthogonality relation here is, of course, the algebraic relation defined
by (7.2 1).1This d ' finiti on allows the coherence condition mentioned above
10 bl' simply I'l l a l eti :
194 'J'ill' III/ I' l/i/'(' /(I/i oll 0/ ('> 111111111111 'J'i/ (,O IY
This condition on posets does the work of (7.27) (Hardegree and Frazer,
1981).
It turns out that the feature we noted in the case G 12 is perfectly general:
every maximal set of mutually compatible elements of a coherent ortho-
modular lattice is a Boolean algebra. Thus, to obtain a PBA from a coherent
orthomodular lattice L, we just define partial operations on L which are the
restrictions of lattice join and meet to pairs of compatible elements within L .
Conversely, there is a natural ordering definable on the transitive PBA
:B(7£), and there are unique extensions of the partial operations on :B(7£) to
meet and join with respect to that ordering; the resulting structure is a
coherent orthomodular lattice.
The second question remains open. It is not known whether there is a
purely algebraic way to specify those partial algebras (or those orthomodu-
lar lattices) which are isomorphic to 5(7£). The sorts of considerations at
work in Chapter 4 suggest that the most promising approach would be to
consider PBAs on which groups of transformations were definable which
reproduced the symmetry groups within Hilbert spaces. These transforma-
tions would map one Boolean subalgebra of the PBA onto another; recall
that a selection of subspaces 1R3 giving rise to Gl2 was obtained by taking one
orthogonal triple in 1R3 and rotating it about the z-axis to yield another. (See
Gudder, 1973, for work along these lines; see also Holdsworth and Hooker,
1983, pp. 135 - 136, for further references.)
with different pro ·dures. Not " how 'v 'r, tha t a om pi em 'nta tion opera-
tion is everywhere defined, since ea h event a has a comple ment aJ in the
(unique) Boolean algebra which contains it. In addition, we may plausibly
identify the null events of all measurement procedures, and also the certain
events. The intuition at work here is that two events are identical if no
preparation will yield a different result for one than for the other. This rather
vague criterion will be made more precise shortly, but for the present it will
serve. For since we stipulated that for each measurement procedure the set
of outcomes was to be exhaustive, it follows that, whatever preparation
procedure is used, the null event, 0, will receive the answer no, and the
certain event, I, the answer yes, no matter what measurement we carry out.
Thus 0 is a family of Boolean algebras in one-to-one correspondence with
the set of measurement procedures. This family is pasted together at top and
bottom; it is an example of a very elementary kind of structure known as an
orthoalgebra. I defer discussion of such structures until Section 8.1; the
present question is, what further constraints can we lay upon 0? In particu-
lar, in order to relate events associated with different me3surement proce-
dures, can we make precise the criterion used just now, when we identified
all the null events associated with different measurements, on the grounds
that no preparation yielded a different result for one than for another?
Well, every preparation gives a certain probability to the various events of
0 . We associate with each preparation a state, w, which assigns a probability
w(a) to each event a of 0. This enables us to define a relation :5 on 0 :
If, further, we identify two events a and b whenever, for all states w, w(a) =
w(b), then :5 is a partial ordering on 0.
It seems that, without significant loss of generality, we have shown that 0
must be a poset-a poset, moreover, with maximum and minimum ele-
ments and on which a complementation operation has been defined. Alas,
dancing in the streets would be premature; without making some significant
assumptions we can't expect the complementation operation to mesh prop-
erly with the ordering relation. Consider, for instance, the experiment
shown diagrammatically in Figure 7.9, which consists of coupling together
two Stem-Gerlach apparatuses, one to measure S% and the other to measure
Sy, so that just one of the beams emerging from the S%apparatus, the z- beam,
say, passes through the Sy apparatus. (This example comes from Beltrametti
and Cassinelli, 1981, p. 145; see also Cooke and Hilgevoord, 1981.) If w e
consider the coupled apparatuses as one experiment, then there will b
three possible outcomes, z+ , y+, and y- .
Figure 7.9 Coupled Stem-Gerlach devices.
In this case we will find that, for all states w, w(y+) = w(y-). It follows
that, if a = {z+ ,y+} and b = {y-}, then b :5 a. Hence b Va = a. But since a and
b are mutually exclusive and jointly exhaustive, a = b.l.. It follows that b V
b.l. = b V a = a oF I , contrary to (7.19). Thus the operation ..l is not an
orthocomplementation with respect to :5.
How might one outlaw such experimental arrangements? One strategy is
to make explicit the assumption that we are dealing with measurement
procedures: we may demand that each event have an internal conceptua l
structure and be recognizable as an experimental question (A ,~) . Then
anomalous cases like this one are ruled out, on the grounds that the appa -
ratus does not measure a specific observable. In doing so, however, we lose
some of the generality we sought; we confine discussion to possible theories
couched in terms of observable quantities and their values. We also assume
that we can recognize which experimental devices provide measurements
of these quantities and which do not. To the extent that our project is that of
prescribing a logical form for the event structure of all successor theories,
these constraints seem unduly restrictive. Nonetheless, the approach is still
general enough to accommodate theories like quantum theory and classical
mechanics.
With these general considerations in mind, let us move to a more formal
mode. (The exposition essentially follows Mackey, 1963; see also Mac-
zynski, 1967.) We take as primitive notions those of observable and state; we
also use the resources of number theory, in particular, the notion of a Borel
set of the reals. (All physically significant sets of reals, and many others, are
Borel sets; see Fano, 1971, p . 215.) Let abe the set of observables, S the set of
states, B(IR) the set of Borel subsets of the reals. A pair (A,~), where, as usual,
A E a and ~ E B(IR), we call an experimental question.
Each state w defines a probability function on the set of questions, such
that for all A E a and for all ~, r E B(IR), w(A,~) E [0,1], w(A,¢) = 0,
w(A,IR) = I, and w(A,~ U r) = w(A,~) + w(A,r) provided A n r = ¢.
We identify two states (WI = w 2 ) if they give the same probability to each
question (A,~), and we identify two observables A and B if, for all wE Sand
It s "'},I' 1111/'///1'1'1111;1111 II/ ()/l1I III II III '/'Itt'my
for all il 8(~), w(A,il) w (IJ,t\) . I{l'spe tivl'ly, thcse i<kntificn tions stat '
that the set of questions and the set of stat '5 nrc complete.
We say that two questions are equivalent, (A ,M - (8,r), if, for all w S,
w(A,il) = w(8,I) . Each equivalence class of questions, [(A,il)], contai ns all
and only questions equivalent to (A,il). Modifying our previous usage, we
refer to an equivalence class of questions as an event; this modification does
not affect the substance of what is said. As before, let 0 be the set of events.
Clearly, any state w can also be thought of as a function on the set of
events such that, for all a in 0, if a = [(A,il)], then w(a) = w(A,il). We define a
relation of orthogonality on 0 as follows.
(7.33) For all a,b E 0, we say that a is orthogonal to b (a 1- b) if, for all w E S,
w(a) + w(b) ~ 1.
Now consider the following postulate (Postulate M).
[The ~ relation, remember, defined in (7.32) is such that a ~ b if, for all w E
S, w (a) ~ w(b).] Orthocomplementation on this poset is defined by:
ordering of events. For th '8 • authors, 'Vl'nts < r 'sp ified in op'rational
terms; thus to make good their laim th ey need to giv a genem l pres ription
whereby, from two recipes -one for asking a and the other, possibly using
a totally different experimental arrangement, for asking b- there ca n be
generated two more, for a 1\ b and a V b, with the required properties. It is
this problem, of giving experimental definitions of the lattice-theoretic
operations, which resists adequate solution. *
The question arises: what further assumptions guarantee that the ortho-
modular poset (0,S;) suggested by the Mackey-Maczynski approach will be
a lattice? We find (Beltrametti and Cassinelli, 1981, pp. 118, 152, and 297 -
298) that
• For detailed criticisms, see Hughes, 1982; note thattherelations on lines 27 and 31 of page
249 of that article should read "0 < w(q . ql.) < 1" and "T:$ (q . ql.)l., " respectively. See also
Holdsworth and Hooker, 1983, pp. 136 - 141.
()III/III"tII 1.11,11/1' OJ
A IlIIrl' sln/t'isdl'filwd II jill {' Irl'l1)JI point in th· onv 'x s ·tS. Incompati bil -
Ily is dd'ined thu s:
( /, 1</) Let a and b be any two orthogonal elements of 8 distinct from the null
event; then there exists a non-null event c in 8, distinct from both a
and b, such that c < a V b.
relations between the cle m ' nl s of 'I I It" Th ' COJ1 I1l'cl ivl's of I h ' la nguLl!-;l' Llrl'
& (conjunction), v (disjunction), a nti (n ega tion), a nd the ma pping! ta king
sentences of Lc into elements o f 13 16 is such th a t, for all sentences
A, BE Lc,
The following purely algebraic theorem also holds. Let .13 be a Boolean
algebra; then for all a, b E .13,
(7.42) a ::s; b if and only if every ultrafilter on.13 containing a also contains b.
Whence we obtain,
Recall from Section 7.2 that the ultrafilters of .13 16 playa special role: th ey
represent maximal consistent sets of propositions. Each possible truth-as-
signment to the sentences of Lc is associated with a homomorphism of .13 16
onto Z2' that is, with a function that maps all and only the members of some
ultrafilter of .13 16 onto the element 1 of Z2' Only the propositions lying in th e
(..)/I1111111111/.0;.:il ' ) ()J
IIll r. lli ll t'r .lfl' oI MII/ ', III '(/ II H' v.l1I 1(' "Trll e" by the associa ted truth assig nme nt.
TillIs (7.43 ) is li lt' .dgdll\lit' equi va lent of:
between the connectives of LQ and the operations on the lattice. For all
sentences A, B E 2Q'
We can see from Figure 7.10 that, in our example, f(~p) = f(q V r), and
h nce that ~ p is equivalent to q V runder the mappingf. As in the classical
casl.', I will confine myself to a single mapping; thus, in what follows, I will
omil Ih e phrase " under the mapping!" which, ideally, should accompany
a/l s la l 'ments about logical relations between the members of 2Q. As be-
fore, th e res triction to a single mapping licenses talk of logical relations
between propositions, in this case, quantum propositions.
The orthomodular lattice ..£(7£) of the set of subspaces of a Hilbert space is
atomic; that is, there are elements of ..£(7£), to wit the rays of 7£, immediately
above the zero of ..£(7£). [(7.12) provides a formal definition.] In what
follows I restrict myself to atomic orthomodular lattices.
As in the case of an atomic Boolean algebra, an ultrafilter U on such a
lattice can be simply defined:
·111) We say that a holds under the assignment u if and only if a is in the
ultrafilter U.
1/.t1l1,,) A FQ A VB
if A FQ C and B FQ C, then A V B FQ C
II .',flb) A,B FQ A 1\ B
A 1\ B FQ B
I=QA V ~A
I=Q ~(A 1\ ~A )
[We write I=QA if A holds on all truth-assignments; note also that the upper
line of (7.S0b) involves a modest extension of our notation.] There are of
course casualties among the theorems of classical logic. Notoriously, A 1\
(B V q ~Q (A 1\ B) V (A 1\ q, since, in general, an orthomodular lattice is
not distributive. (Friedman and Glymour, 1972, provide an axiomatization
of orthomodular quantum logic which was proved complete by Hughes,
1979; see also Dalla Chiara, 1986, and Gibbins, 1987, chap. 9.)
More fundamentally, we may think, the assignments provided by the
ultrafilters on .L do not behave truth-functionally, as classical truth-assign-
ments do. That is to say, the truth-values of compound sentences are not
uniquely determined by the truth-values of their components.
Consider, for example, the assignment u determined by the ultrafilter Uq
on G12 which contains the atom q (see Figure 7.11). On this assignment q
holds, but the other atoms do not. The proposition p V q lies in the ultrafilter
Uq and therefore holds on this assignment. But p V q is identical with the
proposition s V t. Hence, on this assignment, we have
Let us now cash this out in terms of quantum systems and their states, and so
obtain an interpretation of the propositions of a logic based on the lattice
,£,(71 ). We need first to distinguish three kinds of things: quantum events,
quantum propositions, and subspaces of a Hilbert space. The subspaces of a
llilbert space act as mathematical representations both of quantum events
and of quantum propositions. A quantum proposition is whatever is ex-
pressed by a sentence of quantum logic: just what this is we rely on our
interpretation to tell us. A quantum event (also called an "experimental
2(J8 '/'1// ' IlIlaprI ' llIl;OIl IIj (.)111111111111 'I'h, 'IIIY
question" ) is a pair (A,~) . Th . fa ' I Ihal we Me not at thi stag' giving any
further account of these entities does not mean that none is needed; on th
contrary, we are still engaged in the project, announced at the beginning of
Chapter 6, of gaining more insight into their nature, and, indeed, one reason
for seeking an interpretation of quantum logic is that it may help us to do so.
Propositions, events and subspaces are in one-to-one-to-one correspon-
dence; I will use lowercase italicletters a, b, c, . . . for propositions, Ea, Eb ,
Ee, . . . for the corresponding quantum events, and La, Lb, Le, . . . for the
corresponding subspaces of 'Jf. Strictly, the three sets form three isomorphic
lattices, but I will refer to all three structures indiscriminately as .£('Jf),
relying on context to make clear what the elements of the lattice in question
are.
Each atom La of .£('Jf) is a one-dimensional subspace of 'Jf and so repre-
sents a pure state of a system. Thus the set of pure states is in one-to-one
correspondence with the set of valuations of our quantum logic. Now let P a
be the projector onto the atom La, and for any element Lb of .£('Jf) (that is,
any subspace of 'Jf), let P b be the projector onto Lb' As we know, each
subspace Lb (alternatively, each projector P b) represents a quantum event Eb,
and every such event is assigned a probability by the state; if the state is P a
this probability is given by,
,11 "0 lIn < mbitiolls. ()II I hl' pro pos 'd rcading of qua nlum propositions, these
pl'IlposiLions arc JU Sl a subsl'l of the predictions quantum mechanics makes
.t1lOlIllhe probabiliti s of quantum events, and quantum logic offers merely
.1 partial reformulation of quantum theory in the formal mode-that is, a
I(·formulation expressed in terms of sentences and the relations between
tlwm . But many devotees of quantum logic were after bigger game. In
p.lrlicular, they took the logico-algebraic approach to quantum theory to
offt'r a way, or various ways, to talk of the properties of systems. The next
'.('clion looks at one such proposal.
In this laltkt',
reflecting the fact that no state makes either b1 and aI' or b1 and a2' or b1 and
113Simultaneously true. The fact that we cannot infer (II) from (I) is, of
ourse, one example of the failure of the (classical) distributivity law.
Consider now the objection to Putnam made by Harrison (1983). (He too
lalks of "position" and "momentum," and in the quotations below 1 have
replaced these words by "B-value" and "A-value," respectively.) Harrison
suggests first that, according to quantum theory, if a system has a determi-
nate B-value, it is false that it has any of the A-values specified in the second
conjunct of (I). He continues:
ll ence, if quantum theory is true, the truth of the first conjunct in (I) implies the
falsity of the second, and (I) itself must be false. Thus the very circumstance, that a
particle cannot have a determinate B-value and A-value, which implies the falsity
of (II), also implies the falsity of (I), and the difficulty for classical logic is removed.
(I'. 84)
But his argument from the falsity of aI, a2, and a3 to the falsity of a1 V a2 V
113 relies entirely on his treating V as classical truth tables prescribe. Clearly,
it is no objection to Putnam's system just to say that truth-table analysis tells
us that th truth of a disjunction requir's the truth of at least one of the
<.Iisjunct . This III 'r 'ly tells Putnam SOI11l'lhing he already knows: that his
sys lt'm is nol cbssicn l.
212 'I'll I' ill/apn ' /lIl io ll 01() I1I1I1/ 1I1I1 1'11 1'11 /1/
, lIppli '<.I wilh .1 N"!llllnlkll whi h maps them systemati ally onto the ele-
I1wnts of a la lli '. Th . purported logical relations between the sentences are
Ilwn read off from the algebraic relations which hold between these ele-
l\lents. But the logic that results constitutes an alternative to classical logic
onl y when the sentences of this formal language are given a specific inter-
pretation; as Section 7.7 showed, an unexceptionable, if unadventurous
interpretation of the lattice elements as modal propositions, D(A,d), is avail-
.Ible. Under this interpretation quantum logic formalizes a particular ac-
count of necessity; it supplements but does not supplant classical logic.
When Putnam says that the rules of quantum logic" conflict with classical
logic" and that the lesson to be drawn is that "we must change our logic"
(p. 221 ), he has another interpretation of the formal system in mind. As we
sa w in Section 7.8 he reads the propositions of quantum logic as indicative
propositions ascribing properties to microsystems.
It is this interpretation that has given quantum logic that hint of philo-
sophical perversity- delicious or detestable according to taste - conveyed
in the phrase "deviant logic." On the one hand, these are propositions of a
kind to which, prima facie, we would expect classical logic to apply; on the
other, they are just the statements which results like the Kochen and
' pecker theorem tell us behave in a nonstandard way: given an exhaustive
list of the possible values of each observable for a system, at no time can we
truly ascribe exactly one of these values to each observable.
Now this problem is going to be faced by anyone who offers an interpre-
tation of quantum mechanics which involves ascribing properties to sys-
tems. And no matter what kind of account is given of why the properties
behave as they do, this account will always have a counterpart in the formal
mode. Assume, for example, that the account posits states of affairs which
ca n or cannot obtain. Then, corresponding to each of these states of affairs
there will be a statement which mayor may not be truly asserted. Con-
straints on possible states of affairs will appear in the formal mode as
restrictions on what may be truly said about them. It follows that anyone
who talks of the properties of systems is committed to some version of
"quantum logic."
Witness Harrison, whom we met inveighing" Against Quantum Logic" in
Section 7.8. He writes, "1 had always supposed that, according to quantum
theory, . .. [if] a particle's position is determinate, it is false that it has any
of the velocities specified in [an exhaustive list of velocities]" (Harrison,
1983, p . 84). In other words, in quantum mechanics the truth of one atomic
proposition - an ascription of position-entails the falsity of another-
any speci fi c ascription of velocity. Now any systematic account of such
entailment on titutes a logic; furth er, since no classical conjunction of
214 '/'11(' 1IIII'Ip,.('IIIIIO/l 0/ (.)/111111/1111 'I'III'IIIY
sentences can be meaningfully COlllll'c ll'd ; Ihe coone lives 1\ :lI1d V are Ihus
"partial connectives" in the sam' s 'os ' Iha t the operations on a PI3A are
partial operations. At any time only one maximal Boolean ubalgebra of
propositions applies to the system. The ultrafiJters on that subalgebra act as
two-valued truth-assignments to the propositions within it, and to each
ultrafilter corresponds a value-state. Among the propositions within this
sub algebra the laws of classical logic obtain. The propositions that lie out-
side it may conveniently be given some third truth-value, neither true nor
false, to indicate that neither they nor their negations are true. (Hughes,
1985b, gives a detailed account of the semantics of this logic.)
Algebraically, this is precisely the quantum logic that Bub finds implicit in
Bohr's writings. This is not to say that Bohr and Kochen share a common
interpretation of quantum theory. Rather, they offer interpretations which
differ both in detail and in the metaphysical attitudes they express. In the
first place, whereas on Kochen's account the Boolean interaction algebras
are selected by any kind of interaction, on Bohr's view the classical nature of
measuring instruments gives measurement interactions i\ special status.
Secondly, the ontological commitment urged by Kochen is not shared by
Bohr, who indeed took pains to distance himself from others (also associated
with the Copenhagen tradition) who held that physical attributes were
"created by measurement" (Bohr, 1949).
Nonetheless, formally the logics are exactly the same; on the partial
Boolean semantics they employ, sentences conjoining a position ascription
and a momentum ascription are not well formed, and hence are meaning-
less. Thus, although this algebraic logic can be made to collapse to a three-
valued semantics, what results is very unlike the logic Reichenbach pro-
posed as an alternative to the Bohr-Heisenberg interpretation. As we have
noted, on Reichenbach's logic, conjunctions of complementary propositions
are perfectly well formed, though never true; furthermore, unlike the col -
lapsed algebraic logic, Reichenbach's is truth-functional.
These analogies and disanalogies, however, serve only to underscore our
previous conclusion: that much of the debate between advocates of quan -
tum logic and their opponents has been misdirected. If Kochen, on the one
hand, and Bub, acting on Bohr's behalf, on the other, can start from radically
different interpretations of quantum theory and yet produce formally iden-
tical quantum logics, then this adds strong support to the view that, what-
ever interpretation we adopt, the logic of property ascriptions to quantum
systems will be nonclassical. The choice we confront is not between adopt-
ing, for example, the Copenhagen interpretation and embarking on " the
heroic course" of changing our logic (Putnam, 1969, p. 222); it is betwe n
adopting a deviant logic and eschewing the notion of a property.
)/111111/1111 I.lIxir 17
The term probability has, up to now, been treated as though it were entirely
unproblematic. Surely this is too optimistic by far. There is, for instance, the
problem of the interpretation of probability: does it represent a degree of
belief, or a relative frequency, or a mysterious propensity, or something else
again? The view taken in this book is that nearly all the probabilities ap-
pearing in theoretical quantum mechanics are objective probabilities. That is
to say, they inhere in the world and do not simply reflect the degrees of
belief of an observer; rather, they determine what this degree of belief
should ideally be: if an event E is assigned an objective probability of, say,
0.1, then a fully informed observer should assign a subjective probability of
0.1 to E and place her bets accordingly (see Lewis, 1980). I wrote just now
that " nea rly all" quantum-theoretic probabilities are objective. The possible
exceptions occur when a system is in a mixed state. If we adopt the igno-
rance interpretation of a given mixture, then we assign a subjective proba-
bility to each of the pure states represented in it, and each of these in turn
assigns objective probabilities to events. Heisenberg, for one, suggested that
the interplay between objective and subjective components of probability
assignments could be made to do interpretive work, and I discuss his sug-
gestions in Section 9.5. Note, however, that, as we saw in Section 5.8, not all
mixtures can be given the ignorance interpretation.
Leaving aside the possible exception of mixtures, I will assume that quan-
tum theory deals with objective probabilities. However, I will not discuss
how the concept of objective probability is to be interpreted (see, for exam-
ple, Giere, 1973, 1976; Skyrms, 1980, chap. IA; van Fraassen, 1980, chap.
6), but will instead focus on a problem raised by quantum mechanics for the
mathematical theory of probability. Quantum mechanics requires us to
modify this theory, or rather to generalize the mathematical account of it
given by Kolmogorov (1933). But, surprisingly, this revision yields rcmnrk
1'lIIllIIllili/l/, ( 'IIU I/Illilll, IlIltil : I/Jllllllllioll 219
" hi " IWlldil s; il IlI·lpl. W I 10 pl'Ov idl' t'x pl.lJ1.ltions o( th e" .l U 'a l a nomalies"
which best't quantulll IIwory . O r so I shall suggest.
I{unning th rough this hapte r, in wh a t I hope will be a euphonious
" Hint 'rpoint, a re three main themes: (1) the generalization of probability
11l,'ory, (2) th e " causal anomalies" of quantum mechanics, and (3) the reso-
Iill io n of these anomalies in terms of generalized probability theory. A
discussion of scientific explanation appears as a coda.
H. I Probability Generalized
The lassical presentation of probability theory was given by Kolmogorov
( 1933). On this account, probabilities are assigned to sets. In Kolmogorov's
origin al presentation, these sets were said to be subsets of a set E of " elemen-
I.lry events." These " elementary events," however, played no further part
111 the discussion; following standard practice, I will use the term event to
1'1.( 'r to any subset of a set E to which a probability is assigned. If a proba bil-
II y is assigned to two events A and B, we also require it to be defined for their
'II nion, A U B, for their intersection, A n B, and for their comple me nts, E - A
(1/ , I ) We say that the triple (E,'J,p) is a classical probability space if 'J is a
field of subsets of E and p is a function p: 'J --+ [0,1] satisfying
(11 111) p(E) = 1 and p(¢) = 0;
(Ii. I II) p(A U B) = p(A) + p(B), for all A,B E 'J such that A n B = ¢.
In fac t it's now usual to define the measure on a a-field of sets, that is, one
which is closed not only under finite union and intersection, but also under
(denumerably) infinite union and intersection. In this case (8.1b) becomes:
These axioms are due to Hardegree and Frazer (1981). From them we may
derive the following theorems:
['w/I11/li/ily, 11111111111.11, (1//I/I ;,I/"(//Il/li(111
(II ,III) 0 1
I; I I 0;
(11 ,1/1) (nl)1 n;
(II ,It') n CD b = n EIJ c only if b = C;
exists a set C of pairwis . orr h 0).;01) ,I I lI1l'mlwrs o( ,/1 sti ch lh a t 1.'.1 h Jl1cmbl'r /1
of B is the orthogonal sum o ( OHll' su bse t o( ; in other words, (or ea h
bE B, there exist C1 ,C2' . . . ,c" C such that b = CDic/. Wh en B is th e pair
{a,b}, this condition reduces to the familiar definition (7.30).
In the "operational" orthoalgebra g sketched above, we could regard all
events as compatible if all the possible experiments could be performed
simultaneously without interfering one with the other. In that case the
algebra of events would be embeddable within a Boolean algebra, in fact
within a field of sets.
Less stringent constraints than the requirement of universal joint compat-
ibility yield the transitive partial Boolean algebras (equivalently, coherent
orthomodular posets) of quantum logic.
We now define a generalized probability function p.
(X 6) p(L) = Tr(DP)
Note that if 71 has dimension two, while each density operator on 71 yields a
G PF, the converse does not hold, witness the probability function on 1R2
which assigns 1 to points in the first and third quadrants and 0 to point in
Ihe second and fourth.
Gleason's theorem is a very strong result; the measures supplied by Ihe
density operators on 71 are the only natural extensions of classical probabil-
ity functions to the non-Boolean structure of the set of quantum -mechanical
propositions.
In 1977 Bub (1977) pointed out another highly significant result, that the
non-Boolea'n structure of 5(71) also necessitates a revised account of condi-
tional probability. In classical probability theory every Kolmogorov proba-
bility function p defines a conditional probability measure IP; the probability
P)(AIB) of event A conditional on event B is given by
For any given nonzero event B, the function IP(XIB) (where X is any event in
£) is itself a classical probability measure. In fact, it is the only classical
probability measure on the set E of events such that, for all A in E,
Then there exists a unique PF UD(XI/ .,,) on ('//) sllch that, whenever LA C.
LB ,
IP(L IL ) = p(LA )
A B p(LB)
DB = _P...!B,--D_P--.:B~
_
Tr(PBDP B)
Note that in (8.7) there is no restriction on LA; we do not require that LA ~ LB'
However, we see that, as in the classical case, the Liiders rule gives the only
probability measure that, for events LA ~ LB, just involves a renormalization
of the GPF given by the operator D. This offers strong grounds for regarding
it as the appropriate conditionalization rule for GPFs on 5(71). Additional
grounds for thinking of it as the natural extension of the classical condition-
alization rule appear from its behavior in two special cases (see also Bub,
1977, and Section 9.3).
First of all, consider the case when LA and LB are compatible. In this case
we have
where Pc projects onto LA n LB' (If LA 1.. LB, then Pc is the zero operator.)
1II'II IIII IIil ilYI ('IIIlIIlIltil/, ,,,,tll:.I/IIIIIIIII;Oll 22:,
lJ1iing ( .6) WI ' o hl ol lll
(H. 8)
where I" and Jb are the identity operators on 7-/" and 'lib, respectively.
Now by Liiders-rule conditionalization,
(8.9)
• A neutron interference experiment which is the exact analogue of the two-slit experiment
has been performed by a group led by Summhammer. It is simply described in Leggett (1986).
1'/"llIIllilily, '11 I/ SIt/1 III, /11111 1:.1 1"11 11111 i(lll J27
Figure 8.1 The two-slit experiment; curves show distribution of "hits" in experiments a
and b and (far right) experiment c. (From Feynman, Leighton, and Sands, 1965.)
we find that
detectable at the screen; for this " coll.lps· o f the wave pa ket" th e ae ount
offers no explanation. But the parti I' a ollnt fares no better. Let LI S assllme
that, when both apertures are open, a number N~ of particles reach X per
unit time after passing through A, and that a number N~ reach X per unit
time by passing through B. Then, since on a wholehearted particle analysis
each particle reaching X must pass through exactly one aperture,
(2) Lixp'rin1(.'111 i' III Iw III111mbiguou sly des ribcd only in classical te rms
(see Bohr, 11)49, p. 209).
(3) "Any given application of classical concepts precludes the simulta-
neous use of other classical concepts which in a different connection
are equally necessary for the elucidation of the phenomena" (Bohr,
1934, p. 10).
The limitations on classical concepts announced in (3) are due to the inde-
terminacies associated with the quantum of action: there is always a "finite
and uncontrollable interaction between the objects and the measuring in-
struments in the field of quantum theory" (Bohr, 1935a, p. 700), and this
precludes simultaneous ascription of, for example, position and momentum
to a particle. I discuss Bohr's interpretation of these indeterminacy relations
in Section 9.2. (For a full and sympathetic discussion of Bohr's views, see
Hooker, 1972.)
We have met (1) already, in Section 7.9; on Bohr's account, the use of a
particular concept (such as momentum or position) presupposes the exis-
tence of a particular physical situation; only when that situation obtains is
discourse involving that concept meaningful. Similarly with the wave-par-
ticle duality. The language associated with a particle model of physical
processes acquires meaning in specific experimental contexts. Further, con-
cepts are readily linked to particular models-momentum to the wave
model (as the wave number of the wave), and position to the particle model.
Bohr's position is elegantly summarized by Petersen (1963, p. 12):
In the language of physics there are various sets of concepts such as space and time,
and the so-called dynamical concepts like momentum and energy. Corresponding to
these different sets of concepts are different types of measuring tools. For example,
to determine the position of the object, one must use rulers firmly attached together
to form a reference frame. On the other hand, to measure an object's momentum one
may let it collide with a freely movable body of known mass, and then measure the
resultant velocity of the test body . . .
In quantum physics we use the same concepts [as in classical physics1and thus the
same measuring tools, but . . . the dissimilarity between the measuring tools be-
comes crucially important. Here we cannot use the different types of instruments in
combination. We cannot combine the information about the system that we get from
one type of instrument with the information we get from another. Therefore a
quantum physical phenomenon is characterized by the type of measuring instru-
ment we use. Two phenomena obtained by observing the same system with two
different types of instruments are mutually exclusive. Bohr called this logical rela-
tion of exclusion complementarity.
The lesson to be drawn is that if we use the same set of concepts in
23 0 TIll ' Ill/alm' /II/ioll oj (.)111111/111/1 '/'11/'01.'1
oImml'll'r. I Cdn, wil houl ("rlher elaboration, usc the term ali/meier in de-
scribing an experiml'n l to any living physicist. This is because we share a
common theoretical vocabulary which includes electric current. But the con-
ce pt of electric current is not a " refined version" of any concept at all that
was available to, say, Galileo. Still less can Bohr's claim be made on behalf of
magnetic flux density or electrostatic potential, both perfectly good classical
concepts.
My point is simply this, that the vocabulary of the experimenter and that
of the theoretical physicist (to make a dubious distinction) have always been
intertwined; as terms for radically new concepts enter the theoretician's
vocabulary, so they will enter the experimenter's. This process did not stop
at the stroke of midnight on December 31, 1899.
What holds for the concepts used in the theory holds also for the models
in which they appear. One problem which a new theory should not be
called upon to answer is why it makes only partial use of the models used by
its predecessors. Given the historical matrix from which quantum me-
chanics emerged, it is not surprising that a great deal of ea rly quantum
theory was expressed in terms of wave and particle concepts. For ev 'ry
physicist at the turn of the century, these were ready-to-h and pie '8 of
theoretical equipment. For sound pragmatic reasons physicists wer loa th to
discard them. In 1900, however, with Planck's attribution of particle prop
erties to electromagnetic waves, they began to be used in unorthodox ways;
Planck's move was mirrored twenty-five years later by de Broglie's attribu -
tion of wave properties to electrons.
What, then, of the so-called wave-particle duality that results? I say more
about this duality in Section 10.2; however, we can agree with Bohr that
each model, while proving heuristically valuable, offers only a partial anal-
ogy to the behavior of light and matter. Further we can agree (how could we
not?) that the two models are mutually at odds. We can deny, however, that
there is any radical epistemological lesson to be drawn from all this.
These episodes in the prehistory of quantum theory do not teach us to
abjure a unified understanding of quantum phenomena in favor of a doc-
trine of epistemological complementarity, according to which we are com-
pelled to move to and fro between two incompatible ways of picturing the
world. They teach us merely that neither of these ways is fully adequate. We
can draw a different conclusion than did Bohr, even while agreeing with
him that "The two views on the nature of light are rather to be considered as
different attempts at an interpretation of experimental evidence, in which
the limitations of the classical concepts are expressed in complementary
ways" (1934, p. 56).
23 '/'11/' IIII/" I,rrlll!illll II/ ()II/IIIIIIIII '1'1'/'//1.'1
and, since
,) nd so
Then we find that each ele Lron arriving <lL the screen has Lriggcl"l'd ·xactly
one of the counters, and exclusivity seems to be verified . But the pr 'se n e o f
the counters also destroys the pattern at the screen; when they are presenL
(8 .20) holds (see Feynman, Leighton, and Sands, 1965, vol. 3, pp. 1.6 - 1.9).
This effect is certainly peculiar, and, as with many quantum effects, it is
tempting to see it as symptomatic of a deep-seated epistemological recalci -
trance at the quantum level. But I think this temptation should be resisted.
The experiment with counters is designed to answer a specific question: are
the events A and B mutually exclusive? The answer it gives is unambiguous:
theyare.
Of course, our interest in this particular question is a by-product of our
search for an account of the interference pattern at the screen, and a remark-
able effect of the experiment is that this pattern is replaced by another.
Nonetheless, although we would like to know why this effect takes place,
that's a separate problem. Even if we couldn't solve it there would be no
obvious reason why we shouldn't take the evidence the counters supply at
face value; they show that A and B are indeed mutually exclusive. (A similar
point is made by Fine, 1972, p. 25.)
If we accept this result, then we need to locate the problem in the deriva-
tion of (8 .19) elsewhere. Putnam (1969) suggested that the illicit move in
this derivation is that from (8.13) to (8.14) (see Section 7.8). As he pointed
out, this move is an application of the distributive law, which doesn't hold
within quantum logic. It is certainly true that if we reject the distributive law,
then the inference from (8.13) to (8.14), and hence the derivation of (8.19), is
bl ocked . The trouble is, this is a purely negative result. It merely tells us that
the additive pattern is not guaranteed at the screen. It gives us no reason
why, in general, the interference pattern occurs, nor why the interference
pattern becomes the additive pattern when the screen is moved close
enough to the diaphragm. (Compare Bub, 1977; see also Gibbins, 1981b,
and 1987, pp. 147-151.)
Clearly, the simple rejection of a particular law of logic will not supply
much in the way of an explanation of what goes on. And, from Putnam's
1969 paper, one might well think that the only important thing about
quantum logic was that it gave up the distributive law. However, as Putnam
has recognized (Friedman and Putnam, 1978), the quantum-logical ap-
proach can offer a much deeper analysis of the problem than this. This
alternative analysis suggests that, rather than sniffing suspiciously at indi -
vidual moves in the derivation of (8.19), we should reject the whole deriva-
tion. For the probabilities we are dealing with are assigned, not by probabil-
ity functions on a classical probability space, as the derivation assumes, but
by generalized probability functions defined on a Hilbert space. In fact Bub
I'm/Jllililily, Ollillfllily, 1/1It11 :.I/JI/IIllllilll/ 2.1h
This term gives the difference between Pc(X) and 1-[(Pa(X) + Pb(X)] , It only
vanishes for all Llx when t = 0, that is, when the screen is very close to the
diaphragm.
Finally, consider the case when there are counters present. Assume, for
example, that, with both apertures open, the counter beside the A-apertur
registers an electron. Then, after the event A v B, anoth er event A has
occurred, to wit, the restriction of the electron to the region round th
A-counter. Provided that the counter is sufficiently close to the ap rturc for
l'III/I11/lilily, '(/ll/lI/l/ly, 111111 I:.1/111I//lIlio// 37
fl O signifi n nt l 'vo llltl \ Hl o f til l' state to 0 ur between the two, the effect will
b' a two-stage tra nsit ion,
With the counters present, the ensemble of electrons will be divided into
two subensembles in states 'l'A and 'l'B' At the screen this will 'give the
statistics of a mixture of the states Ut'l'A and Ut'l'B' and the additive pattern
characteristic of classical particles will appear. What was previously an odd
and inexplicable effect drops out quite naturally from the analysis.
How does this analysis relate to previous chapters, and where does it
leave us? The lesson of Chapters 6 and 7 was that we might be better off if
we dispensed with talk of the " properties" of a quantum system. Probably
the hardest property to free ourselves from conceptually is that of the
system's position in space. For if we stop attributing a position to a system at
all times, we will no longer be able to describe the electron in experiment c as
passing either through aperture A or through aperture B. Thus w will no
longer be able to regard it as mediating a causal process, at least insofar as we
require such processes to be characterized by spatio-temporal ontinuity.
We will be left with a story told, not in terms of causal processes, but in t ' rms
of quantum events and their probabilities conditional on other qu antum
events.
Of course, similar stories can be told for classical processes involving
probabilities. The difference is that in the classical case, when the probabili-
ties are Kolmogorov probabilities, causal supplements of these stories are
available; for an example, consider the way in which the derivation of (8.19)
earlier in this section could be supplemented by a causal account in terms of
particles. However, when quantum probabilities defined on a Hilbert space
are involved, no such causal supplementation is possible. Nevertheless,
contra Kant, this doesn't make a quantum story unintelligible. And, for all its
unfamiliarity, the account of the two-slit experiment outlined above has one
great merit: it tallies with the facts.
affected by wh at is <.I on ' to .1 st'l'o nd systt'J11 spaliall y s('para led fron, Ilw
first. Although Wigner's formul a tion of Ihe theorem, in terms of probabili
ties, was used, these probabilities were interpreted as th e tatistica l inler
pretation suggests, that is, as relative frequencies of the occurrence of p rop-
erties within an ensemble.
However, we can now redescribe Wigner's result in terms of probability
theory alone. His proof demonstrates that no probability function on a
certain kind of classical probability space can yield the probability assign -
ments of quantum theory.
As we saw, Wigner considers assignments of probabilities to sextuples
(i,j,k;l,m,n) . Each member of a sextuple is either + or -; i, j, and k represent
values of certain components of spin, S!, St, and S:, for particle 1, and I, m,
and n values for the same components of spin for particle 2, S;, S~, and S ~ .
These 2 6 sextuples provide a partition of a classical probability space, that is,
a set of mutually exclusive and jointly exhaustive events. It turns out that no
classical probability assignment to this partition can yield the quantum
statistics for S!, S;, et cetera.
Bub (1974, chap. 6) has argued that this version of Bell's theorem just
provides further evidence that quantum mechanics requires a nonclassical
account of probability. Indeed, the problem with the postulated probability
space is not far to seek. Effectively, the sextuples defining the members of
the partition are assumed to be sixfold classical conjunctions; thus the set
( S! ,+), (SL+), (S: ,+)}, for instance, is assumed jointly compatible. In the
event structure of quantum mechanics, however, this is just not so. Nor,
cru cially, can the quantum Hilbert-space structure be embedded into a
cl assical (Boolean) structure on which this partition might be defined; we
know this from Kochen and Specker's (extended) theorem. The postulated
classical probability space was therefore doomed to inadequacy, indepen-
dently of considerations involving coupled systems. Bub concludes that the
Bell argument "has nothing whatsoever to do with locality" and empha-
sizes the point by generating a similar inequality for a single particle, using a
classical partition with 23 members, each of form (i,j,k) (p. 83).
Accardi and Fedullo (1982) have done likewise. Nonetheless, as Bub now
acknowledges, more can be said about the two-particle version of Bell 's
inequality, in particular about the problems it raises for our concept of
causality.
1': ll'ctron pO:-.illOll P"It :, ,III' produced, with till' (,OI1lPllsi tl' Systl'Jll in till'
so called s ingll'l s pin s t.Hl' '1', This is a pure s lale in the te nsor produ cl s pan'
'I f" ® 'H P:
where v+ and v _ are the eigenvectors of some component of spin, S~ for the
electron, and u+ and u_ are the eigenvectors of the same component of spin
for the positron, S~ ,
To the vector 'I' corresponds the density operator D", on 'H' ® 'II" :
This yields the two reduced states D' and DP for th two compOlll'llt l III I h, '
coupled system; however, these are mixed states ra tlwr Ih ,\11 1'1111' 1.111 " (.11'"
Section 5.8). In fact, they are mixed states withouluniqlll' IlItho)',II I1 ,1I dl '
compositions; we have:
1 1
D'=-P'
2 a+ +-P'
2 a-
1 1
DP = - P~+
2
+ -2 P~_
and, in general,
1 1 /"'0..
(8. 23) P'1'\IS'a' +.1.
2'
S~
,,'
+ .1)
2
= -sin 2 -
2 2 aft
The problem is, how are we to 'xpldin th se orrela tions? W a n ort oul
putative explanatory accounts into rough groups; interaction accounts sug-
gest that the correlations are due to interactions between the compon nt
systems after they have separated, while preparation accounts trace the
correlations back to the original preparation, either of the composite system
(type 5), or of the experimental set-up (type E). Each kind of account, it turns
out, runs counter to our basic beliefs about causality. (Note that a causal
preparation account would involve what Salmon, 1984, chap. 6, calls an
interactive fork. Ah, well.)
As an elementary example of an interaction account, let us hypothesize
that the performance of an experiment on one particle (the positron, say)
changes the state of the other. Assume, for the sake of argument, that the
a-component of spin is measured for the positron and found to have value
+t. Then the probabilities assigned to measurement results on the electron
of the same pair will change. Whereas we had, for anykection p,
p(5p,+t) = 0.5, the correlation now gives us p(Sp,+t) = sin 2taP But these
are just the probabilities assigned to events (5p,+t) when the electron is in
the Q_ eigenstate of spin (see Chapter 4).
On our hypothesis, the measurement on the positron has effected a
change in the state of the electron. Prior to the measurement it was in the
mixed state De; subsequently it is in the pure state P~_. However, the
hypothesis seems to raise as many problems as it solves. In particular, how
can we account for this interaction without contravening the special theory
of relativity (STR)? For it is a fundamental result of that theory, variously
called the principle of Einstein-separability or Einstein-locality, that no causal
signals can propagate at a speed faster than light. And, in the first place,
most of the experimental tests confirming quantum-mechanical predictions
for coupled systems have looked not at spin correlations for an electron-
positron pair, but at polarization correlations between photons; these pho-
tons travel (of course) at the speed of light, and so only a signal traveling
faster than that could pass between them (see Clauser and Shimony, 1978;
d'Espagnat, 1979). Second, even if the interaction involved an electron-
positron pair (and some have been done using proton-proton pairs), it
should be possible to perform an experiment on particle 2 which, although
performed later than the experiment on particle 1 in the laboratory frame of
reference, is nevertheless space-like separated from it (Taylor and Wheeler,
1963), so that, according to STR, no causal transaction could take place
between the two.
STR is one of the most firmly established and best corroborated theories
of modem physics. We should be, at least, deeply suspicious of a ny account
of the EPR correlations which violates it. However, as Bell (1964, p. ] 99)
I'ro/Jo/lilily, (II/mtllily, 1111111 :.\/1/111111/;011 "I
point 'd out, it would nol be n dire t ontraven tion of STR to postulate that
Ihe setting of one measurement device affected the results obtained on the
other. Such interactions would violate locality in one sense, in that the
devices would not function independently of one another, but it would not
necessarily violate Einstein-locality; the postulated interactions could prop-
agate at a speed less than that of light and achieve their effects before the
actual measurements occurred. The proposed solutionis, in effect, a prepa-
ration account of type E, and traces the correlations to the experimental
set-up. It recalls Bohr's dictum: "The problem again emphasizes the neces-
sity of considering the whole experimental arrangement, the specification of
which is imperative for any well-defined application of the quantum-me-
chanical formalism (Bohr, 1949, p. 230). Though Bohr is (again) making a
point about the conditions for meaningful discourse, rather than offering a
causal account of the correlations, any experiment which puts this particular
causal account to the test will also tell us whether Bohr's holistic resolution
of the EPR problem is adequate (contra Leggett, 1986, p. 44; for Bohr's
treatment of EPR see Bohr, 1935a, 1949).
Such an experiment, using correlated polarizations of photons, was flUg
gested by Aspect (1976). His idea was to change "rapidly, repeatedly < nd
independently the orientations of the polarizers." Each change of orien ta-
tion of a polarizer was to be space-like separated from the correspond ing
experiment carried out with the other. Aspect continued, "Thus one finds as
a consequence of the principle of separability [Einstein-locality] that the
response of one polarizer, when analyzing a photon, cannot be influenced
by the orientation of the other polarizer at the same time (when analyzing
the coupled photon)." The experiment was performed by Aspect, Dalibar,
and Roget. They reported that, "The result violates the generalized Bell
Inequality . .. and is in good agreement with [the quantum-mechanical
prediction]" (Wheeler and Zurek, 1983, p. 442n).
On the one hand, their result both undercuts Bohr's response to the EPR
paper and effectively rules out type-E preparation accounts of the statistical
correlations between the measurements. In order to avoid invoking super-
luminal signals, these accounts appeal to the prior configuration of the
apparatus; however, the statistical relations are the same even when there
is, so to speak, no prior configuration. On the other hand, the result also
confirms our earlier suspicions about interaction accounts. It suggests that
all interaction accounts of the EPR correlations, whether they trace these
correlations to interactions between the component systems or between the
measurement devices, will violate the principle of Einstein-locality.
A statement which is at the same time more general and more precise than
this has been proved by Hellman (1982a). He shows that, if any determinis-
24'J. '1'/1/' 111/1"1"'1'111/10/1 tI/ ()/It1I1/1I111 '/'ItI'lI ly
tically Bin le in -Io a l lheory givt'll ,lllli ('orreialion resulls (or lwo dill linci
observables, so that, for exa mple,
and
In anticipation of Section 8.7 I might also add that the compl \ n 'ss
requirement (8.24) strongly resembles part of Reichenba h's • no of
Salmon's specification of what it is for A. to be the CO lll11l 0 11 CIIII S(' of \ wo
statistically governed events +e and +p. As we shall C', il is nll\ IIllly
deterministic causality which is threatened by th viol. lion of IIII' 1It' 11
inequalities.
Pexp U Pmet I- I
Quantum mechanics predicts the correlations of Pexp but also predicts results
at odds with I. Experimental results which bear out the quantum-mechani-
cal predictions thus tell us that I does not hold but that the premises in Pexp
do. It follows that some or all of the premises in P met must be discarded.
The usual Duhemian reservations of course apply. We might, for in-
stance, consider the allegedly theory-neutral correlation experiments to be
46 'J'll/' 111//" 1'/'('/1111011 0/ )/1/1/1/11'11 ,/,11,.111.1/
so infected with theorcli al ,1SSlIInpti ons th a t I'm" oultl bl' rescued (Sl'l'
Shimony, 1981). But in the cases to h nd there seem lillie doubt th a t Wl' an'
genuinely, and remarkably, putting metaphysica l theses to experimental
test.
Furthermore, it follows that as many different theses are being tested as
there are sets of premises from which Bell-type inequalities can be derived .
New derivations of I are thus interesting insofar as they start from different
premises and make explicit the set P met of assumptions at work. For example,
we have already seen tested (a) the thesis that the quantum statistics may be
reconstructed on a classical probability space (Wigner), and (b) the thesis
that quantum mechanics (and the world) is deterministically Einstein-local
(Hellman).
As with both of these examples, negative results do two related things.
They rule out certain kinds of reconstructions or amplifications of quantum
theory (hidden-variable theories), and they also rule out the possibility of
explaining the EPR correlations in certain kinds of ways. Hellman's result,
for instance, tells us that we will look in vain for a deterministically Ein-
stein-local account of them, Jarrett's that we should not accept any stochas-
tically Einstein-local account involving complete state descriptions.
A particularly striking derivation of the inequality, by van Fraassen
(1982), is closely related to Jarrett's. It tells us that the correlations are not to
be explained by reference to a common cause, and threatens any preparation
account which invokes that concept. In van Fraassen's derivation, p .xp con -
tains, together with the usual anticorrelation statements, premises which he
ails sta tements of " Surface Locality." We have already met them as state-
ments of Parameter Independence (stochastic Einstein-locality): they state
that the probability of a particular outcome of an experiment on one particle
is not affected by the fact that a measurement is being performed on the
oth er, whatever the latter experiment maybe. Van Fraassen makes the point
that these premises, like all the others in Pexp ' are, indeed, obtainable by
induction from experiment.
P met contains three different kinds of premises, labeled "Causality,"
" Hidden Locality," and "Hidden Autonomy." The notion of a common
cause which these premises are designed to capture is due to Reichenbach
(1956, pp. 160-161; see also Salmon, 1984, chap. 6, especially pp. 158 -
163). He sought an account of a causal mechanism which affected probabili-
ties, and so would be appropriate in a nondeterministic setting. In particular,
he wished to supply a causal account of statistical correlations.
He suggested that a correlation between events A and B is attributable to n
common cause C provided that C precedes A and B in time, and
l 'III II/1/lilily, ( '1111.'111/1/1/, 1111111:1/11111111/1011 )4 7
(II ' 11(1) I!(II & II/( ') /,(1'1 I( ') . 1'(11 1 )
(11.29) Type-S prepara tion accounts invoking a common cause are ruled out
by van Fraassen's result.
i.'IlI I I/f' 111ft 1'1"1'11111111/ IIJ QIIII/IIIIIII TlII 'IIIY
However, R ' i hcnbac h's ,In,)l ysis is IIH' bl's l , arguably the onl y, ca usa l
account we have of statisLica l co rr -laLio ns between epa rated evenLs. No
preparation account, it seems, ca n both sa ve the quantum phenomena and
explain them in terms of causal processes. But, from (8 .25a), any interaction
account which does so will need to invoke superluminal causal signals.
Either way, the prospects for a causal explanation of these correlations look
bleak.
'orr' I" lion oIlid ~l lli h .. '\' Loca lity ns empirica l principl's, the condi ti onn l
probabi litics "PIll'. 11 ing in Ih 'm are thought of as lassica l conditional prob-
abilities. (Note, in thi regard, that van Fraassen's analysis is entirely in
terms of a classical probability space. ) However, in the cases we are dealing
with here, it turns out that the two conditionalizations coincide. It was
shown in Section 8.2 that, if A and B are quantum events associated with the
two components of a composite system, then the Liiders rule reduces to its
classical counterpart (see also Appendix C); we have
Let us then return to the electron-positron pair in the singlet spin state D'l'
(see Section 8.6). This state of the composite system yields the two reduced
states, De and DP, for the components. The probability P. is given by
1
p(5~,+) = 2= p(5,;,+)
II/"...
e +- 51,; +) = -sin 2 - ap
p(5a' 'p' 2 2
Here, and in what follows, the function p is the union of three generalized
probability functions; it takes as arguments events associated with the elec-
tron, events associated with the positron, and conjunctions of an electron
event and a positron event, giving them the values assigned by the states De,
DP, and D'l', respectively.
Perfect Correlation follows trivially from (8.9):
(8.3 0)
Note in passing that the prob,lbili[il's 11(5;,,-1 ) and p(Stp, 1-) arc not stclt isti
cally independent; quantum mecha nics, as we expected, violates Jarrett's
completeness condition (8 .24a - b).
The account of Perfect Correlation given above starts from (8 .9), and this
in tum is derived by Liiders-rule conditionalization on a tensor-produ ct
space. Thus the electron-positron pair is treated as a whole even when the
two components are spatially separated. The correlation is not predictable
from the states De and DP of the two components, but from the state D't' of
the composite system; the system e + p is therefore not reducible to the sum
of its parts. Indeed, it is a consequence of the way that quantum mechanics
constructs the probability space 'J{e ® 'J{P for e + p from the probability
spaces 'J{e and 'J{P of the components that this is so. In this respect the tensor
product of two quantum probability spaces differs radically from the prod-
uct of two classical probability spaces. Stairs (1984, p. 357; see also 1983a)
puts the point admirably:
Because of the way Boolean algebras (or, more importantly, classical probability
spaces) combine, every measure on the product space will either render the systems
statistically independent or else will be a statistical mixture of such measures. On the
other hand, if the systems are associated with quantum logical fields of propositions,
then their product need not exhibit this feature. That is, there may be propositions
about the pair of systems which are neither equivalent to nor implied by qmjunc-
tions such as a & b, and there may be measures which are not decomposable into
measures which render the subsystems independent.
= LTr(PiDPiQ)
i
= LTr(DQP;)
i
= Tr(DQ L Pi)
i
= Tr(DQI)
= Tr(DQ)
= po(Q)
we obtain
1
p(S}I,+) = '2 = p(Sp,-)
and
In so ml' rl\ Iwd , III(' ill'l'Ollnl o( Ih . EPR correlations whi h this ana lysis
gives us res 'mbll's <Ill inleraction a count, in others a preparation account. It
is a preparation accou nt in that the source of these correlations is the prepa-
ra tion of pairs of particles in the singlet spin state. It is an interaction account
in that an experiment performed on one particle effectively changes the
state of the other; conditionalization on an event (~,+) associated with the
positron " projects" the state of the electron into the eigenstate a- of spin.
The proof of this last result, to be given shortly, provides a summary of this
section. And an examination of this proof will show the crucial respect in
which the present account of the EPR correlations differs from those pro-
posed in Section 8.6. There it was tacitly assumed that, whatever type of
account was forthcoming, whether interaction or preparation, it would tell a
ca usal story. But, as we saw in Section 8.7, there is good reason to think that
no causal explanation can yield the quantum-mechanical statistics, In con-
trast, the present account has no causal component. To recapitulate, it traces
the EPR correlations to three nonclassical features of quantum probability
spaces. The first is that, in these spaces, probability measures and den ity
operators are in one-to-one correspondence, the second is that conditi ona li -
zation on these spaces is given by the Liiders rule, and the third is the way in
which the tensor-product spaces associated with composite quantum sy -
terns are related to the spaces associated with their components,
As I commented earlier, this third feature tells us that the components of
such systems cannot be treated independently, even when they are spatially
separated from each other. But we can now see that this particular " nonlo-
cality" need not be thought to conflict with the Special Theory of Relativ-
ity.* No superluminal transmission capable of carrying information is in-
volved. If we deal with an ensemble of pairs, this fact is shown by Surface
Locality. In the case of a single pair, the occurrence of, say, the event (S:,,+)
associated with the electron could never on its own tell us that an event
(S~, -) had taken place. True, it would tell us that, if an ~-event of any kind
had occurred-that is, if Sa had been measured for the positron-then it
must have been the event (~,-). But this fact on its own is not inconsistent
with relativity theory, It is easy to imagine unproblematic, everydayexam-
ples involving pairs of billiard balls in which similar "superluminal signals"
are sent and received, as when, from a prior configuration, the red falls into
pocket a if and only if the black falls into pocket b.
It remains, then, to show how the event (S~,+) "projects" the electron's
state from De to P:'_ . Here I will give a purely formal account; the interpre-
tation of this " projection" is discussed in Chapters 9 and 10 .
• See Shimony ( 1980, p, 4); Jarrett (1 984, pp, 575 - 578); C ushing and McMullin (1989), But
se' also himouy (1'I1l6); and lion 10 ,2, b 'low ,
254 ril,' JIII/'Illfl'llIllII" "/ (..)//tllIllIlII 'J'II('olll
The set {v++, v+_, v_+, v __} is an orthonormal basis for 7fe ® 7fp. (See
Section 5.7.)
The singlet spin state 0", for the composite system is the projector onto the
1
vector Ii (v+_ - v_+).
According to the Liiders rule, conditionalization on (S~, +) projects 0",
into 0', where
• Equation (8.39) is easily obtained using the Dirac notation for proj 'ctors, since D",
tl v +_ - v _+) (v+_ - v _+I; see Dirac (1930, p. 25).
Whl'nn'
ij
1
2
From (8.40) and (8.41), and the discussion at the end of cclion 1.1 3, w~'
obtain
• I now find that McMullin (1977) has already used this phrase; he has in mind something
closer to Cartwright's simulacrum account of explanation (see below) than to my structu ra l
explanations.
1'10/1111111/11/, ('1I1I11I11i11/, 1111111 :.1111/1/1/111011 );'7
ilH'rti.l1 (r.1I1ll'11. 'I'IH' . IIl IWI'1" olfl'rl'd in th' last d' 'ad' of th e nine Lee nLh
l't'ntury was Lo lid th.llllw.lsuring rods shrank at high speeds in such a way
Ihat a measurement of this velocity in a moving frame always gave the same
value as one in a stationary frame. This causal explanation is now seen as
seriously misleading; a much better answer would involve sketching the
models of space-time which special relativity provides and showing that in
th ese models, for a certain family of pairs of events, not only is their spatial
separation x proportional to their temporal separation t, but the quantity xlt
is invariant across admissible (that is, inertial) coordinate systems; further,
for all such pairs, xlt always has the same value. This answer makes no
appeal to causality; rather it points out structural features of the models that
special relativity provides. It is, in fact, an example of a structural explana-
tion.
If one believes (as I do) that scientific theories-even those expressed in
highly abstract form-provide explanations, then one's account of expla-
nations will be tied to one's account of scientific theories. Consider, for
example, the view that an ideal scientific theory should be laid out axiomat-
ically, in the manner of Euclid's geometry, with particular results deducibl '
from general laws, and those in tum deducible from a few fundam nLal
axioms. This axiomatic view of theories ties in naturally with a " overing
law" account of explanation of the kind favored by Hempel (1965), who
suggested that one event, or set of events, could explain another if the
second could be deduced from the first, given the laws of nature. We may
contrast this view with the semantic view of theories appealed to in this
book. On this view a theory provides a set of models, and ground-level
explanation consists in exhibiting relevant features of these mathematical
structures.
The term ground-level is important. Explanation comes at many levels, as
does scientific theorizing. It is the foundational level which concerns us
here, since it is at this level that structural explanation occurs. Cartwright
(1983), who also distinguishes two levels of explanation, calls them
"causal" and "theoretical" (p. 75) and argues convincingly for a simulacrum
account of the former. She writes, "To explain a phenomenon is to find a
model that fits it into the basic framework of the theory" (p. 152). The
models she refers to here she calls "simulacra" to emphasize the partial
representation of phenomena which they provide. In Section 2.9 I distin-
guished models of this kind from the mathematical models which, on the
semantic view of theories, appear in the exposition of the "basic framework
of the theory." It is this second kind of model which is appealed to in a
theoretical explanation.
A related distinction, between different kinds of theories, was first made
fiH '/'III' 11I/('lpr('/lIlioll oj ()1I1111111111 'I'II('oIY
by Einstein, and has been cmph.l:-lizt·d by l3ub (1974, pp. viii, 143) and
Demopoulos (1976, p. 721). For thcsc authors, quantum mechanics, like
special relativity, is a "principle theory." Such theories may be contrasted
with "constructive theories," like the kinetic theory of gases, which show
how one theory (such as the phenomenological theory of gases) can be
embedded in another (in this case Newtonian mechanics). Principle
theories, in contrast, are foundational. They "introduce abstract structural
constraints that events are held to satisfy" (Bub, 1974, p. 143). They do so by
supplying models which display the structure of a set of events. The four-
dimensional manifold postulated by special relativity models the structure
of the set of physically localizable events; the Hilbert spaces of quantum
mechanics are models of the possibility structure of the set of quantum
events. (Here I echo Bub, 1974, and, in particular, Stairs, 1982; see also
Stairs, 1984.)
Whenever we appeal to a principle theory to provide a theoretical expla-
nation, I claim, the explanation consists in making explicit the structural
features of the models the theory employs. In the same way that we explain
the constancy of one particular velocity with respect to all inertial frames by
appealing to the structure of Minkowski space-time, we explain paradoxical
quantum-mechanical effects by showing, first of all, how Hilbert spaces
provide natural models for probabilistic theories (as in Chapters 3 and 4),
and, second, what the consequences of accepting these models are (as in the
present chapter).
A theory of the kind Salmon looks forward to, which brings with it a new
conception of causality, will also, presumably, be a principle theory. And
should we identify certain processes as causal in this new theory, and appeal
to them within scientific explanations, it seems likely that these explana-
tions will effectively be structural explanations; that is to say, in providing
them we will isolate a particular class of elements and relations within the
representations the theory provides.
9
Measurement
In the last three chapters we have seen the pairs (A, ~) treated variously as
properties of systems, as propositions in a quantum logic, and as quantum
events. The last interpretation seems the most promising: as we saw, talk of
the properties of quantum systems is problematic, and talk of th e proposi
tions -of a quantum logic is uninstructive unless these propositions ar'
themselves interpreted.
But these pairs were originally introduced as experimental questiol/s, to
which the theory assigned probabilities and to which individual experi-
ments gave the answers yes or no. In Chapter 2 the question (A,~) was
glossed as, "Will the measurement of observable A yield a result in the set ~
of the reals?" During the course of this chapter I will clarify the relation
between quantum events and experimental questions, but the main topic
addressed is the measurement process itself and the account of it available
in quantum theory: can the theory tell us what goes on when "a measure-
ment of A yields a result in the set ~"? As a preliminary, I discuss a principle
which has often been taken to imply a constraint on possible measurements,
namely, the uncertainty principle.
momentum (P), for a pa rticle 'onHlr" ilwJ to move in onc d imension. Rendl
that both these observables havc a ~) nl i nu o u s spectrum.
The general principle (9.1 a) follows from Gleason's theorem (altema tively,
from Kochen and Specker's theorem); (9.1b) follows from a theorem proved
by Busch and Lahti (1985, pp. 66-67).
The support of a state, with respect to a given observable, is, intuitively,
the set which contains just the values of that observable to which the state
assigns nonzero probability. More formally,
Unl~ss this slale is 1\ II l·jg' 'Il . I,lle o( Ih . obs rvabl A, measurements of A will
not yield the sam' vnlul' (or en h member of the ensemble, but a series of
different values, each occurring with a certain frequency, These values
sea tter round a mean, the expectation value (A) of observable A. In the case
when A admits eigenvectors we obtain (A) as in Section 2.4, by weighting
I
and so
1
sx - (5)
x
= +-
- 2
The mean value of the square of these deviations from the mean is given by
(We take the squares in order that the differences between the observed
values and the mean should effectively be regarded as positive.) Now we
define the variance, <y o(5x) = ~5x' of 5x in state D as the square root of this
mean square deviation:
Thus, for the system in state z+ (in other words, such that D = P z +),
1
~5 = -
x 2
I lCOS 2
I (/>( I - COS1»2 + in2~1>(1 + cos1»2]
4 2 2
It follows that
1 .
!1S", = -smcf>
2
= AB - BA = C
[A, B) df
It is important to emphasize that all three quantities involved, !1A, !1B, and
(C), are state-dependent.
In our special case, that of the spin-t particle in state Z+, we saw that
1
!1Sx =!1Sy = -
2
and so
264 Til(' illi l'rprclll iioll of )//11/11/11// 'U/I 'Ol.'!
All the values of spin used in this example are in natural units, that is, they
are multiples of Planck's constant, h. In the next paragraph, h has not been
"suppressed" in this way.
Consider the position and momentum observables Q and P, represented
by the operators x and - i h a/ax on the space of square-integrable functions
'JI(x) (see Section 1.11).
For any function '¥(x) we have
QP'¥(x) = x[ - ih a~;x) ]
But
= - i hx a'JI(x) - ih 'JI(x)
ax
Clearly,
and so
[P,Q] = -ih
• By a slight abuse of notation, we use " ([S" Sy» " to signify th e expectation value of 1111'
observable represented by the operator [S" Sy)'
M. ' Ot/ II/'I ' /11I'/I' '} ()!;
We see that, if for a certain state we can obtain a very small uncertainty in
lhe predictions made about momentum measurements, this will be accom-
panied by a correspondingly large uncertainty in those we make about
position measurements, and vice versa. The product of these uncertainties
never falls below a certain value.
P and Q are incompatible operators with continuous spectra. They are
nontypical in that there is no state for which the product of their uncertain-
ties lies below a certain (nonzero) value. However, whenever one of a pair of
observables A and B admits eigenvectors, then the product ~A . ~B can be
made as small as we wish by a judicious choice of state. For, if ai is an
eigenvalue of A, and Vi the corresponding eigenvector, then when the
system is in the state Vi' ~A = 0, and so, for any observable B, ~A' ~B = O.
For example, given an ensemble in the state x+,
1
~S = -
y 2
Note that this does not violate the indeterminacy principle, since [Sx,Sy] =
is,, and, for the state x+, (Sz> = O. Thus, in general, the indeterminacy
principle does not tell us that the product of the variances associated with
incompatible observables has a least value greater than zero. (For a careful
discussion of this, see Beltrametti and Cassinelli, 1981, pp. 24-26.)
Gibbins (1981a) offers some other examples. More typical is Bohr, who
oscillates between an implicit reliance on the ontic reading and an explicit
adherence to a reading in terms of constraints on measurements. Thus, in
his account of a thought-experiment involving a single slit in a diaphragm,
he writes (1949, pp. 213-214):
Consequently the description of the state involves a certain latitude Llp in the mo-
mentum component of the particle and, in the case of a diaphragm with a shutter, an
additional latitude LlE of the kinetic energy.
Since a measure for the latitude Llq in location of the particle in the plane of the
diaphragm is given by the radius a of the hole, and since () = 1I ua, we get . . .
just Llp = ()p = hi Llq in accordance with the indeterminacy relation . . .
However, five pages earlier, Bohr has introduced the indeterminacy princi-
ple rather differently
The commutation rule imposes a reciprocal limitation on the fixation of two conju -
gate parameters q and p expressed by the relation
where Llq and Llp are suitably defined latitudes in the determination of these vari -
ables. (p. 209)
M"I11II1I'I'/11I'/11 )/)1
11('rl' IIll' 1.11 i lud( '/ III \' 11t)1 "I 1hI' qll.ln til il'HI h('JlIH('1VI'S, bill in I heir " d 'termi
n.llion"; Hohr is t' phdlly t'IH.lorsing Ilciscnb 'rg's view that the indeter-
minacy principle 'xpn,'ss 'S a limitation on measurement.
This interpretation of the principle was for many years the dominant one.
In Robertson's words (1929, p. 163),
The principle, as formulated by Heisenberg for two conjugate quantum-mechanical
variables, states that the accuracy with which two such variables can be measured
simultaneously is subject to the restriction that the product of the uncertainties in the
two measurements is at least of order h.
It was Robertson who first derived the indeterminacy principle in the gen-
eral form in which we now have it, so that it applies to any pair of observ-
abies representable in the same Hilbert space. The quotation above is from
his preamble to the derivation; the uncertainties are clearly identified with
the limits of accuracy obtainable in simultaneous measurements of these
observables. Yet, half a dozen lines into the derivation itself, we find Robert-
son writing,
The "uncertainty" L1A in the value A is then defined, in accordance with statistical
usage, as the root mean square of the deviation of A from [the] mean. (P . 163)
von Neumann 's own con lu sionll, th e n~[\ sun th ey annut b' P 'dorm 'd
with arbitrarily high accuracy is that they can not be performed at all .
Be that as it may, let us look at Heisenberg's arguments. Thes are plausi-
bility arguments, the best-known of which (and the one used by von Neu-
mann) involves "Heisenberg's microscope" (1927, p . 174; von Neumann,
1932, pp. 239 -247). This is an idealized instrument similar in principle to a n
optical microscope, but which uses radiations of short wavelengths, like
y-rays, to form images of very small particles. If a small particle were in the
field of view of the microscope (see Figure 9.1) then it would be observed if a
photon (that is, a y-ray particle) struck it and were deflected upward into the
aperture of the microscope.
We can estimate the coordinate of position of the particle under observa-
tion by finding the position of the image formed by the instrument. Let () be
the angle sub tended at the aperture of the instrument by the particle. Then,
writing ~x for the uncertainty in our measurement of the x-coordinate of
position and A. for the wavelength of the radiation, we get
A.
~x~
()
Now, when the photon strikes the particle, a certain amount of momen-
tum is transferred to the particle by the collision; thus any estimate we make
of the particle's momentum will have to allow for this. By our conservation
laws, we expect the momentum transferred to the particle to be equal and
opposite to the change of momentum of the photon. The trouble is, we don't
know exactly how much this is. If we knew the path followed by the photon
Iilrough t1w 'ninO/wo pi ', IIlt ' n il would bl' easy to l'vallinle it, but w 'don ' t
know 'xn t1y wlw((' ill IIw ,lperturc the photon 'ntcrs th instrument. In
f,lel, making (J IMg(.' (in ord 'r to obtain a high resolution) has the effect of
II1creasing the un crtainty Llp in our estimate of momentum. We have
he
Llp - -
A.
and so
Llx·Llp - h
thinking that incompa tible obs 'rvab les arc not commcasurablc is another
matter. Various authors have sugge ted oth erwise; in fa ct Margenau, both
on his own and later in collaboration with Park, proposed a numbe r o f
experiments whereby values for position and momentum could be obtained
simultaneously with no more limitation of accuracy than one might expect if
they were measured individually (Margenau, 1950, p. 376; Park and Mar-
genau, 1968, 1971). These proposals, however, have not gone unchallenged
(see Busch and Lahti, 1984).
1return to the question of simultaneous measurability in the next section; I
suggest there that, where it is forbidden, it is forbidden by the support
principle. Again, it is this principle, rather than the indeterminacy principle,
which summarizes fundamental features of the theory.
Some recent work by Busch and Lahti (1985) offers an interesting foot-
note to this section. They point out that, in orthodox quantum mechanics,
no joint probability measures exist on the set of pairs of Q-events and
P-events. That is, no probability function exists that maps all conjunctions
of Q-events and P-events into [0,1] and that reduces to the usual quantum -
mechanical assignments of probabilities to Q-events when the P-event is
the certain event (P,~), and vice versa (see Beltrametti and Cassinelli, 1981,
pp. 23-24). However, it is possible to define "unsharp" position and mo-
mentum operators with respect to which such measures are well-defined
(Davies, 1976). The operators are the usual Q and P operators on U, modi-
fied by functions f and g to become the operators Qf and Pg . (I omit the
technical details of the modifications.) The modifiers f and g are probability
density fun ctions with mean values equal to zero and variances Ilf and Ilg;
they are design ed to represent the fact that position and momentum mea -
surements are not sharp, that is, not localized at a point on the real number
line. An "uncertainty" relation now holds between Ilf and Ilg; we have
we ca n record the pot wh 'r any individua l clc tron strik 'S th e ::lcrt'ell . I(
we now replace the screen by two thin photographic plates, placed togcth r
and parallel to each other, the electron will go through both, and the mark
where the electron strikes the second will be very close to the mark where it
struck the first. What we have here are two consecutive experiments, both of
which measure the position of the electron in a plane perpendicular to th
axis of the experiment. The second yields the same result as the first, as step
(1) requires. Of course, this second experiment must be "immediately after"
the first; that is, between the two measurements the system's state must
neither be changed discontinuously, by interactions with other devices, nor
must it evolve significantly according to Schrodinger's equation. In the
example given, the further apart the plates are, the further apart the two
marks may be, because diffraction occurs again after the first impact.
One of the problems with step (1), however, is that some measurements
- perhaps most measurements - do not allow a second look at the system;
the electron, photon, or whatever is effectively annihilated by the measure-
ment process. Additionally, Landau and Peierls (1931) suggested that,
among experiments which allow repetition, we can find, and distinguish
between, those which yield the same result the second time as the first and
those which do not (see Jammer, 1974, p. 487n). Following Pauli (1933), we
call the former "experiments of the first kind." These considerations restrict
the scope of von Neumann's proof. They show that incompatible observ-
abIes are not measurable to arbitrarily high precision by measurements of
the first kind. Thus, although von Neumann's result is consistent with the
stronger claim, that no possible measurement could do the job, Margenau
(1950, pp. 360-364) could acceptthe proof and still maintain that simulta-
neous measurability of incompatibles is feasible. Note, however, that a
proof of the strong claim, resting on a particular account of the measure-
ment process, has been offered by van Fraassen (1974a, pp. 301-303).
Let us tum to step (2) of the argument, or rather its analogue for the case of
an observable with a continuous spectrum, like position. In the experiment
described just now, prior to striking the screen the electron behaves like a
wave; its probability distribution is spread out in space. The event of its
striking the screen is often called "the collapse of the wave packet." It is
sometimes described as a change in the properties of the electron - from
being spread out in space the electron becomes localized in a small region -
and sometimes as a passage from potentiality to actuality-of all the possi-
ble events associated with small areas of the screen, just one is actualized .
Von Neumann postulates that this collapse (however regarded) is accompa -
nied by a change in the state of the electron: the state changes in such a way
that a repetition of the experiment will with certainty yield the same result aR
M1 ' 01 1ilriW'1I1 7,1
Iwfor '. M.lrt;t'1I 1111 ( t II',() "" jlll/.I'd Ih is postu la ll' IIw I,ro/fel iOIl I'OSIIII (II I', and
il is now gener,)II n'ft'l n'" 10 by that name. W an revert to th ' LIse of ..111
observable with a disen.'1 ' spectrum to see a particularly simple instan e of
the postulate.
Let us assume that there is no degeneracy and that the result of the first
measurement of observable A is ai' Such an experiment is a maximal mea -
surement of A. If the original state is a pure state v, then the projection
postulate requires that the transformation
[equivalently, P (A"j). P ]
v Vj
A A
C C
B B
Figure 9.2 Probability transition according to the von Neumann projection postulate.
coupl 'd sys t '01 of th~' killo Wll'O in EPR typ ' experi ments, then a n event
associa ted with o n ' 'ub y tern may be both a measurement of an observ-
a ble for that subsystem and an event which projects the other system into an
eigenstate of that observable. The Compton-Simon experiment to which
von Neumann (1932, p. 212) appealed for evidence in support of the projec-
tion postulate is similar in kind.
[n this experiment light was scattered by electrons and the scattering process was
controlled in such a way that the scattered light and the scattered electrons were
subsequently intercepted, and their energy and momentum measured.
Given the initial trajectories of a photon and an electron,
the measurement of the path of the light quanta of the electron after collision
suffices to determine the position of the centralHne of the collision. The Compton-
Simons [sic] experiment now shows that these two observations give the same
result. (P. 213)
The two observations need not occur simultaneously; if they do no t, w
can infer the result of the second from the result of the first. Prior to th e first
observation, we could only make statistical predictions about th ' second,
whereas after the first one has been made, the second is "a ir 'ady d ,t 'r
mined causally [sic] and uniquely" (p. 213).
As an argument for the projection postulate, this has recently come under
heavy fire. Van Fraassen (1974a, p. 297), for example, writes,
Upon what slender support dogma may be founded! In the experiment described,
measurements are made directly on two objects .. . which have interacted and
then separated again. The observables directly measured are ones which have be-
come correlated by the interaction . . . And on the basis of this, an inference is
made about what would happen if a single experiment could be immediately re-
peated on the same object!
Indeed, one wonders why von Neumann chose this particular experiment
for his purposes. Einstein, in contrast, was content to illustrate the projec-
tion postulate by two polarizers PI and P2 ; if their axes of polarization are
parallel, then any photon passing the first will also pass the second (Ein-
stein, in correspondence with Margenau; see Jammer, 1974, p. 228).
One motivation was von Neumann's desire to use the Compton-Simon
experiment to make a further point. The experiment shows that, contrary to
a suggestion made by Bohr, Kramers, and Slater (1924; see Jammer, 1966,
pp. 183 -188), the principles of conservation of energy and momentum
hold in individual cases and are not merely statistical laws. As von Neu-
mann (1932, p. 213) pointed out, this implies that the quantum world lies
somewhere between a purely statistical world and a wholly determined
IH 'I'ltl' II/Ie11ln'IIIIicill II/ (.)/1111111111/ 'J'1'/'/llY
world; for him the proje lion poSllIl.ll • was an cxpre ion of lhis inlt'fI1w
diate " degree of causality." With hindsight, we can reread von Neumann 's
argument as an argument not for his version of the projection po tulat .. bUl
for the Liiders rule viewed as a rule of probability conditionalization . Both
rules indicate where, within a statistical theory, deterministic correlations
may obtain.
That said, in the remainder of this chapter I will leave aside the connection
between conditionalization and measurement, and look solely at the latter.
In particular, I postpone a discussion of Teller's views until Section 10.1.
(9.9) Do - D t = VtDOV;-l
(9.10)
II yof oc urrt'lll·l'.
As this shows, I he problem of the projection postulate is just one element
o( another, larger problem confronting quantum theory, the problem of
measurement. What account can quantum mechanics offer of the statisti-
'ally governed but individually undetermined events characteristic of mea-
surement processes? The problem has two aspects. First, whatever theoreti-
al account we give, the processes it describes may have more than one
possible outcome. Second, this account, though couched in quantum theo-
retical terms, must include some treatment of the classical measuring device.
Apropos of the second point, Schrodinger (1935, pp. 156-157) has
pointed out that we are led to bizarre conclusions if we try to apply the
quantum-mechanical formalism to a macroscopic object. He instances the
case of a radioactive atom and a detector. An alpha-particle within a radio-
active nucleus evolves into a superposition of states, so that as time goes on
there is an increasing probability of its being detected ou tside the nucleus. (I t
" tunnels through" the potential barrier which the nucleus provides; Bohm,
1951, pp. 240-242.) Schrodinger(1935, pp. 156-157) writes colloquially o(
the state being "blurred":
The state of a radioactive nucleus is presumably blurred in such a degree and fashion
that neither the instant of decay nor the direction in which the emitted a -particle
leaves the nucleus is well-established. Inside the nucleus, blurring doesn't bother us.
The emerging particle is described, if one wants to explain intuitively, as a spherical
wave that continuously emanates in all directions from the nucleus and that im-
pinges continuously on a surrounding luminescent screen over its full expanse.
But while we may accept this "blurred" picture of the microscopic system,
we cannot accept a similar picture of the macroscopic measurement appa-
ratus. Schrodinger continues,
The screen however does not show a more or less constant uniform surface glow, but
rather lights up at one spot-or, to honor the truth, it lights up now here, now there,
for it is impossible to do the experiment with only a single radioactive atom.
• Ti(' rli,' bh,llwr (19:19) discusses Ihe rel a tionship of Ihis animal to Buridan's ass.
280 '1'/1/' 11I11' llm'llIlio/l 0/ (./111111111111 / '1t1'01 .1/
acid. If one has left this en tire system to itsl' l( (or one hour, on would say th.1I till' Cdt
still lives if meanwhile no atom has de ayed. The first atomic decay would h. vc
poisoned it. The 'II-function of the entire system would express this by having in it
the living and the dead cat (pardon the expression) mixed or smeared out in equal
parts.
(9. 11 d) Whenever the evolution takes M to state Uj, then it takes 5 to the
eigenstate Vj of A which has eigenvalue aj.
pro 'ss M b 'in.l I'l !.ll<' III wilh probnbilily pv(A,a l ) . Let Pi = pv(A,a;); then the
general r quir 'm 'nt an b expressed by saying that, after the measure-
men t, M mllst be in the mixed state OM = L;PiPS" (where PS" projects onto u;).
Prima facie this does not violate (9 .lla) since, in contrast to superpositions,
mixtures of classical states are perfectly respectable.
Along these lines Heisenberg (1958, p. 53) wrote that
The probability function [of quantum mechanics] combines objective and subjective
elements . . . In ideal cases the subjective element . . . may be practically negli-
gible as compared with the objective one. The ph ysicists then speak of a "pure case."
Although in this passage Heisenberg doesn't use the term, we may add that,
conversely, a mixture is a probability function within which a subjective
element, "our incomplete knowledge of the world," may be represented.
Any measurement process, says Heisenberg (p. 54), produces an interplay
between these two elements:
After the interaction has taken place, the probability function ontains the objl'Cliv('
element of tendency and the subjective element of incomplct 'knowl 't\g(', l'Vl'n i( it
has been in a "pure case" before.
tcrs a po itive v< III ,for A, ,lIld u IIw , t.11t' wlll'1l il fl'Mislers , I f1l'M,ltivl' v,lilll'
for A. We assume that the quantum n1l' h<lni a l formali sm ca n be applil'd 10
M, and that these three states are representable by ve tor u o, U1, and U2,
respectively, in a Hilbert space 7fM. No assumption is made that superpo i-
tions of uo, U1, and U2 are also possible states of M .
We represent the states of the coupled system S + M in the tensor-product
space tfs ® 7fM . Assume that, before the measurement begins, the system S
is in the pure state v, where v = c+v+ + CV_, and that M is in the state uo ;
then the original state of S + M will be '1'0 = v ® u o . During the course of the
measurement interaction this state will evolve continuously, according to
Schrodinger's equation. Accordingly, at the end of the interactio:;1, S + M
will again be in a pure state 'I' E tfs ® 7fM, where 'I' = U'I'o, and U is some
unitary operator on tfs ® 7fM. U must obey the following constraints: when
v = v+ (that is, when c = 0), we require that 'I' = '1'+ = v+ ®"+; when v =
v _ (that is, when c+ = 0), we require that 'I' = '1'_ = v_ ® u_.
In each of these two cases U takes '1'0 into a state of S + M reducible into a
pure state of S and the corresponding pure state of M . By the linearity of U
we obtain, for the general case,
where
and
The operators Pt, P~, P~, P~ project onto rays in tfs and 7fM containing,
respectively, v+, v_, u+, u_.
This seems to give precisely what we want. The measurement process
evolves according to Schrodinger's equation, but the final state of the mea-
surement device is a weighted sum of the indicator states. These weights are
Mln l ltrl' III I' ,,1 lH,1
(' X.l c tlI Ill' p rohll llllli " j w hll'h q u ,lnlulIl Ih i'O I y 01 :, /11 ) ; 111< 10 Il l\' COII'('S)lOll d
IlI g oul co l1l ('s (S\ ' (' ,' \'I' Ii O Il ,1\) .
Moreov 'r, as Jauch )loi nl s oul , we 'nn nlso show th at indicalor s tnt ' S arc
correlated with fi nal slales of S. foor a su me that we ca rry out a measur ' ment
I'.~ Q9 p M on the composite system in the state '1'. That is, we test for the joint
' ven t [(A,+);(A M , - )], where AM is the act of observing M. In this case,
('I'I(P~ ® P~)'¥) = 0
lo Ih(' s(' 1 of .ldlllissihl(' purl' slall'S, we an nol wrilc i(d'Y /d/) = H'Y, as this
would r 'q uire 'll to p< 55 through the " no-man's-land" between admissible
pure sta tes.
Since an internal account of the measurement process is, by definition,
one that conforms to Schrodinger's equation, it would seem that no internal
account conforming to (9.11a) can be given.
A way out is suggested by Beltrametti and Cassinelli (1981, chap. 8) and
independently by Wan (1980). Beltrametti and Cassinelli's strategy is to
distinguish between the mathematical account of the time-evolution of the
state vector and the interpretation of this as the evolution of a particular
kind of state. On their account of the measurement process, the state vector
'P of S + M evolves according to the Schrodinger equation. However, only
when 'l'has the form v ® ui(where v E ti s and uiis an indicator state of M)
does 'P represent a pure state; when it does not, it is interpreted as a (classi-
cal) mixture of such states.
Before assessing this account, let us see how quantum theory trea ts situa-
tions in which not every normalized vector in the relevant Hilbert spa e an
represent a pure state of a system.
A rule forbidding us to form a pure state by the superposi tion of othl'r
pure states is called a superselection rule. Such a rule restricts pure stat 's 10
those representable by vectors in orthogonal subspaces Lo, LI , . . . of lhe
Hilbert space 7i for the system; Lo, L1 , • • • are known as the superseiectioll
subspaces (sometimes the coherent subspaces) of 7i. In the presence of super-
selection rules, not every Hermitian operator on the space can represent an
observable (see Jordan, 1969, sec. 28; Beltrametti and Cassinelli, 1981, chap.
5). In fact a Hermitian operator A on 7i can represent an observable only if
each superselection subspace Li of 7i reduces A - in other words, only if
A'P ELi whenever 'P ELi' This condition holds if and only if every projector
in the spectral decomposition of A projects onto a subspace of some super-
selection subspace Li of 7i. It follows that, in the presence of superselection
rules, (1) any function of an observable A is reduced by every supers election
subspace, and (2) every projector P E representing a quantum event E pro-
jects onto a subspace of some superselection subspace (or is the sum of such
projectors); hence P E is also reduced by every superselection subspace.
Now consider a normalized vector 'P which is a nontrivial superposition
of two normalized vectors 'PI and 'P2in distinct superselection subspaces L1
and L2 of 7i: 'P = c1'PI +C2'P2 . Note that 'PI 1. 'P2 , andlc l 12 +lc2 12 = 1. LetP'I'
be the projector onto the ray containing 'P, and PI and P 2 the projectors onto
the rays containing 'PI and 'P2 , respectively.
In the absence of superselection rules the superposition 'P = C 1'P 1 + C2'P2
would not be statistically equ ivalent to the mixture D = Ic l 12P 1 + Ic 2 12P 2.
86 '/ '/1(' /1I/I'II'I'l'llIlioIlO/ (.) /11111111111 '/ '/;/'tll.I/
That is, there wo uld be a qU <H11ullI i'Vi' IlI t: for w hi h p",(t) I 1'1)([;). For le I I:
be represented by the projecto r I',; o n '/1 . Then Po(C) = Tr(P,;D); using a n
orthonormal basis which includes '1', and '1'2' we obtain
We see that, in the presence of superselection rules, 'I' and D are statisti-
cally equivalent. (Recall, in this connection, the discussion in Section 3.9.)
Thus although, in accordance with the superselection rule, 'I' may not
represent a pure state of the system, we may use it to represent a mixture; 0
and 'I' become two mathematically equivalent ways to represent the same
state.
Let us now return to Beltrametti and Cassinelli's account of the measure-
ment process. They too argue that 5 + M inherits the superselection rules
characteristic of M, and that the superselection subspaces of 'lis + Mare 'lis ®
LtJI, 'lis ® Lr, and so on, where LtJI, Lr, . . . are the rays in 'liM containing
the indicator states Uo , Ul , . .. of M (Beltrametti and Cassinelli, 1981,
p . 84, though their argument to this conclusion is not the same as the one
given here).
As in Section 9.6, we consider the case when the admissible pure states of
Mare u o, U+, and U_, and U+ and u _ correspond to the two values of a n
observable A associated with eigenstates v+ and v_ of 5, respectively. We
take the initial state of S + M to be '1'0 = v ® uo , where v = C+v+ + c v .
Like Jauch, Beltrametti and Cassinelli suggest that '1'0 evolves during th e
measurement process in accordance with the Schr6dinger equa tion, so tha t
M"lI s lIl'I ' IIII' 1I1 87
The most sober of the three accounts is offered by Daneri, Loinger, and
Prosperi (1962). It may seem odd that I portray them as showing why a
macroscopic system merely seems to behave classically, since they write
that
In order that objective meaning may be attributed to the macro-states of large
bodies, it is of course necessary that .. . states incompatible with the macroscopic
observables be actually impossible. (P. 298; Wheeler and Zurek, 1983, p. 658)
It sounds as though, like Beltrametti and Cassinelli, they ar going to rule
out superpositions of indicator states as po sibl pure Mates of 5 + M . (H re I
MCII /lln ' IIIt ' ''' 2H9
am Slrl'lC'hing pr 'v i()IIS lIS.lgt' by usin g " indi olor lole " to refer not merely
lo statcs of M but to th os ' stoles of S + M which would be admissible on a
wholly classical picture.} Ilowcver, this is not what they do. Rather, they
show that, because 5 + M is a very large system, the pure states into which it
evolves behave like mixtures. Starting from the fact that a measuring instru-
ment is a system of many particles and with correspondingly many degrees
of freedom, they argue from thermodynamical considerations that, when
such a system is in a superposition of indicator states, the interference terms
characteristic of superpositions effectively cancel out (pp. 301/661 and
305/665). As a result, a superposition will be statistically indistinguishable
from a mixture with respect to all relevant observables. If we measure the
macroscopic system 5 + M (call it "1") by using another macroscopic system
("II"), then
A statistical operator . . . for the system I which corresponds to a pure state de-
scribed by a superposition of vectors belonging to different [macroscopic sta tes] is
equivalent, so far as the macroscopic observables on II are concerned, to a statistical
operator which is a mixture of the above macroscopic states. (Pp.314/674)
This resembles the move made by Beltrametti and Cassi n IIi (s ('ction
9.7). On both approaches, the state to which 5 + M evolves, and whi 'h is
given mathematically by a linear superposition, is shown to be indistin
guishable from one given by the weighted sum of projection operators. Th '
difference is this. Beltrametti and Cassinelli suggest that the sta te in qu stion
is a mixed state, Daneri, Loinger and Prosperi that it is pure; however,
according to the latter this pure state is statistically indistinguishable from a
mixture. But, unless we think that a state-function refers essentially to an
ensemble of systems, statistical indistinguishability is not enough. What
Daneri, Loinger, and Prosperi conclude is that an ensemble of macroscopic
quantum systems will behave like an ensemble of classical systems. As
Cartwright(1983, pp. 169-171) has pointed out, however, what we need is
an account within which individual systems exhibit classical behavior; if a
superposition of indicator states does not represent a classically permissible
pure state, then Daneri, Loinger, and Prosperi have failed to provide us with
one (see also Bub, 1968; Putnam, 1965).
In brief, their account does not produce the final state we want; Beltra-
metti and Cassinelli, on the other hand, show us the desired state, but in
doing so they make it unattainable.
THE MANY- WORLDS INTERPRETATION
• As J. P. Jarrett has pointed out to me, not all proponents of the " relative sta te" approa h
(Everett's term) accept the many-worlds interpretation of it; see, for exa mple, Ceroch ('1 984). I
discuss MWI from a slightly different perspective (a nd wilh grea ter charily) in Seclion 10.4 .
M CI/ HIII'e lll l' lt1 ' 9J
" a tu , I. " This br.lJ1Ching is determined by the s ta tes o f the systems in-
volved . Now a fea ture o f Eve rett's presenta tion is that, in an interaction, the
state of one system is specified with respect to the other; indeed, Everett
(1 957) called the interpretation the " 'Relative State' Formulation of Quan-
tum Mechanics." However, this specification of states is not symmetrical.
(This follows from an argument due to Cartwright, 1974.) In other words,
the set of possible worlds reachable from the perspective of one participant
in an interaction will not mesh with the set reachable from the perspective of
the other. There is thus no specifiable set of worlds into which the preinter-
action world divides.
Nice examples of criticisms of the second type are given by Healey (1984,
pp. 591-593), who spends several pages outlining the "antinomies" to
which MWI has been thought to give rise; with one exception, which I
discuss below, I will not rehearse them here. (Healey also discusses the
problem of space-time structure and the modal realist version of MWI; see
below.)
A criticism of the third kind has been voiced by Earma n (1 986, p. 224):
What has rarely been explored is the implication for space-time structure of taking
[MWI] seriously. To make sure that the different branches ca nnot interact even in
principle they must be made to lie on sheets of space-time that are topologica lly
disconnected after measurement, implying a splitting of space-time something like
that illustrated [in Figure 9.3]. I do not balk at giving up the notion, held sacred until
now, that space-time is a Hausdorff manifold. But I do balk at trying to invent a
causal mechanism by which a measurement of the spin of an electron causes a global
bifurcation of space-time.
No doubt the many-worlds theorist would reje t the de ma nd for a call sal
explanation, but, if he does, he needs to say what alternative he has up his
sleeve. Lacking one, he is open to the fourth kind of objection.
That is, even if advocates of MWI can respond to criticisms of the first tw
kinds, one is led by Earman's objection to doubt on general ground
whether the speculative metaphysics they offer provides a genuine answer
to a physical problem. In particular, I would question whether what has
been produced is anything more than a semantic model for probability
statements associated with the measurement process. In the last twenty
years philosophers have offered illuminating analyses of a great number of
modal concepts in terms of "possible worlds." (See Loux, 1979, for a careful
introduction to the literature.) To take a couple of trivial examples, a logi-
cally necessary statement is analyzed as a statement that is true in all possible
worlds, whereas a contingent statement is one that is true in some worlds but
not in others. Now probability is itself a modal concept (van Fraassen, 1980,
chap. 6, calls it "The New Modality of Science") and it too has heen ana-
lyzed in terms of possible worlds (Bigelow, 1976; Giere, 1976). The suspi-
cion that this kind of conceptual analysis is all that the many-worlds inter-
pretation supplies is strengthened by de Witt's claim (1970, p. 161) that "the
mathematical formalism of the quantum theory is capable of yielding its own
interpretation" (emphasis in the original).
But perhaps the many-worlds theorist could accept the description of his
enterprise as one of providing a semantic analysis of the probability state-
ments of quantum theory and claim nonetheless that it was true that each
measurement interaction resulted in a division of the world into multiple
copies of itself. Our possible-world analyses of modal concepts, he might
say, are not merely formal; on our best metaphysical picture of the universe,
this world is one of many equally real worlds. David Lewis (1986, p . 3)
writes,
r >ad il y n know l('dgl'/'I th 'l tm any will fi nd th • ontologica l price of his moda l
rea lism too mll h to pay (or th . th eor ' tica l bene fits it brings (p. 5). Let us
assume, however, that we are willing to make the purchase on the many-
worlds theorist's behalf. This still won 't give the theorist what he needs.
Consider the fact that, on Lewis's account, although all possible worlds are
equally real, for us only this world is the actual world. In the grand meta-
physical scheme of things, from the perspective of the Almighty, actuality
may only function as an indexical marker on the set of worlds (like " here"
and " present" across the set of points in space and time; pp. 92-94), but for
each observer there is only one actual world, the one which she inhabits.
Compare Everett's insistence that "all elements of a superposition (all
'branches') are 'actual,' none are more 'real' than the rest" (Everett, 1957,
p. 146n). This, it might be said, is a purely verbal difference: Everett uses
" actual" and "real" synonymously, where Lewis would use only " rea1." But
what, on Everett's account, has become of the world which is actu al in
Lewis's? If there is no such privileged world, then som thing odd happl' nll
to our conception of probability. For if all (relevant) venlS with no nz('ro
probability are realized in some world or other, then are not , lIth o~l{' j'V(' II L
certain of occurrence? (This was pointed out by Hea ley, 1984, p. 5':/3 .) And if
I wager on what the outcome of a measurement will b , will it not pay " li lt'''
to place my bet on whatever outcome is quoted a t the highest odds, wi th out
regard to the probabilities involved? We cannot just ay, (or exa mpl " tha t
there are three times as many worlds, and hence three lim s the lota l payoff,
corresponding to an event A, which has probability l' as there are corre-
sponding to event B, which has probability t, since no principle of indivi-
duation distinguishes one A-world from another. (Before an epidemic of
long-odds betting is upon us, however, I should add that even the National
Security Council would be hard put to divert funds from my Swiss bank
account in one world to its counterpart in another.)
These levities aside, we may ask what new understanding of the measure-
ment process MWI gives us. After a measurement each observer will inhabit
a world (for her the actual world) in which a particular result of the measure-
ment has occurred. And the "total lack of effect of one branch on another
also implies that no observer will ever be aware of any 'splitting' process"
(Everett, 1957, p . 147n). What is this observer to say about the physical
process which has just occurred? From where she stands, the wave packet
has collapsed no less mysteriously, albeit no more so, than before.
We are still left with the dualism that the interpretation sought to eradi-
cate. As de Witt (1970, pp. 164-165) himself remarks, the many-worlds
interpretation of quantum mechanics "leads to experimental predictions
identical with the (dualist) Copenhagen view." The difference is that any
294 'J'//(. 11I1/' rl"'/'IIII;oll oj (JIIIIIIIIIIII '1'llI'tlly
WIGNER's FRIEND
Part One of this book gave an abstract summary of a physical theory; Part
Two has asked, what must the world be like if this theory accurately de-
scribes it? In this final chapter I offer a tentative answer to this question. In
Section 10.2 I present an interpretation of the theory which I call the "quan-
tum event interpretation"; in Section 3 I compare it with a version of the
Copenhagen interpretation; and lastly, in Sections 10.4 and to.5, I discuss
the relation between the quantum world and the classical, macroscopi c
world.
Prior to this, however, I consider the implications, some might say the
hazards, of working with an account of the theory that is as abstract as th
one presented in the first half of the book.
In'(' d('clroll IwllllVI ' dl'''''t'!llly IfllIll thi..' -I' tri ally neutral atoms cxperi-
Il)(.'nl<.'d on by SIt'IIl "lid (;\·rl.lch . Bohr also laimed that measurements of
till' 'Ie tron 's spin ('ompOlll'nts w're, for conceptual reasons, impossible
(Ros -nfeld, 1971, in ohen and Stachel, 1979, p. 694). However, in the
1950s Crane devised a technique for performing such measurements which
\'vaded the problems of the Stem-Gerlach approach, and since then pro-
lon -proton pairs have been used by Lameti-Rachti and Mittig in experi-
ments to show that Bell's inequality is violated. (See Clauser and Shimony,
1978, pp. 1917 -1918; d'Espagnat, 1979.) There is also no masking problem
when the spin of a neutron is measured (see Leggett, 1986, p. 39). Thus spin
components are indeed measurable, though not as easily as I have sug-
gested.
To return to the threatened criticism, that my account has been too ab-
stract, the obvious response is to say that the aim of Part One was precisely
that of showing the abstract conceptual structure of the theory. Philoso-
phers of science may rashly tend to equate such abstract structures with the
whole of a theory, and thereby be led to mistakes of assessment, but that is
another matter. For example, the rejection of the wave picture urged in
Section 8.3 may possibly be a mistake of this kind; although at the abstract
level the picture is unhelpful, perhaps it is indispensable for pragmatic
reasons when physical applications of the theory are at issue. It may be so.
Nonetheless, although discussions of these applications would be needed to
flesh out an abstract, skeletal account of the theory and give it breath, all
these applications will involve a common set of mathematical models, and
these abstract structures repay investigation.
A separate question is whether or not the significant features of these
structures are being correctly identified. Here I am thinking in particular of
the importance attributed, both in Chapter 8 and in the remainder of this
hapter, to quantum conditionalization. In contrast, Teller (1983, p. 428)
suggests that the Liiders rule is simply a "fortuitous approximation," an
approximation because actual processes do not localize the state in precisely
the sharp way that the rule suggests, and fortuitous because "there can be
no uniform way, no formula which even in principle could be fixed in
advance for turning the approximation into exact statements."
I agree on both counts; how then can I resist Teller's conclusion that
then, be a u ~l' thl' " inili,,) ~ 1,11l' '' is ontinuously changing, the result of this
projection will dep 'nd on the time at which it occurs. But there is no theoret-
ical reason to locate the projection at anyone time in the measurement
process rather than another: " No formulation of the projection postulate
tells us exactly at which point to apply it" (Teller, 1983, p. 425). Hence,
Teller could continue, there is no warrant for thinking of the postulate as
giving an idealization of a physical process.
This is a powerful argument, but it draws its strength, I think, from the
fact that Teller looks at the projection postulate solely in terms of its relation
to the measurement process. As I pointed out in Sections 9.4 and 9.5, the
question of the projection postulate is conceptually separable from the main
problem of measurement. Considered just as a constraint on accounts of
measurement, the postulate is a seemingly arbitrary stipulation which lacks
obvious links with the rest of quantum theory. On the other hand, if we
view the Liiders rule as Bub suggests, as the rule of conditionalization
appropriate to quantum event structures, we see it in a different light. As the
quantum analogue of the classical conditionalization rule, it is built into the
non-Boolean event structures around which quantum theory is constructed .
I acknowledge that the Liiders rule differs from the ideal gas law in an
important respect. The deviations of real gases from the ideal gas law hav
explanations (the finite size of actual molecules, their mutual attraction, and
so on); further, these explanations also tell us, in general terms, why van der
Waals' equation is an improvement. In contrast, we have no decent account
of when and why the Liiders rule is a less than adequate idealization, But my
reaction to this is not to revise my view of the Liiders rule, but to say that
quantum mechanics still faces a major empirical and conceptual task, that of
sorting out the relation of quantum systems to the classical, macroscopic
world. Take, for example, the simple case of an electron striking a dia-
phragm with a hole in it (as in Section 8.3), We need to know what it is about
the physical structure of a real diaphragm that makes the wave function of
an electron passing through it differ from the ideal localized wave function
predicted by the Liiders rule. But these gaps in our knowledge do not make
the rule a fortuitous approximation; the idealizations it relies on are those
assumed by quantum theory itself.
This fact, however, that there is no systematic way to explain deviations
from the Liiders rule, prompts a return to the question of the value of
abstract theory, since it hints at a deeper issue than the particular problems I
have looked at so far.
Duhem (1906) thought that what are often called "fundamental" physi-
cal theories (Maxwell's theory of the electromagnetic field, for instance) did
no more than provide a formal unifi ation of a wide range of phenomena.
300 ril e !II/ crl'r('/a/;II/I of UrI/III/II III 'J'IIt'ory
He endorses the view that "a physical theory .. . is an abstra t yst ' m
whose aim is to summarize and classify logically a group of experimental
laws" (p. 7). And he quotes approvingly Hertz's dictum that " Maxwell's
theory is the system of Maxwell's equations" (p. 80). Whether or not he is
right about Maxwell's theory, Duhem's description accurately fits the ver-
sion of quantum mechanics given here. This yokes together a disparate
group of phenomena in a purely formal way. The analogies between these
phenomena, one might think, do no more than allow a unified mathemati-
cal treatment of diverse aspects of nature; no further significance attaches to
them.
Certainly, the deployment of abstract analogies is part of the physicist's
repertoire. For example, in his Lectures on Physics Feynman introduces the
idea at an early stage, in his discussion of damped harmonic oscillations, by
displaying the pair of equations (Feynman, Leighton, and Sands, 1965, vol.
I, p. 25-8):
d 2x dx
(10.1) m dt 2 + ym dt + kx = F
(10.2) L d 2 q + R dq +i = V
dt 2 dt C
co ntinll()II ~,
a nd coni illIlOIl . Iy Ji(( 'r ' nti ablc, quantities. If similar differen-
tial equ ations appt·arl·<.! 1hroughout our fundamental physical theories, then
the implication would be that all physical quantities were continuous in
nature. This would then be a significant element within our metaphysical
picture of the world. In fact we no longer believe in the continuity of electric
charge, and so Equation (10.2) is in this respect misleading: charge is a
discrete, not a continuous quantity. The equation is a pragmatically useful
approximation, not a part of our foundational theory. (Recall the discussion
of Hooke's law.) That, I suggest, is the salient difference in significance
between the modeling of oscillations given by the two equations (particu-
larly the latter) and the models furnished by our abstract account of quan-
tum theory.
My point is this. Even if - or especially if - we accept Duhem's account
of physical theories, it is nevertheless worthwhile to examine the models a
theory employs, to see what metaphysical picture is implicit in them. This is
precisely what goes on when we look to Hilbert spaces in order to find a
categorial framework within which to interpret quantum theory. To this
end, the more abstract the presentation of the theory the better.
To seek such a categorial framework is the reverse of a process Duhem
elsewhere condemns, whereby physical theories are assessed in the light of
prior metaphysical commitments; instead, we are asking the theory to pro-
vide our metaphysics. Nonetheless, a resolute Duhemian skeptic might
insist that the search for a categorial framework was not a useful philosophi-
cal occupation. This itself, however, would betray a certain metaphysical
commitment, albeit one expressed in antimetaphysical terms. There seems
no a priori reason to think that the search should be either fruitless or
uninteresting. And should the skeptic persist-so weak is the power of
rational argument to persuade-one could only say, lilt was not you for
whom Part Two of this book was written."
• But see Teller (1989). And, in addition, Ned Ha ll has point ed out to me the problems ra ised
by the Pauli exclusion principle.
All 1IIII" prt'llllillll 01 (2 111111111111 '['ltl'or!! JO.1
position is just to s"y tlhlt thefe ill an ex tended region of space within which
(lwr ' is a nonzero probability of finding it. The wave formalism offers a
convenient mathema tica l representation of this latency, for not only can the
mathematics of wave effects, like interference and diffraction, be expressed
in terms of the addition of vectors (that is, their linear superposition; see
f.eynman, Leighton, and Sands, 1965, vol. 1, chap. 29-5), but the converse
also holds. Clearly, this mathematical equivalence is independent of the fact
that vectors can represent probability assignments; hence the propriety of
ta lking of the "interference effects" obtained in, for example, the two-slit
experiment. In contrast, "particle" effects typically occur when position is
localized; in other words, when a quantum event occurs, latency is actual-
ized and the "wave packet" collapses.
Thus the quantum event interpretation offers both an abstraction and a
generalization of the thesis of wave-particle duality; on the one hand, it
severs the thesis from its classical nineteenth-century antecedents, and, on
the other, it accommodates all quantum observables, not merely po ition
and momentum.
The sense in which a latency is a natural probabilistic genern lization of a
property can be made more precise. Although the exact ontologi al sta tu s of
a property (greenness, for example) may be questioned, one thing is not in
dispute (Staniland, 1972). If we say of a billiard ball that it is green, the n Ollr
statement entails that, if viewed under normal conditions, it will have a
certain appearance; simply put, that it will appear green. In classical physics,
the ascription of a property to an object entails the truth of various condi-
tionals of the form, "If an (ideal) measurement of A is made, then the result
will lie within Ll." I will call such a conditional a "measurement conditional"
and write it as MA -+ (A,Ll). (A,Ll) is, as usual, the event that an A-measur-
ing device gives a result within Ll.
A complete description of a classical system would give us all its proper-
ties, so that every measurement conditional would be assigned "True" or
"False."* (This description is familiar from Chapter 2.) In contrast, the
ascription of a latency to a quantum system entails the truth or falsity of a
host of conditionals of the form:
MA -+ [p(A,Ll) = x]
occurrcn l' o( a ch.ul!;l' o( 1.lkOCY b' omc8 (ram(' r 'lativ ' , and this certainly
off ' nds the spirit, if nol I h ' I 'ller, of STR .
. Indeed, at this point I can hear the objection that the interpretation of-
fered has just too many unpalatable features. On the one hand, so the
riticism runs, nonlocal conditionalization might be acceptable as a conve-
nient mathematical way to summarize the correlations associated with cou-
pled systems; on the other, the suggestion that there is a new ontological
category called "latency" seems fairly inoffensive. But when it transpires
that (1) these physical significant latencies can be changed by nonlocal
actions, and that (2) these alleged changes are not relativistically invariant,
that is just too much to swallow.
Not much can be said, I fear, to sweeten this particular pill, but perhaps
we can say more on behalf of the individual ingredients which together
prove so distasteful. To reiterate what was said in Section 10.1. in seeking an
interpretation of a theory we start from Duhem's thesis that a theory pro-
vides an abstract summary and logical classification of a group of experi-
mental laws. However, that is only where we start. Though our final con-
victions may be instrumentalist, we are setting these attitudes asid for th
time being and asking, what sort of world could be represented by the
mathematical models the theory provides? Further, if we are not instru -
mentalists, we may hope that this way of proceeding sidesteps Duhem's
argument that, since "explanations" are formulated only with respect to a
set of prior metaphysical assumptions, to think that theories provide expla-
nations is misguided. We perform this sidestep by looking within the theory
for the categorial framework it suggests, and which is to be appealed to in
explanations.
Within quantum mechanics we find, in a word, probabilities. However,
the probability functions the theory uses cannot be regarded as weighted
sums of dispersion-free probability functions - that is, as weighted sums of
property ascriptions; quantum theory is irreducibly probabilistic. Rejecting
properties from our categorial framework, we replace them with their prob-
abilistic analogues, latencies. But why replace them with anything? Why
grant ontological status to these remote and shadowy quasi-attributes? A
specific argument for doing so will be offered in the next section; mean-
while, here are some general considerations.
We invoke latencies for much the same reasons that, in the macroscopic
world, we invoke properties. Attempts to give a purely phenomenalistic
account of properties notoriously failed (see, for example, Hirst, 1967); a
property ascription is more than the logical product of a set of conditionals
of the kind, "If I were looking at object X now, under normal conditions of
JOb TIll' 11I11·tll rl ' ll/liO!1 III (...J III/IIIIIIII 'I 'III 'II IY
illumination, then J would b ' hav ing 14cl1 sa tions of greenn '514." 'imil nrl y for
latencies; these too license infinitely ma ny subjuncti ve conditi ona ls (of
which a proper subset are quantum measurement conditionals), but, for
much the same reasons, are not reducible to them .
What then of the projection postulate? This too emerges from the non -
classical nature of the probability spaces we deal with. Regarded not just as a
postulate applying (occasionally) to the measurement process, but as the
quantum version of conditionalization, it provides explanations of thc
otherwise inexplicable. In Section 8.9 I called these explanations "struc-
tural" but, if conditionalization is seen as a change in the latencies of a
system, they also acquire an ontological foundation.
It turns out that there is a price to be paid. Some of the conditionalizations
which figure in these explanations are nonlocal: latencies may be affected
by action at a distance. Even though stochastic Einstein-locality is respected,
the price may seem too high. The interpretation may still violate too many
intuitions. But so may quantum theory. And, like Isabella on a different
occasion, the fierce defender of intuitions may have got his priorities wrong.
After all, what's so hot about intuitions? Aren't these the folks who gave us
Bell's inequality? Duhem would have had little truck with them.
,1Ild ,,1 11 Ihelll (II, ,II,), (II, ,/J ,), (C, ,t', ), rl'Sp ' "Iiv 'Iy . Th 'n a finer six-way
p.II'litiol1 (II, ,1I" Ii, ,/J ,,(' , ,(,) o( 12 is .wailable, and quantum probabilities
.lppeJr according to the n.'eipe ((or observable A, in this example):
Note that the event a 1 U a2 is not identified with the events bl U b2 and
(', Ue2 , as it would be in the construction of an orthoalgebra of quantum
events (see Section 8.1). On the contrary, these three events are mutually
exclusive.
The example may be generalized. Thatis, given any generalized probabil
ity function p defined on an orthoalgebra A, the probabiliti s Jl assigns 10
members of A may be reproduced as classical conditional probabilities on .)
Kolmogorov probability space as follows. Consider the family {i3{} of maxi
mal Boolean subalgebras of A . We embed these algebras individually in J
Kolmogorov probability space in such a way that their maxima are mutually
exclusive and jointly exhaustive: I j n Ii = 0 when i =1= j, and UJj = n. (I;) is
thus a partition of n, and if PK is any classical probability function on n, then
LiPdI j ) = 1. To reproduce the probabilities assigned by P to members of A,
we stipulate that, for any event e in 13 j ,
uch a probability function PK always exists, but since the assignments PK(I j )
are arbitrary (though they must all be nonzero), PK is not uniquely defined
by p.
To summarize. The Copenhagen view of quantum theory and the quan-
tum event view differ significantly in their treatment of probabilities.
Whereas on the quantum event view probabilities in quantum mechanics
are assigned by generalized probability functions to members of an orthoal-
gebra A of events, on the Copenhagen view the underlying probability
space is classical. This classical space is coarsely partitioned, each member of
the partition being the event that a particular measurement occurs, and each
rorn.'spo nding to a maximal Boolea n subalgebra of A. Probabilities are
.lssigl1l'd 10 pvt' nt s in this cia ical spa (' by a Kolmogorov probability func-
DB Thl' IlIlerllrellllioll of ),1111,/,1/11 '/ '11/'/111/
Pe(a;) = p,(MA) . q
On the quantum event interpretation the equation holds because (1) the
state makes the conditional MA ----. [p(A,a;) = q] true, and (2) MA has proba-
bility Pe(MA) . On the Copenhagen interpretation we obtain the same equa-
tion, since
q = P (a 'IMA) = pia;)
e J Pe(MA)
In the light of this one may ask, what does the quantum event interpreta-
tion achieve that a Copenhagen interpretation does not? What is gained by
the appeal to arcane nonclassical algebraic structures, let alone by the invo-
cation of dubiously metaphysical "latencies"?
The same question was raised at the end of Section 10.2, and I can now
amplify the answer given there. One specific achievement is the ability to
talk of the probability of one quantum event conditional on another. On the
quantum event interpretation, to ask what the probability is that a measure-
mentof A will yield result a;, given that an event (B,b j ) has occurred, is to ask,
for what value of x is the statement MA ----. (p[(A,a;)I(B,bj )] = x) true? Since p
is a generalized probability function (GPF) defined on the set of subspaces
of a Hilbert space, the conditional probability p[(A,a;)I(B,bj )] is given
straightforwardly by the Liiders rule. Chapter 8 demonstrated just how
fruitful the application of this rule can be. In contrast, on the Copenhagen
approach we have no ready means of dealing with sequences of events;
PK(a;lbj ) will always be zero if A and B are incompatible.
More generally and fundamentally, the Copenhagen approach offers no
All IlIlt' IIm' /1I111111 II/ Olltllllll", 'f'/I/'/lly J Ot
,KCOllllt Li t ,III o ( th t' ,,'llli()J\/i lwtw '('11 illcompJtiblc obs ·rva bles. There are
probability (un ' ti ons 11K, ddinabl • on th e Kolmogorov space .Q constructed
a ording to th o p 'nhagcn pre cription, which do not generate quan-
tum -mechanica l proba bilities. To return to our earlier example involving
observa bles A, B, and C, a perfectly respectable classical probability measure
on the partition {a j ,a 2,bj ,b2' C1,c2} assigns to each of aI' b j , and Cl the value
;\, and to each of a2 , b2 , and C2 the value rs. This would yield the quantum
proba bilities
Yet if A, B, and C are the familiar components of spin, Sx, Sy, and Sz ,
respectively, no quantum-mechanical state assigns probability t to the posi-
tive value of each observable. (To be precise, no quantum state simulta-
neously assigns to all three events probabilities greater than 1/2-l
.[3/6 = 0.786.)
The Copenhagen interpretation offers no reason why such assignm ' nts
are ruled out. In rewriting the probabilities assigned by any cpr to ' Iem ' nt s
of an orthoalgebra as conditional probabilities defined on a cJas ica l proba
bility space, it takes no account of the fact that quantum mechani s uses
orthoalgebras which have a very rich structure; each is isomorphic to th e set
of subspaces of some Hilbert space.
Not only does the quantum event interpretation regard that fact as cen-
tral, a partial explanation of it has already been offered which leads natu-
rally to the concept of latency.
The ascription of a particular latency to a system assigns probabilities to
the values of a family of observables. With this in mind, consider the analy-
sis of spin in Chapter 4. The question that chapter asks is, what are the
results of assuming that the probabilities associated with a particular family
of observables are constrained in ways suggested by "natural" symmetries
- the isotropy of space, for example? The answer is that only if all the
observables in the family can in some sense be regarded as components of a
vector is a model of the set of events available which uses the full represen-
tational capacity of a Hilbert space; a condition must be put on the probabili-
ties associated with the component observables, analogous to those obtain-
ing when we deal with classical vector quantities. Equations (4.10) and
(4.1 1) give equi va le nt statements of the required condition.
My suggestion is that we think of this intricately related set of probabili-
ti ' 5 as det 'rmi ned by some one fea ture of the system, and give the name
" Iat ·ncy" to th is (eature. Aga in, la tcn ies appear as the probabilistic ana-
3 /0 'J'll/' lilll'I/I/'I'IIII;III1I1/ (.)111111111/1/ 'J'll/ 'IHI/
logues of pro perties. In I. SS iC.11 IlH.'c ha ni s n ve tor pro pt' rt y, th .lt is, ,1
particular value of a vector qu a ntity like momentum, de termin es th e Va ltl l'S
of all components of that qua ntity. Analogously, in qu a ntum th eory, th e
latency associated with, say, spin determines the proba bilities a signed to
the values of all its component observables.
,il gl'bra, ,1nd , 0 llil' St,IIt'llll'lll s dl'sn 'ihin g tlH's(' l'VI' IlI s w ill .1llow bi v,lll'nl
Irulh nssignml'nl s. Sl'(,'( md , its indi 'ntor sta tl's .lrl.' da ssic,11 stnll's, and ca n
be th ought o f ns n pa rti nl Ii I of ils pro perties. This se o nd fea lure is in fn I
enlailed by th e first.
We see tha t, although the latencies of quantum theory a re late ncies with
re pect to a set of events with a thoroughly non-Boolean structure, never-
lheless the set of events realizable at any given juncture - namely, the set of
events associa ted with any experimental situation - will form a Boolean
a lgebra. While the contribution of quantum theory is to show that th e set of
all events, together with the states that assign them proba bilities, ca n be
represented in a Hilbert space, the first requirement of this represent a tion is
tha t it respects the classical structure of the set of events asso inll'd with ,1
gi ven observable. This is the sense in which, on the qu a ntum 'vent in!l'r
pretation, the classical world is conceptually prior to th 'qu a ntum wll l'ld
Implicit in quantum theory is a reference to a c1assicnl world . !lut wlll'll' 1/
the boundary to be drawn? And what is the r >lali on IWlwt't' 1) 11\1' wll1'ld,
Wigner, as we saw in Section 9 .8, located th ' bound ,lry li t till ' 1,'v,'1 lI t
consciousness; the only classical device was a conscious obHl' rv l'l' III II II Ii, II
undesirable on two grounds, (1) that it is too subje li ve for 0111' I.lsh'll, "".1 (J)
that it relies on a dubious distinction between mind a nd bod y. III 1'1)111111 It,
the original Copenhagen interpretation assumed a self-e vidt'nl dislilw tlll\l
between the quantum and the classical worlds. This, howev r, is unh elpflll
to those to whom the grounds for such a distinction do not imm edi all'l y
reveal themselves. Quantum systems, we may say, are smaller tha n rna ro
systems: an electron is paradigmatically the kind of system treated in qu a n-
tum theory, a piece of polaroid plus a photographic plate can act as a
classical measuring device. But is there a number N such that all systems of
N particles are microsystems, whereas all systems of N + 1 particles are
macrosystems? That sounds implausible.
One of Everett's aims, in his "relative state" formulation of quantum
mechanics (see Section 9.8), was to present the theory in such a way that this
"cut" between the quantum and the classical worlds disappears. Quantum
theory, on this account, would be a global theory; it would not be concep-
tually improper- as it is on the Copenhagen interpretation - to talk of the
" universal wave function" (Everett, 1973). In implementing this program,
the " relative state" formulation ran into a difficulty (see the appendix to
Shimony, 1986): if the " branching" of the universe was to correspond
properly to the (apparent) collapse of the wave packet, then, contrary to
quantum mecha nics, there had to be one preferred basis in which the states
of measurement ystems were represented . Certain systems, in other words,
could not bl' pro pe rl y accommoda ted within the theory; the "cut" which
312 rill! II/I apre /o/iol/ oJ J/lIIII/1I1II 'J'III'My
Everett sought to eliminate did not disappear after all, and the probl '01 of
the relation of the classical world to the quantum world was still with u .
Everett's interpretation provokes the question whether we ca n talk
meaningfully, as he thought we could, about the "universal wave func-
tion." If, as I have claimed, a reference to a classical world is implicit in
quantum mechanics, does this mean that this kind of talk is conceptually
confused?
It does not. I have argued that a quantum-mechanical state represents, at
least in part, dispositions to behave in certain ways in interactions with
certain classical systems. These dispositions do not go away if the interac-
tions are not realized; and even if, in our present universe, these dispositions
cannot be realized, we can still speak counterfactually about what would
happen were our universe to be embedded in another. It is thus not incoher-
ent to suggest that the universe has a quantum-mechanical state which is
unfolding as it should, even though there is (by definition) no external
material agency available to scrutinize it.
However, before arriving at a wave function for the universe, we need to
obtain wave functions for its components - including those which, as mea-
surement devices, furnish the classical world within which the latencies of,
say, an electron are realized. This confronts us with the measurement prob-
lem in its abstract form: if a particular set of quantum events is defined by
reference to the classical behavior of a given system, can we give a quan-
tum-mechanical account of that system?
IA' lus Sl't how Ihis th 'sis bears on lh 'onulysitl of meusurement. Assume
Ihot 0 m 'u 'uring 'ysLcm M interacts with a quantum system 5 and that we
des ribe this intera tion as a measurement by M of the observable A for 5,
Then, according to the quantum event interpretation, the quantum event Ej
occurs, where Ej = (A,a j), for some aj. Thus, when we describe M classically,
Ej occurs.
What happens if we now describe M quantum-mechanically, as, accord-
ing to the conventionality thesis, we may? We now portray 5 + M as evolv-
ing according to the Schrodinger equation; no quantum event occurs unless
5 + M interacts with a new measurement system M* which lies above the
new classical horizon and measures the value of some new observable,
either for M or for 5 + M. Nonetheless, von Neumann (1932, pp. 436-442)
provides an argument to show that there is a sense in which the two ways of
regarding M are equivalent.
Using the notation of Section 9.6, we assume that A has two values, and
that the eigenvectors of A in 71 s are v+ and v_ . The states of Marc Uo (th '
ground state) and the two indicator states u+ and lL. When we represent M
quantum-mechanically, u o, u+, and u_ become orthogonal v tors in '11M .
Prior to the interaction with M, let the initial state of 5 be v = r I v I I (' v ;
we assume M to be in its ground state. then, if we regard M Iassi ally,
quantum theory suggests thatthe event E+ = (A,+) will 0 curwith probabil
ity Ic+12; conditionalization on E+ projects the state of 5 into v +.
Let us now regard M as a quantum system. Consider the observable AM for
systemM whose eigenvectors (in 71M) are u o, u+, and lL, with eigenvalues
0, + 1, and -1, respectively. If we apply the Schrodinger equation to the
interaction between 5 and M, we obtain
Despite this reassurance, we still face a problem. Assume that we usc M "
to observeM, and that we obtain the result (A M ,+). We are here regarding M
not as a classical measuring device but as a quantum system. The question is,
in this situation does the event E+ occur or not? It seems that, by decicting to
draw the classical horizon below M rather than above it, we can bring about
the event E+ ; in other words, it seems that a conventional choice of horizon
has an ontological consequence. Prima facie, this seems to bode ill for the
conventionalist thesis.
In fact, as this analysis shows, it is the conventionalist thesis that creates
the measurement problem. Note that only the least contentious aspect of the
quantum event interpretation - the claim that the registration of a value by
a measurement device can be called an event - is invoked in the generation
of this problem. If, adctitionally, the projection postulate is accepted, then a
further problem appears: does the state of S change to v + as a result of the
interaction with M or not?
One strategy open to the conventionalist is this. He may say that when S
interacts with M the quantum event (A,+) occurs, leaving M with the prop-
erty corresponcting to the positive value of A (call this property A+). To say
this is to describe M in classical terms. This does not rule out the possibility of
describing it quantum-theoretically, he may continue, but if we do so we
forgo two things. We can no longer speak of a quantum event occurring,
since that would involve reference to a classical system, nor can we speak of
M as having a property. However, this means neither that no quantum
event has taken place, nor that M does not in fact have a property; it is rather
that quantum mechanics only allows us to speak of latencies. When MOO
"looks at" M, we can describe that interaction classically: MOO tells us what
property M has; we may also describe it quantum-mechanically, as the
occurrence of the event (A M ,+). These two modes of description are, again,
two alternative ways to describe M.
There are two things to be said about this suggestion. The first is that it
does not entirely dissolve the problem; it shifts it to a new location. It breaks
the "property-eigenvector link" usually assumed to hold of measuring sys-
tems. We describe M classically as having the property A+, or we describe it
quantum-mechanically as being in the eigenstate u+; the assumption is
usually made that M has the property A+ if and only if it is in the indicator
state ~. On the suggested analysis, the conctitional holds in one direction
only: if M is in the state ~ then it has the property A+. However, it may
also have the property A+ even when it is in the mixed state OM =
Ic+12p~ + Icl2p~ as a result of its interaction with S (see Section 9.6).
Second, although the suggestion allows us Lo deal with properties, it will
not work for the projection postulate. Wh en'ns we cn n say without in 011 -
1111 111/1" 1 /1 ' /11/11111 IIJ (,) 111111111111 TlII'III!! .l /!i
'
sis tt' I1CY th.lt M II .IS tl\(' prolwrty II' w hen it is in th • mixed sta te OM, we
. ca nnot sny th nt S is both in th e pure sla te v, and in the mixed state
OS - Ie,1 7 p ~ + Ie. 12p s .
The stra tegy, toge ther with the two corollaries just mentioned, moves us
ve ry cl ose to van Fraassen 's " modal interpretation" of quantum theory. Van
Fraassen (1974a, pp. 300 - 301) presents in the formal mode what I have put
in ontological terms:
We distinguish two kinds of statements-state attributions and value attribu-
tions . . . The state of the system describes what is possibly the case about values
of observables; what is actual is only possible relative to the state and is not deduci-
ble from it.
Van Fraassen is happy to reject the projection postulate (p. 299) and, al-
th ough he does not write in quite these terms, the severing of the property-
eigenvector link appears as a small price to pay for allowing a measurement
device to be given both a classical and a quantum-theoretic description .
Ingenious though this interpretation is, I do not think it is right. I sny this
with some reluctance, since, as we shall see, it solves a number o f intrn tnbl l'
problems . My reservation stems in part from a belief in the ex pl nnntory
va lue of the Liiders rule (which in one guise acts as a projection postul nte),
a nd in part from a belief that van Fraassen's partial re jection of the prop
erty-eigenvector link does not go far enough. I consider even a pa rti al
identification of classical properties of a macroscopic system with quantum
states of that system to be problematic.
For the question one cannot avoid is, are nontrivial superpositions of
these quantum states also admissible pure states of the measurement de-
vice? If so, then they are pure states in a wholly Pickwickian sense, since no
observable distinguishes them from the corresponding mixtures of indicator
states. If not, if the set of admissible pure states is restricted to the indicator
states, then the account runs into the difficulty described in Section 9.7: this
restriction on the set of admissible quantum states is incompatible with the
application to the system of Schrodinger's equation, and hence with treating
it as a quantum system. This is not to say that classical systems admit no
quantum-mechanical description, just that, to the extent that an indicator
state is a classical property, it is implausible to treat it as a quantum state.*
Von Neumann's consistency proof, in my view, has little to do with
measurement or with the question of the classical horizon. If the system
M + 5 evolves according to the Schrodingerequation, then the states uo, 14,
a nd u of M ca nnot be regarded as classical indicator states of M, and so M
a nn ot fun ti on as a measurement apparatus in the way the proof suggests .
• SeL' 'l lso I .L·g~l· tt ( 19H6). This p.l pl! r ca me to my a tte ntion too la te to be discussed here.
3 16 Tlt e 1IIII'rlm'IIIIioll of (JI/(/IIIIIIII 'l 'III'ory
What the proof shows is the pos ibilil y of a qua ntum amplifi a lion d 'v i e or
relay.
The mere rejection of a particular identification of classical states with
quantum states does nothing, however, to resolve the crucial and persistent
question we are left with: what is the conceptual relation between the
quantum world and the classical world? This is the touchstone, pyx, assay,
ordeal, the High Noon, the Big Enchilada for all interpretations of quantum
theory.
Let us approach the question from the classical side, and ask: are there in
the actual world systems which behave like classical systems? To reiterate a
point made earlier, I am not asking whether there are systems whose behav-
ior is governed by the laws of nineteenth-century physics. The question is:
are there systems to which we can consistently ascribe properties, the set of
which forms a Boolean algebra? If we permit ourselves the kind of idealiza-
tion appropriate to any physical theory, the answer is clearly yes. Call these
C-systems. It turns out that there are very small systems whose behavior
with respect to certain C-systems differs markedly from that of other C-sys-
terns. The most complete specification of the state of one of these small
systems that we can obtain assigns probabilities to events associated with
properties of the C-systems with which it interacts. Call these Q-systems.
Are Q-systems and C-systems different in kind? Our best theory tells us
that C-systems are made up of a great number of interacting Q-systems.
Further, our theory of Q-systems includes an account of what happens
when a number of Q-systems together form a larger system, and this ac-
count has received experimental confirmation. We are led to postulate six
theses.
Ih('s(' Ill\ldl'l ~ If " 10 Jllsllf y" IlI'r(' nll',IIlS 11101'(' th.1I1 " to s how th.lt they S dVl'
tIll' plwnOl1ll'll.I," .lnt! wh,ll is required is some O('('p('r analys is warranting
tlwir USl', then it 'annot do so.
This argument is not intended to provide "a tranquilizing philoso-
phy, . . . a gentle pillow for the tTUe believer from which he cannot easily
be aroused" (Einstein, letter to Schrodinger, May 1928, on the Copenhagen
interpretation; quoted in Bub, 1974, p . 46). It is an argument which claims
that the scope of quantum theory is limited by its own structure.
Landau and Lifschitz (1977, p. 3) write,
Thus quantum mechanics occupies a very unusual place among physica l theories: it
contains classical mechanics as a limiting case, yet at the same time it rcquirt'S this
limiting case for its own formulation.
I suggest that the explanation of how this can be so, how Wt' (\\11 W I\' IllI'
limiting cases of quantum theory in order to formulate tht·tlwory, ('.lllliot lit'
given within the theory itself. It will have to await thi.' ,\I 1 iv,II of .1 IH 'W
physical theory, a theory which is not formulated against a cld ss ir.II III tllIOIl
in the way that quantum mechanics is.
Can there be such a theory?
Probably.
APPENDIX A
Gleason's Theorem
Gleason's theorem is of fundamental importance, not onl y for Ilw Ih '\ 11 YII f
Hilbert spaces, but for the interpretation of qu antum ITl('ch .ln ir . TI IIlII )'. h
the original proof, published in 1957, was math mali .1 11 Vt'l y dtIlHIIII , "
1984 an "elementary proof" of the theorem was giVt'fl b ' 00 1..1', 1' 1111',
and Moran (whom I shall refer to collectively as" KM" ), Illd il ill 1'1' 1)1'0
duced below. For the amateur mathematician, ev ' n Ihis proof iH 111'1 111111 1/
enough. To ease the reader's task I have added a ommenl ary cO Il Hi III" /,,
partly of explanations of unfamiliar terms, but mostl y of answer:; 10 Ill\'
questions I asked myself as I worked through the proof. The e qu sl.ionll
were of two kinds: "Why is this move being made here?" and " How does
this follow?" I assume a familiarity with Section 5.6 of the text, and with the
vocabulary and notation of set theory (see, for example, Monk, 1969).CKM
also use one theorem from topology which I quote but do not explain. The
theorem guarantees the existence of the limit of certain sequences; the
reader will have to take it on trust, but in context the intuitive content of its
conclusion will be clear.
I have not altered the symbols used by CKM to make them conform to
those used in the text and in my commentary, but since the symbols in the
proof are all defined on first use, this should cause no problems; the reader
need only be aware that such differences exist.
Abstract
Gleason's theorem characterizes the totally additive measures on the losl.'d
subspaces of a separable real or complex Hilbert space of dimension grea ter
than two. This paper presents an elementary proof of Gleason's theorem
which is accessible to undergraduates having completed a first course in real
analysis.
Introduction
Let H be a separable Hilbert space over the real or complex field. A (normal-
ized) state on H is a function assigning to H the value 1, assigning to each
closed subspace of H a number in the unit interval, and satisfying th e
following additivity property: If any given subspace is written as an orthog-
onal sum of a finite or countable number of subspaces, then the value of the
state on the given subspace is equal to the sum of the values of the state on
the summands. States should be thought of as 'quantum mechanical proba-
bility measures'; they play an essential role in the quantum mechanical
formalism. For an exposition of these ideas we refer to Mackey (1963).
Examples of normalized states are obtained by considering positive self-
adjoint trace class operators with trace 1 on H. Such operators correspond to
preparation procedures in quantum mechanics. If A is such an operator,
then it is easy to see that we can define a state by associating to each one
dimensional subspace generated by a unit vector x E H the inner product
(Ax,x) and extending to subspaces of dimension greater than one by count-
able additivity. States of this type are called regular states.
In his course on the mathematical foundations of quantum mechanics
Mackey (1963) proposed the following problem: determine the set of states
on an arbitrary real or complex Hilbert space. This problem was solved by
Gleason (1957) and the principal result, known as Gleason's theorem, states
that every state on a real or complex Hilbert space of dimension greater than
two is regular. Gleason's proof uses the representation theory of 0(3), and
relies on an intricate continuity argument. Because of the role which Glea -
son's theorem plays in the foundations of quantum mechanics, there have
been several attempts to simplify its proof. Using elementary methods, Bell
(1966) proved a special case of the theorem, namely, that there exist no
states on the closed subspaces of a Hilbert space of dimension greater than
two taking only the values zero and one. Kochen and Specker (1967) proved
a similar result for states restricted to a finite number of closed subspaces.
Piron (1976) produced an elementary proof of Gleason's theorem for the
special case that the state is extreme (i.e. assigns the value 1 to some one
dimensional subspace).
In this article we give an elementary proof of Gleason's theorem in full
,It'tl tl// ', T llt 'tlll ' l/I ,1 ,1
gl' lwra lily. I\ llh o\l ); h 11.i:. p rlio f iH IOllger 111,111 (; II ' .I ~() II ' H p roof, WI' Lwlit'Vl'
th at it onlri bul l's to Ih l' inill itivl' 1I11tkrsla nd ing o f 11ll' un de rl ying reasons
for the va lid ity of the theorem. The slru lu re of lhe argument is as follows.
In § 1 we show tha t it is enough to handle the case H = II~P. This was part of
Gleason's original argument, and is well understood; the essential difficulty
of the proof is the treatment of the case H = II~P , For this purpose it is
convenient to study a certain class of real-valued functions on the unit
sphere of ~3, called frame functions. §§ 2 and 3 are devoted to an exposition
of the properties of frame functions and the statement of the theorem in the
case of ~3 in terms of frame functions. § 3 also contains two 'warm-up
theorems' whose contents were essentially known to 19th century mathe-
maticians. Coupled with a basic lemma in § 4 (essentially due to Gleason
and Piron), they yield a new proof for the extreme case, which is given in § 5.
In § 6 we show that a weak form of continuity in the general case follows
from the result of § 5, and in § 7 we treat the general case. The proofs in §§
2 - 7 are accessible to undergraduates who have completed a first course in
real analysis.
1. Reduction to H = 1R3
Let H be a real or complex separable Hilbert space, and let L be the set of
closed subspaces of H. If A E L, andB E L, then we write A 1- BifA andB are
orthogonal. For Ai E L, i E I, ViE/Ai denotes the smallest closed subspace
containing Ai for all i E I. If x is a vector in H, then denotes the one x
dimensional subspace generated by x.
Definition. A function p: L -+ [0,1] is called a state if for all sequences
{AJf-l' Ai E L, i = 1 . . . ; with Ai 1- Ai' for i j: *
p(x) = (Ax,x).
p(x ) = B(x,x).
B(x,y) = BE(X,y).
M= sup p(x).
xEH
Proof. Ev ory s l.\I (' O il II, wCl'ssari ly indu 's a con tinuous sy mmetric bilin-
ea r form on 'very omplclcly r a l three-dimensiona l subspace, and every
compl etely rea l two-dimensional subspace can be embedded in a com-
pletely real three-dimensional subspace. It follows that the restriction of a
state on H to any two-dimensional completely real subspace is regular, and
from the above lemmata it follows that every state on H is regular. I
2. Frame Functions
In this section, we define frame functions, collect some of their properties,
and give examples. Denote by 5 the unit sphere of a fixed three-dimensional
real Hilbert space. If sand s' are elements (i.e. vectors) of 5, the angle
between 5 and 5' is designated by O(s,s'). If O(s,s') = nl2, we write 5 1- 5' .
Definition. A frame is an ordered triple (p,q,r) of elements of 5 such that
p 1- q, P 1- rand q 1- r.
Given a frame (p,q,r), each point in 5 (and in the vector space) can be
uniquely expressed as xp + yq + zr, with x,y,z E IR. We call (x,y,z) the frame
coordinates of the point with respect to the given frame.
Definition. A frame function is a function f: 5 --+ IR such that the sum
w(af) = aw(f),
w(f + g) = w(f) + w(g) (a E IR, f, g frame functions).
Proof Given s with f(s) > M ~, c ho()s ' () 0 such thaI [(s) , M <: I (),
and t' such thatf(t ' ) < m + J. Then sand /' determine a great ir Ie on 5, < nd
if t, S' are chosen on this great circle with s ..l t, S' ..l t', P 3 yields:
and the sum of the squares of these three numbers is one since Po E 5. Hence
is a frame function, with w(f) = 1. Next, fix a frame (po,qo,ro) and a triple
(a,p, y) of real numbers. Let (xo, Yo ,zo) denote the frame coordina tes of a poin t
s E 5 with respect to (po,qo,ro). By the above and by PI'
Q(s) = (s,As)
M = sup f(s)
m = inf f(s)
a = w(f) - M - m.
Then there exists a frame (p,q,r) such that if the frame coordinates with respec// o
(p,q,r) of s E 5 are (x,y,z),
f(s) = Mx 2 + ay2 + mz 2
for all s E S.
In particular, the proposition of § 2 provides all bounded frame fu nctions.
We remark that the above representation implies that 112 :5 a :5 M; if m
a < M, then the frame (p,q,r) is unique up to change of sign; if m :5 a < M
then p is unique up to change of sign, and similarly for m < a :5 M.
In order to clarify the idea behind our proof of the above result, we now
state and prove a theorem which might be called an 'abelianized' version of
Gleason's theorem. Its content was essentially known to 19th century
inathematicians.
' WARM-UP' THEOREM I. Let f: [0,1] --+ IR be a bounded function such that for all
a,b,c E [0,1] with a + b + c = 1, f(a) + f(b) + f(c) has the same value ill =
w(f). Then f(a) = (ill - 3f(0»a + f(O) for all a E [0,1).
Proof. By subtracting a constant, we may assume f(O) = O. Now take c = 0,
b = 1 - a to obtain
for all a,b,a + b E [0,1]. This impli ' immedia tely tha t
[(a) = wa
for all rational a, and for general a E [0,1] and n ~ 1 with na ::5 1 we have
[(na) = n[(a}.
Hence as a tends to 0, [(a) must tend to 0 because [is bounded, and thus
for rational r, r' with rao, r'ao, (r + r'}a o E [0,1], and hence
for rational r with rao E [0,1]. It now fo llows from (2) tha t [(a) = a for all
a E [0,1]\ c.1
(:/1'111 1111' 'I'lt I'/11'1'/11 .Ill)
Ds ={tEN:t1..s.1.) (s E N\{p))
will be called the descent through s; it is the great circle through s which has s
as its northernmost point. (For e E E, D, = E). We can now state the basic
lemma:
. BASIC LEMMA. Let [ be a frame [unction such that
(1) [(p) = SUPSES[(S), and
(2) [(e) has the same value [or all e E E.
Then i[ 5 E N\{p} and if s' E Ds
[(5) ~ [(5').
Let s E N\{p} and 5' E D •. Choose t,t' E Ds with s 1.. t, and s' 1.. i'. By prop
erty P3 ,
Figlire Al . 1
( ;/"(/ !I()/I ':1 ,/,111 '111 ,' 11/ ,1.1 I
5f
5=5 0
Figure Al.2
angle between 5j and 5i+1 in the plane is njn (see Figure A 1.3). Then 5" has
°
coordinates (-y,O) and we wish to show that y - x - 4 as n - 4 00. Let dk be
the distance of Sk from the origin. Then do = x and dn = y. For each j we ha ve
di+dd; = 1jcos(njn),
and hence
d
n 1 1
l<yjx=djd
- n 0
=TI-'
d
;=1
= (cosnjn)n
j-
<-- --
- (1 - n 2j2n2)" ,
1
Sn=(-Y, 0) So=(x, O)
Figure Al.3
(lj(M - m»(f - m». Let s,t E N\{p} with I(s) > I(t). Then by the geometric
lemma and the basic lemma of the preceding section, we have
f(s) ~ f(t).
Then 1(1) = [(1) = 1,1(0) = [(0) = 0, and if I, l' E [0,1] with I < I', it follows
from the above that
1(1) ~ [(I') .
Hence the set c: = {I: 1(1) > [(I)} is at most countable, as
L (1(1) - [(I» ~ 1.
lEe
If 1,1',1" E [0,1] with 1+ l' + I" = I, then there exists a frame (q,q',q") with
I(q) = I, I(q') = I', I(q") = I". That is, the function f satisfies the hypotheses of
Warm-up Theorem II, and we conclude that
6. Extremal Values
In this section we use the results of the preceding section to show that
bounded frame functions attain their extremal values. Let 1be a bounded
frame function,
M = sup I(s),
sES
p = lim Pn'
lim Cn = p.
(s E 5).
(s E S).
Under the product topology, this space is compact, so that the sequence hI!
has an accumulation point, which we denote by h. Then:
(1) h(p) = 2M = supsEsh(s).
(2) h is constant on E.
(3) h is a frame function, since the frame functions form a closed subset of
[2m,2Mf
By the theorem of § 5, h is continuous (and has a special form, which does
not interest us here).
STEP 4. f(p) = M
Choose E > 0, and choose e E Co such that h(e) > 2M - E. Applying the
approximate version of the basic lemma to h" and noting that we can reach e
from en in two steps (easiest case of the geometric lemma) for sufficiently
large n, we obtain
such that
lim
.
hn(e)
I
> 2M - E.
1-""
(.fIo" II OII '1i '/ 'It, " II'I'11I .I.I!,
f(s) + [(ps)
takes the constant value m + a on the equator, and attains its supremum 2M
at p. Letting
we have from § 4
Proof o[ claim. (a) Note that f (x,y,z ) = (- y,x,z ); P(x,y, z) (x, - z,y). Ap
plying these operations in succession, one verifies:
ppf(x,x,z) = (- x, - x, - z),
pff(x,z,z) = (- x, - z, - z),
Suppose s = (x,x,z). Since [(s) = [(- s), g(s) = g(- s) (by property P2 ), we
conclude from (*):
+ [(fs) =
[(s) g(s) + g(fs),
[(fs) + [(pfs) = g(fs) + g(pfs),
[(pfs) + [(ppfs) = g(pfs) + g(ppfs);
subtracting the second equation from the sum of the first and third, we
conclude that [(s) = g(s). The other two cases under (a) are proved similarly.
(b) Suppose s = (x,- x,z); then f(x,- x,z) = (x,x,z), which lies on the great
circle x = y. From (a) we know that [(fs) = g(fs), and from (*) we conclude
that also [(s) = g(s). The other two cases in (b) are proved similarly, and the
claim is proved.
Now define h: = g - f h is clearly a frame function, and the claim implies
thath(p) = h(q) = h(r) = O,sothattheweightofhiszero. We also know that
h is zero on the six great circles x = ±y, x = ±z, Y = ±z. The proof is com-
pleted by showing that h is identically zero. Assume that h is not identically
zero; then by §5 we may put
(ii) n' 0: This follow s illlll)l'tii.ltl'ly from (i) a nd the fa ct that II has
w ' ig ht z 'ro.
o (iii) "(x',x',z' ) = M'(X'2 - Z' 2), where (x',y',z') denote the (p',q',r') -frame
coordinates. Using the previous two steps, this follows from the claim, upon
substituting h for f and M'(X'2 - Z'2) for g.
(iv) On the great circle x' = y', h takes the value zero at exactly the
following four points: (x',x',x'), (x',x',-x'), (-x',-x',x') and (-x',-x',-x').
The great circles x = y, x = z and y = z intersect in the two points: (x,x,x)
and (-x,-x,-x). As h is zero on these great circles, we see that the great
circle x' = y' must pass through the points (x,x,x) and (-x,-x,-x), since
otherwise there would be six points on x' = y' at which h takes the value
zero. The great circles x = - y and x = - z intersect at (x, - x, - x) and
(- x,x,x). x' = y' must also intersect these points, since otherwise it would
intersect x = -y and x = -z at four points, making six points at which h
would take the value zero on x' = y'. However, there is only one great circle,
passing through the four points (x,x,x), (- x, - x, - x), (x, - x, - x) and (- x,x,x),
namely y = Z.1t follows that y = z and x' = y' describe the sa me gr at cir It"
and therefore h must take the value zero at all points of x' = y' . This onlr.
dicts step (iv) and the theorem is proved. I
B(x,y) = [B(y,x)]*
The restriction of B to pairs of vectors of the form (x,x) yields a quadratic form
Q (see §2), such that
Q(x) = B(x,x)
In the second argument CKM show that, for an arbitrary vector y, p(y) is
given by an expression involving just y, x, and M. The vectors x and yare
assumed to be normalized, and, although the main result does not depend
on it, so is p: that is, CKM assume that p('li) = 1; hence, for any x.L orthogo-
nal to x, p(x.L) = 1 - M. For any angle {}, since eiO is a scalar, y contains eiOy .
We choose {} so that (xle i8y) is real; then within the two-dimensional com-
plex space 'li, there is a completely real two-dimensional space 'li R contain-
ing both x and ei(Jy. Let x.L be a normalized vector in 'li R orthogonal to x; then
there exist real numbers b1 and b2 such that bi + b~ = 1 and
eiOy = b1x + b2 x.L. Note that b1 = (xle i8y) and b2 = (x.Llei(Jy). The restriction
of p to 'li R is again a normalized state, and by the assumption of the lemma
there is a self-adjoint trace-class operator A on 'li R such that, for all normal-
ized v in 'li R, p(v) = (Aviv). Furthermore, we can show that, since A is a
trace-class operator on 'liR and (Aviv) is at a maximum when v = x, x is an
eigenvector of A. Since 'li R is two-dimensional and A is self-adjoint, x.L is
also an eigenvector of A. For any normalized eigenvector v of A with
corresponding eigenvalue a, we have (Aviv) = a; hence the eigenvalues of
A corresponding to x and x.L are, respectively, M and 1 - M.
As Cooke has pointed out to me (pers. com., May 1988), the neatest way
to obtain the result of the lemma is now to use the remark following equa-
tion (*) in §2. To see this, consult the commentary on §2 and consider the
frame {x,x.L} in the two-dimensional space 'li R' The coordinates of ei8y with
respect to this frame are (xleiOy) and (x.Lle i8y), respectively, and, as we have
noted, (xle i(Jy)2 + (x.Lle i8y)2 = 1. By plugging in these coordinates and the
eigenvalues of A into the (two-dimensional version of) equation (t) of the
commentary on §2, we get:
340 ""''''"111 A
Commentary on §2
The injunction after the equation marked (*) may well tax the resources of
memory. The result can be quickly shown for quadratic forms on 1R3, as
follows.
To each quadratic form Q on 1R3 there corresponds a unique symmetric
bilinear form B on 1R3, such that B(x,y) = B(y,x), and a symmetric operator A
on 1R3 such that, for all s E 1R3,
In tlw pro positi o/l ,II Il w l' lI ti of the se ' tion, 'K M Irea l a more general
case, sin . A is nol l1l·n·ssari ly symmelri . To xtend the above proof to the
. g nera l as', w . n . ' d to onsider th e symmetric operator t(A + AT) (see
Fa no, 1971, p. 68).
Furth er notes. (1) Property P4 plays a large partin what follows; note that it
yields an inequality: [(t) < m + C;.
(2) Compare the frame function [(s) = cos 2 0(po,s) with equation (4.4).
Commentary on §3
To recognize the theorem given by CKM at the beginning of this section as
Gleason's theorem, note that (1) every state p on ~3 is a bounded frame
function; we must also show (2) that from the conclusion of the CKM
theorem it follows that there is a symmetric operator A on ~3 such that, for
every s E 5, p(s) = [(s) = (Asls). (Recall that s is the ray containing s.) (2) is
the converse of the proposition of §2. Proof by constru lion: IA'I
A = MPp + aPq + mP r, whereP p ' Pq , Prprojectonto p , q, i, re p ·Cli vl'l y. lf
the coordinates of s with respect to (p,q,r) are (x,y,z), then, sin ( 8 1P1'~)
IPpSl2 = X2 (and similarly for q and r), we obtain
as required.
The implicit quantifications in the proof of " warm-up" theorem I may
give trouble. Throughout this theorem we are considering a fixed (although
w
arbitrary) [with the property [(a) + [(b) + [(c) = for all triples (a,b,c) such
that a + b + c = 1. We take an arbitrary a andobtain[(a) = w-
f(1 - a), by
considering a as part of the triple (a,1 - a,O). This holds for all a E [0,1], and
is applied to (a + b) to obtain
tion, ao E [O,I]\C),
From (1) and (3) [is bounded from above, and, using (2), we obtain
But now assume that, for some ao, [(aD) * ao. Then for a sequence ro, r1,
r2 • • • such that rjaO -+ I, we have
Commentary on §4
Figure Al.4 illustrates the two lemmata of this section.
Commentary on §5
The geometric lemma, together with the basic lemma of §4, shows that,
given premises (1) and (2) of the basic lemma,
The geometric lemma itself shows that from any point s in N one can reach
another of lower latitude via a sequence of descents, starting with the
Figure Al.4
( ;11'11 fllI '/I '1'I11'O /'!'''' J4.J
n-polygon starting from 5 and moving in the direction of t will intersect the
line pt: call the point of intersection t(n) ' The next step of the CKM proof
shows that, by making n big enough, we can obtain l(t(n» > l(t). The required
sequence of descents takes us round this n-polygon from 5 to t(n) and then,
using the maneuver of Stage (1), from t(n) to t.
Three points in the theorem deserve comment.
44· Appclllii A
Whence I' :5 [(I') :51(1') :5 I', and so 1(1') - [(1') = 0, contra hypothesis.
Commentary on §6
At this stage it is useful to compare the theorem of §5 with the statement of
Gleason's theorem in §3. The premises of the §5 theorem are stronger than
the premises of Gleason's theorem: they impose both (1) an extreme-value
requirement and (2) a symmetry condition. The extreme-value requirement
not only requires that [be bounded (sup [(s) = M, inf [(s) = m), but also that
there exist a point p on S such that [(p) = M and a (set of) points for which
[(s) = m. Symmetry requires that for all s 1- p (that is, for SEE), [(s) = m.
§6 shows that the extreme-value requirement holds for any bounded
frame function [; it also introduces a technique for symmetrization which is
used again in §7. Note, however, that in general [ does not satisfy the
symmetry condition.
The conclusion of the §5 theorem is a special case of the conclusion of the
§3 version of Gleason's theorem; if the function [is expressed in terms of
coordinates (x,y,z) with respect to the frame (p,q,r) (q, r E E), it appears as:
."'/It 11// '/1 T I'/ '/I//' III . 4.1
~-/---t--""r
product topology is important, since there are other topologies under which
the space [2m,2M]S is not compact (see Kelley, 1955, pp. 217-218).
Step 4. The existence of this function h allows the final move to be made in
the chain of inequalities that yields the desired result.
Commentary on §7
We now know that any bounded frame function [satisfies the extreme-
value requirement, and we have a technique for using [ to define a function
[sym which fulfills the symmetry condition of the §5 theorem: we write
[sym(5) = [(5) + [(p5) (§6, step 2). By the §5 theorem we also know the form of
[sym [§7, Equation (*»).
In §7 [is compared with a quadratic frame function g which has the same
extreme values, M and m, and the same weight, M + m + a, as does f. We
see first that gsym = [sym, and then that g(5) = [(5) for points on selected great
circles [claims (a) and (b): see Figure A1.6). Lastly, the function h = g - [is
shown to be zero, not merely for points on these great circles, but over all S.
Hence any bounded frame function is a quadratic frame function, and
Gleason's theorem is proved.
Further notes. Step (iii) in showing that g - [ = 0 is an elegant move
whereby claim (a) is made on behalf of h and the quadratic frame function
M'(x t2 - Z'2); this quadratic frame function is constructed from the extreme
values of h as was g from the extreme values of f.
APPENDIX B
In Section 8.2 it was shown that, if subspaces P and Q are comp.ltil lh Ih."
the Liiders rule yields classical conditionalization; in other w(lill .
conditionalize according to the rule, then
IP(PIQ) = p(P n Q)
p(Q)
The Liiders rule thus renormalizes the probabilities assigned to all P ~ Q, III
that q(Q) = 1. We now show that the rule specifies the only GPF on 5('/1)
which does this.
In this proof, Q denotes both a subspace and the projector onto it, and Q I
denotes both the orthocomplement of Q and the corresponding projector.
v
We write for the ray containing a normalized vector v.
Note first that, for any density operator D, the operator QDQ is a trace-
class operator [see (5 .6)], and hence the operator
QDQ
Tr(QDQ)
q(v) = (vIDqv)
= (Qv + Q-LvIDq(Qv + Q-Lv»
= (Qv + Q-LvIDqQv) [(3), above]
= (Dq(Qv + Q-Lv)IQv) [Hermiticity]
= (DqQvIQv) [(3), above]
= (QvIDqQv) [Hermiticity]
We see that any GPF q on 5(7/) such that q(Q) = 1 is completely specified by
the values which it assigns to the rays within Q . Hence, given any GPF p,
there is a unique GPF q on 5(7/) such that, for all P k Q,
P - p(P)
q( ) - p(Q)
QDQ
Tr(QDQ)
QDQ
Dq = Tr(QDQ)
Thus the Liiders rule gives the unique GPF q with the property that, for all
PkQ,
p(P)
q(P) = p(Q)
APPENDIX C
At the end of Section 8.8, it was shown that, when an electron-positron pair
is prepared in the singlet spin state, Liiders-rule conditionalization on the
event (5g" +), associated with the positron, projects the state of the electron
into the pure state p~_; further, that this state indeed yields the quantum
theoretic probabilities for measurements of spin on the ele tron, gi v n Ih al fl
measurement of Sa on the positron has yielded the re ult +.
Here I generalize this result, by taking a coupled syst m in an a rbitra ry
initial state D and looking at the effect of conditionalizing on an 'v 'nl
associated with one of its components. I use the notation o f the last part of
Section 8.2, and the proof is an extension of the one which appea rs there.
Consider a coupled system with components a and b, whose states are
representableina Hilbert space 7i" ® 7i b. Assume that a measurement of Ab
is conducted on system b, and letA" be an observable associated with system
a. We can then form a classical probability space partitioned by the conjunc-
tions (P"· Pb) of A"-events and Ab-events. Since this space is classical, condi-
tionalization on the event pb (the result of the measurement of Ab) will yield
conditional probabilities for the N -events given by the classical rule:
But A" was an arbitrarily chosen a-observable, and so this rule holds for all
a-events P".
Note tha t the probabilities appearing in the expression on the right of this
equation are given by the statistical algorithm of quantum mechanics, if we
know the initial state of the composite system. For if this system has been
prepared in a quan tu m state n., which reduces to states Di and Dt of the
omponcn t , w th n have
p(pa. Pb) = Tr[O\(pa ® Pb)1
p(Pb) = Tr[O\(la ® Ph)] = Tr(O'(pb) [(5.27)]
for an arbitrary a-event pa; in other words, that the conditional probabilities
for all a-events are as though pb projects the state of system a to D~ .
Accardi, L., and A. Fedullo, 1982. "On the Statistical Meaning of Complex Numbers
in Quantum Mechanics." Lettere al Nuovo Cimento 34:161-172 .
Aristotle. 1984. The Complete Works of Aristotle, 2 vols. Ed. J. Barnes. Princeton, N.}.:
Princeton University Press.
Aspect, A. 1976. "Proposed Experiment to Test the Non -Separability of Quantum
Mechanics." Physical Review D 14:1944-1951. Reprinted in Whcl'lcr Ind
Zurek (1983), pp. 435-442.
Asquith, P . D., and R. N. Giere, eds. 1980. PSA 1980, vol. 1. East Lansing, Mich .:
Philosophy of Science Association.
- - - 1981. PSA 1980, vol. 2. East Lansing, Mich.: Philosophy of Science As 0 ia -
tion.
Asquith, P. D., and T. Nickles, eds. 1982. PSA 1982, vol. 1. East Lansing, Mich.:
Philosophy of Science Association.
Ballentine, L. E. 1970. "The Statistical Interpretation of Quantum Mechanics." Re-
views of Modern Physics 42:358-381.
- - - 1972. " Einstein's Interpretation of Quantum Mechanics." American Journal
of Physics 40:1763-1771.
Belinfante, F. J. 1973. A Survey of Hidden Variable Theories. Oxford: Pergamon Press.
Bell, J. L., and A. D. Slomson. 1969. Models and Ultraproducts: An Introduction.
Amsterdam: North Holland.
Bell, J. S. 1964. "On the Einstein-Podolsky-Rosen Paradox." Physics 1:195-200.
Reprinted in Wheeler and Zurek (1983), pp. 403-408.
- -- 1966. "On the Problem of Hidden Variables in Quantum Mechanics."
Review of Modern Physics 38:447-452.
Beltrametti, E. G ., and G . Cassinelli. 1981. The Logic of Quantum Mechanics. Reading,
Mass: Addison Wesley.
Beltrametti, E.G., and B. C. van Fraassen, eds. 1981. Current Issues in Quantum Logic.
New York: Plenum Press.
Bigelow,} . C. 1976. "Possible Worlds Foundations for Probability." Journal of Philo-
so"ilical I.o!{ic 5:299 - 320.
352 RCfl'l'I'lI(,c s
Birkhoff, G., and J. von Neumann. ] 936. " The Logic of Quantum M e hani s."
Annals of Mathematics 37:823 - 843. Reprinted in Hooker (1975), pp. "1 - 26.
Blanche, R. 1962. Axiomatics. Trans. G. B. Keene. London: Routledge and Kegan
Paul.
Bohm, D. 1951. Quantum Theory. Englewood Cliffs, N.J.: Prentice Hall.
- - - 1957. Causality and Chance in Modem Physics. London: Routledge and Kegan
Paul.
Bohr, N. 1934. Atomic Theory and the Description of Nature. Cambridge: Cambridge
University Press.
- - - 1935a. "Can Quantum-Mechanical Description of Reality Be Considered
Complete?" Physical Review 48:696 -702. Reprinted in Wheeler and Zurek
(1983), pp. 145-151.
- - - 1935b. "Quantum Mechanics and Physical Reality." Nature 12:65. Re-
printed in Wheeler and Zurek (1983), p. 144.
- - - 1949. "Discussion with Einstein on Epistemological Problems in Atomk
Physics." In Schilpp (1949), pp. 200-241. Reprinted in Wheeler and Zurek
(1983), pp. 9-49.
Bohr, N., H. A. Kramers, and J. C. Slater. 1924. "Uber die Quantentheorie der
Strahlung." Zeitschrift fur Physik 24:69-87.
Born, M. 1926a. "Zur Quantenmechanik der Stossvergange." Zeitschrift fur Physik
37:863-867. Trans. in Wheeler and Zurek (1983), pp. 52-55.
- - - 1926b. "Quantenmechanik der Stossvergange." Zeitschrift fur Physik
38:803-827.
Bub, J. 1968. "The Daneri-Loinger-Prosperi Quantum Theory of Measurement." Ii
Nuovo Cimento 57B:503-520.
- - - 1974. The Interpretation of Quantum Mechanics. Dordrecht, Holland: Reidel.
- - - 1975. "Popper's Propensity Interpretation of Probability and Quantum
Mechanics." In Maxwell and Anderson (1975), pp. 416-429.
- - - 1977. "Von Neumann's Projection Postulate as a Possibility Conditionaliza-
tion Rule in Quantum Mechanics." Journal of Philosophical Logic 6:381-390.
- - - 1979. "The Measurement Problem of Quantum Mechanics." Problems in the
Philosophy of Physics (72d Corso). Bologna: Societa Italiana di Fisica.
- - - 1987. "How to Solve the Measurement Problem of Quantum Mechanics."
Paper delivered at the VIIIth International Congress of Logic, Methodology and
Philosophy of Science, in Moscow, 1987. College Park, Md.: University of
Maryland, mimeo.
Busch, P., and P. Lahti. 1984. "On Various Joint Measurements of Position and
Momentum Observables." Physical Review D 29:1634-1646.
- - - 1985. "A Note on Quantum Theory, Complementarity and Uncertainty."
Philosophy of Science 52:64-77.
Carnap, R. 1974. An Introduction to the Philosophy of Science. Ed. M. Gardner. New
York: Basic Books.
Cartwright, N. 1974. "Van Fraassen's Modal Model of Quantum Mechanics." Phi-
losophy of Science 41:199-202.
l<t'lt' rt' lIt't'lI •• 1.1
I!JH , I/ ow lhl' L(/ 1U.~ 11/ J )" y ~ ;('s !.ie, Oxford : 'Iarcndon Press, 1983.
lauser, J. F., M . A . Il orne, A. Shimony, and R. A . 11olt. 1969. " Proposed Experi-
ment to Test Ilidd 'n Varia ble Theories." Physical Review Letters 23:880-883 .
Clauser, J. F., and A. Shimony. 1978. " Bell's Theorem: Experimental Tests and
Implica tions." Reports on Progress in Physics 41:1881-1927.
Cohen, R 5 ., C. A. Hooker, A. C. Michalos, and J. W. van Ezra, eds. 1976. PSA 1974.
Boston Studies in the Philosophy of Science, vol. 32. Dordrecht, Holland: Reidel.
Cohen, R 5., and J. J. Stachel, eds. 1979. Selected Papers of Leon Rosenfeld. Dordrecht,
Holland: Reidel.
Cohen, R S., and M. W. Wartofsky, eds. 1969. Boston Studies in the Philosophy of
Science, vol. 5. Dordrecht, Holland: Reidel.
- - - 1974. Logical and Epistemological Studies in Contemporary Physics. Boston
Studies in the Philosophy of Science, vol. 13. Dordrecht, Holland: Reidel.
Colodny, R A., ed. 1965 . Beyond the Edge of Certainty. Englewood Cliffs, N .J.:
Prentice Hall.
- - - 1972. Paradigms and Paradoxes: The Philosophical Challenge of the Quantum
Domain, Pittsburgh: University of Pittsburgh Press, 1972.
Cooke, R M., and J. Hilgevoord. 1981. "A New Approach to Equivalence in Quan-
tum Logic." In Beltrametti and van Fraassen (1981), pp. 101 - 113.
Cooke, R, M. Keane, and W. Moran. 1985. " An Elementary Proof of G leason's
Theorem." Mathematical Proceedings of the Cambridge Philosophical Sociely
98:117-128.
Cushing, J. T., C. F. Delaney, and G. Gutting, eds. 1984. Science and Realily: Recelll
Work in the Philosophy of Science. Notre Dame, Ind .: University of Notre Da me
Press.
Cushing, J. T., and E. McMullin, eds. 1989. Philosophical Consequences of Quantum
Theory. Notre Dame, Ind: University of Notre Dame Press.
Dalla Chiara, M. L. 1977. "Quantum Logic and Physical Modalities." Journal of
Philosophical Logic 6:391-404.
- - 1986. "Quantum Logic." In Gabbay and Guenther (1986), vol. 3, pp. 427-
469 .
Daneri, A., A. Loinger, and G. M. Prosperi. 1962. "Quantum Theory of Measure-
ment and Ergodicity Conditions." Nuclear Physics 33:297 -319. Reprinted in
Wheeler and Zurek (1983), pp. 657-679.
Davies, E. B. 1976. Quantum Theory and Open Systems. London: Academic Press.
Davies, P. C. W. 1984. Quantum Mechanics. London: Routledge and Kegan Paul.
de Boer, J., E. Dal, and O . Ulfbeck, eds. 1986. The Lesson of Quantum Theory: Niels
Bohr Centennial Symposium, 1985. Amsterdam: North Holland.
Demopoulos, W. 1976. " What Is the Logical Interpretation of Quantum Me-
chanics?" In Cohen et al. (1976), pp. 721-728.
D'Espagnat, B. 1979. " The Quantum Theory and Reality." Scientific American
241 :158 - 180.
de Witt, B. S. 1970. "Quantum Mechanics and Reality." Physics Today 23:30-35 .
Reprin ted in de Witt and G raham (1 973), pp . 155 - 165 .
354 I<I'!/'/'I' I/ /'/'s
de Wi tt, B. S., and N . Gra ham, cds. 1973 . '/'/11' M OllY Worlds /lll erp r/' ,o l ;IJ /l llfQ II I/ IlIIlI1/
Mechanics. Princeton, N .).: Princeton Uni versity Press.
Dirac, P. A M. (1930]1967. Th e Principles of Quallium Mec/II/Ilics, 4th cd., rev.
Oxford: Clarendon Press.
Duhem, P. [1906]1962. The Aim and Stru cture of Physical Theory. Trans. P. P. Wiener.
New York: Athaneum.
Earman, J. 1986. A Primer on Determinism. Dordrecht, Holland: Reidel.
Eberhard, P. H . 1977. " Bell's Theorem without Hidden Variables." II Nu ovo Cimento
388 (1):75 - 79 .
Eco, U. 1979. The Role of the Reader, Bloomington, Ind. : Indiana University Press.
Eddington, A S. 1935a. " The Theory of Groups." In Eddington (1935b). Reprinted
in Newman (1956), vol. 3, pp. 1558-1573.
---1935b. New Pathways in Science. Cambridge: Cambridge University Press.
Edwards, P., ed. 1967. The Encyclopedia of Philosophy, 8 vols. New York: Macmillan.
Ehrenfest, P. 1959. Collected Scientific Papers. Ed. M. Klein. Amsterdam: North
Holland.
Einstein, A 1948. "Quantenmechanik und Wirklichkiet. " Dialectica 2:320-324.
Einstein, A, and P. Ehrenfest. 1922. "Quantentheoretische Bemerkungen zum Ex-
periment von Stern und Gerlach." Zeitschrift fiir Physik 11 :31-34. Reprinted in
Ehrenfest (1959), pp. 452-455.
Einstein, A , B. Podolsky, and N. Rosen. 1935. "Can Quantum Mechanical Descrip-
tion of Physical Reality Be Considered Complete? " Physical Review 47:777-
780. Reprinted in Wheeler and Zurek (1983), pp. 138 - 141.
Everett, H ., III. 1957. " 'Relative State' Formulation of Quantum Mechanics ." Re-
views of Modem Physics 29:454-462. Reprinted in de Witt and Graham (1973),
pp. 141-149, and in Wheeler and Zurek (1983), pp. 315-323.
- - - 1973. " The Theory of the Universal Wave Function." In de Witt and Gra-
ham (1973), pp. 3-140.
Fano, G. 1971. Mathematical Methods of Quantum Mechanics. New York: McGraw
Hill.
Fano, U. 1957. " Description of States in Quantum Mechanics by Density Matrix and
Operator Techniques." Reviews of Modem Physics 29 :74-93 .
Feyerabend, P. K. (1962), "On the Quantum Theory of Measurement." In Korner
(1962), pp. 121-130.
- - - 1975. Against Method. London: New Left Books.
Feynman, R. P. 1965 . The Character of Physical Law. Cambridge, Mass.: M.I.T.
Press.
Feynman, R. P ., R. B. Leighton, and M. Sands. 1965. The Feynman Lectures on
Physics, 3 vols. Reading, Mass.: Addison Wesley.
Finch, P . D. 1969. " On the Structure of Quantum Logic." Journal of Symbolic Logic
34:275-282. Reprinted in Hooker (1975), pp. 415 - 425.
Fine, A 1970. "Insolubility of the Quantum Measurement Problem,'; Physical Re-
view 2D:2783-2787.
- - - 1972. " Some Conceptual Problems of Quantum Theory." In Colodny
(1972), pp. 3 - 31.
11.)79 . " Il ow 1\1 <. 'olllli FI'I' lJlIl'l1 'iet>, Ll I rimer for Quan tum Realists." Syllthese
42: 145 154.
- - - 1984. "Einstein's Rea lism. " In Cushing, Delaney, and Guttig (1984), pp.
106 - 133 .
Finkelstein, D. 1969 . " Matter, Space and Logic." In Cohen and Wartofsky (1969),
pp.199 - 215.
French, A. P., ed. 1979. Einstein, A Centenary Volume. Cambridge, Mass.: Harvard
University Press.
Friedman, M., and C. Glymour. 1972. " If Quanta Had Logic." Journal of Philosophi-
cal Logic 1:16-28.
Friedman, M., and H . Putnam. 1978. "Quantum Logic, Conditional Probability and
Interference." Dialectica 32:305-315.
Gabbay, D., and F. Guenther. 1986. Handbook of Philosophical Logic, 4 vols. Dor-
drecht, Holland: Reidel.
Geroch, R. 1984. "The Everett Interpretation." Nous 18:617-633.
Gibbins, P. 1981a. " A Note on Quantum Logic and the Uncertainty Principle."
Philosophy of Science 48:122-126.
- - - 1981b. "Putnam on the Two-Slit Experiment." Erkenntnis 16:235 - 241.
- - - 1987. Particles and Paradoxes. Cambridge: Cambridge University Press.
Giere, R. N. 1973. "Objective Single-Case Probabilities and the Foundations of
Statistics." In Suppes et al. (1973), pp. 467 - 483.
- - - 1976. " A Laplacean Formal Semantics for Single-Case Propensities." Jou r-
nal of Philosophical Logic 5:321-353 .
- - - 1979. Understanding Scientific Reasoning. New York: Holt, Rinehart, Win-
ston.
Gillespie, D. T. 1970. A Quantum Mechanics Primer. Leighton Buzzard, Beds.: Inter-
national Textbook Company.
Gleason, A. M. 1957. " Measures on the Closed Subspaces of a Hilbert Space."
Journal of Mathematics and Mechanics 6:885-893.
Godel, K. 1933. " An Interpretation of the Intuitionistic Sentential Logic." Trans. J.
Hintikka and L. Rossi. In Hintikka (1969), pp. 128-129.
Goldstein, H. 1950. Classical Mechanics. Reading, Mass.: Addison Wesley.
Good, 1. J., ed. 1961 . The Scientist Speculates. London: Heinemann.
Gudder, S. P. 1970. "On Hidden Variable Theories." Journal of Mathematical Physics
11:431-436.
- - - 1972. "Partial Algebraic Structures Associated with Orthomodular Posets. "
Pacific Journal of Mathematics 41:712-730.
- - - 1973. "Quantum Logics, Physical Space, Position Observables and Sym-
metry." Reports on Mathematical Physics 4:193-202.
- - - 1976. "A Generalised Measure and Probability Theory for the Physical
Sciences." In Harper and Hooker (1976), pp. 121-141.
Haag, R. 1973. Boulder Lectures in Theoretical Physics, vol. 14B. Ed. W. E. Britten.
New York: Gordon and Breach.
Hacking, 1. 1983. Representing and Intervening: Introductory Topics in the Philosophy
of Science. Cambridge: Cambridge University Press.
356 Referel/ces
Halmos, P. R. 1957. Introduction to IIi/bert Splice alld the Theory of Spectral MlIltiplic-
ity, 2d ed. New York: Chelsea.
Hanson, N. R. 1967. "Quantum Mechanics, Philosophical Implications of." In Ed-
wards (1967), vol. 7, pp. 41-49.
Hardegree, G. M. 1980. "Micro-States in the Interpretation of Quantum Theory." In
Asquith and Giere (1980), pp. 43-54.
Hardegree, G. M., and P. J. Frazer. 1981. "Charting the Labyrinth of Quantum
Logics: A Progress Report." In Beltrametti and van Fraassen (1981), pp. 53-76.
Harper, W. L., and C. A Hooker, eds. 1976. Foundations and Philosophy of Statistical
Theories in the Physical Sciences. Dordrecht, Holland: Reidel.
Harrison, J. 1983. "Against Quantum Logic." Analysis 43:82-85.
Healey, R. 1979. "Quantum Realism: NaIvete Is No Excuse." Synthese 42:121-144.
- - 1984. "How Many Worlds?" Nous 18:591-616.
Heisenberg, W. 1927. "Uber den anschaulichen Inhalt den quantentheoretischen
Kinematik and Mechanik." Zeitschrift far Physik 43:172-198. Trans. as "The
Physical Content of Quantum Kinematics and Mechanics," in Wheeler and
Zurek (1983), pp. 62-84.
- - - 1958. Physics and Philosophy: The Revolution in Modern Science. New York:
Harper and Row.
Heitler, W. 1949. "The Departure from Classical Thought in Modem Physics." In
Schilpp (1949), pp. 181-198.
Hellman, G. 1982a. "Einstein and Bell: Tightening the Case for Microphysical Ran-
domness." Synthese 53:445-460.
- - - 1982b. "Stochastic Einstein Locality and the Bell Theorems." Synthese
53:461-504.
- - 1984. "Introduction." Nous 18:557 -567.
Hempel, C. G. 1954. "A Logical Appraisal of Operationism." Scientific Monthly
79:215-220. Reprinted in Hempel (1965), pp. 123-133.
- - 1965. Aspects of Scientific Explanation and Other Essays in the Philosophy of
Science. New York: Free Press.
Hintikka, J., ed. 1969. The Philosophy of Mathematics. Oxford: Oxford University
Press.
Hirst, R. J. 1967. "Phenomenalism." In Edwards (1967), vol. 6, pp. 130-135.
Holdsworth, D. G., and C. A Hooker. 1983. "A Critical Survey of Quantum Logic."
In Logic in the 20th Century. Scientia 1983:127-246.
Holland S. S., Jr. 1970. "The Current Interst in Orthomodular Lattices." In Trends in
Lattice Theory. New York: van Nostrand. Reprinted in Hooker (1975), pp. 437 -
496.
Hooker, C. A 1972. "The Nature of Quantum Mechanical Reality: Einstein versus
Bohr." In Colodny (1972), pp. 67-302.
Hooker, C. A, ed. 1973. Contemporary Research in the Foundations and Philosophy of
Quantum Theory. Dordrecht, Holland: Reidel.
- - - 1975. The Logico-Algebraic Approach to Quantum Mechanics, vol. 1: Historical
Evolution. Dordrecht, Holland: Reidel.
Ilugh 's, . E., ,lnu M. J. Cn·llllwdl . (<J6H . 1111 IlIlmt/ll r liolllo Mot/Ill toxic. LlInuon :
Methuen .
Hughes, R. I. C . 1979. Syslems of Quanlum Logic.Ph.D. diss. Vancouver: University
of British Columbia.
- - - 1981. " Quantum Logic." Scientific American 243:202-213.
- - - 1982. "The Logic of Experimental Questions." In Asquith and Nickles
(1982), pp. 243-256.
- - - 1985a. "Logics Based on Partial Boolean Algebras" [Review Article]. Journal
of Symbolic Logic 50:558-566.
- - - 1985b. "Semantic Alternatives in Partial Boolean Quantum Logic." Journal
of Philosophical Logic 14:411-446.
Hughes, R.1. G., and B. C. van Fraassen. 1988. "Can the Measurement Problem Be
Solved by Superselection Rules?" Forthcoming.
Jammer, M. 1966. The Conceptual Development of Quantum Mechanics. New York:
McGraw Hill.
- - - 1974. The Philosophy of Quantum Mechanics. The Interpretation of Quantum
Mechanics in Historical Perspective. New York: John Wiley.
Jarrett, J. P. 1984. "On the Physical Significance of the Locality Conditions in the B "
Arguments." Nous 18:569-589.
- - - 1989. "Bell's Theorem: A Guide to the Implications." In Cushing nnd
McMullin (1989).
Jauch, J. M. 1968. Foundations of Quantum Mechanics. Reading, M()III1.: AddHltlll
Wesley.
Jauch, J. M., and C. Piron. 1963. "Can Hidden Variables Be Exclutl('d III (JIIIIIIIIIIII
Mechanics?" Helvetica Physica Acta 38:827 -837.
Jeffrey, R. c., ed. 1980. Studies in Inductive Logic and Pro!Jn/liIl/I/, vol 7 "",~"I. ,y,
Calif.: University of California Press.
Jordan, T. F. 1969. Linear Operators for Quantum Mecllallics. N('w YIII k ,. 11111 W ,. Y
Kadison, R. 1951. "Isometries of Operator Algebras." 11'11111111 II/ MIIIII,'''II,III .
54:325-338.
Kant, I. [1787]1929. Critique of Pure Reason. Trans. N. Kl'mp Smllh . N.·w III~ ',I
Martin's Press.
- - - [1786]1970. Metaphysical Foundations of Nalliral Scicllce. Tmllll I I'll 111'0 11 ",
Indianapolis: Bobbs Merrill.
Kelley, J. L. 1955. General Topology. New York: van Nostr()nd .
Kleene, S. C. 1967. Mathematical Logic. New York: John Wiley.
Kochen, S. 1978. 'The Interpretation of Quantum Mechani ~ . " Add,,"'" III 1111
Biennial Conference of the Philosophy of Science Association, 197H . 1'1 Ill' It "I
N.J.: Princeton University, mimeo.
Kochen, S., and E. P. Specker. 1965. "Logical Structures Ari si ll~ Il 1111,,111111
Theory." Symposium on the Theory of Models. Amsterdam: Norlh 11011 11101 II.
printed in Hooker (1975), pp. 263-276.
- - - 1967. "The Problem of Hidden Variables in Quantum Me hallln. " '.1/1111111
of Mathematics and Mechanics 17:59 - 87.
358 1~I'fcrc ll cc5
I ()6:l " M",I IHl rt ' IIU'1I1 1l III 011 'lllulll Ml'dl,II11' N, " /\111111111 (1/ I'hysics 23:469
485.
Maxwell, G., and R. M. Anderson, Jr., cd . 1975 . llIdllctioll, Probability and Confirma-
tion. Minnesota Studies ill Philosophy of Science, vol. 6. Minneapolis: University
of Minnesota Press.
McMullin, E. 1978. "Structural Explanation." American Philosophical Quarterly
15:139-147.
Messiah, A. 1958. Quantum Mechanics, 2 vols. Vol. 1 trans. G. M. Tenner; vol. 2
trans. J. Potter. New York: John Wiley.
Mielnik, B. 1968. " Geometry of Quantum States." Communications in Mathematical
Physics 9:55 - 80.
Mittelstaedt, P. 1981. " Classification of Different Areas of Work Afferent to Quan-
tum Logic." In Beltrametti and van Fraassen (1981), pp. 3-16.
Monk, J. D. 1969. Introduction to Set Theory. New York: McGraw Hill.
Morgenbesser, S., ed. 1967. Philosophy of Science Today. New York: Basic Books.
Mott, N . F., and H. S. W. Massey, 1965. The Theory of Atomic Collisions. Ox-
ford: Clarendon Press. Reprinted in part in Wheeler and Zurek (1983), pp.
701-706.
Nagel, E., P. Suppes, and A. Tarski, eds. 1962. Logic, Methodology and Philosophy of
Science. Stanford: Stanford University Press.
Newman, J. R., ed. 1956. The World of Mathematics, 4 vols. New York: Simon and
Schuster.
Noakes, G. R. 1957. New Intermediate Physics. London: Macmillan.
Pagels, H . R. 1982. The Cosmic Code: Quantum Physics as the Language of Nature. New
York: Simon and Schuster.
Park, J. L., and H. Margenau. 1968. "Simultaneous Measurability in Quantum
Theory." International Journal of Theoretical Physics 1:211-283.
- - - 1971. "The Logic of Noncommutability of Quantum Mechanical Operators
and Its Empirical Consequences." In Yourgrau and van der Merwe (1971), pp.
37-70.
Pauli, W. 1933. "Die allgemeinen Prinzipien der Wellenmechanik." Handbuch der
Physik (ed. H. Geiger and K. Scheel), 2d ed., vol. 24, pp. 83-272. Berlin:
Springer Verlag.
Penrose, R., and C. J. Isham, eds. 1986. Quantum Concepts in Space and Time. Oxford:
Clarendon Press.
Peterseri, A. 1963. "The Philosophy of Niels Bohr." Bulletin of the Atomic Scientists,
September 1963, pp. 8-14.
Piron, C. 1972. "Survey of General Quantum Physics." Foundations of Physics
2:287 - 314.
- - - 1976. Foundations of Quantum Physics. Reading, Mass.: Benjamin.
Popper, K. R. 1959. The Logic of Scientific Discovery. London: Hutchinson.
- - - 1982. Quantum Theory and the Schism in Physics. Totowa, N.J.: Rowan and
Littlefield.
PSSC (Physical Sciences Study Committee). 1960. Physics. New York: Heath.
360 I<efcrcll ccs
Putnam, H. 1962. " What Theories Arc No!." In Nagel, Suppes, and T"rski (1962),
pp. 240-251.
- - - 1965. "A Philosopher Looks at Quantum Mechanics." In Colodny (1965),
pp.75-1Ol.
- - 1969. "Is Logic Empirical?" In Cohen and Wartofsky (1969), pp. 181-241.
Reprinted in Hooker (1975), pp. 181- 206.
Reichenbach, H. 1944. Philosophic Foundations of Quantum Mechanics. Berkeley,
Calif.: University of California Press.
- - - 1956. The Direction of Time. Berkeley, Calif.: University of California Press.
Robertson, H. P. 1929. "The Uncertainty Principle." Physical Review 34:163-164 .
Reprinted in Wheeler and Zurek (1983), pp. 127-128.
Rosenfeld, L. 1971. "Quantum Theory in 1929." In Cohen and Stachel (1979); see
also Wheeler and Zurek (1983), pp. 699-700.
Russell, B. 1917. Mysticism and Logic. London: Allen and Unwin.
Salmon, W. C. 1984. Scientific Explanation and the Causal Structure of the World.
Princeton, N.J.: Princeton University Press.
Schilpp, P. A., ed. 1949. Albert Einstein: Philosopher-Scientist. La Sale, Ill.: Open
Court.
Schrodinger, E. 1935. "Die gegenwartige Situation in der Quantenmechanik." Na-
turwissenschaften 22:807 -812,823-828,844-849. Trans. as "The Present Sit-
uation in Quantum Mechanics" by J. D. Trimmer, in Wheeler and Zurek (1983),
pp. 152-167.
- - 1953. "What Is Matter?" Scientific American, September 1953, pp. 52-56.
Shimony, A. 1980. "The Point We Have Reached." Epistemological Letters, June
1980.
- - - 1981. "Critique of the Papers of Fine and Suppes." In Asquith and Giere
(1981), pp. 572-580.
- - - 1986. "Events and Processes in the Quantum World." In Penrose and Isham
(1986), pp. 182-203.
Sikorsky, R. 1964. Boolean Algebras, 2d ed. Berlin: Springer Verlag.
Simon, B. 1976. "Quantum Dynamics: From Automorphism to Hamiltonian." In
Lieb, Simon, and Wightman (1976), pp. 327-349.
Skyrms, B. 1980. Causal Necessity: A Pragmatic Investigation of the Necessity of Laws.
New Haven, Conn.: Yale University Press.
Stairs, A. 1982. "Quantum Logic and the Liiders Rule." Philosophy of Science
49:422-436.
- - - 1983a. "On the Logic of Pairs of Quantum Systems." Synthese 56:47 -60.
- - - 1983b. "Quantum Logic, Realism and Value Definiteness." Philosophy of
Science 50:578-602.
- - - 1984. "Sailing into the Charybdis: van Fraassen on Bell's Theorem." Syn-
these 61 :351-359.
Staniland, H . 1972. Universals. New York: Anchor Books.
Stein, H. 1972. "On the Conceptual Structure of Quantum Mechanics." In Colod ny
(1972), pp. 367 -438.
I~/ 'fm' " ces 361
Supp " /0'. 1977. " Til' Sl·.lrcll for I' hllosophi Und 'rslanding of Scientific Theories."
In rhe Stru cture of cielltific Theories, cd. F. Suppe, pp. 3-232. Urbana:Univer-
sity of Illinois Press.
Suppes, P. 1966. "The Probabilistic Argument for a Nonclassical Logic in Quantum
Mechanics." Philosophy of Science 33:14-21.
- - 1967. "What Is a Scientific Theory?" In Morgenbesser (1967), pp. 55-67.
Suppes, P., L. Henken, A. Joja, and G. C. Moisil, eds. 1973. Logic, Methodology and
Philosophy of Science, vol. 4. Amsterdam: North-Holland.
Swift, A. R., and R. Wright. 1980. " Generalized Stern-Gerlach Experiments and the
Observability of Arbitrary Spin Operators." Journal of Mathematical Physics 21
(1):77-82.
Taylor, E. F., and J. A. Wheeler. 1963. Space-Time Physics. San Francisco: Freeman.
Teller, P. 1979. "Quantum Mechanics and the Nature of Continuous Physical
Quantities." Journal of Philosophy 76:345-360.
- - - 1983. "The Projection Postulate as a Fortuitous Approximation." Philosophy
of Science 50:413-431.
- - - 1989. "Relativity, Relational Holism, and the Bell Inequalities." In Cushing
and McMullin (1989).
Tierliebhaber, X. 1939. "Katzen und Affen, Affen und Katzen: die Tiere der Philoso-
phen." Zeitschrift fur Philosophische Zoologie 1:1-26.
Toulrnin, S. 1953. Philosophy of Science: An Introduction. London: Hutchinson.
van Fraassen, B. C. 1972. "A Formal Approach to the Philosophy of Science." In
Colodny (1972), pp. 303-366.
- - - 1974a, "The Einstein-Podolsky-Rosen Paradox." Synthese 29:291-309.
- - 1974b, "The Labyrinth of Quantum Logic." In Cohen and Wartofsky
(1974), pp. 72-102. Reprinted in Hooker (1975), pp. 577-607.
- - - 1980. The Scientific Image. Oxford: Clarendon Press.
- - - 1981a. "Assumptions and Interpretations of Quantum Logic." In Beltra-
metti and van Fraassen (1981), pp. 17-31.
- - - 1981b. "A Modal Interpretation of Quantum Mechanics." In Beltrametti
and van Fraassen (1981), pp. 229-258.
- - 1982. "The Charybdis of Realism: Epistemological Implications of Bell's
Inequality." Synthese 52:25 - 38.
- - - 1985. "Salmon on Explanation." Contribution to a symposium at the East-
ern Division Meeting of the American Philosophical Association, 1985. Prince-
ton, N.J.: Princeton University, mimeo.
von Neumann, J. [1932] 1955. Mathematical Foundations of Quantum Mechanics.
Trans. R. T. Beyer. Princeton, N.J.: Princeton University Press.
Wan, K.-K. 1980. "Superselection Rules, Quantum Measurement and the Schr6-
dinger's Cat." Canadian Journal of Physics 58:976-982.
Weyl, H. 1952. Symmetry. Princeton, N.J.: Princeton University Press.
Wheeler, J. A. 1957. "Assessment of Everett's 'Relative State' Formulation of Quan-
tum Theory." Reviews of Modern Physics 29:463-465. Reprinted in de Witt and
Graham (1973), pp. 151-153.
362 Referellces
Wheeler, ]. A., and W. H. Zurek, eds. 1983 . Q ll all/IIIII T heory all d M eaSli re ml' lI/ .
Princeton, N .].: Princeton University Press.
Wigner, E. P . 1961. "Remarks on the Mind-Body Question." In Good (1961). Re-
printed in Wigner (1967), pp. 171-184.
- - - 1963. " The Problem of Measurement." American Journal of Physics 31 :6 - 15.
Reprinted in Wigner (1967), pp. 153-170, and Wheeler and Zurek (1983), pp.
324-341.
- - - 1967. Symmetries and Reflections. Bloomington, Ind. : Indiana University
Press.
- - - 1970. " On Hidden Variables and Quantum Mechanical Probabilities."
American Journal of Physics 38:1005-1009.
- - - 1973. " Epistemological Perspectives in Quantum Theory." In Hooker
(1973), pp. 369-385.
Yourgrau, W., and A. van der Merwe, eds. 1971. Perspectives in Quantum Theory:
Essays in Honor of Alfred Lande. Cambridge, Mass.: M.LT. Press.
Index
Absorption, 182, 189 Bohr, N., 214, 216, 217, 228-231, 241, 266,
Accardi, L., and A. Fedullo, 238 296,297,306,310,317
Accumulation point, 338 Bohr, N., H. Kramers, and J. Slater, 277
Algebra: of events, 194-201, 302; of Bohr-Heisenberg interpretation, 214, 216
properties, 178 -182; of propositions, 203 Bohr's theory of atom, 175
Appro>\imation: empirical, 298; fortuitous, Boolean algebras, 178, 182-186, 189;
297,298;uniform,298 atomic, 185; finite, 185; a-algebra, 220
Aristotle, 156 Boolean manifold, 192
Aspect, A., 241, 245, 247, 308 Borel set, 197
Aspect, A., J. Dalibar, and G. Roget, 241 Born, M., 158, 173,232,302
Associativity, 182, 189 Born's rule, 162
Automorphism, 128 Bub, J., 170, 173, 214,216,217,220,22:1,
Available properties, 215 224, 234, 238, 258, 273, 289, 299, 3 1H, :11 '/
Axiomatic view of theories, 80, 256 Buridan's ass, 279
Busch, P., and P. Lahti, 107, 157,260, 2(,\
270
Ballentine, L., 163
Basis, 13; orthonormal, 48 C2, 11,31-36
Belinfante, F., 173 Cartwright, N., 81, 119, 256, 257, 2H'I, '/'/1
Bell, J. L., and A. Slomson, 183, 186 296
Bell, J. S., 170, 172, 173, 174, 175,238,240, Cassinelli, G. See Beltrametti and .111/1 ,,,,/II
322 Categorial framework, 175,217, 2JO, .lUl
Bell's inequality, 238, 241, 242, 243, 244, 305
245, 297, 306 Causal anomalies, 219, 228
Bell's theorem, 237, 238 CHSH inequality, 242
Bell-Wigner inequality, 170-172, 237 CKM. See Cooke, Keane, and Mor,1II
Beltrametti, E., and G. Cassinelli, 132, 139, Classical: horizon, 312-313, 319; 10111"
146, 147, 148, 162, 178, 191, 196, 198, 184, 186,202,203
200,235,265,270,285,286,287,288,289 Classical mechanics, 175, 194; I III III II"" I"
Bigelow, J., 292 cobi formulation of, 57, 72, 75, 17/,
Bilinear form, 337 Clauser, J., 242
Binomial theorem, 94 Clauser, J., and A. Shimony, 17'1, 1~1I , JoI .
Birkhoff, G., and J. von Neumann, 214. See Cohen, R, and J. Stachel, 297
also MacLane and Birkhoff Coherence, 193, 194
Bianche, R, 80 Collapse of wave packet, 227, 272, 111i1 1111
Bohm, D., 109, 159, 160, 170, 237,238 Common cause, 246
64 I"tlt'
Kadison, R , 146 McMullin, E., 256, 266, 269 . See II /SO ush
Kant, 1., 176, 237, 317 ing and McMullin
Keane, M. See Cooke, Keane, and Moran Maczinski, M., 178, 197, 200
Kelley, J., 338, 346 Magnetic core hypothesis, 3
Kinetic energy, 59 Magnetic moment, 2, 3, 296
Kleene, S., 184 Many-worlds interpretation (MWI),
Klein, F., 128 289 - 294
Kochen and Specker's theorem, 164-168, Mapping, 42
170,173,174,177,206, 213,238,260,322 Margenau, H., 266, 270, 271, 272, 273, 276,
Kochen, S., 176,215,216,217 277, 302
Kolmogorov, A., 218, 220, 306 Massey, H. See Mott and Massey
Kolmogorov event space, 255, 307 Matrix, 17; diagonal elements of, 33;
Kolmogorov probability axioms, 88, 142, 219 representation, 18, 32, 128
Kolmogorov probability function, 219, 222, Maximum element, 188, 191
237 Maxwell's equations, 113, 300
Komer, S., 175 Maxwell's theory, 299, 300
Kramers, H. See Bohr, Kramers, and Slater Meaning incommensurability, 230
Measurement, 299, 302; maximal, 273;
£2, 44 problem, 79, 212, 217, 280, 312
U , 44 Medicine Hat, 62
Lahti, P. See Busch and Lahti Meet, 182, 190
Lameti-Rachti, M., and W. Mittig, 297 Messiah, A., 134
Landau, L., 272 Metaphysical nostalgia, 217
Landau, L., and E. Lifschitz, 319 Mielnik, B., 132, 135
Lande, A. See Sommerfeld and Lande Mind-body problem, 294
Laplace, P., 74, 75, 76, 312 Minimum element, 188, 191
Latency, 302-304, 308, 309 Minkowski space-time, 242, 258
Lattices, 189-190; atomic, 204; comple- Mittelstaedt, P., 178
mented, 189; distributive, 189; nondistrib- Mittig, W. See Lameti-Rachti and Mittig
utive, 191; orthocomplemented, 189, 191; Mixture. See States, mixed
orthocomplete, 189; orthomodular, 178, Modal: interpretation, 315, 317, 318; logic,
189, 190,192; ultrafilter on, 204 206; operator, 206
Least upper bound. See Supremum Models, 79 - 82
Leggett, A. , 226, 241, 297 Monk, J., 321
Leighton, R See Feynman, Leighton, and Moose Jaw, 62
Sands Mott, N ., and H. Massey, 296
Lewis, D., 218, 292, 293 MWI. See Many-worlds interpretation
Locality condition, 161
Local realism, 172,237 National Security Council, 293
LOgic: classical, 184, 186, 202, 203; Natural units, 4, 65, 105, 264
deductive, 178; intuitionistic, 207; modal, Negation, 202
206; quantum, 177, 178-217 Newman, J. , 39
Loinger, A. See Daneri, Loinger, and Prosperi Newton's laws of motion, 73
Loux, M., 292 Noakes, G., 298
Liiders rule, 224-226, 235, 236, 248, 249, No-go theorems, 173 -175
250, 253, 254, 274, 275, 278, 297, 298, Non-Euclidean geometry, 209
299,308,315, 347 -348
Ludwig, G., 265 Observables, 59, 63, 69, 82, 97, 98, 155,
197; compatible, 102, 104, 165; Fourtier-
Mackey, G., 116, 146, 147, 178, 197,200,322 connected, 259; full set of, 168; function-
Mackey-Maczinski theorem, 198 ally dependent, 101, 102; incompatible,
MacKinnon, E., 214 104-107, 111, 157, 159,309; indepen-
MacLane, S., and G. Birkhoff, 39, 40 dent, 100; minimal representation of,
1"tI,'\ .11,/
IOil ; IIHII1 ... "III,I\, 10 / , I,", "'"lll1l1y 1'111"""'111 111\11111"1 , " , II!" 2M, 267
Irnns(orm"blt" I 0(', 135, 2S 9; po~ ltjun , Popper, K., 163,220,26 ,266
64,66, 107,264 I'os 'I, 186 190; omplcmented, 188;
Operalionalism, 97, 195, 2 14 , 230 orlhocomplcmented, 188; orthocomplete,
Operations, 38 - 40; on field, 40; on group, 189; orthogonality on, 188; orthomodu-
39; on vector space, 41; partial, 192, 194; lar, 189; separable, 200
set-theoretic, 87, 181 Possible worlds, 292
Operators, 14, 97; addition of, 20, 43; Postulate M, 198, 199
bounded, 137, 138;corrunuting, 15, 103, Precise value principle, 163, 164, 168, 171,
104; continuous, 137; decomposition of, 177,217
25, 36; density, 138; differential. 43; Preparation, 62, 85, 88, 196
eigenvalues of. 22, 23, 32, 43, 49, 64; ei- Principle theories, 258
genvectors of, 22, 23, 32, 43, 49, 64; Probability: classical space, 219, 306;
Fourier-connected, 107; Fourier-Plan- conditional, 223, 232, 306-307;
cherel. 107; Hamiltonian, 77, 81, 115, epistemic, 144; function, 89, 219; general-
296; Hermitian, 32, 34, 47, 49, 52, 63; ized function (GPF), 222, 308, 347;
idempotent, 47; identity, 18, 48; linear, generalized theory of, 219-222;
17,42; matrix representation of. 18,32; Kolmogorov function, 219, 222, 237;
momentum, 107, 264; multiplication of, measure, 89, 142, 146; objective, 218;
18, 32, 43; position, 64, 66, 107, 264; propensity interpretation of, 164, 218;
positive, 136, 147; projection, 15, 19, 23, relative frequency interpretation of. 163,
24,35,46-47; reflection, 14,23; rotation, 218; subjective, 218; transition, 117
14, 23, 78; self-adjoint, 147; statistical, Projection postulate, 271 -275, 299, 314, 315
138; symmetric, 24, 33; trace class, 136; Properties, 155, 176, 237, 259, 302, 303; vs.
trace of, 136; unitary, 78; vector, 134, dispOSitions, 62; emergent, 317. See also
310; zero, 15 Available properties
Orthoalgebra, 196, 199, 220-222, 302; Property-eigenvalue link, 314, 315
associative 221 PropOSitions, 182, 184, 202; maximal
Orthogonality, 193,221; of subspaces, 46; consistent sets of, 202; quantum, 205,
of vectors, 26, 27, 34, 45 208, 259
Orthogonal sum, 221 Prosperi, G. See Daneri, Loinger, and Prosperi
Orthomodular identity, 189, 191 Ptolemaic astronomy, 158
Outcome, 85, 195 Pure states, preservation of, 117. See also
Outcome independence, 243 States, pure
Putnam, H., 80, 205, 209-212, 216, 217,
Pagels, H., 252 234, 289
Parameter independence, 243 Pythagoras' theorem, 83, 84
Park, J., 270
Partial Boolean algebras, 178, 192; Quadratic form, 338
intransitive, 193; transitive, 193, 222 Quantum: conditionalization, 297;
Partial ordering, 185, See also Poset connectives, 203-204; event, 259, 275,
Particle model, 227 314; event interpretation, 276, 278, 296,
Partition, 238 301-306,314; logic, 177, 178-217;
Pauli, W., 276 measurement conditionals, 303, 306;
Pauli spin matrices, 36-38, 65, 67, 131, propositions, 205, 208, 259
135, 139, 165
PBA. See Partial Boolean algebras 1R2, 11, 12-28
Peano's axioms, 80 Ray, 35
Peierls, R., 272 Reichenbach, H., 214, 216, 217, 227, 245,
Petersen, A., 229 246,248
Phase space, 58,175,179 Relation: antisyrrunetric, 185; reflexive, 185;
Piron, c., 178, 199, 206, 322, 330 transitive, 185
Planck, M., 231 Relative frequency principle, 163, 171
3 8 II/de
Relative state formulation, 291, 311 mi ro-, 176; mixed, 63, 93, 11 0, II I, I 6,
Relativity: general theory of, 209; special 141 , 143 (see also Ignora nce interpr ' ta
theory of, 240, 253, 304, 305 tion); position, 63; pure, 63, 69, 91 , 111 ,
Reversibility, 116 121,129,201; reduction of, 149 - 151 ;
Robertson, H., 267 singlet spin, 160, 239, 254; spin, for
Roget, G. See Aspect, Dalibar, and Roget spin-t particle, 63, 131; statistical, 176,
Rosenfeld, 1., 230, 297 215; sufficient set of, 200; value, 215
Ruby laser, 82 Statistical algOrithm, 67, 136, 143, 155, 236
Russell, B., 40 Statistical deternrinism, 116, 146
Statistical interpretation, 162 - 164, 238
Salmon, W., 240, 245, 246, 255, 256, 258 Stein, H., 45, 55, 116, 195, 221
Sands, M. See Feynman, Leighton, and Sands Stern-Gerlach experiment, 1-8, 37, 113,
Scalars, 40 296, 297
Schopenhauer, A., 80 Stone's theorem: on Boolean algebras, 186,
Schrodinger, E., 44, 230, 232, 279, 280, 319 220; on unitary operators, 114, 117, 146,
Schrodinger equation, 77, 78,81, 113-118, 201
145,146,201,235,278,280,298,313,315 Structural explanation, 256-258
Schrodinger's cat, 279, 290 SU(3), 128
Semantic view of theories, 175,257 Sublattice, 192
Senrigroup, 116, 117 Subspaces, 35, 36, 45; closed, 55; compati-
Shimony, A., 243, 245, 246, 253, 304, 311. ble, 102; dimenSionality of, 49; orthogo-
See also Clauser and Shimony nality of, 46
Sikorsky, R., 182 Summhammer, J., 226
Simon, B., 146 Superposition, 143, 236, 303; preservation
Simulacrum, 81; account of explanation, 257 of, 117; principle of, 92,108,111,200
Skyrms, B., 218 Superselection: rule, 285, 318; subspace, 285
Slater, J. See Bohr, Kramers, and Slater Suppe, F., 80
Slomson, A. See Bell and Slomson Suppes, P., 80, 220
Sommerfeld, A, 7 Support: principle, 260; requirement, 273
Sommerfeld, A., and A. Lande, 3 Supremum, 187
Spaces: basis for, 13, 48; Cartesian product Surface locality, 243, 246, 250-252
of, 149; complete, 55; complex, 11; Swift, A, and R. Wright, 113
dimensionality of, 49; direct sum of, 100; Symmetry, 120-121, 127, 135, 309
Hilbert, 11,55,63,69; phase, 58, 69,175, Systems, 86, 301, 302; classical, 57-59, 69;
179; physical, 124; real, 11; representa- composite, 136, 148-151, 161,349-350;
tional, 124; state, 63, 69; topological Newtonian, 79; quantum, 69, 79; spin-I ,
product of, 149; vector, 11, 40-42, 88. 164; spin-t, 3, 11, 119-130,263
See also Tensor-product spaces
Span, 48, 49,87, 190
Spectral: decomposition, 55, 67; decomposi- Taylor, E., and A Wheeler, 240, 242
tion theorem, 25, 36, 50, 54, 98; measure, Teller, P., 210, 212, 274, 278, 297, 298, 299,
51-55,235 302
Spectrum: continuous, 51-55, 66, 67, 69, Tensor-product spaces, 136, 160, 253; inner
210,235,260; discrete, 49, 69, 71, 97 product on, 148; linear operators on, 149;
Spin, 3, 11, 63, 296, 309. See also Pauli spin orthonormal basis for, 148; reduction of
matrices; Systems, spin t states on, 149 - 151; zero vector in, 149
Stachel, J. See Cohen and Stachel Three-valued lOgic, 214
Stairs, A, 165, 174, 209, 212, 217, 250, Tierliebhaber, X., 279
258,275 Tinkertoy,81
Staniland, H., 303 Total ordering, 187
States, 82, 88, 90, 196, 197; classical, 58; Trace of operator, 136
complete set of, 198; dispersion-free, 79, Truth: assignments, 203, 205, 206; tables,
159; dynanrical, 176; dynanrical evolution 184; values, 184, 214, 216
of, 72, 77-78, 113-118, 145-146, 201 ; Two-slit experiment, 226 - 238, 255
111111',1' .Jill
Uhl,'nb" ,'k, C " ,I multiplication of, 12,34, 40; zero, 12, 13,
Ultraflllt: r, 1114 , 1H ~, 202, 2()4 , 205 41 . Sa II/ SO Spa eS, vector
Uncertainty, 262, 267; interpretations of, von Neumann, j ., 11, 45, 55, 108,146,162,
266; principle, J08, 111 , 200,269 173,267, 268, 269, 271, 272, 273, 274,
Unicorns, 217 275,276,277,278,288,294,304,313,315
Universal wave function, 312
Un polarized electron, 144,215 Wan, K.-K., 285, 287
Wave model, 227, 297
Valuations, 207 Wave-particle duality, 228, 231, 302, 303
van der Waals' law, 298, 299 Wheeler, J., 290. See also Taylor and Wheeler
van Fraassen, B" 79, 80,149,176,178,218, Wheeler, J" and W. Zurek, 232, 241, 288
246-247,248,249,271, 272,277,287, Wigner, E., 113, 170,238,246,278,294,
288, 292, 306, 315, 318 295,311,317
Variance, 262 Wigner's friend, 294-295
Vectors, 12; addition of, 12, 13,34,40; Wright, R. See Swift and Wright
inner product of, 26, 33, 34, 44-45;
length of, 26, 33, 88; normalized, 26, 34, Z2, 183
45; orthogonality of, 26, 27, 34, 45; scalar Zurek, W. See Wheeler and Zurek