Wehrl 1978
Wehrl 1978
Wehrl 1978
Alfred Wehrl*
Institute for Theoretical Physics, University of Vienna, Vienna, Austria
It is rather paradoxical that, although entropy is one of the most important quantities in physics, its main
properties are rarely listed in the usual textbooks on statistical mechanics. In this paper we try to fill this
gap by discussing these properties, as, for instance, invariance, additivity, concavity, subadditivity, strong
subadditivity, continuity, etc. , in detail, with reference to their implications in statistical mechanics. In
addition, we consider related concepts such as relative entropy, skew entropy, dynamical entropy, etc.
Taking into account that statistical mechanics deals with large, essentially infinite systems, we finally will
get a glimpse of systems with infinitely many degrees of freedom.
Reviews of Modern Physics, Vol. 50, No. 2, April 1978 Copyright 1978 American Physical Society 221
222 Alfred Wehrl: General properties of entropy
of entropy and may frequently lead to rather obscure pense with an ab initio description of infinite systems.
conceptions and to very speculative or even mystical This will be done in the last section, but again, I can
ideas. (An example is the famous heat death. ) How- present only a very sketchy treatment because there
ever, it has to be stressed that the concept of entropy are severe mathematical obstacles that require exten-
is not at all unclear but a very well defined one. Of sive studies and go beyond the scope of this review.
course, a correct definition is only possible in the But after all one cannot avoid this approach because
framework of quantum mechanics, whereas in classical such important properties as ergodicity, mixing, sta-
mechanics entropy can only be introduced in a some- bility, etc. , can (quantum-mechanically) only hold in
what limited and artificial manner. strictly infinite systems.
Admittedly entropy has an exceptional position among As already mentioned, entropy can be considered as a
the physical quantities. For instance, it does not show measure of the amount of chaos, or, to what extent a
up in the fundamental equations of motion, such as the density matrix can be considered as "mixed. " In See.
Schrodinger equation. Its nature is rather, roughly II.C an elaborate version of this concept of "mixedness"
speaking, a statistical or probabilistic one; entropy can of a density matrix is presented. Since, on the other
be interpreted as a measure of the amount of chaos hand, entropy can also be regarded as a measure of the
within a quantum-mechanical mixed state. However, lack of information about a system (this is just another
entropy by no means has to be considered as an entirely point of view of the preceding statement), it is also
nem quantity going beyond the concepts of classical or necessary to comment on the relation between (physical)
quantum mechanics. This idea has been discussed fre- entropy and information theory (Sec. II.G).
quently in the past and, from time to time, is even Of course, a few words also have to be said about the
found in the present-day literature. I.et me emphasize classical ensembles of statistical mechanics (Sec. I.C)
that for a description of entropy the usual concepts of as well as about the history of the subject (Sec. I.D).
quantum mechanics such as Hilbert space, wave func- Again, this will be rather cursory because there exists
tion, observables, and density matrices are absolutely a rich and excellent literature about all that.
sufficient (Sec. 1.A). I hope that the physics will not be hidden behind &math-
Entropy relates macroscopic and mieroseopic aspects ematical technicalities. At least I have tried to avoid
of nature and determines the behavior of macroscopic this.
systems, i. e. , real matter, in equilibrium (or close to
equilibrium). Why this is true unfortunately is not yet
understood in full detail, in spite of a century's efforts
of thousands and thousands of physicists. There are 6 EN E BALI TI ES
many opinions and proposals for a solution to this
problem; however, none of them seems to be completely A. Definition of entropy
satisfactory. Since there is an abundant literature on As already discussed in the introduction, entropy is
this topic, I will not, in this review, try to take account different from most physical quantities. In quantum
of all the results obtained so far, but will restrict my- mechanics one has to distinguish between observables
self to a few remarks only (Sec. I.B.). and states. Observables, like position, momentum,
What I rather mant to do is to give a survey of the angular momentum, etc. , are mathematically des-
general properties of entropy, i.e. , those properties cribed by self-adjoint operators in Hilbert space.
that do not depend on certain specific systems but are — —
States which generally are mixed are characterized
generally true. This is the main content of Sec. II. Cer- by a density matrix, say, p, i.e. , a Hermitian operator,
tainly some of these properties are well~known whereas ~ 0, with trace = 1. The expectation value of an observ-
others seem to have escaped general attention, as, for able A in the state p is (A) = Trpb.
instance, strong subadditivity. But all of them are very Now entropy is not an observable; that means that
important and indispensable for, say, a correct treat- there does not exist an operator with the property that
ment of the thermodynamic limit and various other its expectation value in some state would be its entropy.
problems. I have tried to indicate in several places It is rather a function of a state. If the state is des-
what' these properties are good for in physics, however. cribed by the density matrix p, its entropy is defined
Sometimes this will be rather sketchy and I will outline by
the main ideas only and will have to refer to the original
papers for a detailed treatment.
Besides entropy itself there are many other quantities S (p) = —A Try slnp.
related to it that are of interest, as, for instance, the
relative entropy and several other concepts. They will This formula is due to von Neumann (1927) and gener-
be treated in Secs. III and H7. One thing should be said alizes the classical expression of Boltzmann and Gibbs
in this connection: there is a tremendous variety of to quantum mechanics. [von Neumann's derivation is
entropylike quantities, especially in the classical ease, based on earlier arguments by Einstein (1914) and
end perhaps every month somebody invents a new one. Szilard (1925)]. k~ is Boltzmann's constant = 1.38
Among all these "entropies" I have tried to select those x10 '6 erg/K. In what follows we will put it equal to 1
that, in my opinion, are of some physical significance. which corresponds to measuring the temperature in
Maybe my choice will be felt to be subjective. ergs instead of Kelvin; thus entropy becomes dimen-
Since statements in statistical mechanics are fre- sionless. (Occasionally we will insert in the formula
quently true in the infinite limit only, one cannot dis- for S(p) an arbitrary, compact, positive operator rather
than a density matrix. The quantity thus obtained has, scribed expectation values. Let us assume that in a
of course, no direct physical meaning. )' certain system there are N' different pure states, each
Entropy is a well defined quantity, no matter what of them occurring with the same probability. Then the
the kind or size of the system under consideration is. entropy is S=lnW (remember that we have put ks = 1).
(This statement, however, has nothing to do with the The density matrix of this system is p= (1/W)P, P being
question to what extent entropy is a useful quantity in a PV-dimensional projection. One easily can see that .
physics. ) It is always ~0, and, as we will see im- in%" = —Tr p ln p.
mediately, =0 exactly for the pure states, possibly If p is of a more general type, then one has to look for
=+~. (In a certain sense this latter possibility happens to an expression that interpolates between density matrices
be the usual case. Fortunately, this has no serious conse- of the form 1/W times a, W-dimensional projection. Of
quences in physics, ef. Sec. II.D. ) It is another question course, this is done by S(p) = -Trplnp, but there are
how well it can be measured (ef. Sec. IV. B). Admit- many more expressions which do the same (for in-
tedly in most cases one is not able to perform suffi- stance, —ln Trp', cf. Sec. IV.B). However, S(p)
ciently many measurements in order to determine the = —Trplnp is the only possibility with seasonable pro-
density matrix p, and thus S(p), completely. But this perties (such as additivity and subadditivity, cf. Sec. II.E
problem does not concern entropy specifically, only and F. Furthermore, the latter expression enjoys nice
quite generally the quantum-mechanical concepts of "mixing properties" that are very desirable from the
density matrices and wave function. However, it is true point of view of physics; cf. See. II.B).
that even if one knows p completely, it may be ex- It is rather instructive to pay attention to the combin-
tremely hard to calculate S(p), although, of course, atorial aspects of von Neumann's formula. Each den-
this can be done in principle, because one would have to sity matrix can be diagonalized: p=gj»~k)(k) [where ~k)
diagonalize an infinite matrix in order to compute the =normed eigenvector corresponding to the eigenvalue
trace of a function of it, namely, -plnp. p», ~k) (k~ =projection onto ~k), p» -0, +p»=1]. S(p)
= —Qp» inp» (we understand that 0lnO =0). p» is the
probability of finding the system in a pure state ~k). If
1. Various interpretations of the expression for the one performs N measurements, one will obtain as a re-
entropy sult that (at least for large N) the system is found
Before trying to clarify the relation between the ex- p, N times in the state ~1), p, N times in the state ~2),
~ ~
pression S(p) and physical reality, I want to mention a ete. (Of course, these quantities need to be integers,
few interpretations of von Neumann's formula. but this is only a minor point which easily can be cor-
Ludwig Boltzmann's great discovery was the celebrated rected. ) Now the density matrix does not contain any in-
formula formation about the order in which one will find the
states ~1), ~2), . . . , ete. There are Nl/(P, N)!(P,N)!. . .
possibilities for this; and for N- ~ we find (by virtue
whichappeared' in a paper in 1877 and established the of Stirling's formula) that 1/N times the logarithm of
connection between the variable of state, "entropy, " this number of possibilities converges to S.
which had been derived from phenomenological consid- One may likewise interpret this fact in the following
erations, and the "amount of chaos" (or disorder) of a manner: consider N copies of the same system (re-
system, which, more precisely, means the number of presented by the Hilbert space H H- ~ ~
H, H =Hil-
microstates which have the same prescribed macro- bert space of the original system). In this new system
scopic properties. (This number has been denoted as there are microstates of the form ~1) S ~2). . . , etc. ,
"thermodynamical probability, " in German "thermo- where ~1) occurs p, N times, ~2) P»N times, and so on.
dynamische Wahrscheinliehkeit" —
hence the letter W. ) All these microstates have the same weight. According
Of course, Boltzmann's treatment was a purely classi- to Boltzmann one obtains for the entropy lnW„(with W~
cal one. Since the "number of microstates" does not =N!/(p, N)!(p,N)! ~ ~
). The corresponding portion for
~
literally make sense in classical mechanics he took it one system is (1/N) inW„, which goes to S(p) as N- ~.
as the available volume in phase space divided by the
volume of an (at first arbitrarily chosen) "unit cell. " 2. Entropy and information theory
In quantum mechanics, however, there is no ambi-
As already explained, entropy is a measure of the
guity at all; the "number of microstates" may be in-
"amount of chaos" or of the lack of information about a
terpreted as the number of pure states with some pre-
system. If one has complete information, i.e. , if one is
concerned with a pure state, entropy =0. Otherwise it
Sometimes, mainly in the mathematical literature, one uses is &0, and it is bigger the more microstates. exist and
the letter H instead of 8 for entropy. It is claimed that the the smaller their statistical weight is. [One easily
H should be a capital "eta"; however, this is not so certain. checks the inequality S(p) ~ in(1/p, ), p, being the biggest
In any case, the letter H was introduced by Burbury in only eigenvalue (= operator norm) of p. ] This principle,
1890, whereas Boltzmann himself originally used "E." In namely, that entropy is a measure of our ignorance a-
physics, H is not a very good notation because of the risk of bout a system, described by a density matrix, or, in
confusion with the Hamiltonian. The name "entropy" is due
to Clausius (1865) and means transformation (vpo~q). The the classical case, by a probability distribution, en-
prefix "en" was chosen to have a resemblance to the word ables one to apply results of mathematical information
"energy. " theory to physics (Sec. II.G). Also the formal corre-
Not quite in this form, which is due to Planck (1906). spondence between the expression —Qp»lnp» and Shan-
non's expression for the information content of a dis- & —p" (p, q) lnp" (p, q) and
crete probability distribution suggests such a proce-
dure. We will discuss it in detail in Sec. II.G.
S~S
(En the rest of this article, we simply will write S, p in-
3. The classical approximation stead of S', p", if there is no risk of confusion. )
The "classical limit" of the expression for the entropy The above inequality is a consequence of the following
is obtained. by the usual prescription (we first consider inequality for matrix elements: let be a convex (con-f
the case of one degree of freedom only) cave) function, A be a self-adjoint operator and y be a,
normed vector. Then (Pl f(A)l@&- (-)f((plAIQ&). For
density matrix- probability distribution in phase
the proof let us (for the sake of simplicity only) assume
space
that A. has a pure point spectrum: A =Q o.»lk&(kl. Then
trace— dP dg' (@If(A)l P& =Q le»l'f(~») (-) -
27TS f(P Ic»l'o'»)=f(&II Alt &).
This can be justified mathematically by means of gghex- For many density matrices, the error due to the re-
ent states. placement of &p, qlPlnplp, q& by p" lnP" will be negligibly
Coherent states were introduced by Schrodinger in small. It turns out that the classical approximation is
1927. [A detailed treatment is presented in the book of good as long as P" ( p, q) is a smooth function spread over
Klauder and Sudarshan (1968).] They are functions of the
form U(p~ q)ID~ 0& =— p~ q&~ U(p, q) =—e~' " I »@
I
p, q =numbers, P, Q=momentumorpositionoperator,
thereby
re
'; a volume in phase space that is»h(Wehrl, 1977). If
there are small distance fluctuations or if p" is con-
centrated on small regions of phase space, then the
spectively; ID, 0& = the wave function in configuration space classical approximation ean be very bad. (For an esti-
vr
't h ' e * ". We have (p, qlP(Q)lp, q&=p(q). The
,
mate of this error in typical situations, cf. Sec. C.)
Ip, q) are Gaussian wave packets with minimal uncer- There is a striking paradox since quantum-mechani-
tainty. cally one always has S(p) ~ 0, because S(P) = —gp» lnp»,
One can prove the following important relation: and, since, p» ~ D, pp» =1, p» &1, one has p»lnp» ~ 0—
(=0 if, and only if, p» =0 or 1; hence if one p» = 1, all the
&p, qlAlhq&. others must be =0: therefore p is a one-dimensional
projection, i..e. , a pure state). Thus S(p) ~ 0 (ef. , also
One should bear in mind that the p, q& are normed but See. II.A). The conventional classical entropy, however,
may verywell be &0, even —~, in spite of the inequality
I
Ip, q&
= exp(i/~)(~ "pe —~"qP)Io, D&. can become negative (see Fig. 1).
Our following considerations are equally valid for these Suppose that S( f) &0. Because J(dpdq/h) =1, the ex- f
kinds of coherent states. tent of the region, where f
&1, must be &h. Hence a
negative classical entropy arises if one tries to localize
If one defines
p" (p, q) =&p, qlplp, a particle in phase space in a region &Pg, i.e. , if the un-
q& (1.2) certainty relation is violated. Therefore in applying the.
as the classical probability distribution in phase space, conventional classical expression one has to keep in
then
dP d9'
P
ci
(p q)=TrP=I
S= — dp —
@
&p, qlPlnPlp, q&.
The classical approximation consists in replacing
(p, qlP»PI p, q& by P' »P"':
0 I
(1.4) -xlnx
Since —xlnx is a concave function, —(p, qlplnplp, q) FIG. 1 Graph of f (x) = —xl~.
Rev. Mod. Phys. , Vol. 60, No. 2, April 1978
Alfred Wehrl: General properties of entropy 225
mind that not every classical probability distribution can rises if all density matrices under consideration in a
be observed in nature. certain problem commute. In this case, there exists a
Although the conventional classical entropy has some common set of eigenvectors ~i) such that pl ~(i) =PI l)i)
other inconvenient features (for instance, it is not (o labels the density matrices) and S(p") = —+p& lnp;
monotonic as the "true" classical entropy is; cf. Sec. This shows that every general theorem that is true in
II.F), we will nevertheless, for the rest of this paper, quantum mechanics also must be true in the classical
always understand by "classical entropy" the conven- discrete case, or, vice versa, if a theorem is not true
tional one, if not otherwise stated, in order to avoid any in the classical discrete case, it also cannot be true in
confusion. the general quantum-mechanical case.
At this place it should also be remarked —
which, of
course, is well known to everybody —
that in purely 5. Hilbert spaces for statistical mechanics
classical reasoning the expression for the entropy can
only be derived up to an additive constant. For dpdq Although the expression for the entropy does not refer
has the dimension of an action, hence in order to obtain to any special structure of a system, there are some
something dimensionless in the normalization condition particular features of many-body systems. Let me. be-
"f p" =1" one has to divide dP dq by some quantity of the gin. with a short review of the Hilbert spaces of those
dimension of an action (=volume of a "unit cell" ). The systems that are of primary interest in statistical me-
right quantity, as we have just seen, is Planck's con- chanics. For a careful presentation, see Huelle, 1969.
stant h (not h! ). If one takes some other quantity, say
h' (in a classical theory h cannot occur), we obtain the One-particle systems
normalization condition J(dpdq/h' )p"' = 1 and for the
The Hilbert space of a particle moving in a subvolume
entropy V of R~ (d = space dimension) is L'(V) = space of square-
P 9 Cl integrable functions g(x) (x e V). Here and throughout
S Cl the rest of this paper we neglect spin since our treat-
dP dq p' p' ment will be a nonrelativistic one only.
ln —ln
h
= correct classical entropy -ln —p' =— p gg'
Many-particle systems (Maxwell Boltzman-n statistics)
Here the Hilbert space is the tensor product of N
If, in particular, p = const in a certain region ("phase copies of L'(R~), thus the particles are supposed to be
volume" ) of the (P, q) plane, otherwise =0, then S' distinguishable. Since in nature there are only very few
=logarithm of the phase volume measured in units of distinguishable particles, Maxwell-Boltzmann statistic s
gg =logarithm
"
of the number of "cells. If the size of is not very well suited for purposes of statistical mech-
the cells is changed, then of course the expression for anic s.
the entropy changes too. (In classical statistical mech-
anics this problem is partly overcome by the ad hoc 8ose-Ei nstei n statistics
postulate of the third law of thermodynamics. ) The Hilbert space of N identical particles obeying B-E
statistics is the symmetric tensor product of N
4. The classical discrete approximation copies of L'(V):
In approximating the expression for the entropy one
can go one step further and discretize the classical pro- H„'(V) = I'(V) . . S L'(V)
bability distribution p". That means that one partitions
the phase space into cells of size h (enumerated by which equals space of square-integrable functions
some index i) and replaces p~~ in each cell by its aver- g(x„. . . , x„) (x, z V), that are symmetric in x„...,x .
age, which we will denote by p, , i.e. ,
dP dq Fermi-Dirac statisitcs
(1.6)
11 Like the 8-E case, but "symmetric" being replaced
"antisymmetric";
Then Qp, =1. The classical discrete entropy is defined
- by
by
S"' =—
g p, lnp, .
Fock space
Because of the inequality
If the number of particles is not kept fixed but if one
x(lnx —Iny) ~ x —y,
rather wants to take into account the possibility of a
one obtains variable number of particles, one considers Fock space
H'(V)= e H„'(V).
N=O
Like the classical (continuous) approximation, the class-
ical discrete approximation may be sufficiently good for [Ho =
—C (one-dimensional space = vacuum). ] If the mea-
many purposes. sure of the intersection of two volumina V, and V, is
It should be noted that the same formal structure a- zero (by abuse of language we will always write V, A V,
"
-], ! Kli) = I 82]) = =I, ((']) = P;.
Then
Let us illustrate this by the simple example of a
"microcanonical" density matrix p =(1/!V)P (P=projec- S= -Q P, lnP, .
tion of dimension W and W»N). Here S= logarithm of
the number of microstates =in(~) for fermions,
=ln( '~~ ') for bosons. In either case S= —lnN! +NlnW.
3. Relative Entropy. dg =cd'"tpd'"q/h'"(Nt, d~
= pd
'Tge term -lnN! that appears in these calculations can P dsq/h'(N! ), cr, p being probability distributions. In
be derived from a rule known a, s correct Boltzmann
this case the generalized BGS entropy is — p (lnp
—Ino)d p d'q/ha~(N!). With the +sign in front, this
f
counting: microstates of the type, say, ~1) 8 ~2) 8 -- ~
quantity is called "relative entropy" and plays an im-
and ~2) 8 j 1) 8 ~
are to be identified (which clearly is a
~
B. Entropy and physics system in which the original number of degrees of free-
The relation between entropy and physics is established dom, 10", is reduced to 6.
The Boltzmann equation is by no means an immediate
by an empirical principle, namely, the second law of
thermodynamics. There are several formulations of this consequence of the laws of classical mechanics, i. e. ,
the Hamiltonian equations. Rather it is based on several
law, of varying degrees of validity, which we will. now
assumptions, such as, for instance, the molecular
briefly discuss. However, as mentioned in the intro-
duction, the problem of the second law of thermodynam- chaos, or the "Stosszahlansatz, "andupon the fact that one
ics does not appear to be fully understood yet. considers the one-particle correlation function only, in-
stead of taking the whole probability distribution in phase
't. A paradox
space. It turns out that, although the time evolution of
the total system is given by the Hamiltonian dynamics,
A very common formulation of the second law of ther- under certain conditions the time evolution of the first
modynamics reads as follows: the entropy of a closed correlation function can be described, in fairly good ap-
system never decreases; it cari only remain constant or proximation, by an irreversible equation.
increase. A less sharp formulation is the following The correlation function of a system of X identical
(maximum entropy principle): the entropy of a closed particles (with mass = 1) is defined by
system in equilibrium always takes the maximal possible of particles in the
value. (Of course, both formulations are a little bit
E(P, q, t)d'p tf'q =number
vague and have to be specified in concrete instances. ) volume d'p d'q at time t
These statements are, however, in striking contra- [hence (I/N)E =probability of finding one particie in d'p
diction to the fact that the entropy of a system obeying d'q, irrespective of where the others are]. From the
the Schrodinger equation (with a time-independent Ham-
definition,
iltonian) afways remains constant. For the density ma-
— —
trix at time f let us denote it by p(t) is obtained from Ig P d q=N.
——
the density matrix at time 0 p by the formula
e»ttt e»Ht
E is obtained from p" (P„q„.. . ) by the formula
p(f) (1.12)
E(p, q, t)=
d p'2d $2'''d pg& Qg
Since e»"t is a unitary operator, the eigenvalues of p(t) I»' (N —1))
are the same as the eigenvalues of p. But the expres-
sion for the entropy only involves the eigenvalues of the Xp cl ((f'~q~p2~q2) ~ ~ ' ~i)'
density matrix, hence S(p(t))=S(p). (In the classical
(Because of the symmetry of p the exceptional position
case, the analogous statement is a consequence of I iou- of the first particle is only fictitious. )
ville' s theorem. )
The assumption of molecular chaos states that the
This result seems to be absurd since one knows by ex- number of pairs of particles in the element d'q in con-
perimental experience that the second law is something
figuration space, with momenta in 4'p„or d'p„respec-
very sensible and very useful. There is one way out of
tively, equals [E(P„q, t)d'p, d'q][E(P „q, t)d'P, d'q].
this dilemma; that is, that the time evolution of a sys- From it one derives (we consider the simplest case:
tem is not described by the Schrodinger equation but by no external forces, no internal degrees of freedom, etc. )
some other equation. In fact, in statistical mechanics
the Boltzmann equation
one uses, with great success, equations like the Boltz-
mann equation, the master equation, and other equa-
tions. +p&&, ~&= d d'P2 «Pi -P2
2. The Boltzmann equation
where v(Q) is the differential cross section for a collision
To begin with, et us look at the classical Boltzmann
l.
(iI„P,) —(p'„P,') (0= solid angle), E, = E(p„q, t), E-',
equation (Boltzmann, 1872). In historical development, = E(p'„q, i)(i = i, 2).
this equation was the first one to describe an irrever- The Boltzmann equation implies the Theorem the 0
sible behavior of a system in a rigorous way. Yet this function
equation is still the best known to most physicists. Many
of its features are characteristic of all equations that
aim at overcoming the difficulty that microscopic de-
scription and irreversibility do not fit together. (See
H(t) = —
I d't d'q I" lnF (1.17)
the article by Grad, 1958, or Cohen and Thirring, Eds. , is nondecreasing in time. The following remarks apply:
Tke &oltzmann Equation, 1972. Of course, the Boltz- (1) H, as defined by Eq. (1.17), does not coincide with
mann equation is also discussed in all textbooks on stat- the classical entropy in general. This is the case only
istical mechanics. ) Perhaps the reader should be warned if p" factorizes: p" (w„w„. . . , w~) =~(p", (w, ) ' ' ' p (w&)
that although usual macroscopic equations, such as the —(p„q»)), we then have
(w, =
Navier-Stokes equations, can be derived from the Boltz- Sc' =MS(po')—
mann equation by means of further approximations, the (1.18)
Boltzmann equation also has to be considered as a ypgac- (cf. Sec. A). Otherwise, H & 8 (cf. Sec. ILF).~
xoscopic equation because it provides a description of the (2) The correlation function E is obtained from the
"true" distribution in phase space by some sort of aver- @= 0 and therefore we are allowed to choose the energy
aging. shell infinitesimally thin. Of course, here "energy
(3) The assumption of molecular chaos cannot be jus- shell" no longer means a subspace of Hilbert space
tified from first principles. It may be probable to a L'(R' ), but rather a subset of R' . We will denote the
more or less high extent, but it certainly is neither nec- classical energy shell by Qz (or simply 0):
essary nor true for all time.
From our discussion up to now we have learned that the (1.20)
mechanism of nondecrease of entropy is based upon aver- [H(. . . ) = classical Hamiltonian]. In the following we want
aging and probability assumptions. (We will recognize to use the abbreviation m=— (p„. . . , q~).
this in a somewhat clearer fashion in the example of the The restriction of the measure d'p, ' ' 'd'qN/N! (as
"master equation. ") However, it should be mentioned always we a.ssume the particles to be identical) to the
that there is a rigoxous derivationof the Boltzmann equa- energy shell Q~, formally given by
tion (but only for small times) for a. gas of hard spheres
of diameter d in the limit d-0, n'd' kept fixed, where
5(E H(p„-. . . , q„))d'p, "d'q, /~!, (l. 21)
n = number of particles/cm', by Lanford (1975). In this defines a measure dw. (For a more precise definition
limit, the system consists of infinitely many particles. see, for example, Reed and Simon (1972) or Arnold and
This is one of the hints that rigorous versions of irre- Avez (1969), and other textbooks on ergodic theory. ) By
versibil. ity, and quite generally thermodynamical be- virtueof I iouville's theorem this measure is time in-
havior, are to be expected for infinite systems (and pos- variant, i.e. , dm=dze(t). Let me denote by W(Q) (or
sibly for a restricted class of initial states) only [cf. Sec. simply W) the measure of all of 0, by W(A) the measure
!v.c). of a subset A( Q.
Let us return to finite systems and proceed by discus- In classical statistical mechanics the concept of ergo-
sing ergodicity and mixing properties of classical sys- dicity has been introduced by Boltzmann in order to jus-
tems. tify the microcanonical ensemble. A "rnicrocanonical
ensemble" means a uniform probability distribution over
the energy shell, i. e. ,
3. Ergodtclty ancl fnIx)ng
I want to start with the concept of energy shell. I et (1.22)
p(t) be the time evolution of a density matrix p, i. e. ,
p(t) = e '"'p e' ', and let ~n) be the eigenvectors of the [We write p, instead of the more precise notation p"„
Hamiltonian, thus H~n) =E„~n). Then the matrix ele- recalling our remark following Eq. (1.5)]. However,
ments of p(t) are ergodicity certainly is too weak a property to establish
that every probability distribution tends (at least in a.
(l. 19) certain sense) to the microcanonical one. Therefore one
We may classify them as follows: has to introduce a stronger notion: mixing. (This con-
(a) Matrix elements that change in a significant way cept is due to Hopf, 1932.)
only during macroscopic time intervals (say, 10 ' sec). A system is called "mixing" if the following is true:
',
They are connected with extremely small (unmeasurable) let A, be the time evolution of a subset Ac: 0 (i. e. , A,
energy differences E —E„(&10 erg). " =1m(t): m(0) c-A]). Then, for any two sets A, B c: Q, al-
(b) If the difference E —E„ is bigger, then (p(t))„ is a. way s
very rapidly oscillating function of t. Since macroscopic
measurements last rather long compared with the fre- lim W(A, A B) = W(A) B) (1.23)
quency of these oscillations, they in fact will average t 0
over (p(t))„. The mean value of these matrix elements
being of order of magnitude I/(E —E„)&t (&t =period of Ergodicity only would state that
the measurement), one can neglect them, or, expressed
in other terms, those fluctuations are too rapid to be lim —
I
W(A, . 0 B)dt' = W(A) W(B)
W D
(1.24)
0
observed.
Now let E = Trp(t)H be the expectation value of the en- There is no direction of time favored in this definition,
ergy. Of course, this is a constant of motion since which perhaps is not easy to recognize at first, be-
Trp(t)H = Tre '"'p e'"'H = Tre-'"'pH e'"' = TrpB. On the cause
other hand, E =Z p(t)„„E„. Due to our foregoing consid- lim W(A, 9 B) = lim W(A 0 B,) = lim W(A 8 B,)
erations for a description of macroscopic changes of the
system, one only has to take into account the matrix ele- W(A) W(B)
ments of class (a). Hence we can restrict ourselves to W(n)
that subspace of the Hilbert space that is spanned by
those energy eigenvectors ~n) for which ~E —E„~ & f with One can think of such a system as a flow with strongly
e being sufficiently small. We will cal. l this Hilbert turbulent aspects. After sufficiently large times every
space the energy shell. (In our considerations we always set A is so ragged that its relative portion in every fixed
have assumed that the Hamiltonian has a pure point spec- partB of 0 is just W(A)/W(Q) (see Fig. 2). It should be kept
trum. However, it is a simple matter to generalize our in mind, however, that suchabehavior hasnothing to do
arguments to Hamiltonians with a continuous spectrum. )
In the classical case (cf. Sec. A) we formally can put
with irreversibility. „
Given A. for I;& 0, one can recon-
struct A. for t = 0. In fact, this sometimes even canbe done
that macroscopic measuring apparatuses have only a 1928), which, of course, is of restricted validity but may
restricted precision are not able to distinguish be-
and
be suitable for practical purposes.
tween points inside one cell. Then also p(w) cannot be
Take a distribution that is constant in cell i and = 0 in
measured by them exactly, but rather only its mean val-
all other cells. By Hamilton's equations one obtains
ue over the cells. (Of course, there is a certain arbi-
from it the density distribution at time t: p(t), and the
trariness with this concept because there is no canoni- probabilities Pz(t). If p(t = 0) were concentrated in the
cal way of defining these cells. ) I et us define the cell -i, but not constant, one would obtain another density
distribution p(t & 0) and other probabilities P,. (t). Now
coarse-grained density as follows:
if the cells are not too small one can find arguments that
in the overwhelming majority of possible cases Pz(t)
p„(w) = p(w') dM' (1.25)
o =P&(t). Starting with arbitrary distributions p(t = 0), no
if wc Q, . (That coarse-graining is essential in statisti- longer necessarily concentrated within one cell, one con-
cludes that ' almost always" P&(t) can be calculated from
cal mechanics was. first pointed out by the Ehrenfests,
1911.) This corresponds to replacing p by a, distribution the P, (0):
that is uniform inside the cells. As discussed before,
one cannot distinguish p and p„by macroscopic mea- (l. 30)
surements. The coarse-grained entropy is
On the other hand, p(t+ t') can be calculated from p(t),
S„(p) = S (p, ) = — g P, lnP, /W(Q, ),
(1.26)
hence, similarly,
P(= p zo dt's.
Gg For simplicity and mathematical convenience one may
Of course, S„~S, since we have lost information. (A impose the Markov property on the T's:
proof is easily obtained by means of the inequality be-
low. ) T„(t+t') =P T„(t')T„,(t) (1.31)
Now mixing implies that
(Chapman —Kolmogorov equation). The differential form
W(Q, ) of this equation is obtained by inserting t' =dt. Then
lim p(w(t)) dw = WQ (l. 2V)
T»(dt) must be of the form
l
menological equations for which I want to refer to the ways many uncontrollable perturbations that will have.
literature only. (For a good bibliography, see Heif, the effect that the "true" dynamics of the system is
1965.) spoiled and that the time evolution of a point in phase
The considerations presented above are not intended space no longer obeys the Hamiltonian equations but
to give any "proof" of the second law of thermodynamics. rather behaves in a stochastic manner.
I rather wanted to draw attention to those assumptions
Properties of the solutions of the master equation were
that are necessary in order to "produce" an irreversible
first discussed by von Mises (1931), Frechet (1938), and
behavior, or to derive equations predicting approach to Feller (1950), just to mention the earliest treatments.
equilibrium. Let me single out once more the main fea- In particular the master equation implies a monotone in-
tures: crease (or nondecrease) of the coarse-grained entropy
(1) Some averaging procedure is needed. Concerning
the Boltzmann equation, it consisted in considering the S P) '
w(Q, )
first correlation function instead of the complete dis-
tribution in phase space. For closed systems, this leads A more general result in this direction is that for density
to a nonlinear equation. For open systems in free space matrices p(t), whose time evolution is given by a dynam-
(which we did not discuss) this leads to a linear trans- ical semigroup, the relative entropy S(p, ~p(t)) de-
port equation. The treatment of open systems in bound- creases. [See Eq. (1.41) below. p, = the stationary state. ]
ed regions (systems in a heat bath) leads to master equa- The entropy production
tions of the type mentioned above or, more generally, to
dynamical semigroups (Kossa~owski, 1972; Gorini, S( po p(t))
i
In our last discussion, we introduced coar se- graining. latter fact relies on Lieb's theorem; cf. Sec. III. ) For the
This may seem to be somewhat artificial. However, proof of our first statement, let us, for simplicity, as-
after all, one can regard it in another way. Suppose we sume that all cells have the same size, W(Q, ) = &o. Then
are dealing with a system consisting of two subsystems,
the phase space of one of them being discrete, Q,
=(1, 2, . . . j, the pha. se space of the second one being con-
S„=in~ —
g P, lnP, .
tinuous, and the phase space of the whole system being Now we arrange the P, in decreasing order: P, ~ P,
0 Qj + Q2 In addition, assume that the composite sy s- ~ ~ ~ ~
. For the sum of the biggest nP&'s we find
tem is mixing. Let p= p(i, go)(i c: Q„ave Q, ) be a prob- n n
ability distribution. The corresponding density distri-
bution of the two subsystems obviously is to be taken as dt 4 ~
P, =g
f~g
g (W„P„-W„, P, )- 0.
k
Z p, &~IP I» 2 t,
0 ~ ~
o
~ ~
&~ IQ l~&. 0
classical one. Also the same remarks apply as in the tropy of the beginning of this chapter, the density ma-
classical case concerning its validity. trix with maximal entropy is the most probable one.
In principle, however, all our previous statements (2) Under certain conditions to be discussed in the
about quantum-mechanical mixing are false on a formal next'section, the classical ensembles are equivalent in
basis because of the recurrence paradox ("Wiederkehr- the sense that in the thermodynamic limit they give the
einwand"). In our case it states that, if the Hamilton- same expressions for the intensive thermodynamical
ian has a discrete spectrum, the function Trp, Q is quantities. This shows that for large systems it is not
almost periodic in t. The way out is well known: the really necessary that they be in the state with maximal
time it would take until the system gets close to the possible entropy but that deviations from this state that
original state again is tremendously large for macro- are not too big do not change the thermodynamics. Thus
scopic systems and beyond any sensible imagination. "not too big" means that, for instance, a difference in
Thus, if t = ~ means something like "t = age of the uni- entropy to the maximal value of, say, order vN does
"
verse, things are certainly okay. To correct for our not at all matter and can be neglected. Before further
above considerations in a mathematically incontestable discussing the classical ensembles let me state some
way we have to deal with strictly infinite systems. ~e mathematical aspects of states with maximal entropy.
will do this in the last section only because of the mathe-
matical technicalities that are involved. At this point let 5. States with maximal entropy
me just mention that ergodicity and mixing make per-
We study the following problem: given E = TrpH (H
fect sense in the infin. ite case. being a fixed Hamiltonian), what does the density matrix
So fRI' R few I emalks (admittedly rather sllpel'flclR1)
with maximal entropy look, like'P The answer is well
have been given on the problem of approach to equilibrium. known: it is
For a more careful and detailed discussion I have to
refer the reader to the literature, as announced in the g0 =e»/Tre» -g 'e 6H
introduction.
(1.39)
Z = Tre ~" (partition function)
In the rest of this section I would like to comment on
some properties of equilibrium states. (Gibbs state), where 8 is chosen such that Trcr&H =E.
It is often argued on philosophical grounds that the The proof is based on Klein's inequality (see, for ex-
microcanonical state is the equilibrium state (if the en-
tion. Then
f
ample, Ruelle, 1969): Let be a convex (concave) func-
ergy is fixed), because, after all, there is no physical
principle which would distinguish between the different
energy eigenstates of the energy shell and therefore any Tr[f(B) f(A)] - Tr(B A)f (A). (1.4O)
of them must occur with the same probability. However,
it is not obviously certain that this application of La- This inequality is rather powerful and we will frequently
place s principle of insufficient reason to physical sys- make use of it. I et us state some important special
tems is really legitimate; one definitely has to elaborate cases.
those Physical laws which are responsible for the validi- Take
(1)IAIDO. f (x) = —x lnx. Then
ty of this principle for real matter. TrA (lnA —lnB) ~ Tr (A —B) .
Equilibrium states, and only they, also enjoy remark-
able stability properties which roughly may be charac- If, in particular, A and B are density matrices p, o,
terized as follows: small local perturbations of the —T rp(lnp —inc)
then we find for the relative entropy S(o p) = l
&elf(B)l@& .
f(&elBle&.
temperature j
This is a consequence of the following
f be concave
Hence, since convexity (concavity) implies (y) (x) f -f Lemma
with f(0) =0.
1 (Wehrl,
Then
1974): Let (convex),
i (y —x) f'(x), for the eigenvectors @,. of A, belonging
to the eigenvalues &;,
S Sp.
This lemma is itself a consequence of the next one, as
we already have seen in our discussion of properties of
which f
is just Tr(B -A)f'(A). [Since is convex (con-
the solutions of the master equation.
cave), there exist at least both one-sided derivatives
f f
„', '„and the inequality is true for both of them. ] Lemma 2: The mappings p- f(p)/Trf(p) (for concave
Now we return to the Gibbs state. Suppose that T rpII is f) and -p
(for conv'ex f), all with f(0) =0,
f(p)/Trf(p)
~E, and that Tro zH = E. Then are mixing-enhancing in the sense that, for the eigen-
Trp lno& -g TrpH —ln Tre ~H,
——
values P, & P, & - of p, or P,' &P,' & . . . of f(p)/Trf(p),
~ ~
Then, for p=oe, E(p, p, H) is minimal (namely, =E). One (Remember that QP„=1.)
easily verifies the standa, rd relation
tinuous in the closed strip 0 «Imz «P such that, for C. The classical ensembles
real t, F'(t) = (BA,) 8, F(t + i P) = (A, B)8. The most important kinds of density matrices to be
The KMS condition is of greater importance since it
considered in statistical mechanics are the classical
extends to infinite systems (where o8 does no longer
ensembles (see again Ruelle, 1969).
exist). It turns out that it entails far-reaching con- The microcanonieal ensemble is the density matrix
sequences for the structure of infinite systems (Kubo,
in the Hilbert space H „+(V) defined by
I 957; Martin and Schwinger, 1959; cf. the last section.
'The role of the KMS-condition in infinite systems was
realized by Haag, Hugenholtz, and Winnink, 196't. )
'xqE ,~q(~) .
—characteristic function
(1.51)
(lt!-~, ~ ~ of the interval [E
Finally it should be mentioned that there is an in-
teresting inequality between the quantum-mechanical
—e, E]), e = "thickness" of the energy shell). Here S
the mierocanonieal entropy, is
„
and the classical partition function, somewhat similar
to inequality (1.5) but much more powerful. If the Ham- S, = ln TrXL-~, ~ ) (H)
iltonian is of the form H = —P", P';/2+ V(x) then the -, =logarithm of the number of energy levels
quantum -mechanic al partition function i s
Z =Tre HH=e H~, between E —e and E.
The number e is undetermined to a, certain extent. In
whereas the classical partition function is
classical mechanics it can be chosen equal to zero
d pd' (cf. Sec. B; it certainly is not necessary here to dis-
Z"= q
exp
!
p g p +V(q) =e
2'
play the corresponding classical probability distribu-
tion p",); on the contrary, in quantum mechanics it is
(1.46) very convenient to choose e = E, thus
(we put m= 0 = 1 and, for simplicity, suppose Boltzmann
statistics; F' =classical free energy). Define the con- p,= e ~& Tr O(E —II) . (1.52)
volution (We suppose H~ 0.) It turns out that in fact under cer-
tain conditions it does not matter how big e is chosen,
& (9)= — 3N /2
d' 9'('(9')e
9(— P (9, —9()) as we will indicate below.
The canonical ensemble in HN3(V) is given by
+
3N (1.49) (1.53)
2
(d,
cr8 being the Gibbs state at inverse temperature p. The
and Z'„', F~c) as Z", F"
above, but with V(q) being re- entropy is
placed by V„(q). Then, for all (d,
S, = i3(E —F)
Z (dcl «Zcl
IEq. (1.46)].
or (1.50) 'The grand-canonical ensemble is defined in Fock
sPace H~(V) by
pc «p «~c
—.Hp V —H H
c(N
(1.54)
The upper bound for Z relies on the Golden-Thompson gC
inequality (Golden, 1965; Thompson, 1965): Tre~'~ (n = pt). , p. = chemical potential, p=pressure. )
& Tre"e . Inserting for A =kinetic energy, B= V(x), one There are also other kinds of ensembles that are
arrives at sometimes of use in physics (for instance, the "pres-
2 sure ensemble": Lewis and Siegert, 1956. If there are
f)
Z «Tr exp —P 2
exp more parameters, like electric and magnetic fields,
one clearly also has to consider more complicated en-
To exp( —PQP', /2) there corresponds an integral kernel sembles. ) However, I do not want to go into details
K with since these things are covered in all textbooks on
statistical mechanic s.
Z(x, x) =(2~P) '"'= d3Np
)3N P t Q
p2
2 9 I would rather like to concentrate on only a few as-
(2
pects that are of some importance for the rest of this
which immedi. ately yields the desideratum. For the paper.
lower bound we will consider the case of one degree of
freedom only. The general case is obtained in a 1. The thermodynamic limit
straightforward manner. Using coherent states ~z) (see
This is the question whether, if a sequence of volumes
the first section)
V tends to infinity, the limits
2p
' ' -8B)' ) dz
2m
' —.
8«ieiz&
"
VI VI
and by an easy explicit calculation one obtains (z~H~z) I I
=P'/2+ 3+ V, (q). That 1 can be replaced by u follows or of p as defined by Eq. (3.4), exist, provided that
from our remarks concerning the definition of coherent f(t/)iv(i9E/[V) (or e, p) are kept fixed. (We write [V[ for
states in See. A. A similar inequality holds for spin the measure of the region V if there is a risk of con-
systems (i, ieb, 19 t3a). fusion. ) The existence of such a limit only justifies the
(1.56)
A. &d. (One can also formulate a similar theorem for
Hamiltonians including many-body interactions. ) The
proof of this theorem is quite involved and shall be
omitted here. (See Ruelle, 1969. The idea of using
"corridors" in order to prove the existence of the
thermodynamic limit is due to Yang and Lee, 1952, and
van Hove. Rigorous proofs are due to Fisher, 1964, and
Griffiths, 1965.)
Unfortunately, however, these considerations do not
really apply to physics, because, after all, the inter-
action between particles is not "tempered" in the sense
of Eq. (1.56) but, with great accuracy, goes as 1/x:
there is one contribution from gravitation and an
FIG. 4. Definition of nz and nz+. electrostatic part. Also, one usually has to consider
several species of particles (electrons, nuclei, . ). If ..
one neglects electrostatic forces (i. e. , if one considers
usual distinction between "extensive" quantities, like neutron stars or something similar), then a thermo-
entropy, free energy, etc. , and "intensive" ones like dynamic limit in the usual sense no longer exists. If
temperature, pressure, etc. one takes the free energy as a function of particle num-
At first one has to make what is meant by "V
clear ber, inverse temperature, and volume, the limit
tends to infinity. "
'The least restrictive notion in this
direction is "tends to infinity in the sense of van Hove"
(van Hove, 1949, 1950):
Let V(a), with a ~ R or Z,
be a parallelepiped exists inst'ead of
fxc R~: 0 ~x, ~ a,.l. Consider all translates of V(a) of
the form fx: x,. =n, a, +$, , n, = integer, 0 & $& &a, j. Let lim N 'F(N, P, V),
n„or n~, respectively, be the number of all trans-
lates of V(a) that are contained in a given volume V, or i. e. , the scaling behavior of the thermodynamical quan-
have a nonempty intersection with it, respectively (see tities is entirely different from the usual one (Hertel
Fig. 4). A sequence of volumes is saidtotendtoinfinity and Thirring, 1971; Hertel, Narnhofer, and Thirring,
in the sense of van Hove, if, for any a, 1972). This shows that it is by no means self-evident,
for instance, that entropy is an extensive quantity.
If gravitation can be neglected, but there are only el-
More restrictive is the tendency to infinity in the ectric forces, then for neutral systems (and provided
sense of Fisher. Define V„ to be the set of all points that at least one species of particles are fermions) it
that have a distance less than h to the boundary of V. can be shown that a lower bound H„~ -N C is true
Let D(V) be the diameter of V. Then V- ~ in the sense (Dyson and Lenard, 1967, 1968; Dyson, 1967. A much
of Fisher if there is a function w(n), with ~(o. )-0 for better method for obtaining a lower bound was presented
cx-O, such that, for sufficiently small n and all V, by Lich and Thirring, 1975) and a. thermodynamic limit
in the usual sense exists (Lebowitz and Lieb, 1969;
Lich and Lebowitz, 1972). This, undoubtedly, is one
of the deepest results of mathematical physics: stability
and, in addition, IVI-~. of matte+. (For all that concerns stability of matter cf.
One often considers less sophisticated kinds of limits the review article by Lieb, 1976.)
V-~ such as sequences of parallelepipeds V(a), with
all a; —~, or sequences of balls with radii going to in-
finity. However, the disadvantage of these limits is 2. Equivalence of ensembles
that one never can be sure that the limit is shape inde- The next problem is that of equivalence of ensembles:
pendent and, for instance, for a sequence of cubes it
Are the thermodynamic quantities, computed with the
could be different from that for a sequence of balls.
various ensembles, asymptotically equal in the thermo-
Now the result conc@ming the thermodynamic limit dynamic limits This, of course, need not always be
is that it exists for N-particle Hamiltonians of the form true (for instance, in the regions of phase transitions).
On the other hand, it is "normally" to be expected. Let
(1.55) me illustrate this question by two examples only.
provided that there is a lower bound for the Hamiltonian a. Equi valence of the mlcrocanonl cal ensembles
Let
C being a fixed constant, and if, for IxI being suffi- 1
ln TrB(@IVI —&)
ciently large,
s(q) =lj, m (1.57)
(q =energy density; the underlying Hilbert spa. ce is that Boltzmann's ideas caused many controversies and
of X particles, with A7/~V~ kept fixed). It turns out that there were many objections, such as the reversibility
s is a concave, nondecreasing function of q. As long as paradox (Loschmidt, 1876) and the recurrence paradox
it is strictly inc~easing, all the other ensembles de- (Zermelo, 1896; based on work of Poincare: 1890),
fined by (1.51) yield the same entropy density too (see etc. and as we have seen, there are still open problems
Ruelle, 1969). with them, at least if one desires full mathematical
rigor ~
b. Compar/ son between microcanonical and canonical, The next step towards the modern concept of entropy
ensemble was taken by J. W. Gibbs (1902), who adopted the en-
semble point of view and gave a definition of entropy as
Assuming differentiability of s, take P =Bs/Bq, and the ave&age index of Probability-in phase -(this prob-
f
define =c —P 's. Then the limit of the density of the ability-in-phase is just the classical probability dis-
free energy of the canonical ensemble exists and tribution of Sec. A: the "index" is ln 1/p, hence the
equals that f.(For more details, once more Ruelle's avera. ge is f p lnl/p).
book should be consulted. ) How the paradox that entropy, after all, should remain
What one can learn from all this is that for large sys- constant, could be resolved, was pointed out by the
tems, provided that the thermodynamic. limit exists, Ehrenfests (1911), who recognized the role of coarse-
the precise structure of the density matrix is not so gr aining.
important. To come back to our last example, the dif- Finally the quantum-mechanical expression for the
ference of the entropies in the microcanonical and the entropy was given by von Neumann in 1927.
canonical ensemble is of order in%, which is big but As far as more recent developments are concerned, I
on the other hand is negligible in the thermodynamic have tried to give the relevant literature in the text.
limit. Therefore. our starting point of Sec. B, the There have been significant contributions concerning
maximum entropy principle, must be formulated as fol-
classical entropy and classical statistical mechanics,
lows: given some intensive quantities (such as tem- and there has been a strong impetus in creating such
perature, density, or energy density, etc. ), the en- fields as ergodic theory and theory of dynamical sys-
tropy density of the corresponding equilibrium state is tems. In contrast, the properties of quantum-mechani-
maximal. cal entropy have not been investigated in detail for a
very long time, and it certainly is to the credit of E.
D. Historical remarks Leib to put forward their study in the last few years.
I would like to conclude this introductory. chapter by
some remarks concerning history. There are many
reviews, historical surveys as well as reprint volumes, II ~ PROPERTIES OF ENTROPY
containing the decisive papers in this field (for in-
A. Simple properties
stance, by Brush, 1965, 1966; Klein, 1972; Koenig,
1959; Roller, 1950), that certainly describe the histori- In this chapter we come to the very object of this re-
cal development of the subject much better than I could view, namely, to describe the various general proper-
do. I only want to make a very few comments that con- ties of entropy. Let me start with a fewextremely sim-
cern "entropy" itself, without pretending that they are ple ones. Some of them me already have discussed in
exhaustive or take into account all important steps in Sec. I as, for instance, the following:
the past. Entropy is defined for every density matrix, it is al-
Thermodynamics in the modern sense has its origin ways «0, possibly =~. For the pure states, and only for
in the work of Mayer (1842) and Joule (1845), to whom them, 5 = 0.
major credit is to be given in the recognition of the first
Again this shows a weak point in a purely classical ap-
law of thermodynamics (conservation of energy, which
proach because in the classical case the "pure states"
in times past wa. s called "force"), and of Clausius (1850)
certainly are density distributions that are concentrated
and Thomson (Lord Kelvin) (1852, 1857), who, based on
at one point (i. e. , 5 functions), and their entropy is
previous work of Carnot (1824), formulated the prin- This does not fit into the interpretation of entropy
ciple of dissipation of energy and the second law of as ' lack of information. "
thermodynamics. That this principle leads to the heat
death was worked outbyvon Helmholtz (1854). The One easily verifies that the range of S(p) is the whole
notion of entropy finally was introduced by Clausius in extended real half-line [0, ~], i. e. , to every number c:
1865. 0 ~ c ~ ~ there exists a. density matrix p such that S(p)
At about the same time the kinetic theory was put for- = C.
ward by Maxwell (1860) and Clausius. An important step The range of the generalized Boltzmann —Gibbs-Shan-
towards the understanding of irreversibility was Max- non (cf. Sec. LA) entropy is [u, v] or (u, v), u—= inf in', (X)
well's Demon (Maxwell in a letter to Tail, 1867) which [2C =non-null measurable subset of 0, v: in', (Q), depend-
illustrated the statistical nature of it. ing on whether the infimum is attained or not (Ochs,
The main contributions, however, in this direction, 1976)] . For the classical Boltzmann —Gibbs entropy, in
are due to L. Boltzmann: the Boltzmann equation and particular, it is al. l of R.
the H' theorem (1872), the relation between entropy and An important property of entropy is invariance. Since
probability theory (1877), etc. etc. S(p) = —ZP in(, P being the eigenvalues of p, S only de-
IO
I
B. Concavity
O.
I
Concavity states that for the density matrix p: A. ]p]
+ A, p, (p„p, density matrices A„A, ~ 0, A, + X, = 1)
FIG. 5. Construction of p' = p EB O.
S(p) Z, S(p, )+~,S(p , ). (2 1)
pends on the (strictly) positive part of the spectrum of (see Lieb, 1975, or Ruelle, 1969). [There is an equality
for X„X,&0 only if p, = p, or S(p, ), or S(p, ), respective-
p. Any mapping p- p'
unchanged also l. eaves
that leaves the positive spectrum
the entropy unchanged. Examples ly is equal to ™.
]
The proof is very simple. Let p =Z p~ k&(k I. Then,
for such mappings are the following: I
for the coarse-grained entropy the above invariance +&.&s(&&lp. l»)-& ~&) ls«»I»+»&I ls(p )I»
property is not true, hence, it may change with time. = A. ,S (p, ) + X,S (p, ).
If, in particular, U is a permutation matrix with respect Of course, we did not use any special property of s(x)
to the eigenvectors of p, then the invariance property is besides concavity; thus, for any concave function f, the
also called symmetry. In other words: S is a symmetric mapping p- Trf(p) iS concave.
function of the p 's. Why is concavity considered to be important? Entropy
2. LetH'= HS H", p' = p6) 0. In that case the invariance is a measure of lack of information. Hence if two en-
property is called expansibility. In graphical language sembles are fitted together (what in mathematical lan-
one adds zeros to p as seen in Fig. 5. guage is described by the convex combination A. ,p,
Another simple property is insensitivity. S is a con- + X,p, ), one loses the information that tells from which
tinuous function of finitely many eigenvalues pro- p, ensemble a special sample stems, and therefore entropy
vided that the rest of them are kept fixed. (This, how- increases (Wigner and Yanase, 1963).
ever, does not mean that S is a. continuous function of p,. Let us illustrate things by a simple example. Let p„
cf. Sec. D. ) p, be one-dimensional projections, i.e. , pure states. In
Let us give an example of a function of a density ma- a case in which A„A., &0, p= A]py+A. 2p2 is a mixed state
trix for which the insensitivity does not hold: The quan- (unless p, =p, ). Therefore, S(p)&X,S(p, )+X2S(p, ) =0. By
tum-mechanical version of the Hartley entropy S, (p) is the way, in that case it is no longer possible to recon-
defined by S, (p) = logarithm of the number of eigenvalues struct p„p, from p.
that are t0. If, for instance, p, = .
=p„= 1/ ,n„p„=p„„ ~
Mixing of pure states (the forming of convex combina-
tions of them) yields a mixed state. More generally, it
=0, then So(p) =S(p). If now p, =p, = ~ ~ =0, then S(p) ~
= —p, lnp, —p, in/, is continuous in p, and p„whereas seems to be plausible to argue: if we are given two
mixed states that are unitarily equivalent (p, = U*p, U),
S,(p) = ln2 if 0&/„p, & 1, otherwise =0, hence S, (p) is dis- then mixing of them yields a new state that is more
continuous in p, and p, . However, it should be remarked
that, apart from insensitivity, many properties of the strongly mixed than the two original ones, and S(p)
~ XP(p, )+ &,S(p, ) =S(p, ). We will make this consideration
Hartley entropy are shared in common with the "right"
entropy, such as additivity, subadditivity, and the above
precise in the next section.
invariance property. It should be anticipated that concavity is a consequence
of subadditivity (Sec. F).
If p is of finite rank, i. e. , if only finitely many eigen-
values are &0, then S(p) &~. Now let p be an arbitrary Clearly concavity generalizes as foll. ows: let p„
density matrix. The canonical approximations are ob-
tained as follows:
p„. . . , p„be density
~0, with ZX, =1.
matrices,
Then
X„A.„. . . , A.„numbers
p'"'=- g p, l»(ul g p, . S A.
] p] & A]S pq (2. 2)
S(p) =S g fk
l( p "Q"' ~—
g ia
l(. p" (Ink, + lnp„" )
It should be remarked that this fact allows an axiomatic
characterization of entropy: I et 4 be a mapping of the
set of density matrices into the extended real half-line,
= —
Q q inn, —Q l(, Q p(, "Inp,"' which fulfills the invariance and continuity properties of
Sec. A. Also let M = H, S ~ ~ s SH„, p, = density matrices in
— H, p= l(, p, (B ~ ~ ~ SX„p„. If C (p) always satisfies 4(p)
=
Q X( Ink(+ Q +(S(p() ~
A. 4
, +4 A, A being a diagonal matrix in the Hil-
&
- j„(p
„.
have (using P' =P, + P, )I„(P . . , P2) =P'j, (P ~/P', P2/P')
p+) g.
/(1 —P'), . . . , P„/(1 —P' )) + I, (P', 1 —P')
n
+os ~ (2. 6) (1 p')
&p lplp a.ndj„, (p', p„. . . , p„) =p'I, (l)+I, (p', 1 —p')
+ (1 —p')I„, (p, /(1 —p'), . . . , p„/(1-p')), hence I„(p„.. . , p„)
One ean prove this as follows: l. et g„.. . , g„be another
set of pairwise orthogonal, normed vectors, all of them
=j„, (P, +P., P„. , y„)+ (P, +P, )j, ll, /(P, +I, ), P, /(II, +P, )).
See Fig. 6; the left-hand side refers to the first equality,
contained in the subspace spanned by . . , @„. Then P„.
it is ~~~~di~t~ly seen that Z ((t ( p I l
Now choose g„& l1&, l2&, . . . , ln —1& (remember that lk&
H(
=eigenvector of p belonging to the eigenvalue p~). This (I —dim. }
is always possible since the dimension of the subspace (2-dim. )
Pp
spanned by qb, ' ~ ~ qb„ is &n —1. Continue by choosing
I», l» -, ln-2&, 4.; 4. , &», , ln-3&, e„,
~ ".
Hp Hp
at inequality (2. 6). ((n-2} —dim. ) ((n-2) —dim. )
Define H " as the subspace generated
by P, H, . . . , P„H,
Pn
thonormal basis for H~"~. Because of Ky I'"an's inequality, FIG. 6. Illustration of the preceding identity.
=c ——ln —— 1 ——ln 1 —— choose c =P„(p'). Then Trf (p') =nP„(p') +Q;" „„P„(p')
& nP„(p')++~"=„„p.(p) ~
Trf(p) since P„(p') &P„(p).
which completes the proof (see Fig. 7). Similarly, p ~ p' if, and only if, for every convex
function f, Trf(p) & Trf(p'). (Unfortunately it is not
C. Uhlmann's theory true that Trp ~Trp'~ for all P: 1 ~P&~ implies that
p ~ p'. )
On seve ral occasions we have already met the notion of Now' let us consider another example of mixing-en-
mixing-enhancement. It states that for the eigenvalues of hanc ement:
two density matrices p and p', arranged in decreasing (5) Let U„. .. , U„, ... be
unitary operators, A.; ~ 0,
order, the inequalities Z&; =1. Thenthe convex combination p= &, U,"p'U,
p, (p)- p, (p'), p, Q)+p. (p)- p, (p')+p. (p'),
+ ~ . ~ +~„U„*p'U + ~ .
is more mixed than p'.
(2.7)
This is again a consequence of Ky Fan's inequality.
p (p)+ +p (p)- p (p')+ +p (p')
Let Q, be the eigenvectors belonging to the P~(p). Then
hold. One then says that p is mo re mixed then p' (or ~, (p). ~ ~
'P.
(p) =Z."-,Z;~, {e.IU;*p'U;~3
more chaotic), or that p' is purer than p, and writes =ZZ~;«;e. p'Ue. = ZZ~, p, (p ) =Z",=,u. (p').
This explains the word "mixing-enhancement. " Re-
l &
amount of information in any sensible interpretation. trix elements reduces the information and increases the
(Cf. also Sec. G. ) Now p is obtained from the U*p'U degree of mixture. 'The proof is easily obtained by
by a mixing procedure, hence there is a loss of infor- means of Ky Fan's inequality.
mation. (Note that the constituents of p cannot be re- The coarse-grained
constructed. )
density matrix p =+A;P, (A. ; TrP;
By the way, in example 5 it is not necessary that the
= TrpP, , cf. Sec. I.B isi+P, pP;, . hence
s. p; there-
fore not only S(p ) & S(p), but also Trf(p ~) & Trf(p)
U, be unitary; they may be only isometric (i. e. , U,*U;
=1)
for any concave function f.
(Remember Fig. 3:
both mappings are mixing-enhancing. )
Clearly the relation ~ is transitive and reflexive, i. e. , .
Et is worth mentioning that Uhlmann's theory has been
a preorder. Thus it generates an equivalence relation: generalized to arbitrary von Neumann algebras by
p- p' if, and only if, p ~ p' and p'I- p, hence if p and p' Wehrl (1975), Alberti (thesis, 1973), and Uhlmann him-
have the same positive spectrum. This equivalence
self. It turns out that this theory provides a powerful
relation may be regarded as the most general concept tool in the investigation of tPe structure of von Neumann
of invarianee (Sec. A): from the entropy, or more gen-
algebras and, in a certain sense, is the "dual" of the
erally from the information-theoretical point of view,
von Neumann —Murray dimensidn theory.
density matrices with the same positive spectra are
equally good.
Uhlmann's theorem states that, in essence, mixing- D. Continuity properties
enhancement is always produced by the mechanism
described in example 5: p ~ p if, and only if, p is in In infinite-dimensional Hilbert spaces, entropy, as a
the (weak) closure of the convex hull of {U*p'U: U function of density matrices, is discontinuous in the
usual topologies. There are only a few restricted con-
unitaryj.
tinuity properties. The problems that arise in this con-
We will only sketch the proof. It consists of four nection may be divided into two groups:
steps: (1) Those which are of more mathematical interest
(1) The set A of all operators. A & 0 such that P, (A) and which we will not treat in great detail here.
- p, (p'), p, (A) +p, (A) - p, ( p') +p, (p'), . . . is by virtue of (2) Technical considerations that are of use in ex-
Ky Fan's inequality convex and weakly closed, hence tending theorems that can be proven for finite-dimen-
weakly compact. sional matrices, to the general case. (Cf. the end of
(2) Its extremal elements a.re exactly those A for Sec. A. For a typical example, see Sec. III.A. )
which P;(A. ) =P;( p') for all i or P;(A) =P;(p') for i From section A we already know insensitivity. Other
& n, p;(A. ) = 0 otherwise. restricted continuity properties are:
(3) Apply the theorem of Krein and Milman: Louer Semicontinuity. (This fact seems to have been
= closed convex hull of the extremalA. . well known for a long time, but was written down only
(4) All extremalA. are in the weak closure of by Naudts, 1969; Wehrl, 1976. For other proofs,
EU*p'W. cf. Secs. III.B and IV. B). Let p„, p be density matrices,
such that Trip —p„l- 0. Then S(p) &liminfS(p„).
In the finite-dimensional case, there is another way
of proving the theorem invoking Birkhoff's theorem Ky Fan's inequality tells us that, for the eigenvalues
mentioned in Sec. I. B. Namely, for two sequences of of p„, or p, respectively, arranged in decreasing order,
numbers n, & n, & ~ & n„, or P, & P, & ~ & P„(n; & 0, P& . +p. (p)
p, (p. )+. . +p&(p. )-»lp. —pl+p. (p)+
~ ~ ~ ~
forward application of Birkhoff's theorem yields hence l(p&(p) —p. (p. ))+ . +(p, (p) —pa(pn)l - 0 and
Uhlmann's theorem. eventually p~(p„) —p„(p). Thus
I.et us now make a short remark on the order struc-
ture of density matrices (this expression is due to —
Q p, (p) lnp, . (p) = liml. — Q p, (p„) lnp, (p„)j
Thirring, 1975. The lattice structure of density ma-
trices was recognized by Wehrl, 1974): and
For any two density matrices p„p, there exist (up to
equivalence) a "purest" density fnatrix ~ p„p, and a
"most mixed" one &p» p, . Thus the equivalence classes S(p) = suPQ ( —p, (p) lnp, .(p))
of density matrices form a lattice. Its "purest" element
clearly is the equivalence class of the pure states. A
most mixed element does not exist in infinite-dimen- & lim inf supp ( —p, (p„) lnp;(p„)) .
In.
sional Hilbert space, only in the finite-dimensional
case, namely, p = 1/dim H.
Remark: It is also true that, if p„"~"-p, S(p)
Next, let us generalize example 3: let P; be a family
&lim infS(p„), prozrided that p is a density matnx
of pairwise orthogonal projections (not necessarily one-
this case, also Trip„—pl-0. If p is not a density ma-
dimensional) with QP; = 1. Then gP;pP, ~ p.
trix, it can happen that S(p) & lim infS(p„)(Wehrl, 1976;
This means intuitively that deleting off-diagonal ma- see also Davies, 1972, and dell 'Antonio, 1967).
Unboundedness in Every ghborhood. Let p be a One can even dispense with the requirement that
density matrix and e &0 be an arbitrary number. Then Trp„H- Tro H if Tre "& ~ for all P: 0 & P & ~. Then
there always exists another density matrix p' with S(p) in continuous on the sets (p: TrpH «C & ~], even if
Tr~p —p'~ &» andS(p') = ~. (Clearly this implies that Trp„H- TrpH. Namely, Trp(PH) —S(p) & lim inf[Trp„(PH)
S(p) i s disc ontinuou s. ) —S(p„)]; hence —S(p) & lim inf [—S(p„)] + lim sup~ Trp„(pH) ~
In the earlier sections the special role of entropy did tributions p, (w2), or p, (w, }, from a distribution p(w„w2),
not appear; rather, the traces of any concave function by integrating over the other variable:
was taken to be more or less as good as entropy. How-
ever, the property of additivity distinguishes entropy
among all functionals of the form p- Trf(p), where is f
p, (w, ) = f dw, p(w„w, ) (2.11)
a measurable function: if Trf(p, p, ) = Trf(p, )+ Trf(P, ), and vice versa.
then f(p) = constp lnp. Now subadditivity states that
Due to the assumption Zf(p„q;) =Zf(p„)+Zf(qj) S(P) &S(P, )+S(P,) =S(p) HP2) . (2.12)
=Q[q~f(p„)+p„f(q, )] fo.r all sequences p„, q&, hence
f(P, q ) =q;f(P, )+P,f(q;). For g(x) =f(x)&x, g(P. q, ) This appears plausible since, when forming p, and p„
=g(p„)+g(q, ), i.e. , g(x) = const lnx, f(x) = constx lnx. one loses the information about the correlations. (Also,
There are, of course, other additive functionals of p, one cannot reconstruct p from p, and p, .) However, it
'but they are not of the form p- Trf(p). is false that p, (3 p, )- p. (Lieb, private communication.
An example is provided by the so-called n entropies.
[In the classical case they were introduced by Renyi.
This follows also from the fact that the n entropies for
o. o 0, 1 are not subadditive; see below ).
proof is obtained from the inequality for the relative
A
See Renyi, 1966. I or the quantum-mechanical case,
see Wehrl, 1976; Thirring, 1975. The case n = 0 was entropy (1.41) S(p, (2 p, p) -0. Now S(p, (3 p, p)
~ ~
Inm(P1 q1»
. Paqf». Pn qm) or, with P, =y, P, =x,
= I„(p„.. . , p„)+ I (q„. . . , q.), (&-~&f( ) +g(~&=(&-»f(& ) +&(»
whereas subadditivity states that
Differentiating with respect to x and y yields
I„(r„, . . . , 2„,, . . . , 2„)&X„(p„.. . , p„)+I (q, , . . . , q ) .
In the last relation, the x~z are a double sequence of
non-negative numbers with Z„, J22f =1, and P; =Z22';~,
(1 —x)' 1 —x 1 —y2 -(, )
(One can prove that all derivatives exist. ) For s =y/
q; =Z„2». As in Sec. B, the first part of the proof con- 1 —x, t=x/1 —y, this becomes
sists in showing that, for 0&p&1, the information func-
tion I,(p, 1-p) =aS(p, 1 —p)+A, ; with S(p, 1 —p) = —plnp f
s(1 —s) "(s) = t(1 —t) "(t) f .
-(1 —p) ln(1 —p) and a and A2 being constants. For this, The left-hand side depending on s only, and the right-
we need three lemmas:
hand side depending on t only, both sides must be a con-
(1) Let f(p) =I (p, 1 —p). We note that f(p) =f(l —p). stant, and by integration one arrives at
From the next lemma it follows that is nondecreasing f
in [0, 1/2] and nonincreasing in [ 1/2, 1], and that it is f(t) = —a[ t (ln t —1) + (1 —t) (ln(l —t) —1)] + bt+ C .
concave.
Because of the symmetry, 5=0, and thus we have ob-
(2) Symmetry, additivity, and subadditivity can be
used to obtain the inequality tained the expression for f(p) asserted above, with A. ,
=C+a. How do we get from I, to I ? Suppose all P,-&0
I.(1 —q, q) —g(1 —P)(1 —q)+P(1 —~), (1 —P)q+P~) and consider the expression
I (p(l —q), pq, p„. . . , p
&
I2(1 —q, q) —I2((l —p)(1 —q) + p(1 —2.), (1 —p)q+ pz) From this one concludes easily that it is a constant, A
- I.(P(1 —q)+(1-P)(1 —3), pq+(1-P)2') - I.(1 —2", 2) . Therefore
Take, as an example, H, = H, = C, @„P, (or g„g„re- It should be noted that the triangle inequality is false
spectively) = orthonormal basis in H, (or H.„respective- in the classical continuous case, because the analog of
ly). Let p= projection onto 2 above does not hold.
Now let us describe some applications of subadditiv-
ity.
for the right" classical continuous entropy (not the con- Does there exist a. limit S(V)/i Vi (i Vi being the volume
ventional one), i. e. , for p" as defined in Sec. I. A(Wehrl, of V, or the number of lattice points in V, respectively),
1977). as V-~ in some suitable sense (for instance, in the
Let us come back to our example above (p pure, p, sense of van Hove; see Sec. I. C), provided that the sys-
not). Two remarks are appropriate: tem is translationally invariant, i. e. , S(V+ a) =S(V) for
(1) If p is pure, then S(p, ) =S(p, ); moreover, the posi- all a(= R", or Z", respectively.
tive spectra of p, and p, coincide. To begin with let us consider the case of a one-dimen-
sional lattice system. I et V be an interval of length l:
Let Q„g~ be orthonormal bases in H„or H„respec-
V=(k, @+1,. . . , k+ l —lj. Because of translational in-
tively. Let y be the vector Zc&„p&@g . p=—iaaf)(pi
variance, S(V) is a. function of l only: S(V) =F(l), l — = Vi.
I
i
=Bc,~c&~i @,)(&f&& ~. Let C be an infinite matrix with en- By subadditivity of the entropy, is also a subadditive
function of l:
tries c,~. The eigenval. ues of p, equal those of CC*.
Similarly, one finds that the eigenvalues of p, equal those F (l, + l, ) ~ F (l, ) + E(l, ) . (2. iS)
of C*C. But it is well known that. CC* and C*C have the
same positive spectrum. Using a classical theorem of analysis (e. g. , Polya and
Szego, 1970) one concludes that the limit F(l) jl exists:
(2) Given p„one always can find a. Hilbert space H, and
F(l) = lim
. S(V) . . F(l)
a pure density matrix in Hy(3H2 such that p, = TrH.,p. lim = s, with s =—inf (2. 16)
g ~ce
Let p, =ZP, @,)(P, i. Take for H, a Hilbert space with
i
The same argument would work in the continuous case
the same dimension as H„and with an orthonormal bas-
Is (Z being replaced by R) too, provided that one would
ki~ 4~~ ~ ~ ~
have some bound on E(l). However, subadditivity is not
sufficient to provide such a bound.
'I p= I x&&~ I x—+
= ~P»
0 ~ «. As an example, consider for V intervals [a, b) and de-
fine "S"([a,b]) = 0 if b —a is rational, =™if b —a is ir-
From remarks 1 and 2 one can derive the triangle in-
equality (Araki and Lich, 1970) which gives a partial
r ational.
We will see in Sec. III.A that such a bound is in fact,
compensation for the failure of monotonicity:
for quantum systems, provided by strong subadditivity,
iS(p ) —S(p ) i
- S(p)- S(p, )+S(p, ). (2. 13) which is a sharpening of the subadditivity property dis-
cussed in this section. Namely, strong subadditivity
(Of course, the right-hand side is merely subadditivity. )
yields, for /' &lo, the inequality
We want to prove the inequality S(p, ) & S(p)+S(p, ); in-
terchanging remarks 1 and 2 yields the rest. p is a den-
F(l') +F(2l, —l') ~ 2F(l, ), (2. 17)
sity matrix in Hy(SHp Due to remark 2 there exists a hence in the quantum case, E(l') ~ 2E(l, ), because the
Hilbert space H, and a pure density matrix 0 in. HyH2 quantum-mechanical entropy is always &0.
..Tr„...
H, such that p= Tr o'. Let o', = p. S(o, ) =S(p) But inequality (2. 17), which also holds in the classical
because of remark 1. p, = TrH. ,o'. S(p, ) =S(o'»), o» case (see Sec. III. A), allows us to prove the existence
—
= TrH o'. By subadditivity, S(p, ) =S(a'») ~ S(p2)+S(o'3) of the mean entropy even for classical continuous sys-
=S(p, ) +S(p). tems, where it is not true that E(l') ~ 2E(l, ). We have,
(2. 19)
If S were monotonic there would be no need to refer (&) =
to strong subadditivity in order to establish the exis- 1)
tence of limF(l)/I. But monotonicity is generally true Since in a classical theory there is no need of introduc-
only in the classical. discrete case. It will turn out, how- ing the 'correct" normalization condition (1.3) one
ever, that it is also true in our case of translationally chooses it as follows:
invariant systems; again, this result relies on strong
subaddit ivity.
Nevertheless there are some instances where some
sort of monotonicity can be proved even without any
e- fVi
f " dd, dd„pt"' =1 (2. 20)
'
I.A. ) Letus drop the assumption d=1. Remember that "classical Fock space. " It is very different from the
H(V) = H„, H„being Hilbert spaces of fixed finite di- grand-canonical entropy of Sec. I for two reasons: (i)
mension, say v. One readily verifies that the normalization condition, and (ii) the kinetic energy
is omitted.
One can show, extending our previous arguments, that
Let VC: V'. By subadditivity, S(V') ~ S(V)+S(V"), V" the analog of (2. 18) holds:
= V'gv), hence
.
s. „,(v) s. „,(v) . v v'~ v, (2.22)
and also that subadditivity (and even strong subadditivity)
mechanical case, let
holds.
S(V) =S(V) — V~ in~. ~
Namely,
ee- )V)
dq, ~ ~
dye(l —pcs') (by Klein's inequality; cf. Secs. I.A and I.B) = 1 —1 =0 .
~=o
'The second inequality follows from the first one and subadditivity: For VC V',
. .
s, „,(v') -s, „,(v)+s. „(v . i v) -s..„,(v) .
Turning to subadditivity, we first have to define S„„,(V) and pP' in terms of pg':
~g, N
yl yu pP ( 1& ' & N&yl»yhf)
etc (V"=V'g. V, x;c V, y~c=- V"). Similarly, S„„,(V") and pt~f,' are defined We have.
since every point q(= V' must either belong to V or to V", and
t'~ l
r N
dq dq =
K=Q
g I
( 3 V
dXj ~ ~ ~ dX
e-
.
S, „,(V) =- g
I VI
VN
dx, ~ ~ ~
dxN pp'(x„. . . , x„)Inp&~"'
~ ~ ~
e- lvl e- V" I
l
dX 1
NpM
¹& M.
~ ~ ~ CfXN
N
VgrM
1
~ ~ ~
AM
No te that IVI+ IV" I= IV'I ~ A similar formula holds for S, ,(V"), and therefore
S„„f(V)+S„,~(V") —S„,f(V')
e- l V l
(N+ M)
dX ~ ~dX N y1 dyNP'v) (x1) ' ' )xN)y1) ' ' ' )yhl)
N! M! pgM
(again by Klein's inequality) ~ 1 —1 =0. To illustrate this let us consider 9 or the configura-
Let us return to our example and indicate how mono- tional entropy. For simplicity we will use the same
tonicity can be used in order to establish the existence letter 8 for both S itself and the configurational entropy.
of Iim~(V)/IVI or IimS„,r(V)/IV (Of course, forquantum Let a =(a„.. . , a~) be a vector in Z ~ or R~, and let V(a)
lattice systems the existence of limS(V)/ V also implies
I
I I
be the box $xc Z~ or lR~: 0&x,. & a,. Let us also ).
the existence of limS(V)/ V I.) We shall consider the
I
define
above case (a) only (for the configurational entropy things
S(V(a))
work in quite the same manner) since for our following
! V(a)!
arguments we need only relations (2. 18), or (2.22), re-
spectively. Suppose now that a sequence of volumes V tends to in-
Choose &, l„
l, and n as before in example 1, re- finity in the sense of van Hove (cf. Sec. I.C). We choose
placing S, however, by S in the definition of P(E), thus a, such that B(V(aQ)) ~ IV(aQ) I(s+&). Define nv, or nv,
defining s = infP(l)/l, etc. Inequality (2.1V) is then re- respectively. , as in Sec. I.C. By assumption, nv/n'„- l.
placed by Monotonicity and subadditivity imply in the same way as
S'(l)
before that
Z(nt, ) nl, Z(l, )
l 2 20 n-„-
S(V} ~ —"
(s+e) . (2.24)
IVl
Consequently,
It remains to show an inequality of the kind
llm
&(I) . I'(I)
lnf
S(V}
lvl
) n'„-V
(2.25)
exists.
which would be a consequence of the inequality
3. Dimensions & 't S(1' ) - s I V( . I,) (2.26)
In the case of dimension &1, subadditivity is definite- where I"v denotes the union of the n+„ translates of V(a, )
ly too weak to establish the existence of the mean en- that cover V. However, the latter inequality can only be
tropy, even in lattice systems. Also monotonicity is not obtained by invoking strong subadditivity (cf. Sec.
suf f icient. III.A).
1957). However, as we have already discussed in Sec. P(l') 2F(l) .& (3.3)
I.B, one has to be careful with such arguments because
This is true because any interval of length E' can be
they only make plausible, but do not actually prove, the
maximum entropy principle.
represented as the intersection of two intervals of
length 7; thus by strong subadditivity
On the other hand it is amusing to note that in practical
applications of information theory, such as in technology, F(l') & F(l') + F(2l —I') & F(l) + E(l),
biology, etc. , the second law of thermodynamics has
On the other hand, a:s we have seen in Sec. II.F, the
been adopted and there called the negentropy principle
second inequality of the latter relation is also sufficient
(see Brillouin, 1962). Thus we find a mutual interaction
between physics and information theory rather than a to prove the existence of the mean entropy in the classi-
perfect understanding of statistical mechanics on the cal continuous case (see Sec. II.F).
grounds of information theory.
The following remark, due to E. Lieb, applies: let
x =2l —I', y = I' in the last formula. Then 2E(x+y)/2
~ E(x)+E(y), i.e. , E is weakly concave. To show that
III. STRONG SUBADDITIVITY AND LIEB'S E is concave, i.e. , F(Xx+(I —X)y) ~'A F( x)+( I—X)E(y),
THEOREM it is sufficient to have I bounded above in any interval.
A. Strong subadditivity Conversely, if F is concave, this implies strong sub-
In Sec. II.F it turned out that mere subadditivity often additivity.
is too weak a property, and that strong subadditivity is (2) This problem is closely related to the problem of
needed. By this, the following is meant: given three monotmzicity of the quantum-mechanical entropy, i.e. ,
of proving that F(I') &F(l) (cf., our remarks of Sec. II.F).
„
Hilbert spaces H„H„H, let p be a density matrix in
H1 H~(3 H3. Define the partial tra If there is no translational invariance, we already have
seen that this need not be true, However, if the system
p» —Tr„p, etc. (In order to have a less cumbersome no- is translationally invariant, then one can use strong sub-
tation, f rom now on instead of p we will write p»„ instead
additivity to show that
of Tr„„we
Hp H3 will write Tr», instead of Tr„weH&
will
write Tr„etc.) Then F(l) —F(l') F(l+m) —F(l'+ m)
S(pz2, )+S(p, ) &S(p„)+S(p„) . (3.1) for every m ~ 0, in particular for m =zz(l —l'), zz being an
If H, is one-dimensional, this reduces to normal sub- integer. Consequently,
additivity. 'The same inequality holds in the classical — P [ F(l + zz(l —I ')) —E(l '+ n(l —I'))]
case, there being given three "phase spaces" Q„Q„Q,; E(l) —P(l ') ~ ~ n=i
p„, is a probability distribution in 0, x 0, Q, , p, (w, )
—[F(I+a(f —I')) —F(I)]
&&
dw, dw, dw, p», (lnp», —lno) ~ 0, 3. Now let us consider translationally invariant sys-
tems in dimensions &1. If we are given a lattice system
valid for every probability distribution o, and to take and consider a. sequence of boxes whose lengths tend to
o = p„p„/p, . Then infinity, then again, as in example 1 of Sec. Q.F,
limS(V)/I V exists. I
H, such that, according to remark 2 of Sec. II.F, there So we have got a proof of the concavity of the condi-
is a pure p»34 in (H, S H, H, ) (3 H, such that p»3 tional entropy by assuming the va, lidity of the lemma, .
T r4 p»34 Then
Unfortunately, the proof of the latter is not easy at all
+ S2 —S~2 —8~4 (see Sec. C).
by (3.5), which establishes strong subadditivity. B. Relative entropy
One might think that there are other inequalities of the
We have met the concept of relative entropy, which in
type of the above ones, for instance between S»3+Sy+S2 general form is due to Umegaki (1962) and Lindblad
S3 and S» + S» + S», but thi s is not the c ase . Al so S»
(1973), on several occasions already, the first being in
Sy S2 is neither c one ave nor convex. For a further
Secs. I.A (as a special case of the generalized Boltz-
discussion of which inequalities are true and which are
mann —Gibbs —Shannon entropy) and I.B. [in our dis-
not, see Lieb (1975). cussion of the free energy E(p, P, H)].
'The above is the original proof of Lieb and Ruskai of Remember that it was defined as S(ol p) — = Trp(lnp
strong subadditivity. There is another way, due to —1no). We have proven that S(vip) ~ 0 for all
Uhlmann, of proving strong subadditivity from the con- density matrices cr, p', by the way, going through our
cavity of S» —S, . Let all Hilbert spaces under con- proof of Klein's inequality, one sees that S(crlp) =0 if
sideration be finite-dimensional. Now, as above, S», and only if 0 = p',
S23 i s concave . Denote by dU3 the H aa r m ea, sure of The second important property is joint convexity for
the group of unitary operators in H3. Then density matrices pa~ p2~0j~a'2 and A. : 0 «A « I,
3p»3 3
— p» I
-
S((alp) )(.S(o, p, ) + (1 —)).)S((x, lp, ),
l (3.6)
d
where o =—)(.o, + (1 —X)cr„p=
—)(.p, + (1 —)()p, .
(d, =dimension of H, ; U, is identified with 1(g)U, ). Thus Joint convexity arises from Lieb's concavity theorem,
which we will discuss in the next section. The latter
(dU (S„—
S„)(U p U, ) - (S —5 )
I
p Sl), states that TrKA'K~B' ', for 0 «t «I a,nd A~O, 8& 0,
and any K is jointly concave in A and B. Hence setting
K = I, taking the derivative for t =0, one finds that
S», —S» «(S» —lnd, ) —(S, —lnd, ) . Tr(A'B' ') l, , = TrB(lnA —lnB) (3.7)
dt
So what we have to do is to prove the concavity of S»
—S, . We will do this for the finite-dimensional case; is concave, or S(alp) is convex.
the general case follows from an application of our re- As a consequence, the conditional entropy S» —S, is
sults of Sec. II.D. 'To make things more transparent, concave: suppose all Hilbert spaces to be finite-dimen-
let us also abuse language and write p~ instead of p, sional. Then
(SI in H~(3H~. I
'The essential ingredient of the proof is the following. S» —S~ = -S' p» p ~ + lnd
d2
I.emma (Lich, 1973b). For finite-dimensional ma-
trices, the mapping A- Tr exp(K+InA) (for A &0, & self (d, = dimension of H, ), observing that ln(p, I/d, )
adjoint) is concave. =Inp~ 001 —1(3(lnd2). (The transition to the infinite-di-
Now let p» —)).p,', + (1 —)(. ) p,",(0 «)(. « I). Define mensional case follows the methods indicated in Sec.
6 = Trp»(lnp» —lnp, ) —)). Trp» (lnp~, —ln p', ) II.D. )
There is a representation of S(alp) in which the argu-
—(1 —)() Trp,", (lnp,", —lnp,'), ment of the trace does not contain a product of two non-
E' = Trp~2 (Inp» —Inp~ —Inp~2+Inp~), commuting operators:
and b. " similarly. b. =AD'+(1 —A)I), '.
We want to show S(ol p) = sup S„(olp),
tha. t A «0, or e «I. Because of the convexity of the
exponential function, S,(o l p) =-(I/)(. )[S()(o+ (1 —~) p) —)(. S(o) —(1 —) )S(p) J (3.8)
Rev. Mod. Phys. , Vol. 50, No. 2, April tS78
Alfred Wehrl: General properties of entropy 251
valid for ordinary entropy, for skew entropy, namely, -(yl(S, +T, )ly)"(yl(s, +T, )llr)'/'because of Schwarz's
concavity in p. [Of course, it cannot be expected that inequality sls, + t, t2 & (s', + t, )' '(s', + t', )' '. (s, =- lls', "y
etc.) Hence IIRl QR2' 'll -1~ by taking + =R, ' 'x,
II
inva. riance holds, except for the trivial statement that, ttt
—S((u ~ .
6„= .~ C "(u) —S((u v - ~ ~ ~ C" 1g) system (clearly, C then means "space translation"; Rob-
=S(C'" cu, u ~ . ~ ~C " 'cu) . (4. 6)
inson and Ruelle, 1967). Then, if s (the mean entropy)
is &~, KS and mean entropy coincide. This means that
Due to strong subadditivity, the KS entropy, in essence, is a, mean entropy, (By the
way, it is possible to generalize KS entropy by replacing
S(o. , 0) ~ S(n, Pvy) (4. 7) the group of discrete time translation by more general
for any partitions n, P, y [because of Eq. (15.5)], hence groups, for instance, Z~. Many of the important results
consequently, lim&„exists, and since then, after obvious modifications, remain valid. )
There is a serious problem with the KS entropy because
S((uv ~ v C" '(u) =S((u)+ &, + ~ + &„, ,
~ ~ ~
it refers to a discrete time evolution. In the more real-
— istic case of a continuous one-parameter group 4, of
lim (S((uv. v C" '(u)) = limS(C" '(u, &uv ~ ~ v C" '~) time evolutions the construction presented above does
n
not work for two reasons: (a) it is not obvious by what
= lim &„=s(~, C ) . (4. 8) quantity S(wvC &uv vC" 'u) has to be replaced. In pa, r-
The entropy of C (KS invariant) is defined as ticular, there may arise measurability questions be-
cause in the continuous case uncountable unions and in-
s (C ) =— sups (~, C ), (4. 9) tersections of the sets C,Q& are involved which need not
the sup being taken over all finite partitions co. be measurable. (b) If we adopt the view that the KS en-
It should be noted that, in contradistinction to usual tropy is a mean entropy, then, certainly, in the con-
notions of entropy, this kind of entropy is not a function tinuous case strong subadditivity enters in a very essen-
of a state but rather a function of the dynamics of the tial way. Thus, in any case, the construction of an anal-
system. og of the KS entropy in the continuous case must be much
The Kolmogorov-Sinai invariant has the following im- more sophisticated.
portant properties: As concerns quantum mechanics, one could think of
(1) It is an inva1'lant of the dynamical system in the fol- imitating the original method of Kolmogorov and Sinai
lowing sense: the system is described by Q, the Liou- according to our "translation table" of Sec. I.B. How-
ville measure p, , and the measure-preserving one-to- ever, this does not work in general. The difficulty lies
one mapping 4: Q- Q. Suppose there is another triple in the possible noncommutativity. In quantum mechan-
Q', p, ', 4 ' with the same properties, and an isomorphism ics, clearly a partition ~ has to be defined as a set of
f: Q-Q', etc. , such that the diagram pairwise orthogonal projections P„wit hZP, = 1. How-
ever, if we are given two partitions cu, =(PI and co, "}
=(P&"']-, then it is unclear how to define a, v w„since
the products I"& "I'&" in general will not be projections;
they will not even be Hermitian. Also the dimension of
the algebra generated by co, and co, can be exceedingly
large, so that in any case subadditivity arguments cannot
be used.
There is partial success in constructing a KS entropy
for quantum-mechanical K systems (Emch, 1976). They
is commutative. Then s(C') = s(C ). are analogs of the classical K systems (Kolmogorov,
(2) Kolmogorov's theorem. The partition o is called 1953), which are systems with a mixing property that is
a generator if the 0 algebra generated by the sets much stronger than the mixing property we have used
C (&)(m = 0, al, a2, . . . , A c &) is all measurable subsets in Sec. I. B. Unfortunately, this is rather lengthy to de-
of Q. Then scribe and demands a good knowledge of the theory of
von Neumann algebras, so I must refer the reader to the
s(C) =s(o, C) (4. 10)
original papers. There is also a construction for Ber-
(Kolmogorov, 1953, 1959). noulli shifts on the hyperfinite II, factor by Connes and
Before stating the next important property, one re- Stdrmer (1975).
mark should be made. Namely, all our considerations Recently, Lindblad (1977) succeeded in giving a de-
above apply to abstract dynamical systems too, where finition of a quantum analog of the KS entropy which is
Q need not be phase space or even any smooth manifold, not based on a noncommutative generalization of par-
but can be any set. Also 4 need not have anything to do titions but is rather analogous to the definition of the
with time evolution but can be any automorphism, for mean entropy for quantum lattice systems.
instance, space translation, or any symmetry operation. Besides its interpretation as a mean entropy, KS en-
There is the following theorem relevant to classical dy- tropy can also be taken as a measure of the strength of
namical systems. mixing of 4. Remember that
~oucknir enko's tkeoxem. The KS entropy of finite clas-
sical dynamical systems is finite. (For abstract sys- s(Q3, @) = lim [S ((dv ' ' ' v4 (d) —S (cd v ' 'v g? co)] .
tems it may be infinite. ) (Kouchnirenko, 1965, 1967.)
The construction of the KS entropy is very similar to Let n= 1; the following argument can easily be trans-
the one of the mean entropy in Sec. II. F. It can be shown ferred to the general case. If S(srvC u) =S(~), then C u
that for classical lattice systems one can find a trans- =m, i.e. , 4 leaves the sets Q, unchanged. If, on the
formation such that they become an abstract dynamical other hand, the difference S(mvC &o) —S(w) is big, this
(4. 12)
F IG. 10. Interpretation of the Kolmogorov-Sinai-invariant. We know that p„pp, hence S(p„)~ S(p). Performing
another measurement corresponding to another partition
co', one obtains
means that the intersection of every set 40, with the
original Q~ must be quite significant. (See Fig. 10. The
shaded area is 4Q„. we have not represented the other sets i.e. ,again a loss of information. For details see Wehrl,
4Q, ' ~ CQ, for reasons of clearness. ) Thus big KS en- 1977, and Staszewski, 1977. It also arises in other sit-
tropy means that the sets of any partition co get rapidly uations. Let co be the set of spectral projections of a
distributed over the whole phase space, and that the sys- Hamiltonian. H (and let us assume that there are no de-
tem exhibits strong mixing properties. generacies). Then, using the notation after Eq. (2. 2),
Similar to KS entropy are Kouchnirenko's A entropies: S,„(p, (u) =S(p„) .
let A be a sequence of integers a, &a, &a, & ~ ~ . Then .
IU entropy has (of course, besides invariance) many
s A. (m, C ) = lim sup —
n
[S(C "(ov ' ' ' V O' "M)] properties in common with classical discrete entropy,
n for instance concavity, additivity, and subadditivity (the
latter ones in some appropriate sense). There are also
continuous analogs of it (Grabowski, 1977). Since S(p)
s~ (4') = sup sA(~~ 4 ) . = inf„S~U(p, co), and =Sz„(p, (o) if and only if u consists
of the spectral projections of p, the quantity Sz„(p, &u)
They also are invariants of dynamical systems(cf. —S(p) may be considered as a measure of noncommuta-
Arnold and Avez, 1969). tivity between p and the partition co.
Some concepts measuring the amount of information
have been described. The list is not exhaustive and it
B. Various other concepts is left to everyone to invent new such quantities. How-
ever, it will be very hard to establish their physical
On several occasions we already have met entropylike meaning.
concepts that were of a certain use, either directly in
physics, as, for instance, the coarse-grained entropy,
or in order to show that certain properties of the "right" C. Systems with infinitely many degrees of freedom
entropy were not as obvious as one might think at first.
Many theorems of statistical mechanics refer to the
Let me write down a short list of these concepts, as
infinite case, i. e. , systems with infinitely many particles
far as we were concerned with them, or as they seem
moving in an infinite volume. %e have seen that only in
to have a certain relevance for physics.
this case phenomena such as quantum-mechanical. er-
(1) Coarse-grained entropy (see Sec. I.B).
godicity, etc. can be expected to hold in a rigorous man-
(2) n entropies (see See. II. E). One property of n en-
ner
tropies should be added: for n&1, they are continuous,
i. e. , Tr~ p„—p~ -0
implies S (p„)-S(n). For fixed p,
the mappings n-S (p) are convex and decreasing; since 1. Description of infinite systems
S(p) = sup~» S (p), this provides a third proof of lower In Sec. II. F we obtained a description of infinitely ex-
semicontinuity of entropy.
tended systems by attaching to every bounded region V
(3) Daroczy and other entropies (see Sec. I.G). a Hilbert space H~ and a density matrix p~; thus we sup-
(4) Measures of noncommutativity (see Sec. III. C). posed the family of density matrices to be compatible.
(5) Inga. rden —Urbanik (IU) entropy (Ingarden and Ur- Remember that in the continuous case H~ was the Fock
banik, 1962; Ingarden, 1965, 1973). This concept in fact
space SH~(V), with H~~(V) being the space of symmetric
appeared very early, namely in the papers of the Ehren-
fests (1911), Pauli (1928), and von Neumann (1929), but (or, antisymmetric, respectively) square-integrable
functions g(x„. . . , xz), where the arguments x, were re-
was intensively studied in the 1960s. It arises in con-
nection with conslderatlons about the measul ement pro- stricted to V.
cess. Let ur =(P, } be a partition of one-dimensional pro- It seems to be quite natural to describe an infinitely
extended system in d dimensions simply by replacing V
jections, i. e. , commensurable "counters" in physical
language. Then a measurement yields the numbers p& by R". The Hilbert space then would be
= Trp P&, and the amount of information obtained by this H'(R~) = C 6 L'(R~) 6 [L'(R~) L'(R )] S ~ ~
. (4. 13)
measurement clearly is s(a)
This construction makes perfect sense. The unfortunate
(4. 11) thing, however, is that,in general, there is no density
matrix in this space describing the state. To be more For a Gibbs state this means that it depends on the tem-
precise, for any bounded region V, as in Eq. (1.9), perature.
H+(V)@H'(R )V) = H'(R ), (4. 14) We have not yet said anything about time evolution. Re-
garding that, from the algebraic point of view, the time
but there is no density matrix p in H'(R~) such that
evolution in 8 H (V)], the bounded linear operators on
~
Tr H ~(R~y v) p pv H (V),
If such a density matrix existed, then, for instance, the etHtT e- fHt )
particle density n = IimN(V)/~ V would be = 0. This
means that the Hilbert space (4. 13) cannot be the right
~
is else than an automorphism of the algebra A(V)
nothing
one for the description of the system.
= &! H(V)]; itis natural to consider the time evolution in
A also as an automorphism of A (or, better, as a one-
The algebraic approach (see Ruelie, 1969; Eckmann
and Guenin, 1969; Emch, 1972) now essentially proposes
parameter group of automorphisms Tt: A —A). In gen-
the following procedure:
eral there will not be a Hamiltonian II(=—A such that
Since it is at first unclear what the right Hilbert space etHtA e tHt-
of the system is, one should not worry too much about.
However, in the GAS construction performed with a
it. One should rather concentrate on the operators re- time-invariant state td, i. e. , an cu such that (d(A) = ~&(&g)
presenting the observables of the system. Let A(V) be
for all t, there exists a H„such that
the algebra, for all operators on H(V), for V bounded. If
V'~ V, then every operator T c A (V) can be identified (gg) eiH~t~ (~) 8-tH~t (4. 17)
with the operator T' = Tl on H(V') = H(V)Q H(V' jV),
hence A(U) c: A(V') (isotony). Define
In that case, Q„ is invariant:
:
A—U A(V). (4. 15)
[Note that H„ in general neither belongs to v„(A) nor can
(4. 18)
totically Abelian (Doplicher, Kastler, and Robinson, KMS states have attracted great interest in recent
1966; Huelle, 1966; Doplicher, Kastler, Kadison, and years both from the physical and the mathematical side,
Robinson, 1967), i.e. , that the commutator [v, A; B.]. and there is a rich literature about them (the study of
vanishes in some appropriate sense as t-+~. (There KMS states was initiated by IIaag, Hugenholtz, and
are different notions of this property which we do not Winnink, 196V). Let me just mention a few results.
want to discuss in detail here. So let us for simplicity (1) KMS states are automatically time invariant.
suppose that ll[v', A, B]11-0 as t-~.
) This is certainly (2') To a given state u there is exactly one group of
true for free systems where the commutator goes as time automorphisms r, such that a is KMS for them.
t ' '.
For systems with repulsive forces only, one can (However, there may be more than one KMS state for
expect even stronger commutation properties, and for a given time evolution. )
attractive forces, asymptotic Abelianness will pre- (3) KMS states can be decomposed into extremal ones,
sumbaly hold as long as the attraction is not too strong. i.e. , those that cannot be written as a genuine convex
If asymptotic Abelianness is true, then lim&u(C(v, A)B) combination of two other KMS states. These extremal
.
=lim~(r, A BC) =&@(A)co(BC), hence in the GNS con- KMS states are factorial, i.e. , w„(A)" is a factor.
struction m„(v, A} —co(A) times the unit operator in H„. [w (A)' =set of all operators on H„ that commute with
Now let co' be a state that is normal with respect to (d, all of 7t'„(A), m„(A)" =all operators that commute with
by which we mean that there exists a density matrix p' ~„(A)'. m„(A)' is called the commutant of m„(A), m„(A)"
in H„such that ~'(A) = Trp'~„(A). Then ~'(v, A) the bicommutant. "Factor" means that the center
= Trp'm„(v, A)- ur(A); i.e. , states not too far from a m„(A)'Q w„(A)" consists of the multiples of the identity
mixing state converge towards the latter. This is a only. j
rigorous result concerning approach to equilibrium (4) Factorial states (whether they are KMS or not) are
(Sec. I.B). always mixing.
Now take a bounded subvolume V. For any A A(V), c (5) m„(A)' and m„(A)" are anti-isomorphic. There is a
(A) = ~'(v, A) —~(A). Let p~, p~ be density matrices
&u',
— deep theory studying this symmetry; the so-called
on H(V) defined by Tomita- Takesaki theory, which is one of the mo st f ruit-
ful recent concepts in the field of operator algebras
(d (A) = Trp~A,
(Takesaki, 19VO).
u(A) = TrpvA,
and let p~(t) be the time evolution of pv, defined in an
obvious manner. Then 4. Stability.
Tr p„'(t) A —Tr pvA, We already have mentioned stability properties of
equilibrium states: small perturbations of the dynamics
and, consequently, Tr~pv(t) —pv~-0 (see Davies, 19V2; do not lead to global changes of the state. I et me sketch
Wehrl, 1976). As we have seen in Sec. II.D, this does one result in this direction (Haag, Kastler, and Trych-
not necessarily imply that S(p~(t))-S(pg, but under Pohlmeyer, 1974; Haag and Trych-Pohlmeyer, 1977;
some weak additional assumptions (which in general can for another approach, cf. Araki and Sewell, 1977).
be expected to be fulfilled) this will in fact be true. A small, local perturbation of the dynamics may be
It usually is not possible to define the entropy of the described by changing v, to v~", where ~~~" is defined via
state co of the whole system; any sensible definition its infinitesimal generator (which is in mathematical
would give S =~. However, in addition to the mean en- language a derivation of the algebra A) as
tropy (Secs. II.F and III.A), one can define the relative
entropy of two states by i —
et '
v' — +Ah
=i et 7'
This concept turns out to be very useful for infinite sys- infinite series involving time-ordered integrals of multi-
tems too; however, due to mathematical complications commutators. ) Let co (or m~", respectively) where &u~"
(one has to know about Tomita —Takesaki theory}, we is defined in a similar way to r ~", be a time-invariant
have to refer the reader to the literature (Araki, 1975). state of the unperturbed, or perturbed, system, re-
spectively, and suppose that for every h
3. KIVIS states
Iles~" cuff-0 as x-0 . (4.23)
~ (for s~all ~),
If f II[v, A, B] lid« ~, fll[v& "A, B] lldt&
In general, for an infinite system there exists no
then, for factorial states ~, they turn out to be KMS for
operator H belonging to A or which can be constructed some P. (The P comes in as some "modulus of stabil-
as a limit: of elements of A such that the time evolution ity. ") On the other hand, every factorial KMS state has
is given by A. —O' 'A. e ' '. Therefore one also cannot the stability property (4.23).
use Eq. (1.39) to describe Gibbs states. But one can
Let me close with a few words about a very general
use the KMS condition (Sec. I.B) in order to obtain an
concept of entropy that refers to von Neumann algebras,
analog of them: A. state (d is called a KMS state at in- i.e. , weakly closed*- algebras of operators containing the
verse temperature P if there is a function E(z) with the identity. [Examples are m(A)', w(A)", in fact von Neu-
analyticity properties stated after Eq. (1.47), namely mann algebras are exactly those operator algebras N
cu(Bv', A) =E(t), &u(v, AB) =E(t+zP) . (4.22) for which N = N".]
5. Segal entropy (Segal, 196O) Aczel, J., B. Forte, and C. T. Ng, 1974, Adv. Appl. Prob. 6,
131.
We have remarked at the end of Sec. I.A that this is in Alberti, P. , 1973, thesis, Leipzig.
.some sense the most general concept of entropy. It is Alberti, P. , 1977, to be published in Wiss. Z. Karl-Marx-Univ. ,
defined as follows: let N C B(H) be a von Neumann alge- Leipzig.
bra. Let 4 be a faithful normal-semifinite trace on N, dell'Antonio, G. F., 1967, Commun. Pure Appl. Math. 20, 413.
i.e. , a mapping of the positive part of N into [0, ~] such Araki, H. , 1975, preprint RIMS 190, Kyoto.
that 4 (R) w 0 if R w 0, 4 (tR) = X4 (R) (X ~ 0), 4 (R + S) = 4 (R) Araki, H„and E. Lieb, 1970, Commun. Math. Phys 18, 160 .
Araki, H. , and G. L. Sewell, 1977, Commun. Math. Phys. 52,
+4'(S), 4'(&*R&) =4 (R) for U =unitary, HN; furthermore 103.
if R„0R, then 4(R„) %4(R); and finally that, to every R Arnold, V. , and A. Avez, 1969, Ergodic Problems of Classical
there exists S c0, ~R, with 4 (S) & ~. Mechanics (Benjamin, New York).
(The usual trace Tr ~ fulfills all requirements; how Baumann, F., and R. Jost, 1969, in The Problems of Theore-
ever, there are algebras such that Tr T =~ for every tical Physics: Essays Devoted to Bogoliubov (Nauka,
¹ ¹
Fisher, M. E., 1964, Arch. Ration. Mech. Anal. 17, 377. Kouchnirenko, A. G. , 1967, Funkz. Analys i ego Prilojenija,
Frdchet, M. , 1938, Mdthode des fonctions arbitraires, theoric Moscow, 1, 103.
des evenements en chasse dans le cas d'un nombre fini d'etats Kraus, K. , 1970, Ann. Phys. 64, 311.
possibles (Dunod, Paris). Kubo, R. , 1957, J.
Phys. Soc. Japan 12, 570.
Gibbs, J. W. , 1902, Elementary Principles in Statistical Me- Ky Fan, 1959, Proc. Natl. Acad. Sci. USA 35, 652.
chanics (Yale University, New Haven, Conn. ). Landau, L. D. , and E. M. Lifschitz, 1966, Statistische Physik
Glasman, I. M. , and Y. I. Gubich, 1969, I"inite-dimensional (Akademie, Berlin}.
Linear Analysis, in Russian (Gostekhisdat, Moscow). Lanford, O. E., 1975, in Dynamical Systems, Theory and Ap-
Golden, S., 1965, Phys. Rev. B 137, 1127. plications, edited by J.
Moser (Springer, Berlin).
Gorini, V. , A. Kossakowski, and E. C. G. Sudarshan, 1976, J. Lanford, O. E., and D. Robinson, 1968, J.Math. Phys. 9, 1120.
Math. Phys. 17, 821. Lassner, G. and G. , 1978, to be published in Rep. Math. Phys.
Gorini, V. , A. Frigerio, A. Kossakowski, E. C. G. Sudarshan, Lebowitz, J.
, and E. Lieb, 1969, Phys. B,ev. Lett. 22, 631.
and M. Verri, 1976, to be published in Bep. Math. Phys. Lebowitz, J.
, and H. Spohn, 1977, preprint, Yeshiva Univer-
Grabowski, M. , 1977, to be published in Rep. Math. Phys. sity.
Grad, H. , 1958, in Handbuch der Physik, Vol. XII: Thermo- Lenard, A. , and F. J. Dyson, 1968, J. Math. Phys. 9, 698.
dynamik der Gase, edited by S. Flugge (Springer, Berlin). Lewis, M. B., and A. Siegert, 1956, Phys. Rev. 101, 1227.
Griffiths, R. B., 1965, J.
Math. Phys. 6, 1447. Lieb, E., 1973a, Commun. Math. Phys. 31, 327.
Guichardet, A. , 1974, Systemes dynamiques non-commutatifs Lich, E., 1973b, Adv. Math. 11, 267.
(Gauthier-Villars, Paris). Lieb, E., 1975, BuO. Am. Math. Soc. 81, 1.
Haag, R. , N. M. Hugenholtz, and M. Winnink, 1967, Commun. Lieb, E., 1976, Rev. Mod. Phys. 48, 553.
Math. Phys. . 5, 215. Lieb, E., and J. Lebowitz, 1972, Adv. Math. 9, 316.
Haag, B., D. Kastler, and E. Trych-Pohlmeyer, 1974, Com- Lich, E., and M. B. Ruskai, 1973a, Phys. Rev. Lett. 30, 434.
mun. Math. Phys. 38, 143. Lieb, E., and M. B. Ruskai, 1973b, .J. Math. Phys. 14, 1938.
Haag, B., and E. Trych-Pohlmeyer, 1977, to be published in Lieb, E., and W. Thirring, 1975, Phys. Rev. Lett. 35, 687.
Commun. Math. Phys. Lindblad, G. , 1973, Commun. Math. Phys. 33, 305.
Halmos, P. R. , 1956, Lectures in Ergodic Theory (Chelsea, Lindblad, G. , 1974, Commun. Math. Phys. 39, 111.
New York). Lindblad, G. , 1975, Commun. Math. Phys. 40, 147.
Hardy, G. H. , J.
E. Littlewood, and G. Polya, 1934 (1967), Lindblad, G. , 1976a, Commun. Math. Phys. 48, 119.
Inequalities (Cambridge University, Cambridge, England). Lindblad, G. , 1976b, Lett. Math. Phys. 1, 219.
von Helmholtz, H. , 18'54, reprinted in Popular Scientific Lec- Lindblad, G. , 1977, Preprint, Stockholm.
tures, 1962 (Dover, New York). Loschmidt, J.
, 1876, Wiener Ber. 73, 128.
Hertel, P. , H. Narnhofer, and W. Thirring, 1972, Commun. Maison, D. , 1971, Commun. Math. Phys. 22, 166.
Math. Phys. 28, 159. Martin, P. C. , and J. Schwinger, 1959, Phys. Rev. 115, 1342.
Hertel, P. , and W. Thirring, 1971, Commun. Math. Phys. 24, Maxwell, J. C., 1860, Philos. Mag. 19, 19; 20, 21.
22. Maxwell, J.C. , 1871, Theory of Heat (Longmans, London).
Hopf, E., 1932, Proc. Natl. Acad. Sci. USA 18, 204. Mayer, J. E., 1961, Chem. Phys. 34, 1207.
J.
van Hove, L. , 1949, Physica 15, 951. Mayer, B., 1842, Die Mechanik der lVarme (Ostwalds Klassi-
van Hove, L., 1950, Physica 16, 137. ker, Leipzig, 1911).
van Hove, L., 1962, in Eundamenta/ Problems in Statistical Mead, A. , 1977, J. Chem. Phys. 66, 459.
Mechanics, edited by E. G. D. Cohen (North-Holland, Amster- von Mises, B., 1931, Wahrscheinlichkeitsr echnung und ihre
dam). Anmendung in der Statistik und theoretischen Physik {Deuticke,
Hu»g, K. , 1963, Statistical Mechanics (Wiley, New York). Leipzig) .
Ingarden, R. S., 1965, Fortschr. Phys. 13, 755. Naudts, J.
, 1969, thesis, Brussels, unpublished.
Ingarden, R. S., 1973, Acta Phys. Pol. 43, 3. von Neumann, J.
, 1927, Gott. Nachr. 273.
Ingarden, R. S., and K. Urbanik, 1962, Acta Phys. Pol. 21, von Neumann, J.
, 1929, Z. Phys. 57, 30.
281. von Neuinann, J.
, 1932, Proc. Natl. Acad. Sci. USA 18, 710.
Jacobs, K. , 1960, Neuere Methoden und Ergebnisse der Ochs, W. , 1975, Bep. Math. Phys. 8, 109.
Ergodentheorie (Springer, Berlin). Ochs, W. , 1976, Bep. Math. Phys. 9, 135.
Jacobs, K. , 1963, Lectures on Ergodic Theory {Aarhus Uni- Ochs, W. , and W. Bayer, 1973, Z. Naturforsch. A 28a, 1571.
versity, Aarhus). Ochs, W. , and H. Spohn, 1976, preprint, Munchen.
Jaynes, E. T. , 1957, Phys. Bev. 106, 620. Pauli, W. , 1928, in E'estschrift zum 60. Geburtstage A. Sommer-
Joule, J. P. , 1845, Phil. Mag. 27, 205. felds, edited by P. Debye (Hirzel, Leipzig).
van Kampen, N. G. , 1962, in F'undamental Problems in Statis- Pauli, W. , and M. Fierz, 1937, Z. Phys. 106, 572.
tical Mechanics, edited by E. G. D. Cohen (North-Holland, Peierls, R. , 1936, Proc. Camb. Philos. Soc. 32, 477.
Amsterdam). Planck, M. , 1906, Vorlesungen nber die Theoric der
Katai, S., 1967, Ann. Univ. Sci. Budapest, Eotvos Sect. Math. Wa'rmestrahlun g (Barth, I.eipzig).
12, 81. Poincare, H. , 1890, Acta Math. 13, 1.
Klauder, J. R. , and E. C. G. Sudarshan, 1968, fundamentals Polya, G. , and G. Szego, 1970, Aufgaben und Lehrsatze der
of Quantum Optics (Benjamin, New York). Analysis (Springer, Heidelberg).
Klein, M. , 1972, in The Boltzmann Equation, edited by E. G. D. Prigogine, I., 1972, in The Boltzmann Equation, edited by
Cohen and W. Thirring (Springer, Vienna). E. G. D. Cohen and W. Thirring (Springer, Vienna).
Koenig, F. O. , 1959, in Men -and Moments in the History of Reed, M. , and B. Simon, 1972, Methods of'Modern Mathemati-
Science, edited by H. M. Evans (University of Washington, cal Physics (Academic, New York).
Seattle) . Reif, F., 1965, 5'undameritals of'Statistical and Thermal
Kolmogorov, A. N. , 1953, Dokl. Akad. Nauk SSSR 119, 861. Physics (McGraw-Hill, New York).
Kolmogorov, A. N. , 1959, Dokl. Akad. Nauk SSSR 124, 754. Benyi, A. , 1965, Rev. Int. Statist. Inst. 33, 1.
Koopman, B. O. , 1931, Proc. Natl. Acad. Sci. USA 17, 315. Benyi, A. , 1966, Wahrscheinlichkeitsrechnung {Deutscher
Kossakowski, A. , 1972, Rep. Math. Phys. 3, 247. Verlag der Wissenschaften, Berlin).
Kothe, G. , 1960, Topologische lineare Raume I (Springer, Robertson, A. and W. , 1964, Topological Vector Spaces (Cam-
Berlin). bridge University, Cambridge, England).
Kouchnirenko, A. G. , 1965, Dokl. Akad. Nauk SSSB 161, 37. Robinson, D. W. , 1971, The Thermodynamical Pressure in
Quantum Statistical Mechanics (Springer, Berlin). Takesaki, M. , 1970, Tomita's Theory of Modular Hilbert Al-
Robinson, D. W. , and D. Ruelle, 1967, Commun. Math. Phys. gebras and Its App/ications (Springer, Berlin).
5, 288. Takesaki, M. , 1972, J. Funct. Anal. 9, 306.
Roller, D. , 1950, The Early DeveloPment of the ConcePts of Thirring, W. , 1975, Vorlesungen uber mathematische Physik
Temperature and Heat (Harvard University, Cambridge, (University of Vienna. ).
Mass. ). Thompson, C. , 1965, J. Math. Phys. 6, 1812.
Ruch, E., 1975, Theor. Chim. Acta 38, 167. Thomson, W. (Lord Kelvin), 1852, Philos. Mag. 4, 304.
Ruch, E., and A. Mead, 1976, Theor. Chim. Acta 41, 95. Thomson, W. (Lord Kelvin), 1857, Proc. B. Soc. Edinb. 3, 139.
Ruelle, D. , 1966, Commun. Math. Phys. 3, 133. Uhlmann, A. , 1971, Wiss. Z. Karl-Marx-Univ. Leipzig 20, 633.
Ruelle, D. , 1969, Statistica/ Mechanics (Benjamin, New York). Uhlmann, A. , 1972, Wiss. Z. Karl-Marx-Univ. Leipzig 21, 427.
Schlogl, F., 1976, Z. Phys. B 25, 411. Uhlmann, A. , 1973, Wiss. Z. Karl-Marx-Univ. Leipzig 22, 139.
Schrodinger, E., 1927, Naturwissenschaften 14, 644. Uhlmann, A. , 1975, Bep. Math. Phys. 7, 449.
Segal, I. E., 1951, Duke Math. J. 18, 221. Uhlmann, A. , 1976, Commun. Math. Phys. 54, 21.
Segal, I. E., 1960, J.
Math. Mech. 9, 623. Uhlmann, A. , 1977, preprint, Leipzig.
Shannon, C. , and W. Weaver, 1949, The Mathematical Theory Umegaki, H. , 1954, Tohoku Math. J. 6, 177.
of Communication (University of Illinois, Urbana). Umegaki, H. , 1962, Kodai Math. Sem. Rep. 14, 59.
Simon, B., 1973, appendix to Lieb and Ruskai, J.
Math. Phys. Wehrl, A. , 1973, Acta Phys. Austriaca 37, 361.
1973. Wehrl, A. , 1974, Rep. Math. Phys. 6, 15.
Simon, B., private communication. Wehrl, A. , 1975, Acta Phys. Austriaca 41, 197.
Sinai, Ya. G. , 1961, Dokl. Akad. Nauk SSSR 25, 899. Wehrl, A. , 1976a, to be published in Rep. Math. Phys.
Sinai, Ya. G. , 1965, Usp. Math. Nauk 20, 232. Wehrl, A. , 1976b, Lectures at the CIME Summer School in
von Smoluchowski, M. , 1914, Vortrage uber die kinetische Bressanone.
Theoric der Materie und Elekt~zitat (Teubner, Leipzig). Wehrl, A. , 1976c, Rep. Math. Phys. 10, 159.
Spohn, H. , private communication. Wehrl, A. , 1977, to be published inRep. Math. Phys.
Spohn, H. , 1977, preprint. Wigner, E. P. , and M. M. Yanase, 1963, Proc. Natl. Acad. Sci.
Staszewski, P. , 1977, to be published in Rep. Math. Phys. USA 49, 910.
Stinespring, W. F., 1955, Proc. Am. Math. Soc. 6, 211. Yang, C. N. , and T. D. Lee, 1952, Phys. Bev. 87, 404.
Szilard, L. , 1925, Z. Phys. 32, 777. Zermelo, E., 1896, Ann. Phys. 57, 485; 59, 793.