Lecture Notes on Statistical Mechanics and Thermodynamics

Universität Leipzig

Instructor: Prof. Dr. S. Hollands

Contents

1 Introduction and Historical Overview ..... 3
2 Basic Statistical Notions ..... 7
2.1 Probability Theory and Random Variables ..... 7
2.2 Entropy ..... 11
2.3 Examples of probability distributions ..... 12
2.3.1 Gaussian distribution ..... 13
2.3.2 Binomial and Poisson distribution ..... 13
2.3.3 Ising model ..... 14
2.3.4 Site percolation ..... 15
2.3.5 Random walk on a lattice ..... 17
2.4 Ensembles in Classical Mechanics ..... 18
2.5 Ensembles in Quantum Mechanics (Statistical Operators / Density Matrices) ..... 24
3 Time-evolving ensembles ..... 29
3.1 Boltzmann Equation in Classical Mechanics ..... 29
3.2 Boltzmann Equation, Approach to Equilibrium in Quantum Mechanics ..... 36
4 Equilibrium Ensembles ..... 39
4.1 Generalities ..... 39
4.2 Micro-Canonical Ensemble ..... 39
4.2.1 Micro-Canonical Ensemble in Classical Mechanics ..... 39
4.2.2 Microcanonical Ensemble in Quantum Mechanics ..... 46
4.2.3 Mixing entropy of the ideal gas ..... 49
4.3 Canonical Ensemble ..... 51
4.3.1 Canonical Ensemble in Quantum Mechanics ..... 51
4.3.2 Canonical Ensemble in Classical Mechanics ..... 54
4.3.3 Equidistribution Law and Virial Theorem in the Canonical Ensemble ..... 57
4.4 Grand Canonical Ensemble ..... 61
4.5 Summary of different equilibrium ensembles ..... 64
4.6 Approximation methods ..... 65
4.6.1 The cluster expansion ..... 65
4.6.2 Peierls contours ..... 67
5 The Ideal Quantum Gas ..... 71
5.1 Hilbert Spaces, Canonical and Grand Canonical Formulations ..... 71
5.2 Degeneracy pressure for free fermions ..... 78
5.3 Spin Degeneracy ..... 81
5.4 Black Body Radiation ..... 83
5.5 Degenerate Bose Gas ..... 87
6 The Laws of Thermodynamics ..... 91
6.1 The Zeroth Law ..... 92
6.2 The First Law ..... 94
6.3 The Second Law ..... 99
6.4 Cyclic processes ..... 102
6.4.1 The Carnot Engine ..... 102
6.4.2 General Cyclic Processes ..... 107
6.4.3 The Diesel Engine ..... 109
6.5 Thermodynamic potentials ..... 110
6.6 Chemical Equilibrium ..... 116
6.7 Phase Co-Existence and Clausius-Clapeyron Relation ..... 118
6.8 Osmotic Pressure ..... 123
A Dynamical Systems and Approach to Equilibrium ..... 125
A. 1 The Master Equation ..... 125
A. 2 Properties of the Master Equation ..... 127
A. 3 Relaxation time vs. ergodic time ..... 129
A. 4 Monte Carlo methods and Metropolis algorithm ..... 132
A. 5 Eigenstate thermalization ..... 134
B Exercises ..... 137
B. 1 Exercises for chapter 2 ..... 137
B. 2 Exercises for chapter 3 ..... 141
B. 3 Exercises for chapter 4 ..... 143
B. 4 Exercises for chapter 5 ..... 149
B. 5 Exercises for chapter 6 ..... 152

List of Figures

1.1 Boltzmann's tomb with his famous entropy formula engraved at the top ..... 4
2.1 Graphical expression for the first four moments. ..... 10
2.2 Sketch of a well-potential W W W\mathcal{W}W. ..... 19
2.3 Evolution of a phase space volume under the flow map Φ t Φ t Phi_(t)\Phi_{t}Φt ..... 20
2.4 Sketch of the situation described in the proof of Poincaré recurrence. ..... 23
3.1 Classical scattering of particles in the "fixed target frame". ..... 31
3.2 Pressure on the walls due to the impact of particles. ..... 33
3.3 Sketch of the air-flow across a wing. ..... 34
4.1 Gas in a piston maintained at pressure P P PPP. ..... 43
4.2 The joint number of states for two systems in thermal contact. ..... 45
4.3 Number of states with energies lying between E Δ E E Δ E E-Delta EE-\Delta EEΔE and E E EEE. ..... 48
4.4 Two gases separated by a removable wall. ..... 49
4.5 A small system in contact with a large heat reservoir. ..... 51
4.6 Distribution and velocity of stars in a galaxy. ..... 59
4.7 Sketch of a potential V V V\mathcal{V}V of a lattice with a minimum at Q 0 Q 0 Q_(0)Q_{0}Q0. ..... 59
4.8 A small system coupled to a large heat and particle reservoir. ..... 61
4.9 A Peierls contour ..... 68
5.1 The potential V ( r ) V ( r ) V( vec(r))\mathcal{V}(\vec{r})V(r) ocurring in (5.46). ..... 81
5.2 Lowest-order Feynman diagram for photon-photon scattering in Quantum Electrodynamics. ..... 84
5.3 Sketch of the Planck distribution for different temperatures. ..... 86
6.1 The triple point of ice water and vapor in the ( P , T ) ( P , T ) (P,T)(P, T)(P,T) phase diagram ..... 93
6.2 A large system divided into subsystems I and II by an imaginary wall. ..... 94
6.3 Change of system from initial state i i iii to final state f f fff along two different paths. ..... 94
6.4 A curve γ : [ 0 , 1 ] R 2 γ : [ 0 , 1 ] R 2 gamma:[0,1]rarrR^(2)\gamma:[0,1] \rightarrow \mathbb{R}^{2}γ:[0,1]R2. ..... 95
6.5 Sketch of the submanifolds A A A\mathcal{A}A. ..... 99
6.6 Adiabatics of the ideal gas ..... 101
6.7 Carnot cycle for an ideal gas. The solid lines indicate isotherms and the dashed lines indicate adiabatics. ..... 104
6.8 The Carnot cycle in the ( T , S ) ( T , S ) (T,S)(T, S)(T,S)-diagram. ..... 106
6.9 A generic cyclic process in the ( T , S T , S T,ST, ST,S )-diagram. ..... 107
6.10 A generic cyclic process divided into two parts by an isotherm at temper- ature T I T I T_(I)T_{I}TI. ..... 108
6.11 The process describing the Diesel engine in the ( P , V ) ( P , V ) (P,V)(P, V)(P,V)-diagram. ..... 109
6.12 The phase boundary between solution and a solute. ..... 119
6.13 Imaginary phase diagram for the case of 6 different phases. At each point on a phase boundary which is not an intersection point, φ = 2 φ = 2 varphi=2\varphi=2φ=2 phases are supposed to coexist. At each intersection point φ = 4 φ = 4 varphi=4\varphi=4φ=4 phases are supposed to coexist. ..... 120
6.14 Phase boundary of a vapor-solid system in the ( P , T ) ( P , T ) (P,T)(P, T)(P,T)-diagram ..... 122

Chapter 1

Introduction and Historical Overview

As the name suggests, thermodynamics historically developed as an attempt to understand phenomena involving heat. This notion is intimately related to irreversible processes involving typically many, essentially randomly excited, degrees of freedom. The proper understanding of this notion as well as the 'laws' that govern it took the better part of the 19 th 19 th  19^("th ")19^{\text {th }}19th  century. The basic rules that were, essentially empirically, observed were clarified and laid out in the so-called "laws of thermodynamics". These laws are still useful today, and will, most likely, survive most microscopic models of physical systems that we use.
Before the laws of thermodynamics were identified, other theories of heat were also considered. A curious example from the 17 th 17 th  17^("th ")17^{\text {th }}17th  century is a theory of heat proposed by J. Becher. He put forward the idea that heat was carried by special particles he called
scientists such as A.L. de Lavoisier 2 2 ^(2){ }^{2}2, who showed that the existence of such a particle did not explain, and was in fact inconsistent with, the phenomenon of burning, which he instead correctly associated also with chemical processes involving oxygen. Heat had already previously been associated with friction, especially through the work of B. Thompson, who showed that in this process work (mechanical energy) is converted to heat. That heat transfer can generate mechanical energy was in turn exemplified through the steam engine as developed by inventors such as J. Watt, J. Trevithick, and
T. Newcomen - the key technical invention of the 18 th 18 th  18^("th ")18^{\text {th }}18th  and 19 th 19 th  19^("th ")19^{\text {th }}19th  century. A broader theoretical description of processes involving heat transfer was put forward in 1824 by N.L.S. Carnot, who emphasized in particular the importance of the notion of equilibrium. The quantitative understanding of the relationship between heat and energy was found by J.P. Joule and R. Mayer, who were the first to state clearly that heat is a form of energy. This finally lead to the principle of conservation of energy put forward by H . von Helmholtz in 1847.
Parallel to this largely phenomenological view of heat, there were also early attempts to understand this phenomenon from a microscopic angle. This viewpoint seems to have been first stated in a transparent fashion by D. Bernoulli in 1738 in his work on hydrodynamics, in which he proposed that heat is transferred from regions with energetic molecules (high internal energy) to regions with less energetic molecules (low energy). The microscopic viewpoint ultimately lead to the modern 'bottom up' view of heat by J.C. Maxwell, J. Stefan and especially L. Boltzmann. According to Boltzmann, heat is associated with a quantity called "entropy" which increases in irreversible processes. In the context of equilibrium states, entropy can be understood as a measure of the number of accessible states at a defined energy according to his famous formula
S = k B log W ( E ) , S = k B log W ( E ) , S=k_(B)log W(E),S=\mathrm{k}_{\mathrm{B}} \log W(E),S=kBlogW(E),
which Planck had later engraved in Boltzmann's tomb on Wiener Zentralfriedhof:
Figure 1.1: Boltzmann's tomb with his famous entropy formula engraved at the top.
The formula thereby connects a macroscopic, phenomenological quantity S S SSS to the microscopic states of the system (counted by W ( E ) = W ( E ) = W(E)=W(E)=W(E)= number of accessible states of energy E E EEE ). His proposal to relate entropy to counting problems for microscopic configurations and thereby to ideas from probability theory was entirely new and ranks as one of the major intellectual accomplishments in Physics. The systematic understanding of the relationship between the distributions of microscopic states of a system and
macroscopic quantities such as S S SSS is the subject of statistical mechanics. That subject nowadays goes well beyond the original goal of understanding the phenomenon of heat but is more broadly aimed at the analysis of systems with a large number of, typically interacting, degrees of freedom and their description in an "averaged", or "statistical", or "coarse grained" manner. As such, statistical mechanics has found an ever growing number of applications to many diverse areas of science, such as
  • Neural networks and other networks
  • Financial markets
  • Data analysis and mining
  • Astronomy
  • Black hole physics
    and many more. Here is an, obviously incomplete, list of some key innovations in the subject:

Timeline

17 th 17 th  17^("th ")17^{\text {th }}17th  century:

Ferdinand II, Grand Duke of Tuscany: Quantitative measurement of temperature 18 th 18 th  18^("th ")18^{\text {th }}18th  century:
A.Celsius, C. von Linné: Celsius temperature scale
A.L. de Lavoisier: basic calometry
D. Bernoulli: basics of kinetic gas theory
B. Thompson (Count Rumford): mechanical energy can be converted to heat
19 th 19 th  19^("th ")19^{\text {th }}19th  century:
1802 J. L. Gay-Lussac: heat expansion of gases
1824 N.L.S.Carnot: thermodynamic cycles and heat engines
1847 H. von Helmholtz: energy conservation (1 st st  ^("st "){ }^{\text {st }}st  law of thermodynamics)
1848 W. Thomson (Lord Kelvin): definition of absolute thermodynamic temperature scale based on Carnot processes
1850 W. Thomson and H. von Helmholtz: impossibility of a perpetuum mobile (2 2 nd 2 nd  2^("nd ")2^{\text {nd }}2nd  law)
1857 R. Clausius: equation of state for ideal gases
1860 J.C. Maxwell: distribution of the velocities of particles in a gas
1865 R.Clausius: new formulation of 2 nd 2 nd  2^("nd ")2^{\text {nd }}2nd  law of thermodynamics, notion of entropy
1877 L. Boltzmann: S = k B log W S = k B log W S=k_(B)log WS=\mathrm{k}_{\mathrm{B}} \log WS=kBlogW
1876 (as well as 1896 and 1909) controversy concerning entropy, Poincaré recurrence is not compatible with macroscopic behavior
1894 W. Wien: black body radiation
20 th 20 th  20^("th ")20^{\text {th }}20th  century:
1900 M. Planck: radiation law rarr\rightarrow Quantum Mechanics
1911 P. Ehrenfest: foundations of Statistical Mechanics
1924 Bose-Einstein statistics
1925 Fermi-Pauli statistics
1931 L. Onsager: theory of irreversible processes
1937 L. Landau: phase transitions, later extended to superconductivity by Ginzburg
1930's W. Heisenberg, E. Ising, R. Peierls,... : spin models for magnetism
1943 S. Chandrasekhar, R.H. Fowler: applications of statistical mechanics in astrophysics
1956 J. Bardeen, L.N. Cooper, J.R. Schrieffer: explanation of superconductivity
1956-58 L. Landau: theory of Fermi liquids
1960's T. Matsubara, E. Nelson, K. Symanzik,... : application of Quantum Field Theory methods to Statistical Mechanics
1970's L. Kadanoff, K.G. Wilson, W. Zimmermann, F. Wegner,...: renormalization group methods in Statistical Mechanics
1973 J. Bardeen, B. Carter, S. Hawking, J. Bekenstein, R.M. Wald, W.G. Unruh,....: laws of black hole mechanics, Bekenstein-Hawking entropy
1975 - Neural networks
1985 - Statistical physics in economy

Chapter 2

Basic Statistical Notions

2.1 Probability Theory and Random Variables

Statistical mechanics is an intrinsically probabilistic description of a system, so we do not ask questions like "What is the velocity of the N th N th  N^("th ")\mathrm{N}^{\text {th }}Nth  particle?" but rather questions of the sort "What is the probability for the N th N th  N^("th ")\mathrm{N}^{\text {th }}Nth  particle having velocity between v v vvv and v + Δ v v + Δ v v+Delta vv+\Delta vv+Δv ?" in an ensemble of particles. Thus, basic notions and manipulations from probability theory can be useful, and we now introduce some of these, without any attention paid to mathematical rigor.
  • A random variable x x xxx can have different outcomes forming a set Ω = { x 1 , x 2 , } Ω = x 1 , x 2 , Omega={x_(1),x_(2),dots}\Omega=\left\{x_{1}, x_{2}, \ldots\right\}Ω={x1,x2,}, e.g. for tossing a coin Ω coin = { Ω coin  = { Omega_("coin ")={\Omega_{\text {coin }}=\{Ωcoin ={ head,tail } } }\}} or for a dice Ω dice = { 1 , 2 , 3 , 4 , 5 , 6 } Ω dice  = { 1 , 2 , 3 , 4 , 5 , 6 } Omega_("dice ")={1,2,3,4,5,6}\Omega_{\text {dice }}=\{1,2,3,4,5,6\}Ωdice ={1,2,3,4,5,6}, or for the velocity of a particle Ω velocity = { v = ( v x , v y , v z ) R 3 } Ω velocity  = v = v x , v y , v z R 3 Omega_("velocity ")={( vec(v))=(v_(x),v_(y),v_(z))inR^(3)}\Omega_{\text {velocity }}=\left\{\vec{v}=\left(v_{x}, v_{y}, v_{z}\right) \in \mathbb{R}^{3}\right\}Ωvelocity ={v=(vx,vy,vz)R3}, etc.
  • An event is a subset E Ω E Ω E sub OmegaE \subset \OmegaEΩ (not all subsets need to be events).
  • A probability measure is a map that assigns a number P ( E ) P ( E ) P(E)P(E)P(E) to each event, subject to the following general rules:
    (i) P ( E ) 0 P ( E ) 0 P(E) >= 0P(E) \geqslant 0P(E)0.
    (ii) P ( Ω ) = 1 P ( Ω ) = 1 P(Omega)=1P(\Omega)=1P(Ω)=1.
    (iii) If E E = P ( E E ) = P ( E ) + P ( E ) E E = P E E = P ( E ) + P E E nnE^(')=O/=>P(E uuE^('))=P(E)+P(E^('))E \cap E^{\prime}=\varnothing \Rightarrow P\left(E \cup E^{\prime}\right)=P(E)+P\left(E^{\prime}\right)EE=P(EE)=P(E)+P(E).
In mathematics, the data ( Ω , P , { E } ) ( Ω , P , { E } ) (Omega,P,{E})(\Omega, P,\{E\})(Ω,P,{E}) is called a probability space and the above axioms basically correspond to the axioms for such spaces. For instance, for a fair dice the probabilities would be P dice ( { 1 } ) = = P dice ( { 6 } ) = 1 6 P dice  ( { 1 } ) = = P dice  ( { 6 } ) = 1 6 P_("dice ")({1})=dots=P_("dice ")({6})=(1)/(6)P_{\text {dice }}(\{1\})=\ldots=P_{\text {dice }}(\{6\})=\frac{1}{6}Pdice ({1})==Pdice ({6})=16 and E E EEE would be any subset of { 1 , 2 , 3 , 4 , 5 , 6 } { 1 , 2 , 3 , 4 , 5 , 6 } {1,2,3,4,5,6}\{1,2,3,4,5,6\}{1,2,3,4,5,6}. In practice, probabilities are determined by repeating the experiment (independently) many times, e.g. throwing the dice very often. Thus, the "empirical definition" of the probability of an event E E EEE is
(2.1) P ( E ) = lim N N E N , (2.1) P ( E ) = lim N N E N , {:(2.1)P(E)=lim_(N rarr oo)(N_(E))/(N)",":}\begin{equation*} P(E)=\lim _{N \rightarrow \infty} \frac{N_{E}}{N}, \tag{2.1} \end{equation*}(2.1)P(E)=limNNEN,
where N E = N E = N_(E)=N_{E}=NE= number of times E E EEE occurred, and N = N = N=N=N= total number of experiments. For one real variable x Ω R x Ω R x in Omega subRx \in \Omega \subset \mathbb{R}xΩR, it is common to write the probability of an event E R E R E subRE \subset \mathbb{R}ER formally as
(2.2) P ( E ) = E p ( x ) d x (2.2) P ( E ) = E p ( x ) d x {:(2.2)P(E)=int_(E)p(x)dx:}\begin{equation*} P(E)=\int_{E} p(x) d x \tag{2.2} \end{equation*}(2.2)P(E)=Ep(x)dx
Here, p ( x ) p ( x ) p(x)p(x)p(x) is the probability density "function", defined formally by:
" p ( x ) d x = P ( ( x , x + d x ) ) " " p ( x ) d x = P ( ( x , x + d x ) ) " "p(x)dx=P((x,x+dx))"" p(x) d x=P((x, x+d x)) ""p(x)dx=P((x,x+dx))"
The axioms for p p ppp formally imply that we should have
p ( x ) d x = 1 , 0 p ( x ) p ( x ) d x = 1 , 0 p ( x ) int_(-oo)^(oo)p(x)dx=1,quad0 <= p(x) <= oo\int_{-\infty}^{\infty} p(x) d x=1, \quad 0 \leqslant p(x) \leqslant \inftyp(x)dx=1,0p(x)
A mathematically more precise way to think about the quantity p ( x ) d x p ( x ) d x p(x)dxp(x) d xp(x)dx is provided by measure theory, i.e. we should really think of p ( x ) d x = d μ ( x ) p ( x ) d x = d μ ( x ) p(x)dx=d mu(x)p(x) d x=d \mu(x)p(x)dx=dμ(x) as defining a measure and of { E } { E } {E}\{E\}{E} as the corresponding collection of measurable subsets. A typical case is that p p ppp is a smooth (or even just integrable) function on R R R\mathbb{R}R and that d x d x dxd xdx is the Lebesgue measure, with E E EEE from the set of all Lebesgue measurable subsets of R R R\mathbb{R}R. However, we can also consider more pathological cases, e.g. by allowing p p ppp to have certain singularities. It is possible to define "singular" measures d μ d μ d mud \mudμ relative to the Lebesgue measure d x d x dxd xdx which are not writable as p ( x ) d x p ( x ) d x p(x)dxp(x) d xp(x)dx and p p ppp an integrable function which is non-negative almost everywhere. An example of this is the Dirac measure, which is formally written as
(2.3) p ( x ) = i = 1 N p i δ ( x y i ) , (2.3) p ( x ) = i = 1 N p i δ x y i , {:(2.3)p(x)=sum_(i=1)^(N)p_(i)delta(x-y_(i))",":}\begin{equation*} p(x)=\sum_{i=1}^{N} p_{i} \delta\left(x-y_{i}\right), \tag{2.3} \end{equation*}(2.3)p(x)=i=1Npiδ(xyi),
where p i 0 p i 0 p_(i) >= 0p_{i} \geqslant 0pi0 and i p i = 1 i p i = 1 sum_(i)p_(i)=1\sum_{i} p_{i}=1ipi=1. Nevertheless, we will, by abuse of notation, stick with the informal notation p ( x ) d x p ( x ) d x p(x)dxp(x) d xp(x)dx. We can also consider several random variables, such as x = ( x 1 , , x N ) Ω = R N x = x 1 , , x N Ω = R N x=(x_(1),dots,x_(N))in Omega=R^(N)x=\left(x_{1}, \ldots, x_{N}\right) \in \Omega=\mathbb{R}^{N}x=(x1,,xN)Ω=RN. The probability density function would now be - again formally - a function p ( x ) 0 p ( x ) 0 p(x) >= 0p(x) \geqslant 0p(x)0 on R N R N R^(N)\mathbb{R}^{N}RN with total integral of 1 .
Of course, as the example of the coin shows, one can and should also consider discrete probability spaces such as Ω = { 1 , , N } Ω = { 1 , , N } Omega={1,dots,N}\Omega=\{1, \ldots, N\}Ω={1,,N}, with the events E E EEE being all possible subsets. For the elementary event { n } { n } {n}\{n\}{n} the probability p n = P ( { n } ) p n = P ( { n } ) p_(n)=P({n})p_{n}=P(\{n\})pn=P({n}) is then a non-negative number and i p i = 1 i p i = 1 sum_(i)p_(i)=1\sum_{i} p_{i}=1ipi=1. The collection of { p 1 , , p N } p 1 , , p N {p_(1),dots,p_(N)}\left\{p_{1}, \ldots, p_{N}\right\}{p1,,pN} completely characterizes the probability distribution.
Let us collect some standard notions and terminology associated with probability spaces:
  • The expectation value F ( x ) F ( x ) (:F(x):)\langle F(x)\rangleF(x) of a function R N Ω x F ( x ) R R N Ω x F ( x ) R R^(N)sup Omega∋x|->F(x)inR\mathbb{R}^{N} \supset \Omega \ni x \mapsto F(x) \in \mathbb{R}RNΩxF(x)R ("observable") of a random variable is
(2.4) F ( x ) := Ω F ( x ) p ( x ) d N x . (2.4) F ( x ) := Ω F ( x ) p ( x ) d N x . {:(2.4)(:F(x):):=int_(Omega)F(x)p(x)d^(N)x.:}\begin{equation*} \langle F(x)\rangle:=\int_{\Omega} F(x) p(x) d^{N} x . \tag{2.4} \end{equation*}(2.4)F(x):=ΩF(x)p(x)dNx.
Here, the function F ( x ) F ( x ) F(x)F(x)F(x) should be such that this expression is actually well-defined, i.e. F F FFF should be integrable with respect to the probability measure d μ = p ( x ) d N x d μ = p ( x ) d N x d mu=p(x)d^(N)xd \mu=p(x) d^{N} xdμ=p(x)dNx.
  • The moments m n m n m_(n)m_{n}mn of a probability density function p p ppp of one real variable x Ω = R x Ω = R x in Omega=Rx \in \Omega=\mathbb{R}xΩ=R are defined by
(2.5) m n := x n = x n p ( x ) d x (2.5) m n := x n = x n p ( x ) d x {:(2.5)m_(n):=(:x^(n):)=int_(-oo)^(oo)x^(n)p(x)dx:}\begin{equation*} m_{n}:=\left\langle x^{n}\right\rangle=\int_{-\infty}^{\infty} x^{n} p(x) d x \tag{2.5} \end{equation*}(2.5)mn:=xn=xnp(x)dx
Note that it is not automatically guaranteed that the moments are well-defined, and the same remark applies to the expressions given below. The probability distribution p p ppp can be reconstructed from the moments under certain conditions. This is known as the "Hamburger moment problem".
  • The characteristic function p ~ p ~ tilde(p)\tilde{p}p~ of a probability density function of one real variable is its Fourier transform, defined as
(2.6) p ~ ( k ) = d x e i k x p ( x ) = e i k x = n = 0 ( i k ) n n ! x n (2.6) p ~ ( k ) = d x e i k x p ( x ) = e i k x = n = 0 ( i k ) n n ! x n {:(2.6) tilde(p)(k)=int_(-oo)^(oo)dxe^(-ikx)p(x)=(:e^(-ikx):)=sum_(n=0)^(oo)((-ik)^(n))/(n!)(:x^(n):):}\begin{equation*} \tilde{p}(k)=\int_{-\infty}^{\infty} d x e^{-i k x} p(x)=\left\langle e^{-i k x}\right\rangle=\sum_{n=0}^{\infty} \frac{(-i k)^{n}}{n!}\left\langle x^{n}\right\rangle \tag{2.6} \end{equation*}(2.6)p~(k)=dxeikxp(x)=eikx=n=0(ik)nn!xn
The Fourier inversion formula gives
(2.7) p ( x ) = 1 2 π d k e i k x p ~ ( k ) (2.7) p ( x ) = 1 2 π d k e i k x p ~ ( k ) {:(2.7)p(x)=(1)/(2pi)int_(-oo)^(oo)dke^(ikx) tilde(p)(k):}\begin{equation*} p(x)=\frac{1}{2 \pi} \int_{-\infty}^{\infty} d k e^{i k x} \tilde{p}(k) \tag{2.7} \end{equation*}(2.7)p(x)=12πdkeikxp~(k)
  • The cumulants x n c x n c (:x^(n):)_(c)\left\langle x^{n}\right\rangle_{c}xnc are defined via
(2.8) log p ~ ( k ) = n = 1 ( i k ) n n ! x n c . (2.8) log p ~ ( k ) = n = 1 ( i k ) n n ! x n c . {:(2.8)log tilde(p)(k)=sum_(n=1)^(oo)((-ik)^(n))/(n!)(:x^(n):)_(c).:}\begin{equation*} \log \tilde{p}(k)=\sum_{n=1}^{\infty} \frac{(-i k)^{n}}{n!}\left\langle x^{n}\right\rangle_{c} . \tag{2.8} \end{equation*}(2.8)logp~(k)=n=1(ik)nn!xnc.
The first four are given in terms of the moments by
x c = x x 2 c = x 2 x 2 = ( x x ) 2 x 3 c = x 3 3 x 2 x + 2 x 3 x 4 c = x 4 4 x 3 x 3 x 2 2 + 12 x 2 x 2 6 x 4 . x c = x x 2 c = x 2 x 2 = ( x x ) 2 x 3 c = x 3 3 x 2 x + 2 x 3 x 4 c = x 4 4 x 3 x 3 x 2 2 + 12 x 2 x 2 6 x 4 . {:[(:x:)_(c)=(:x:)],[(:x^(2):)_(c)=(:x^(2):)-(:x:)^(2)=(:(x-(:x:))^(2):)],[(:x^(3):)_(c)=(:x^(3):)-3(:x^(2):)(:x:)+2(:x:)^(3)],[(:x^(4):)_(c)=(:x^(4):)-4(:x^(3):)(:x:)-3(:x^(2):)^(2)+12(:x^(2):)(:x:)^(2)-6(:x:)^(4).]:}\begin{aligned} \langle x\rangle_{c} & =\langle x\rangle \\ \left\langle x^{2}\right\rangle_{c} & =\left\langle x^{2}\right\rangle-\langle x\rangle^{2}=\left\langle(x-\langle x\rangle)^{2}\right\rangle \\ \left\langle x^{3}\right\rangle_{c} & =\left\langle x^{3}\right\rangle-3\left\langle x^{2}\right\rangle\langle x\rangle+2\langle x\rangle^{3} \\ \left\langle x^{4}\right\rangle_{c} & =\left\langle x^{4}\right\rangle-4\left\langle x^{3}\right\rangle\langle x\rangle-3\left\langle x^{2}\right\rangle^{2}+12\left\langle x^{2}\right\rangle\langle x\rangle^{2}-6\langle x\rangle^{4} . \end{aligned}xc=xx2c=x2x2=(xx)2x3c=x33x2x+2x3x4c=x44x3x3x22+12x2x26x4.
There is an important combinatorial scheme relating moments to cumulants. The result expressed by this combinatorial scheme is called the linked cluster theorem, and a variant of it will appear when we discuss the cluster expansion. In order to state and illustrate the content of the linked cluster theorem, we represent the first four moments graphically as follows:
x = . x 2 = + x 3 =∵ + 3 + Θ x 4 =∵ + 4 + 3 x = x 2 = + x 3 =∵ + 3 + Θ x 4 =∵ + 4 + 3 {:[(:x:)=". "],[(:x^(2):)=o.+o.o.],[(:x^(3):)=∵+3≑+Theta],[(:x^(4):)=∵+4:'o.+3≑]:}\begin{aligned} & \langle x\rangle=\text {. } \\ & \left\langle x^{2}\right\rangle=\odot+\odot \odot \\ & \left\langle x^{3}\right\rangle=\because+3 \doteqdot+\Theta \\ & \left\langle x^{4}\right\rangle=\because+4 \because \odot+3 \doteqdot \end{aligned}x=x2=+x3=∵+3+Θx4=∵+4+3
Figure 2.1: Graphical expression for the first four moments.
A blob indicates a connected moment, also called 'cluster'. The linked cluster theorem states that the numerical coefficients in front of the various terms can be obtained by finding the number of ways to break points into clusters of this type. A proof of the linked cluster theorem can be obtained as follows: we write
(2.9) 0 ( i k ) m m ! x m = e n = 1 ( i k ) n ! x n c = n = 1 i n [ ( i k ) n i n i n ! ( x n c n ! ) i n ] (2.9) 0 ( i k ) m m ! x m = e n = 1 ( i k ) n ! x n c = n = 1 i n ( i k ) n i n i n ! x n c n ! i n {:(2.9)sum_(0)^(oo)((-ik)^(m))/(m!)(:x^(m):)=e^(sum_(n=1)^(oo)((-ik))/(n!)(:x^(n):)_(c))=prod_(n=1)^(oo)sum_(i_(n))^(')[((-ik)^(ni_(n)))/(i_(n)!)(((:x^(n):)_(c))/(n!))^(i_(n))]:}\begin{equation*} \sum_{0}^{\infty} \frac{(-i k)^{m}}{m!}\left\langle x^{m}\right\rangle=e^{\sum_{n=1}^{\infty} \frac{(-i k)}{n!}\left\langle x^{n}\right\rangle_{c}}=\prod_{n=1}^{\infty} \sum_{i_{n}}^{\prime}\left[\frac{(-i k)^{n i_{n}}}{i_{n}!}\left(\frac{\left\langle x^{n}\right\rangle_{c}}{n!}\right)^{i_{n}}\right] \tag{2.9} \end{equation*}(2.9)0(ik)mm!xm=en=1(ik)n!xnc=n=1in[(ik)ninin!(xncn!)in]
from which we conclude that
(2.10) x m = { i n } m ! n x n c i n i n ! ( n ! ) i n , (2.10) x m = i n m ! n x n c i n i n ! ( n ! ) i n , {:(2.10)(:x^(m):)=sum_({i_(n)})m!prod_(n)((:x^(n):)_(c)^(i_(n)))/(i_(n)!(n!)^(i_(n)))",":}\begin{equation*} \left\langle x^{m}\right\rangle=\sum_{\left\{i_{n}\right\}} m!\prod_{n} \frac{\left\langle x^{n}\right\rangle_{c}^{i_{n}}}{i_{n}!(n!)^{i_{n}}}, \tag{2.10} \end{equation*}(2.10)xm={in}m!nxncinin!(n!)in,
where sum'\sum^{\prime} is restricted to n i n = m n i n = m sum ni_(n)=m\sum n i_{n}=mnin=m. The claimed graphical expansion follows because m ! n i n ! ( n ! ) i n m ! n i n ! ( n ! ) i n (m!)/(prod_(n)i_(n)!(n!)^(i_(n)))\frac{m!}{\prod_{n} i_{n}!(n!)^{i_{n}}}m!nin!(n!)in is the number of ways to break m m mmm points into clusters characterized by the numbers { i n } i n {i_(n)}\left\{i_{n}\right\}{in}, where i n i n i_(n)i_{n}in is the number of clusters with n n nnn points.
Let p p ppp be a probability density on the space Ω = Ω A × Ω B = { x = ( x A , x B ) : x A Ω = Ω A × Ω B = x = x A , x B : x A Omega=Omega_(A)xxOmega_(B)={x=(x_(A),x_(B)):x_(A)in:}\Omega=\Omega_{A} \times \Omega_{B}=\left\{x=\left(x_{A}, x_{B}\right): x_{A} \in\right.Ω=ΩA×ΩB={x=(xA,xB):xA Ω A , x B Ω B } Ω A , x B Ω B {:Omega_(A),x_(B)inOmega_(B)}\left.\Omega_{A}, x_{B} \in \Omega_{B}\right\}ΩA,xBΩB}. If the density is p p ppp on Ω Ω Omega\OmegaΩ factorized, as in
(2.11) p ( x A , x B ) = p A ( x A ) p B ( x B ) , (2.11) p x A , x B = p A x A p B x B , {:(2.11)p(x_(A),x_(B))=p_(A)(x_(A))p_(B)(x_(B))",":}\begin{equation*} p\left(x_{A}, x_{B}\right)=p_{A}\left(x_{A}\right) p_{B}\left(x_{B}\right), \tag{2.11} \end{equation*}(2.11)p(xA,xB)=pA(xA)pB(xB),
then we say that the variables x A x A x_(A)x_{A}xA and x B x B x_(B)x_{B}xB are independent. If F A F A F_(A)F_{A}FA is an observable for x A x A x_(A)x_{A}xA and F B F B F_(B)F_{B}FB an observable for x B x B x_(B)x_{B}xB, then for independent random variables x A , x B x A , x B x_(A),x_(B)x_{A}, x_{B}xA,xB, one has
(2.12) F A F B = F A F B (2.12) F A F B = F A F B {:(2.12)(:F_(A)F_(B):)=(:F_(A):)(:F_(B):):}\begin{equation*} \left\langle F_{A} F_{B}\right\rangle=\left\langle F_{A}\right\rangle\left\langle F_{B}\right\rangle \tag{2.12} \end{equation*}(2.12)FAFB=FAFB
and one says that A A AAA and B B BBB are uncorrelated.
The notion of independence can be generalized immediately to any "Cartesian product" Ω = Ω 1 × × Ω N Ω = Ω 1 × × Ω N Omega=Omega_(1)xx dots xxOmega_(N)\Omega=\Omega_{1} \times \ldots \times \Omega_{N}Ω=Ω1××ΩN of probability spaces. In the case of independent identically distributed real random variables x i , i = 1 , , N x i , i = 1 , , N x_(i),i=1,dots,Nx_{i}, i=1, \ldots, Nxi,i=1,,N, there is an important theorem characterizing the limit as N N N rarr ooN \rightarrow \inftyN, which is treated in more detail in problem B.1. Basically it says that (under certain assumptions about p p ppp ) the random variable y = ( x i μ ) N y = x i μ N y=(sum(x_(i)-mu))/(sqrtN)y=\frac{\sum\left(x_{i}-\mu\right)}{\sqrt{N}}y=(xiμ)N has Gaussian distribution for large N N NNN with mean 0 and spread σ / N σ / N sigma//sqrtN\sigma / \sqrt{N}σ/N. Thus, in this sense, a sum of a large number of arbitrary random variables is approximately distributed as a Gaussian random variable. This so called "Central Limit Theorem" explains, in some sense, the empirical evidence that the random variables appearing in various applications are distributed as Gaussians.

2.2 Entropy

An important quantity associated with a probability distribution is its "information entropy". Let { p i } p i {p_(i)}\left\{p_{i}\right\}{pi} be a probability distribution for a random variable taking values in Ω = { x 1 , , x N } Ω = x 1 , , x N Omega={x_(1),dots,x_(N)}\Omega=\left\{x_{1}, \ldots, x_{N}\right\}Ω={x1,,xN}. If the probability p i p i p_(i)p_{i}pi for finding x i x i x_(i)x_{i}xi is very small, then we should be surprised that x i x i x_(i)x_{i}xi occurred. A measure of surprise for the event x i x i x_(i)x_{i}xi is
(2.13) surprise at seeing event x i = log 1 p i (2.13)  surprise at seeing event  x i = log 1 p i {:(2.13)" surprise at seeing event "x_(i)=log((1)/(p_(i))):}\begin{equation*} \text { surprise at seeing event } x_{i}=\log \frac{1}{p_{i}} \tag{2.13} \end{equation*}(2.13) surprise at seeing event xi=log1pi
because (i) the surprise is larger the smaller p i p i p_(i)p_{i}pi and (ii) the surprise of independent events (see above) should be additive, thus we should take a log. The average surprise is
(2.14) average surprise = log 1 p i = i p i log p i (2.14)  average surprise  = log 1 p i = i p i log p i {:(2.14)" average surprise "=(:log((1)/(p_(i))):)=-sum_(i)p_(i)log p_(i):}\begin{equation*} \text { average surprise }=\left\langle\log \frac{1}{p_{i}}\right\rangle=-\sum_{i} p_{i} \log p_{i} \tag{2.14} \end{equation*}(2.14) average surprise =log1pi=ipilogpi
This average surprise is defined to be the "information entropy":
Definition: Let Ω = { x 1 , , x N } Ω = x 1 , , x N Omega={x_(1),dots,x_(N)}\Omega=\left\{x_{1}, \ldots, x_{N}\right\}Ω={x1,,xN} and let { p i } p i {p_(i)}\left\{p_{i}\right\}{pi} be a probability distribution. The quan-
tity
(2.15) S inf ( { p i } ) := k B i p i log p i (2.15) S inf p i := k B i p i log p i {:(2.15)S_(inf)({p_(i)}):=-k_(B)sum_(i)p_(i)log p_(i):}\begin{equation*} S_{\mathrm{inf}}\left(\left\{p_{i}\right\}\right):=-\mathrm{k}_{\mathrm{B}} \sum_{i} p_{i} \log p_{i} \tag{2.15} \end{equation*}(2.15)Sinf({pi}):=kBipilogpi
is called information entropy. (Our convention is that 0 log 0 = 0 0 log 0 = 0 0log 0=00 \log 0=00log0=0 and log log log\loglog is the natural logarithm).
The factor k B k B k_(B)\mathrm{k}_{\mathrm{B}}kB is merely inserted here to be consistent with the conventions in statistical physics. In the context of computer science, it is dropped, and the natural log is replaced by the logarithm with base 2 , which is natural to use if we think of information encoded in bits. The information entropy is also defined with the opposite sign sometimes, since more entropy means less information. It can be shown that the information entropy (in computer science normalization) is roughly equal to the average (with respect to the given probability distribution) number of yes/no questions necessary to determine whether a given event has occurred (cf. problem B.4).
Maximum entropy principle: A practical application of information entropy is as follows: suppose one has an ensemble whose probability distribution { p i : i = 1 , , n } p i : i = 1 , , n {p_(i):i=1,dots,n}\left\{p_{i}: i=1, \ldots, n\right\}{pi:i=1,,n} is not completely known. One would like to make a good guess about { p i } p i {p_(i)}\left\{p_{i}\right\}{pi} based on some partial information such as a finite number of moments, or other observables. Thus, suppose that F A ( x ) , A = 1 , 2 , , m F A ( x ) , A = 1 , 2 , , m F_(A)(x),A=1,2,dots,mF_{A}(x), A=1,2, \ldots, mFA(x),A=1,2,,m are m < n m < n m < nm<nm<n observables for which F A ( x ) = f A F A ( x ) = f A (:F_(A)(x):)=f_(A)\left\langle F_{A}(x)\right\rangle=f_{A}FA(x)=fA are known. Then a good guess, representing in some sense a minimal bias about { p i } p i {p_(i)}\left\{p_{i}\right\}{pi}, is to maximize S inf S inf S_(inf)S_{\mathrm{inf}}Sinf, subject to the constraints F A ( x ) = f A F A ( x ) = f A (:F_(A)(x):)=f_(A)\left\langle F_{A}(x)\right\rangle=f_{A}FA(x)=fA. In the case when the observables are the mean value μ μ mu\muμ and variance σ σ sigma\sigmaσ, the distribution obtained in this way is the Gaussian. So the Gaussian is, in this sense, our best guess if we only know μ μ mu\muμ and σ σ sigma\sigmaσ (cf. problem B.3).
The maximum entropy principle is typically analyzed with the help of Lagrange multipliers. For the constraint i p i = 1 i p i = 1 sum_(i)p_(i)=1\sum_{i} p_{i}=1ipi=1 we take a multiplier μ μ mu\muμ and for the other constraints i p i F A ( x i ) = f A i p i F A x i = f A sum_(i)p_(i)F_(A)(x_(i))=f_(A)\sum_{i} p_{i} F_{A}\left(x_{i}\right)=f_{A}ipiFA(xi)=fA we take multipliers λ A λ A lambda_(A)\lambda_{A}λA. Then we should first look at the unconstrained maximization problem
ϕ ( p 1 , , p n ) = S ( p 1 , , p n ) + μ ( i p i 1 ) + A λ A ( i p i F A ( x i ) f A ) ϕ p 1 , , p n = S p 1 , , p n + μ i p i 1 + A λ A i p i F A x i f A phi(p_(1),dots,p_(n))=S(p_(1),dots,p_(n))+mu(sum_(i)p_(i)-1)+sum_(A)lambda_(A)(sum_(i)p_(i)F_(A)(x_(i))-f_(A))rarr\phi\left(p_{1}, \ldots, p_{n}\right)=S\left(p_{1}, \ldots, p_{n}\right)+\mu\left(\sum_{i} p_{i}-1\right)+\sum_{A} \lambda_{A}\left(\sum_{i} p_{i} F_{A}\left(x_{i}\right)-f_{A}\right) \rightarrowϕ(p1,,pn)=S(p1,,pn)+μ(ipi1)+AλA(ipiFA(xi)fA) maximum
The constraints are linear in the p i p i p_(i)p_{i}pi and S ( { p i } ) S p i S({p_(i)})S\left(\left\{p_{i}\right\}\right)S({pi}) is a concave function. Thus a stationary point that is a solution to the constraints is automatically a maximum.

2.3 Examples of probability distributions

We next give some important examples of probability distributions and related models in statistical mechanics:

2.3.1 Gaussian distribution

The Gaussian distribution for one real random variable x Ω = R x Ω = R x in Omega=Rx \in \Omega=\mathbb{R}xΩ=R is defined by the following probability density:
(2.17) p ( x ) = 1 2 π σ e ( x μ ) 2 2 σ 2 (2.17) p ( x ) = 1 2 π σ e ( x μ ) 2 2 σ 2 {:(2.17)p(x)=(1)/(sqrt(2pi)sigma)e^(-((x-mu)^(2))/(2sigma^(2))):}\begin{equation*} p(x)=\frac{1}{\sqrt{2 \pi} \sigma} e^{-\frac{(x-\mu)^{2}}{2 \sigma^{2}}} \tag{2.17} \end{equation*}(2.17)p(x)=12πσe(xμ)22σ2
We find μ = x μ = x mu=(:x:)\mu=\langle x\rangleμ=x and σ 2 = x 2 x 2 = x c 2 σ 2 = x 2 x 2 = x c 2 sigma^(2)=(:x^(2):)-(:x:)^(2)=(:x:)_(c)^(2)\sigma^{2}=\left\langle x^{2}\right\rangle-\langle x\rangle^{2}=\langle x\rangle_{c}^{2}σ2=x2x2=xc2. The higher moments are all expressible in terms of μ μ mu\muμ and σ σ sigma\sigmaσ in a systematic fashion. For example:
x 2 = σ 2 + μ 2 x 3 = 3 σ 2 μ + μ 3 x 4 = 3 σ 4 + 6 σ 2 μ 2 + μ 4 x 2 = σ 2 + μ 2 x 3 = 3 σ 2 μ + μ 3 x 4 = 3 σ 4 + 6 σ 2 μ 2 + μ 4 {:[(:x^(2):)=sigma^(2)+mu^(2)],[(:x^(3):)=3sigma^(2)mu+mu^(3)],[(:x^(4):)=3sigma^(4)+6sigma^(2)mu^(2)+mu^(4)]:}\begin{aligned} & \left\langle x^{2}\right\rangle=\sigma^{2}+\mu^{2} \\ & \left\langle x^{3}\right\rangle=3 \sigma^{2} \mu+\mu^{3} \\ & \left\langle x^{4}\right\rangle=3 \sigma^{4}+6 \sigma^{2} \mu^{2}+\mu^{4} \end{aligned}x2=σ2+μ2x3=3σ2μ+μ3x4=3σ4+6σ2μ2+μ4
The generating functional for the moments is e i k x = e i k μ e σ 2 k 2 / 2 e i k x = e i k μ e σ 2 k 2 / 2 (:e^(-ikx):)=e^(-ik mu)e^(-sigma^(2)k^(2)//2)\left\langle e^{-i k x}\right\rangle=e^{-i k \mu} e^{-\sigma^{2} k^{2} / 2}eikx=eikμeσ2k2/2. The N N NNN-dimensional generalization of the Gaussian distribution ( Ω = R N ) Ω = R N (Omega=R^(N))\left(\Omega=\mathbb{R}^{N}\right)(Ω=RN) is expressed in terms of a "covariance matrix", C C CCC, which is symmetric, real, with positive eigenvalues and a vector μ μ vec(mu)\vec{\mu}μ. It is
(2.18) p ( x ) = 1 ( 2 π ) N / 2 ( det C ) 1 / 2 e 1 2 ( x μ ) C 1 ( x μ ) . (2.18) p ( x ) = 1 ( 2 π ) N / 2 ( det C ) 1 / 2 e 1 2 ( x μ ) C 1 ( x μ ) . {:(2.18)p( vec(x))=(1)/((2pi)^(N//2)(det C)^(1//2))e^(-(1)/(2)( vec(x)- vec(mu))*C^(-1)( vec(x)- vec(mu))).:}\begin{equation*} p(\vec{x})=\frac{1}{(2 \pi)^{N / 2}(\operatorname{det} C)^{1 / 2}} e^{-\frac{1}{2}(\vec{x}-\vec{\mu}) \cdot C^{-1}(\vec{x}-\vec{\mu})} . \tag{2.18} \end{equation*}(2.18)p(x)=1(2π)N/2(detC)1/2e12(xμ)C1(xμ).
The first two moments are x i = μ i , x i x j = C i j + μ i μ j x i = μ i , x i x j = C i j + μ i μ j (:x_(i):)=mu_(i),(:x_(i)x_(j):)=C_(ij)+mu_(i)mu_(j)\left\langle x_{i}\right\rangle=\mu_{i},\left\langle x_{i} x_{j}\right\rangle=C_{i j}+\mu_{i} \mu_{j}xi=μi,xixj=Cij+μiμj, so C i j C i j C_(ij)C_{i j}Cij is the generalization of the variance and μ i μ i mu_(i)\mu_{i}μi that of the mean value.

2.3.2 Binomial and Poisson distribution

The binomial distribution occurs naturally when we perform a yes/no probability experiment independently a number of times. Fix N N NNN and let Ω = { 1 , , N } Ω = { 1 , , N } Omega={1,dots,N}\Omega=\{1, \ldots, N\}Ω={1,,N}. Then the events are subsets of Ω Ω Omega\OmegaΩ, such as { n } { n } {n}\{n\}{n}. We think of n n nnn as the number of times an outcome A A AAA occurs in N N NNN trials, where 0 q 1 0 q 1 0 <= q <= 10 \leqslant q \leqslant 10q1 is the probability for the event A A AAA.
(2.19) P N ( { n } ) = ( N n ) q n ( 1 q ) N n (2.20) p ~ N ( k ) = e i k n = ( q e i k + ( 1 q ) ) N (2.19) P N ( { n } ) = ( N n ) q n ( 1 q ) N n (2.20) p ~ N ( k ) = e i k n = q e i k + ( 1 q ) N {:[(2.19)P_(N)({n})=((N)/(n))q^(n)(1-q)^(N-n)],[(2.20)=>quad tilde(p)_(N)(k)=(:e^(-ikn):)=(qe^(-ik)+(1-q))^(N)]:}\begin{align*} & P_{N}(\{n\})=\binom{N}{n} q^{n}(1-q)^{N-n} \tag{2.19}\\ & \Rightarrow \quad \tilde{p}_{N}(k)=\left\langle e^{-i k n}\right\rangle=\left(q e^{-i k}+(1-q)\right)^{N} \tag{2.20} \end{align*}(2.19)PN({n})=(Nn)qn(1q)Nn(2.20)p~N(k)=eikn=(qeik+(1q))N
The Poisson distribution is the limit of the binomial distribution for N N N rarr ooN \rightarrow \inftyN when n n nnn is fixed and q = α N q = α N q=(alpha )/(N)q=\frac{\alpha}{N}q=αN, with α α alpha\alphaα fixed (rare events). It is given by ( n { 0 , 1 , 2 , } = Ω n { 0 , 1 , 2 , } = Ω n in{0,1,2,dots}=Omegan \in\{0,1,2, \ldots\}=\Omegan{0,1,2,}=Ω ):
(2.21) p ( n ) = α n n ! e α , (2.21) p ( n ) = α n n ! e α , {:(2.21)p(n)=(alpha^(n))/(n!)e^(-alpha)",":}\begin{equation*} p(n)=\frac{\alpha^{n}}{n!} e^{-\alpha}, \tag{2.21} \end{equation*}(2.21)p(n)=αnn!eα,
To see this, we write the binomial distribution as
p N ( x ) = N ( N 1 ) ( N n + 1 ) n ! α n N n ( 1 α N ) N n (2.22) = N ( N 1 ) ( N n + 1 ) N n 1 α n n ! ( 1 α N ) N e α ( 1 α N ) n 1 p N ( x ) = N ( N 1 ) ( N n + 1 ) n ! α n N n 1 α N N n (2.22) = N ( N 1 ) ( N n + 1 ) N n 1 α n n ! 1 α N N e α 1 α N n 1 {:[p_(N)(x)=(N(N-1)dots(N-n+1))/(n!)(alpha^(n))/(N^(n))(1-(alpha )/(N))^(N-n)],[(2.22)=ubrace((N(N-1)dots(N-n+1))/(N^(n))ubrace)_(rarr1)(alpha^(n))/(n!)ubrace((1-(alpha )/(N))^(N)ubrace)_(rarre^(-alpha))ubrace((1-(alpha )/(N))^(-n)ubrace)_(rarr1)]:}\begin{align*} & p_{N}(x)=\frac{N(N-1) \ldots(N-n+1)}{n!} \frac{\alpha^{n}}{N^{n}}\left(1-\frac{\alpha}{N}\right)^{N-n} \\ & =\underbrace{\frac{N(N-1) \ldots(N-n+1)}{N^{n}}}_{\rightarrow 1} \frac{\alpha^{n}}{n!} \underbrace{\left(1-\frac{\alpha}{N}\right)^{N}}_{\rightarrow e^{-\alpha}} \underbrace{\left(1-\frac{\alpha}{N}\right)^{-n}}_{\rightarrow 1} \tag{2.22} \end{align*}pN(x)=N(N1)(Nn+1)n!αnNn(1αN)Nn(2.22)=N(N1)(Nn+1)Nn1αnn!(1αN)Neα(1αN)n1
A standard application of the Poisson distribution is radioactive decay: let q = λ Δ t q = λ Δ t q=lambda Delta tq=\lambda \Delta tq=λΔt the decay probability in a time interval Δ t = T N Δ t = T N Delta t=(T)/(N)\Delta t=\frac{T}{N}Δt=TN. If n n nnn denotes the number of decays, then the probability is obtained as:
(2.23) p ( n ) = ( λ T ) n n ! e λ T (2.23) p ( n ) = ( λ T ) n n ! e λ T {:(2.23)p(n)=((lambda T)^(n))/(n!)e^(-lambda T):}\begin{equation*} p(n)=\frac{(\lambda T)^{n}}{n!} e^{-\lambda T} \tag{2.23} \end{equation*}(2.23)p(n)=(λT)nn!eλT

2.3.3 Ising model

The Ising model is basically a probability distribution for spins on a lattice. For each lattice site i i iii (atom), there is a spin taking values σ i { ± 1 } σ i { ± 1 } sigma_(i)in{+-1}\sigma_{i} \in\{ \pm 1\}σi{±1}. So, an individual spin configuration C C CCC is a collection C = { σ i } C = σ i C={sigma_(i)}C=\left\{\sigma_{i}\right\}C={σi} of spin values, one for each site. In d d ddd dimensions, the lattice is usually taken to be a volume V = [ 0 , L ] d Z d V = [ 0 , L ] d Z d V=[0,L]^(d)subZ^(d)V=[0, L]^{d} \subset \mathbb{Z}^{d}V=[0,L]dZd. The number of lattice sites is then | V | = { L d | V | = { L d |V|={L~|^(d)|V|=\{L\rceil^{d}|V|={Ld, and the set of possible configurations C = { σ i } C = σ i C={sigma_(i)}C=\left\{\sigma_{i}\right\}C={σi} is Ω = { C } = { 1 , 1 } | V | Ω = { C } = { 1 , 1 } | V | Omega={C}={-1,1}^(|V|)\Omega=\{C\}=\{-1,1\}^{|V|}Ω={C}={1,1}|V| since each spin can take precisely two values. In the Ising model, one assigns to each configuration an energy
(2.24) H ( C ) = J i k σ i σ k b i σ i , (2.24) H ( C ) = J i k σ i σ k b i σ i , {:(2.24)H(C)=-Jsum_(ik)sigma_(i)sigma_(k)-bsum_(i)sigma_(i)",":}\begin{equation*} H(C)=-J \sum_{i k} \sigma_{i} \sigma_{k}-b \sum_{i} \sigma_{i}, \tag{2.24} \end{equation*}(2.24)H(C)=Jikσiσkbiσi,
where J , b J , b J,bJ, bJ,b are parameters, and where the first sum is over all lattice bonds i k i k iki kik in the volume V V VVV. The second sum is over all lattice sites in V V VVV. The probability of a configuration is then given by the Boltzmann weight
(2.25) p ( C ) = 1 Z exp [ β H ( C ) ] (2.25) p ( C ) = 1 Z exp [ β H ( C ) ] {:(2.25)p(C)=(1)/(Z)exp[-beta H(C)]:}\begin{equation*} p(C)=\frac{1}{Z} \exp [-\beta H(C)] \tag{2.25} \end{equation*}(2.25)p(C)=1Zexp[βH(C)]
A large coupling constant J 1 J 1 J≫1J \gg 1J1 energetically favors adjacent spins to be parallel and a large b 1 b 1 b≫1b \gg 1b1 favors spins to be preferentially up ( +1 ). The coupling b b bbb can thus be thought of as an external magnetic field. Z = Z ( V , J , b ) Z = Z ( V , J , b ) Z=Z(V,J,b)Z=Z(V, J, b)Z=Z(V,J,b) is a normalization constant ensuring that all the probabilities add up to unity. For the computation of Z Z ZZZ in dimension d = 1 d = 1 d=1d=1d=1 and for b = 0 b = 0 b=0b=0b=0, see problem B.5. Of particular interest in the Ising model are the mean magnetization m = | V | 1 σ i m = | V | 1 σ i m=|V|^(-1)sum(:sigma_(i):)m=|V|^{-1} \sum\left\langle\sigma_{i}\right\ranglem=|V|1σi and the free energy density f = 1 | V | 1 log Z f = 1 | V | 1 log Z f=^(-1)|V|^(-1)log Zf={ }^{-1}|V|^{-1} \log Zf=1|V|1logZ, see problem B.16. Another quantity of interest is the two-point function σ i σ j σ i σ j (:sigma_(i)sigma_(j):)\left\langle\sigma_{i} \sigma_{j}\right\rangleσiσj in the limit of large V Z d V Z d V rarrZ^(d)V \rightarrow \mathbb{Z}^{d}VZd (called "thermodynamic limit") and a large separation between i i iii and j j jjj.

2.3.4 Site percolation

Consider a finite lattice V = [ 0 , L ] d Z d V = [ 0 , L ] d Z d V=[0,L]^(d)subZ^(d)V=[0, L]^{d} \subset \mathbb{Z}^{d}V=[0,L]dZd. Now we occupy each lattice site i i iii randomly with a spin σ i = + 1 σ i = + 1 sigma_(i)=+1\sigma_{i}=+1σi=+1 with probability q q qqq and with spin σ i = 1 σ i = 1 sigma_(i)=-1\sigma_{i}=-1σi=1 with probability 1 q 1 q 1-q1-q1q. If we assume that these choices are independent, then the probability of a configuration C = { σ i } C = σ i C={sigma_(i)}C=\left\{\sigma_{i}\right\}C={σi} of spin values is
(2.26) p ( C ) = q N + ( 1 q ) N (2.26) p ( C ) = q N + ( 1 q ) N {:(2.26)p(C)=q^(N_(+))(1-q)^(N_(-)):}\begin{equation*} p(C)=q^{N_{+}}(1-q)^{N_{-}} \tag{2.26} \end{equation*}(2.26)p(C)=qN+(1q)N
with N ± N ± N_(+-)N_{ \pm}N±the number of + or - spins. The probability space is, as before, Ω = { C } Ω = { C } Omega={C}\Omega=\{C\}Ω={C}. The physical interpretation of such a model can be for example that sites occupied by + spins are conducting, whereas those occupied by - spins are not. A cluster is a set of sites containing only + spins such that between any two sites of the cluster there is a path containing only + spins. If we have a long cluster spanning from one side of the lattice to the opposite side, then the lattice is a conductor, otherwise it is an insulator.
Interesting observables are, for example:
  • n s ( C ) n s ( C ) n_(s)(C)n_{s}(C)ns(C) : The number of clusters of size s s sss in C C CCC, i.e., the number of such clusters divided by the number of all lattice sites (i.e., L d L d L^(d)L^{d}Ld ).
  • χ i ( C ) χ i ( C ) chi_(i)(C)\chi_{i}(C)χi(C) : This is one if i i iii belongs to a cluster in C C CCC and zero otherwise.
  • ξ ( C ) ξ ( C ) xi(C)\xi(C)ξ(C) : The average over all sizes of clusters in C C CCC, etc.
We can then form expectation values as usual, for instance
(2.27) n s = C n s ( C ) p ( C ) = average normalized number of clusters of size s , (2.27) n s = C n s ( C ) p ( C ) =  average normalized number of clusters of size  s , {:(2.27)(:n_(s):)=sum_(C)n_(s)(C)p(C)=" average normalized number of clusters of size "s",":}\begin{equation*} \left\langle n_{s}\right\rangle=\sum_{C} n_{s}(C) p(C)=\text { average normalized number of clusters of size } s, \tag{2.27} \end{equation*}(2.27)ns=Cns(C)p(C)= average normalized number of clusters of size s,
so s n s s n s s(:n_(s):)s\left\langle n_{s}\right\ranglesns is the probability that an arbitrarily chosen site belongs to a cluster of size s s sss. We can also consider
(2.28) N = s = 1 n s = average normalized number of clusters. (2.28) N = s = 1 n s =  average normalized number of clusters.  {:(2.28)N=(:sum_(s=1)^(oo)n_(s):)=" average normalized number of clusters. ":}\begin{equation*} N=\left\langle\sum_{s=1}^{\infty} n_{s}\right\rangle=\text { average normalized number of clusters. } \tag{2.28} \end{equation*}(2.28)N=s=1ns= average normalized number of clusters. 
The probability that a site occupied by a + spin a + spin a+spin\mathrm{a}+\operatorname{spin}a+spin belongs to a cluster of size s s sss is
(2.29) s n s s = 1 s n s . (2.29) s n s s = 1 s n s . {:(2.29)(s(:n_(s):))/(sum_(s^(')=1)^(oo)s^(')(:n_(s^(')):)).:}\begin{equation*} \frac{s\left\langle n_{s}\right\rangle}{\sum_{s^{\prime}=1}^{\infty} s^{\prime}\left\langle n_{s^{\prime}}\right\rangle} . \tag{2.29} \end{equation*}(2.29)snss=1sns.
Consequently, the mean size of the clusters is
(2.30) S = s = 1 s s n s s = 1 s n s (2.30) S = s = 1 s s n s s = 1 s n s {:(2.30)S=sum_(s=1)^(oo)s(s(:n_(s):))/(sum_(s^(')=1)^(oo)s^(')(:n_(s^(')):)):}\begin{equation*} S=\sum_{s=1}^{\infty} s \frac{s\left\langle n_{s}\right\rangle}{\sum_{s^{\prime}=1}^{\infty} s^{\prime}\left\langle n_{s^{\prime}}\right\rangle} \tag{2.30} \end{equation*}(2.30)S=s=1ssnss=1sns
Physically, one is interested in the question how, for example, S S SSS depends on q q qqq. Can the clusters become macroscopically large for sufficiently large q q qqq ? If this is the case,
then the material is conducting, otherwise it is an insulator and the transition from one behavior to another is thought of as a "phase transition". To study this question more precisely, one likes to take L L L rarr ooL \rightarrow \inftyL and asks whether S S SSS can diverge as q q qqq goes to some critical value q q q_(**)q_{*}q, i.e. when q q q q q rarrq_(**)q \rightarrow q_{*}qq. Near such a critical value, one expects for instance
(2.31) S ( q q ) γ (2.31) S q q γ {:(2.31)S prop(q-q_(**))^(-gamma):}\begin{equation*} S \propto\left(q-q_{*}\right)^{-\gamma} \tag{2.31} \end{equation*}(2.31)S(qq)γ
and calls γ γ gamma\gammaγ a "critical exponent". Other examples of critical exponents can also be defined. For instance, let P P PPP be the probability that a given " + " site belongs to an infinite cluster. This number can be computed as
(2.32) P = 1 1 q s = 1 s n s (2.32) P = 1 1 q s = 1 s n s {:(2.32)P=1-(1)/(q)sum_(s=1)^(oo)s(:n_(s):):}\begin{equation*} P=1-\frac{1}{q} \sum_{s=1}^{\infty} s\left\langle n_{s}\right\rangle \tag{2.32} \end{equation*}(2.32)P=11qs=1sns
where the limit L L L rarr ooL \rightarrow \inftyL is again understood in the end. To see this, consider an arbitrary lattice site. It either has spin - or it has spin + and therefore belongs to some finite cluster of size s s sss or an infinite cluster. This means 1 = 1 q + s = 1 s n s + q P 1 = 1 q + s = 1 s n s + q P 1=1-q+sum_(s=1)^(oo)s(:n_(s):)+qP1=1-q+\sum_{s=1}^{\infty} s\left\langle n_{s}\right\rangle+q P1=1q+s=1sns+qP, which gives the statement. Near the critical value q q q_(**)q_{*}q we then likewise expect
(2.33) P { 0 for q < q ( q q ) β for q q (2.33) P 0  for  q < q q q β  for  q q {:(2.33)P prop{[0," for "q < q_(**)],[(q-q_(**))^(beta)," for "q >= q_(**)]:}:}P \propto \begin{cases}0 & \text { for } q<q_{*} \tag{2.33}\\ \left(q-q_{*}\right)^{\beta} & \text { for } q \geqslant q_{*}\end{cases}(2.33)P{0 for q<q(qq)β for qq
The numbers γ , β γ , β gamma,beta\gamma, \betaγ,β and similar other quantities are called "critical exponents". Analogous quantities can be defined for many systems, and their determination is one of the standard problems in statistical mechanics, since, for example in this model, the appearance of a macroscopic cluster is the sign of change of a macroscopic property of the system (conductance vs. insulator in this model). The importance of the values of the exponents is that we may expect them to be independent of the precise nature of the model at the microscopic level. For instance, we would expect to get the same values if we replace a cubic lattice by some other lattice which is not too different.
Unfortunately, the determination of critical points and critical exponents is a very complicated business, and we will not have the time to introduce the methods for doing this in this lecture course. By way of an example, let us treat the trivial case d = 1 d = 1 d=1d=1d=1 (chain) of the percolation model, which can be treated by elementary means. In this example, we can immediately calculate the normed cluster number n s n s (:n_(s):)\left\langle n_{s}\right\ranglens : On the one hand, as we have already noted, the probability that an arbitrarily chosen site i i iii belongs to a cluster of size s s sss is given by s n s s n s s(:n_(s):)s\left\langle n_{s}\right\ranglesns. On the other hand, it equals s q s ( 1 q ) 2 s q s ( 1 q ) 2 sq^(s)(1-q)^(2)s q^{s}(1-q)^{2}sqs(1q)2 because s s sss consecutive sites have to carry spin + (factor q s ) q s {:q^(s))\left.q^{s}\right)qs), the left and right boundaries of the cluster have to be occupied by ( -(:}-\left(\right.( factor ( 1 q ) 2 ) ( 1 q ) 2 {:(1-q)^(2))\left.(1-q)^{2}\right)(1q)2), and there are s s sss clusters of size s s sss
containing the chosen site i i iii (factor s s sss ). Thus, we conclude that
(2.34) n s = q s ( 1 q ) 2 (2.34) n s = q s ( 1 q ) 2 {:(2.34)(:n_(s):)=q^(s)(1-q)^(2):}\begin{equation*} \left\langle n_{s}\right\rangle=q^{s}(1-q)^{2} \tag{2.34} \end{equation*}(2.34)ns=qs(1q)2
As long as there is no infinite cluster, we also have s = 1 s n s = q s = 1 s n s = q sum_(s=1)^(oo)s(:n_(s):)=q\sum_{s=1}^{\infty} s\left\langle n_{s}\right\rangle=qs=1sns=q, because the probabilities that an arbitrarily chosen site belongs to a cluster of size s s sss add up to the probability that this site is occupied by a + spin. As a consequence, our formula for S S SSS gives
S = 1 q s = 1 s 2 n s = 1 q s = 1 s 2 q s ( 1 q ) 2 (2.35) = ( 1 q ) 2 q ( q d d q ) 2 s = 1 q s = ( 1 q ) 2 q ( q d d q ) 2 q 1 q = 1 + q 1 q . S = 1 q s = 1 s 2 n s = 1 q s = 1 s 2 q s ( 1 q ) 2 (2.35) = ( 1 q ) 2 q q d d q 2 s = 1 q s = ( 1 q ) 2 q q d d q 2 q 1 q = 1 + q 1 q . {:[S=(1)/(q)sum_(s=1)^(oo)s^(2)(:n_(s):)],[=(1)/(q)sum_(s=1)^(oo)s^(2)q^(s)(1-q)^(2)],[(2.35)=((1-q)^(2))/(q)(q(d)/(dq))^(2)sum_(s=1)^(oo)q^(s)],[=((1-q)^(2))/(q)(q(d)/(dq))^(2)(q)/(1-q)],[=(1+q)/(1-q).]:}\begin{align*} S & =\frac{1}{q} \sum_{s=1}^{\infty} s^{2}\left\langle n_{s}\right\rangle \\ & =\frac{1}{q} \sum_{s=1}^{\infty} s^{2} q^{s}(1-q)^{2} \\ & =\frac{(1-q)^{2}}{q}\left(q \frac{d}{d q}\right)^{2} \sum_{s=1}^{\infty} q^{s} \tag{2.35}\\ & =\frac{(1-q)^{2}}{q}\left(q \frac{d}{d q}\right)^{2} \frac{q}{1-q} \\ & =\frac{1+q}{1-q} . \end{align*}S=1qs=1s2ns=1qs=1s2qs(1q)2(2.35)=(1q)2q(qddq)2s=1qs=(1q)2q(qddq)2q1q=1+q1q.
So we read off that q = 1 q = 1 q_(**)=1q_{*}=1q=1 and that γ = 1 γ = 1 gamma=1\gamma=1γ=1 for d = 1 d = 1 d=1d=1d=1.

2.3.5 Random walk on a lattice

A walk ω ω omega\omegaω in a volume V V VVV of a lattice as in the Ising model can be characterized by the sequence of sites ω = ( x , i 1 , i 2 , , i N 1 , y ) ω = x , i 1 , i 2 , , i N 1 , y omega=(x,i_(1),i_(2),dots,i_(N-1),y)\omega=\left(x, i_{1}, i_{2}, \ldots, i_{N-1}, y\right)ω=(x,i1,i2,,iN1,y) encountered by the walker, where x x xxx is the fixed beginning and y y yyy the fixed endpoint. The number of sites in the walk is denoted l ( ω ) ( = N + 1 l ( ω ) ( = N + 1 l(omega)(=N+1l(\omega)(=N+1l(ω)(=N+1 in the example), and the number of self-intersections is denoted by n ( ω ) n ( ω ) n(omega)n(\omega)n(ω). The set of walks from x x xxx to y y yyy is our probability space Ω x , y Ω x , y Omega_(x,y)\Omega_{x, y}Ωx,y, and a natural probability distribution is
(2.36) P ( ω ) = 1 Z e μ l ( ω ) g n ( ω ) . (2.36) P ( ω ) = 1 Z e μ l ( ω ) g n ( ω ) . {:(2.36)P(omega)=(1)/(Z)e^(-mu l(omega)-gn(omega)).:}\begin{equation*} P(\omega)=\frac{1}{Z} e^{-\mu l(\omega)-g n(\omega)} . \tag{2.36} \end{equation*}(2.36)P(ω)=1Zeμl(ω)gn(ω).
Here, μ , g μ , g mu,g\mu, gμ,g are positive constants. For large μ 1 μ 1 mu≫1\mu \gg 1μ1, short walks between x x xxx and y y yyy are favored, and for large g 1 g 1 g≫1g \gg 1g1, self-avoiding walks are favored. Z = Z x , y ( V , μ , g ) Z = Z x , y ( V , μ , g ) Z=Z_(x,y)(V,mu,g)Z=Z_{x, y}(V, \mu, g)Z=Zx,y(V,μ,g) is a normalization constant ensuring that the probabilities add up to unity. Of interest are e.g. the "free energy density" f = | V | 1 log Z f = | V | 1 log Z f=|V|^(-1)log Zf=|V|^{-1} \log Zf=|V|1logZ, or the average number of steps the walk spends in a given subset S V S V S sub VS \subset VSV, given by # { S ω } # { S ω } (:#{S nn omega}:)\langle \#\{S \cap \omega\}\rangle#{Sω}.
In general, such observables are very difficult to calculate, but for g = 0 g = 0 g=0g=0g=0 (unconstrained walks) there is a nice connection between Z Z ZZZ and the Gaussian distribution, which is the starting point to obtain many further results. Let α f ( i ) = f ( i + e α ) f ( i ) α f ( i ) = f i + e α f ( i ) del_(alpha)f(i)=f(i+ vec(e)_(alpha))-f(i)\partial_{\alpha} f(i)=f\left(i+\vec{e}_{\alpha}\right)-f(i)αf(i)=f(i+eα)f(i) be the "lattice partial derivative" of a function f ( i ) f ( i ) f(i)f(i)f(i) defined on the lattice sites i V i V i in Vi \in ViV, in the direction of the α α alpha\alphaα-th unit vector, e α , α = 1 , , d e α , α = 1 , , d vec(e)_(alpha),alpha=1,dots,d\vec{e}_{\alpha}, \alpha=1, \ldots, deα,α=1,,d. Let α 2 = Δ α 2 = Δ sumdel_(alpha)^(2)=Delta\sum \partial_{\alpha}^{2}=\Deltaα2=Δ be the "lattice
Laplacian". The lattice Laplacian can be identified with a matrix Δ i j Δ i j Delta_(ij)\Delta_{i j}Δij of size | V | × | V | | V | × | V | |V|xx|V||V| \times|V||V|×|V| defined by Δ f ( i ) = j Δ i j f ( j ) Δ f ( i ) = j Δ i j f ( j ) Delta f(i)=sum_(j)Delta_(ij)f(j)\Delta f(i)=\sum_{j} \Delta_{i j} f(j)Δf(i)=jΔijf(j). Define the covariance matrix as C = ( Δ + m 2 ) 1 C = Δ + m 2 1 C=(-Delta+m^(2))^(-1)C=\left(-\Delta+m^{2}\right)^{-1}C=(Δ+m2)1 and consider the corresponding Gaussian measure for the variables { ϕ i } R | V | ϕ i R | V | {phi_(i)}inR^(|V|)\left\{\phi_{i}\right\} \in \mathbb{R}^{|V|}{ϕi}R|V| (one real variable per lattice site in V V VVV ). One shows that
(2.37) Z x , y = ϕ x ϕ y 1 ( 2 π ) | V | / 2 ( det C ) 1 / 2 ϕ x ϕ y e 1 2 ϕ i ( Δ + m 2 ) i j ϕ j d | V | ϕ ϕ (2.37) Z x , y = ϕ x ϕ y 1 ( 2 π ) | V | / 2 ( det C ) 1 / 2 ϕ x ϕ y e 1 2 ϕ i Δ + m 2 i j ϕ j d | V | ϕ ϕ {:(2.37)Z_(x,y)=(:phi_(x)phi_(y):)-=(1)/((2pi)^(|V|//2)(det C)^(1//2))intphi_(x)phi_(y)e^(-(1)/(2)sumphi_(i)(-Delta+m^(2))_(ij)phi_(j))d^(|V|)phi_(phi):}\begin{equation*} Z_{x, y}=\left\langle\phi_{x} \phi_{y}\right\rangle \equiv \frac{1}{(2 \pi)^{|V| / 2}(\operatorname{det} C)^{1 / 2}} \int \phi_{x} \phi_{y} e^{-\frac{1}{2} \sum \phi_{i}\left(-\Delta+m^{2}\right)_{i j} \phi_{j}} d^{|V|} \phi_{\phi} \tag{2.37} \end{equation*}(2.37)Zx,y=ϕxϕy1(2π)|V|/2(detC)1/2ϕxϕye12ϕi(Δ+m2)ijϕjd|V|ϕϕ
for g = 0 , μ = log ( 2 d + m 2 ) g = 0 , μ = log 2 d + m 2 g=0,mu=log(2d+m^(2))g=0, \mu=\log \left(2 d+m^{2}\right)g=0,μ=log(2d+m2) (exercise).

2.4 Ensembles in Classical Mechanics

The basic ideas of probability theory outlined in the previous sections can be used for the statistical description of systems obeying the laws of classical mechanics. Consider a classical system of N N NNN particles, described by 6 N 6 N 6N6 N6N phase space coordinates 1 1 ^(1){ }^{1}1 which we abbreviate as
(2.38) ( P , Q ) = ( p 1 , , p N ; x 1 , , x N ) R ( 3 + 3 ) N = Ω (2.38) ( P , Q ) = p 1 , , p N ; x 1 , , x N R ( 3 + 3 ) N = Ω {:(2.38)(P","Q)=( vec(p)_(1),dots, vec(p)_(N); vec(x)_(1),dots, vec(x)_(N))inR^((3+3)N)=Omega:}\begin{equation*} (P, Q)=\left(\vec{p}_{1}, \ldots, \vec{p}_{N} ; \vec{x}_{1}, \ldots, \vec{x}_{N}\right) \in \mathbb{R}^{(3+3) N}=\Omega \tag{2.38} \end{equation*}(2.38)(P,Q)=(p1,,pN;x1,,xN)R(3+3)N=Ω
A classical ensemble is simply a probability density function ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q), i.e.
(2.39) Ω ρ ( P , Q ) d 3 N P d 3 N Q = 1 , 0 ρ ( P , Q ) . (2.39) Ω ρ ( P , Q ) d 3 N P d 3 N Q = 1 , 0 ρ ( P , Q ) . {:(2.39)int_(Omega)rho(P","Q)d^(3N)Pd^(3N)Q=1","quad0 <= rho(P","Q) <= oo.:}\begin{equation*} \int_{\Omega} \rho(P, Q) d^{3 N} P d^{3 N} Q=1, \quad 0 \leqslant \rho(P, Q) \leqslant \infty . \tag{2.39} \end{equation*}(2.39)Ωρ(P,Q)d3NPd3NQ=1,0ρ(P,Q).
According to the basic concepts of probability theory, the ensemble average of an observable F ( P , Q ) F ( P , Q ) F(P,Q)F(P, Q)F(P,Q) is then simply
(2.40) F ( P , Q ) = Ω F ( P , Q ) ρ ( P , Q ) d 3 N Q d 3 N P (2.40) F ( P , Q ) = Ω F ( P , Q ) ρ ( P , Q ) d 3 N Q d 3 N P {:(2.40)(:F(P","Q):)=int_(Omega)F(P","Q)rho(P","Q)d^(3N)Qd^(3N)P:}\begin{equation*} \langle F(P, Q)\rangle=\int_{\Omega} F(P, Q) \rho(P, Q) d^{3 N} Q d^{3 N} P \tag{2.40} \end{equation*}(2.40)F(P,Q)=ΩF(P,Q)ρ(P,Q)d3NQd3NP
The probability distribution ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) represents our limited knowledge about the system which, in reality, is of course supposed to be described by a single trajectory ( P ( t ) , Q ( t ) ) ( P ( t ) , Q ( t ) ) (P(t),Q(t))(P(t), Q(t))(P(t),Q(t)) in phase space. In practice, we cannot know what this trajectory is precisely other than for a very small number of particles N N NNN and, in some sense, we do not really want to know the precise trajectory at all. The idea behind ensembles is rather that the time evolution (=phase space trajectory ( Q ( t ) , P ( t ) ) ) ( Q ( t ) , P ( t ) ) ) (Q(t),P(t)))(Q(t), P(t)))(Q(t),P(t))) typically scans the entire accessible phase space (or sufficiently large parts of it) such that the time average of F F FFF equals the ensemble average of F F FFF, i.e. in many cases we expect to have:
(2.41) lim T 1 T 0 T F ( P ( t ) , Q ( t ) ) d t = F ( P , Q ) (2.41) lim T 1 T 0 T F ( P ( t ) , Q ( t ) ) d t = F ( P , Q ) {:(2.41)lim_(T rarr oo)(1)/(T)int_(0)^(T)F(P(t)","Q(t))dt=(:F(P","Q):):}\begin{equation*} \lim _{T \rightarrow \infty} \frac{1}{T} \int_{0}^{T} F(P(t), Q(t)) d t=\langle F(P, Q)\rangle \tag{2.41} \end{equation*}(2.41)limT1T0TF(P(t),Q(t))dt=F(P,Q)
for a suitable (stationary) probability density function. This is closely related to the "ergodic theorem" which in turn is related to the fact that the equations of motion are derivable from a (time independent) Hamiltonian. Hamilton's equations are
$$
(2.42) x ˙ i α = H p i α p ˙ i α = H x i α , (2.42) x ˙ i α = H p i α p ˙ i α = H x i α , {:(2.42)x^(˙)_(i alpha)=(del H)/(delp_(i alpha))quadp^(˙)_(i alpha)=-(del H)/(delx_(i alpha))",":}\begin{equation*} \dot{x}_{i \alpha}=\frac{\partial H}{\partial p_{i \alpha}} \quad \dot{p}_{i \alpha}=-\frac{\partial H}{\partial x_{i \alpha}}, \tag{2.42} \end{equation*}(2.42)x˙iα=Hpiαp˙iα=Hxiα,
w h e r e $ i = 1 , , N $ a n d $ α = 1 , 2 , 3 $ . T h e H a m i l t o n i a n $ H $ i s t y p i c a l l y o f t h e f o r m w h e r e $ i = 1 , , N $ a n d $ α = 1 , 2 , 3 $ . T h e H a m i l t o n i a n $ H $ i s t y p i c a l l y o f t h e f o r m where$i=1,dots,N$and$alpha=1,2,3$.TheHamiltonian$H$istypicallyoftheformwhere $i=1, \ldots, N$ and $\alpha=1,2,3$. The Hamiltonian $H$ is typically of the formwhere$i=1,,N$and$α=1,2,3$.TheHamiltonian$H$istypicallyoftheform
(2.43) H = i p i 2 2 m kinetic energy + i < j V ( x i x j ) interaction + j W ( x j ) external potential (2.43) H = i p i 2 2 m kinetic energy  + i < j V x i x j interaction  + j W x j external potential  {:(2.43)H=ubrace(sum_(i)( vec(p)_(i)^(2))/(2m)ubrace)_("kinetic energy ")+ubrace(sum_(i < j)V( vec(x)_(i)- vec(x)_(j))ubrace)_("interaction ")+ubrace(sum_(j)W( vec(x)_(j))ubrace)_("external potential "):}\begin{equation*} H=\underbrace{\sum_{i} \frac{\vec{p}_{i}^{2}}{2 m}}_{\text {kinetic energy }}+\underbrace{\sum_{i<j} \mathcal{V}\left(\vec{x}_{i}-\vec{x}_{j}\right)}_{\text {interaction }}+\underbrace{\sum_{j} \mathcal{W}\left(\vec{x}_{j}\right)}_{\text {external potential }} \tag{2.43} \end{equation*}(2.43)H=ipi22mkinetic energy +i<jV(xixj)interaction +jW(xj)external potential 
$$
if there are no internal degrees of freedom. It is a standard theorem in classical mechanics that E = H ( P , Q ) E = H ( P , Q ) E=H(P,Q)E=H(P, Q)E=H(P,Q) is conserved under time evolution. Let us imagine a well-potential W ( x ) W ( x ) W( vec(x))\mathcal{W}(\vec{x})W(x) as in the following picture:
Figure 2.2: Sketch of a well-potential W W W\mathcal{W}W.
Then Ω E = { ( P , Q ) H ( P , Q ) = E } Ω E = { ( P , Q ) H ( P , Q ) = E } Omega_(E)={(P,Q)∣H(P,Q)=E}\Omega_{E}=\{(P, Q) \mid H(P, Q)=E\}ΩE={(P,Q)H(P,Q)=E} is compact for sufficiently large W 0 W 0 W_(0)\mathcal{W}_{0}W0. We call Ω E Ω E Omega_(E)\Omega_{E}ΩE the energy surface. The "hyper"-area of this surface is denoted by | Ω E | Ω E |Omega_(E)|\left|\Omega_{E}\right||ΩE|, so
(2.44) | Ω E | Ω E d S Ω δ ( H ( P , Q ) E ) d 3 N P d 3 N Q (2.44) Ω E Ω E d S Ω δ ( H ( P , Q ) E ) d 3 N P d 3 N Q {:(2.44)|Omega_(E)|-=int_(Omega_(E))dS-=int_(Omega)delta(H(P","Q)-E)d^(3N)Pd^(3N)Q:}\begin{equation*} \left|\Omega_{E}\right| \equiv \int_{\Omega_{E}} d S \equiv \int_{\Omega} \delta(H(P, Q)-E) d^{3 N} P d^{3 N} Q \tag{2.44} \end{equation*}(2.44)|ΩE|ΩEdSΩδ(H(P,Q)E)d3NPd3NQ
Particle trajectories do not leave this surface by energy conservation. If Hamilton's equations admit other constants of motion, then it is natural to define a corresponding surface with respect to all constants of motion.
An important feature of the dynamics given by Hamilton's equations is
Liouville's Theorem: The flow map Φ t : ( P , Q ) ( P ( t ) , Q ( t ) ) Φ t : ( P , Q ) ( P ( t ) , Q ( t ) ) Phi_(t):(P,Q)|->(P(t),Q(t))\Phi_{t}:(P, Q) \mapsto(P(t), Q(t))Φt:(P,Q)(P(t),Q(t)) is area-preserving.
Figure 2.3: Evolution of a phase space volume under the flow map Φ t Φ t Phi_(t)\Phi_{t}Φt.
Proof of the theorem: Let ( P , Q ) = ( P ( t ) , Q ( t ) ) P , Q = ( P ( t ) , Q ( t ) ) (P^('),Q^('))=(P(t),Q(t))\left(P^{\prime}, Q^{\prime}\right)=(P(t), Q(t))(P,Q)=(P(t),Q(t)), such that ( P ( 0 ) = P , Q ( 0 ) = Q ) ( P ( 0 ) = P , Q ( 0 ) = Q ) (P(0)=P,Q(0)=Q)(P(0)=P, Q(0)=Q)(P(0)=P,Q(0)=Q). Then we have
(2.45) d 3 N P d 3 N Q = ( P , Q ) ( P , Q ) d 3 N P d 3 N Q (2.45) d 3 N P d 3 N Q = P , Q ( P , Q ) d 3 N P d 3 N Q {:(2.45)d^(3N)P^(')d^(3N)Q^(')=(del(P^('),Q^(')))/(del(P,Q))d^(3N)Pd^(3N)Q:}\begin{equation*} d^{3 N} P^{\prime} d^{3 N} Q^{\prime}=\frac{\partial\left(P^{\prime}, Q^{\prime}\right)}{\partial(P, Q)} d^{3 N} P d^{3 N} Q \tag{2.45} \end{equation*}(2.45)d3NPd3NQ=(P,Q)(P,Q)d3NPd3NQ
and we would like to show that J P , Q ( t ) = 1 J P , Q ( t ) = 1 J_(P,Q)(t)=1J_{P, Q}(t)=1JP,Q(t)=1 for all t t ttt. Let us write the Jacobian as J P , Q ( t ) = ( P , Q ) ( P , Q ) J P , Q ( t ) = P , Q ( P , Q ) J_(P,Q)(t)=(del(P^('),Q^(')))/(del(P,Q))J_{P, Q}(t)=\frac{\partial\left(P^{\prime}, Q^{\prime}\right)}{\partial(P, Q)}JP,Q(t)=(P,Q)(P,Q). Since the flow evidently satisfies Φ t + t ( P , Q ) = Φ t ( Φ t ( P , Q ) ) Φ t + t ( P , Q ) = Φ t Φ t ( P , Q ) Phi_(t+t^('))(P,Q)=Phi_(t^('))(Phi_(t)(P,Q))\Phi_{t+t^{\prime}}(P, Q)=\Phi_{t^{\prime}}\left(\Phi_{t}(P, Q)\right)Φt+t(P,Q)=Φt(Φt(P,Q)), the chain rule and the properties of the Jacobian imply J P , Q ( t + t ) = J P , Q ( t ) J P , Q ( t ) J P , Q t + t = J P , Q ( t ) J P , Q t J_(P,Q)(t+t^('))=J_(P,Q)(t)J_(P^('),Q^('))(t^('))J_{P, Q}\left(t+t^{\prime}\right)=J_{P, Q}(t) J_{P^{\prime}, Q^{\prime}}\left(t^{\prime}\right)JP,Q(t+t)=JP,Q(t)JP,Q(t). We now show that J P , Q / t ( 0 ) = 0 J P , Q / t ( 0 ) = 0 delJ_(P,Q)//del t(0)=0\partial J_{P, Q} / \partial t(0)=0JP,Q/t(0)=0. For small t t ttt, we can expand as follows:
P = P + t P ˙ + O ( t 2 ) = P t H Q + O ( t 2 ) Q = Q + t Q ˙ + O ( t 2 ) = Q + t H P + O ( t 2 ) P = P + t P ˙ + O t 2 = P t H Q + O t 2 Q = Q + t Q ˙ + O t 2 = Q + t H P + O t 2 {:[P^(')=P+tP^(˙)+O(t^(2))=P-t(del H)/(del Q)+O(t^(2))],[Q^(')=Q+tQ^(˙)+O(t^(2))=Q+t(del H)/(del P)+O(t^(2))]:}\begin{aligned} & P^{\prime}=P+t \dot{P}+\mathcal{O}\left(t^{2}\right)=P-t \frac{\partial H}{\partial Q}+\mathcal{O}\left(t^{2}\right) \\ & Q^{\prime}=Q+t \dot{Q}+\mathcal{O}\left(t^{2}\right)=Q+t \frac{\partial H}{\partial P}+\mathcal{O}\left(t^{2}\right) \end{aligned}P=P+tP˙+O(t2)=PtHQ+O(t2)Q=Q+tQ˙+O(t2)=Q+tHP+O(t2)
It follows that
J P , Q ( t ) = ( P , Q ) ( P , Q ) = det [ ( 1 3 N × 3 N 0 0 1 3 N × 3 N ) + t ( P Q H Q 2 H P 2 H Q P H ) + O ( t 2 ) ] = 1 + ttr ( P Q H Q 2 H P 2 H Q P H ) + O ( t 2 ) = 1 + t ( α , i 2 H x i α p i α + α , i 2 H p i α x i α = 0 ) + O ( t 2 ) = 1 + O ( t 2 ) . J P , Q ( t ) = P , Q ( P , Q ) = det 1 3 N × 3 N 0 0 1 3 N × 3 N + t P Q H Q 2 H P 2 H Q P H + O t 2 = 1 + ttr P Q H Q 2 H P 2 H Q P H + O t 2 = 1 + t ( α , i 2 H x i α p i α + α , i 2 H p i α x i α = 0 ) + O t 2 = 1 + O t 2 . {:[J_(P,Q)(t)=(del(P^('),Q^(')))/(del(P,Q))],[=det[([1_(3N xx3N),0],[0,1_(3N xx3N)])+t([-del_(P)del_(Q)H,-del_(Q)^(2)H],[del_(P)^(2)H,del_(Q)del_(P)H])+O(t^(2))]],[=1+ttr([-del_(P)del_(Q)H,-del_(Q)^(2)H],[del_(P)^(2)H,del_(Q)del_(P)H])+O(t^(2))],[=1+t(ubrace(-sum_(alpha,i)(del^(2)H)/(delx_(i alpha)delp_(i alpha))+sum_(alpha,i)(del^(2)H)/(delp_(i alpha)delx_(i alpha))ubrace)_(=0))+O(t^(2))],[=1+O(t^(2)).]:}\begin{aligned} J_{P, Q}(t) & =\frac{\partial\left(P^{\prime}, Q^{\prime}\right)}{\partial(P, Q)} \\ & =\operatorname{det}\left[\left(\begin{array}{cc} \mathbb{1}_{3 N \times 3 N} & 0 \\ 0 & \mathbb{1}_{3 N \times 3 N} \end{array}\right)+t\left(\begin{array}{cc} -\partial_{P} \partial_{Q} H & -\partial_{Q}^{2} H \\ \partial_{P}^{2} H & \partial_{Q} \partial_{P} H \end{array}\right)+\mathcal{O}\left(t^{2}\right)\right] \\ & =1+\operatorname{ttr}\left(\begin{array}{cc} -\partial_{P} \partial_{Q} H & -\partial_{Q}^{2} H \\ \partial_{P}^{2} H & \partial_{Q} \partial_{P} H \end{array}\right)+\mathcal{O}\left(t^{2}\right) \\ & =1+t(\underbrace{-\sum_{\alpha, i} \frac{\partial^{2} H}{\partial x_{i \alpha} \partial p_{i \alpha}}+\sum_{\alpha, i} \frac{\partial^{2} H}{\partial p_{i \alpha} \partial x_{i \alpha}}}_{=0})+\mathcal{O}\left(t^{2}\right) \\ & =1+\mathcal{O}\left(t^{2}\right) . \end{aligned}JP,Q(t)=(P,Q)(P,Q)=det[(13N×3N0013N×3N)+t(PQHQ2HP2HQPH)+O(t2)]=1+ttr(PQHQ2HP2HQPH)+O(t2)=1+t(α,i2Hxiαpiα+α,i2Hpiαxiα=0)+O(t2)=1+O(t2).
(Here we used det ( I + t A ) = 1 + t tr A + O ( t 2 ) det ( I + t A ) = 1 + t tr A + O t 2 det(I+tA)=1+t tr A+O(t^(2))\operatorname{det}(I+t A)=1+t \operatorname{tr} A+\mathcal{O}\left(t^{2}\right)det(I+tA)=1+ttrA+O(t2) for any matrix A.) This implies
(2.46) J P , Q / t ( 0 ) = lim t 0 1 t ( J P , Q ( t ) J P , Q ( 0 ) ) = lim t 0 1 t O ( t 2 ) = 0 (2.46) J P , Q / t ( 0 ) = lim t 0 1 t J P , Q ( t ) J P , Q ( 0 ) = lim t 0 1 t O t 2 = 0 {:(2.46)delJ_(P,Q)//del t(0)=lim_(t rarr0)(1)/(t)(J_(P,Q)(t)-J_(P,Q)(0))=lim_(t rarr0)(1)/(t)O(t^(2))=0:}\begin{equation*} \partial J_{P, Q} / \partial t(0)=\lim _{t \rightarrow 0} \frac{1}{t}\left(J_{P, Q}(t)-J_{P, Q}(0)\right)=\lim _{t \rightarrow 0} \frac{1}{t} \mathcal{O}\left(t^{2}\right)=0 \tag{2.46} \end{equation*}(2.46)JP,Q/t(0)=limt01t(JP,Q(t)JP,Q(0))=limt01tO(t2)=0
using J P , Q ( 0 ) = 0 J P , Q ( 0 ) = 0 J_(P,Q)(0)=0J_{P, Q}(0)=0JP,Q(0)=0. The functional equation for the Jacobian then implies that the time derivative vanishes for arbitrary t t ttt :
(2.47) t J P , Q ( t ) = t J P , Q ( t + t ) | t = 0 = J P , Q ( t ) t J P , Q ( t ) | t = 0 = 0 (2.47) t J P , Q ( t ) = t J P , Q t + t t = 0 = J P , Q ( t ) t J P , Q t t = 0 = 0 {:(2.47)(del)/(del t)J_(P,Q)(t)=(del)/(delt^('))J_(P,Q)(t+t^('))|_(t^(')=0)=J_(P,Q)(t)(del)/(delt^('))J_(P^('),Q^('))(t^('))|_(t^(')=0)=0:}\begin{equation*} \frac{\partial}{\partial t} J_{P, Q}(t)=\left.\frac{\partial}{\partial t^{\prime}} J_{P, Q}\left(t+t^{\prime}\right)\right|_{t^{\prime}=0}=\left.J_{P, Q}(t) \frac{\partial}{\partial t^{\prime}} J_{P^{\prime}, Q^{\prime}}\left(t^{\prime}\right)\right|_{t^{\prime}=0}=0 \tag{2.47} \end{equation*}(2.47)tJP,Q(t)=tJP,Q(t+t)|t=0=JP,Q(t)tJP,Q(t)|t=0=0
Together with J P , Q ( 0 ) = 1 J P , Q ( 0 ) = 1 J_(P,Q)(0)=1J_{P, Q}(0)=1JP,Q(0)=1, this gives the result J P , Q ( t ) = 1 J P , Q ( t ) = 1 J_(P,Q)(t)=1J_{P, Q}(t)=1JP,Q(t)=1 for all t t ttt, i.e. the flow is area-preserving.
If we start with a classical ensemble, described by a probability distribution ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) on phase space, then its time evolution is defined as
(2.48) ρ ( P , Q ; t ) ρ ( P ( t ) , Q ( t ) ) ρ t ( P , Q ) (2.48) ρ ( P , Q ; t ) ρ ( P ( t ) , Q ( t ) ) ρ t ( P , Q ) {:(2.48)rho(P","Q;t)-=rho(P(-t)","Q(-t))-=rho_(t)(P","Q):}\begin{equation*} \rho(P, Q ; t) \equiv \rho(P(-t), Q(-t)) \equiv \rho_{t}(P, Q) \tag{2.48} \end{equation*}(2.48)ρ(P,Q;t)ρ(P(t),Q(t))ρt(P,Q)
where ( P ( t ) , Q ( t ) ) ( P ( t ) , Q ( t ) ) (P(t),Q(t))(P(t), Q(t))(P(t),Q(t)) are the phase space trajectories. Using our notation for the flow map, we could also write this as
(2.49) ρ t ( P , Q ) = ρ ( Φ t ( P , Q ) ) (2.49) ρ t ( P , Q ) = ρ Φ t ( P , Q ) {:(2.49)rho_(t)(P","Q)=rho(Phi_(-t)(P,Q)):}\begin{equation*} \rho_{t}(P, Q)=\rho\left(\Phi_{-t}(P, Q)\right) \tag{2.49} \end{equation*}(2.49)ρt(P,Q)=ρ(Φt(P,Q))
The reason for the "-" is as follows: The probability to be at ( P , Q ) ( P , Q ) (P,Q)(P, Q)(P,Q) at time t t ttt should be given by the probability for having been at Φ t ( P , Q ) Φ t ( P , Q ) Phi_(-t)(P,Q)\Phi_{-t}(P, Q)Φt(P,Q) at time 0 . Note that the time evolution of observables (corresponding to the Heisenberg picture in Quantum Mechanics) is defined oppositely:
(2.50) F t ( P , Q ) = F ( P ( t ) , Q ( t ) ) (2.50) F t ( P , Q ) = F ( P ( t ) , Q ( t ) ) {:(2.50)F_(t)(P","Q)=F(P(t)","Q(t)):}\begin{equation*} F_{t}(P, Q)=F(P(t), Q(t)) \tag{2.50} \end{equation*}(2.50)Ft(P,Q)=F(P(t),Q(t))
Using Liouville's theorem, one then easily checks that F ρ t = F t ρ F ρ t = F t ρ (:F:)_(rho_(t))=(:F_(t):)_(rho)\langle F\rangle_{\rho_{t}}=\left\langle F_{t}\right\rangle_{\rho}Fρt=Ftρ.
Differentiating (2.49) with respect to t t ttt gives
(2.51) t ρ t ( P , Q ) = ρ t ( P , Q ) P P t = H Q ρ t ( P , Q ) Q Q t = H P = { H , ρ t } ( P , Q ) , (2.51) t ρ t ( P , Q ) = ρ t ( P , Q ) P P t = H Q ρ t ( P , Q ) Q Q t = H P = H , ρ t ( P , Q ) , {:(2.51)(del)/(del t)rho_(t)(P","Q)=-(delrho_(t)(P,Q))/(del P)ubrace((del P)/(del t)ubrace)_(=-(del H)/(del Q))-(delrho_(t)(P,Q))/(del Q)ubrace((del Q)/(del t)ubrace)_(=(del H)/(del P))={H,rho_(t)}(P","Q)",":}\begin{equation*} \frac{\partial}{\partial t} \rho_{t}(P, Q)=-\frac{\partial \rho_{t}(P, Q)}{\partial P} \underbrace{\frac{\partial P}{\partial t}}_{=-\frac{\partial H}{\partial Q}}-\frac{\partial \rho_{t}(P, Q)}{\partial Q} \underbrace{\frac{\partial Q}{\partial t}}_{=\frac{\partial H}{\partial P}}=\left\{H, \rho_{t}\right\}(P, Q), \tag{2.51} \end{equation*}(2.51)tρt(P,Q)=ρt(P,Q)PPt=HQρt(P,Q)QQt=HP={H,ρt}(P,Q),
where { , } { , } {*,*}\{\cdot, \cdot\}{,} denotes the Poisson bracket. An ensemble is called stationary if ρ t ρ t rho_(t)\rho_{t}ρt remains constant, i.e. ρ t / t = 0 ρ t / t = 0 delrho_(t)//del t=0\partial \rho_{t} / \partial t=0ρt/t=0. It follows that an ensemble is stationary if and only if ρ ρ rho\rhoρ Poisson-commutes with H H HHH, that is { ρ , H } = 0 { ρ , H } = 0 {rho,H}=0\{\rho, H\}=0{ρ,H}=0. Examples of stationary ensembles are thus the functions of H H HHH, i.e.
(2.52) ρ ( P , Q ) = f ( H ( P , Q ) ) (2.52) ρ ( P , Q ) = f ( H ( P , Q ) ) {:(2.52)rho(P","Q)=f(H(P","Q)):}\begin{equation*} \rho(P, Q)=f(H(P, Q)) \tag{2.52} \end{equation*}(2.52)ρ(P,Q)=f(H(P,Q))
where f f fff is some function. A particular example of a stationary ensemble is
(2.53) ρ ( P , Q ) = 1 Z β e β H ( P , Q ) (2.53) ρ ( P , Q ) = 1 Z β e β H ( P , Q ) {:(2.53)rho(P","Q)=(1)/(Z_(beta))e^(-beta H(P,Q)):}\begin{equation*} \rho(P, Q)=\frac{1}{Z_{\beta}} e^{-\beta H(P, Q)} \tag{2.53} \end{equation*}(2.53)ρ(P,Q)=1ZβeβH(P,Q)
where β > 0 β > 0 beta > 0\beta>0β>0 is a parameter and the normalization factor Z β Z β Z_(beta)Z_{\beta}Zβ ensures that ρ ρ rho\rhoρ is properly normalized, so Z β = e β H d 3 N P d 3 N Q Z β = e β H d 3 N P d 3 N Q Z_(beta)=inte^(-beta H)d^(3N)Pd^(3N)QZ_{\beta}=\int e^{-\beta H} d^{3 N} P d^{3 N} QZβ=eβHd3NPd3NQ. We will come back to this ensemble below.
The flow Φ t Φ t Phi_(t)\Phi_{t}Φt is not only area preserving on the entire phase-space, but also on the energy surface Ω E Ω E Omega_(E)\Omega_{E}ΩE (with the natural integration element understood). Such area-preserving flows under certain conditions imply that the phase space average equals the time average, cf. (2.41). This is expressed by the ergodic theorem:
Theorem: Let ( P ( t ) , Q ( t ) ) ( P ( t ) , Q ( t ) ) (P(t),Q(t))(P(t), Q(t))(P(t),Q(t)) be dense in Ω E Ω E Omega_(E)\Omega_{E}ΩE and F F FFF continuous. Then the time average is equal to the ensemble average:
(2.54) lim T 1 T 0 T F ( P ( t ) , Q ( t ) ) d t = 1 | Ω E | Ω E F ( P , Q ) d S (2.54) lim T 1 T 0 T F ( P ( t ) , Q ( t ) ) d t = 1 Ω E Ω E F ( P , Q ) d S {:(2.54)lim_(T rarr oo)(1)/(T)int_(0)^(T)F(P(t)","Q(t))dt=(1)/(|Omega_(E)|)int_(Omega_(E))F(P","Q)dS:}\begin{equation*} \lim _{T \rightarrow \infty} \frac{1}{T} \int_{0}^{T} F(P(t), Q(t)) d t=\frac{1}{\left|\Omega_{E}\right|} \int_{\Omega_{E}} F(P, Q) d S \tag{2.54} \end{equation*}(2.54)limT1T0TF(P(t),Q(t))dt=1|ΩE|ΩEF(P,Q)dS
The ergodic theorem may thus be summarized as
time average = ensemble average,
where the "ensemble" is the uniform distribution on Ω E Ω E Omega_(E)\Omega_{E}ΩE given by ρ ( P , Q ) = 1 / | Ω E | ρ ( P , Q ) = 1 / Ω E rho(P,Q)=1//|Omega_(E)|\rho(P, Q)=1 /\left|\Omega_{E}\right|ρ(P,Q)=1/|ΩE| (we will come back to this ensemble later in the context of " S = k B log W S = k B log W S=k_(B)log WS=k_{B} \log WS=kBlogW ", where it is also called the "micro-canonical ensemble"). In quantum theory, there is an analogue of this phenomenon going under the name "eigenstate thermalization", which is outlined in the appendix.
The key hypothesis is that the orbit lies dense in Ω E Ω E Omega_(E)\Omega_{E}ΩE and that this surface is compact. The first is clearly not the case if there are further constants of motion, since the orbit must then lie on a submanifold of Ω E Ω E Omega_(E)\Omega_{E}ΩE corresponding to particular values of these constants. The Kolmogorov-Arnold-Moser (KAM) theorem shows that small perturbations of systems with sufficiently many constants of motion again possess such invariant submanifolds, i.e. the ergodic theorem does not hold in such cases. Nevertheless, the ergodic theorem still remains an important motivation for studying ensembles.
One puzzling consequence of Liouville's theorem is that a trajectory starting at ( P 0 , Q 0 ) P 0 , Q 0 (P_(0),Q_(0))\left(P_{0}, Q_{0}\right)(P0,Q0) comes back arbitrarily closely to that point, a phenomenon called Poincaré recurrence. An intuitive "proof" of this statement can be given as follows:
Figure 2.4: Sketch of the situation described in the proof of Poincaré recurrence.
Let B 0 B 0 B_(0)B_{0}B0 be an ϵ ϵ epsilon\epsilonϵ-neighborhood of a point ( P 0 , Q 0 ) P 0 , Q 0 (P_(0),Q_(0))\left(P_{0}, Q_{0}\right)(P0,Q0). For k N k N k inNk \in \mathbb{N}kN define B k := Φ k ( B 0 ) B k := Φ k B 0 B_(k):=Phi_(k)(B_(0))B_{k}:=\Phi_{k}\left(B_{0}\right)Bk:=Φk(B0), which are ϵ ϵ epsilon\epsilonϵ-neighborhoods of ( P k , Q k ) = Φ k ( ( Q 0 , P 0 ) ) P k , Q k = Φ k Q 0 , P 0 (P_(k),Q_(k))=Phi_(k)((Q_(0),P_(0)))\left(P_{k}, Q_{k}\right)=\Phi_{k}\left(\left(Q_{0}, P_{0}\right)\right)(Pk,Qk)=Φk((Q0,P0)). Let us assume that the statement of the theorem is wrong. This yields
B 0 B k = k N B 0 B k = k N B_(0)nnB_(k)=O/quad AA k inNB_{0} \cap B_{k}=\varnothing \quad \forall k \in \mathbb{N}B0Bk=kN
Then it follows that
B n B k = n , k N , n k B n B k = n , k N , n k B_(n)nnB_(k)=O/quad AA n,k inN,n!=kB_{n} \cap B_{k}=\varnothing \quad \forall n, k \in \mathbb{N}, n \neq kBnBk=n,kN,nk
Now, by Liouville's theorem we have
| B 0 | = | B 1 | = = | B k | = B 0 = B 1 = = B k = |B_(0)|=|B_(1)|=dots=|B_(k)|=dots\left|B_{0}\right|=\left|B_{1}\right|=\ldots=\left|B_{k}\right|=\ldots|B0|=|B1|==|Bk|=
which immediately yields
| Ω E | | B 0 | + + | B k | + = Ω E B 0 + + B k + = |Omega_(E)| >= |B_(0)|+dots+|B_(k)|+dots=oo\left|\Omega_{E}\right| \geqslant\left|B_{0}\right|+\ldots+\left|B_{k}\right|+\ldots=\infty|ΩE||B0|++|Bk|+=
This clearly contradicts the assumption that Ω E Ω E Omega_(E)\Omega_{E}ΩE is compact and therefore the statement of the theorem has to be true.
Historically, the recurrence argument played an important role in early discussions of the notion of irreversibility, i.e. the fact that systems generically tend to approach an equilibrium state, whereas they never seem to spontaneously leave an equilibrium state and evolve back to the (non-equilibrium) initial conditions. To explain the origin of resp. the mechanisms behind irreversibility is one of the major challenges of non-equilibrium thermodynamics and we shall briefly come back to this point later. For the moment, we simply note that in practice the recurrence time τ recurrence τ recurrence  tau_("recurrence ")\tau_{\text {recurrence }}τrecurrence  would be extremely large compared to the natural scales of the system such as the equilibration time. We will verify this by investigating the dynamics of a toy model in the appendix. Here we only
give a heuristic explanation. Consider a gas of N N NNN particles in a volume V V VVV. The volume is partitioned into sub volumes V 1 , V 2 V 1 , V 2 V_(1),V_(2)V_{1}, V_{2}V1,V2 of equal size. We start the system in a state where the atoms only occupy V 1 V 1 V_(1)V_{1}V1. By the ergodic theorem we estimate that the fraction of time the system spends in such a state is χ Q V 1 = 2 3 N χ Q V 1 = 2 3 N (:chi_(Q inV_(1)):)=2^(-3N)\left\langle\chi_{Q \in V_{1}}\right\rangle=2^{-3 N}χQV1=23N (for an ideal gas), where χ Q V 1 χ Q V 1 chi_(Q inV_(1))\chi_{Q \in V_{1}}χQV1 gives 1 if all particles are in V 1 V 1 V_(1)V_{1}V1, and zero otherwise. For N = 1 mol N = 1 mol N=1molN=1 \mathrm{~mol}N=1 mol, i.e. N = O ( 10 23 ) N = O 10 23 N=O(10^(23))N=\mathcal{O}\left(10^{23}\right)N=O(1023), this fraction is astronomically small. So there is no real puzzle!
One often has the situation that a system can be divided up into two (or more) parts which can be treated approximately as isolated or which have some simple well-known interaction. One way to model this situation is to suppose that the phase space is a direct product Ω = Ω A × Ω B Ω = Ω A × Ω B Omega=Omega_(A)xxOmega_(B)\Omega=\Omega_{A} \times \Omega_{B}Ω=ΩA×ΩB, where A A AAA and B B BBB label the subsystems. For instance, in an N N NNN-particle system, Ω A Ω A Omega_(A)\Omega_{A}ΩA could comprise the phase space coordinates of one (or more) distinguished particle, and Ω B Ω B Omega_(B)\Omega_{B}ΩB those of all the others. If we have a probability density (ensemble) of the total system, i.e. function ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) on Ω Ω Omega\OmegaΩ, we may decide to probe it with observables of system A only, i.e. with observables F A = F A ( P A , Q A ) F A = F A P A , Q A F_(A)=F_(A)(P_(A),Q_(A))F_{A}=F_{A}\left(P_{A}, Q_{A}\right)FA=FA(PA,QA) which are functions of the phase space coordinates of Ω A Ω A Omega_(A)\Omega_{A}ΩA only. It is then clear that for such an F A F A F_(A)F_{A}FA, we can write (with P = ( P A , P B ) , Q = ( Q A , Q B ) P = P A , P B , Q = Q A , Q B P=(P_(A),P_(B)),Q=(Q_(A),Q_(B))P=\left(P_{A}, P_{B}\right), Q=\left(Q_{A}, Q_{B}\right)P=(PA,PB),Q=(QA,QB) )
(2.56) F A = Ω A × Ω B ρ ( P A , P B , Q A , Q B ) F A ( P A , Q A ) = Ω A ρ A ( P A , Q A ) F A ( P A , Q A ) , (2.56) F A = Ω A × Ω B ρ P A , P B , Q A , Q B F A P A , Q A = Ω A ρ A P A , Q A F A P A , Q A , {:(2.56)(:F_(A):)=int_(Omega_(A)xxOmega_(B))rho(P_(A),P_(B),Q_(A),Q_(B))F_(A)(P_(A),Q_(A))=int_(Omega_(A))rho_(A)(P_(A),Q_(A))F_(A)(P_(A),Q_(A))",":}\begin{equation*} \left\langle F_{A}\right\rangle=\int_{\Omega_{A} \times \Omega_{B}} \rho\left(P_{A}, P_{B}, Q_{A}, Q_{B}\right) F_{A}\left(P_{A}, Q_{A}\right)=\int_{\Omega_{A}} \rho_{A}\left(P_{A}, Q_{A}\right) F_{A}\left(P_{A}, Q_{A}\right), \tag{2.56} \end{equation*}(2.56)FA=ΩA×ΩBρ(PA,PB,QA,QB)FA(PA,QA)=ΩAρA(PA,QA)FA(PA,QA),
so it is natural to make the following definition:
Definition For a phase space Ω = Ω A × Ω B Ω = Ω A × Ω B Omega=Omega_(A)xxOmega_(B)\Omega=\Omega_{A} \times \Omega_{B}Ω=ΩA×ΩB describing two subsystems A A AAA and B B BBB of a given system, the reduced probability distribution for A A AAA is defined by
(2.57) ρ A ( P A , Q A ) = Ω B ρ ( P A , P B , Q A , Q B ) . (2.57) ρ A P A , Q A = Ω B ρ P A , P B , Q A , Q B . {:(2.57)rho_(A)(P_(A),Q_(A))=int_(Omega_(B))rho(P_(A),P_(B),Q_(A),Q_(B)).:}\begin{equation*} \rho_{A}\left(P_{A}, Q_{A}\right)=\int_{\Omega_{B}} \rho\left(P_{A}, P_{B}, Q_{A}, Q_{B}\right) . \tag{2.57} \end{equation*}(2.57)ρA(PA,QA)=ΩBρ(PA,PB,QA,QB).
The reduced probability distribution is that assigned to the system by an observer having only access to observables of system A (and similarly B). Note that ρ A ρ A rho_(A)\rho_{A}ρA satisfies again all the axioms of a probability density (on Ω A Ω A Omega_(A)\Omega_{A}ΩA ).

2.5 Ensembles in Quantum Mechanics (Statistical Operators / Density Matrices)

Quantum mechanical systems are of an intrinsically probabilistic nature, so the language of probability theory is, in this sense, not just optional but actually essential. In fact, to
say that the system is in a state | Ψ | Ψ |Psi:)|\Psi\rangle|Ψ really means that, if A A AAA is a self adjoint operator and
(2.58) A = i a i | i i | (2.58) A = i a i | i i | {:(2.58)A=sum_(i)a_(i)|i:)(:i|:}\begin{equation*} A=\sum_{i} a_{i}|i\rangle\langle i| \tag{2.58} \end{equation*}(2.58)A=iai|ii|
its spectral decomposition 2 2 ^(2){ }^{2}2, the probability for measuring the outcome a i a i a_(i)a_{i}ai is given by
p A , Ψ ( a i ) = | Ψ i | 2 p i . p A , Ψ a i = | Ψ i | 2 p i . p_(A,Psi)(a_(i))=|(:Psi∣i:)|^(2)-=p_(i).p_{A, \Psi}\left(a_{i}\right)=|\langle\Psi \mid i\rangle|^{2} \equiv p_{i} .pA,Ψ(ai)=|Ψi|2pi.
Thus, if we assign the state | Ψ | Ψ |Psi:)|\Psi\rangle|Ψ to the system, the set of possible measuring outcomes for A A AAA is the probability space Ω = { a 1 , a 2 , } Ω = a 1 , a 2 , Omega={a_(1),a_(2),dots}\Omega=\left\{a_{1}, a_{2}, \ldots\right\}Ω={a1,a2,} with (discrete) probability distribution given by { p 1 , p 2 , } p 1 , p 2 , {p_(1),p_(2),dots}\left\{p_{1}, p_{2}, \ldots\right\}{p1,p2,}.
In statistical mechanics we have incomplete information about the state of a quantum mechanical system. In particular, we do not want to prejudice ourselves by ascribing a pure state | Ψ | Ψ |Psi:)|\Psi\rangle|Ψ to the system. Instead, we describe it by a statistical mixture i.e. an ensemble of pure states. Suppose we believe that the system is in the state | Ψ i Ψ i |Psi_(i):)\left|\Psi_{i}\right\rangle|Ψi with probability λ i λ i lambda_(i)\lambda_{i}λi, where, as usual, λ i = 1 , λ i 0 λ i = 1 , λ i 0 sumlambda_(i)=1,lambda_(i) >= 0\sum \lambda_{i}=1, \lambda_{i} \geqslant 0λi=1,λi0. For example, before preparing the state we perform a classical random experiment to determine which state | Ψ i Ψ i |Psi_(i):)\left|\Psi_{i}\right\rangle|Ψi we prepare. The states | Ψ i Ψ i |Psi_(i):)\left|\Psi_{i}\right\rangle|Ψi should be normalized, i.e. Ψ i Ψ i = 1 Ψ i Ψ i = 1 (:Psi_(i)∣Psi_(i):)=1\left\langle\Psi_{i} \mid \Psi_{i}\right\rangle=1ΨiΨi=1, but they do not have to be orthogonal or complete. Then the expectation value A A (:A:)\langle A\rangleA of an operator is defined as
(2.59) A = i λ i Ψ i | A | Ψ i . (2.59) A = i λ i Ψ i A Ψ i . {:(2.59)(:A:)=sum_(i)lambda_(i)(:Psi_(i)|A|Psi_(i):).:}\begin{equation*} \langle A\rangle=\sum_{i} \lambda_{i}\left\langle\Psi_{i}\right| A\left|\Psi_{i}\right\rangle . \tag{2.59} \end{equation*}(2.59)A=iλiΨi|A|Ψi.
Introducing the density matrix ρ = i λ i | Ψ i Ψ i | ρ = i λ i Ψ i Ψ i rho=sum_(i)lambda_(i)|Psi_(i):)(:Psi_(i)|\rho=\sum_{i} \lambda_{i}\left|\Psi_{i}\right\rangle\left\langle\Psi_{i}\right|ρ=iλi|ΨiΨi| this may also be written as
(2.60) A = tr ( ρ A ) . (2.60) A = tr ( ρ A ) . {:(2.60)(:A:)=tr(rho A).:}\begin{equation*} \langle A\rangle=\operatorname{tr}(\rho A) . \tag{2.60} \end{equation*}(2.60)A=tr(ρA).
The density matrix has the properties tr ρ = 1 tr ρ = 1 tr rho=1\operatorname{tr} \rho=1trρ=1, as well as ρ = ρ ρ = ρ rho^(†)=rho\rho^{\dagger}=\rhoρ=ρ. Furthermore, for any state | Φ | Φ |Phi:)|\Phi\rangle|Φ we have
Φ | ρ | Φ = i λ i | Ψ i Φ | 2 0 . Φ | ρ | Φ = i λ i Ψ i Φ 2 0 . (:Phi|rho|Phi:)=sum_(i)lambda_(i)|(:Psi_(i)∣Phi:)|^(2) >= 0.\langle\Phi| \rho|\Phi\rangle=\sum_{i} \lambda_{i}\left|\left\langle\Psi_{i} \mid \Phi\right\rangle\right|^{2} \geqslant 0 .Φ|ρ|Φ=iλi|ΨiΦ|20.
The density matrix should be thought of as analogous to the classical probability distribution { p i } p i {p_(i)}\left\{p_{i}\right\}{pi} given by the eigenvalues p i p i p_(i)p_{i}pi of ρ ρ rho\rhoρ (which coincide with the λ i λ i lambda_(i)\lambda_{i}λi if and only if the states | Ψ i Ψ i |Psi_(i):)\left|\Psi_{i}\right\rangle|Ψi are orthogonal).
In the context of quantum mechanical ensembles one can define a quantity that is closely analogous to the information entropy for ordinary probability distributions. This
quantity is defined as
$$
(2.61) S v . N . ( ρ ) = k B tr ( ρ log ρ ) = k B i p i log p i (2.61) S v . N . ( ρ ) = k B tr ( ρ log ρ ) = k B i p i log p i {:(2.61)S_(v.N.)(rho)=-k_(B)tr(rho log rho)=-k_(B)sum_(i)p_(i)log p_(i):}\begin{equation*} S_{\mathrm{v} . \mathrm{N} .}(\rho)=-\mathrm{k}_{\mathrm{B}} \operatorname{tr}(\rho \log \rho)=-\mathrm{k}_{B} \sum_{i} p_{i} \log p_{i} \tag{2.61} \end{equation*}(2.61)Sv.N.(ρ)=kBtr(ρlogρ)=kBipilogpi
a n d i s c a l l e d t h e v o n N e u m a n n e n t r o p y a s s o c i a t e d w i t h $ ρ $ . A c c o r d i n g t o t h e r u l e s o f q u a n t u m m e c h a n i c s , t h e t i m e e v o l u t i o n o f a s t a t e i s d e s c r i b e d b y S c h r ö d i n g e r s e q u a t i o n a n d i s c a l l e d t h e v o n N e u m a n n e n t r o p y a s s o c i a t e d w i t h $ ρ $ . A c c o r d i n g t o t h e r u l e s o f q u a n t u m m e c h a n i c s , t h e t i m e e v o l u t i o n o f a s t a t e i s d e s c r i b e d b y S c h r ö d i n g e r s e q u a t i o n andiscalledthevonNeumannentropyassociatedwith$rho$.Accordingtotherulesofquantummechanics,thetimeevolutionofastateisdescribedbySchrödinger^(')sequationand is called the von Neumann entropy associated with $\rho$. According to the rules of quantum mechanics, the time evolution of a state is described by Schrödinger's equationandiscalledthevonNeumannentropyassociatedwith$ρ$.Accordingtotherulesofquantummechanics,thetimeevolutionofastateisdescribedbySchrödingersequation
i d d t | Ψ ( t ) = H | Ψ ( t ) i d d t ρ ( t ) = [ H , ρ ( t ) ] H ρ ( t ) ρ ( t ) H i d d t | Ψ ( t ) = H | Ψ ( t ) i d d t ρ ( t ) = [ H , ρ ( t ) ] H ρ ( t ) ρ ( t ) H {:[iℏ(d)/(dt)|Psi(t):)=H|Psi(t):)],[=>quad iℏ(d)/(dt)rho(t)=[H","rho(t)]-=H rho(t)-rho(t)H]:}\begin{aligned} i \hbar \frac{d}{d t}|\Psi(t)\rangle & =H|\Psi(t)\rangle \\ \Rightarrow \quad i \hbar \frac{d}{d t} \rho(t) & =[H, \rho(t)] \equiv H \rho(t)-\rho(t) H \end{aligned}iddt|Ψ(t)=H|Ψ(t)iddtρ(t)=[H,ρ(t)]Hρ(t)ρ(t)H
$$
Therefore an ensemble is stationary if [ H , ρ ] = 0 [ H , ρ ] = 0 [H,rho]=0[H, \rho]=0[H,ρ]=0. In particular, ρ ρ rho\rhoρ is stationary if it is of the form
ρ = f ( H ) = i f ( E i ) | Ψ i Ψ i | , ρ = f ( H ) = i f E i Ψ i Ψ i , rho=f(H)=sum_(i)f(E_(i))|Psi_(i):)(:Psi_(i)|,\rho=f(H)=\sum_{i} f\left(E_{i}\right)\left|\Psi_{i}\right\rangle\left\langle\Psi_{i}\right|,ρ=f(H)=if(Ei)|ΨiΨi|,
where i f ( E i ) = 1 i f E i = 1 sum_(i)f(E_(i))=1\sum_{i} f\left(E_{i}\right)=1if(Ei)=1 and p i = f ( E i ) > 0 p i = f E i > 0 p_(i)=f(E_(i)) > 0p_{i}=f\left(E_{i}\right)>0pi=f(Ei)>0 (here, E i E i E_(i)E_{i}Ei label the eigenvalues of the Hamiltonian H H HHH and | Ψ i Ψ i |Psi_(i):)\left|\Psi_{i}\right\rangle|Ψi its eigenstates, i.e. H | Ψ i = E i | Ψ i ) H Ψ i = E i Ψ i {:H|Psi_(i):)=E_(i)|Psi_(i):))\left.H\left|\Psi_{i}\right\rangle=E_{i}\left|\Psi_{i}\right\rangle\right)H|Ψi=Ei|Ψi). The characteristic example is given by
(2.62) f ( H ) = 1 Z β e β H (2.62) f ( H ) = 1 Z β e β H {:(2.62)f(H)=(1)/(Z_(beta))e^(-beta H):}\begin{equation*} f(H)=\frac{1}{Z_{\beta}} e^{-\beta H} \tag{2.62} \end{equation*}(2.62)f(H)=1ZβeβH
where Z β = i e β E i Z β = i e β E i Z_(beta)=sum_(i)e^(-betaE_(i))Z_{\beta}=\sum_{i} e^{-\beta E_{i}}Zβ=ieβEi. More generally, if { Q α } Q α {Q_(alpha)}\left\{Q_{\alpha}\right\}{Qα} are operators commuting with H H HHH, then another choice is
(2.63) ρ = 1 Z ( β , μ α ) e β H α μ α Q α (2.63) ρ = 1 Z β , μ α e β H α μ α Q α {:(2.63)rho=(1)/(Z(beta,mu_(alpha)))e^(-beta H-sum_(alpha)mu_(alpha)Q_(alpha)):}\begin{equation*} \rho=\frac{1}{Z\left(\beta, \mu_{\alpha}\right)} e^{-\beta H-\sum_{\alpha} \mu_{\alpha} Q_{\alpha}} \tag{2.63} \end{equation*}(2.63)ρ=1Z(β,μα)eβHαμαQα
We will come back to discuss such ensembles below in chapter 4.
One often deals with situations in which a system is comprised of two sub-systems A and B described by Hilbert spaces H A , H B H A , H B H_(A),H_(B)\mathcal{H}_{A}, \mathcal{H}_{B}HA,HB. The total Hilbert space is then H = H A H B H = H A H B H=H_(A)oxH_(B)\mathcal{H}=\mathcal{H}_{A} \otimes \mathcal{H}_{B}H=HAHB ( ox\otimes is the tensor product). If { | i A } | i A {|i:)_(A)}\left\{|i\rangle_{A}\right\}{|iA} and { | j B } | j B {|j:)_(B)}\left\{|j\rangle_{B}\right\}{|jB} are orthonormal bases of H A H A H_(A)\mathcal{H}_{A}HA and H B H B H_(B)\mathcal{H}_{B}HB, an orthonormal basis of H H H\mathcal{H}H is given by { | i , j = | i A | j B } | i , j = | i A | j B {|i,j:)=|i:)_(A)ox|j:)_(B)}\left\{|i, j\rangle=|i\rangle_{A} \otimes|j\rangle_{B}\right\}{|i,j=|iA|jB}.
Consider a (pure) state | Ψ | Ψ |Psi:)|\Psi\rangle|Ψ in H H H\mathcal{H}H, i.e. a pure state of the total system. It can be expanded as
| Ψ = i , j c i , j | i , j | Ψ = i , j c i , j | i , j |Psi:)=sum_(i,j)c_(i,j)|i,j:)|\Psi\rangle=\sum_{i, j} c_{i, j}|i, j\rangle|Ψ=i,jci,j|i,j
We assume that the state is normalized, meaning that
(2.64) i , j | c i , j | 2 = 1 (2.64) i , j c i , j 2 = 1 {:(2.64)sum_(i,j)|c_(i,j)|^(2)=1:}\begin{equation*} \sum_{i, j}\left|c_{i, j}\right|^{2}=1 \tag{2.64} \end{equation*}(2.64)i,j|ci,j|2=1
Observables describing measurements of subsystem A consist of operators of the form a ~ = a 1 B a ~ = a 1 B tilde(a)=a ox1_(B)\tilde{a}=a \otimes \mathbb{1}_{B}a~=a1B, where a a aaa is an operator on H A H A H_(A)\mathcal{H}_{A}HA and 1 B 1 B 1_(B)\mathbb{1}_{B}1B is the identity operator on H B H B H_(B)\mathcal{H}_{B}HB (similarly
an observable describing a measurement of system B B BBB corresponds to b ~ = 1 A b b ~ = 1 A b tilde(b)=1_(A)ox b\tilde{b}=1_{A} \otimes bb~=1Ab ). For such an operator we can write:
Ψ | a ~ | Ψ = i , j , k , l c ¯ i , k c j , l i , k | a 1 B | j , l = i , j , k , l c ¯ i , k c j , l A i | a | j A B k l B δ k l = i , j ( k c ¯ i , k c j , k ) =: ( ρ A ) j i ) A i | a | j A = tr A ( a ρ A ) . Ψ | a ~ | Ψ = i , j , k , l c ¯ i , k c j , l i , k | a 1 B | j , l = i , j , k , l c ¯ i , k c j , l A i | a | j A B k l B δ k l = i , j k c ¯ i , k c j , k =: ρ A j i ) A i | a | j A = tr A a ρ A . {:[(:Psi| tilde(a)|Psi:)=sum_(i,j,k,l) bar(c)_(i,k)c_(j,l)(:i","k|a ox1_(B)|j","l:)],[=sum_(i,j,k,l) bar(c)_(i,k)c_(j,l)_(A)(:i|a|j:)_(A)ubrace(B(:k∣l:)_(B)ubrace)_(delta_(kl))],[=sum_(i,j)ubrace((sum_(k) bar(c)_(i,k)c_(j,k))ubrace)_(=:(rho_(A))_(ji)))A(:i|a|j:)_(A)],[=tr_(A)(arho_(A)).]:}\begin{aligned} \langle\Psi| \tilde{a}|\Psi\rangle & =\sum_{i, j, k, l} \bar{c}_{i, k} c_{j, l}\langle i, k| a \otimes \mathbb{1}_{B}|j, l\rangle \\ & =\sum_{i, j, k, l} \bar{c}_{i, k} c_{j, l}{ }_{A}\langle i| a|j\rangle_{A} \underbrace{B\langle k \mid l\rangle_{B}}_{\delta_{k l}} \\ & =\sum_{i, j} \underbrace{\left(\sum_{k} \bar{c}_{i, k} c_{j, k}\right)}_{=:\left(\rho_{A}\right)_{j i}}) A\langle i| a|j\rangle_{A} \\ & =\operatorname{tr}_{A}\left(a \rho_{A}\right) . \end{aligned}Ψ|a~|Ψ=i,j,k,lc¯i,kcj,li,k|a1B|j,l=i,j,k,lc¯i,kcj,lAi|a|jABklBδkl=i,j(kc¯i,kcj,k)=:(ρA)ji)Ai|a|jA=trA(aρA).
The operator ρ A ρ A rho_(A)\rho_{A}ρA on H A H A H_(A)\mathcal{H}_{A}HA by definition satisfies ρ A = ρ A ρ A = ρ A rho_(A)^(†)=rho_(A)\rho_{A}^{\dagger}=\rho_{A}ρA=ρA and by (2.64), it satisfies tr ρ A = 1 tr ρ A = 1 trrho_(A)=1\operatorname{tr} \rho_{A}=1trρA=1. It is also not hard to see that Φ | ρ A | Φ 0 Φ | ρ A | Φ 0 (:Phi|rho_(A)|Phi:) >= 0\langle\Phi| \rho_{A}|\Phi\rangle \geqslant 0Φ|ρA|Φ0. Thus, ρ A ρ A rho_(A)\rho_{A}ρA defines a density matrix on the Hilbert space H A H A H_(A)\mathcal{H}_{A}HA of system A . One similarly defines ρ B ρ B rho_(B)\rho_{B}ρB on H B H B H_(B)\mathcal{H}_{B}HB.
Definition: The operator ρ A ρ A rho_(A)\rho_{A}ρA is called reduced density matrix of subsystem A , and ρ B ρ B rho_(B)\rho_{B}ρB that of subsystem B.
The reduced density matrix reflects the limited information of an observer only having access to a subsystem. The quantity
(2.65) S ent := S v . N . ( ρ A ) = k B tr ( ρ A log ρ A ) (2.65) S ent  := S v . N . ρ A = k B tr ρ A log ρ A {:(2.65)S_("ent "):=S_(v.N.)(rho_(A))=-k_(B)tr(rho_(A)log rho_(A)):}\begin{equation*} S_{\text {ent }}:=S_{\mathrm{v} . \mathrm{N} .}\left(\rho_{A}\right)=-\mathrm{k}_{\mathrm{B}} \operatorname{tr}\left(\rho_{A} \log \rho_{A}\right) \tag{2.65} \end{equation*}(2.65)Sent :=Sv.N.(ρA)=kBtr(ρAlogρA)
is called the entanglement entropy of subsystem A. One shows that S v.N. ( ρ A ) = S v.N.  ρ A = S_("v.N. ")(rho_(A))=S_{\text {v.N. }}\left(\rho_{A}\right)=Sv.N. (ρA)= S v.N. ( ρ B ) S v.N.  ρ B S_("v.N. ")(rho_(B))S_{\text {v.N. }}\left(\rho_{B}\right)Sv.N. (ρB), so it does not matter which of the two subsystems we use to define it.
Example: Let H A = C 2 = H B H A = C 2 = H B H_(A)=C^(2)=H_(B)\mathcal{H}_{A}=\mathbb{C}^{2}=\mathcal{H}_{B}HA=C2=HB with orthonormal basis { | , | } { | , | } {|uarr:),|darr:)}\{|\uparrow\rangle,|\downarrow\rangle\}{|,|} for either system A or B. The orthonormal basis of H H H\mathcal{H}H is then given by { | ↑↑ , | ↑↓ , | ↓↑ , | ↓↓ } { | ↑↑ , | ↑↓ , | ↓↑ , | ↓↓ } {|uarr uarr:),|uarr darr:),|darr uarr:),|darr darr:)}\{|\uparrow \uparrow\rangle,|\uparrow \downarrow\rangle,|\downarrow \uparrow\rangle,|\downarrow \downarrow\rangle\}{|↑↑,|↑↓,|↓↑,|↓↓}.
(i) Let | Ψ = | ↑↓ | Ψ = | ↑↓ |Psi:)=|uarr darr:)|\Psi\rangle=|\uparrow \downarrow\rangle|Ψ=|↑↓. Then
(2.66) Ψ | a ~ | Ψ = ↑↓ | a 1 B | ↑↓ = | a | . (2.66) Ψ | a ~ | Ψ = ↑↓ | a 1 B | ↑↓ = | a | . {:(2.66)(:Psi| tilde(a)|Psi:)=(:uarr darr|a ox1_(B)|uarr darr:)=(:uarr|a|uarr:).:}\begin{equation*} \langle\Psi| \tilde{a}|\Psi\rangle=\langle\uparrow \downarrow| a \otimes \mathbb{1}_{B}|\uparrow \downarrow\rangle=\langle\uparrow| a|\uparrow\rangle . \tag{2.66} \end{equation*}(2.66)Ψ|a~|Ψ=↑↓|a1B|↑↓=|a|.
from which it follows that the reduced density matrix of subsystem A is given by
(2.67) ρ A = | | (2.67) ρ A = | | {:(2.67)rho_(A)=|uarr:)(:uarr|:}\begin{equation*} \rho_{A}=|\uparrow\rangle\langle\uparrow| \tag{2.67} \end{equation*}(2.67)ρA=||
The entanglement entropy is
(2.68) S ent = k B tr ( ρ A log ρ A ) = k B ( 1 log 1 ) = 0 (2.68) S ent = k B tr ρ A log ρ A = k B ( 1 log 1 ) = 0 {:(2.68)S_(ent)=-k_(B)tr(rho_(A)log rho_(A))=-k_(B)(1*log 1)=0:}\begin{equation*} S_{\mathrm{ent}}=-\mathrm{k}_{\mathrm{B}} \operatorname{tr}\left(\rho_{A} \log \rho_{A}\right)=-\mathrm{k}_{\mathrm{B}}(1 \cdot \log 1)=0 \tag{2.68} \end{equation*}(2.68)Sent=kBtr(ρAlogρA)=kB(1log1)=0
(ii) Let | Ψ = 1 2 ( | ↑↓ | ↓↑ ) | Ψ = 1 2 ( | ↑↓ | ↓↑ ) |Psi:)=(1)/(sqrt2)(|uarr darr:)-|darr uarr:))|\Psi\rangle=\frac{1}{\sqrt{2}}(|\uparrow \downarrow\rangle-|\downarrow \uparrow\rangle)|Ψ=12(|↑↓|↓↑). Then
Ψ | a ~ | Ψ = 1 2 ( ↑↓ | ↓↑ | ) ( a 1 B ) ( | ↑↓ | ↓↑ ) (2.69) = 1 2 ( | a | + | a | ) , Ψ | a ~ | Ψ = 1 2 ( ↑↓ | ↓↑ | ) a 1 B ( | ↑↓ | ↓↑ ) (2.69) = 1 2 ( | a | + | a | ) , {:[(:Psi| tilde(a)|Psi:)=(1)/(2)((:uarr darr|-(:darr uarr|)(a ox1_(B))(|uarr darr:)-|darr uarr:))],[(2.69)=(1)/(2)((:uarr|a|uarr:)+(:darr|a|darr:))","]:}\begin{align*} \langle\Psi| \tilde{a}|\Psi\rangle & =\frac{1}{2}(\langle\uparrow \downarrow|-\langle\downarrow \uparrow|)\left(a \otimes \mathbb{1}_{B}\right)(|\uparrow \downarrow\rangle-|\downarrow \uparrow\rangle) \\ & =\frac{1}{2}(\langle\uparrow| a|\uparrow\rangle+\langle\downarrow| a|\downarrow\rangle), \tag{2.69} \end{align*}Ψ|a~|Ψ=12(↑↓|↓↑|)(a1B)(|↑↓|↓↑)(2.69)=12(|a|+|a|),
from which it follows that the reduced density matrix of subsystem A is given by
(2.70) ρ A = 1 2 ( | | + | | ) . (2.70) ρ A = 1 2 ( | | + | | ) . {:(2.70)rho_(A)=(1)/(2)(|uarr:)(:uarr|+|darr:)(:darr|).:}\begin{equation*} \rho_{A}=\frac{1}{2}(|\uparrow\rangle\langle\uparrow|+|\downarrow\rangle\langle\downarrow|) . \tag{2.70} \end{equation*}(2.70)ρA=12(||+||).
The entanglement entropy is
(2.71) S ent = k B tr ( ρ A log ρ A ) = k B ( 1 2 log 1 2 + 1 2 log 1 2 ) = k B log 2 (2.71) S ent = k B tr ρ A log ρ A = k B 1 2 log 1 2 + 1 2 log 1 2 = k B log 2 {:(2.71)S_(ent)=-k_(B)tr(rho_(A)log rho_(A))=-k_(B)((1)/(2)log((1)/(2))+(1)/(2)log((1)/(2)))=k_(B)log 2:}\begin{equation*} S_{\mathrm{ent}}=-\mathrm{k}_{\mathrm{B}} \operatorname{tr}\left(\rho_{A} \log \rho_{A}\right)=-\mathrm{k}_{\mathrm{B}}\left(\frac{1}{2} \log \frac{1}{2}+\frac{1}{2} \log \frac{1}{2}\right)=\mathrm{k}_{\mathrm{B}} \log 2 \tag{2.71} \end{equation*}(2.71)Sent=kBtr(ρAlogρA)=kB(12log12+12log12)=kBlog2

Chapter 3

Time-evolving ensembles

3.1 Boltzmann Equation in Classical Mechanics

In order to understand the dynamical properties of systems in statistical mechanics one has to study non-stationary (i.e. time-dependent) ensembles. A key question, already brought up earlier, is whether systems initially described by a non-stationary ensemble will eventually approach an equilibrium ensemble. An important quantitative tool to understand the approach to equilibrium (e.g. in the case of thin media or weakly coupled systems) is the Boltzmann equation, which we discuss here in the case of classical mechanics.
Let ρ ρ rho\rhoρ be an ensemble and ρ t ( P , Q ) = ρ ( P ( t ) , Q ( t ) ) ρ t ( P , Q ) = ρ ( P ( t ) , Q ( t ) ) rho_(t)(P,Q)=rho(P(t),Q(t))\rho_{t}(P, Q)=\rho(P(t), Q(t))ρt(P,Q)=ρ(P(t),Q(t)) be the time evolving ensemble defined in the previous chapter. We would like to learn something about this function ρ t ρ t rho_(t)\rho_{t}ρt. It is of course in general impossible to determine this exactly, because we cannot find the particle trajectories ( P ( t ) , Q ( t ) ) ( P ( t ) , Q ( t ) ) (P(t),Q(t))(P(t), Q(t))(P(t),Q(t)) for a system with a large number N N NNN of particles. Also, even if we could, the full ρ t ( P , Q ) ρ t ( P , Q ) rho_(t)(P,Q)\rho_{t}(P, Q)ρt(P,Q) as a function of 6 N 6 N 6N6 N6N phase space coordinates is in general way more information than we really would like to have in practice. Instead, we often only want to know the time evolution of a relatively small subset of observables. One such observable is the 1 -particle density f 1 f 1 f_(1)f_{1}f1 which is defined by
f 1 ( p 1 , x 1 ; t ) := i δ 3 ( p 1 p i ) δ 3 ( x 1 x i ) (3.1) = N ρ t ( p 1 , x 1 , p 2 , x 2 , p N , x N ) i = 2 N d 3 x i d 3 p i . f 1 p 1 , x 1 ; t := i δ 3 p 1 p i δ 3 x 1 x i (3.1) = N ρ t p 1 , x 1 , p 2 , x 2 , p N , x N i = 2 N d 3 x i d 3 p i . {:[f_(1)( vec(p)_(1), vec(x)_(1);t):=(:sum_(i)delta^(3)( vec(p)_(1)- vec(p)_(i))delta^(3)( vec(x)_(1)- vec(x)_(i)):)],[(3.1)=N intrho_(t)( vec(p)_(1), vec(x)_(1), vec(p)_(2), vec(x)_(2)dots, vec(p)_(N), vec(x)_(N))prod_(i=2)^(N)d^(3)x_(i)d^(3)p_(i).]:}\begin{align*} f_{1}\left(\vec{p}_{1}, \vec{x}_{1} ; t\right) & :=\left\langle\sum_{i} \delta^{3}\left(\vec{p}_{1}-\vec{p}_{i}\right) \delta^{3}\left(\vec{x}_{1}-\vec{x}_{i}\right)\right\rangle \\ & =N \int \rho_{t}\left(\vec{p}_{1}, \vec{x}_{1}, \vec{p}_{2}, \vec{x}_{2} \ldots, \vec{p}_{N}, \vec{x}_{N}\right) \prod_{i=2}^{N} d^{3} x_{i} d^{3} p_{i} . \tag{3.1} \end{align*}f1(p1,x1;t):=iδ3(p1pi)δ3(x1xi)(3.1)=Nρt(p1,x1,p2,x2,pN,xN)i=2Nd3xid3pi.
Similarly, the two particle density can be computed from ρ ρ rho\rhoρ via
(3.2) f 2 ( p 1 , x 1 , p 2 , x 2 ; t ) = N ( N 1 ) ρ t ( p 1 , x 1 , p 2 , x 2 , p N , x N ) i = 3 N d 3 x i d 3 p i (3.2) f 2 p 1 , x 1 , p 2 , x 2 ; t = N ( N 1 ) ρ t p 1 , x 1 , p 2 , x 2 , p N , x N i = 3 N d 3 x i d 3 p i {:(3.2)f_(2)( vec(p)_(1), vec(x)_(1), vec(p)_(2), vec(x)_(2);t)=N(N-1)intrho_(t)( vec(p)_(1), vec(x)_(1), vec(p)_(2), vec(x)_(2)dots, vec(p)_(N), vec(x)_(N))prod_(i=3)^(N)d^(3)x_(i)d^(3)p_(i):}\begin{equation*} f_{2}\left(\vec{p}_{1}, \vec{x}_{1}, \vec{p}_{2}, \vec{x}_{2} ; t\right)=N(N-1) \int \rho_{t}\left(\vec{p}_{1}, \vec{x}_{1}, \vec{p}_{2}, \vec{x}_{2} \ldots, \vec{p}_{N}, \vec{x}_{N}\right) \prod_{i=3}^{N} d^{3} x_{i} d^{3} p_{i} \tag{3.2} \end{equation*}(3.2)f2(p1,x1,p2,x2;t)=N(N1)ρt(p1,x1,p2,x2,pN,xN)i=3Nd3xid3pi
Analogously, we define the s s sss-particle densities f s f s f_(s)f_{s}fs, for 2 < s N 2 < s N 2 < s <= N2<s \leqslant N2<sN. Note that the s s sss-particle densities are, up to the normalization, nothing but the reduced probability distributions obtained from ρ t ρ t rho_(t)\rho_{t}ρt if we divide the total system into a system A consisting of s s sss fixed particles and a system B consisting of the remaining ones, i.e. for instance for s = 1 s = 1 s=1s=1s=1
(3.3) ρ t A ( p 1 , x 1 ) = 1 N f 1 ( p 1 , x 1 ; t ) . (3.3) ρ t A p 1 , x 1 = 1 N f 1 p 1 , x 1 ; t . {:(3.3)rho_(tA)( vec(p)_(1), vec(x)_(1))=(1)/(N)f_(1)( vec(p)_(1), vec(x)_(1);t).:}\begin{equation*} \rho_{t A}\left(\vec{p}_{1}, \vec{x}_{1}\right)=\frac{1}{N} f_{1}\left(\vec{p}_{1}, \vec{x}_{1} ; t\right) . \tag{3.3} \end{equation*}(3.3)ρtA(p1,x1)=1Nf1(p1,x1;t).
The Hamiltonian H s H s H_(s)H_{s}Hs describing the interaction between s s sss particles can be written as
(3.4) H s = i = 1 s p i 2 2 m + 1 i < j s V ( x i x j ) + i = 1 s W ( x i ) (3.4) H s = i = 1 s p i 2 2 m + 1 i < j s V x i x j + i = 1 s W x i {:(3.4)H_(s)=sum_(i=1)^(s)( vec(p)_(i)^(2))/(2m)+sum_(1 <= i < j <= s)V( vec(x)_(i)- vec(x)_(j))+sum_(i=1)^(s)W( vec(x)_(i)):}\begin{equation*} H_{s}=\sum_{i=1}^{s} \frac{\vec{p}_{i}^{2}}{2 m}+\sum_{1 \leqslant i<j \leqslant s} \mathcal{V}\left(\vec{x}_{i}-\vec{x}_{j}\right)+\sum_{i=1}^{s} \mathcal{W}\left(\vec{x}_{i}\right) \tag{3.4} \end{equation*}(3.4)Hs=i=1spi22m+1i<jsV(xixj)+i=1sW(xi)
so that in particular H N = H H N = H H_(N)=HH_{N}=HHN=H. One finds the relations
(3.5) f s t { H s , f s } streaming term = i = 1 s d 3 p s + 1 d 3 x s + 1 V ( x i x s + 1 ) x i f s + 1 p i collision term (3.5) f s t H s , f s streaming term  = i = 1 s d 3 p s + 1 d 3 x s + 1 V x i x s + 1 x i f s + 1 p i collision term  {:(3.5)ubrace((delf_(s))/(del t)-{H_(s),f_(s)}ubrace)_("streaming term ")=ubrace(sum_(i=1)^(s)intd^(3)p_(s+1)d^(3)x_(s+1)(delV( vec(x)_(i)- vec(x)_(s+1)))/(del vec(x)_(i))*(delf_(s+1))/(del vec(p)_(i))ubrace)_("collision term "):}\begin{equation*} \underbrace{\frac{\partial f_{s}}{\partial t}-\left\{H_{s}, f_{s}\right\}}_{\text {streaming term }}=\underbrace{\sum_{i=1}^{s} \int d^{3} p_{s+1} d^{3} x_{s+1} \frac{\partial \mathcal{V}\left(\vec{x}_{i}-\vec{x}_{s+1}\right)}{\partial \vec{x}_{i}} \cdot \frac{\partial f_{s+1}}{\partial \vec{p}_{i}}}_{\text {collision term }} \tag{3.5} \end{equation*}(3.5)fst{Hs,fs}streaming term =i=1sd3ps+1d3xs+1V(xixs+1)xifs+1picollision term 
This system of equations is called the BBGKY hierarchy (for Bogoliubov-Born-GreenKirkwood Y Y Y\mathbf{Y}Y von hierarchy). The first term ( s = 1 ) ( s = 1 ) (s=1)(s=1)(s=1) is given by
(3.6) [ t W x 1 = F (ext. force) p 1 + p 1 m = v (velocity) x 1 ] f 1 = d 3 p 2 d 3 x 2 V ( x 1 x 2 ) x 1 f 2 p 1 unknown! (3.6) [ t W x 1 = F  (ext. force)  p 1 + p 1 m = v  (velocity)  x 1 ] f 1 = d 3 p 2 d 3 x 2 V x 1 x 2 x 1 f 2 p 1 unknown!  {:(3.6)[(del)/(del t)-ubrace((delW)/(del vec(x)_(1))ubrace)_(= vec(F)" (ext. force) ")*(del)/(del vec(p)_(1))+ubrace(( vec(p)_(1))/(m)ubrace)_(= vec(v)" (velocity) ")*(del)/(del vec(x)_(1))]f_(1)=intd^(3)p_(2)d^(3)x_(2)(delV( vec(x)_(1)- vec(x)_(2)))/(del vec(x)_(1))*ubrace((delf_(2))/(del vec(p)_(1))ubrace)_("unknown! "):}\begin{equation*} [\frac{\partial}{\partial t}-\underbrace{\frac{\partial \mathcal{W}}{\partial \vec{x}_{1}}}_{=\vec{F} \text { (ext. force) }} \cdot \frac{\partial}{\partial \vec{p}_{1}}+\underbrace{\frac{\vec{p}_{1}}{m}}_{=\vec{v} \text { (velocity) }} \cdot \frac{\partial}{\partial \vec{x}_{1}}] f_{1}=\int d^{3} p_{2} d^{3} x_{2} \frac{\partial \mathcal{V}\left(\vec{x}_{1}-\vec{x}_{2}\right)}{\partial \vec{x}_{1}} \cdot \underbrace{\frac{\partial f_{2}}{\partial \vec{p}_{1}}}_{\text {unknown! }} \tag{3.6} \end{equation*}(3.6)[tWx1=F (ext. force) p1+p1m=v (velocity) x1]f1=d3p2d3x2V(x1x2)x1f2p1unknown! 
An obvious feature of the BBGKY hierarchy is that the equation for f 1 f 1 f_(1)f_{1}f1 involves f 2 f 2 f_(2)f_{2}f2, that for f 2 f 2 f_(2)f_{2}f2 involves f 3 f 3 f_(3)f_{3}f3, etc. In this sense the equations for the individual f i f i f_(i)f_{i}fi are not closed. To get a manageable system, some approximations/truncations are necessary.
The simplest truncation, which leads to the Boltzmann equation, is to assume uncorrelated densities, i.e.
f 2 ( p 1 , x 1 , p 2 , x 2 ) = N ( N 1 ) N 2 f 1 ( p 1 , x 1 ) f 1 ( p 2 , x 2 ) (3.7) f 1 ( p 1 , x 1 ) f 1 ( p 2 , x 2 ) , f 2 p 1 , x 1 , p 2 , x 2 = N ( N 1 ) N 2 f 1 p 1 , x 1 f 1 p 2 , x 2 (3.7) f 1 p 1 , x 1 f 1 p 2 , x 2 , {:[f_(2)( vec(p)_(1), vec(x)_(1), vec(p)_(2), vec(x)_(2))=(N(N-1))/(N^(2))f_(1)( vec(p)_(1), vec(x)_(1))f_(1)( vec(p)_(2), vec(x)_(2))],[(3.7)≃f_(1)( vec(p)_(1), vec(x)_(1))f_(1)( vec(p)_(2), vec(x)_(2))","]:}\begin{align*} f_{2}\left(\vec{p}_{1}, \vec{x}_{1}, \vec{p}_{2}, \vec{x}_{2}\right) & =\frac{N(N-1)}{N^{2}} f_{1}\left(\vec{p}_{1}, \vec{x}_{1}\right) f_{1}\left(\vec{p}_{2}, \vec{x}_{2}\right) \\ & \simeq f_{1}\left(\vec{p}_{1}, \vec{x}_{1}\right) f_{1}\left(\vec{p}_{2}, \vec{x}_{2}\right), \tag{3.7} \end{align*}f2(p1,x1,p2,x2)=N(N1)N2f1(p1,x1)f1(p2,x2)(3.7)f1(p1,x1)f1(p2,x2),
where in the first line we took the different proportionality factors of reduced density matrices and the f s f s f_(s)f_{s}fs into account. With this assumption, the equation (3.6) closes. In general, this assumption is inconsistent with the dynamical equation for f 2 f 2 f_(2)f_{2}f2, i.e., it is not preserved under time evolution. However, in a certain limit in which N N N rarr ooN \rightarrow \inftyN and the interaction range d 0 d 0 d rarr0d \rightarrow 0d0, one can prove the consistency of this truncation. To discuss conditions under which the assumption is approximately valid, one introduces several time scales which have to be sufficiently separated:
(i) Let v v vvv be the typical velocity of gas particles (e.g. v 100 m s v 100 m s v~~100((m))/((s))v \approx 100 \frac{\mathrm{~m}}{\mathrm{~s}}v100 m s at room temperature and 1atm) and let L L LLL be the scale over which W ( x ) W ( x ) W( vec(x))\mathcal{W}(\vec{x})W(x) varies, i.e. the box size. Then τ v := L v τ v := L v tau_(v):=(L)/(v)\tau_{v}:=\frac{L}{v}τv:=Lv is the extrinsic scale (e.g. τ v 10 5 τ v 10 5 tau_(v)~~10^(-5)\tau_{v} \approx 10^{-5}τv105 s for L 1 mm L 1 mm L~~1mmL \approx 1 \mathrm{~mm}L1 mm ).
(ii) If d d ddd is the range of the interaction V ( x ) V ( x ) V( vec(x))\mathcal{V}(\vec{x})V(x) (e.g. d 10 10 m d 10 10 m d~~10^(-10)md \approx 10^{-10} \mathrm{~m}d1010 m ), then τ c := d v τ c := d v tau_(c):=(d)/(v)\tau_{c}:=\frac{d}{v}τc:=dv is the collision time (e.g. τ c 10 12 s τ c 10 12 s tau_(c)~~10^(-12)s\tau_{c} \approx 10^{-12} \mathrm{~s}τc1012 s ). We should have τ c τ v τ c τ v tau_(c)≪tau_(v)\tau_{c} \ll \tau_{v}τcτv.
(iii) We can also define the mean free time τ x τ c n d 3 1 n v d 2 , n = N V τ x τ c n d 3 1 n v d 2 , n = N V tau_(x)~~(tau_(c))/(nd^(3))~~(1)/(nvd^(2)),n=(N)/(V)\tau_{x} \approx \frac{\tau_{c}}{n d^{3}} \approx \frac{1}{n v d^{2}}, n=\frac{N}{V}τxτcnd31nvd2,n=NV, which is the average time between subsequent collisions. We have τ x 10 8 S τ c τ x 10 8 S τ c tau_(x)~~10^(-8)S≫tau_(c)\tau_{x} \approx 10^{-8} \mathrm{~S} \gg \tau_{c}τx108 Sτc in our example.
For (a) and (b) to hold, we should have τ v τ x τ c τ v τ x τ c tau_(v)≫tau_(x)≫tau_(c)\tau_{v} \gg \tau_{x} \gg \tau_{c}τvτxτc.
As by the assumptions on the scales, a particle moves freely between two encounters, we can approximate the interaction of two particles as a scattering process, described by the differential cross section d σ d Σ d σ d Σ (d sigma)/(d Sigma)\frac{d \sigma}{d \Sigma}dσdΣ, as indicated in the following sketch:
Figure 3.1: Classical scattering of particles in the "fixed target frame".
Here d σ = b d b d ϕ d σ = b d b d ϕ d sigma=bdbd phid \sigma=b d b d \phidσ=bdbdϕ is the infinitesimal area element shaded in grey, through which a flux of particles passes, and d Ω d Ω d Omegad \OmegadΩ is the infinitesimal area element on the unit sphere, indicating into which direction these particles are scattered. In other words, | d σ d Ω | d σ d Ω |(d sigma)/(d Omega)|\left|\frac{d \sigma}{d \Omega}\right||dσdΩ| is the Jacobian between b b vec(b)\vec{b}b and Ω ^ = ( θ , ϕ ) Ω ^ = ( θ , ϕ ) hat(Omega)=(theta,phi)\hat{\Omega}=(\theta, \phi)Ω^=(θ,ϕ). Hence, if F = # of incoming particles area time F = #  of incoming particles   area   time  F=(#" of incoming particles ")/(" area "*" time ")F=\frac{\# \text { of incoming particles }}{\text { area } \cdot \text { time }}F=# of incoming particles  area  time  is the incoming flux of particles, then | d σ d Ω ( Ω ) | F d Ω d σ d Ω ( Ω ) F d Ω |(d sigma)/(d Omega)(Omega)|Fd Omega\left|\frac{d \sigma}{d \Omega}(\Omega)\right| F d \Omega|dσdΩ(Ω)|FdΩ is the number rate of particles being scattered into the area element d Ω d Ω d Omegad \OmegadΩ.
If, instead of the fixed target frame indicated in the figure, one considers scattering in the center of mass frame and denotes p = p 1 p 2 p = p 1 p 2 vec(p)= vec(p)_(1)- vec(p)_(2)\vec{p}=\vec{p}_{1}-\vec{p}_{2}p=p1p2 and p = p 1 p 2 p = p 1 p 2 vec(p)^(')= vec(p)_(1)^(')- vec(p)_(2)^(')\vec{p}^{\prime}=\vec{p}_{1}^{\prime}-\vec{p}_{2}^{\prime}p=p1p2 as the relative momentum before and after the scattering (where we assume elastic collision, i.e., | p | = | p | = | vec(p)|=|\vec{p}|=|p|=
| p | ) | p | ) | vec(p)|)|\vec{p}|)|p|), one can then arrive at the Boltzmann equation
[ t F p 1 + v 1 x 1 ] f 1 ( p 1 , x 1 ; t ) = (3.8) d 3 p 2 d 2 Ω | d σ d Ω | cross-section v 1 v 2 α fllux [ f 1 ( p 1 , x 1 ; t ) f 1 ( p 2 , x 1 ; t ) f 1 ( p 1 , x 1 ; t ) f 1 ( p 2 , x 1 ; t ) ] , t F p 1 + v 1 x 1 f 1 p 1 , x 1 ; t = (3.8) d 3 p 2 d 2 Ω d σ d Ω cross-section  v 1 v 2 α  fllux  f 1 p 1 , x 1 ; t f 1 p 2 , x 1 ; t f 1 p 1 , x 1 ; t f 1 p 2 , x 1 ; t , {:[{:[(del)/(del t)-( vec(F))(del)/(del vec(p)_(1))+ vec(v)_(1)(del)/(del vec(x)_(1))]f_(1)( vec(p)_(1), vec(x)_(1);t)=:}],[(3.8)-intd^(3)p_(2)d^(2)Omegaubrace(|(d sigma)/(d Omega)|ubrace)_("cross-section ")*ubrace(∣ vec(v)_(1)- vec(v)_(2)ubrace)_(alpha" fllux ")*[f_(1)( vec(p)_(1), vec(x)_(1);t)f_(1)( vec(p)_(2), vec(x)_(1);t)-f_(1)( vec(p)_(1)^('), vec(x)_(1);t)f_(1)( vec(p)_(2)^('), vec(x)_(1);t)]","]:}\begin{align*} & {\left[\frac{\partial}{\partial t}-\vec{F} \frac{\partial}{\partial \vec{p}_{1}}+\vec{v}_{1} \frac{\partial}{\partial \vec{x}_{1}}\right] f_{1}\left(\vec{p}_{1}, \vec{x}_{1} ; t\right)=} \\ & -\int d^{3} p_{2} d^{2} \Omega \underbrace{\left|\frac{d \sigma}{d \Omega}\right|}_{\text {cross-section }} \cdot \underbrace{\mid \vec{v}_{1}-\vec{v}_{2}}_{\alpha \text { fllux }} \cdot\left[f_{1}\left(\vec{p}_{1}, \vec{x}_{1} ; t\right) f_{1}\left(\vec{p}_{2}, \vec{x}_{1} ; t\right)-f_{1}\left(\vec{p}_{1}^{\prime}, \vec{x}_{1} ; t\right) f_{1}\left(\vec{p}_{2}^{\prime}, \vec{x}_{1} ; t\right)\right], \tag{3.8} \end{align*}[tFp1+v1x1]f1(p1,x1;t)=(3.8)d3p2d2Ω|dσdΩ|cross-section v1v2α fllux [f1(p1,x1;t)f1(p2,x1;t)f1(p1,x1;t)f1(p2,x1;t)],
where Ω = ( θ , ϕ ) Ω = ( θ , ϕ ) Omega=(theta,phi)\Omega=(\theta, \phi)Ω=(θ,ϕ) is the solid angle between p p vec(p)\vec{p}p and p p vec(p)\vec{p}p, and d 2 Ω = sin θ d θ d ϕ d 2 Ω = sin θ d θ d ϕ d^(2)Omega=sin theta d theta d phid^{2} \Omega=\sin \theta d \theta d \phid2Ω=sinθdθdϕ. The integral expression on the right side of the Boltzmann equation (3.8) is called the collision operator, and is often denoted as C [ f 1 ] ( t , p 1 , x 1 ) C f 1 t , p 1 , x 1 C[f_(1)](t, vec(p)_(1), vec(x)_(1))C\left[f_{1}\right]\left(t, \vec{p}_{1}, \vec{x}_{1}\right)C[f1](t,p1,x1). It represents the change in the 1 1 1-1-1 particle distribution due to collisions of particles. The two terms in the brackets [...] under the integral in (3.8) can be viewed as taking into account that new particles with momentum p 1 p 1 vec(p)_(1)\vec{p}_{1}p1 can be created or be lost, respectively, when momentum is transferred from other particles in a collision process.
It is important to know whether f 1 ( p , x ; t ) f 1 ( p , x ; t ) f_(1)( vec(p), vec(x);t)f_{1}(\vec{p}, \vec{x} ; t)f1(p,x;t) is stationary, i.e. time-independent. Intuitively, this should be the case when the collision term C [ f 1 ] C f 1 C[f_(1)]C\left[f_{1}\right]C[f1] vanishes. This in turn should happen if
(3.9) f 1 ( p 1 , x ; t ) f 1 ( p 2 , x ; t ) = f 1 ( p 1 , x ; t ) f 1 ( p 2 , x ; t ) (3.9) f 1 p 1 , x ; t f 1 p 2 , x ; t = f 1 p 1 , x ; t f 1 p 2 , x ; t {:(3.9)f_(1)( vec(p)_(1),( vec(x));t)f_(1)( vec(p)_(2),( vec(x));t)=f_(1)( vec(p)_(1)^('),( vec(x));t)f_(1)( vec(p)_(2)^('),( vec(x));t):}\begin{equation*} f_{1}\left(\vec{p}_{1}, \vec{x} ; t\right) f_{1}\left(\vec{p}_{2}, \vec{x} ; t\right)=f_{1}\left(\vec{p}_{1}^{\prime}, \vec{x} ; t\right) f_{1}\left(\vec{p}_{2}^{\prime}, \vec{x} ; t\right) \tag{3.9} \end{equation*}(3.9)f1(p1,x;t)f1(p2,x;t)=f1(p1,x;t)f1(p2,x;t)
As we will now see, one can derive the functional form of the 1-particle density from this condition. Taking the logarithm on both sides of (3.9) gives, with F 1 = log f 1 F 1 = log f 1 F_(1)=log f_(1)F_{1}=\log f_{1}F1=logf1 etc.,
(3.10) F 1 + F 2 = F 1 + F 2 (3.10) F 1 + F 2 = F 1 + F 2 {:(3.10)F_(1)+F_(2)=F_(1^('))+F_(2^(')):}\begin{equation*} F_{1}+F_{2}=F_{1^{\prime}}+F_{2^{\prime}} \tag{3.10} \end{equation*}(3.10)F1+F2=F1+F2
whence F F FFF must be a conserved quantity, i.e. either we have F = β p 2 2 m F = β p 2 2 m F=beta( vec(p)^(2))/(2m)F=\beta \frac{\vec{p}^{2}}{2 m}F=βp22m or F = α p F = α p F= vec(alpha)* vec(p)F=\vec{\alpha} \cdot \vec{p}F=αp or F = γ F = γ F=gammaF=\gammaF=γ. It follows, after renaming constants, that
(3.11) f 1 = c e β ( p p 0 ) 2 2 m (3.11) f 1 = c e β p p 0 2 2 m {:(3.11)f_(1)=c*e^(-beta((( vec(p))- vec(p)_(0))^(2))/(2m)):}\begin{equation*} f_{1}=c \cdot e^{-\beta \frac{\left(\vec{p}-\vec{p}_{0}\right)^{2}}{2 m}} \tag{3.11} \end{equation*}(3.11)f1=ceβ(pp0)22m
In principle c , β , p 0 c , β , p 0 c,beta, vec(p)_(0)c, \beta, \vec{p}_{0}c,β,p0 could be functions of x x vec(x)\vec{x}x and t t ttt at this stage, but then the left hand side of the Boltzmann equation does not vanish in general. So (3.11) represents the general stationary homogeneous solution to the Boltzmann equation. It is known as the Maxwell-Boltzmann distribution. The proper normalization is, from f 1 d 3 p d 3 x = f 1 d 3 p d 3 x = intf_(1)d^(3)pd^(3)x=\int f_{1} d^{3} p d^{3} x=f1d3pd3x= N,
(3.12) c = N V ( β 2 π m ) 3 2 , p 0 = p . (3.12) c = N V β 2 π m 3 2 , p 0 = p . {:(3.12)c=(N)/(V)((beta)/(2pi m))^((3)/(2))","quad vec(p)_(0)=(: vec(p):).:}\begin{equation*} c=\frac{N}{V}\left(\frac{\beta}{2 \pi m}\right)^{\frac{3}{2}}, \quad \vec{p}_{0}=\langle\vec{p}\rangle . \tag{3.12} \end{equation*}(3.12)c=NV(β2πm)32,p0=p.
The mean kinetic energy is found to be p 2 2 m = 3 2 β p 2 2 m = 3 2 β (:( vec(p)^(2))/(2m):)=(3)/(2beta)\left\langle\frac{\vec{p}^{2}}{2 m}\right\rangle=\frac{3}{2 \beta}p22m=32β, so β = 1 K B T β = 1 K B T beta=(1)/(K_(B)T)\beta=\frac{1}{\mathrm{~K}_{\mathrm{B}} T}β=1 KBT is identified with the inverse temperature of the gas.
This interpretation of β β beta\betaβ is reinforced by considering a gas of N N NNN particles confined to a
box of volume V V VVV. The pressure of the gas results from a force K K KKK acting on a wall element of area A A AAA, as depicted in the figure below. The force is equal to:
K = 1 Δ t 3 p # ( particles impacting A during Δ t with momenta between p and p + d p ) ( f 1 ( p ) d 3 p ) ( A v x Δ t ) × ( momentum transfer in x direction ) 2 p x = 1 Δ t 0 d p x d p y d p z f 1 ( p ) ( A v x Δ t ) ( 2 p x ) . K = 1 Δ t 3 p # (  particles impacting A   during  Δ t  with momenta between  p  and  p + d p ) f 1 ( p ) d 3 p A v x Δ t × (  momentum transfer   in  x  direction  ) 2 p x = 1 Δ t 0 d p x d p y d p z f 1 ( p ) A v x Δ t 2 p x . {:[K=(1)/(Delta t)int^(3)p*# obrace(((" particles impacting A ")/(" during "Delta t" with momenta between "( vec(p))" and "( vec(p))+d( vec(p)))))^((f_(1)(( vec(p)))d^(3)p)*(Av_(x)Delta t))xx obrace(((" momentum transfer ")/(" in "x-" direction ")))^(2p_(x))],[=(1)/(Delta t)int_(0)^(oo)dp_(x)int_(-oo)^(oo)dp_(y)int_(-oo)^(oo)dp_(z)f_(1)( vec(p))(Av_(x)Delta t)*(2p_(x)).]:}\begin{aligned} K & =\frac{1}{\Delta t} \int^{3} p \cdot \# \overbrace{\binom{\text { particles impacting A }}{\text { during } \Delta t \text { with momenta between } \vec{p} \text { and } \vec{p}+d \vec{p}}}^{\left(f_{1}(\vec{p}) d^{3} p\right) \cdot\left(A v_{x} \Delta t\right)} \times \overbrace{\binom{\text { momentum transfer }}{\text { in } x-\text { direction }}}^{2 p_{x}} \\ & =\frac{1}{\Delta t} \int_{0}^{\infty} d p_{x} \int_{-\infty}^{\infty} d p_{y} \int_{-\infty}^{\infty} d p_{z} f_{1}(\vec{p})\left(A v_{x} \Delta t\right) \cdot\left(2 p_{x}\right) . \end{aligned}K=1Δt3p#( particles impacting A  during Δt with momenta between p and p+dp)(f1(p)d3p)(AvxΔt)×( momentum transfer  in x direction )2px=1Δt0dpxdpydpzf1(p)(AvxΔt)(2px).
Note, that the first integral is just over half of the range of p x p x p_(x)p_{x}px, which is due to the fact that only particles moving in the direction of the wall will hit it.
Together with (3.11) it follows that the pressure P P PPP is given by
(3.13) P = K A = d 3 p f 1 ( p ) p x 2 m = n β . (3.13) P = K A = d 3 p f 1 ( p ) p x 2 m = n β . {:(3.13)P=(K)/(A)=intd^(3)pf_(1)( vec(p))(p_(x)^(2))/(m)=(n)/( beta).:}\begin{equation*} P=\frac{K}{A}=\int d^{3} p f_{1}(\vec{p}) \frac{p_{x}^{2}}{m}=\frac{n}{\beta} . \tag{3.13} \end{equation*}(3.13)P=KA=d3pf1(p)px2m=nβ.
Comparing with the equation of state for an ideal gas, P V = N k B T P V = N k B T PV=Nk_(B)TP V=N \mathrm{k}_{\mathrm{B}} TPV=NkBT, we get β = 1 k B T β = 1 k B T beta=(1)/(k_(B)T)\beta=\frac{1}{\mathrm{k}_{\mathrm{B}} T}β=1kBT.
Figure 3.2: Pressure on the walls due to the impact of particles.
It is noteworthy that, in the presence of external forces, other solutions representing equilibrium (but with a non-vanishing collision term) should also be possible. One only has to think of the following situation, representing a stationary air flow across a wing:
Figure 3.3: Sketch of the air-flow across a wing.
In this case we have to deal with a much more complicated f 1 f 1 f_(1)f_{1}f1, not equal to the Maxwell-Boltzmann distribution. As the example of an air-flow suggests, the Boltzmann equation is also closely related to other equations for fluids such as the Euler- or NavierStokes equation, which can be seen to arise as approximations of the Boltzmann equation.
The Boltzmann equation can easily be generalized to a gas consisting of several species α , β , α , β , alpha,beta,dots\alpha, \beta, \ldotsα,β, which are interacting via the 2 -body potentials V α , β ( x ( α ) x ( β ) ) V α , β ( x ( α ) x ( β ) ) V_(alpha,beta)( vec(x)(alpha)- vec(x)(beta))\mathcal{V}_{\alpha, \beta}(\vec{x}(\alpha)-\vec{x}(\beta))Vα,β(x(α)x(β)). As before, we can define the 1-particle density f 1 ( α ) ( p , x , t ) f 1 ( α ) ( p , x , t ) f_(1)^((alpha))( vec(p), vec(x),t)f_{1}^{(\alpha)}(\vec{p}, \vec{x}, t)f1(α)(p,x,t) for each species. The same derivation leading to the Boltzmann equation now gives the system of equations
(3.14) [ t F p + v x ] f 1 ( α ) = β C ( α , β ) (3.14) t F p + v x f 1 ( α ) = β C ( α , β ) {:(3.14)[(del)/(del t)-( vec(F))(del)/(del( vec(p)))+( vec(v))(del)/(del( vec(x)))]f_(1)^((alpha))=sum_(beta)C^((alpha,beta)):}\begin{equation*} \left[\frac{\partial}{\partial t}-\vec{F} \frac{\partial}{\partial \vec{p}}+\vec{v} \frac{\partial}{\partial \vec{x}}\right] f_{1}^{(\alpha)}=\sum_{\beta} C^{(\alpha, \beta)} \tag{3.14} \end{equation*}(3.14)[tFp+vx]f1(α)=βC(α,β)
where the collision term C ( α , β ) C ( α , β ) C^((alpha,beta))C^{(\alpha, \beta)}C(α,β) is given by
(3.15) C ( α , β ) = d 3 p 2 d 2 Ω | d σ α , β d Ω | | v 1 v 2 | × × [ f 1 ( α ) ( p 1 , x 1 ; t ) f 1 ( β ) ( p 2 , x 1 ; t ) f 1 ( α ) ( p 1 , x 1 ; t ) f 1 ( β ) ( p 2 , x 1 ; t ) ] (3.15) C ( α , β ) = d 3 p 2 d 2 Ω d σ α , β d Ω v 1 v 2 × × f 1 ( α ) p 1 , x 1 ; t f 1 ( β ) p 2 , x 1 ; t f 1 ( α ) p 1 , x 1 ; t f 1 ( β ) p 2 , x 1 ; t {:[(3.15)C^((alpha,beta))=-intd^(3)p_(2)d^(2)Omega|(dsigma_(alpha,beta))/(d Omega)|| vec(v)_(1)- vec(v)_(2)|xx],[ xx[f_(1)^((alpha))( vec(p)_(1), vec(x)_(1);t)f_(1)^((beta))( vec(p)_(2), vec(x)_(1);t)-f_(1)^((alpha))( vec(p)_(1)^('), vec(x)_(1);t)f_(1)^((beta))( vec(p)_(2)^('), vec(x)_(1);t)]]:}\begin{align*} C^{(\alpha, \beta)}= & -\int d^{3} p_{2} d^{2} \Omega\left|\frac{d \sigma_{\alpha, \beta}}{d \Omega}\right|\left|\vec{v}_{1}-\vec{v}_{2}\right| \times \tag{3.15}\\ & \times\left[f_{1}^{(\alpha)}\left(\vec{p}_{1}, \vec{x}_{1} ; t\right) f_{1}^{(\beta)}\left(\vec{p}_{2}, \vec{x}_{1} ; t\right)-f_{1}^{(\alpha)}\left(\vec{p}_{1}^{\prime}, \vec{x}_{1} ; t\right) f_{1}^{(\beta)}\left(\vec{p}_{2}^{\prime}, \vec{x}_{1} ; t\right)\right] \end{align*}(3.15)C(α,β)=d3p2d2Ω|dσα,βdΩ||v1v2|××[f1(α)(p1,x1;t)f1(β)(p2,x1;t)f1(α)(p1,x1;t)f1(β)(p2,x1;t)]
This system of equations has great importance in practice e.g. for the evolution of the abundances of different particle species in the early universe. In this case
(3.16) f 1 ( α ) ( p , x ; t ) f 1 ( α ) ( p , t ) (3.16) f 1 ( α ) ( p , x ; t ) f 1 ( α ) ( p , t ) {:(3.16)f_(1)^((alpha))( vec(p)"," vec(x);t)~~f_(1)^((alpha))( vec(p)","t):}\begin{equation*} f_{1}^{(\alpha)}(\vec{p}, \vec{x} ; t) \approx f_{1}^{(\alpha)}(\vec{p}, t) \tag{3.16} \end{equation*}(3.16)f1(α)(p,x;t)f1(α)(p,t)
are homogeneous distributions and the external force F F vec(F)\vec{F}F on the left hand side of equations (3.14) is related to the expansion of the universe.
Demanding equilibrium now amounts to
(3.17) f 1 ( α ) ( p 1 ; t ) f 1 ( β ) ( p 2 ; t ) = f 1 ( α ) ( p 1 ; t ) f 1 ( β ) ( p 2 ; t ) (3.17) f 1 ( α ) p 1 ; t f 1 ( β ) p 2 ; t = f 1 ( α ) p 1 ; t f 1 ( β ) p 2 ; t {:(3.17)f_(1)^((alpha))( vec(p)_(1);t)f_(1)^((beta))( vec(p)_(2);t)=f_(1)^((alpha))( vec(p)_(1)^(');t)f_(1)^((beta))( vec(p)_(2)^(');t):}\begin{equation*} f_{1}^{(\alpha)}\left(\vec{p}_{1} ; t\right) f_{1}^{(\beta)}\left(\vec{p}_{2} ; t\right)=f_{1}^{(\alpha)}\left(\vec{p}_{1}^{\prime} ; t\right) f_{1}^{(\beta)}\left(\vec{p}_{2}^{\prime} ; t\right) \tag{3.17} \end{equation*}(3.17)f1(α)(p1;t)f1(β)(p2;t)=f1(α)(p1;t)f1(β)(p2;t)
and similar arguments as above lead to
(3.18) f 1 ( α ) e β ( p m α v 0 ) 2 2 m α , (3.18) f 1 ( α ) e β p m α v 0 2 2 m α , {:(3.18)f_(1)^((alpha))prope^(-beta((( vec(p))-m_(alpha) vec(v)_(0))^(2))/(2m_(alpha)))",":}\begin{equation*} f_{1}^{(\alpha)} \propto e^{-\beta \frac{\left(\vec{p}-m_{\alpha} \vec{v}_{0}\right)^{2}}{2 m_{\alpha}}}, \tag{3.18} \end{equation*}(3.18)f1(α)eβ(pmαv0)22mα,
i.e. we have the same temperature T T TTT and average velocity v 0 v 0 vec(v)_(0)\vec{v}_{0}v0 for all α α alpha\alphaα. In the context of the early universe it is essential to study deviations from equilibrium in order to explain the observed abundances.
By contrast to the original system of equations (Hamilton's equations or the BBGKY hierarchy), the Boltzmann equation is irreversible. This can be seen for example by introducing the function
(3.19) h ( t ) = k B d 3 x d 3 p f 1 ( p , x ; t ) log f 1 ( p , x ; t ) = S inf ( f 1 ( t ) ) , (3.19) h ( t ) = k B d 3 x d 3 p f 1 ( p , x ; t ) log f 1 ( p , x ; t ) = S inf f 1 ( t ) , {:(3.19)h(t)=-k_(B)intd^(3)xd^(3)pf_(1)( vec(p)"," vec(x);t)log f_(1)( vec(p)"," vec(x);t)=S_(inf)(f_(1)(t))",":}\begin{equation*} h(t)=-\mathrm{k}_{\mathrm{B}} \int d^{3} x d^{3} p f_{1}(\vec{p}, \vec{x} ; t) \log f_{1}(\vec{p}, \vec{x} ; t)=S_{\mathrm{inf}}\left(f_{1}(t)\right), \tag{3.19} \end{equation*}(3.19)h(t)=kBd3xd3pf1(p,x;t)logf1(p,x;t)=Sinf(f1(t)),
which is called Boltzmann H-function. It can be shown (cf. problem B.11) that h ˙ ( t ) 0 h ˙ ( t ) 0 h^(˙)(t) >= 0\dot{h}(t) \geqslant 0h˙(t)0, with equality if
f 1 ( p 1 , x ; t ) f 1 ( p 2 , x ; t ) = f 1 ( p 1 , x ; t ) f 1 ( p 2 , x ; t ) f 1 p 1 , x ; t f 1 p 2 , x ; t = f 1 p 1 , x ; t f 1 p 2 , x ; t f_(1)( vec(p)_(1),( vec(x));t)f_(1)( vec(p)_(2),( vec(x));t)=f_(1)( vec(p)_(1)^('),( vec(x));t)f_(1)( vec(p)_(2)^('),( vec(x));t)f_{1}\left(\vec{p}_{1}, \vec{x} ; t\right) f_{1}\left(\vec{p}_{2}, \vec{x} ; t\right)=f_{1}\left(\vec{p}_{1}^{\prime}, \vec{x} ; t\right) f_{1}\left(\vec{p}_{2}^{\prime}, \vec{x} ; t\right)f1(p1,x;t)f1(p2,x;t)=f1(p1,x;t)f1(p2,x;t)
a result which is known as the H H H\mathbf{H}H-theorem. We just showed this equality holds if and only if f 1 f 1 f_(1)f_{1}f1 is given by the Maxwell-Boltzmann distribution. Thus, we conclude that h ( t ) h ( t ) h(t)h(t)h(t) is an increasing function, as long as f 1 f 1 f_(1)f_{1}f1 is not equal to the Maxwell-Boltzmann distribution. In particular, the evolution of f 1 f 1 f_(1)f_{1}f1, as described by the Boltzmann equation, is irreversible. Since the Boltzmann equation is only an approximation to the full BBGKY hierarchy, which is reversible, there is no mathematical inconsistency. However, it is not clear, a priori, at which stage of the derivation the irreversibility has been allowed to enter. Looking at the approximations (a) and (b) made above, it is clear that the assumption that the 2 -particle correlations f 2 f 2 f_(2)f_{2}f2 are factorized, as in (b), cannot be exactly true, since the outgoing momenta of the particles are correlated. Although this correlation is extremely small after several collisions, it is not exactly zero. Our decision to neglect it can be viewed as one reason for the emergence of irreversibility on a macroscopic scale.
The close analogy between the definition of the Boltzmann H H HHH-function and the information entropy S inf S inf  S_("inf ")S_{\text {inf }}Sinf , as defined in (2.15), together with the monotonicity of h ( t ) h ( t ) h(t)h(t)h(t) suggest that h h hhh should represent some sort of entropy of the system. The H H HHH-theorem is then viewed as a "derivation" of the 2 nd 2 nd  2^("nd ")2^{\text {nd }}2nd  law of thermodynamics (see Chapter 6). However, this point of view is not entirely correct, since h ( t ) h ( t ) h(t)h(t)h(t) only depends on the 1-particle density f 1 f 1 f_(1)f_{1}f1 and not on the higher particle densities f s f s f_(s)f_{s}fs, which in general should also contribute to the entropy. It is not clear how an entropy with sensible properties has to be defined in a completely general situation, in particular when the above approximations (a) and (b) are not justified.

3.2 Boltzmann Equation, Approach to Equilibrium in Quantum Mechanics

A version of the Boltzmann equation and the H H HHH-theorem can also be derived in the quantum mechanical context. The main difference to the classical case is a somewhat modified collision term: the classical differential cross section is replaced by the quantum mechanical differential cross section (in the Born approximation) and the combination
f 1 ( p 1 , x ; t ) f 1 ( p 2 , x ; t ) f 1 ( p 1 , x ; t ) f 1 ( p 2 , x ; t ) f 1 p 1 , x ; t f 1 p 2 , x ; t f 1 p 1 , x ; t f 1 p 2 , x ; t f_(1)( vec(p)_(1),( vec(x));t)f_(1)( vec(p)_(2),( vec(x));t)-f_(1)( vec(p)_(1)^('),( vec(x));t)f_(1)( vec(p)_(2)^('),( vec(x));t)f_{1}\left(\vec{p}_{1}, \vec{x} ; t\right) f_{1}\left(\vec{p}_{2}, \vec{x} ; t\right)-f_{1}\left(\vec{p}_{1}^{\prime}, \vec{x} ; t\right) f_{1}\left(\vec{p}_{2}^{\prime}, \vec{x} ; t\right)f1(p1,x;t)f1(p2,x;t)f1(p1,x;t)f1(p2,x;t)
is somewhat changed in order to accommodate Bose-Einstein resp. Fermi-Dirac statistics (see section 5.1 for an explanation of these terms). This then leads to the corresponding equilibrium distributions in the stationary case. Starting from the quantum Boltzmann equation, one can again derive a corresponding H H HHH-theorem. Rather than explaining the details, we give a simplified "derivation" of the H H HHH-theorem, which also will allow us to introduce a simple minded but very useful approximation of the dynamics of probabilities, discussed in more detail in the Appendix.
The basic idea is to ascribe the approach to equilibrium to an incomplete knowledge of the true dynamics due to perturbations. The true Hamiltonian is written as
(3.20) H = H 0 + H 1 , (3.20) H = H 0 + H 1 , {:(3.20)H=H_(0)+H_(1)",":}\begin{equation*} H=H_{0}+H_{1}, \tag{3.20} \end{equation*}(3.20)H=H0+H1,
where H 1 H 1 H_(1)H_{1}H1 is a tiny perturbation over which we do not have control. For simplicity, we assume that the spectrum of the unperturbed Hamiltonian H 0 H 0 H_(0)H_{0}H0 is discrete and we write H 0 | n = E n | n H 0 | n = E n | n H_(0)|n:)=E_(n)|n:)H_{0}|n\rangle=E_{n}|n\rangleH0|n=En|n. For a typical eigenstate | n | n |n:)|n\rangle|n we then have
(3.21) n | H 1 | n E n 1 (3.21) n | H 1 | n E n 1 {:(3.21)((:n|H_(1)|n:))/(E_(n))≪1:}\begin{equation*} \frac{\langle n| H_{1}|n\rangle}{E_{n}} \ll 1 \tag{3.21} \end{equation*}(3.21)n|H1|nEn1
Let p n p n p_(n)p_{n}pn be the probability that the system is in the state | n | n |n:)|n\rangle|n, i.e. we ascribe to the system the density matrix ρ = n p n | n n | ρ = n p n | n n | rho=sum_(n)p_(n)|n:)(:n|\rho=\sum_{n} p_{n}|n\rangle\langle n|ρ=npn|nn|. For generic perturbations H 1 H 1 H_(1)H_{1}H1, this ensemble is not stationary with respect to the true dynamics because [ ρ , H ] 0 [ ρ , H ] 0 [rho,H]!=0[\rho, H] \neq 0[ρ,H]0. Consequently, the von Neumann entropy S v . N . of ρ ( t ) = e i t H ρ e i t H S v . N . of  ρ ( t ) = e i t H ρ e i t H S_(v.N". of ")rho(t)=e^(itH)rhoe^(-itH)S_{\mathrm{v} . \mathrm{N} \text {. of }} \rho(t)=e^{i t H} \rho e^{-i t H}Sv.N. of ρ(t)=eitHρeitH depends upon time. We define this to be the H H HHH-function
(3.22) h ( t ) := S v N ( ρ ( t ) ) = k B n p n log p n (3.22) h ( t ) := S v N ( ρ ( t ) ) = k B n p n log p n {:(3.22)h(t):=S_(v*N*)(rho(t))=-k_(B)sum_(n)p_(n)log p_(n):}\begin{equation*} h(t):=S_{\mathrm{v} \cdot N \cdot}(\rho(t))=-k_{B} \sum_{n} p_{n} \log p_{n} \tag{3.22} \end{equation*}(3.22)h(t):=SvN(ρ(t))=kBnpnlogpn
Next, we approximate the dynamics by imagining that our perturbation H 1 H 1 H_(1)H_{1}H1 will cause jumps from state | n | n |n:)|n\rangle|n to state | m | m |m:)|m\rangle|m leading to time-dependent probabilities as described
by the master equation 1 1 ^(1){ }^{1}1
(3.23) p ˙ n ( t ) = m : m n ( T n m p m ( t ) T m n p n ( t ) ) , (3.23) p ˙ n ( t ) = m : m n T n m p m ( t ) T m n p n ( t ) , {:(3.23)p^(˙)_(n)(t)=sum_(m:m!=n)(T_(nm)p_(m)(t)-T_(mn)p_(n)(t))",":}\begin{equation*} \dot{p}_{n}(t)=\sum_{m: m \neq n}\left(T_{n m} p_{m}(t)-T_{m n} p_{n}(t)\right), \tag{3.23} \end{equation*}(3.23)p˙n(t)=m:mn(Tnmpm(t)Tmnpn(t)),
where T n m T n m T_(nm)T_{n m}Tnm is the transition rate of going from state | n | n |n:)|n\rangle|n to state | m | m |m:)|m\rangle|m. Thus, the approximated, time-dependent density matrix is ρ ( t ) = n p n ( t ) | n n | ρ ( t ) = n p n ( t ) | n n | rho(t)=sum_(n)p_(n)(t)|n:)(:n|\rho(t)=\sum_{n} p_{n}(t)|n\rangle\langle n|ρ(t)=npn(t)|nn|, with p n ( t ) p n ( t ) p_(n)(t)p_{n}(t)pn(t) obeying the master equation. By the latter,
h ˙ ( t ) = n ( p ˙ n log p n + p ˙ n ) = 1 2 ( n p ˙ n log ( e p n ) + m p ˙ m log ( e p m ) ) = 1 2 k B n , m T n m [ p n ( t ) p m ( t ) ] [ log p n ( t ) log p m ( t ) ] 0 . h ˙ ( t ) = n p ˙ n log p n + p ˙ n = 1 2 n p ˙ n log e p n + m p ˙ m log e p m = 1 2 k B n , m T n m p n ( t ) p m ( t ) log p n ( t ) log p m ( t ) 0 . {:[h^(˙)(t)=-sum_(n)(p^(˙)_(n)log p_(n)+p^(˙)_(n))],[=-(1)/(2)(sum_(n)p^(˙)_(n)log(e*p_(n))+sum_(m)p^(˙)_(m)log(e*p_(m)))],[=(1)/(2)k_(B)sum_(n,m)T_(nm)[p_(n)(t)-p_(m)(t)][log p_(n)(t)-log p_(m)(t)] >= 0.]:}\begin{aligned} \dot{h}(t) & =-\sum_{n}\left(\dot{p}_{n} \log p_{n}+\dot{p}_{n}\right) \\ & =-\frac{1}{2}\left(\sum_{n} \dot{p}_{n} \log \left(\mathrm{e} \cdot p_{n}\right)+\sum_{m} \dot{p}_{m} \log \left(\mathrm{e} \cdot p_{m}\right)\right) \\ & =\frac{1}{2} \mathrm{k}_{B} \sum_{n, m} T_{n m}\left[p_{n}(t)-p_{m}(t)\right]\left[\log p_{n}(t)-\log p_{m}(t)\right] \geqslant 0 . \end{aligned}h˙(t)=n(p˙nlogpn+p˙n)=12(np˙nlog(epn)+mp˙mlog(epm))=12kBn,mTnm[pn(t)pm(t)][logpn(t)logpm(t)]0.
The latter inequality follows from the fact that both terms in parentheses [...] have the same sign, just as in the proof of the classical H H HHH-theorem (problem B.11. Note that if we had defined h ( t ) h ( t ) h(t)h(t)h(t) as the von Neumann entropy, using a density matrix ρ ρ rho\rhoρ that is diagonal in an eigenbasis of the full Hamiltonian H H HHH (rather than the unperturbed Hamiltonian), then we would have obtained [ ρ , H ] = 0 [ ρ , H ] = 0 [rho,H]=0[\rho, H]=0[ρ,H]=0 and consequently ρ ( t ) = ρ ρ ( t ) = ρ rho(t)=rho\rho(t)=\rhoρ(t)=ρ, i.e. a constant h ( t ) h ( t ) h(t)h(t)h(t). Thus, in this approach, the H H HHH-theorem is viewed as a consequence of our partial ignorance about the system, which prompts us to ascribe to it a density matrix ρ ( t ) ρ ( t ) rho(t)\rho(t)ρ(t) which is diagonal with respect to H 0 H 0 H_(0)H_{0}H0. In order to justify working with a density matrix ρ ρ rho\rhoρ that is diagonal with respect to H 0 H 0 H_(0)H_{0}H0 (and therefore also in order to explain the approach to equilibrium), one may argue very roughly as follows: suppose that we start with a system in a state | Ψ = n γ n | n | Ψ = n γ n | n |Psi:)=sum_(n)gamma_(n)|n:)|\Psi\rangle=\sum_{n} \gamma_{n}|n\rangle|Ψ=nγn|n that is not an eigenstate of the true Hamiltonian H H HHH. Let us write
| Ψ ( t ) = n γ n ( t ) e i E n t | n e i H t | Ψ . | Ψ ( t ) = n γ n ( t ) e i E n t | n e i H t | Ψ . |Psi(t):)=sum_(n)gamma_(n)(t)e^((-iE_(n)t)/(ℏ))|n:)-=e^(-iHt)|Psi:).|\Psi(t)\rangle=\sum_{n} \gamma_{n}(t) e^{\frac{-i E_{n} t}{\hbar}}|n\rangle \equiv e^{-i H t}|\Psi\rangle .|Ψ(t)=nγn(t)eiEnt|neiHt|Ψ.
for the time evolved state. If there is no perturbation, i.e. H 1 = 0 H 1 = 0 H_(1)=0H_{1}=0H1=0, we get
γ n ( t ) = γ n = const. γ n ( t ) = γ n =  const.  gamma_(n)(t)=gamma_(n)=" const. "\gamma_{n}(t)=\gamma_{n}=\text { const. }γn(t)=γn= const. 
but for H 1 0 H 1 0 H_(1)!=0H_{1} \neq 0H10 this is typically not the case. The time average of an operator (observable) A A AAA is given by
(3.24) lim T 1 T 0 T Ψ ( t ) | A | Ψ ( t ) d t = lim T tr ( ρ ( T ) A ) , (3.24) lim T 1 T 0 T Ψ ( t ) | A | Ψ ( t ) d t = lim T tr ( ρ ( T ) A ) , {:(3.24)lim_(T rarr oo)(1)/(T)int_(0)^(T)(:Psi(t)|A|Psi(t):)dt=lim_(T rarr oo)tr(rho(T)A)",":}\begin{equation*} \lim _{T \rightarrow \infty} \frac{1}{T} \int_{0}^{T}\langle\Psi(t)| A|\Psi(t)\rangle d t=\lim _{T \rightarrow \infty} \operatorname{tr}(\rho(T) A), \tag{3.24} \end{equation*}(3.24)limT1T0TΨ(t)|A|Ψ(t)dt=limTtr(ρ(T)A),
with
$$
(3.25) n | ρ ( T ) | m = 1 T 0 T γ n ( t ) γ m ( t ) e i t ( E n E m ) d t (3.25) n | ρ ( T ) | m = 1 T 0 T γ n ( t ) γ m ( t ) ¯ e i t E n E m d t {:(3.25)(:n|rho(T)|m:)=(1)/(T)int_(0)^(T)gamma_(n)(t) bar(gamma_(m)(t))e^((it(E_(n)-E_(m)))/(ℏ))dt:}\begin{equation*} \langle n| \rho(T)|m\rangle=\frac{1}{T} \int_{0}^{T} \gamma_{n}(t) \overline{\gamma_{m}(t)} e^{\frac{i t\left(E_{n}-E_{m}\right)}{\hbar}} d t \tag{3.25} \end{equation*}(3.25)n|ρ(T)|m=1T0Tγn(t)γm(t)eit(EnEm)dt
$$
For T T T rarr ooT \rightarrow \inftyT the oscillating phase factor e i t ( E n E m ) e i t E n E m e^(it(E_(n)-E_(m)))e^{i t\left(E_{n}-E_{m}\right)}eit(EnEm) is expected to cause the integral to vanish for E n E m E n E m E_(n)!=E_(m)E_{n} \neq E_{m}EnEm (destructive interference). Thus, we expect that n | ρ ( T ) | m T n | ρ ( T ) | m T (:n|rho(T)|m:)rarr_("T rarr oo")^("longrightarrow")\langle n| \rho(T)|m\rangle \xrightarrow[T \rightarrow \infty]{\longrightarrow}n|ρ(T)|mT p n δ n , m p n δ n , m p_(n)delta_(n,m)p_{n} \delta_{n, m}pnδn,m. It follows that
(3.26) lim T 1 T 0 T Ψ ( t ) | A | Ψ ( t ) d t = tr ( A ρ ) (3.26) lim T 1 T 0 T Ψ ( t ) | A | Ψ ( t ) d t = tr ( A ρ ) {:(3.26)lim_(T rarr oo)(1)/(T)int_(0)^(T)(:Psi(t)|A|Psi(t):)dt=tr(A rho):}\begin{equation*} \lim _{T \rightarrow \infty} \frac{1}{T} \int_{0}^{T}\langle\Psi(t)| A|\Psi(t)\rangle d t=\operatorname{tr}(A \rho) \tag{3.26} \end{equation*}(3.26)limT1T0TΨ(t)|A|Ψ(t)dt=tr(Aρ)
where the density matrix ρ ρ rho\rhoρ is ρ = n p n | n n | ρ = n p n | n n | rho=sum_(n)p_(n)|n:)(:n|\rho=\sum_{n} p_{n}|n\rangle\langle n|ρ=npn|nn|. Since [ ρ , H 0 ] = 0 ρ , H 0 = 0 [rho,H_(0)]=0\left[\rho, H_{0}\right]=0[ρ,H0]=0, the ensemble described by ρ ρ rho\rhoρ is stationary with respect to H 0 H 0 H_(0)H_{0}H0. The plausibility of this "derivation" rests on the basic assumption that, while n | H 1 | n n | H 1 | n (:n|H_(1)|n:)\langle n| H_{1}|n\ranglen|H1|n is E n E n ≪E_(n)\ll E_{n}En, it can be large compared to Δ E n = E n Δ E n = E n DeltaE_(n)=E_(n)-\Delta E_{n}=E_{n}-ΔEn=En E n + 1 = O ( e N ) E n + 1 = O e N E_(n+1)=O(e^(-N))E_{n+1}=\mathcal{O}\left(e^{-N}\right)En+1=O(eN) (where N N NNN is the particle number) and can therefore induce transitions causing the system to equilibrate.

Chapter 4

Equilibrium Ensembles

4.1 Generalities

In the probabilistic description of a system with a large number of constituents one considers probability distributions (=ensembles) ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) on phase space, rather than individual trajectories. In the previous section, we have given various arguments leading to the expectation that the time evolution of an ensemble will generally lead to an equilibrium ensemble. The study of such ensembles is the subject of equilibrium statistical mechanics. Standard equilibrium ensembles are:
(a) Micro-canonical ensemble (section 4.2).
(b) Canonical ensemble (section 4.3).
(c) Grand canonical (Gibbs) ensemble (section 4.4).

4.2 Micro-Canonical Ensemble

4.2.1 Micro-Canonical Ensemble in Classical Mechanics

Recall that in classical mechanics the phase space Ω Ω Omega\OmegaΩ of a system consisting of N N NNN particles without internal degrees of freedom is given by
(4.1) Ω = R 6 N (4.1) Ω = R 6 N {:(4.1)Omega=R^(6N):}\begin{equation*} \Omega=\mathbb{R}^{6 N} \tag{4.1} \end{equation*}(4.1)Ω=R6N
As before, we define the energy surface Ω E Ω E Omega_(E)\Omega_{E}ΩE by
(4.2) Ω E = { ( P , Q ) Ω : H ( P , Q ) = E } , (4.2) Ω E = { ( P , Q ) Ω : H ( P , Q ) = E } , {:(4.2)Omega_(E)={(P","Q)in Omega:H(P","Q)=E}",":}\begin{equation*} \Omega_{E}=\{(P, Q) \in \Omega: H(P, Q)=E\}, \tag{4.2} \end{equation*}(4.2)ΩE={(P,Q)Ω:H(P,Q)=E},
where H H HHH denotes the Hamiltonian of the system. In the micro-canonical ensemble each point of Ω E Ω E Omega_(E)\Omega_{E}ΩE is considered to be equally likely. In order to write down the corresponding
ensemble, i.e. the density function ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q), we define the invariant volume | Ω E | Ω E |Omega_(E)|\left|\Omega_{E}\right||ΩE| of Ω E Ω E Omega_(E)\Omega_{E}ΩE by
(4.3) | Ω E | := lim Δ E 0 1 Δ E E Δ E H ( P , Q ) E d 3 N P d 3 N Q (4.3) Ω E := lim Δ E 0 1 Δ E E Δ E H ( P , Q ) E d 3 N P d 3 N Q {:(4.3)|Omega_(E)|:=lim_(Delta E rarr0)(1)/(Delta E)int_(E-Delta E <= H(P,Q) <= E)d^(3N)Pd^(3N)Q:}\begin{equation*} \left|\Omega_{E}\right|:=\lim _{\Delta E \rightarrow 0} \frac{1}{\Delta E} \int_{E-\Delta E \leqslant H(P, Q) \leqslant E} d^{3 N} P d^{3 N} Q \tag{4.3} \end{equation*}(4.3)|ΩE|:=limΔE01ΔEEΔEH(P,Q)Ed3NPd3NQ
which can also be expressed as
(4.4) | Ω E | = Φ ( E ) E , with Φ ( E ) = H ( P , Q ) E d 3 N P d 3 N Q (4.4) Ω E = Φ ( E ) E ,  with  Φ ( E ) = H ( P , Q ) E d 3 N P d 3 N Q {:(4.4)|Omega_(E)|=(del Phi(E))/(del E)","quad" with "quad Phi(E)=int_(H(P,Q) <= E)d^(3N)Pd^(3N)Q:}\begin{equation*} \left|\Omega_{E}\right|=\frac{\partial \Phi(E)}{\partial E}, \quad \text { with } \quad \Phi(E)=\int_{H(P, Q) \leqslant E} d^{3 N} P d^{3 N} Q \tag{4.4} \end{equation*}(4.4)|ΩE|=Φ(E)E, with Φ(E)=H(P,Q)Ed3NPd3NQ
Thus, we can write the probability density of the micro-canonical ensemble as
(4.5) ρ ( P , Q ) = 1 | Ω E | δ ( H ( P , Q ) E ) . (4.5) ρ ( P , Q ) = 1 Ω E δ ( H ( P , Q ) E ) . {:(4.5)rho(P","Q)=(1)/(|Omega_(E)|)delta(H(P","Q)-E).:}\begin{equation*} \rho(P, Q)=\frac{1}{\left|\Omega_{E}\right|} \delta(H(P, Q)-E) . \tag{4.5} \end{equation*}(4.5)ρ(P,Q)=1|ΩE|δ(H(P,Q)E).
To avoid subtleties coming from the δ δ delta\deltaδ-function for sharp energy one sometimes replaces this expression by
(4.6) ρ ( P , Q ) = 1 | { E Δ E H ( P , Q ) E } | { 1 , if H ( P , Q ) ( E Δ E , E ) 0 , if H ( P , Q ) ( E Δ E , E ) (4.6) ρ ( P , Q ) = 1 | { E Δ E H ( P , Q ) E } | 1 ,  if  H ( P , Q ) ( E Δ E , E ) 0 ,  if  H ( P , Q ) ( E Δ E , E ) {:(4.6)rho(P","Q)=(1)/(|{E-Delta E <= H(P,Q) <= E}|)*{[1","," if "H(P","Q)in(E-Delta E","E)],[0","," if "H(P","Q)!in(E-Delta E","E)]:}:}\rho(P, Q)=\frac{1}{|\{E-\Delta E \leqslant H(P, Q) \leqslant E\}|} \cdot \begin{cases}1, & \text { if } H(P, Q) \in(E-\Delta E, E) \tag{4.6}\\ 0, & \text { if } H(P, Q) \notin(E-\Delta E, E)\end{cases}(4.6)ρ(P,Q)=1|{EΔEH(P,Q)E}|{1, if H(P,Q)(EΔE,E)0, if H(P,Q)(EΔE,E)
Strictly speaking, this depends not only on E E EEE but also on Δ E Δ E Delta E\Delta EΔE. But in typical cases | Ω E | Ω E |Omega_(E)|\left|\Omega_{E}\right||ΩE| depends exponentially on E E EEE, so there is practically no difference between these two expressions for ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) as long as Δ E E Δ E E Delta E≲E\Delta E \lesssim EΔEE. We may alternatively write the second definition as:
(4.7) ρ = 1 W ( E ) [ Θ ( H E + Δ E ) Θ ( H E ) ] (4.7) ρ = 1 W ( E ) [ Θ ( H E + Δ E ) Θ ( H E ) ] {:(4.7)rho=(1)/(W(E))[Theta(H-E+Delta E)-Theta(H-E)]:}\begin{equation*} \rho=\frac{1}{W(E)}[\Theta(H-E+\Delta E)-\Theta(H-E)] \tag{4.7} \end{equation*}(4.7)ρ=1W(E)[Θ(HE+ΔE)Θ(HE)]
Here we have used the Heaviside step function Θ Θ Theta\ThetaΘ, defined by
Θ ( E ) = { 1 , for E > 0 0 , otherwise Θ ( E ) = 1 ,       for  E > 0 0 ,       otherwise  Theta(E)={[1","," for "E > 0],[0","," otherwise "]:}\Theta(E)= \begin{cases}1, & \text { for } E>0 \\ 0, & \text { otherwise }\end{cases}Θ(E)={1, for E>00, otherwise 
We have also defined
(4.8) W ( E ) = | { E Δ E H ( P , Q ) E } | . (4.8) W ( E ) = | { E Δ E H ( P , Q ) E } | . {:(4.8)W(E)=|{E-Delta E <= H(P","Q) <= E}|.:}\begin{equation*} W(E)=|\{E-\Delta E \leqslant H(P, Q) \leqslant E\}| . \tag{4.8} \end{equation*}(4.8)W(E)=|{EΔEH(P,Q)E}|.
Following Boltzmann, we give the following
Definition: The entropy of the micro-canonical ensemble is defined by
(4.9) S ( E ) = k B log W ( E ) (4.9) S ( E ) = k B log W ( E ) {:(4.9)S(E)=k_(B)log W(E):}\begin{equation*} S(E)=\mathrm{k}_{\mathrm{B}} \log W(E) \tag{4.9} \end{equation*}(4.9)S(E)=kBlogW(E)
As we have already said, in typical cases, changing S ( E ) S ( E ) S(E)S(E)S(E) in this definition to k B log | Ω E | k B log Ω E k_(B)log|Omega_(E)|\mathrm{k}_{\mathrm{B}} \log \left|\Omega_{E}\right|kBlog|ΩE| will not significantly change the result. It is not hard to see that we may equivalently
write in either case
(4.10) S ( E ) = k B ρ ( P , Q ) log ρ ( P , Q ) d 3 N P d 3 N Q = S inf ( ρ ) (4.10) S ( E ) = k B ρ ( P , Q ) log ρ ( P , Q ) d 3 N P d 3 N Q = S inf ( ρ ) {:(4.10)S(E)=-k_(B)int rho(P","Q)log rho(P","Q)d^(3N)Pd^(3N)Q=S_(inf)(rho):}\begin{equation*} S(E)=-\mathrm{k}_{\mathrm{B}} \int \rho(P, Q) \log \rho(P, Q) d^{3 N} P d^{3 N} Q=S_{\mathrm{inf}}(\rho) \tag{4.10} \end{equation*}(4.10)S(E)=kBρ(P,Q)logρ(P,Q)d3NPd3NQ=Sinf(ρ)
i.e. Boltzmann's definition of entropy coincides with the definition of the information entropy (2.15) of the microcanonical ensemble ρ ρ rho\rhoρ. As defined, S S SSS is a function of E E EEE and implicitly V , N V , N V,NV, NV,N, since these enter the definition of the Hamiltonian and phase space. Sometimes one also specifies other constants of motion or parameters of the system other than E E EEE when defining S S SSS. Denoting these constants collectively as { I α } I α {I_(alpha)}\left\{I_{\alpha}\right\}{Iα}, one defines W W WWW accordingly with respect to E E EEE and { I α } I α {I_(alpha)}\left\{I_{\alpha}\right\}{Iα} by replacing the energy surface with:
(4.11) Ω E , { I α } := { ( P , Q ) Ω : H ( P , Q ) = E , I α ( P , Q ) = I α } . (4.11) Ω E , I α := ( P , Q ) Ω : H ( P , Q ) = E , I α ( P , Q ) = I α . {:(4.11)Omega_(E,{I_(alpha)}):={(P,Q)in Omega:H(P,Q)=E,I_(alpha)(P,Q)=I_(alpha)}.:}\begin{equation*} \Omega_{E,\left\{I_{\alpha}\right\}}:=\left\{(P, Q) \in \Omega: H(P, Q)=E, I_{\alpha}(P, Q)=I_{\alpha}\right\} . \tag{4.11} \end{equation*}(4.11)ΩE,{Iα}:={(P,Q)Ω:H(P,Q)=E,Iα(P,Q)=Iα}.
In this case S = S ( E , { I α } , N , V ) S = S E , I α , N , V S=S(E,{I_(alpha)},N,V)S=S\left(E,\left\{I_{\alpha}\right\}, N, V\right)S=S(E,{Iα},N,V) becomes a function of more variables.

Example:

The ideal gas of N N NNN particles in a box has the Hamiltonian H = i = 1 N ( p 2 2 m + W ( x i ) ) H = i = 1 N p 2 2 m + W x i H=sum_(i=1)^(N)(( vec(p)^(2))/(2m)+W( vec(x)_(i)))H=\sum_{i=1}^{N}\left(\frac{\vec{p}^{2}}{2 m}+\mathcal{W}\left(\vec{x}_{i}\right)\right)H=i=1N(p22m+W(xi)), where the external potential W W W\mathcal{W}W represents the walls of a box of volume V V VVV. For a box with hard walls we take, for example,
(4.12) W ( x ) = { 0 inside V outside V . (4.12) W ( x ) = 0  inside  V  outside  V . {:(4.12)W( vec(x))={[0," inside "V],[oo," outside "V].:}:}\mathcal{W}(\vec{x})=\left\{\begin{array}{ll} 0 & \text { inside } V \tag{4.12}\\ \infty & \text { outside } V \end{array} .\right.(4.12)W(x)={0 inside V outside V.
For the energy surface Ω E Ω E Omega_(E)\Omega_{E}ΩE we then find
(4.13) Ω E = { ( P , Q ) Ω x i inside the box V N , i = 1 N p 2 = 2 E m } , = hyper sphere of dimension 3 N 1 and radius 2 E m (4.13) Ω E = { ( P , Q ) Ω x i  inside the box  V N , i = 1 N p 2 = 2 E m } , =  hyper sphere of dimension  3 N 1  and radius  2 E m {:[(4.13)Omega_(E)={(P","Q)in Omega∣ubrace( vec(x)_(i)" inside the box "ubrace)_(rarrV^(N))","quad ubrace(sum_(i=1)^(N) vec(p)^(2)=2Emubrace)}","],[=" hyper sphere of dimension "],[3N-1" and radius "sqrt(2Em)]:}\begin{align*} & \Omega_{E}=\{(P, Q) \in \Omega \mid \underbrace{\vec{x}_{i} \text { inside the box }}_{\rightarrow V^{N}}, \quad \underbrace{\sum_{i=1}^{N} \vec{p}^{2}=2 E m}\}, \tag{4.13}\\ & =\text { hyper sphere of dimension } \\ & 3 N-1 \text { and radius } \sqrt{2 E m} \end{align*}(4.13)ΩE={(P,Q)Ωxi inside the box VN,i=1Np2=2Em},= hyper sphere of dimension 3N1 and radius 2Em
from which it follows that
(4.14) | Ω E | = V N 2 E m 3 N 1 area ( S 3 N 1 ) = 23 3 N / 2 Γ ( 1 2 3 N ) (4.14) Ω E = V N 2 E m 3 N 1 area S 3 N 1 = 23 3 N / 2 Γ 1 2 3 N {:(4.14)|Omega_(E)|=V^(N)sqrt2Em^(3N-1)ubrace(area(S^(3N-1))ubrace)_(=(23^(3N//2))/(Gamma((1)/(2)3N))):}\begin{equation*} \left|\Omega_{E}\right|=V^{N} \sqrt{2 E m}^{3 N-1} \underbrace{\operatorname{area}\left(S^{3 N-1}\right)}_{=\frac{23^{3 N / 2}}{\Gamma\left(\frac{1}{2} 3 N\right)}} \tag{4.14} \end{equation*}(4.14)|ΩE|=VN2Em3N1area(S3N1)=233N/2Γ(123N)
Here, Γ ( x ) = ( x 1 ) Γ ( x ) = ( x 1 ) Gamma(x)=(x-1)\Gamma(x)=(x-1)Γ(x)=(x1) ! denotes the Γ Γ Gamma\GammaΓ-function. The entropy S ( E , V , N ) S ( E , V , N ) S(E,V,N)S(E, V, N)S(E,V,N) is therefore given by
(4.15) S ( E , V , N ) k B [ N log V + 3 N 2 log ( 2 π m E ) 3 N 2 log 3 N 2 + 3 N 2 ] (4.15) S ( E , V , N ) k B N log V + 3 N 2 log ( 2 π m E ) 3 N 2 log 3 N 2 + 3 N 2 {:(4.15)S(E","V","N)~~k_(B)[N log V+(3N)/(2)log(2pi mE)-(3N)/(2)log((3N)/(2))+(3N)/(2)]:}\begin{equation*} S(E, V, N) \approx \mathrm{k}_{\mathrm{B}}\left[N \log V+\frac{3 N}{2} \log (2 \pi m E)-\frac{3 N}{2} \log \frac{3 N}{2}+\frac{3 N}{2}\right] \tag{4.15} \end{equation*}(4.15)S(E,V,N)kB[NlogV+3N2log(2πmE)3N2log3N2+3N2]
where we have used Stirling's approximation:
log x ! i = 1 x log i 1 x log y d y = x log x x + 1 x ! e x x x . log x ! i = 1 x log i 1 x log y d y = x log x x + 1 x ! e x x x . {:[log x!~~sum_(i=1)^(x)log i~~int_(1)^(x)log ydy=x log x-x+1],[=>x!~~e^(-x)x^(x).]:}\begin{aligned} & \log x!\approx \sum_{i=1}^{x} \log i \approx \int_{1}^{x} \log y d y=x \log x-x+1 \\ & \Rightarrow x!\approx e^{-x} x^{x} . \end{aligned}logx!i=1xlogi1xlogydy=xlogxx+1x!exxx.
Thus, we obtain for the entropy of the ideal gas:
(4.16) S ( E , V , N ) N k B log [ V ( 4 π e m E 3 N ) 3 / 2 ] (4.16) S ( E , V , N ) N k B log V 4 π e m E 3 N 3 / 2 {:(4.16)S(E","V","N)~~Nk_(B)log[V((4pi emE)/(3N))^(3//2)]:}\begin{equation*} S(E, V, N) \approx N \mathrm{k}_{\mathrm{B}} \log \left[V\left(\frac{4 \pi e m E}{3 N}\right)^{3 / 2}\right] \tag{4.16} \end{equation*}(4.16)S(E,V,N)NkBlog[V(4πemE3N)3/2]
Given the function S ( E , V , N ) S ( E , V , N ) S(E,V,N)S(E, V, N)S(E,V,N) for a system, one can define the corresponding temperature, pressure and chemical potential as follows:
Definition: The empirical temperature T T TTT, pressure P P PPP and chemical potential μ μ mu\muμ of the microcanonical ensemble are defined as:
(4.17) 1 T := S E | V , N , P := T S V | E , N , μ := T S N | E , V (4.17) 1 T := S E V , N , P := T S V E , N , μ := T S N E , V {:(4.17)(1)/(T):=(del S)/(del E)|_(V,N)","quad P:=T(del S)/(del V)|_(E,N)","quad mu:=-T(del S)/(del N)|_(E,V):}\begin{equation*} \frac{1}{T}:=\left.\frac{\partial S}{\partial E}\right|_{V, N}, \quad P:=\left.T \frac{\partial S}{\partial V}\right|_{E, N}, \quad \mu:=-\left.T \frac{\partial S}{\partial N}\right|_{E, V} \tag{4.17} \end{equation*}(4.17)1T:=SE|V,N,P:=TSV|E,N,μ:=TSN|E,V
For the ideal classical gas this definition, together with (4.16), yields for instance
(4.18) 1 T = S E = 3 2 N k B E (4.18) 1 T = S E = 3 2 N k B E {:(4.18)(1)/(T)=(del S)/(del E)=(3)/(2)(Nk_(B))/(E):}\begin{equation*} \frac{1}{T}=\frac{\partial S}{\partial E}=\frac{3}{2} \frac{N \mathrm{k}_{\mathrm{B}}}{E} \tag{4.18} \end{equation*}(4.18)1T=SE=32NkBE
which we can rewrite in the more familiar form
(4.19) E = 3 2 N k B T . (4.19) E = 3 2 N k B T . {:(4.19)E=(3)/(2)Nk_(B)T.:}\begin{equation*} E=\frac{3}{2} N \mathrm{k}_{\mathrm{B}} T . \tag{4.19} \end{equation*}(4.19)E=32NkBT.
This formula states that for the ideal gas we have the equidistribution law
(4.20) average energy degree of freedom = 1 2 k B T . (4.20)  average energy   degree of freedom  = 1 2 k B T {:(4.20)(" average energy ")/(" degree of freedom ")=(1)/(2)k_(B)T". ":}\begin{equation*} \frac{\text { average energy }}{\text { degree of freedom }}=\frac{1}{2} \mathrm{k}_{\mathrm{B}} T \text {. } \tag{4.20} \end{equation*}(4.20) average energy  degree of freedom =12kBT
One can similarly verify that the abstract definition of P P PPP in (4.17) above gives
(4.21) P V = k B N T , (4.21) P V = k B N T , {:(4.21)PV=k_(B)NT",":}\begin{equation*} P V=\mathrm{k}_{\mathrm{B}} N T, \tag{4.21} \end{equation*}(4.21)PV=kBNT,
which is the familiar "equation of state" for an ideal gas.
In order to further motivate the second relation in (4.17), we consider a system comprised of a piston applied to an enclosed gas chamber:
Figure 4.1: Gas in a piston maintained at pressure P P PPP.
Here, we obviously have P V = m g z P V = m g z PV=mgzP V=m g zPV=mgz. The total energy is obtained as
(4.22) H total = H gas ( P , Q ) + H piston ( p , z ) = H gas ( P , Q ) + p 2 / 2 m =kin. energy of piston + m g z =pot. energy of piston , (4.22) H total  = H gas ( P , Q ) + H piston ( p , z ) = H gas ( P , Q ) + p 2 / 2 m =kin. energy of piston  + m g z =pot. energy of piston  , {:(4.22)H_("total ")=H_(gas)(P","Q)+H_(piston)(p","z)=H_(gas)(P","Q)+ubrace(p^(2)//2mubrace)_("=kin. energy of piston ")+ obrace(mgz)^("=pot. energy of piston ")",":}\begin{equation*} H_{\text {total }}=H_{\mathrm{gas}}(P, Q)+H_{\mathrm{piston}}(p, z)=H_{\mathrm{gas}}(P, Q)+\underbrace{p^{2} / 2 m}_{\text {=kin. energy of piston }}+\overbrace{m g z}^{\text {=pot. energy of piston }}, \tag{4.22} \end{equation*}(4.22)Htotal =Hgas(P,Q)+Hpiston(p,z)=Hgas(P,Q)+p2/2m=kin. energy of piston +mgz=pot. energy of piston ,
where m m mmm is the mass of the piston (in a moment, we will let m m m rarr oom \rightarrow \inftym and at the same time g 0 ) g 0 ) g rarr0)g \rightarrow 0)g0). Next, we calculate the reduced probability distribution of the piston, assuming that the total system (piston-gas system) is in the micro canonical ensemble:
ρ piston ( p , z ) = 1 W total ( E total , N ) E total Δ E H gas + p 2 / 2 m + m g z E total d 3 N P d 3 N Q = W gas ( E total P V p 2 / 2 m , V , N ) W total ( E total , N ) ρ piston  ( p , z ) = 1 W total  E total  , N E total  Δ E H gas  + p 2 / 2 m + m g z E total  d 3 N P d 3 N Q = W gas  E total  P V p 2 / 2 m , V , N W total  E total  , N {:[rho_("piston ")(p","z)=(1)/(W_("total ")(E_("total "),N))int_(E_("total ")-Delta E <= H_("gas ")+p^(2)//2m+mgz <= E_("total "))d^(3N)Pd^(3N)Q],[=(W_("gas ")(E_("total ")-PV-p^(2)//2m,V,N))/(W_("total ")(E_("total "),N))]:}\begin{aligned} \rho_{\text {piston }}(p, z) & =\frac{1}{W_{\text {total }}\left(E_{\text {total }}, N\right)} \int_{E_{\text {total }}-\Delta E \leqslant H_{\text {gas }}+p^{2} / 2 m+m g z \leqslant E_{\text {total }}} d^{3 N} P d^{3 N} Q \\ & =\frac{W_{\text {gas }}\left(E_{\text {total }}-P V-p^{2} / 2 m, V, N\right)}{W_{\text {total }}\left(E_{\text {total }}, N\right)} \end{aligned}ρpiston (p,z)=1Wtotal (Etotal ,N)Etotal ΔEHgas +p2/2m+mgzEtotal d3NPd3NQ=Wgas (Etotal PVp2/2m,V,N)Wtotal (Etotal ,N)
using that the force F = m g F = m g F=mgF=m gF=mg is also equal to F = P / A F = P / A F=P//AF=P / AF=P/A, where A A AAA is the area of the piston (so that V = A z V = A z V=AzV=A zV=Az is the volume occupied by the gas, hence m g z = P V m g z = P V mgz=PVm g z=P Vmgz=PV ). Now we let m m m rarr oom \rightarrow \inftym keeping the force F = m g F = m g F=mgF=m gF=mg acting on the piston constant. Then the dependence on p p ppp clearly drops out. Let us calculate for which z z zzz the probability ρ piston ( z ) ρ piston ( z , p ) ρ piston  ( z ) ρ piston  ( z , p ) rho_("piston ")(z)-=rho_("piston ")(z,p)\rho_{\text {piston }}(z) \equiv \rho_{\text {piston }}(z, p)ρpiston (z)ρpiston (z,p) of finding the piston at position z z zzz is maximized. This happens when
0 = d d z W gas ( E total P V , V , N ) = d d V W gas ( E total P V , V , N ) = W gas E ( P ) + W gas V = ( P k B S gas E + 1 k B S gas V ) e S gas k B . 0 = d d z W gas E total P V , V , N = d d V W gas E total P V , V , N = W gas E ( P ) + W gas V = P k B S gas E + 1 k B S gas V e S gas k B . {:[0=(d)/(dz)W_(gas)(E_(total)-PV,V,N)=(d)/(dV)W_(gas)(E_(total)-PV,V,N)],[=(delW_(gas))/(del E)*(-P)+(delW_(gas))/(del V)=(-(P)/(k_(B))(delS_(gas))/(del E)+(1)/(k_(B))(delS_(gas))/(del V))e^((S_(gas))/(k_(B))).]:}\begin{aligned} 0 & =\frac{d}{d z} W_{\mathrm{gas}}\left(E_{\mathrm{total}}-P V, V, N\right)=\frac{d}{d V} W_{\mathrm{gas}}\left(E_{\mathrm{total}}-P V, V, N\right) \\ & =\frac{\partial W_{\mathrm{gas}}}{\partial E} \cdot(-P)+\frac{\partial W_{\mathrm{gas}}}{\partial V}=\left(-\frac{P}{\mathrm{k}_{\mathrm{B}}} \frac{\partial S_{\mathrm{gas}}}{\partial E}+\frac{1}{\mathrm{k}_{\mathrm{B}}} \frac{\partial S_{\mathrm{gas}}}{\partial V}\right) e^{\frac{S_{\mathrm{gas}}}{\mathrm{k}_{\mathrm{B}}}} . \end{aligned}0=ddzWgas(EtotalPV,V,N)=ddVWgas(EtotalPV,V,N)=WgasE(P)+WgasV=(PkBSgasE+1kBSgasV)eSgaskB.
Using S gas = k B log W gas S gas  = k B log W gas  S_("gas ")=k_(B)log W_("gas ")S_{\text {gas }}=\mathrm{k}_{\mathrm{B}} \log W_{\text {gas }}Sgas =kBlogWgas , it follows that
(4.23) S gas V | E , N = P S gas E = P T (4.23) S gas V E , N = P S gas E = P T {:(4.23)(delS_(gas))/(del V)|_(E,N)=P(delS_(gas))/(del E)=(P)/(T):}\begin{equation*} \left.\frac{\partial S_{\mathrm{gas}}}{\partial V}\right|_{E, N}=P \frac{\partial S_{\mathrm{gas}}}{\partial E}=\frac{P}{T} \tag{4.23} \end{equation*}(4.23)SgasV|E,N=PSgasE=PT
which gives the desired relation
(4.24) P = T S V | E , N (4.24) P = T S V E , N {:(4.24)P=T(del S)/(del V)|_(E,N):}\begin{equation*} P=\left.T \frac{\partial S}{\partial V}\right|_{E, N} \tag{4.24} \end{equation*}(4.24)P=TSV|E,N
The quantity E total = E gas + P V E total  = E gas  + P V E_("total ")=E_("gas ")+PVE_{\text {total }}=E_{\text {gas }}+P VEtotal =Egas +PV is also called the enthalpy.
It is instructive to compare the definition of the temperature in (4.17) with the parameter β β beta\betaβ that arose in the Boltzmann-Maxwell distribution (3.11), which we also interpreted as temperature there. We first ask the following question: What is the probability for finding particle number 1 having momentum lying between p 1 p 1 vec(p)_(1)\vec{p}_{1}p1 and p 1 + p 1 + vec(p)_(1)+\vec{p}_{1}+p1+ d p 1 d p 1 d vec(p)_(1)d \vec{p}_{1}dp1 ? The answer is: W ( p 1 ) d 3 p 1 W p 1 d 3 p 1 W( vec(p)_(1))d^(3)p_(1)W\left(\vec{p}_{1}\right) d^{3} p_{1}W(p1)d3p1, where W ( p 1 ) W p 1 W( vec(p)_(1))W\left(\vec{p}_{1}\right)W(p1) is given by
(4.25) W ( p 1 ) = ρ ( P , Q ) d 3 p 2 d 3 p N d 3 x 1 d 3 x N (4.25) W p 1 = ρ ( P , Q ) d 3 p 2 d 3 p N d 3 x 1 d 3 x N {:(4.25)W( vec(p)_(1))=int rho(P","Q)d^(3)p_(2)dotsd^(3)p_(N)d^(3)x_(1)dotsd^(3)x_(N):}\begin{equation*} W\left(\vec{p}_{1}\right)=\int \rho(P, Q) d^{3} p_{2} \ldots d^{3} p_{N} d^{3} x_{1} \ldots d^{3} x_{N} \tag{4.25} \end{equation*}(4.25)W(p1)=ρ(P,Q)d3p2d3pNd3x1d3xN
This is of course nothing but the reduced probability distribution where system A consists of the the phase space coordinates p 1 p 1 vec(p)_(1)\vec{p}_{1}p1, and system B of all other coordinates, ( x 1 , p 2 , x 2 , p 3 , x 3 , ) x 1 , p 2 , x 2 , p 3 , x 3 , ( vec(x)_(1), vec(p)_(2), vec(x)_(2), vec(p)_(3), vec(x)_(3),dots)\left(\vec{x}_{1}, \vec{p}_{2}, \vec{x}_{2}, \vec{p}_{3}, \vec{x}_{3}, \ldots\right)(x1,p2,x2,p3,x3,). We wish to calculate this for the ideal gas. To this end we introduce the Hamiltonian H H H^(')H^{\prime}H and the kinetic energy E E E^(')E^{\prime}E for the remaining atoms:
(4.26) H = i = 2 N ( p i 2 2 m + W ( x i ) ) , (4.27) E = E p 1 2 2 m , E H = E H . (4.26) H = i = 2 N p i 2 2 m + W x i , (4.27) E = E p 1 2 2 m , E H = E H . {:[(4.26)H^(')=sum_(i=2)^(N)(( vec(p)_(i)^(2))/(2m)+W( vec(x)_(i)))","],[(4.27)E^(')=E-( vec(p)_(1)^(2))/(2m)","quad E-H quad=E^(')-H^(').]:}\begin{align*} & H^{\prime}=\sum_{i=2}^{N}\left(\frac{\vec{p}_{i}^{2}}{2 m}+\mathcal{W}\left(\vec{x}_{i}\right)\right), \tag{4.26}\\ & E^{\prime}=E-\frac{\vec{p}_{1}^{2}}{2 m}, \quad E-H \quad=E^{\prime}-H^{\prime} . \tag{4.27} \end{align*}(4.26)H=i=2N(pi22m+W(xi)),(4.27)E=Ep122m,EH=EH.
From this we get, together with (4.25) and (4.5):
W ( p 1 ) = V | Ω E | δ ( E H ) i = 2 N d 3 p i d 3 x i = V | Ω E , N 1 | | Ω E , N | (4.28) = ( 3 2 N 1 ) ! π 3 2 ( 3 2 N 5 2 ) ! ( 2 m E ) 3 / 2 ( E E ) 3 N 2 5 2 W p 1 = V Ω E δ E H i = 2 N d 3 p i d 3 x i = V Ω E , N 1 Ω E , N (4.28) = 3 2 N 1 ! π 3 2 3 2 N 5 2 ! ( 2 m E ) 3 / 2 E E 3 N 2 5 2 {:[W( vec(p)_(1))=(V)/(|Omega_(E)|)int delta(E^(')-H^('))prod_(i=2)^(N)d^(3)p_(i)d^(3)x_(i)=(V|Omega_(E^('),N-1)|)/(|Omega_(E,N)|)],[(4.28)=(((3)/(2)N-1)!)/(pi^((3)/(2))((3)/(2)N-(5)/(2))!(2mE)^(3//2))((E^('))/(E))^((3N)/(2)-(5)/(2))]:}\begin{align*} W\left(\vec{p}_{1}\right) & =\frac{V}{\left|\Omega_{E}\right|} \int \delta\left(E^{\prime}-H^{\prime}\right) \prod_{i=2}^{N} d^{3} p_{i} d^{3} x_{i}=\frac{V\left|\Omega_{E^{\prime}, N-1}\right|}{\left|\Omega_{E, N}\right|} \\ & =\frac{\left(\frac{3}{2} N-1\right)!}{\pi^{\frac{3}{2}}\left(\frac{3}{2} N-\frac{5}{2}\right)!(2 m E)^{3 / 2}}\left(\frac{E^{\prime}}{E}\right)^{\frac{3 N}{2}-\frac{5}{2}} \tag{4.28} \end{align*}W(p1)=V|ΩE|δ(EH)i=2Nd3pid3xi=V|ΩE,N1||ΩE,N|(4.28)=(32N1)!π32(32N52)!(2mE)3/2(EE)3N252
Using now the relations
( 3 N 2 + a ) ! ( 3 N 2 + b ) ! ( 3 N 2 ) a b , for a , b 3 N 2 3 N 2 + a ! 3 N 2 + b ! 3 N 2 a b ,  for  a , b 3 N 2 (((3N)/(2)+a)!)/(((3N)/(2)+b)!)~~((3N)/(2))^(a-b),quad" for "a,b≪(3N)/(2)\frac{\left(\frac{3 N}{2}+a\right)!}{\left(\frac{3 N}{2}+b\right)!} \approx\left(\frac{3 N}{2}\right)^{a-b}, \quad \text { for } a, b \ll \frac{3 N}{2}(3N2+a)!(3N2+b)!(3N2)ab, for a,b3N2
we see that for a sufficiently large number of particles (e.g. N = O ( 10 23 ) N = O 10 23 N=O(10^(23))N=\mathcal{O}\left(10^{23}\right)N=O(1023) )
(4.29) W ( p 1 ) ( 3 N 4 π m E ) 3 2 ( 1 p 1 2 2 m E ) 3 N 2 5 2 (4.29) W p 1 3 N 4 π m E 3 2 1 p 1 2 2 m E 3 N 2 5 2 {:(4.29)W( vec(p)_(1))~~((3N)/(4pi mE))^((3)/(2))(1-( vec(p)_(1)^(2))/(2mE))^((3N)/(2)-(5)/(2)):}\begin{equation*} W\left(\vec{p}_{1}\right) \approx\left(\frac{3 N}{4 \pi m E}\right)^{\frac{3}{2}}\left(1-\frac{\vec{p}_{1}^{2}}{2 m E}\right)^{\frac{3 N}{2}-\frac{5}{2}} \tag{4.29} \end{equation*}(4.29)W(p1)(3N4πmE)32(1p122mE)3N252
Using
( 1 a N ) b N N e a b 1 a N b N N e a b (1-(a)/(N))^(bN)longrightarrow_(N rarr oo)e^(-ab)\left(1-\frac{a}{N}\right)^{b N} \underset{N \rightarrow \infty}{\longrightarrow} e^{-a b}(1aN)bNNeab
and β = 3 N 2 E ( E = 3 2 k B N T ) β = 3 N 2 E E = 3 2 k B N T beta=(3N)/(2E)(<=>E=(3)/(2)k_(B)NT)\beta=\frac{3 N}{2 E}\left(\Leftrightarrow E=\frac{3}{2} \mathrm{k}_{\mathrm{B}} N T\right)β=3N2E(E=32kBNT), we find that
(4.30) ( 1 p 1 2 2 m E ) 3 N 2 5 2 N e 3 N 2 p 1 2 2 m E (4.30) 1 p 1 2 2 m E 3 N 2 5 2 N e 3 N 2 p 1 2 2 m E {:(4.30)(1-( vec(p)_(1)^(2))/(2mE))^((3N)/(2)-(5)/(2))rarr_("N rarr oo")^("")e^(-(3N)/(2)( vec(p)_(1)^(2))/(2mE)):}\begin{equation*} \left(1-\frac{\vec{p}_{1}^{2}}{2 m E}\right)^{\frac{3 N}{2}-\frac{5}{2}} \xrightarrow[N \rightarrow \infty]{ } e^{-\frac{3 N}{2} \frac{\vec{p}_{1}^{2}}{2 m E}} \tag{4.30} \end{equation*}(4.30)(1p122mE)3N252Ne3N2p122mE
Consequently, we get exactly the Maxwell-Boltzmann distribution
(4.31) W ( p 1 ) = ( β 2 π m ) 3 2 e β p 1 2 2 m (4.31) W p 1 = β 2 π m 3 2 e β p 1 2 2 m {:(4.31)W( vec(p)_(1))=((beta)/(2pi m))^((3)/(2))e^(-beta( vec(p)_(1)^(2))/(2m)):}\begin{equation*} W\left(\vec{p}_{1}\right)=\left(\frac{\beta}{2 \pi m}\right)^{\frac{3}{2}} e^{-\beta \frac{\vec{p}_{1}^{2}}{2 m}} \tag{4.31} \end{equation*}(4.31)W(p1)=(β2πm)32eβp122m
which confirms our interpretation of β β beta\betaβ as β = 1 k B T β = 1 k B T beta=(1)/(k_(B)T)\beta=\frac{1}{\mathrm{k}_{\mathrm{B}} T}β=1kBT.
We can also confirm the interpretation of β β beta\betaβ by the following consideration: consider two initially isolated systems and put them in thermal contact. The resulting joint probability distribution is given by
(4.32) ρ ( P , Q ) = 1 | Ω E | δ ( H 1 ( P 1 , Q 1 ) system 1 + H 2 ( P 2 , Q 2 ) system 2 E ) . (4.32) ρ ( P , Q ) = 1 Ω E δ ( H 1 P 1 , Q 1 system 1  + H 2 P 2 , Q 2 system  2 E ) . {:(4.32)rho(P","Q)=(1)/(|Omega_(E)|)delta(ubrace(H_(1)(P_(1),Q_(1))ubrace)_("system 1 ")+ubrace(H_(2)(P_(2),Q_(2))ubrace)_("system "2)-E).:}\begin{equation*} \rho(P, Q)=\frac{1}{\left|\Omega_{E}\right|} \delta(\underbrace{H_{1}\left(P_{1}, Q_{1}\right)}_{\text {system 1 }}+\underbrace{H_{2}\left(P_{2}, Q_{2}\right)}_{\text {system } 2}-E) . \tag{4.32} \end{equation*}(4.32)ρ(P,Q)=1|ΩE|δ(H1(P1,Q1)system 1 +H2(P2,Q2)system 2E).
Since only the overall energy is fixed, we may write for the total allowed phase space volume (exercise):
| Ω E | = d E 1 d E 2 | Ω E 1 | system 1 | Ω E 2 | system 2 δ ( E E 1 E 2 ) (4.33) = d E 1 e S 1 ( E 1 ) + S 2 ( E E 1 ) k B Ω E = d E 1 d E 2 Ω E 1 system  1 Ω E 2 system  2 δ E E 1 E 2 (4.33) = d E 1 e S 1 E 1 + S 2 E E 1 k B {:[|Omega_(E)|=int dE_(1)dE_(2)ubrace(|Omega_(E_(1))|ubrace)_("system "1)*ubrace(|Omega_(E_(2))|ubrace)_("system "2)*delta(E-E_(1)-E_(2))],[(4.33)=int dE_(1)e^((S_(1)(E_(1))+S_(2)(E-E_(1)))/(k_(B)))]:}\begin{align*} \left|\Omega_{E}\right| & =\int d E_{1} d E_{2} \underbrace{\left|\Omega_{E_{1}}\right|}_{\text {system } 1} \cdot \underbrace{\left|\Omega_{E_{2}}\right|}_{\text {system } 2} \cdot \delta\left(E-E_{1}-E_{2}\right) \\ & =\int d E_{1} e^{\frac{S_{1}\left(E_{1}\right)+S_{2}\left(E-E_{1}\right)}{\mathrm{k}_{\mathrm{B}}}} \tag{4.33} \end{align*}|ΩE|=dE1dE2|ΩE1|system 1|ΩE2|system 2δ(EE1E2)(4.33)=dE1eS1(E1)+S2(EE1)kB
For typical systems, the integrand is very sharply peaked at the maximum ( E 1 , E 2 ) E 1 , E 2 (E_(1)^(**),E_(2)^(**))\left(E_{1}^{*}, E_{2}^{*}\right)(E1,E2), as depicted in the following figure:
Figure 4.2: The joint number of states for two systems in thermal contact.
At the maximum we have S 1 E ( E 1 ) = S 2 E ( E 2 ) S 1 E E 1 = S 2 E E 2 (delS_(1))/(del E)(E_(1)^(**))=(delS_(2))/(del E)(E_(2)^(**))\frac{\partial S_{1}}{\partial E}\left(E_{1}^{*}\right)=\frac{\partial S_{2}}{\partial E}\left(E_{2}^{*}\right)S1E(E1)=S2E(E2) from which we get the relation:
(4.34) 1 T 1 = 1 T 2 = 1 T (uniformity of temperature). (4.34) 1 T 1 = 1 T 2 = 1 T  (uniformity of temperature).  {:(4.34)(1)/(T_(1))=(1)/(T_(2))=(1)/(T)quad" (uniformity of temperature). ":}\begin{equation*} \frac{1}{T_{1}}=\frac{1}{T_{2}}=\frac{1}{T} \quad \text { (uniformity of temperature). } \tag{4.34} \end{equation*}(4.34)1T1=1T2=1T (uniformity of temperature). 
Since one expects the function to be very sharply peaked at ( E 1 , E 2 ) E 1 , E 2 (E_(1)^(**),E_(2)^(**))\left(E_{1}^{*}, E_{2}^{*}\right)(E1,E2), the integral in (4.33) can be approximated by
S ( E ) S 1 ( E 1 ) + S 2 ( E 2 ) S ( E ) S 1 E 1 + S 2 E 2 S(E)~~S_(1)(E_(1)^(**))+S_(2)(E_(2)^(**))S(E) \approx S_{1}\left(E_{1}^{*}\right)+S_{2}\left(E_{2}^{*}\right)S(E)S1(E1)+S2(E2)
which means that the entropy is (approximately) additive. Note that from the condition of ( E 1 , E 2 ) E 1 , E 2 (E_(1)^(**),E_(2)^(**))\left(E_{1}^{*}, E_{2}^{*}\right)(E1,E2) being a genuine maximum (not just a stationary point), one gets the important stability condition
(4.35) 2 S 1 E 1 2 + 2 S 2 E 2 2 0 (4.35) 2 S 1 E 1 2 + 2 S 2 E 2 2 0 {:(4.35)(del^(2)S_(1))/(delE_(1)^(2))+(del^(2)S_(2))/(delE_(2)^(2)) <= 0:}\begin{equation*} \frac{\partial^{2} S_{1}}{\partial E_{1}^{2}}+\frac{\partial^{2} S_{2}}{\partial E_{2}^{2}} \leqslant 0 \tag{4.35} \end{equation*}(4.35)2S1E12+2S2E220
implying 2 S E 2 0 2 S E 2 0 (del^(2)S)/(delE^(2)) <= 0\frac{\partial^{2} S}{\partial E^{2}} \leqslant 02SE20 if applied to two copies of the same system. We can apply the same considerations if S S SSS depends on additional parameters, such as other constants of motion. Denoting the parameters collectively as X = ( X 1 , , X n ) X = X 1 , , X n X=(X_(1),dots,X_(n))X=\left(X_{1}, \ldots, X_{n}\right)X=(X1,,Xn), the stability condition becomes
(4.36) i , j 2 S X i X j v i v j 0 (4.36) i , j 2 S X i X j v i v j 0 {:(4.36)sum_(i,j)(del^(2)S)/(delX_(i)delX_(j))v_(i)v_(j) <= 0:}\begin{equation*} \sum_{i, j} \frac{\partial^{2} S}{\partial X_{i} \partial X_{j}} v_{i} v_{j} \leqslant 0 \tag{4.36} \end{equation*}(4.36)i,j2SXiXjvivj0
for any choice of displacements v i v i v_(i)v_{i}vi (negativity of the Hessian matrix). Thus, in this case, S S SSS is a concave function of its arguments. Otherwise, if the Hessian matrix has a positive eigenvalue e.g. in the i i iii-th coordinate direction, then the corresponding displacement v i v i v_(i)v_{i}vi will drive the system to an inhomogeneous state, i.e. one where the quantity X i X i X_(i)X_{i}Xi takes different values in different parts of the system (different phases).

4.2.2 Microcanonical Ensemble in Quantum Mechanics

Let H H HHH be the Hamiltonian of a system with eigenstates | n | n |n:)|n\rangle|n and eigenvalues E n E n E_(n)E_{n}En, i.e. H | n = E n | n H | n = E n | n H|n:)=E_(n)|n:)H|n\rangle=E_{n}|n\rangleH|n=En|n, and consider the density matrix
(4.37) ρ = 1 W n : E Δ E E n E | n n | , (4.37) ρ = 1 W n : E Δ E E n E | n n | , {:(4.37)rho=(1)/(W)sum_(n:E-Delta E <= E_(n) <= E)|n:)(:n|",":}\begin{equation*} \rho=\frac{1}{W} \sum_{n: E-\Delta E \leqslant E_{n} \leqslant E}|n\rangle\langle n|, \tag{4.37} \end{equation*}(4.37)ρ=1Wn:EΔEEnE|nn|,
where the normalization constant W W WWW is chosen such that tr ρ = 1 tr ρ = 1 tr rho=1\operatorname{tr} \rho=1trρ=1. The density matrix ρ ρ rho\rhoρ is analogous to the distribution function ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) in the classical microcanonical ensemble, eq. (4.6), since it effectively amounts to giving equal probability to all eigenstates with energies lying between E E EEE and E Δ E E Δ E E-Delta EE-\Delta EEΔE. By analogy with the classical case we get
(4.38) W = number of states between E Δ E and E , (4.38) W =  number of states between  E Δ E  and  E , {:(4.38)W=" number of states between "E-Delta E" and "E",":}\begin{equation*} W=\text { number of states between } E-\Delta E \text { and } E, \tag{4.38} \end{equation*}(4.38)W= number of states between EΔE and E,
and we define the corresponding entropy S ( E ) S ( E ) S(E)S(E)S(E) again by
(4.39) S ( E ) = k B log W ( E ) . (4.39) S ( E ) = k B log W ( E ) {:(4.39)S(E)=k_(B)log W(E)". ":}\begin{equation*} S(E)=\mathrm{k}_{\mathrm{B}} \log W(E) \text {. } \tag{4.39} \end{equation*}(4.39)S(E)=kBlogW(E)
Since W ( E ) W ( E ) W(E)W(E)W(E) is equal to the number of states with energies lying between E Δ E E Δ E E-Delta EE-\Delta EEΔE and E E EEE, it also depends, strictly speaking, on Δ E Δ E Delta E\Delta EΔE. But for Δ E E Δ E E Delta E≲E\Delta E \lesssim EΔEE and large N N NNN, this dependency can be neglected. Note that
S v . N . ( ρ ) = k B tr ( ρ log ρ ) = k B n : E Δ E E n E 1 W log 1 W = k B log W 1 W n : E Δ E E n E = k B log W S v . N . ( ρ ) = k B tr ( ρ log ρ ) = k B n : E Δ E E n E 1 W log 1 W = k B log W 1 W n : E Δ E E n E = k B log W {:[S_(v.N.)(rho)=-k_(B)tr(rho log rho)=-k_(B)*sum_(n:E-Delta E <= E_(n) <= E)quad(1)/(W)log((1)/(W))],[=k_(B)log W*(1)/(W)quadsum_(n:E-Delta E <= E_(n) <= E)],[=k_(B)log W]:}\begin{aligned} S_{\mathrm{v} . \mathrm{N} .}(\rho) & =-\mathrm{k}_{\mathrm{B}} \operatorname{tr}(\rho \log \rho)=-\mathrm{k}_{\mathrm{B}} \cdot \sum_{n: E-\Delta E \leqslant E_{n} \leqslant E} \quad \frac{1}{W} \log \frac{1}{W} \\ & =\mathrm{k}_{\mathrm{B}} \log W \cdot \frac{1}{W} \quad \sum_{n: E-\Delta E \leqslant E_{n} \leqslant E} \\ & =\mathrm{k}_{\mathrm{B}} \log W \end{aligned}Sv.N.(ρ)=kBtr(ρlogρ)=kBn:EΔEEnE1Wlog1W=kBlogW1Wn:EΔEEnE=kBlogW
so S = k B log W S = k B log W S=k_(B)log WS=\mathrm{k}_{\mathrm{B}} \log WS=kBlogW is equal to the von Neumann entropy for the statistical operator ρ ρ rho\rhoρ, defined in (4.37) above.
Example: Free atom in a cube We consider a free particle ( N = 1 ) ( N = 1 ) (N=1)(N=1)(N=1) in a cube of side lengths ( L x , L y , L z ) L x , L y , L z (L_(x),L_(y),L_(z))\left(L_{x}, L_{y}, L_{z}\right)(Lx,Ly,Lz). The Hamiltonian is given by H = 1 2 m ( p x 2 + p y 2 + p z 2 ) H = 1 2 m p x 2 + p y 2 + p z 2 H=(1)/(2m)(p_(x)^(2)+p_(y)^(2)+p_(z)^(2))H=\frac{1}{2 m}\left(p_{x}^{2}+p_{y}^{2}+p_{z}^{2}\right)H=12m(px2+py2+pz2). We impose boundary conditions such that the normalized wave function Ψ Ψ Psi\PsiΨ vanishes at the boundary of the cube. This yields the eigenstates
(4.40) Ψ ( x , y , z ) = 8 V sin ( k x x ) sin ( k y y ) sin ( k z z ) (4.40) Ψ ( x , y , z ) = 8 V sin k x x sin k y y sin k z z {:(4.40)Psi(x","y","z)=sqrt((8)/(V))sin(k_(x)*x)sin(k_(y)*y)sin(k_(z)*z):}\begin{equation*} \Psi(x, y, z)=\sqrt{\frac{8}{V}} \sin \left(k_{x} \cdot x\right) \sin \left(k_{y} \cdot y\right) \sin \left(k_{z} \cdot z\right) \tag{4.40} \end{equation*}(4.40)Ψ(x,y,z)=8Vsin(kxx)sin(kyy)sin(kzz)
where k x = π n x L x , k x = π n x L x , k_(x)=(pin_(x))/(L_(x)),dotsk_{x}=\frac{\pi n_{x}}{L_{x}}, \ldotskx=πnxLx,, with n x = 1 , 2 , 3 , n x = 1 , 2 , 3 , n_(x)=1,2,3,dotsn_{x}=1,2,3, \ldotsnx=1,2,3,
The corresponding energy eigenvalues are given by
(4.41) E n = 2 2 m ( k x 2 + k y 2 + k z 2 ) (4.41) E n = 2 2 m k x 2 + k y 2 + k z 2 {:(4.41)E_(n)=(ℏ^(2))/(2m)(k_(x)^(2)+k_(y)^(2)+k_(z)^(2)):}\begin{equation*} E_{n}=\frac{\hbar^{2}}{2 m}\left(k_{x}^{2}+k_{y}^{2}+k_{z}^{2}\right) \tag{4.41} \end{equation*}(4.41)En=22m(kx2+ky2+kz2)
since p x = i x p x = i x p_(x)=(ℏ)/(i)(del)/(del x)p_{x}=\frac{\hbar}{i} \frac{\partial}{\partial x}px=ix, etc. Recall that W W WWW was defined by
W = number of states | n x , n y , n z with E Δ E E n E W =  number of states  n x , n y , n z  with  E Δ E E n E W=" number of states "|n_(x),n_(y),n_(z):)" with "E-Delta E <= E_(n) <= EW=\text { number of states }\left|n_{x}, n_{y}, n_{z}\right\rangle \text { with } E-\Delta E \leqslant E_{n} \leqslant EW= number of states |nx,ny,nz with EΔEEnE
The following figure gives a sketch of this situation (with k z = 0 k z = 0 k_(z)=0k_{z}=0kz=0 ):
Figure 4.3: Number of states with energies lying between E Δ E E Δ E E-Delta EE-\Delta EEΔE and E E EEE.
In the continum approximation we have (recalling that = h 2 π = h 2 π ℏ=(h)/(2pi)\hbar=\frac{h}{2 \pi}=h2π ):
W = E Δ E E n E 1 { E Δ E E n E } d 3 n = { E Δ E 2 2 m ( k x 2 + k y 2 + k z 2 ) E } L x L y L z π 3 d 3 k = ( 2 m 2 ) 3 2 V π 3 { E Δ E E E } 1 2 E 1 2 d E 1 / 8 of S 2 d 2 Ω (4.42) = 4 π 3 V ( 2 π ) 3 ( 2 m E 2 ) 3 2 | E Δ E E W = E Δ E E n E 1 E Δ E E n E d 3 n = E Δ E 2 2 m k x 2 + k y 2 + k z 2 E L x L y L z π 3 d 3 k = 2 m 2 3 2 V π 3 E Δ E E E 1 2 E 1 2 d E 1 / 8  of  S 2 d 2 Ω (4.42) = 4 π 3 V ( 2 π ) 3 2 m E 2 3 2 E Δ E E {:[W=sum_(E-Delta E <= E_(n) <= E)1~~int_({E-Delta E <= E_(n) <= E})d^(3)n],[=int_({E-Delta E <= (ℏ^(2))/(2m)(k_(x)^(2)+k_(y)^(2)+k_(z)^(2)) <= E})(L_(x)L_(y)L_(z))/(pi^(3))d^(3)k],[=((2m)/(ℏ^(2)))^((3)/(2))(V)/(pi^(3))int_({E-Delta E <= E^(') <= E})(1)/(2)E^('(1)/(2))dE^(')int_(1//8" of "S^(2))d^(2)Omega],[(4.42)=(4pi)/(3)(V)/((2pi)^(3))((2mE)/(ℏ^(2)))^((3)/(2))|_(E-Delta E)^(E)]:}\begin{align*} & W=\sum_{E-\Delta E \leqslant E_{n} \leqslant E} 1 \approx \int_{\left\{E-\Delta E \leqslant E_{n} \leqslant E\right\}} d^{3} n \\ & =\int_{\left\{E-\Delta E \leqslant \frac{\hbar^{2}}{2 m}\left(k_{x}^{2}+k_{y}^{2}+k_{z}^{2}\right) \leqslant E\right\}} \frac{L_{x} L_{y} L_{z}}{\pi^{3}} d^{3} k \\ & =\left(\frac{2 m}{\hbar^{2}}\right)^{\frac{3}{2}} \frac{V}{\pi^{3}} \int_{\left\{E-\Delta E \leqslant E^{\prime} \leqslant E\right\}} \frac{1}{2} E^{\prime \frac{1}{2}} d E^{\prime} \int_{1 / 8 \text { of } S^{2}} d^{2} \Omega \\ & =\left.\frac{4 \pi}{3} \frac{V}{(2 \pi)^{3}}\left(\frac{2 m E}{\hbar^{2}}\right)^{\frac{3}{2}}\right|_{E-\Delta E} ^{E} \tag{4.42} \end{align*}W=EΔEEnE1{EΔEEnE}d3n={EΔE22m(kx2+ky2+kz2)E}LxLyLzπ3d3k=(2m2)32Vπ3{EΔEEE}12E12dE1/8 of S2d2Ω(4.42)=4π3V(2π)3(2mE2)32|EΔEE
If we compute W W WWW according to the definition in classical mechanics, we would get
W = { E Δ E H E } d 3 p d 3 x = V { E Δ E p 2 2 m E } d 3 p = V ( 2 m ) 3 2 { E Δ E E E } 1 2 E 1 2 d E S 2 d 2 Ω = 4 π 3 V ( 2 m E ) 3 2 | E Δ E E W = { E Δ E H E } d 3 p d 3 x = V E Δ E p 2 2 m E d 3 p = V ( 2 m ) 3 2 E Δ E E E 1 2 E 1 2 d E S 2 d 2 Ω = 4 π 3 V ( 2 m E ) 3 2 E Δ E E {:[W=int_({E-Delta E <= H <= E})d^(3)pd^(3)x=Vint_({E-Delta E <= ( vec(p)^(2))/(2m) <= E})d^(3)p],[=V(2m)^((3)/(2))int_({E-Delta E <= E^(') <= E})(1)/(2)E^('(1)/(2))dE^(')int_(S^(2))d^(2)Omega],[=(4pi)/(3)V(2mE)^((3)/(2))|_(E-Delta E)^(E)]:}\begin{aligned} W & =\int_{\{E-\Delta E \leqslant H \leqslant E\}} d^{3} p d^{3} x=V \int_{\left\{E-\Delta E \leqslant \frac{\vec{p}^{2}}{2 m} \leqslant E\right\}} d^{3} p \\ & =V(2 m)^{\frac{3}{2}} \int_{\left\{E-\Delta E \leqslant E^{\prime} \leqslant E\right\}} \frac{1}{2} E^{\prime \frac{1}{2}} d E^{\prime} \int_{S^{2}} d^{2} \Omega \\ & =\left.\frac{4 \pi}{3} V(2 m E)^{\frac{3}{2}}\right|_{E-\Delta E} ^{E} \end{aligned}W={EΔEHE}d3pd3x=V{EΔEp22mE}d3p=V(2m)32{EΔEEE}12E12dES2d2Ω=4π3V(2mE)32|EΔEE
This is just h 3 h 3 h^(3)h^{3}h3 times the quantum mechanical result. For the case of N N NNN particles, this suggests the following relation 1 1 ^(1){ }^{1}1 :
(4.44) W N qm 1 h 3 N W N cl (4.44) W N qm 1 h 3 N W N cl {:(4.44)W_(N)^(qm)~~(1)/(h^(3N))W_(N)^(cl):}\begin{equation*} W_{N}^{\mathrm{qm}} \approx \frac{1}{h^{3 N}} W_{N}^{\mathrm{cl}} \tag{4.44} \end{equation*}(4.44)WNqm1h3NWNcl
This can be understood intuitively by recalling the uncertainty relation Δ p Δ x h Δ p Δ x h Delta p Delta x≳h\Delta p \Delta x \gtrsim hΔpΔxh, together with p π n V 1 3 , n N p π n V 1 3 , n N p∼(pi nℏ)/(V^((1)/(3))),n inNp \sim \frac{\pi n \hbar}{V^{\frac{1}{3}}}, n \in \mathbb{N}pπnV13,nN.

4.2.3 Mixing entropy of the ideal gas

A puzzle concerning the definition of entropy in the micro-canonical ensemble (e.g. for an ideal gas) is revealed if we consider the following situation of two chambers, each of which is filled with an ideal gas:
Figure 4.4: Two gases separated by a removable wall.
The total volume is given by V = V 1 + V 2 V = V 1 + V 2 V=V_(1)+V_(2)V=V_{1}+V_{2}V=V1+V2, the total particle number by N = N 1 + N 2 N = N 1 + N 2 N=N_(1)+N_(2)N=N_{1}+N_{2}N=N1+N2 and the total energy by E = E 1 + E 2 E = E 1 + E 2 E=E_(1)+E_(2)E=E_{1}+E_{2}E=E1+E2. Both gases are at the same temperature T T TTT. Using the expression (4.16) for the classical ideal gas, the entropies S i ( N i , V i , E i ) S i N i , V i , E i S_(i)(N_(i),V_(i),E_(i))S_{i}\left(N_{i}, V_{i}, E_{i}\right)Si(Ni,Vi,Ei) are calculated as
(4.45) S i ( N i , V i , E i ) = N i k B log [ V i ( 4 π e m i E i 3 N i ) 3 2 ] (4.45) S i N i , V i , E i = N i k B log V i 4 π e m i E i 3 N i 3 2 {:(4.45)S_(i)(N_(i),V_(i),E_(i))=N_(i)k_(B)log[V_(i)((4pi em_(i)E_(i))/(3N_(i)))^((3)/(2))]:}\begin{equation*} S_{i}\left(N_{i}, V_{i}, E_{i}\right)=N_{i} \mathrm{k}_{\mathrm{B}} \log \left[V_{i}\left(\frac{4 \pi e m_{i} E_{i}}{3 N_{i}}\right)^{\frac{3}{2}}\right] \tag{4.45} \end{equation*}(4.45)Si(Ni,Vi,Ei)=NikBlog[Vi(4πemiEi3Ni)32]
The wall is now removed and the gases can mix. The temperature of the resulting ideal gas is determined by
(4.46) 3 2 k B T = E 1 + E 2 N 1 + N 2 = E i N i (4.46) 3 2 k B T = E 1 + E 2 N 1 + N 2 = E i N i {:(4.46)(3)/(2)k_(B)T=(E_(1)+E_(2))/(N_(1)+N_(2))=(E_(i))/(N_(i)):}\begin{equation*} \frac{3}{2} \mathrm{k}_{\mathrm{B}} T=\frac{E_{1}+E_{2}}{N_{1}+N_{2}}=\frac{E_{i}}{N_{i}} \tag{4.46} \end{equation*}(4.46)32kBT=E1+E2N1+N2=EiNi
The total entropy S is now found as (we assume m 1 = m 2 m m 1 = m 2 m m_(1)=m_(2)-=mm_{1}=m_{2} \equiv mm1=m2m for simplicity):
S = N k B log [ V ( 2 π e m k B T ) 3 2 ] (4.47) = N k B log V N 1 k B log V 1 N 2 k B log V 2 Δ S + S 1 + S 2 , S = N k B log V 2 π e m k B T 3 2 (4.47) = N k B log V N 1 k B log V 1 N 2 k B log V 2 Δ S + S 1 + S 2 , {:[S=Nk_(B)log[V(2pi emk_(B)T)^((3)/(2))]],[(4.47)=ubrace(Nk_(B)log V-N_(1)k_(B)log V_(1)-N_(2)k_(B)log V_(2)ubrace)_(Delta S)+S_(1)+S_(2)","]:}\begin{align*} S & =N \mathrm{k}_{\mathrm{B}} \log \left[V\left(2 \pi e m \mathrm{k}_{\mathrm{B}} T\right)^{\frac{3}{2}}\right] \\ & =\underbrace{N \mathrm{k}_{\mathrm{B}} \log V-N_{1} \mathrm{k}_{\mathrm{B}} \log V_{1}-N_{2} \mathrm{k}_{\mathrm{B}} \log V_{2}}_{\Delta S}+S_{1}+S_{2}, \tag{4.47} \end{align*}S=NkBlog[V(2πemkBT)32](4.47)=NkBlogVN1kBlogV1N2kBlogV2ΔS+S1+S2,
From this it follows that the mixing entropy Δ S Δ S Delta S\Delta SΔS is given by
Δ S = N 1 k B log V V 1 N 2 k B log V V 2 (4.48) = N k B i c i log v i Δ S = N 1 k B log V V 1 N 2 k B log V V 2 (4.48) = N k B i c i log v i {:[Delta S=N_(1)k_(B)log((V)/(V_(1)))-N_(2)k_(B)log((V)/(V_(2)))],[(4.48)=-Nk_(B)sum_(i)c_(i)log v_(i)]:}\begin{align*} \Delta S & =N_{1} \mathrm{k}_{\mathrm{B}} \log \frac{V}{V_{1}}-N_{2} \mathrm{k}_{\mathrm{B}} \log \frac{V}{V_{2}} \\ & =-N \mathrm{k}_{\mathrm{B}} \sum_{i} c_{i} \log v_{i} \tag{4.48} \end{align*}ΔS=N1kBlogVV1N2kBlogVV2(4.48)=NkBicilogvi
with c i = N i N c i = N i N c_(i)=(N_(i))/(N)c_{i}=\frac{N_{i}}{N}ci=NiN and v i = V i V v i = V i V v_(i)=(V_(i))/(V)v_{i}=\frac{V_{i}}{V}vi=ViV. This holds also for an arbitrary number of components and raises the following paradox: if both gases are identical with the same density N 1 N = N 2 N N 1 N = N 2 N (N_(1))/(N)=(N_(2))/(N)\frac{N_{1}}{N}=\frac{N_{2}}{N}N1N=N2N, from a macroscopic viewpoint clearly "nothing happens" as the wall is removed. Yet, Δ S 0 Δ S 0 Delta S!=0\Delta S \neq 0ΔS0. The resolution of this paradox is that the particles have been treated as distinguishable, i.e. the states

have been counted as microscopically different. However, if both gases are the same, they ought to be treated as indistinguishable. This change results in a different definition of W W WWW in both cases. Namely, depending on the case considered, the correct definition of W W WWW should be:
(4.49) W ( E , V , { N i } ) := { | Ω ( E , V , { N i } ) | if distinguishable 1 i N i ! | Ω ( E , V , { N i } ) | if indistinguishable (4.49) W E , V , N i := Ω E , V , N i  if distinguishable  1 i N i ! Ω E , V , N i  if indistinguishable  {:(4.49)W(E,V,{N_(i)}):={[|Omega(E,V,{N_(i)})|," if distinguishable "],[(1)/(prod_(i)N_(i)!)|Omega(E,V,{N_(i)})|," if indistinguishable "]:}:}W\left(E, V,\left\{N_{i}\right\}\right):= \begin{cases}\left|\Omega\left(E, V,\left\{N_{i}\right\}\right)\right| & \text { if distinguishable } \tag{4.49}\\ \frac{1}{\prod_{i} N_{i}!}\left|\Omega\left(E, V,\left\{N_{i}\right\}\right)\right| & \text { if indistinguishable }\end{cases}(4.49)W(E,V,{Ni}):={|Ω(E,V,{Ni})| if distinguishable 1iNi!|Ω(E,V,{Ni})| if indistinguishable 
where N i N i N_(i)N_{i}Ni is the number of particles of species i i iii. Thus, the second definition is the physically correct one in our case. With this change (which in turn results in a different definition of the entropy S S SSS ), the mixing entropy of two identical gases is now Δ S = 0 Δ S = 0 Delta S=0\Delta S=0ΔS=0. In quantum mechanics the symmetry factor 1 N ! 1 N ! (1)/(N!)\frac{1}{N!}1N! in W qm W qm  W^("qm ")W^{\text {qm }}Wqm  (for each species of indistinguishable particles) is automatically included due to the Bose/Fermi alternative, which we shall discuss later, leading to an automatic resolution of the paradox.
The non-zero mixing entropy of two identical gases is seen to be unphysical also
at the classical level because the entropy should be an extensive quantity. Indeed, the arguments of the previous subsection suggest that for V 1 = V 2 = 1 2 V V 1 = V 2 = 1 2 V V_(1)=V_(2)=(1)/(2)VV_{1}=V_{2}=\frac{1}{2} VV1=V2=12V and N 1 = N 2 = 1 2 N N 1 = N 2 = 1 2 N N_(1)=N_(2)=(1)/(2)NN_{1}=N_{2}=\frac{1}{2} NN1=N2=12N we have
| Ω ( E , V , N ) | = d E | Ω ( E E , V 2 , N 2 ) | | Ω ( E , V 2 , N 2 ) | | Ω ( 1 2 E , 1 2 V , 1 2 N ) | 2 | Ω ( E , V , N ) | = d E Ω E E , V 2 , N 2 Ω E , V 2 , N 2 Ω 1 2 E , 1 2 V , 1 2 N 2 {:[|Omega(E","V","N)|=int dE^(')|Omega(E-E^('),(V)/(2),(N)/(2))||Omega(E^('),(V)/(2),(N)/(2))|],[~~|Omega((1)/(2)E,(1)/(2)V,(1)/(2)N)|^(2)]:}\begin{aligned} |\Omega(E, V, N)| & =\int d E^{\prime}\left|\Omega\left(E-E^{\prime}, \frac{V}{2}, \frac{N}{2}\right)\right|\left|\Omega\left(E^{\prime}, \frac{V}{2}, \frac{N}{2}\right)\right| \\ & \approx\left|\Omega\left(\frac{1}{2} E, \frac{1}{2} V, \frac{1}{2} N\right)\right|^{2} \end{aligned}|Ω(E,V,N)|=dE|Ω(EE,V2,N2)||Ω(E,V2,N2)||Ω(12E,12V,12N)|2
(the maximum of the integrand above should be sharply peaked at E = E 2 E = E 2 E^(')=(E)/(2)E^{\prime}=\frac{E}{2}E=E2 ). It follows for the entropy that, approximately,
(4.50) S ( E , N , V ) = 2 S ( E 2 , N 2 , V 2 ) (4.50) S ( E , N , V ) = 2 S E 2 , N 2 , V 2 {:(4.50)S(E","N","V)=2S((E)/(2),(N)/(2),(V)/(2)):}\begin{equation*} S(E, N, V)=2 S\left(\frac{E}{2}, \frac{N}{2}, \frac{V}{2}\right) \tag{4.50} \end{equation*}(4.50)S(E,N,V)=2S(E2,N2,V2)
The same consideration can be repeated for ν ν nu\nuν subsystems and yields
(4.51) S ( E , N , V ) = ν S ( E ν , N ν , V ν ) (4.51) S ( E , N , V ) = ν S E ν , N ν , V ν {:(4.51)S(E","N","V)=nu S((E)/( nu),(N)/( nu),(V)/( nu)):}\begin{equation*} S(E, N, V)=\nu S\left(\frac{E}{\nu}, \frac{N}{\nu}, \frac{V}{\nu}\right) \tag{4.51} \end{equation*}(4.51)S(E,N,V)=νS(Eν,Nν,Vν)
and thus
(4.52) S ( E , N , V ) = N σ ( ϵ , n ) (4.52) S ( E , N , V ) = N σ ( ϵ , n ) {:(4.52)S(E","N","V)=N*sigma(epsilon","n):}\begin{equation*} S(E, N, V)=N \cdot \sigma(\epsilon, n) \tag{4.52} \end{equation*}(4.52)S(E,N,V)=Nσ(ϵ,n)
for some function σ σ sigma\sigmaσ in two variables, where ϵ = E N ϵ = E N epsilon=(E)/(N)\epsilon=\frac{E}{N}ϵ=EN is the average energy per particle and n = N V n = N V n=(N)/(V)n=\frac{N}{V}n=NV is the particle density. Hence S S SSS is an extensive quantity, i.e. S S SSS is proportional to N N NNN. A non-zero mixing entropy would contradict the extensivity property of S S SSS.

4.3 Canonical Ensemble

4.3.1 Canonical Ensemble in Quantum Mechanics

We consider a system (system A) in thermal contact with an (infinitely large) heat reservoir (system B):
Figure 4.5: A small system in contact with a large heat reservoir.
The overall energy E = E A + E B E = E A + E B E=E_(A)+E_(B)E=E_{A}+E_{B}E=EA+EB of the combined system is fixed, as are the particle numbers N A , N B N A , N B N_(A),N_(B)N_{A}, N_{B}NA,NB of the subsystems. We think of N B N B N_(B)N_{B}NB as much larger than N A N A N_(A)N_{A}NA; in fact we shall let N B N B N_(B)rarr ooN_{B} \rightarrow \inftyNB at the end of our derivation. We accordingly describe the total Hilbert space of the system by a tensor product, H = H A H B H = H A H B H=H_(A)oxH_(B)\mathcal{H}=\mathcal{H}_{A} \otimes \mathcal{H}_{B}H=HAHB. The total Hamiltonian of the combined system is
(4.53) H = H A system A + H B system B + H A B interaction (neglected) , (4.53) H = H A system A  + H B system B  + H A B interaction (neglected)  , {:(4.53)H=ubrace(H_(A)ubrace)_("system A ")+ubrace(H_(B)ubrace)_("system B ")+ubrace(H_(AB)ubrace)_("interaction (neglected) ")",":}\begin{equation*} H=\underbrace{H_{A}}_{\text {system A }}+\underbrace{H_{B}}_{\text {system B }}+\underbrace{H_{A B}}_{\text {interaction (neglected) }}, \tag{4.53} \end{equation*}(4.53)H=HAsystem A +HBsystem B +HABinteraction (neglected) ,
where the interaction is needed in order that the subsystems can interact with each other. Its precise form is not needed, as we shall assume that the interaction strength is arbitrarily small. The Hamiltonians H A H A H_(A)H_{A}HA and H B H B H_(B)H_{B}HB of the subsystems A A AAA and B B BBB act on the Hilbert spaces H A H A H_(A)\mathcal{H}_{A}HA and H B H B H_(B)\mathcal{H}_{B}HB, and we choose bases so that:
H A | n A = E n ( A ) | n A H B | m B = E m ( B ) | m B | n , m = | n A | m B . H A | n A = E n ( A ) | n A H B | m B = E m ( B ) | m B | n , m = | n A | m B . {:[H_(A)|n:)_(A)=E_(n)^((A))|n:)_(A)],[H_(B)|m:)_(B)=E_(m)^((B))|m:)_(B)],[|n","m:)=|n:)_(A)ox|m:)_(B).]:}\begin{aligned} H_{A}|n\rangle_{A} & =E_{n}^{(A)}|n\rangle_{A} \\ H_{B}|m\rangle_{B} & =E_{m}^{(B)}|m\rangle_{B} \\ |n, m\rangle & =|n\rangle_{A} \otimes|m\rangle_{B} . \end{aligned}HA|nA=En(A)|nAHB|mB=Em(B)|mB|n,m=|nA|mB.
Since E E EEE is conserved, the quantum mechanical statistical operator of the combined system is given by the micro canonical ensemble with density matrix
(4.54) ρ = 1 W n , m : E Δ E E n ( A ) + E m ( B ) E | n , m n , m | . (4.54) ρ = 1 W n , m : E Δ E E n ( A ) + E m ( B ) E | n , m n , m | . {:(4.54)rho=(1)/(W)*sum_({:[n","m:],[E-Delta E <= E_(n)^((A))+E_(m)^((B)) <= E]:})|n","m:)(:n","m|.:}\begin{equation*} \rho=\frac{1}{W} \cdot \sum_{\substack{n, m: \\ E-\Delta E \leqslant E_{n}^{(A)}+E_{m}^{(B)} \leqslant E}}|n, m\rangle\langle n, m| . \tag{4.54} \end{equation*}(4.54)ρ=1Wn,m:EΔEEn(A)+Em(B)E|n,mn,m|.
The reduced density matrix for sub system A is calculated as
ρ A = 1 W n ( m : E E n ( A ) Δ E E m ( A ) E E n ( A ) 1 ) = W B ( E E n ( A ) ) | n A A n | . ρ A = 1 W n m : E E n ( A ) Δ E E m ( A ) E E n ( A ) 1 = W B E E n ( A ) | n A A n | . rho_(A)=(1)/(W)sum_(n) obrace((sum_(m:E-E_(n)^((A))-Delta E <= E_(m)^((A)) <= E-E_(n)^((A)))1))^(=W_(B)(E-E_(n)^((A))))|n:)_(AA)(:n|.\rho_{A}=\frac{1}{W} \sum_{n} \overbrace{\left(\sum_{m: E-E_{n}^{(A)}-\Delta E \leqslant E_{m}^{(A)} \leqslant E-E_{n}^{(A)}} 1\right)}^{=W_{B}\left(E-E_{n}^{(A)}\right)}|n\rangle_{A A}\langle n| .ρA=1Wn(m:EEn(A)ΔEEm(A)EEn(A)1)=WB(EEn(A))|nAAn|.
Now, using the extensiveness of the entropy S B S B S_(B)S_{B}SB of system B we find (with n B = N B / V B n B = N B / V B n_(B)=N_(B)//V_(B)n_{B}=N_{B} / V_{B}nB=NB/VB
the particle density and σ B σ B sigma_(B)\sigma_{B}σB the entropy per particle of system B )
log W B ( E E n ( A ) ) = 1 k B S B ( E E n ( A ) ) = N B k B σ B ( E N B , n B ) N B k B E n ( A ) N B σ B ϵ ( E N B , n B ) + 1 2 N B k B ( E n ( A ) N B ) 2 2 σ B ϵ 2 ( E N B , n B ) + . = O ( 1 N B ) 0 (as N B , i.e., reservoir -large!) log W B E E n ( A ) = 1 k B S B E E n ( A ) = N B k B σ B E N B , n B N B k B E n ( A ) N B σ B ϵ E N B , n B + 1 2 N B k B E n ( A ) N B 2 2 σ B ϵ 2 E N B , n B + . = O 1 N B 0  (as  N B ,  i.e., reservoir  -large!)  {:[log W_(B)(E-E_(n)^((A)))=(1)/(k_(B))S_(B)(E-E_(n)^((A)))],[=(N_(B))/(k_(B))sigma_(B)((E)/(N_(B)),n_(B))-(N_(B))/(k_(B))(E_(n)^((A)))/(N_(B))(delsigma_(B))/(del epsilon)((E)/(N_(B)),n_(B))],[+(1)/(2)ubrace((N_(B))/(k_(B))((E_(n)^((A)))/(N_(B)))^(2)(del^(2)sigma_(B))/(delepsilon^(2))((E)/(N_(B)),n_(B))+dotsubrace).],[=O((1)/(N_(B)))rarr0" (as "N_(B)rarr oo","" i.e., reservoir "oo"-large!) "]:}\begin{aligned} \log W_{B}\left(E-E_{n}^{(A)}\right) & =\frac{1}{\mathrm{k}_{\mathrm{B}}} S_{B}\left(E-E_{n}^{(A)}\right) \\ & =\frac{N_{B}}{\mathrm{k}_{\mathrm{B}}} \sigma_{B}\left(\frac{E}{N_{B}}, n_{B}\right)-\frac{N_{B}}{\mathrm{k}_{\mathrm{B}}} \frac{E_{n}^{(A)}}{N_{B}} \frac{\partial \sigma_{B}}{\partial \epsilon}\left(\frac{E}{N_{B}}, n_{B}\right) \\ & +\frac{1}{2} \underbrace{\frac{N_{B}}{\mathrm{k}_{\mathrm{B}}}\left(\frac{E_{n}^{(A)}}{N_{B}}\right)^{2} \frac{\partial^{2} \sigma_{B}}{\partial \epsilon^{2}}\left(\frac{E}{N_{B}}, n_{B}\right)+\ldots} . \\ & =\mathcal{O}\left(\frac{1}{N_{B}}\right) \rightarrow 0 \text { (as } N_{B} \rightarrow \infty, \text { i.e., reservoir } \infty \text {-large!) } \end{aligned}logWB(EEn(A))=1kBSB(EEn(A))=NBkBσB(ENB,nB)NBkBEn(A)NBσBϵ(ENB,nB)+12NBkB(En(A)NB)22σBϵ2(ENB,nB)+.=O(1NB)0 (as NB, i.e., reservoir -large!) 
Thus, using β = 1 k B T β = 1 k B T beta=(1)/(k_(B)T)\beta=\frac{1}{\mathrm{k}_{\mathrm{B}} T}β=1kBT and 1 T = S E 1 T = S E (1)/(T)=(del S)/(del E)\frac{1}{T}=\frac{\partial S}{\partial E}1T=SE, we have for an infinite reservoir
(4.55) log W B ( E E n ( A ) ) = log W B ( E ) E n ( A ) β (4.55) log W B E E n ( A ) = log W B ( E ) E n ( A ) β {:(4.55)log W_(B)(E-E_(n)^((A)))=log W_(B)(E)-E_(n)^((A))beta:}\begin{equation*} \log W_{B}\left(E-E_{n}^{(A)}\right)=\log W_{B}(E)-E_{n}^{(A)} \beta \tag{4.55} \end{equation*}(4.55)logWB(EEn(A))=logWB(E)En(A)β
which means
(4.56) W B ( E E n ( A ) ) = 1 Z e β E n ( A ) (4.56) W B E E n ( A ) = 1 Z e β E n ( A ) {:(4.56)W_(B)(E-E_(n)^((A)))=(1)/(Z)e^(-betaE_(n)^((A))):}\begin{equation*} W_{B}\left(E-E_{n}^{(A)}\right)=\frac{1}{Z} e^{-\beta E_{n}^{(A)}} \tag{4.56} \end{equation*}(4.56)WB(EEn(A))=1ZeβEn(A)
Therefore, we find the following expression for the reduced density matrix for system A:
(4.57) ρ A = 1 Z n e β E n ( A ) | n A A n | , (4.57) ρ A = 1 Z n e β E n ( A ) | n A A n | , {:(4.57)rho_(A)=(1)/(Z)sum_(n)e^(-betaE_(n)^((A)))|n:)_(A)ox_(A)(:n|",":}\begin{equation*} \rho_{A}=\frac{1}{Z} \sum_{n} e^{-\beta E_{n}^{(A)}}|n\rangle_{A} \otimes_{A}\langle n|, \tag{4.57} \end{equation*}(4.57)ρA=1ZneβEn(A)|nAAn|,
where Z = Z ( β , N A , V A ) Z = Z β , N A , V A Z=Z(beta,N_(A),V_(A))Z=Z\left(\beta, N_{A}, V_{A}\right)Z=Z(β,NA,VA) is called canonical partition function. Explicitly:
(4.58) Z ( N , β , V ) = tr [ e β H ( V , N ) ] = n e β E n (4.58) Z ( N , β , V ) = tr e β H ( V , N ) = n e β E n {:(4.58)Z(N","beta","V)=tr[e^(-beta H(V,N))]=sum_(n)e^(-betaE_(n)):}\begin{equation*} Z(N, \beta, V)=\operatorname{tr}\left[e^{-\beta H(V, N)}\right]=\sum_{n} e^{-\beta E_{n}} \tag{4.58} \end{equation*}(4.58)Z(N,β,V)=tr[eβH(V,N)]=neβEn
Here we have dropped the subscripts "A" referring to our sub system since we can at this point forget about the role of the reservoir B (so H = H A , V = V A H = H A , V = V A H=H_(A),V=V_(A)H=H_{A}, V=V_{A}H=HA,V=VA etc. in this formula). This finally leads to the statistical operator of the canonical ensemble:
(4.59) ρ = 1 Z ( β , N , V ) e β H ( N , V ) . (4.59) ρ = 1 Z ( β , N , V ) e β H ( N , V ) . {:(4.59)rho=(1)/(Z(beta,N,V))e^(-beta H(N,V)).:}\begin{equation*} \rho=\frac{1}{Z(\beta, N, V)} e^{-\beta H(N, V)} . \tag{4.59} \end{equation*}(4.59)ρ=1Z(β,N,V)eβH(N,V).
Particular, the only quantity characterizing the reservoir entering the formula is the temperature T T TTT.

4.3.2 Canonical Ensemble in Classical Mechanics

In the classical case we can make similar considerations as in the quantum mechanical case. Consider the same situation as above. The phase space coordinates of the combined system are divided up as
( P , Q ) = ( P A , Q A system A , P B , Q B system B ) . ( P , Q ) = ( P A , Q A system A  , P B , Q B system  B ) . (P,Q)=(ubrace(P_(A),Q_(A)ubrace)_("system A "),ubrace(P_(B),Q_(B)ubrace)_("system "B)).(P, Q)=(\underbrace{P_{A}, Q_{A}}_{\text {system A }}, \underbrace{P_{B}, Q_{B}}_{\text {system } \mathrm{B}}) .(P,Q)=(PA,QAsystem A ,PB,QBsystem B).
The Hamiltonian of the total system is written as
(4.60) H ( P , Q ) = H A ( P A , Q A ) + H B ( P B , Q B ) + H A B ( P , Q ) (4.60) H ( P , Q ) = H A P A , Q A + H B P B , Q B + H A B ( P , Q ) {:(4.60)H(P","Q)=H_(A)(P_(A),Q_(A))+H_(B)(P_(B),Q_(B))+H_(AB)(P","Q):}\begin{equation*} H(P, Q)=H_{A}\left(P_{A}, Q_{A}\right)+H_{B}\left(P_{B}, Q_{B}\right)+H_{A B}(P, Q) \tag{4.60} \end{equation*}(4.60)H(P,Q)=HA(PA,QA)+HB(PB,QB)+HAB(P,Q)
H A B H A B H_(AB)H_{A B}HAB accounts for the interaction between the particles from both systems and is neglected in the following. By analogy with the quantum mechanical case we get a reduced probability distribution ρ A ρ A rho_(A)\rho_{A}ρA for sub system A:
ρ A ( P A , Q A ) = d 3 N B P B d 3 N B Q B ρ ( P A , Q A , P B , Q B ) , ρ A P A , Q A = d 3 N B P B d 3 N B Q B ρ P A , Q A , P B , Q B , rho_(A)(P_(A),Q_(A))=intd^(3N_(B))P_(B)d^(3N_(B))Q_(B)rho(P_(A),Q_(A),P_(B),Q_(B)),\rho_{A}\left(P_{A}, Q_{A}\right)=\int d^{3 N_{B}} P_{B} d^{3 N_{B}} Q_{B} \rho\left(P_{A}, Q_{A}, P_{B}, Q_{B}\right),ρA(PA,QA)=d3NBPBd3NBQBρ(PA,QA,PB,QB),
with
ρ = 1 W { 1 if E Δ E H ( P , Q ) E 0 otherwise. ρ = 1 W 1       if  E Δ E H ( P , Q ) E 0       otherwise.  rho=(1)/(W)*{[1," if "E-Delta E <= H(P","Q) <= E],[0," otherwise. "]:}\rho=\frac{1}{W} \cdot \begin{cases}1 & \text { if } E-\Delta E \leqslant H(P, Q) \leqslant E \\ 0 & \text { otherwise. }\end{cases}ρ=1W{1 if EΔEH(P,Q)E0 otherwise. 
From this it follows that
ρ A ( P A , Q A ) = 1 W { E Δ E H A + H B E } d 3 N B P B d 3 N B Q B = 1 W { E H A ( P A , Q A ) Δ E H B ( P B , Q B ) E + H A ( P A , Q A ) } d 3 N B P B d 3 N B Q B = 1 W ( E ) W 2 ( E H A ( P A , Q A ) ) ρ A P A , Q A = 1 W E Δ E H A + H B E d 3 N B P B d 3 N B Q B = 1 W E H A P A , Q A Δ E H B P B , Q B E + H A P A , Q A d 3 N B P B d 3 N B Q B = 1 W ( E ) W 2 E H A P A , Q A {:[rho_(A)(P_(A),Q_(A))=(1)/(W)int_({E-Delta E <= H_(A)+H_(B) <= E})d^(3N_(B))P_(B)d^(3N_(B))Q_(B)],[=(1)/(W)int_({E-H_(A)(P_(A),Q_(A))-Delta E <= H_(B)(P_(B),Q_(B)) <= E+H_(A)(P_(A),Q_(A))})d^(3N_(B))P_(B)d^(3N_(B))Q_(B)],[=(1)/(W(E))W_(2)(E-H_(A)(P_(A),Q_(A)))]:}\begin{aligned} \rho_{A}\left(P_{A}, Q_{A}\right) & =\frac{1}{W} \int_{\left\{E-\Delta E \leqslant H_{A}+H_{B} \leqslant E\right\}} d^{3 N_{B}} P_{B} d^{3 N_{B}} Q_{B} \\ & =\frac{1}{W} \int_{\left\{E-H_{A}\left(P_{A}, Q_{A}\right)-\Delta E \leqslant H_{B}\left(P_{B}, Q_{B}\right) \leqslant E+H_{A}\left(P_{A}, Q_{A}\right)\right\}} d^{3 N_{B}} P_{B} d^{3 N_{B}} Q_{B} \\ & =\frac{1}{W(E)} W_{2}\left(E-H_{A}\left(P_{A}, Q_{A}\right)\right) \end{aligned}ρA(PA,QA)=1W{EΔEHA+HBE}d3NBPBd3NBQB=1W{EHA(PA,QA)ΔEHB(PB,QB)E+HA(PA,QA)}d3NBPBd3NBQB=1W(E)W2(EHA(PA,QA))
It is then demonstrated precisely as in the quantum mechanical case that the reduced density matrix ρ ρ A ρ ρ A rho-=rho_(A)\rho \equiv \rho_{A}ρρA for system A is given by (for an infinitely large system B ):
(4.61) ρ ( P , Q ) = 1 Z e β H ( P , Q ) , (4.61) ρ ( P , Q ) = 1 Z e β H ( P , Q ) , {:(4.61)rho(P","Q)=(1)/(Z)e^(-beta H(P,Q))",":}\begin{equation*} \rho(P, Q)=\frac{1}{Z} e^{-\beta H(P, Q)}, \tag{4.61} \end{equation*}(4.61)ρ(P,Q)=1ZeβH(P,Q),
where P = P A , Q = Q A , H = H A P = P A , Q = Q A , H = H A P=P_(A),Q=Q_(A),H=H_(A)P=P_{A}, Q=Q_{A}, H=H_{A}P=PA,Q=QA,H=HA in this formula. The classical canonical partition function Z = Z ( β , N , V ) Z = Z ( β , N , V ) Z=Z(beta,N,V)Z=Z(\beta, N, V)Z=Z(β,N,V) for N N NNN indistinguishable particles is conventionally fixed by ( h 3 N N ! ) 1 ρ d 3 N P d 3 N Q = 1 h 3 N N ! 1 ρ d 3 N P d 3 N Q = 1 (h^(3N)N!)^(-1)int rhod^(3N)Pd^(3N)Q=1\left(h^{3 N} N!\right)^{-1} \int \rho d^{3 N} P d^{3 N} Q=1(h3NN!)1ρd3NPd3NQ=1. For an external square well potential ( H ( P , Q ) = ( H ( P , Q ) = (H(P,Q)=(H(P, Q)=(H(P,Q)=
i = 1 3 N P i 2 2 m + V N ( Q ) ) i = 1 3 N P i 2 2 m + V N ( Q ) {:sum_(i=1)^(3N)(P_(i)^(2))/(2m)+V_(N)(Q))\left.\sum_{i=1}^{3 N} \frac{P_{i}^{2}}{2 m}+\mathcal{V}_{N}(Q)\right)i=13NPi22m+VN(Q)) confining the system to a box of volume V V VVV this leads to
Z := ( 1 N ! h 3 N ) d 3 N P d 3 N Q e β H ( P , Q ) (4.62) = 1 N ! h 3 N ( 2 π m β ) 3 N / 2 V N d 3 N Q e β V N ( Q ) Z := 1 N ! h 3 N d 3 N P d 3 N Q e β H ( P , Q ) (4.62) = 1 N ! h 3 N 2 π m β 3 N / 2 V N d 3 N Q e β V N ( Q ) {:[Z:=((1)/(N!h^(3N)))intd^(3N)Pd^(3N)Qe^(-beta H(P,Q))],[(4.62)=(1)/(N!h^(3N))((2pi m)/(beta))^(3N//2)int_(V^(N))d^(3N)Qe^(-betaV_(N)(Q))]:}\begin{align*} Z & :=\left(\frac{1}{N!h^{3 N}}\right) \int d^{3 N} P d^{3 N} Q e^{-\beta H(P, Q)} \\ & =\frac{1}{N!h^{3 N}}\left(\frac{2 \pi m}{\beta}\right)^{3 N / 2} \int_{V^{N}} d^{3 N} Q e^{-\beta \mathcal{V}_{N}(Q)} \tag{4.62} \end{align*}Z:=(1N!h3N)d3NPd3NQeβH(P,Q)(4.62)=1N!h3N(2πmβ)3N/2VNd3NQeβVN(Q)
The quantity λ := h 2 π m k B T λ := h 2 π m k B T lambda:=(h)/(sqrt(2pi mk_(B)T))\lambda:=\frac{h}{\sqrt{2 \pi m \mathrm{k}_{\mathrm{B}} T}}λ:=h2πmkBT is sometimes called the "thermal deBroglie wavelength". As a rule of thumb, quantum effects start being significant if λ λ lambda\lambdaλ exceeds the typical dimensions of the system, such as the mean free path length or system size. Using this definition, we can write
(4.63) Z ( β , N , V ) = 1 N ! λ 3 N V N d 3 N Q e β V N ( Q ) (4.63) Z ( β , N , V ) = 1 N ! λ 3 N V N d 3 N Q e β V N ( Q ) {:(4.63)Z(beta","N","V)=(1)/(N!lambda^(3N))int_(V^(N))d^(3N)Qe^(-betaV_(N)(Q)):}\begin{equation*} Z(\beta, N, V)=\frac{1}{N!\lambda^{3 N}} \int_{V^{N}} d^{3 N} Q e^{-\beta \mathcal{V}_{N}(Q)} \tag{4.63} \end{equation*}(4.63)Z(β,N,V)=1N!λ3NVNd3NQeβVN(Q)
Of course, this form of the partition function applies to classical, not quantum, systems. The unconventional factor of h 3 N h 3 N h^(3N)h^{3 N}h3N is nevertheless put in by analogy with the quantum mechanical case because one imagines that the "unit" of phase space for N N NNN particles (i.e. the phase space measure) is given by d 3 N P d 3 N Q / ( N ! h 3 N ) d 3 N P d 3 N Q / N ! h 3 N d^(3N)Pd^(3N)Q//(N!h^(3N))d^{3 N} P d^{3 N} Q /\left(N!h^{3 N}\right)d3NPd3NQ/(N!h3N), inspired by the uncertainty principle Δ Q Δ P h Δ Q Δ P h Delta Q Delta P∼h\Delta Q \Delta P \sim hΔQΔPh, see e.g. our discussion of the atom in a cube for why the normalized classical partition function then approximates the quantum partition function. The motivation of the factor N N NNN ! is due to the fact that we want to treat the particles as indistinguishable. Therefore, a permuted phase space configuration should be viewed as equivalent to the unpermuted one, and since there are N N NNN ! permutations, the factor 1 / N 1 / N 1//N1 / N1/N ! effectively compensates a corresponding overcounting (here we implicitly assume that V N V N V_(N)\mathcal{V}_{N}VN is symmetric under permutations). For the discussion of the N N NNN !-factor, see also our discussion on mixing entropy. In practice, these factors often do not play a major role because the quantities most directly related to thermodynamics are derivatives of
(4.64) F := β 1 log Z ( β , N , V ) (4.64) F := β 1 log Z ( β , N , V ) {:(4.64)F:=-beta^(-1)log Z(beta","N","V):}\begin{equation*} F:=-\beta^{-1} \log Z(\beta, N, V) \tag{4.64} \end{equation*}(4.64)F:=β1logZ(β,N,V)
for instance P = F / V | T , N P = F / V T , N P=-del F// del V|_(T,N)P=-\partial F /\left.\partial V\right|_{T, N}P=F/V|T,N, see chapter 6.5 for a detailed discussion of such relations. F F FFF is also called the free energy.

Example:

One may use the formula (4.61) to obtain the barometric formula for the average particle density at a position x x vec(x)\vec{x}x in a given external potential. In this case the Hamiltonian H H HHH is given by
H = i = 1 N p i 2 2 m + i = 1 N W ( x i ) external potential, no interaction between the particles , H = i = 1 N p i 2 2 m + i = 1 N W x i  external potential,   no interaction   between the particles  , H=sum_(i=1)^(N)( vec(p)_(i)^(2))/(2m)+sum_(i=1)^(N)ubrace(W( vec(x)_(i))ubrace)_({:[" external potential, "],[" no interaction "],[" between the particles "]:}),H=\sum_{i=1}^{N} \frac{\vec{p}_{i}^{2}}{2 m}+\sum_{i=1}^{N} \underbrace{\mathcal{W}\left(\vec{x}_{i}\right)}_{\begin{array}{c} \text { external potential, } \\ \text { no interaction } \\ \text { between the particles } \end{array}},H=i=1Npi22m+i=1NW(xi) external potential,  no interaction  between the particles ,
which yields the probability distribution
(4.65) ρ ( P , Q ) = 1 Z e β H ( P , Q ) = 1 Z i = 1 N e β ( p i 2 2 m + W ( x i ) ) . (4.65) ρ ( P , Q ) = 1 Z e β H ( P , Q ) = 1 Z i = 1 N e β p i 2 2 m + W x i . {:(4.65)rho(P","Q)=(1)/(Z)e^(-beta H(P,Q))=(1)/(Z)prod_(i=1)^(N)e^(-beta(( vec(p)_(i)^(2))/(2m)+W( vec(x)_(i)))).:}\begin{equation*} \rho(P, Q)=\frac{1}{Z} e^{-\beta H(P, Q)}=\frac{1}{Z} \prod_{i=1}^{N} e^{-\beta\left(\frac{\vec{p}_{i}^{2}}{2 m}+\mathcal{W}\left(\vec{x}_{i}\right)\right)} . \tag{4.65} \end{equation*}(4.65)ρ(P,Q)=1ZeβH(P,Q)=1Zi=1Neβ(pi22m+W(xi)).
The particle density n ( x ) n ( x ) n( vec(x))n(\vec{x})n(x) is given by
(4.66) n ( x ) = i = 1 N δ 3 ( x i x ) = N d 3 p 1 Z 1 e β ( p 2 2 m + W ( x ) ) (4.66) n ( x ) = i = 1 N δ 3 x i x = N d 3 p 1 Z 1 e β p 2 2 m + W ( x ) {:(4.66)n( vec(x))=(:sum_(i=1)^(N)delta^(3)( vec(x)_(i)-( vec(x))):)=N intd^(3)p(1)/(Z_(1))e^(-beta(( vec(p)^(2))/(2m)+W(( vec(x))))):}\begin{equation*} n(\vec{x})=\left\langle\sum_{i=1}^{N} \delta^{3}\left(\vec{x}_{i}-\vec{x}\right)\right\rangle=N \int d^{3} p \frac{1}{Z_{1}} e^{-\beta\left(\frac{\vec{p}^{2}}{2 m}+\mathcal{W}(\vec{x})\right)} \tag{4.66} \end{equation*}(4.66)n(x)=i=1Nδ3(xix)=Nd3p1Z1eβ(p22m+W(x))
where
(4.67) Z 1 = d 3 p d 3 x e β ( p 2 2 m + W ( x ) ) = ( 2 π m β ) 3 2 d 3 x e β ( W ( x ) ) (4.67) Z 1 = d 3 p d 3 x e β p 2 2 m + W ( x ) = 2 π m β 3 2 d 3 x e β ( W ( x ) ) {:(4.67)Z_(1)=intd^(3)pd^(3)xe^(-beta(( vec(p)^(2))/(2m)+W(( vec(x)))))=((2pi m)/(beta))^((3)/(2))intd^(3)xe^(-beta(W( vec(x)))):}\begin{equation*} Z_{1}=\int d^{3} p d^{3} x e^{-\beta\left(\frac{\vec{p}^{2}}{2 m}+\mathcal{W}(\vec{x})\right)}=\left(\frac{2 \pi m}{\beta}\right)^{\frac{3}{2}} \int d^{3} x e^{-\beta(\mathcal{W}(\vec{x}))} \tag{4.67} \end{equation*}(4.67)Z1=d3pd3xeβ(p22m+W(x))=(2πmβ)32d3xeβ(W(x))
From this we obtain the barometric formula
(4.68) n ( x ) = n 0 e β W ( x ) (4.68) n ( x ) = n 0 e β W ( x ) {:(4.68)n( vec(x))=n_(0)e^(-betaW( vec(x))):}\begin{equation*} n(\vec{x})=n_{0} e^{-\beta \mathcal{W}(\vec{x})} \tag{4.68} \end{equation*}(4.68)n(x)=n0eβW(x)
with n 0 n 0 n_(0)n_{0}n0 given by
(4.69) n 0 = N d 3 x e β W ( x ) (4.69) n 0 = N d 3 x e β W ( x ) {:(4.69)n_(0)=(N)/(intd^(3)xe^(-betaW( vec(x)))):}\begin{equation*} n_{0}=\frac{N}{\int d^{3} x e^{-\beta \mathcal{W}(\vec{x})}} \tag{4.69} \end{equation*}(4.69)n0=Nd3xeβW(x)
In particular, for the gravitational potential, W ( x , y , z ) = m g z W ( x , y , z ) = m g z W(x,y,z)=mgz\mathcal{W}(x, y, z)=m g zW(x,y,z)=mgz, we find
(4.70) n ( z ) = n 0 e z m g k B T (4.70) n ( z ) = n 0 e z m g k B T {:(4.70)n(z)=n_(0)e^(-z(mg)/(k_(B)T)):}\begin{equation*} n(z)=n_{0} e^{-z \frac{m g}{\mathrm{k}_{\mathrm{B}} T}} \tag{4.70} \end{equation*}(4.70)n(z)=n0ezmgkBT
To double-check with our intuition we provide an alternative derivation of this formula: let P ( x ) P ( x ) P( vec(x))P(\vec{x})P(x) be the pressure at x x vec(x)\vec{x}x and F ( x ) = W ( x ) F ( x ) = W ( x ) vec(F)( vec(x))=- vec(grad)W( vec(x))\vec{F}(\vec{x})=-\vec{\nabla} \mathcal{W}(\vec{x})F(x)=W(x) the force acting on one particle. For the average force density f ( x ) f ( x ) vec(f)( vec(x))\vec{f}(\vec{x})f(x) in equilibrium we thus obtain
(4.71) f ( x ) = n ( x ) F ( x ) = n ( x ) W ( x ) = P ( x ) (4.71) f ( x ) = n ( x ) F ( x ) = n ( x ) W ( x ) = P ( x ) {:(4.71) vec(f)( vec(x))= vec(n)( vec(x)) vec(F)( vec(x))=- vec(n)( vec(x)) vec(grad)W( vec(x))= vec(grad)P( vec(x)):}\begin{equation*} \vec{f}(\vec{x})=\vec{n}(\vec{x}) \vec{F}(\vec{x})=-\vec{n}(\vec{x}) \vec{\nabla} \mathcal{W}(\vec{x})=\vec{\nabla} P(\vec{x}) \tag{4.71} \end{equation*}(4.71)f(x)=n(x)F(x)=n(x)W(x)=P(x)
Together with P ( x ) = n ( x ) k B T P ( x ) = n ( x ) k B T P( vec(x))=n( vec(x))k_(B)TP(\vec{x})=n(\vec{x}) \mathrm{k}_{\mathrm{B}} TP(x)=n(x)kBT it follows that
(4.72) k B T n ( x ) = n ( x ) W ( x ) (4.72) k B T n ( x ) = n ( x ) W ( x ) {:(4.72)k_(B)T vec(grad)n( vec(x))=-n( vec(x)) vec(grad)W( vec(x)):}\begin{equation*} \mathrm{k}_{\mathrm{B}} T \vec{\nabla} n(\vec{x})=-n(\vec{x}) \vec{\nabla} \mathcal{W}(\vec{x}) \tag{4.72} \end{equation*}(4.72)kBTn(x)=n(x)W(x)
and thus
(4.73) k B T log n ( x ) = W ( x ) (4.73) k B T log n ( x ) = W ( x ) {:(4.73)k_(B)T vec(grad)log n( vec(x))=- vec(grad)W( vec(x)):}\begin{equation*} \mathrm{k}_{\mathrm{B}} T \vec{\nabla} \log n(\vec{x})=-\vec{\nabla} \mathcal{W}(\vec{x}) \tag{4.73} \end{equation*}(4.73)kBTlogn(x)=W(x)
which again yields the barometric formula
(4.74) n ( x ) = n 0 e β W ( x ) (4.74) n ( x ) = n 0 e β W ( x ) {:(4.74)n( vec(x))=n_(0)e^(-betaW( vec(x))):}\begin{equation*} n(\vec{x})=n_{0} e^{-\beta \mathcal{W}(\vec{x})} \tag{4.74} \end{equation*}(4.74)n(x)=n0eβW(x)

4.3.3 Equidistribution Law and Virial Theorem in the Canonical Ensemble

We first derive the equidistribution law for classical systems with a Hamiltonian of the form
(4.75) H = i = 1 N p i 2 2 m i + V ( Q ) , Q = ( x 1 , , x N ) (4.75) H = i = 1 N p i 2 2 m i + V ( Q ) , Q = x 1 , , x N {:(4.75)H=sum_(i=1)^(N)( vec(p)_(i)^(2))/(2m_(i))+V(Q)","quad Q=( vec(x)_(1),dots, vec(x)_(N)):}\begin{equation*} H=\sum_{i=1}^{N} \frac{\vec{p}_{i}^{2}}{2 m_{i}}+\mathcal{V}(Q), \quad Q=\left(\vec{x}_{1}, \ldots, \vec{x}_{N}\right) \tag{4.75} \end{equation*}(4.75)H=i=1Npi22mi+V(Q),Q=(x1,,xN)
We take as the probability distribution the canonical ensemble as discussed in the previous subsection, with probability distribution given by
(4.76) ρ ( P , Q ) = 1 Z e β H ( P , Q ) (4.76) ρ ( P , Q ) = 1 Z e β H ( P , Q ) {:(4.76)rho(P","Q)=(1)/(Z)e^(-beta H(P,Q)):}\begin{equation*} \rho(P, Q)=\frac{1}{Z} e^{-\beta H(P, Q)} \tag{4.76} \end{equation*}(4.76)ρ(P,Q)=1ZeβH(P,Q)
Then we have for any observable A ( P , Q ) A ( P , Q ) A(P,Q)A(P, Q)A(P,Q) :
0 = d 3 N P d 3 N Q p i α ( A ( P , Q ) ρ ( P , Q ) ) = d 3 N P d 3 N Q ( A p i α β A H p i α ) ρ ( P , Q ) (4.77) = A p i α β A H p i α , i = 1 , , N , α = 1 , 2 , 3 . 0 = d 3 N P d 3 N Q p i α ( A ( P , Q ) ρ ( P , Q ) ) = d 3 N P d 3 N Q A p i α β A H p i α ρ ( P , Q ) (4.77) = A p i α β A H p i α , i = 1 , , N , α = 1 , 2 , 3 . {:[0=intd^(3N)Pd^(3N)Q(del)/(delp_(i alpha))(A(P","Q)rho(P","Q))],[=intd^(3N)Pd^(3N)Q((del A)/(delp_(i alpha))-beta A(del H)/(delp_(i alpha)))rho(P","Q)],[(4.77)=(:(del A)/(delp_(i alpha)):)-beta(:A(del H)/(delp_(i alpha)):)","quad i=1","dots","N","alpha=1","2","3.]:}\begin{align*} 0 & =\int d^{3 N} P d^{3 N} Q \frac{\partial}{\partial p_{i \alpha}}(A(P, Q) \rho(P, Q)) \\ & =\int d^{3 N} P d^{3 N} Q\left(\frac{\partial A}{\partial p_{i \alpha}}-\beta A \frac{\partial H}{\partial p_{i \alpha}}\right) \rho(P, Q) \\ & =\left\langle\frac{\partial A}{\partial p_{i \alpha}}\right\rangle-\beta\left\langle A \frac{\partial H}{\partial p_{i \alpha}}\right\rangle, \quad i=1, \ldots, N, \alpha=1,2,3 . \tag{4.77} \end{align*}0=d3NPd3NQpiα(A(P,Q)ρ(P,Q))=d3NPd3NQ(ApiαβAHpiα)ρ(P,Q)(4.77)=ApiαβAHpiα,i=1,,N,α=1,2,3.
From this we obtain the relation
(4.78) k B T A p i α = A H p i α (4.78) k B T A p i α = A H p i α {:(4.78)k_(B)T(:(del A)/(delp_(i alpha)):)=(:A(del H)/(delp_(i alpha)):):}\begin{equation*} \mathrm{k}_{\mathrm{B}} T\left\langle\frac{\partial A}{\partial p_{i \alpha}}\right\rangle=\left\langle A \frac{\partial H}{\partial p_{i \alpha}}\right\rangle \tag{4.78} \end{equation*}(4.78)kBTApiα=AHpiα
and similarly
(4.79) k B T A x i α = A H x i α (4.79) k B T A x i α = A H x i α {:(4.79)k_(B)T(:(del A)/(delx_(i alpha)):)=(:A(del H)/(delx_(i alpha)):):}\begin{equation*} \mathrm{k}_{\mathrm{B}} T\left\langle\frac{\partial A}{\partial x_{i \alpha}}\right\rangle=\left\langle A \frac{\partial H}{\partial x_{i \alpha}}\right\rangle \tag{4.79} \end{equation*}(4.79)kBTAxiα=AHxiα
The function A A AAA should be chosen such that that the integrand falls of sufficiently rapidly. For A ( P , Q ) = p i α A ( P , Q ) = p i α A(P,Q)=p_(i alpha)A(P, Q)=p_{i \alpha}A(P,Q)=piα and A ( P , Q ) = x i α A ( P , Q ) = x i α A(P,Q)=x_(i alpha)A(P, Q)=x_{i \alpha}A(P,Q)=xiα, respectively, we find
(4.80) p i α H p i α = p i α 2 m i = k B T (4.81) x i α H x i α = x i α V x i α = k B T (4.80) p i α H p i α = p i α 2 m i = k B T (4.81) x i α H x i α = x i α V x i α = k B T {:[(4.80)(:p_(i alpha)(del H)/(delp_(i alpha)):)=(:(p_(i alpha)^(2))/(m_(i)):)=k_(B)T],[(4.81)(:x_(i alpha)(del H)/(delx_(i alpha)):)=(:x_(i alpha)(delV)/(delx_(i alpha)):)=k_(B)T]:}\begin{align*} \left\langle p_{i \alpha} \frac{\partial H}{\partial p_{i \alpha}}\right\rangle & =\left\langle\frac{p_{i \alpha}^{2}}{m_{i}}\right\rangle=\mathrm{k}_{\mathrm{B}} T \tag{4.80}\\ \left\langle x_{i \alpha} \frac{\partial H}{\partial x_{i \alpha}}\right\rangle & =\left\langle x_{i \alpha} \frac{\partial \mathcal{V}}{\partial x_{i \alpha}}\right\rangle=\mathrm{k}_{\mathrm{B}} T \tag{4.81} \end{align*}(4.80)piαHpiα=piα2mi=kBT(4.81)xiαHxiα=xiαVxiα=kBT
The first of these equations is called equipartition or equidistribution law.
We split up the potential V V V\mathcal{V}V into a part coming from the interactions of the particles and
a part describing an external potential, i.e.
(4.82) V ( Q ) = i < j V ( x i x j ) interactions + i W ( x i , ) external potential = 1 2 i j V ( x i x j ) (4.82) V ( Q ) = i < j V x i x j interactions  + i W x i , external potential  = 1 2 i j V x i x j {:[(4.82)V(Q)=ubrace(sum_(i < j)V( vec(x)_(i)- vec(x)_(j))ubrace)_("interactions ")+ubrace(sum_(i)W( vec(x)_(i),)ubrace)_("external potential ")],[=(1)/(2)sum_(i!=j)V( vec(x)_(i)- vec(x)_(j))]:}\begin{align*} \mathcal{V}(Q)= & \underbrace{\sum_{i<j} \mathcal{V}\left(\vec{x}_{i}-\vec{x}_{j}\right)}_{\text {interactions }}+\underbrace{\sum_{i} \mathcal{W}\left(\vec{x}_{i},\right)}_{\text {external potential }} \tag{4.82}\\ = & \frac{1}{2} \sum_{i \neq j} \mathcal{V}\left(\vec{x}_{i}-\vec{x}_{j}\right) \end{align*}(4.82)V(Q)=i<jV(xixj)interactions +iW(xi,)external potential =12ijV(xixj)
Writing x k l x k x l x k l x k x l vec(x)_(kl)-= vec(x)_(k)- vec(x)_(l)\vec{x}_{k l} \equiv \vec{x}_{k}-\vec{x}_{l}xklxkxl for the relative distance between the k k kkk-th and the l l lll-th particle, we find by a lengthy calculation:
i , α x i α V ( Q ) x i α = 1 2 i , α k l x i α x i α V ( x k x l ) + i , α k x i α x i α W ( x k ) = 1 2 i , α , β , k l x i α ( V x β ) ( x k x l ) ( δ i k δ i l ) δ α β + d 3 x k δ 3 ( x x k ) x W ( x ) = 1 2 k l ( x k x l ) V ( x k x l ) d 3 x n ( x ) F ( x ) P x = 1 2 k l x k l V x k l + d 3 x x = 3 P = 1 2 k l x k l V x k l + 3 P V , i , α x i α V ( Q ) x i α = 1 2 i , α k l x i α x i α V x k x l + i , α k x i α x i α W x k = 1 2 i , α , β , k l x i α V x β x k x l δ i k δ i l δ α β + d 3 x k δ 3 x x k x W ( x ) = 1 2 k l x k x l V x k x l d 3 x n ( x ) F ( x ) P x = 1 2 k l x k l V x k l + d 3 x x = 3 P = 1 2 k l x k l V x k l + 3 P V , {:[sum_(i,alpha)(:x_(i alpha)(delV(Q))/(delx_(i alpha)):)=(1)/(2)sum_(i,alpha)sum_(k!=l)(:x_(i alpha)(del)/(delx_(i alpha))V( vec(x)_(k)- vec(x)_(l)):)+sum_(i,alpha)sum_(k)(:x_(i alpha)(del)/(delx_(i alpha))W( vec(x)_(k)):)],[=(1)/(2)sum_(i,alpha,beta,k!=l)(:x_(i alpha)((delV)/(delx_(beta)))( vec(x)_(k)- vec(x)_(l)):)(delta_(ik)-delta_(il))delta_(alpha beta)+intd^(3)x(:sum_(k)delta^(3)(( vec(x))- vec(x)_(k)):) vec(x)* vec(grad)W( vec(x))],[=(1)/(2)sum_(k!=l)(:( vec(x)_(k)- vec(x)_(l))( vec(grad))V( vec(x)_(k)- vec(x)_(l)):)-intd^(3)xubrace(n(( vec(x)))( vec(F))(( vec(x)))ubrace)_( vec(grad)P)* vec(x)],[=(1)/(2)sum_(k!=l)(: vec(x)_(kl)(delV)/(del vec(x)_(kl)):)+intd^(3)xubrace(( vec(grad))*( vec(x))ubrace)_(=3)P],[=(1)/(2)sum_(k!=l)(: vec(x)_(kl)(delV)/(del vec(x)_(kl)):)+3PV","]:}\begin{aligned} \sum_{i, \alpha}\left\langle x_{i \alpha} \frac{\partial \mathcal{V}(Q)}{\partial x_{i \alpha}}\right\rangle & =\frac{1}{2} \sum_{i, \alpha} \sum_{k \neq l}\left\langle x_{i \alpha} \frac{\partial}{\partial x_{i \alpha}} \mathcal{V}\left(\vec{x}_{k}-\vec{x}_{l}\right)\right\rangle+\sum_{i, \alpha} \sum_{k}\left\langle x_{i \alpha} \frac{\partial}{\partial x_{i \alpha}} \mathcal{W}\left(\vec{x}_{k}\right)\right\rangle \\ & =\frac{1}{2} \sum_{i, \alpha, \beta, k \neq l}\left\langle x_{i \alpha}\left(\frac{\partial \mathcal{V}}{\partial x_{\beta}}\right)\left(\vec{x}_{k}-\vec{x}_{l}\right)\right\rangle\left(\delta_{i k}-\delta_{i l}\right) \delta_{\alpha \beta}+\int d^{3} x\left\langle\sum_{k} \delta^{3}\left(\vec{x}-\vec{x}_{k}\right)\right\rangle \vec{x} \cdot \vec{\nabla} \mathcal{W}(\vec{x}) \\ & =\frac{1}{2} \sum_{k \neq l}\left\langle\left(\vec{x}_{k}-\vec{x}_{l}\right) \vec{\nabla} \mathcal{V}\left(\vec{x}_{k}-\vec{x}_{l}\right)\right\rangle-\int d^{3} x \underbrace{n(\vec{x}) \vec{F}(\vec{x})}_{\vec{\nabla} P} \cdot \vec{x} \\ & =\frac{1}{2} \sum_{k \neq l}\left\langle\vec{x}_{k l} \frac{\partial \mathcal{V}}{\partial \vec{x}_{k l}}\right\rangle+\int d^{3} x \underbrace{\vec{\nabla} \cdot \vec{x}}_{=3} P \\ & =\frac{1}{2} \sum_{k \neq l}\left\langle\vec{x}_{k l} \frac{\partial \mathcal{V}}{\partial \vec{x}_{k l}}\right\rangle+3 P V, \end{aligned}i,αxiαV(Q)xiα=12i,αklxiαxiαV(xkxl)+i,αkxiαxiαW(xk)=12i,α,β,klxiα(Vxβ)(xkxl)(δikδil)δαβ+d3xkδ3(xxk)xW(x)=12kl(xkxl)V(xkxl)d3xn(x)F(x)Px=12klxklVxkl+d3xx=3P=12klxklVxkl+3PV,
where we assumed that the potential W W W\mathcal{W}W is constant within the volume, so that the pressure is also constant. According to the equipartition law we have i α x i α V x i α = i α x i α V x i α = sum_(i alpha)(:x_(i alpha)(delV)/(delx_(i alpha)):)=\sum_{i \alpha}\left\langle x_{i \alpha} \frac{\partial \mathcal{V}}{\partial x_{i \alpha}}\right\rangle=iαxiαVxiα= 3 N k B T 3 N k B T 3Nk_(B)T3 N \mathrm{k}_{\mathrm{B}} T3NkBT and therefore obtain the virial law for classical systems
(4.83) P V = N k B T 1 6 k l x k l V x k l = 0 for ideal gas . (4.83) P V = N k B T 1 6 k l x k l V x k l = 0  for ideal gas  . {:(4.83)PV=Nk_(B)T-ubrace((1)/(6)sum_(k!=l)(: vec(x)_(kl)(delV)/(del vec(x)_(kl)):)ubrace)_(=0" for ideal gas ").:}\begin{equation*} P V=N \mathrm{k}_{\mathrm{B}} T-\underbrace{\frac{1}{6} \sum_{k \neq l}\left\langle\vec{x}_{k l} \frac{\partial \mathcal{V}}{\partial \vec{x}_{k l}}\right\rangle}_{=0 \text { for ideal gas }} . \tag{4.83} \end{equation*}(4.83)PV=NkBT16klxklVxkl=0 for ideal gas .
Thus, interactions tend to increase P P PPP when they are repulsive, and tend to decrease P P PPP when they are attractive. This is of course consistent with our intuition.
A well-known application of the virial law is the following example:

Example: estimation of the mass of distant galaxies:

Figure 4.6: Distribution and velocity of stars in a galaxy.
We use the relations (4.80) we found above,
p 1 2 m 1 = V x 1 x 1 = 3 k B T p 1 2 m 1 = V x 1 x 1 = 3 k B T (:( vec(p)_(1)^(2))/(m_(1)):)=(:(delV)/(del vec(x)_(1)) vec(x)_(1):)=3k_(B)T\left\langle\frac{\vec{p}_{1}^{2}}{m_{1}}\right\rangle=\left\langle\frac{\partial \mathcal{V}}{\partial \vec{x}_{1}} \vec{x}_{1}\right\rangle=3 \mathrm{k}_{\mathrm{B}} Tp12m1=Vx1x1=3kBT
assuming that the stars in the outer region have reached thermal equilibrium, so that they can be described by the canonical ensemble. We put v = p 1 / m 1 , v = | v | v = p 1 / m 1 , v = | v | vec(v)= vec(p)_(1)//m_(1),v=| vec(v)|\vec{v}=\vec{p}_{1} / m_{1}, v=|\vec{v}|v=p1/m1,v=|v| and R = | x 1 | R = x 1 R=| vec(x_(1))|R=\left|\overrightarrow{x_{1}}\right|R=|x1|, and assume that v 2 v 2 v 2 v 2 (: vec(v)^(2):)~~(:v:)^(2)\left\langle\vec{v}^{2}\right\rangle \approx\langle v\rangle^{2}v2v2 as well as
(4.84) V x 1 x 1 = m 1 G j 1 m j | x 1 x j | m 1 M G 1 R m 1 M G 1 R (4.84) V x 1 x 1 = m 1 G j 1 m j x 1 x j m 1 M G 1 R m 1 M G 1 R {:(4.84)(:(delV)/(del vec(x)_(1)) vec(x)_(1):)=m_(1)Gsum_(j!=1)(:(m_(j))/(| vec(x)_(1)- vec(x)_(j)|):)~~m_(1)MG(:(1)/(R):)~~m_(1)MG(1)/((:R:)):}\begin{equation*} \left\langle\frac{\partial \mathcal{V}}{\partial \vec{x}_{1}} \vec{x}_{1}\right\rangle=m_{1} G \sum_{j \neq 1}\left\langle\frac{m_{j}}{\left|\vec{x}_{1}-\vec{x}_{j}\right|}\right\rangle \approx m_{1} M G\left\langle\frac{1}{R}\right\rangle \approx m_{1} M G \frac{1}{\langle R\rangle} \tag{4.84} \end{equation*}(4.84)Vx1x1=m1Gj1mj|x1xj|m1MG1Rm1MG1R
supposing that the potential felt by star 1 is dominated by the Newton potential created by the core of the galaxy containing most of the mass M j m j M j m j M~~sum_(j)m_(j)M \approx \sum_{j} m_{j}Mjmj. Under these approximations, we conclude that
(4.85) M R v 2 G (4.85) M R v 2 G {:(4.85)(M)/((:R:))~~((:v:)^(2))/(G):}\begin{equation*} \frac{M}{\langle R\rangle} \approx \frac{\langle v\rangle^{2}}{G} \tag{4.85} \end{equation*}(4.85)MRv2G
This relation is useful for estimating M M MMM because R R (:R:)\langle R\rangleR and v v (:v:)\langle v\ranglev can be measured or estimated. Typically v = O ( 10 2 km s ) v = O 10 2 km s (:v:)=O(10^(2)((km))/((s)))\langle v\rangle=\mathcal{O}\left(10^{2} \frac{\mathrm{~km}}{\mathrm{~s}}\right)v=O(102 km s).
Continuing with the general discussion, if the potential attains a minimum at Q = Q 0 Q = Q 0 Q=Q_(0)Q=Q_{0}Q=Q0 we have V x i α ( Q 0 ) = 0 V x i α Q 0 = 0 (delV)/(delx_(i alpha))(Q_(0))=0\frac{\partial \mathcal{V}}{\partial x_{i \alpha}}\left(Q_{0}\right)=0Vxiα(Q0)=0, as sketched in the following figure:
Figure 4.7: Sketch of a potential V V V\mathcal{V}V of a lattice with a minimum at Q 0 Q 0 Q_(0)Q_{0}Q0.
Setting V ( Q 0 ) V 0 V Q 0 V 0 V(Q_(0))-=V_(0)\mathcal{V}\left(Q_{0}\right) \equiv \mathcal{V}_{0}V(Q0)V0, we can Taylor expand around Q 0 Q 0 Q_(0)Q_{0}Q0 :
(4.86) V ( Q ) = V 0 + 1 2 2 V x i α x j β ( Q 0 ) = f i α j β Δ x i α Δ x j β + (4.86) V ( Q ) = V 0 + 1 2 2 V x i α x j β Q 0 = f i α j β Δ x i α Δ x j β + {:(4.86)V(Q)=V_(0)+(1)/(2)sumubrace((del^(2)V)/(delx_(i alpha)delx_(j beta))(Q_(0))ubrace)_(=f_(i alpha j beta))Deltax_(i alpha)Deltax_(j beta)+dots:}\begin{equation*} \mathcal{V}(Q)=\mathcal{V}_{0}+\frac{1}{2} \sum \underbrace{\frac{\partial^{2} \mathcal{V}}{\partial x_{i \alpha} \partial x_{j \beta}}\left(Q_{0}\right)}_{=f_{i \alpha j \beta}} \Delta x_{i \alpha} \Delta x_{j \beta}+\ldots \tag{4.86} \end{equation*}(4.86)V(Q)=V0+122Vxiαxjβ(Q0)=fiαjβΔxiαΔxjβ+
where Δ Q = Q Q 0 Δ Q = Q Q 0 Delta Q=Q-Q_(0)\Delta Q=Q-Q_{0}ΔQ=QQ0. In this approximation | Δ Q | 1 | Δ Q | 1 |Delta Q|≪1|\Delta Q| \ll 1|ΔQ|1, i.e. for small oscillations around the minimum) we have, setting the zero point energy V 0 = 0 V 0 = 0 V_(0)=0\mathcal{V}_{0}=0V0=0,
(4.87) i , α x i α V x i α 2 V = i , α k B T = 3 N k B T (4.88) i , α p i α 2 m i = 2 i p i 2 2 m i = 3 N k B T (4.87) i , α x i α V x i α 2 V = i , α k B T = 3 N k B T (4.88) i , α p i α 2 m i = 2 i p i 2 2 m i = 3 N k B T {:[(4.87)sum_(i,alpha)(:x_(i alpha)(delV)/(delx_(i alpha)):)~~2(:V:)=sum_(i,alpha)k_(B)T=3Nk_(B)T],[(4.88)sum_(i,alpha)(:(p_(i alpha)^(2))/(m_(i)):)=2(:sum_(i)( vec(p)_(i)^(2))/(2m_(i)):)=3Nk_(B)T]:}\begin{align*} \sum_{i, \alpha}\left\langle x_{i \alpha} \frac{\partial \mathcal{V}}{\partial x_{i \alpha}}\right\rangle & \approx 2\langle\mathcal{V}\rangle=\sum_{i, \alpha} \mathrm{k}_{\mathrm{B}} T=3 N \mathrm{k}_{\mathrm{B}} T \tag{4.87}\\ \sum_{i, \alpha}\left\langle\frac{p_{i \alpha}^{2}}{m_{i}}\right\rangle & =2\left\langle\sum_{i} \frac{\vec{p}_{i}^{2}}{2 m_{i}}\right\rangle=3 N \mathrm{k}_{\mathrm{B}} T \tag{4.88} \end{align*}(4.87)i,αxiαVxiα2V=i,αkBT=3NkBT(4.88)i,αpiα2mi=2ipi22mi=3NkBT
It follows that the mean energy H H (:H:)\langle H\rangleH of the system is given by
(4.89) H = 3 N k B T . (4.89) H = 3 N k B T . {:(4.89)(:H:)=3Nk_(B)T.:}\begin{equation*} \langle H\rangle=3 N \mathrm{k}_{\mathrm{B}} T . \tag{4.89} \end{equation*}(4.89)H=3NkBT.
This relation is called the Dulong-Petit law. For real lattice systems there are deviations from this law at low temperature T T TTT through quantum effects and at high temperature T T TTT through non-linear effects, which are not captured by the approximation (4.86).
Our discussion for classical systems can be adapted to the quantum mechanical context, but there are some changes. Consider the canonical ensemble with statistical operator ρ = 1 Z e β H ρ = 1 Z e β H rho=(1)/(Z)e^(-beta H)\rho=\frac{1}{Z} e^{-\beta H}ρ=1ZeβH. From this it immediately follows that
(4.90) [ ρ , H ] = 0 (4.90) [ ρ , H ] = 0 {:(4.90)[rho","H]=0:}\begin{equation*} [\rho, H]=0 \tag{4.90} \end{equation*}(4.90)[ρ,H]=0
which in turn implies that for any observable A A AAA we have
(4.91) [ H , A ] = tr ( ρ [ A , H ] ) = tr ( [ ρ , H ] A ) = 0 . (4.91) [ H , A ] = tr ( ρ [ A , H ] ) = tr ( [ ρ , H ] A ) = 0 . {:(4.91)(:[H","A]:)=tr(rho[A","H])=-tr([rho","H]A)=0.:}\begin{equation*} \langle[H, A]\rangle=\operatorname{tr}(\rho[A, H])=-\operatorname{tr}([\rho, H] A)=0 . \tag{4.91} \end{equation*}(4.91)[H,A]=tr(ρ[A,H])=tr([ρ,H]A)=0.
Now let A = i x i p i A = i x i p i A=sum_(i) vec(x)_(i)* vec(p)_(i)A=\sum_{i} \vec{x}_{i} \cdot \vec{p}_{i}A=ixipi and assume, as before, that
(4.92) H = i p i 2 2 m i + V N ( Q ) (4.92) H = i p i 2 2 m i + V N ( Q ) {:(4.92)H=sum_(i)( vec(p)_(i)^(2))/(2m_(i))+V_(N)(Q):}\begin{equation*} H=\sum_{i} \frac{\vec{p}_{i}^{2}}{2 m_{i}}+\mathcal{V}_{N}(Q) \tag{4.92} \end{equation*}(4.92)H=ipi22mi+VN(Q)
By using [ a , b c ] = [ a , b ] c + b [ a , c ] [ a , b c ] = [ a , b ] c + b [ a , c ] [a,bc]=[a,b]c+b[a,c][a, b c]=[a, b] c+b[a, c][a,bc]=[a,b]c+b[a,c] and p j = i x j p j = i x j vec(p)_(j)=(ℏ)/(i)(del)/(del vec(x)_(j))\vec{p}_{j}=\frac{\hbar}{i} \frac{\partial}{\partial \vec{x}_{j}}pj=ixj we obtain
[ H , A ] = i , j [ p i 2 2 m i , x j ] p j + j x j [ V ( Q ) , p j ] = i j p j 2 m j + i j x j x j V ( Q ) [ H , A ] = i , j p i 2 2 m i , x j p j + j x j V ( Q ) , p j = i j p j 2 m j + i j x j x j V ( Q ) {:[[H","A]=sum_(i,j)[( vec(p)_(i)^(2))/(2m_(i)), vec(x)_(j)]* vec(p)_(j)+sum_(j) vec(x)_(j)[V(Q), vec(p)_(j)]],[=(ℏ)/(i)sum_(j)( vec(p)_(j)^(2))/(m_(j))+iℏsum_(j) vec(x)_(j)del_( vec(x)_(j))V(Q)]:}\begin{aligned} {[H, A] } & =\sum_{i, j}\left[\frac{\vec{p}_{i}^{2}}{2 m_{i}}, \vec{x}_{j}\right] \cdot \vec{p}_{j}+\sum_{j} \vec{x}_{j}\left[\mathcal{V}(Q), \vec{p}_{j}\right] \\ & =\frac{\hbar}{i} \sum_{j} \frac{\vec{p}_{j}^{2}}{m_{j}}+i \hbar \sum_{j} \vec{x}_{j} \partial_{\vec{x}_{j}} \mathcal{V}(Q) \end{aligned}[H,A]=i,j[pi22mi,xj]pj+jxj[V(Q),pj]=ijpj2mj+ijxjxjV(Q)
which gives
(4.93) j x j V x j = 2 H kin ( cancels out ) . (4.93) j x j V x j = 2 H kin  (  cancels out  ) . {:(4.93)sum_(j)(: vec(x)_(j)(delV)/(del vec(x)_(j)):)=2(:H_("kin "):)quad(ℏ" cancels out ").:}\begin{equation*} \sum_{j}\left\langle\vec{x}_{j} \frac{\partial \mathcal{V}}{\partial \vec{x}_{j}}\right\rangle=2\left\langle H_{\text {kin }}\right\rangle \quad(\hbar \text { cancels out }) . \tag{4.93} \end{equation*}(4.93)jxjVxj=2Hkin ( cancels out ).
Applying now the same arguments as in the classical case to evaluate the left hand side leads to
(4.94) P V = 2 3 H kin N k B T for ideal gas quantum effects! 1 6 k l x k l V x k l . (4.94) P V = 2 3 H kin N k B T  for ideal gas   quantum effects!  1 6 k l x k l V x k l . {:(4.94)PV=ubrace((2)/(3)(:H_(kin):)ubrace)_({:[],[!=Nk_(B)T" for ideal gas "],[],[=>" quantum effects! "]:})-(1)/(6)sum_(k!=l)(: vec(x)_(kl)(delV)/(del vec(x)_(kl)):).:}\begin{equation*} P V=\underbrace{\frac{2}{3}\left\langle H_{\mathrm{kin}}\right\rangle}_{\substack{ \\\neq N \mathrm{k}_{\mathrm{B}} T \text { for ideal gas } \\ \\ \Rightarrow \text { quantum effects! }}}-\frac{1}{6} \sum_{k \neq l}\left\langle\vec{x}_{k l} \frac{\partial \mathcal{V}}{\partial \vec{x}_{k l}}\right\rangle . \tag{4.94} \end{equation*}(4.94)PV=23HkinNkBT for ideal gas  quantum effects! 16klxklVxkl.
For an ideal gas the contribution from the potential is by definition absent, but the contribution from the kinetic piece does not give the same formula as in the classical case, as we will discuss in more detail below in chapter 5. Thus, even for an ideal quantum gas ( V = 0 ) ( V = 0 ) (V=0)(\mathcal{V}=0)(V=0), the classical formula P V = N k B T P V = N k B T PV=Nk_(B)TP V=N \mathrm{k}_{\mathrm{B}} TPV=NkBT receives corrections!

4.4 Grand Canonical Ensemble

This ensemble describes the following physical situation: a small system (system A) is coupled to a large reservoir (system B). Energy and particle exchange between A and B are possible.
Figure 4.8: A small system coupled to a large heat and particle reservoir.
The treatment of this ensemble is similar to that of the canonical ensemble. For definiteness, we consider the quantum mechanical case. We have E = E A + E B E = E A + E B E=E_(A)+E_(B)E=E_{A}+E_{B}E=EA+EB for the total energy, and N = N A + N B N = N A + N B N=N_(A)+N_(B)N=N_{A}+N_{B}N=NA+NB for the total particle number. The total system A + B A + B A+B\mathrm{A}+\mathrm{B}A+B
is described by the microcanonical ensemble, since E E EEE and N N NNN are conserved. The Hilbert space for the total system is again a tensor product, and the statistical operator ρ ρ rho\rhoρ of the total system is accordingly given by
(4.95) ρ = 1 W E Δ E E n ( A ) + E m ( B ) E N n ( A ) + N m ( B ) = N | n , m n , m | , (4.95) ρ = 1 W E Δ E E n ( A ) + E m ( B ) E N n ( A ) + N m ( B ) = N | n , m n , m | , {:(4.95)rho=(1)/(W)*sum_({:[E-Delta E <= E_(n)^((A))+E_(m)^((B)) <= E],[N_(n)^((A))+N_(m)^((B))=N]:})|n","m:)(:n","m|",":}\begin{equation*} \rho=\frac{1}{W} \cdot \sum_{\substack{E-\Delta E \leqslant E_{n}^{(A)}+E_{m}^{(B)} \leqslant E \\ N_{n}^{(A)}+N_{m}^{(B)}=N}}|n, m\rangle\langle n, m|, \tag{4.95} \end{equation*}(4.95)ρ=1WEΔEEn(A)+Em(B)ENn(A)+Nm(B)=N|n,mn,m|,
where the total Hamiltonian of the combined system is
(4.96) H = H A system A + H B system B + H A B interaction (neglected) , (4.96) H = H A system A  + H B system B  + H A B interaction (neglected)  , {:(4.96)H=ubrace(H_(A)ubrace)_("system A ")+ubrace(H_(B)ubrace)_("system B ")+ubrace(H_(AB)ubrace)_("interaction (neglected) ")",":}\begin{equation*} H=\underbrace{H_{A}}_{\text {system A }}+\underbrace{H_{B}}_{\text {system B }}+\underbrace{H_{A B}}_{\text {interaction (neglected) }}, \tag{4.96} \end{equation*}(4.96)H=HAsystem A +HBsystem B +HABinteraction (neglected) ,
We are using notations similar to the canonical ensemble such as | n , m = | n A | m B | n , m = | n A | m B |n,m:)=|n:)_(A)|m:)_(B)|n, m\rangle=|n\rangle_{A}|m\rangle_{B}|n,m=|nA|mB and
(4.97) H A / B | n A / B = E n ( A / B ) | n A / B , (4.98) N ^ A / B | n A / B = N n ( A / B ) | n A / B . (4.97) H A / B | n A / B = E n ( A / B ) | n A / B , (4.98) N ^ A / B | n A / B = N n ( A / B ) | n A / B . {:[(4.97)H_(A//B)|n:)_(A//B)=E_(n)^((A//B))|n:)_(A//B)","],[(4.98) hat(N)_(A//B)|n:)_(A//B)=N_(n)^((A//B))|n:)_(A//B).]:}\begin{align*} H_{A / B}|n\rangle_{A / B} & =E_{n}^{(A / B)}|n\rangle_{A / B}, \tag{4.97}\\ \hat{N}_{A / B}|n\rangle_{A / B} & =N_{n}^{(A / B)}|n\rangle_{A / B} . \tag{4.98} \end{align*}(4.97)HA/B|nA/B=En(A/B)|nA/B,(4.98)N^A/B|nA/B=Nn(A/B)|nA/B.
Note that the particle numbers of the individual subsystems fluctuate, so we describe them by number operators N ^ A , N ^ B N ^ A , N ^ B hat(N)_(A), hat(N)_(B)\hat{N}_{A}, \hat{N}_{B}N^A,N^B acting on H A , H B H A , H B H_(A),H_(B)\mathcal{H}_{A}, \mathcal{H}_{B}HA,HB.
The statistical operator for system A is described by the reduced density matrix ρ A ρ A rho_(A)\rho_{A}ρA for this system, namely by
(4.99) ρ A = 1 W n W B ( E E n ( A ) , N N n ( A ) , V B ) | n A A n | (4.99) ρ A = 1 W n W B E E n ( A ) , N N n ( A ) , V B | n A A n | {:(4.99)rho_(A)=(1)/(W)sum_(n)W_(B)(E-E_(n)^((A)),N-N_(n)^((A)),V_(B))|n:)_(AA)(:n|:}\begin{equation*} \rho_{A}=\frac{1}{W} \sum_{n} W_{B}\left(E-E_{n}^{(A)}, N-N_{n}^{(A)}, V_{B}\right)|n\rangle_{A A}\langle n| \tag{4.99} \end{equation*}(4.99)ρA=1WnWB(EEn(A),NNn(A),VB)|nAAn|
As before, in the canonical ensemble, we use that the entropy is an extensive quantity to write
log W B ( E B , N B , V B ) = 1 k B S B ( E B , N B , V B ) = 1 k B V B σ B ( E B V B , N B V B ) , log W B E B , N B , V B = 1 k B S B E B , N B , V B = 1 k B V B σ B E B V B , N B V B , {:[log W_(B)(E_(B),N_(B),V_(B))=(1)/(k_(B))S_(B)(E_(B),N_(B),V_(B))],[=(1)/(k_(B))V_(B)sigma_(B)((E_(B))/(V_(B)),(N_(B))/(V_(B)))","]:}\begin{aligned} \log W_{B}\left(E_{B}, N_{B}, V_{B}\right) & =\frac{1}{\mathrm{k}_{\mathrm{B}}} S_{B}\left(E_{B}, N_{B}, V_{B}\right) \\ & =\frac{1}{\mathrm{k}_{\mathrm{B}}} V_{B} \sigma_{B}\left(\frac{E_{B}}{V_{B}}, \frac{N_{B}}{V_{B}}\right), \end{aligned}logWB(EB,NB,VB)=1kBSB(EB,NB,VB)=1kBVBσB(EBVB,NBVB),
for some function σ σ sigma\sigmaσ in two variables. Now we let V B V B V_(B)rarr ooV_{B} \rightarrow \inftyVB, but keeping E V B E V B (E)/(V_(B))\frac{E}{V_{B}}EVB and N V B N V B (N)/(V_(B))\frac{N}{V_{B}}NVB constant. Arguing precisely as in the case of the canonical ensemble, and using now also the definition of the chemical potential in (4.17), we find
(4.100) log W B ( E E n ( A ) , N N n ( A ) , V B ) = log W B ( E , N , V B ) β E n ( A ) + β μ N n ( A ) (4.100) log W B E E n ( A ) , N N n ( A ) , V B = log W B E , N , V B β E n ( A ) + β μ N n ( A ) {:(4.100)log W_(B)(E-E_(n)^((A)),N-N_(n)^((A)),V_(B))=log W_(B)(E,N,V_(B))-betaE_(n)^((A))+beta muN_(n)^((A)):}\begin{equation*} \log W_{B}\left(E-E_{n}^{(A)}, N-N_{n}^{(A)}, V_{B}\right)=\log W_{B}\left(E, N, V_{B}\right)-\beta E_{n}^{(A)}+\beta \mu N_{n}^{(A)} \tag{4.100} \end{equation*}(4.100)logWB(EEn(A),NNn(A),VB)=logWB(E,N,VB)βEn(A)+βμNn(A)
for N B , V B N B , V B N_(B),V_(B)rarr ooN_{B}, V_{B} \rightarrow \inftyNB,VB. By the same arguments as for the temperature in the canonical ensemble the chemical potential μ μ mu\muμ is the same for both systems in equilibrium. We
obtain for the reduced density matrix of system A:
(4.101) ρ A = 1 Y n e β ( E n ( A ) μ N n ( A ) ) | n A A n | (4.101) ρ A = 1 Y n e β E n ( A ) μ N n ( A ) | n A A n | {:(4.101)rho_(A)=(1)/(Y)sum_(n)e^(-beta(E_(n)^((A))-muN_(n)^((A))))|n:)_(A)ox_(A)(:n|:}\begin{equation*} \rho_{A}=\frac{1}{Y} \sum_{n} e^{-\beta\left(E_{n}^{(A)}-\mu N_{n}^{(A)}\right)}|n\rangle_{A} \otimes_{A}\langle n| \tag{4.101} \end{equation*}(4.101)ρA=1Yneβ(En(A)μNn(A))|nAAn|
Thus, only the quantities β β beta\betaβ and μ μ mu\muμ characterizing the reservoir (system B ) have an influence on system A. Dropping from now on the reference to "A", we can write the statistical operator of the grand canonical ensemble as
(4.102) ρ = 1 Y e β ( H ( V ) μ N ^ ( V ) ) , (4.102) ρ = 1 Y e β ( H ( V ) μ N ^ ( V ) ) , {:(4.102)rho=(1)/(Y)e^(-beta(H(V)-mu hat(N)(V)))",":}\begin{equation*} \rho=\frac{1}{Y} e^{-\beta(H(V)-\mu \hat{N}(V))}, \tag{4.102} \end{equation*}(4.102)ρ=1Yeβ(H(V)μN^(V)),
where H H HHH and N ^ N ^ hat(N)\hat{N}N^ are now operators. The constant Y = Y ( μ , β , V ) Y = Y ( μ , β , V ) Y=Y(mu,beta,V)Y=Y(\mu, \beta, V)Y=Y(μ,β,V) is determined by tr ρ 1 = 1 tr ρ 1 = 1 trrho_(1)=1\operatorname{tr} \rho_{1}=1trρ1=1 and is called the grand canonical partition function. Explicitly:
(4.103) Y ( μ , β , V ) = tr [ e β ( H ( V ) μ N ^ ) ] = n e β ( E n μ N n ) (4.103) Y ( μ , β , V ) = tr e β ( H ( V ) μ N ^ ) = n e β E n μ N n {:(4.103)Y(mu","beta","V)=tr[e^(-beta(H(V)-mu hat(N)))]=sum_(n)e^(-beta(E_(n)-muN_(n))):}\begin{equation*} Y(\mu, \beta, V)=\operatorname{tr}\left[e^{-\beta(H(V)-\mu \hat{N})}\right]=\sum_{n} e^{-\beta\left(E_{n}-\mu N_{n}\right)} \tag{4.103} \end{equation*}(4.103)Y(μ,β,V)=tr[eβ(H(V)μN^)]=neβ(EnμNn)
The analog of the free energy for the grand canonical ensemble is the Gibbs free energy. It is defined by
(4.104) G := β 1 log Y ( β , μ , V ) . (4.104) G := β 1 log Y ( β , μ , V ) . {:(4.104)G:=-beta^(-1)log Y(beta","mu","V).:}\begin{equation*} G:=-\beta^{-1} \log Y(\beta, \mu, V) . \tag{4.104} \end{equation*}(4.104)G:=β1logY(β,μ,V).
The grand canonical partition function can be related to the canonical partition function. The Hilbert space of our system (i.e., system A) can be decomposed
(4.105) H = C vacuum H 1 1 particle H 2 2 particles H 3 3 particles (4.105) H = C vacuum  H 1 1  particle  H 2 2  particles  H 3 3  particles  {:(4.105)H=ubrace(Cubrace)_("vacuum ")o+ubrace(H_(1)ubrace)_(1" particle ")o+ubrace(H_(2)ubrace)_(2" particles ")o+ubrace(H_(3)ubrace)_(3" particles ")o+dots:}\begin{equation*} \mathcal{H}=\underbrace{\mathbb{C}}_{\text {vacuum }} \oplus \underbrace{\mathcal{H}_{1}}_{1 \text { particle }} \oplus \underbrace{\mathcal{H}_{2}}_{2 \text { particles }} \oplus \underbrace{\mathcal{H}_{3}}_{3 \text { particles }} \oplus \ldots \tag{4.105} \end{equation*}(4.105)H=Cvacuum H11 particle H22 particles H33 particles 
with H N H N H_(N)\mathcal{H}_{N}HN the Hilbert space for a fixed number N N NNN of particles 2 2 ^(2){ }^{2}2, and that the total Hamiltonian is given by
H = H 1 + H 2 + H 3 + H N = i = 1 N p i 2 2 m + V N ( x 1 , , x N ) H = H 1 + H 2 + H 3 + H N = i = 1 N p i 2 2 m + V N x 1 , , x N {:[H=H_(1)+H_(2)+H_(3)+dots],[H_(N)=sum_(i=1)^(N)( vec(p)_(i)^(2))/(2m)+V_(N)( vec(x)_(1),dots, vec(x)_(N))]:}\begin{aligned} & H=H_{1}+H_{2}+H_{3}+\ldots \\ & H_{N}=\sum_{i=1}^{N} \frac{\vec{p}_{i}^{2}}{2 m}+\mathcal{V}_{N}\left(\vec{x}_{1}, \ldots, \vec{x}_{N}\right) \end{aligned}H=H1+H2+H3+HN=i=1Npi22m+VN(x1,,xN)
Then [ H , N ^ ] = 0 ( N ^ [ H , N ^ ] = 0 N ^ [H, hat(N)]=0(( hat(N)):}[H, \hat{N}]=0\left(\hat{N}\right.[H,N^]=0(N^ has eigenvalue N N NNN on H N ) H N {:H_(N))\left.\mathcal{H}_{N}\right)HN), and H H HHH and N ^ N ^ hat(N)\hat{N}N^ are simultaneously diagonalized, with (assuming a discrete spectrum of H H HHH )
(4.106) H | α , N = E α , N | α , N and N ^ | α , N = N | α , N (4.106) H | α , N = E α , N | α , N  and  N ^ | α , N = N | α , N {:(4.106)H|alpha","N:)=E_(alpha,N)|alpha","N:)quad" and "quad hat(N)|alpha","N:)=N|alpha","N:):}\begin{equation*} H|\alpha, N\rangle=E_{\alpha, N}|\alpha, N\rangle \quad \text { and } \quad \hat{N}|\alpha, N\rangle=N|\alpha, N\rangle \tag{4.106} \end{equation*}(4.106)H|α,N=Eα,N|α,N and N^|α,N=N|α,N
From this we get:
Y ( β , μ , V ) = α , N e β ( E α , N μ N ) = N e + β μ N α e β E α , N (4.107) = N Z ( N , β , V ) canonical partition function! e β μ N , Y ( β , μ , V ) = α , N e β E α , N μ N = N e + β μ N α e β E α , N (4.107) = N Z ( N , β , V ) canonical partition function!  e β μ N , {:[Y(beta","mu","V)=sum_(alpha,N)e^(-beta(E_(alpha,N)-mu N))=sum_(N)e^(+beta mu N)sum_(alpha)e^(-betaE_(alpha,N))],[(4.107)=sum_(N)ubrace(Z(N,beta,V)ubrace)_("canonical partition function! ")e^(beta mu N)","]:}\begin{align*} Y(\beta, \mu, V) & =\sum_{\alpha, N} e^{-\beta\left(E_{\alpha, N}-\mu N\right)}=\sum_{N} e^{+\beta \mu N} \sum_{\alpha} e^{-\beta E_{\alpha, N}} \\ & =\sum_{N} \underbrace{Z(N, \beta, V)}_{\text {canonical partition function! }} e^{\beta \mu N}, \tag{4.107} \end{align*}Y(β,μ,V)=α,Neβ(Eα,NμN)=Ne+βμNαeβEα,N(4.107)=NZ(N,β,V)canonical partition function! eβμN,
which is the desired relation between the canonical and the grand canonical partition function.
We also note that for a potential of the standard form
V N = 1 i < j N V ( x i x j ) + 1 i N W ( x i ) V N = 1 i < j N V x i x j + 1 i N W x i V_(N)=sum_(1 <= i < j <= N)V( vec(x)_(i)- vec(x)_(j))+sum_(1 <= i <= N)W( vec(x)_(i))\mathcal{V}_{N}=\sum_{1 \leqslant i<j \leqslant N} \mathcal{V}\left(\vec{x}_{i}-\vec{x}_{j}\right)+\sum_{1 \leqslant i \leqslant N} \mathcal{W}\left(\vec{x}_{i}\right)VN=1i<jNV(xixj)+1iNW(xi)
we may think of replacing H N H N μ N H N H N μ N H_(N)rarrH_(N)-mu NH_{N} \rightarrow H_{N}-\mu NHNHNμN as being due to W W μ W W μ WrarrW-mu\mathcal{W} \rightarrow \mathcal{W}-\muWWμ. Therefore, for variable particle number N N NNN, there is no arbitrary additive constant in the 1-particle potential W W W\mathcal{W}W, but it is determined by the chemical potential μ μ mu\muμ. A larger μ μ mu\muμ gives greater statistical weight in Y Y YYY to states with larger N N NNN, just as larger T T TTT (smaller β β beta\betaβ ) gives greater weight to states with larger E E EEE.

4.5 Summary of different equilibrium ensembles

Let us summarize the equilibrium ensembles we have discussed in this chapter in a table:
Ensemble
Defining
property
Defining property| Defining | | :--- | | property |
Partition
function
Partition function| Partition | | :--- | | function |
Statistical operator ρ ρ rho\rhoρ
Microcanonical
ensemble
Microcanonical ensemble| Microcanonical | | :--- | | ensemble |
no energy exchange
no particle exchange
no energy exchange no particle exchange| no energy exchange | | :--- | | no particle exchange |
W ( E , N , V ) W ( E , N , V ) W(E,N,V)W(E, N, V)W(E,N,V) 1 W [ Θ ( H E + Δ E ) Θ ( H E ) ] 1 W [ Θ ( H E + Δ E ) Θ ( H E ) ] (1)/(W)[Theta(H-E+Delta E)-Theta(H-E)]\frac{1}{W}[\Theta(H-E+\Delta E)-\Theta(H-E)]1W[Θ(HE+ΔE)Θ(HE)]
Canonical
ensemble
Canonical ensemble| Canonical | | :--- | | ensemble |
energy exchange
no particle exchange
energy exchange no particle exchange| energy exchange | | :--- | | no particle exchange |
Z ( β , N , V ) Z ( β , N , V ) Z(beta,N,V)Z(\beta, N, V)Z(β,N,V) 1 Z e β H 1 Z e β H (1)/(Z)e^(-beta H)\frac{1}{Z} e^{-\beta H}1ZeβH
Grand canonical
ensemble
Grand canonical ensemble| Grand canonical | | :--- | | ensemble |
energy exchange
particle exchange
energy exchange particle exchange| energy exchange | | :--- | | particle exchange |
Y ( β , μ , V ) Y ( β , μ , V ) Y(beta,mu,V)Y(\beta, \mu, V)Y(β,μ,V) 1 Y e β ( H μ N ^ ) 1 Y e β ( H μ N ^ ) (1)/(Y)e^(-beta(H-mu hat(N)))\frac{1}{Y} e^{-\beta(H-\mu \hat{N})}1Yeβ(HμN^)
Ensemble "Defining property" "Partition function" Statistical operator rho "Microcanonical ensemble" "no energy exchange no particle exchange" W(E,N,V) (1)/(W)[Theta(H-E+Delta E)-Theta(H-E)] "Canonical ensemble" "energy exchange no particle exchange" Z(beta,N,V) (1)/(Z)e^(-beta H) "Grand canonical ensemble" "energy exchange particle exchange" Y(beta,mu,V) (1)/(Y)e^(-beta(H-mu hat(N)))| Ensemble | Defining <br> property | Partition <br> function | Statistical operator $\rho$ | | :--- | :--- | :--- | :--- | | Microcanonical <br> ensemble | no energy exchange <br> no particle exchange | $W(E, N, V)$ | $\frac{1}{W}[\Theta(H-E+\Delta E)-\Theta(H-E)]$ | | Canonical <br> ensemble | energy exchange <br> no particle exchange | $Z(\beta, N, V)$ | $\frac{1}{Z} e^{-\beta H}$ | | Grand canonical <br> ensemble | energy exchange <br> particle exchange | $Y(\beta, \mu, V)$ | $\frac{1}{Y} e^{-\beta(H-\mu \hat{N})}$ |
Table 4.1: Properties of the different equilibrium ensembles.
The relationship between the partition functions W , Z , Y W , Z , Y W,Z,YW, Z, YW,Z,Y and the corresponding natural termodynamic "potentials" is summarized in the following table:
Further explanations regarding the various thermodynamic potentials are given below
Ensemble
Name of
potential
Name of potential| Name of | | :--- | | potential |
Symbol
Relation with
partion function
Relation with partion function| Relation with | | :--- | | partion function |
Microcanonical
ensemble
Microcanonical ensemble| Microcanonical | | :--- | | ensemble |
Entropy S ( E , N , V ) S ( E , N , V ) S(E,N,V)S(E, N, V)S(E,N,V) S = k B log W S = k B log W S=k_(B)log WS=\mathrm{k}_{\mathrm{B}} \log WS=kBlogW
Canonical
ensemble
Canonical ensemble| Canonical | | :--- | | ensemble |
Free energy F ( β , N , V ) F ( β , N , V ) F(beta,N,V)F(\beta, N, V)F(β,N,V) F = β 1 log Z F = β 1 log Z F=-beta^(-1)log ZF=-\beta^{-1} \log ZF=β1logZ
Grand canonical
ensemble
Grand canonical ensemble| Grand canonical | | :--- | | ensemble |
Gibbs free
energy
Gibbs free energy| Gibbs free | | :--- | | energy |
G ( β , μ , V ) G ( β , μ , V ) G(beta,mu,V)G(\beta, \mu, V)G(β,μ,V) G = β 1 log Y G = β 1 log Y G=-beta^(-1)log YG=-\beta^{-1} \log YG=β1logY
Ensemble "Name of potential" Symbol "Relation with partion function" "Microcanonical ensemble" Entropy S(E,N,V) S=k_(B)log W "Canonical ensemble" Free energy F(beta,N,V) F=-beta^(-1)log Z "Grand canonical ensemble" "Gibbs free energy" G(beta,mu,V) G=-beta^(-1)log Y| Ensemble | Name of <br> potential | Symbol | Relation with <br> partion function | | :--- | :--- | :--- | :--- | | Microcanonical <br> ensemble | Entropy | $S(E, N, V)$ | $S=\mathrm{k}_{\mathrm{B}} \log W$ | | Canonical <br> ensemble | Free energy | $F(\beta, N, V)$ | $F=-\beta^{-1} \log Z$ | | Grand canonical <br> ensemble | Gibbs free <br> energy | $G(\beta, \mu, V)$ | $G=-\beta^{-1} \log Y$ |
Table 4.2: Relationship to different thermodynamic potentials.
in section 6.7.

4.6 Approximation methods

For interacting systems, it is normally impossible to calculate thermodynamic quantities exactly. In these cases, approximations, estimations or numerical methods must be used. In the appendix, we present an example of a numerical method, the Monte Carlo algorithm, which can be turned into an efficient method for numerically evaluating quantities like partition functions. In problem B. 16 we discuss the mean field approximation in the example of the Ising model. In the following two subsections we present an example of an expansion technique and an example of a method based on rigorous estimations.

4.6.1 The cluster expansion

For simplicity, we consider a classical system in a box of volume V V VVV, with N N NNN-particle Hamiltonian H N H N H_(N)H_{N}HN given by
where V i j = V ( x i x j ) V i j = V x i x j V_(ij)=V( vec(x)_(i)- vec(x)_(j))\mathcal{V}_{i j}=\mathcal{V}\left(\vec{x}_{i}-\vec{x}_{j}\right)Vij=V(xixj) is the two-particle interaction between the i i iii-th and the j j jjj-th particle. The partition function for the grand canonical ensemble is (see (4.103)):
Y ( μ , β , V ) = N = 0 e β μ N Z ( β , V , N ) (4.109) = N = 0 e β μ N 1 N ! λ 3 N V N d 3 N Q e β V N ( Q ) . Y ( μ , β , V ) = N = 0 e β μ N Z ( β , V , N ) (4.109) = N = 0 e β μ N 1 N ! λ 3 N V N d 3 N Q e β V N ( Q ) . {:[Y(mu","beta","V)=sum_(N=0)^(oo)e^(beta mu N)Z(beta","V","N)],[(4.109)=sum_(N=0)^(oo)e^(beta mu N)*(1)/(N!lambda^(3N))*int_(V^(N))d^(3N)Qe^(-betaV_(N)(Q)).]:}\begin{align*} Y(\mu, \beta, V) & =\sum_{N=0}^{\infty} e^{\beta \mu N} Z(\beta, V, N) \\ & =\sum_{N=0}^{\infty} e^{\beta \mu N} \cdot \frac{1}{N!\lambda^{3 N}} \cdot \int_{V^{N}} d^{3 N} Q e^{-\beta \mathcal{V}_{N}(Q)} . \tag{4.109} \end{align*}Y(μ,β,V)=N=0eβμNZ(β,V,N)(4.109)=N=0eβμN1N!λ3NVNd3NQeβVN(Q).
Here, λ = h 2 π m k B T λ = h 2 π m k B T lambda=(h)/(sqrt(2pi mk_(B)T))\lambda=\frac{h}{\sqrt{2 \pi m \mathrm{k}_{\mathrm{B}} T}}λ=h2πmkBT is the thermal deBroglie wavelength. To compute the remaining integral over Q = ( x 1 , , x N ) Q = x 1 , , x N Q=( vec(x)_(1),dots, vec(x)_(N))Q=\left(\vec{x}_{1}, \ldots, \vec{x}_{N}\right)Q=(x1,,xN) is generally impossible, but one can derive an expansion of which the first few terms may often be evaluated exactly. For this we write the integrand as
(4.110) e β V N = exp ( β i < j V i j ) = i < j e β V i j i < j ( 1 + f i j ) , (4.110) e β V N = exp β i < j V i j = i < j e β V i j i < j 1 + f i j , {:(4.110)e^(-betaV_(N))=exp(-betasum_(i < j)V_(ij))=prod_(i < j)e^(-betaV_(ij))-=prod_(i < j)(1+f_(ij))",":}\begin{equation*} e^{-\beta \mathcal{V}_{N}}=\exp \left(-\beta \sum_{i<j} \mathcal{V}_{i j}\right)=\prod_{i<j} e^{-\beta \mathcal{V}_{i j}} \equiv \prod_{i<j}\left(1+f_{i j}\right), \tag{4.110} \end{equation*}(4.110)eβVN=exp(βi<jVij)=i<jeβViji<j(1+fij),
where we have set f i j f ( x i x j ) = 1 e β ν i j f i j f x i x j = 1 e β ν i j f_(ij)-=f( vec(x)_(i)- vec(x)_(j))=1-e^(-betanu_(ij))f_{i j} \equiv f\left(\vec{x}_{i}-\vec{x}_{j}\right)=1-e^{-\beta \nu_{i j}}fijf(xixj)=1eβνij. The idea is that we can think of | f i j | f i j |f_(ij)|\left|f_{i j}\right||fij| as small in some situations of interest, e.g. when the gas is dilute (such that | V i j | 1 V i j 1 |V_(ij)|≪1\left|\mathcal{V}_{i j}\right| \ll 1|Vij|1 in "most of phase space"), or when β β beta\betaβ is small (i.e. for large temperature T T TTT ). With this in mind, we expand the above product as
(4.111) e β V N = 1 + i < j f i j + i < j , k < l f i j f k l + (4.111) e β V N = 1 + i < j f i j + i < j , k < l f i j f k l + {:(4.111)e^(-betaV_(N))=1+sum_(i < j)f_(ij)+sum_(i < j,k < l)f_(ij)f_(kl)+dots:}\begin{equation*} e^{-\beta \mathcal{V}_{N}}=1+\sum_{i<j} f_{i j}+\sum_{i<j, k<l} f_{i j} f_{k l}+\ldots \tag{4.111} \end{equation*}(4.111)eβVN=1+i<jfij+i<j,k<lfijfkl+
and substitute the result into the integral V N d 3 N Q e β V N ( Q ) V N d 3 N Q e β V N ( Q ) int_(V^(N))d^(3N)Qe^(-betaV_(N)(Q))\int_{V^{N}} d^{3 N} Q e^{-\beta \mathcal{V}_{N}(Q)}VNd3NQeβVN(Q). The general form of the resulting integrals that we need to evaluate is suggested by the following representative example for N = 6 N = 6 N=6N=6N=6 particles:
(4.112) d 3 x 1 d 3 x 6 f 12 f 35 f 45 f 36 = (4.112) d 3 x 1 d 3 x 6 f 12 f 35 f 45 f 36 = {:(4.112) intd^(3)x_(1)dotsd^(3)x_(6)f_(12)f_(35)f_(45)f_(36)=:}\begin{align*} & \int d^{3} x_{1} \ldots d^{3} x_{6} f_{12} f_{35} f_{45} f_{36}= \tag{4.112} \end{align*}(4.112)d3x1d3x6f12f35f45f36=
To keep track of all the integrals that come up, we introduced the following convenient graphical notation. In our example, this graphical notation amounts to the following. Each circle corresponds to an an integration, e.g.
(4.114) (1) d 3 x 1 , (4.114)  (1)  d 3 x 1 {:(4.114)" (1) "harrd^(3)x_(1)", ":}\begin{equation*} \text { (1) } \leftrightarrow d^{3} x_{1} \text {, } \tag{4.114} \end{equation*}(4.114) (1) d3x1
and each line corresponds to an f i j f i j f_(ij)f_{i j}fij in the integrand, e.g.
(4.115) (1)-2 f 12 . (4.115)  (1)-2  f 12 {:(4.115)" (1)-2 "longleftrightarrowf_(12)". ":}\begin{equation*} \text { (1)-2 } \longleftrightarrow f_{12} \text {. } \tag{4.115} \end{equation*}(4.115) (1)-2 f12
The connected parts of a diagram are called "clusters". Obviously, the integral associated with a graph factorizes into the corresponding integrals for the clusters. Therefore, the "cluster integrals" are the building blocks, and we define
(4.116) b l ( V , β ) = 1 l ! λ 3 l 3 V ( sum of all l -point cluster integrals ) . (4.116) b l ( V , β ) = 1 l ! λ 3 l 3 V (  sum of all  l -point cluster integrals  ) . {:(4.116)b_(l)(V","beta)=(1)/(l!lambda^(3l-3)V)*(" sum of all "l"-point cluster integrals ").:}\begin{equation*} b_{l}(V, \beta)=\frac{1}{l!\lambda^{3 l-3} V} \cdot(\text { sum of all } l \text {-point cluster integrals }) . \tag{4.116} \end{equation*}(4.116)bl(V,β)=1l!λ3l3V( sum of all l-point cluster integrals ).
The main result in this context, known as the linked cluster theorem 3 3 ^(3){ }^{3}3, is that
(4.117) 1 V log Y ( μ , V , β ) = 1 λ 3 l = 1 b l ( V , β ) z l (4.117) 1 V log Y ( μ , V , β ) = 1 λ 3 l = 1 b l ( V , β ) z l {:(4.117)(1)/(V)log Y(mu","V","beta)=(1)/(lambda^(3))sum_(l=1)^(oo)b_(l)(V","beta)z^(l):}\begin{equation*} \frac{1}{V} \log Y(\mu, V, \beta)=\frac{1}{\lambda^{3}} \sum_{l=1}^{\infty} b_{l}(V, \beta) z^{l} \tag{4.117} \end{equation*}(4.117)1VlogY(μ,V,β)=1λ3l=1bl(V,β)zl
where z = e β μ z = e β μ z=e^(beta mu)z=e^{\beta \mu}z=eβμ is sometimes called the fugacity. If the f i j f i j f_(ij)f_{i j}fij are sufficiently small, the first few terms ( b 1 , b 2 , b 3 , ) b 1 , b 2 , b 3 , (b_(1),b_(2),b_(3),dots)\left(b_{1}, b_{2}, b_{3}, \ldots\right)(b1,b2,b3,) will give a good approximation. Explicitly, one finds (exercise):
(4.118) b 1 = 1 1 ! λ 0 V V d 3 x = 1 , (4.119) b 2 = 1 2 ! λ 3 V V 2 d 3 x 1 d 3 x 2 f 12 , (4.120) b 3 = 1 3 ! λ 6 V V 3 d 3 x 1 d 3 x 2 d 3 x 3 ( ( f 12 f 23 + f 13 f 12 + f 13 f 23 3 + f 12 f 13 f 23 ) , (4.118) b 1 = 1 1 ! λ 0 V V d 3 x = 1 , (4.119) b 2 = 1 2 ! λ 3 V V 2 d 3 x 1 d 3 x 2 f 12 , (4.120) b 3 = 1 3 ! λ 6 V V 3 d 3 x 1 d 3 x 2 d 3 x 3 ( f 12 f 23 + f 13 f 12 + f 13 f 23 3 + f 12 f 13 f 23 ) , {:[(4.118)b_(1)=(1)/(1!lambda^(0)V)int_(V)d^(3)x=1","],[(4.119)b_(2)=(1)/(2!lambda^(3)V)int_(V^(2))d^(3)x_(1)d^(3)x_(2)f_(12)","],[(4.120)b_(3)=(1)/(3!lambda^(6)V)int_(V^(3))d^(3)x_(1)d^(3)x_(2)d^(3)x_(3)(ubrace((f_(12)f_(23)+f_(13)f_(12)+f_(13)f_(23)ubrace)_(rarr3)+f_(12)f_(13)f_(23))","]:}\begin{align*} & b_{1}=\frac{1}{1!\lambda^{0} V} \int_{V} d^{3} x=1, \tag{4.118}\\ & b_{2}=\frac{1}{2!\lambda^{3} V} \int_{V^{2}} d^{3} x_{1} d^{3} x_{2} f_{12}, \tag{4.119}\\ & b_{3}=\frac{1}{3!\lambda^{6} V} \int_{V^{3}} d^{3} x_{1} d^{3} x_{2} d^{3} x_{3}(\underbrace{\left(f_{12} f_{23}+f_{13} f_{12}+f_{13} f_{23}\right.}_{\rightarrow 3}+f_{12} f_{13} f_{23}), \tag{4.120} \end{align*}(4.118)b1=11!λ0VVd3x=1,(4.119)b2=12!λ3VV2d3x1d3x2f12,(4.120)b3=13!λ6VV3d3x1d3x2d3x3((f12f23+f13f12+f13f233+f12f13f23),
since the possible 1-,2- and 3-clusters are given by:
As exemplified by the first 3 terms in b 3 b 3 b_(3)b_{3}b3, topologically identical clusters (i.e. ones that differ only by a permutation of the particles) give the same cluster integral. Thus, we only need to evaluate the cluster integrals for topologically distinct clusters.
Given an approximation for 1 V log Y 1 V log Y (1)/(V)log Y\frac{1}{V} \log Y1VlogY, one obtains approximations for the equations of state etc. by the general methods described in more detail in section 6.5.

4.6.2 Peierls contours

Next, we present an example of a rigorous estimation method proving the existence of phase transition in the two-dimensional Ising model. In this model, we have spins σ i = ± 1 σ i = ± 1 sigma_(i)=+-1\sigma_{i}= \pm 1σi=±1 on a square lattice and the Hamitonian (energy) is
H ( { σ } ) = J i k ( σ i σ k 1 ) b i σ i , H ( { σ } ) = J i k σ i σ k 1 b i σ i , H({sigma})=-Jsum_(ik)(sigma_(i)sigma_(k)-1)-bsum_(i)sigma_(i),H(\{\sigma\})=-J \sum_{i k}\left(\sigma_{i} \sigma_{k}-1\right)-b \sum_{i} \sigma_{i},H({σ})=Jik(σiσk1)biσi,
where the first sum is over all lattice bonds i k i k iki kik and the second sum is over all lattice sites i i iii. Note that we shifted the energies in the first term such that a pair of parallel spins has vanishing contribution to the total energy. We present an argument, due to Peierls, showing that under the boundary condition that the spins on the boundary of the lattice are positive, at sufficiently low temperatures, this model shows an equilibrium magnetization. In the following, we set b = 0 b = 0 b=0b=0b=0.
Each configuration { σ i } = { σ 1 , σ 2 , } σ i = σ 1 , σ 2 , {sigma_(i)}={sigma_(1),sigma_(2),dots}\left\{\sigma_{i}\right\}=\left\{\sigma_{1}, \sigma_{2}, \ldots\right\}{σi}={σ1,σ2,} of spins is in one-to-one correspondence with set of (connected) contours separating regions of positive and negative spins, known as Peierls contours. They may be chosen so that they consist of the line segments in the middle between opposite spins, see figure 4.9.
Figure 4.9: A Peierls contour
Due to our boundary condition, these contours are closed. Each pair of antiparallel spins contributes an energy of + 2 J + 2 J +2J+2 J+2J. Since the total number of antiparallel spin pairs in a configuration corresponding to the contours C 1 , , C r C 1 , , C r C_(1),dots,C_(r)C_{1}, \ldots, C_{r}C1,,Cr is given by the sum of their lengths, | C 1 | + + | C r | C 1 + + C r |C_(1)|+cdots+|C_(r)|\left|C_{1}\right|+\cdots+\left|C_{r}\right||C1|++|Cr|, the energy of that configuration is
(4.121) H = 2 J i = 1 r | C i | (4.121) H = 2 J i = 1 r C i {:(4.121)H=2Jsum_(i=1)^(r)|C_(i)|:}\begin{equation*} H=2 J \sum_{i=1}^{r}\left|C_{i}\right| \tag{4.121} \end{equation*}(4.121)H=2Ji=1r|Ci|
Consider the totality of configurations having a given contour C C CCC as a Peierls contour or domain wall. The sum of their probabilities P C = { σ } C Z 1 e β H ( { σ } ) P C = { σ } C Z 1 e β H ( { σ } ) P_(C)=sum_({sigma}sup C)Z^(-1)e^(-beta H({sigma}))P_{C}=\sum_{\{\sigma\} \supset C} Z^{-1} e^{-\beta H(\{\sigma\})}PC={σ}CZ1eβH({σ}) is the probability that there is a Peierls contour C C CCC. We can find an estimate for that probability as follows. For each configuration { σ } { σ } {sigma}\{\sigma\}{σ} containing C C CCC, we may define a modified configuration { σ } σ {sigma^(')}\left\{\sigma^{\prime}\right\}{σ} obtained by flipping all spins inside domain defined by C C CCC. Then { σ } σ {sigma^(')}\left\{\sigma^{\prime}\right\}{σ} no longer has
the Peierls contour C C CCC. The subset of all distinct configurations { σ } { σ } {sigma}\{\sigma\}{σ} containing C C CCC leads in this way to a set of distinct configurations { σ } σ {sigma^(')}\left\{\sigma^{\prime}\right\}{σ} without C C CCC, and this subset is itself a subset of all configurations without C C CCC. Since, by (4.121), the ratio of the probabilities of the original and the modified configuration is e 2 β J | C | e 2 β J | C | e^(-2beta J|C|)\mathrm{e}^{-2 \beta J|C|}e2βJ|C|, and since the sum of the probabilities of configurations without C C CCC is at most 1 , the probability for C C CCC to be a Peierls contour is at most e 2 β J | C | e 2 β J | C | e^(-2beta J|C|)\mathrm{e}^{-2 \beta J|C|}e2βJ|C|.
Now, if the spin σ x σ x sigma_(x)\sigma_{x}σx at some place x x xxx is negative, there must be a Peierls contour surrounding x x xxx. Therefore, the probability P x P x P_(x)^(-)P_{x}^{-}Pxthat the spin at x x xxx is negative can be estimated against the sum of the probabilities P C e 2 β J | C | P C e 2 β J | C | P_(C) <= e^(-2beta J|C|)P_{C} \leqslant e^{-2 \beta J|C|}PCe2βJ|C| for there to be a contour C C CCC surrounding x x xxx. Thus,
P x C e 2 β J | C | , P x C e 2 β J | C | , P_(x)^(-) <= sum_(C)e^(-2beta J|C|),P_{x}^{-} \leqslant \sum_{C} \mathrm{e}^{-2 \beta J|C|},PxCe2βJ|C|,
where the sum runs over all contours C C CCC surrounding x x xxx. Sorting them by length l l lll and extending the sum to infinite lengths, we obtain
P x l = 4 N ( l ) e 2 β J l P x l = 4 N ( l ) e 2 β J l P_(x)^(-) <= sum_(l=4)^(oo)N(l)e^(-2beta Jl)P_{x}^{-} \leqslant \sum_{l=4}^{\infty} N(l) \mathrm{e}^{-2 \beta J l}Pxl=4N(l)e2βJl
where N ( l ) N ( l ) N(l)N(l)N(l) is the number of contours of length l l lll surrounding x x xxx. The sum starts at length 4, because the length of a contour is at least 4 times the lattice spacing, which is assumed to be 1 .
To get an estimate for N ( l ) N ( l ) N(l)N(l)N(l), we observe the following. First, we consider the set of all possible shapes of contours of the given length l l lll, where two contours are considered to have the same shape if they are congruent after some translation. We follow a given contour at some starting point, from which we have 2 possibilities to proceed (taking into account that the order is irrelevant). At the subsequent sites, we have at most 3 possibilities to continue, as we can not go straight back. At the last site, there is at most one possibility to close the contour, so that there are at most 2 × 3 l 2 2 × 3 l 2 2xx3^(l-2)2 \times 3^{l-2}2×3l2 shapes of closed curves of length l l lll (Actually, this is a vast over-counting since the curves we include in this counting may neither be closed nor free of self-intersections!) Now we impose that x x xxx must lie within the contour. Within a contour of length l l lll there can be at most ( l / 4 ) 2 ( l / 4 ) 2 (l//4)^(2)(l / 4)^{2}(l/4)2 points since l / 4 l / 4 l//4l / 4l/4 is the side-length of a square of circumference l l lll. If a contour surrounds x x xxx, then we may shift it in as many ways as there are points inside it while still keeping x x xxx inside. Therefore,
N ( l ) 2 ( l / 4 ) 2 3 l 2 N ( l ) 2 ( l / 4 ) 2 3 l 2 N(l) <= 2(l//4)^(2)3^(l-2)N(l) \leqslant 2(l / 4)^{2} 3^{l-2}N(l)2(l/4)23l2
hence
P x l = 4 2 ( l / 4 ) 2 3 l 2 e 2 β J l = 1 72 l = 4 l 2 e ( 2 β J + log 3 ) l . P x l = 4 2 ( l / 4 ) 2 3 l 2 e 2 β J l = 1 72 l = 4 l 2 e ( 2 β J + log 3 ) l . P_(x)^(-) <= sum_(l=4)^(oo)2(l//4)^(2)3^(l-2)e^(-2beta Jl)=(1)/(72)sum_(l=4)^(oo)l^(2)e^((-2beta J+log 3)l).P_{x}^{-} \leqslant \sum_{l=4}^{\infty} 2(l / 4)^{2} 3^{l-2} e^{-2 \beta J l}=\frac{1}{72} \sum_{l=4}^{\infty} l^{2} e^{(-2 \beta J+\log 3) l} .Pxl=42(l/4)23l2e2βJl=172l=4l2e(2βJ+log3)l.
By the integral test, the right hand side converges for 2 β J > log 3 2 β J > log 3 2beta J > log 32 \beta J>\log 32βJ>log3, and it tends to zero
for β β beta rarr oo\beta \rightarrow \inftyβ i.e. T 0 T 0 T rarr0T \rightarrow 0T0. Hence, for every m ( 0 , 1 ) m ( 0 , 1 ) m in(0,1)m \in(0,1)m(0,1), there exists a temperature T 0 T 0 T_(0)T_{0}T0 such that P x < 1 2 ( 1 m ) P x < 1 2 ( 1 m ) P_(x)^(-) < (1)/(2)(1-m)P_{x}^{-}<\frac{1}{2}(1-m)Px<12(1m) for all lower temperatures. Then, P x + > 1 2 ( 1 + m ) P x + > 1 2 ( 1 + m ) P_(x)^(+) > (1)/(2)(1+m)P_{x}^{+}>\frac{1}{2}(1+m)Px+>12(1+m) and thus
σ x = P x + P x > m σ x = P x + P x > m (:sigma_(x):)=P_(x)^(+)-P_(x)^(-) > m\left\langle\sigma_{x}\right\rangle=P_{x}^{+}-P_{x}^{-}>mσx=Px+Px>m
for all T < T 0 T < T 0 T < T_(0)T<T_{0}T<T0. Since this holds for all lattice sites x x xxx, we conclude that below a certain threshold temperature, the system shows a macroscopic magnetization.

Chapter 5

The Ideal Quantum Gas

5.1 Hilbert Spaces, Canonical and Grand Canonical Formulations

When discussing the mixing entropy of classical ideal gases in section 4.2.3, we noted that Gibbs' paradox could resolved by treating the particles of the same gas species as indistinguishable. How to treat indistinguishable particles in quantum mechanics? If we have N N NNN particles, the state vectors Ψ Ψ Psi\PsiΨ are elements of a Hilbert space, such as H N = L 2 ( V × × V , d 3 x 1 d 3 x N ) H N = L 2 V × × V , d 3 x 1 d 3 x N H_(N)=L^(2)(V xx dots xx V,d^(3)x_(1)dotsd^(3)x_(N))\mathcal{H}_{N}=L^{2}\left(V \times \ldots \times V, d^{3} x_{1} \ldots d^{3} x_{N}\right)HN=L2(V××V,d3x1d3xN) for particles in a box V R 3 V R 3 V subR^(3)V \subset \mathbb{R}^{3}VR3 without additional quantum numbers. The probability density of finding the N N NNN particles at prescribed positions x 1 , , x N x 1 , , x N vec(x)_(1),dots, vec(x)_(N)\vec{x}_{1}, \ldots, \vec{x}_{N}x1,,xN is given by | Ψ ( x 1 , , x N ) | 2 Ψ x 1 , , x N 2 |Psi( vec(x)_(1),dots, vec(x)_(N))|^(2)\left|\Psi\left(\vec{x}_{1}, \ldots, \vec{x}_{N}\right)\right|^{2}|Ψ(x1,,xN)|2. For identical particles, this should be the same as | Ψ ( x σ ( 1 ) , , x σ ( N ) ) | 2 Ψ x σ ( 1 ) , , x σ ( N ) 2 |Psi( vec(x)_(sigma(1)),dots, vec(x)_(sigma(N)))|^(2)\left|\Psi\left(\vec{x}_{\sigma(1)}, \ldots, \vec{x}_{\sigma(N)}\right)\right|^{2}|Ψ(xσ(1),,xσ(N))|2 for any permutation
σ : ( 1 2 3 N 1 N σ ( 1 ) σ ( 2 ) σ ( 3 ) σ ( N 1 ) σ ( N ) ) σ : 1 2 3 N 1 N σ ( 1 ) σ ( 2 ) σ ( 3 ) σ ( N 1 ) σ ( N ) sigma:([1,2,3,dots,N-1,N],[sigma(1),sigma(2),sigma(3),dots,sigma(N-1),sigma(N)])\sigma:\left(\begin{array}{cccccc} 1 & 2 & 3 & \ldots & N-1 & N \\ \sigma(1) & \sigma(2) & \sigma(3) & \ldots & \sigma(N-1) & \sigma(N) \end{array}\right)σ:(123N1Nσ(1)σ(2)σ(3)σ(N1)σ(N))
Thus, the map U σ : Ψ ( x 1 , , x N ) Ψ ( x σ ( 1 ) , , x σ ( N ) ) U σ : Ψ x 1 , , x N Ψ x σ ( 1 ) , , x σ ( N ) U_(sigma):Psi( vec(x)_(1),dots, vec(x)_(N))|->Psi( vec(x)_(sigma(1)),dots, vec(x)_(sigma(N)))\mathcal{U}_{\sigma}: \Psi\left(\vec{x}_{1}, \ldots, \vec{x}_{N}\right) \mapsto \Psi\left(\vec{x}_{\sigma(1)}, \ldots, \vec{x}_{\sigma(N)}\right)Uσ:Ψ(x1,,xN)Ψ(xσ(1),,xσ(N)) should be represented by a phase, i.e.
U σ Ψ = η σ Ψ , | η σ | = 1 U σ Ψ = η σ Ψ , η σ = 1 U_(sigma)Psi=eta_(sigma)Psi,quad|eta_(sigma)|=1\mathcal{U}_{\sigma} \Psi=\eta_{\sigma} \Psi, \quad\left|\eta_{\sigma}\right|=1UσΨ=ησΨ,|ησ|=1
Every permutation σ σ sigma\sigmaσ can be expressed as a concatenation of transpositions, i.e., an interchange of two elements. Performing a transposition π π pi\piπ twice yields the original wave function, hence, U π 2 = 1 U π 2 = 1 U_(pi)^(2)=1\mathcal{U}_{\pi}^{2}=\mathbb{1}Uπ2=1, so η π 2 = 1 η π 2 = 1 eta_(pi)^(2)=1\eta_{\pi}^{2}=1ηπ2=1. It follows that, η π { ± 1 } η π { ± 1 } eta_(pi)in{+-1}\eta_{\pi} \in\{ \pm 1\}ηπ{±1}. Furthermore, from U σ U σ = U σ σ U σ U σ = U σ σ U_(sigma)U_(sigma^('))=U_(sigmasigma^('))\mathcal{U}_{\sigma} \mathcal{U}_{\sigma^{\prime}}=\mathcal{U}_{\sigma \sigma^{\prime}}UσUσ=Uσσ it follows that η σ η σ = η σ σ η σ η σ = η σ σ eta_(sigma)eta_(sigma^('))=eta_(sigmasigma^('))\eta_{\sigma} \eta_{\sigma^{\prime}}=\eta_{\sigma \sigma^{\prime}}ησησ=ησσ, and as any permutation σ σ sigma\sigmaσ can be expressed as a product of transpositions, the only possible constant assignments for η σ η σ eta_(sigma)\eta_{\sigma}ησ are therefore
given by
(5.1) η σ = { 1 σ (Bosons) sgn ( σ ) σ (Fermions). (5.1) η σ = 1 σ  (Bosons)  sgn ( σ ) σ  (Fermions).  {:(5.1)eta_(sigma)={[1,AA sigma," (Bosons) "],[sgn(sigma),AA sigma," (Fermions). "]:}:}\eta_{\sigma}=\left\{\begin{array}{lll} 1 & \forall \sigma & \text { (Bosons) } \tag{5.1}\\ \operatorname{sgn}(\sigma) & \forall \sigma & \text { (Fermions). } \end{array}\right.(5.1)ησ={1σ (Bosons) sgn(σ)σ (Fermions). 
Here the signum of σ σ sigma\sigmaσ is defined as
(5.2) sgn ( σ ) = ( 1 ) # { transpositions in σ } = ( 1 ) # { "crossings" in σ } (5.2) sgn ( σ ) = ( 1 ) # {  transpositions in  σ } = ( 1 ) # {  "crossings" in  σ } {:(5.2)sgn(sigma)=(-1)^(#{" transpositions in "sigma})=(-1)^(#{" "crossings" in "sigma}):}\begin{equation*} \operatorname{sgn}(\sigma)=(-1)^{\#\{\text { transpositions in } \sigma\}}=(-1)^{\#\{\text { "crossings" in } \sigma\}} \tag{5.2} \end{equation*}(5.2)sgn(σ)=(1)#{ transpositions in σ}=(1)#{ "crossings" in σ}
The second characterization also makes plausible the fact that sgn ( σ ) sgn ( σ ) sgn(sigma)\operatorname{sgn}(\sigma)sgn(σ) is an invariant satisfying sgn ( σ ) sgn ( σ ) = sgn ( σ σ ) sgn ( σ ) sgn σ = sgn σ σ sgn(sigma)sgn(sigma^('))=sgn(sigmasigma^('))\operatorname{sgn}(\sigma) \operatorname{sgn}\left(\sigma^{\prime}\right)=\operatorname{sgn}\left(\sigma \sigma^{\prime}\right)sgn(σ)sgn(σ)=sgn(σσ).

Example:

Consider the following permutation:
In this example we have sgn ( σ ) = + 1 = ( 1 ) 4 sgn ( σ ) = + 1 = ( 1 ) 4 sgn(sigma)=+1=(-1)^(4)\operatorname{sgn}(\sigma)=+1=(-1)^{4}sgn(σ)=+1=(1)4.
In order to go from the Hilbert space H N H N H_(N)\mathcal{H}_{N}HN of distinguishable particles such as
H N = H 1 H 1 N factors H N | Ψ = | i 1 | i N | i 1 i N H | i = E i | i 1-particle Hamiltonian on H 1 H N = H 1 H 1 N  factors  H N | Ψ = i 1 i N i 1 i N H | i = E i | i  1-particle Hamiltonian on  H 1 {:[H_(N)=ubrace(H_(1)ox dots oxH_(1)ubrace)_(N" factors ")],[H_(N)∋|Psi:)=|i_(1):)ox dots ox|i_(N):)-=|i_(1)dotsi_(N):)],[H|i:)=E_(i)|i:)" 1-particle Hamiltonian on "H_(1)]:}\begin{aligned} \mathcal{H}_{N} & =\underbrace{\mathcal{H}_{1} \otimes \ldots \otimes \mathcal{H}_{1}}_{N \text { factors }} \\ \mathcal{H}_{N} \ni|\Psi\rangle & =\left|i_{1}\right\rangle \otimes \ldots \otimes\left|i_{N}\right\rangle \equiv\left|i_{1} \ldots i_{N}\right\rangle \\ H|i\rangle & =E_{i}|i\rangle \text { 1-particle Hamiltonian on } \mathcal{H}_{1} \end{aligned}HN=H1H1N factors HN|Ψ=|i1|iN|i1iNH|i=Ei|i 1-particle Hamiltonian on H1
to the Hilbert space for Bosons/Fermions one can apply the projection operators
P + = 1 N ! σ S N U σ P = 1 N ! σ S N sgn ( σ ) U σ . P + = 1 N ! σ S N U σ P = 1 N ! σ S N sgn ( σ ) U σ . {:[P_(+)=(1)/(N!)sum_(sigma inS_(N))U_(sigma)],[P_(-)=(1)/(N!)sum_(sigma inS_(N))sgn(sigma)U_(sigma).]:}\begin{aligned} & \mathcal{P}_{+}=\frac{1}{N!} \sum_{\sigma \in S_{N}} \mathcal{U}_{\sigma} \\ & \mathcal{P}_{-}=\frac{1}{N!} \sum_{\sigma \in S_{N}} \operatorname{sgn}(\sigma) \mathcal{U}_{\sigma} . \end{aligned}P+=1N!σSNUσP=1N!σSNsgn(σ)Uσ.
As projectors the operators P ± P ± P_(+-)\mathcal{P}_{ \pm}P±fulfill the following relations:
P ± 2 = P ± , P ± = P ± , P + P = P P + = 0 P ± 2 = P ± , P ± = P ± , P + P = P P + = 0 P_(+-)^(2)=P_(+-),quadP_(+-)^(†)=P_(+-),quadP_(+)P_(-)=P_(-)P_(+)=0\mathcal{P}_{ \pm}^{2}=\mathcal{P}_{ \pm}, \quad \mathcal{P}_{ \pm}^{\dagger}=\mathcal{P}_{ \pm}, \quad \mathcal{P}_{+} \mathcal{P}_{-}=\mathcal{P}_{-} \mathcal{P}_{+}=0P±2=P±,P±=P±,P+P=PP+=0
The Hilbert spaces for Bosons/Fermions, respectively, are then given by
(5.3) H N ± = { P + H N for Bosons P H N for Fermions. (5.3) H N ± = P + H N  for Bosons  P H N  for Fermions.  {:(5.3)H_(N)^(+-)={[P_(+)H_(N)," for Bosons "],[P_(-)H_(N)," for Fermions. "]:}:}\mathcal{H}_{N}^{ \pm}= \begin{cases}\mathcal{P}_{+} \mathcal{H}_{N} & \text { for Bosons } \tag{5.3}\\ \mathcal{P}_{-} \mathcal{H}_{N} & \text { for Fermions. }\end{cases}(5.3)HN±={P+HN for Bosons PHN for Fermions. 
In the following, we consider N N NNN non-interacting, non-relativistic particles of mass m m mmm in a box with volume V = L 3 V = L 3 V=L^(3)V=L^{3}V=L3, together with Dirichlet boundary conditions. The Hamiltonian of the system in either case is given by
(5.4) H N = i = 1 N p i 2 2 m = i = 1 N 2 2 m x i 2 (5.4) H N = i = 1 N p i 2 2 m = i = 1 N 2 2 m x i 2 {:(5.4)H_(N)=sum_(i=1)^(N)( vec(p)_(i)^(2))/(2m)=sum_(i=1)^(N)-(ℏ^(2))/(2m)del_( vec(x)_(i))^(2):}\begin{equation*} H_{N}=\sum_{i=1}^{N} \frac{\vec{p}_{i}^{2}}{2 m}=\sum_{i=1}^{N}-\frac{\hbar^{2}}{2 m} \partial_{\vec{x}_{i}}^{2} \tag{5.4} \end{equation*}(5.4)HN=i=1Npi22m=i=1N22mxi2
The eigenstates for a single particle are given by the wave functions
(5.5) Ψ k ( x ) = 8 V sin ( k x x ) sin ( k y y ) sin ( k z z ) (5.5) Ψ k ( x ) = 8 V sin k x x sin k y y sin k z z {:(5.5)Psi_( vec(k))( vec(x))=sqrt((8)/(V))sin(k_(x)x)sin(k_(y)y)sin(k_(z)z):}\begin{equation*} \Psi_{\vec{k}}(\vec{x})=\sqrt{\frac{8}{V}} \sin \left(k_{x} x\right) \sin \left(k_{y} y\right) \sin \left(k_{z} z\right) \tag{5.5} \end{equation*}(5.5)Ψk(x)=8Vsin(kxx)sin(kyy)sin(kzz)
where k x = π n x L , k x = π n x L , k_(x)=(pin_(x))/(L),dotsk_{x}=\frac{\pi n_{x}}{L}, \ldotskx=πnxL,, with n x = 1 , 2 , 3 , n x = 1 , 2 , 3 , n_(x)=1,2,3,dotsn_{x}=1,2,3, \ldotsnx=1,2,3,, and similarly for the y , z y , z y,zy, zy,z-components. The product wave functions Ψ k 1 ( x 1 ) Ψ k N ( x N ) Ψ k 1 x 1 Ψ k N x N Psi_( vec(k)_(1))( vec(x)_(1))cdotsPsi_( vec(k)_(N))( vec(x)_(N))\Psi_{\vec{k}_{1}}\left(\vec{x}_{1}\right) \cdots \Psi_{\vec{k}_{N}}\left(\vec{x}_{N}\right)Ψk1(x1)ΨkN(xN) do not satisfy the symmetry requirements for Bosons/Fermions. To obtain these we have to apply the projectors P ± P ± P_(+-)\mathcal{P}_{ \pm}P±to the states | k 1 | k N H N k 1 k N H N | vec(k)_(1):)ox cdots ox| vec(k)_(N):)inH_(N)\left|\vec{k}_{1}\right\rangle \otimes \cdots \otimes\left|\vec{k}_{N}\right\rangle \in \mathcal{H}_{N}|k1|kNHN. We define:
(5.6) | k 1 , , k N ± := N ! c ± P ± ( | k 1 | k N | k 1 , , k N ) (5.6) k 1 , , k N ± := N ! c ± P ± ( k 1 k N k 1 , , k N ) {:(5.6)| vec(k)_(1),dots, vec(k)_(N):)_(+-):=(N!)/(sqrt(c_(+-)))P_(+-)(ubrace(|k_(1):)ox dots ox|k_(N):)ubrace)_(|k_(1),dots,k_(N):))):}\begin{equation*} \left|\vec{k}_{1}, \ldots, \vec{k}_{N}\right\rangle_{ \pm}:=\frac{N!}{\sqrt{c_{ \pm}}} \mathcal{P}_{ \pm}(\underbrace{\left|k_{1}\right\rangle \otimes \ldots \otimes\left|k_{N}\right\rangle}_{\left|k_{1}, \ldots, k_{N}\right\rangle}) \tag{5.6} \end{equation*}(5.6)|k1,,kN±:=N!c±P±(|k1|kN|k1,,kN)
where c ± c ± c_(+-)c_{ \pm}c±is a normalization constant, defined by demanding that ± k 1 , , k N k 1 , , k N ± = ± k 1 , , k N k 1 , , k N ± = _(+-)(: vec(k)_(1),dots, vec(k)_(N)∣ vec(k)_(1),dots, vec(k)_(N):)_(+-)={ }_{ \pm}\left\langle\vec{k}_{1}, \ldots, \vec{k}_{N} \mid \vec{k}_{1}, \ldots, \vec{k}_{N}\right\rangle_{ \pm}=±k1,,kNk1,,kN±= 1. (We have used the Dirac notation x k Ψ k ( x ) x k Ψ k ( x ) (:x∣ vec(k):)-=Psi_( vec(k))( vec(x))\langle x \mid \vec{k}\rangle \equiv \Psi_{\vec{k}}(\vec{x})xkΨk(x).) Explicitly, we have:
(5.7) | k 1 , , k N + = 1 c + σ S N | k σ ( 1 ) , , k σ ( N ) for Bosons, (5.8) | k 1 , , k N = 1 c σ S N sgn ( σ ) | k σ ( 1 ) , , k σ ( N ) for Fermions. (5.7) k 1 , , k N + = 1 c + σ S N k σ ( 1 ) , , k σ ( N )  for Bosons,  (5.8) k 1 , , k N = 1 c σ S N sgn ( σ ) k σ ( 1 ) , , k σ ( N )  for Fermions.  {:[(5.7)| vec(k)_(1),dots, vec(k)_(N):)_(+)=(1)/(sqrt(c_(+)))sum_(sigma inS_(N))| vec(k)_(sigma(1)),dots, vec(k)_(sigma(N)):)quad" for Bosons, "],[(5.8)| vec(k)_(1),dots, vec(k)_(N):)_(-)=(1)/(sqrt(c_(-)))sum_(sigma inS_(N))sgn(sigma)| vec(k)_(sigma(1)),dots, vec(k)_(sigma(N)):)quad" for Fermions. "]:}\begin{align*} & \left|\vec{k}_{1}, \ldots, \vec{k}_{N}\right\rangle_{+}=\frac{1}{\sqrt{c_{+}}} \sum_{\sigma \in S_{N}}\left|\vec{k}_{\sigma(1)}, \ldots, \vec{k}_{\sigma(N)}\right\rangle \quad \text { for Bosons, } \tag{5.7}\\ & \left|\vec{k}_{1}, \ldots, \vec{k}_{N}\right\rangle_{-}=\frac{1}{\sqrt{c_{-}}} \sum_{\sigma \in S_{N}} \operatorname{sgn}(\sigma)\left|\vec{k}_{\sigma(1)}, \ldots, \vec{k}_{\sigma(N)}\right\rangle \quad \text { for Fermions. } \tag{5.8} \end{align*}(5.7)|k1,,kN+=1c+σSN|kσ(1),,kσ(N) for Bosons, (5.8)|k1,,kN=1cσSNsgn(σ)|kσ(1),,kσ(N) for Fermions. 
Note, that the factor 1 N ! 1 N ! (1)/(N!)\frac{1}{N!}1N! coming from P ± P ± P_(+-)\mathcal{P}_{ \pm}P±has been absorbed into c ± c ± c_(+-)c_{ \pm}c±.

Examples:

(a) Fermions with N = 2 N = 2 N=2N=2N=2 : A normalized two-particle fermion state is given by
| k 1 , k 2 = 1 2 ( | k 1 , k 2 | k 2 , k 1 ) k 1 , k 2 = 1 2 k 1 , k 2 k 2 , k 1 | vec(k)_(1), vec(k)_(2):)_(-)=(1)/(sqrt2)(| vec(k)_(1), vec(k)_(2):)-| vec(k)_(2), vec(k)_(1):))\left|\vec{k}_{1}, \vec{k}_{2}\right\rangle_{-}=\frac{1}{\sqrt{2}}\left(\left|\vec{k}_{1}, \vec{k}_{2}\right\rangle-\left|\vec{k}_{2}, \vec{k}_{1}\right\rangle\right)|k1,k2=12(|k1,k2|k2,k1)
with | k 1 , k 2 = 0 k 1 , k 2 = 0 | vec(k)_(1), vec(k)_(2):)_(-)=0\left|\vec{k}_{1}, \vec{k}_{2}\right\rangle_{-}=0|k1,k2=0 if k 1 = k 2 k 1 = k 2 vec(k)_(1)= vec(k)_(2)\vec{k}_{1}=\vec{k}_{2}k1=k2. This implements the Pauli principle.
More generally, for an N N NNN-particle fermion state we have
(5.9) | , k i , , k j , = 0 whenever k i = k j (5.9) , k i , , k j , = 0  whenever  k i = k j {:(5.9)|dots, vec(k)_(i),dots, vec(k)_(j),dots:)_(-)=0quad" whenever " vec(k)_(i)= vec(k)_(j):}\begin{equation*} \left|\ldots, \vec{k}_{i}, \ldots, \vec{k}_{j}, \ldots\right\rangle_{-}=0 \quad \text { whenever } \vec{k}_{i}=\vec{k}_{j} \tag{5.9} \end{equation*}(5.9)|,ki,,kj,=0 whenever ki=kj
(b) Bosons with N = 2 N = 2 N=2N=2N=2 : A normalized two-particle boson state is given by
(5.10) | k 1 , k 2 + = { 1 2 ( | k 1 k 2 + | k 2 k 1 ) k 1 k 2 | k 1 , k 2 k 1 = k 2 (5.10) k 1 , k 2 + = 1 2 k 1 k 2 + k 2 k 1 k 1 k 2 k 1 , k 2 k 1 = k 2 {:(5.10)| vec(k)_(1), vec(k)_(2):)_(+)={[(1)/(sqrt2)(| vec(k)_(1) vec(k)_(2):)+| vec(k)_(2) vec(k)_(1):)), vec(k)_(1)!= vec(k)_(2)],[| vec(k)_(1), vec(k)_(2):), vec(k)_(1)= vec(k)_(2)]:}:}\left|\vec{k}_{1}, \vec{k}_{2}\right\rangle_{+}= \begin{cases}\frac{1}{\sqrt{2}}\left(\left|\vec{k}_{1} \vec{k}_{2}\right\rangle+\left|\vec{k}_{2} \vec{k}_{1}\right\rangle\right) & \vec{k}_{1} \neq \vec{k}_{2} \tag{5.10}\\ \left|\vec{k}_{1}, \vec{k}_{2}\right\rangle & \vec{k}_{1}=\vec{k}_{2}\end{cases}(5.10)|k1,k2+={12(|k1k2+|k2k1)k1k2|k1,k2k1=k2
(c) Bosons with N = 3 N = 3 N=3N=3N=3 : A normalized three-particle boson state with k 1 = k , k 2 = k 1 = k , k 2 = vec(k)_(1)= vec(k), vec(k)_(2)=\vec{k}_{1}=\vec{k}, \vec{k}_{2}=k1=k,k2= k 3 = p k 3 = p vec(k)_(3)= vec(p)\vec{k}_{3}=\vec{p}k3=p is given by
(5.11) | k , p , p + = 1 3 ( | p , p , k + | p , k , p + | k , p , p ) . (5.11) | k , p , p + = 1 3 ( | p , p , k + | p , k , p + | k , p , p ) . {:(5.11)| vec(k)"," vec(p)"," vec(p):)_(+)=(1)/(sqrt3)(| vec(p)"," vec(p)"," vec(k):)+| vec(p)"," vec(k)"," vec(p):)+| vec(k)"," vec(p)"," vec(p):)).:}\begin{equation*} |\vec{k}, \vec{p}, \vec{p}\rangle_{+}=\frac{1}{\sqrt{3}}(|\vec{p}, \vec{p}, \vec{k}\rangle+|\vec{p}, \vec{k}, \vec{p}\rangle+|\vec{k}, \vec{p}, \vec{p}\rangle) . \tag{5.11} \end{equation*}(5.11)|k,p,p+=13(|p,p,k+|p,k,p+|k,p,p).
The normalization factors c + , c c + , c c_(+),c_(-)c_{+}, c_{-}c+,care given in general as follows:
(a) Bosons: Let n k n k n_( vec(k))n_{\vec{k}}nk be the number of appearances of the mode k k vec(k)\vec{k}k in | k 1 , , k N + k 1 , , k N + | vec(k)_(1),dots, vec(k)_(N):)_(+)\left|\vec{k}_{1}, \ldots, \vec{k}_{N}\right\rangle_{+}|k1,,kN+, i.e. n k = i δ k , k i n k = i δ k , k i n_( vec(k))=sum_(i)delta_( vec(k), vec(k)_(i))n_{\vec{k}}=\sum_{i} \delta_{\vec{k}, \vec{k}_{i}}nk=iδk,ki. Then c + c + c_(+)c_{+}c+is given by
(5.12) c + = N ! k n k ! (5.12) c + = N ! k n k ! {:(5.12)c_(+)=N!prod_( vec(k))n_( vec(k))!:}\begin{equation*} c_{+}=N!\prod_{\vec{k}} n_{\vec{k}}! \tag{5.12} \end{equation*}(5.12)c+=N!knk!
In example (c) above we have n k = 1 , n p = 2 n k = 1 , n p = 2 n_( vec(k))=1,n_( vec(p))=2n_{\vec{k}}=1, n_{\vec{p}}=2nk=1,np=2 and thus
c + = 3 ! 2 ! 1 ! = 12 c + = 3 ! 2 ! 1 ! = 12 c_(+)=3!2!1!=12c_{+}=3!2!1!=12c+=3!2!1!=12
Note, that this is correct since
| k , p , p + = 1 12 ( | p , p , k + | p , k , p + | k , p , p + | p , p , k + | p , k , p + | k , p , p , ) | k , p , p + = 1 12 ( | p , p , k + | p , k , p + | k , p , p + | p , p , k + | p , k , p + | k , p , p , ) | vec(k), vec(p), vec(p):)_(+)=(1)/(sqrt12)(| vec(p), vec(p), vec(k):)+| vec(p), vec(k), vec(p):)+| vec(k), vec(p), vec(p):)+| vec(p), vec(p), vec(k):)+| vec(p), vec(k), vec(p):)+| vec(k), vec(p), vec(p):),)|\vec{k}, \vec{p}, \vec{p}\rangle_{+}=\frac{1}{\sqrt{12}}(|\vec{p}, \vec{p}, \vec{k}\rangle+|\vec{p}, \vec{k}, \vec{p}\rangle+|\vec{k}, \vec{p}, \vec{p}\rangle+|\vec{p}, \vec{p}, \vec{k}\rangle+|\vec{p}, \vec{k}, \vec{p}\rangle+|\vec{k}, \vec{p}, \vec{p}\rangle,)|k,p,p+=112(|p,p,k+|p,k,p+|k,p,p+|p,p,k+|p,k,p+|k,p,p,)
because there are 3 ! = 6 3 ! = 6 3!=63!=63!=6 permutations in S 3 S 3 S_(3)S_{3}S3.
(b) Fermions: In this case we have c = N c = N c_(-)=Nc_{-}=Nc=N !. To check this, we note that
{ k } { k } = σ , σ S N 1 c k σ ( 1 ) , , k σ ( N ) k σ ( 1 ) , , k σ ( N ) = N ! c σ S N k 1 , , k N k σ ( 1 ) , , k σ ( N ) = N ! n k 1 ! n k 2 ! n k N ! c = N ! c = 1 { k } { k } = σ , σ S N 1 c k σ ( 1 ) , , k σ ( N ) k σ ( 1 ) , , k σ ( N ) = N ! c σ S N k 1 , , k N k σ ( 1 ) , , k σ ( N ) = N ! n k 1 ! n k 2 ! n k N ! c = N ! c = 1 {:[(:{ vec(k)}∣{ vec(k)}:)=sum_(sigma,sigma^(')inS_(N))(1)/(c_(-))(: vec(k)_(sigma(1)),dots, vec(k)_(sigma(N))∣ vec(k)_(sigma^(')(1)),dots, vec(k)_(sigma^(')(N)):)],[=(N!)/(c_(-))sum_(sigma inS_(N))(: vec(k)_(1),dots, vec(k)_(N)∣ vec(k)_(sigma(1)),dots, vec(k)_(sigma(N)):)],[=(N!n_( vec(k)_(1))!n_( vec(k)_(2))!dotsn_( vec(k)_(N))!)/(c_(-))=(N!)/(c_(-))=1]:}\begin{aligned} \langle\{\vec{k}\} \mid\{\vec{k}\}\rangle & =\sum_{\sigma, \sigma^{\prime} \in S_{N}} \frac{1}{c_{-}}\left\langle\vec{k}_{\sigma(1)}, \ldots, \vec{k}_{\sigma(N)} \mid \vec{k}_{\sigma^{\prime}(1)}, \ldots, \vec{k}_{\sigma^{\prime}(N)}\right\rangle \\ & =\frac{N!}{c_{-}} \sum_{\sigma \in S_{N}}\left\langle\vec{k}_{1}, \ldots, \vec{k}_{N} \mid \vec{k}_{\sigma(1)}, \ldots, \vec{k}_{\sigma(N)}\right\rangle \\ & =\frac{N!n_{\vec{k}_{1}}!n_{\vec{k}_{2}}!\ldots n_{\vec{k}_{N}}!}{c_{-}}=\frac{N!}{c_{-}}=1 \end{aligned}{k}{k}=σ,σSN1ckσ(1),,kσ(N)kσ(1),,kσ(N)=N!cσSNk1,,kNkσ(1),,kσ(N)=N!nk1!nk2!nkN!c=N!c=1
because the term under the second sum is zero unless the permuted { k } { k } { vec(k)}\{\vec{k}\}{k} 's are
identical (this happens k n k k n k prod_( vec(k))n_( vec(k))\prod_{\vec{k}} n_{\vec{k}}knk ! times for either bosons or fermions), and because for fermions, the occupation numbers n k n k n_( vec(k))n_{\vec{k}}nk can be either zero or one.
The canonical partition function Z ± Z ± Z^(+-)Z^{ \pm}Z±is now defined as:
(5.13) Z ± ( N , V , β ) := tr H N ± ( e β H ) (5.13) Z ± ( N , V , β ) := tr H N ± e β H {:(5.13)Z^(+-)(N","V","beta):=tr_(H_(N)^(+-))(e^(-beta H)):}\begin{equation*} Z^{ \pm}(N, V, \beta):=\operatorname{tr}_{\mathcal{H}_{N}^{ \pm}}\left(e^{-\beta H}\right) \tag{5.13} \end{equation*}(5.13)Z±(N,V,β):=trHN±(eβH)
In general the partition function is difficult to calculate. It is easier to momentarily move onto the grand canonical ensemble, where the particle number N N NNN is variable, i.e. it is given by a particle number operator N ^ N ^ hat(N)\hat{N}N^ with eigenvalues N = 0 , 1 , 2 , N = 0 , 1 , 2 , N=0,1,2,dotsN=0,1,2, \ldotsN=0,1,2,. The Hilbert space is then given by the bosonic ( + ) ( + ) (+)(+)(+) or fermionic ( ) ( ) (-)(-)() Fock space
(5.14) H ± = N 0 H N ± = C H 1 ± (5.14) H ± = N 0 H N ± = C H 1 ± {:(5.14)H^(+-)=bigoplus_(N >= 0)H_(N)^(+-)=Co+H_(1)^(+-)o+dots:}\begin{equation*} \mathcal{H}^{ \pm}=\bigoplus_{N \geqslant 0} \mathcal{H}_{N}^{ \pm}=\mathbb{C} \oplus \mathcal{H}_{1}^{ \pm} \oplus \ldots \tag{5.14} \end{equation*}(5.14)H±=N0HN±=CH1±
On H N ± H N ± H_(N)^(+-)\mathcal{H}_{N}^{ \pm}HN±the particle number operator N ^ N ^ hat(N)\hat{N}N^ has eigenvalue N N NNN. The grand canonical partition function Y ± Y ± Y^(+-)Y^{ \pm}Y±is then defined as before as (cf. (4.103) and (4.107)):
(5.15) Y ± ( μ , V , β ) := tr H ± ( e β ( H μ N ^ ) ) = N = 0 e + μ β N Z ± ( N , V , β ) (5.15) Y ± ( μ , V , β ) := tr H ± e β ( H μ N ^ ) = N = 0 e + μ β N Z ± ( N , V , β ) {:(5.15)Y^(+-)(mu","V","beta):=tr_(H^(+-))(e^(-beta(H-mu hat(N))))=sum_(N=0)^(oo)e^(+mu beta N)Z^(+-)(N","V","beta):}\begin{equation*} Y^{ \pm}(\mu, V, \beta):=\operatorname{tr}_{\mathcal{H}^{ \pm}}\left(e^{-\beta(H-\mu \hat{N})}\right)=\sum_{N=0}^{\infty} e^{+\mu \beta N} Z^{ \pm}(N, V, \beta) \tag{5.15} \end{equation*}(5.15)Y±(μ,V,β):=trH±(eβ(HμN^))=N=0e+μβNZ±(N,V,β)
Another representation of the states in H ± H ± H^(+-)\mathcal{H}^{ \pm}H±is the one based on the occupation numbers n k n k n_( vec(k))n_{\vec{k}}nk :
(a) | { n k } + , n k = 0 , 1 , 2 , 3 , n k + , n k = 0 , 1 , 2 , 3 , |{n_( vec(k))}:)_(+),quadn_( vec(k))=0,1,2,3,dots\left|\left\{n_{\vec{k}}\right\}\right\rangle_{+}, \quad n_{\vec{k}}=0,1,2,3, \ldots|{nk}+,nk=0,1,2,3, for Bosons,
(b) | { n k } , n k = 0 , 1 n k , n k = 0 , 1 |{n_( vec(k))}:)_(-),quadn_( vec(k))=0,1\left|\left\{n_{\vec{k}}\right\}\right\rangle_{-}, \quad n_{\vec{k}}=0,1|{nk},nk=0,1 for Fermions.
In the bosonic case, one defines the creation annihilation operator for a mode k k vec(k)\vec{k}k as
(5.16) a k | , n k , + = 1 + n k | , n k + 1 , + , (5.17) a k | , n k , + = n k | , n k 1 , + (5.16) a k , n k , + = 1 + n k , n k + 1 , + , (5.17) a k , n k , + = n k , n k 1 , + {:[(5.16)a_( vec(k))^(†)|dots,n_( vec(k)),dots:)_(+)=sqrt(1+n_( vec(k)))|dots,n_( vec(k))+1,dots:)_(+)","],[(5.17)a_( vec(k))|dots,n_( vec(k)),dots:)_(+)=sqrt(n_( vec(k)))|dots,n_( vec(k))-1,dots:)_(+)]:}\begin{align*} & a_{\vec{k}}^{\dagger}\left|\ldots, n_{\vec{k}}, \ldots\right\rangle_{+}=\sqrt{1+n_{\vec{k}}}\left|\ldots, n_{\vec{k}}+1, \ldots\right\rangle_{+}, \tag{5.16}\\ & a_{\vec{k}}\left|\ldots, n_{\vec{k}}, \ldots\right\rangle_{+}=\sqrt{n_{\vec{k}}}\left|\ldots, n_{\vec{k}}-1, \ldots\right\rangle_{+} \tag{5.17} \end{align*}(5.16)ak|,nk,+=1+nk|,nk+1,+,(5.17)ak|,nk,+=nk|,nk1,+
These raise/lower the number of particles in mode k k vec(k)\vec{k}k by one. One easily checks that these fulfill the commutation relations
(5.18) [ a k , a p ] = δ k , p [ a k , a p ] = [ a k , a p ] = 0 (5.18) a k , a p = δ k , p a k , a p = a k , a p = 0 {:(5.18)[a_( vec(k)),a_( vec(p))^(†)]=delta_( vec(k), vec(p))quad[a_( vec(k)),a_( vec(p))]=[a_( vec(k))^(†),a_( vec(p))^(†)]=0:}\begin{equation*} \left[a_{\vec{k}}, a_{\vec{p}}^{\dagger}\right]=\delta_{\vec{k}, \vec{p}} \quad\left[a_{\vec{k}}, a_{\vec{p}}\right]=\left[a_{\vec{k}}^{\dagger}, a_{\vec{p}}^{\dagger}\right]=0 \tag{5.18} \end{equation*}(5.18)[ak,ap]=δk,p[ak,ap]=[ak,ap]=0
so they behave as the ladder operators of a system of independent harmonic oscillators (one for each mode k k vec(k)\vec{k}k ). In particular, N ^ k = a k a k N ^ k = a k a k hat(N)_( vec(k))=a_( vec(k))^(†)a_( vec(k))\hat{N}_{\vec{k}}=a_{\vec{k}}^{\dagger} a_{\vec{k}}N^k=akak is the operator counting the number of particles in mode k k vec(k)\vec{k}k.
To arrive at similar creation/annihilation operators in the fermionic case, we first consider a single mode, which can be occupied by no or one particle. In the corresponding basis { | 0 , | 1 } { | 0 , | 1 } {|0:),|1:)}\{|0\rangle,|1\rangle\}{|0,|1}, the annihilation and creation operator have the following matrix representation
(5.19) a = ( 0 1 0 0 ) , a = ( 0 0 1 0 ) (5.19) a = 0 1 0 0 , a = 0 0 1 0 {:(5.19)a=([0,1],[0,0])","quadquada^(†)=([0,0],[1,0]):}a=\left(\begin{array}{cc} 0 & 1 \tag{5.19}\\ 0 & 0 \end{array}\right), \quad \quad a^{\dagger}=\left(\begin{array}{cc} 0 & 0 \\ 1 & 0 \end{array}\right)(5.19)a=(0100),a=(0010)
In particular, we find that
a a = ( 1 0 0 0 ) , a a = ( 0 0 0 1 ) , a a = 0 , a a = 0 , a a = 1 0 0 0 , a a = 0 0 0 1 , a a = 0 , a a = 0 , {:[aa^(†),=([1,0],[0,0])",",a^(†)a,=([0,0],[0,1])","],[aa,=0",",a^(†)a^(†)=0","]:}\begin{array}{rlrl} a a^{\dagger} & =\left(\begin{array}{cc} 1 & 0 \\ 0 & 0 \end{array}\right), & a^{\dagger} a & =\left(\begin{array}{cc} 0 & 0 \\ 0 & 1 \end{array}\right), \\ a a & =0, & a^{\dagger} a^{\dagger}=0, \end{array}aa=(1000),aa=(0001),aa=0,aa=0,
which in particular imply that
(5.22) { a , a } = 1 , { a , a } = { a , a } = 0 (5.22) a , a = 1 , { a , a } = a , a = 0 {:(5.22){a,a^(†)}=1","quad{a","a}={a^(†),a^(†)}=0:}\begin{equation*} \left\{a, a^{\dagger}\right\}=\mathbb{1}, \quad\{a, a\}=\left\{a^{\dagger}, a^{\dagger}\right\}=0 \tag{5.22} \end{equation*}(5.22){a,a}=1,{a,a}={a,a}=0
where { A , B } = A B + B A { A , B } = A B + B A {A,B}=AB+BA\{A, B\}=A B+B A{A,B}=AB+BA denotes the anticommutator of two operators. Furthermore, we see that a a = N ^ a a = N ^ a^(†)a= hat(N)a^{\dagger} a=\hat{N}aa=N^ can again be interpreted as the number operator. It is natural to require anticommutation relations also for creation/annihilation operator corresponding to different modes, i.e.
(5.23) { a k , a p } = δ k , p { a k , a p } = { a k , a p } = 0 (5.23) a k , a p = δ k , p a k , a p = a k , a p = 0 {:(5.23){a_( vec(k))^(†),a_( vec(p))}=delta_( vec(k), vec(p))quad{a_( vec(k)),a_( vec(p))}={a_( vec(k))^(†),a_( vec(p))^(†)}=0:}\begin{equation*} \left\{a_{\vec{k}}^{\dagger}, a_{\vec{p}}\right\}=\delta_{\vec{k}, \vec{p}} \quad\left\{a_{\vec{k}}, a_{\vec{p}}\right\}=\left\{a_{\vec{k}}^{\dagger}, a_{\vec{p}}^{\dagger}\right\}=0 \tag{5.23} \end{equation*}(5.23){ak,ap}=δk,p{ak,ap}={ak,ap}=0
To implement these we can define their action in the above basis as
(5.24) a k | , n k , = ( 1 ) p < k n p ( 1 n k ) | , n k + 1 , , (5.25) a k | , n k , = ( 1 ) p < k n p n k | , n k 1 , . (5.24) a k , n k , = ( 1 ) p < k n p 1 n k , n k + 1 , , (5.25) a k , n k , = ( 1 ) p < k n p n k , n k 1 , . {:[(5.24)a_( vec(k))^(†)|dots,n_( vec(k)),dots:)_(-)=(-1)^(sum_( vec(p) < vec(k))n_( vec(p)))(1-n_( vec(k)))|dots,n_( vec(k))+1,dots:)_(-)","],[(5.25)a_( vec(k))|dots,n_( vec(k)),dots:)_(-)=(-1)^(sum_( vec(p) < vec(k))n_( vec(p)))n_( vec(k))|dots,n_( vec(k))-1,dots:)_(-).]:}\begin{align*} & a_{\vec{k}}^{\dagger}\left|\ldots, n_{\vec{k}}, \ldots\right\rangle_{-}=(-1)^{\sum_{\vec{p}<\vec{k}} n_{\vec{p}}}\left(1-n_{\vec{k}}\right)\left|\ldots, n_{\vec{k}}+1, \ldots\right\rangle_{-}, \tag{5.24}\\ & a_{\vec{k}}\left|\ldots, n_{\vec{k}}, \ldots\right\rangle_{-}=(-1)^{\sum_{\vec{p}<\vec{k}} n_{\vec{p}}} n_{\vec{k}}\left|\ldots, n_{\vec{k}}-1, \ldots\right\rangle_{-} . \tag{5.25} \end{align*}(5.24)ak|,nk,=(1)p<knp(1nk)|,nk+1,,(5.25)ak|,nk,=(1)p<knpnk|,nk1,.
For the sign factors on the right hand side, which implement the anticommutation relations between operators corresponding to different modes, one chooses an arbitrary ordering of the wave vectors k k vec(k)\vec{k}k.
Both for bosons and fermion, the operator counting the number of particles in mode k k vec(k)\vec{k}k is N ^ k = a k a k N ^ k = a k a k hat(N)_( vec(k))=a_( vec(k))^(†)a_( vec(k))\hat{N}_{\vec{k}}=a_{\vec{k}}^{\dagger} a_{\vec{k}}N^k=akak, with eigenvalues n k n k n_( vec(k))n_{\vec{k}}nk. The Hamiltonian may then be written as
(5.26) H = k ϵ ( k ) N ^ k = k ϵ ( k ) a k a k (5.26) H = k ϵ ( k ) N ^ k = k ϵ ( k ) a k a k {:(5.26)H=sum_( vec(k))epsilon( vec(k)) hat(N)_( vec(k))=sum_( vec(k))epsilon( vec(k))a_( vec(k))^(†)a_( vec(k)):}\begin{equation*} H=\sum_{\vec{k}} \epsilon(\vec{k}) \hat{N}_{\vec{k}}=\sum_{\vec{k}} \epsilon(\vec{k}) a_{\vec{k}}^{\dagger} a_{\vec{k}} \tag{5.26} \end{equation*}(5.26)H=kϵ(k)N^k=kϵ(k)akak
where ϵ ( k ) = 2 k 2 2 m ϵ ( k ) = 2 k 2 2 m epsilon( vec(k))=(ℏ^(2) vec(k)^(2))/(2m)\epsilon(\vec{k})=\frac{\hbar^{2} \vec{k}^{2}}{2 m}ϵ(k)=2k22m for non-relativistic particles. With the formalism of creation and destruction operators at hand, the grand canonical partition function for bosons and fermions, respectively, may now be calculated as follows:
(a) Bosons ("+"):
Y + ( μ , V , β ) = { n k } + { n k } | e β ( H μ N ^ ) | { n k } + = { n k } e β k n k ( ϵ ( k ) μ ) = k ( n = 0 e β ( ϵ ( k ) μ ) n ) (5.27) = k ( 1 e β ( ϵ ( k ) μ ) ) 1 . Y + ( μ , V , β ) = n k + n k e β ( H μ N ^ ) n k + = n k e β k n k ( ϵ ( k ) μ ) = k n = 0 e β ( ϵ ( k ) μ ) n (5.27) = k 1 e β ( ϵ ( k ) μ ) 1 . {:[Y^(+)(mu","V","beta)=sum_({n_( vec(k))}^(+))(:{n_( vec(k))}|e^(-beta(H-mu hat(N)))|{n_( vec(k))}:)_(+)],[=sum_({n_( vec(k))})e^(-betasum_( vec(k))n_( vec(k))(epsilon( vec(k))-mu))],[=prod_( vec(k))(sum_(n=0)^(oo)e^(-beta(epsilon( vec(k))-mu)n))],[(5.27)=prod_( vec(k))(1-e^(-beta(epsilon( vec(k))-mu)))^(-1).]:}\begin{align*} Y^{+}(\mu, V, \beta) & =\sum_{\left\{n_{\vec{k}}\right\}^{+}}\left\langle\left\{n_{\vec{k}}\right\}\right| e^{-\beta(H-\mu \hat{N})}\left|\left\{n_{\vec{k}}\right\}\right\rangle_{+} \\ & =\sum_{\left\{n_{\vec{k}}\right\}} e^{-\beta \sum_{\vec{k}} n_{\vec{k}}(\epsilon(\vec{k})-\mu)} \\ & =\prod_{\vec{k}}\left(\sum_{n=0}^{\infty} e^{-\beta(\epsilon(\vec{k})-\mu) n}\right) \\ & =\prod_{\vec{k}}\left(1-e^{-\beta(\epsilon(\vec{k})-\mu)}\right)^{-1} . \tag{5.27} \end{align*}Y+(μ,V,β)={nk}+{nk}|eβ(HμN^)|{nk}+={nk}eβknk(ϵ(k)μ)=k(n=0eβ(ϵ(k)μ)n)(5.27)=k(1eβ(ϵ(k)μ))1.
(b) Fermions ("-"):
Y ( μ , V , β ) = { n k } { n k } | e β ( H μ N ^ ) | { n k } (5.28) = k ( 1 + e β ( ϵ ( k ) μ ) ) 1 Y ( μ , V , β ) = n k n k e β ( H μ N ^ ) n k (5.28) = k 1 + e β ( ϵ ( k ) μ ) 1 {:[Y^(-)(mu","V","beta)=sum_({n_( vec(k))}^(-))(:{n_( vec(k))}|e^(-beta(H-mu hat(N)))|{n_( vec(k))}:)],[(5.28)=prod_( vec(k))(1+e^(-beta(epsilon( vec(k))-mu)))^(1)]:}\begin{align*} Y^{-}(\mu, V, \beta) & =\sum_{\left\{n_{\vec{k}}\right\}^{-}}\left\langle\left\{n_{\vec{k}}\right\}\right| e^{-\beta(H-\mu \hat{N})}\left|\left\{n_{\vec{k}}\right\}\right\rangle \\ & =\prod_{\vec{k}}\left(1+e^{-\beta(\epsilon(\vec{k})-\mu)}\right)^{1} \tag{5.28} \end{align*}Y(μ,V,β)={nk}{nk}|eβ(HμN^)|{nk}(5.28)=k(1+eβ(ϵ(k)μ))1
The expected number densities n ¯ k n ¯ k bar(n)_( vec(k))\bar{n}_{\vec{k}}n¯k, which are defined as
(5.29) n ¯ k := N ^ k ± = tr H ± ( N ^ k Y ± e β ( H μ N ^ ) ) (5.29) n ¯ k := N ^ k ± = tr H ± N ^ k Y ± e β ( H μ N ^ ) {:(5.29) bar(n)_( vec(k)):=(: hat(N)_( vec(k)):)_(+-)=tr_(H^(+-))(( hat(N)_( vec(k)))/(Y^(+-))e^(-beta(H-mu hat(N)))):}\begin{equation*} \bar{n}_{\vec{k}}:=\left\langle\hat{N}_{\vec{k}}\right\rangle_{ \pm}=\operatorname{tr}_{\mathcal{H}^{ \pm}}\left(\frac{\hat{N}_{\vec{k}}}{Y^{ \pm}} e^{-\beta(H-\mu \hat{N})}\right) \tag{5.29} \end{equation*}(5.29)n¯k:=N^k±=trH±(N^kY±eβ(HμN^))
can be calculated by means of a trick. Let us consider the bosonic case ("+"). From the above commutation relations we obtain
(5.30) N ^ k a p = a p ( N ^ k + δ k , p ) and a p N ^ k = ( N ^ k + δ k , p ) a p (5.30) N ^ k a p = a p N ^ k + δ k , p  and  a p N ^ k = N ^ k + δ k , p a p {:(5.30) hat(N)_( vec(k))a_( vec(p))^(†)=a_( vec(p))^(†)( hat(N)_( vec(k))+delta_( vec(k), vec(p)))quad" and "quada_( vec(p)) hat(N)_( vec(k))=( hat(N)_( vec(k))+delta_( vec(k), vec(p)))a_( vec(p)):}\begin{equation*} \hat{N}_{\vec{k}} a_{\vec{p}}^{\dagger}=a_{\vec{p}}^{\dagger}\left(\hat{N}_{\vec{k}}+\delta_{\vec{k}, \vec{p}}\right) \quad \text { and } \quad a_{\vec{p}} \hat{N}_{\vec{k}}=\left(\hat{N}_{\vec{k}}+\delta_{\vec{k}, \vec{p}}\right) a_{\vec{p}} \tag{5.30} \end{equation*}(5.30)N^kap=ap(N^k+δk,p) and apN^k=(N^k+δk,p)ap
From this it follows by a straightforward calculation that
n ¯ k = tr H + ( 1 Y + a k a k e β ( H μ N ^ ) ) = tr H + ( 1 Y + a k a k e p β ( ϵ ( p ) μ ) N ^ p ) = tr H + ( 1 Y + a k e p β ( ϵ ( p ) μ ) ( N ^ p + δ k , p ) a k ) = tr H + ( 1 Y + a k a k e p β ( ϵ ( p ) μ ) ( N ^ p + δ k , p ) ) = e β ( ϵ ( k ) μ ) tr H + ( 1 Y + a k a k 1 + N ^ k e p β ( ϵ ( p ) μ ) N ^ p ) = e β ( ϵ ( k ) μ ) ( 1 + n ¯ k ) . n ¯ k = tr H + 1 Y + a k a k e β ( H μ N ^ ) = tr H + 1 Y + a k a k e p β ( ϵ ( p ) μ ) N ^ p = tr H + 1 Y + a k e p β ( ϵ ( p ) μ ) N ^ p + δ k , p a k = tr H + 1 Y + a k a k e p β ( ϵ ( p ) μ ) N ^ p + δ k , p = e β ( ϵ ( k ) μ ) tr H + ( 1 Y + a k a k 1 + N ^ k e p β ( ϵ ( p ) μ ) N ^ p ) = e β ( ϵ ( k ) μ ) 1 + n ¯ k . {:[ bar(n)_( vec(k))=tr_(H^(+))((1)/(Y^(+))a_( vec(k))^(†)a_( vec(k))e^(-beta(H-mu hat(N))))=tr_(H^(+))((1)/(Y^(+))a_( vec(k))^(†)a_( vec(k))e^(-sum_( vec(p))beta(epsilon( vec(p))-mu) hat(N)_( vec(p))))],[=tr_(H^(+))((1)/(Y^(+))a_( vec(k))^(†)e^(-sum_( vec(p))beta(epsilon( vec(p))-mu)( hat(N)_( vec(p))+delta_( vec(k), vec(p))))a_( vec(k)))],[=tr_(H^(+))((1)/(Y^(+))a_( vec(k))a_( vec(k))^(†)e^(-sum_( vec(p))beta(epsilon( vec(p))-mu)( hat(N)_( vec(p))+delta_( vec(k), vec(p)))))],[=e^(-beta(epsilon( vec(k))-mu))tr_(H^(+))((1)/(Y^(+))ubrace(a_( vec(k))a_( vec(k))^(†)ubrace)_(1+ hat(N)_( vec(k)))e^(-sum_( vec(p))beta(epsilon( vec(p))-mu) hat(N)_( vec(p))))],[=e^(-beta(epsilon( vec(k))-mu))(1+ bar(n)_( vec(k))).]:}\begin{aligned} \bar{n}_{\vec{k}} & =\operatorname{tr}_{\mathcal{H}^{+}}\left(\frac{1}{Y^{+}} a_{\vec{k}}^{\dagger} a_{\vec{k}} e^{-\beta(H-\mu \hat{N})}\right)=\operatorname{tr}_{\mathcal{H}^{+}}\left(\frac{1}{Y^{+}} a_{\vec{k}}^{\dagger} a_{\vec{k}} e^{-\sum_{\vec{p}} \beta(\epsilon(\vec{p})-\mu) \hat{N}_{\vec{p}}}\right) \\ & =\operatorname{tr}_{\mathcal{H}^{+}}\left(\frac{1}{Y^{+}} a_{\vec{k}}^{\dagger} e^{-\sum_{\vec{p}} \beta(\epsilon(\vec{p})-\mu)\left(\hat{N}_{\vec{p}}+\delta_{\vec{k}, \vec{p}}\right)} a_{\vec{k}}\right) \\ & =\operatorname{tr}_{\mathcal{H}^{+}}\left(\frac{1}{Y^{+}} a_{\vec{k}} a_{\vec{k}}^{\dagger} e^{-\sum_{\vec{p}} \beta(\epsilon(\vec{p})-\mu)\left(\hat{N}_{\vec{p}}+\delta_{\vec{k}, \vec{p}}\right)}\right) \\ & =e^{-\beta(\epsilon(\vec{k})-\mu)} \operatorname{tr}_{\mathcal{H}^{+}}(\frac{1}{Y^{+}} \underbrace{a_{\vec{k}} a_{\vec{k}}^{\dagger}}_{1+\hat{N}_{\vec{k}}} e^{-\sum_{\vec{p}} \beta(\epsilon(\vec{p})-\mu) \hat{N}_{\vec{p}}}) \\ & =e^{-\beta(\epsilon(\vec{k})-\mu)}\left(1+\bar{n}_{\vec{k}}\right) . \end{aligned}n¯k=trH+(1Y+akakeβ(HμN^))=trH+(1Y+akakepβ(ϵ(p)μ)N^p)=trH+(1Y+akepβ(ϵ(p)μ)(N^p+δk,p)ak)=trH+(1Y+akakepβ(ϵ(p)μ)(N^p+δk,p))=eβ(ϵ(k)μ)trH+(1Y+akak1+N^kepβ(ϵ(p)μ)N^p)=eβ(ϵ(k)μ)(1+n¯k).
Applying similar arguments in the fermionic case we find for the expected number densities:
(5.32) n ¯ k = 1 e β ( ϵ ( k ) μ ) 1 , for bosons n ¯ k = 1 e β ( ϵ ( k ) μ ) + 1 , for fermions. (5.32) n ¯ k = 1 e β ( ϵ ( k ) μ ) 1 ,  for bosons  n ¯ k = 1 e β ( ϵ ( k ) μ ) + 1 ,  for fermions.  {:(5.32){:[ bar(n)_( vec(k))=(1)/(e^(beta(epsilon( vec(k))-mu))-1)","," for bosons "],[ bar(n)_( vec(k))=(1)/(e^(beta(epsilon( vec(k))-mu))+1)","quad" for fermions. "]:}:}\begin{array}{ll} \bar{n}_{\vec{k}}=\frac{1}{e^{\beta(\epsilon(\vec{k})-\mu)}-1}, & \text { for bosons } \\ \bar{n}_{\vec{k}}=\frac{1}{e^{\beta(\epsilon(\vec{k})-\mu)}+1}, \quad \text { for fermions. } \tag{5.32} \end{array}(5.32)n¯k=1eβ(ϵ(k)μ)1, for bosons n¯k=1eβ(ϵ(k)μ)+1, for fermions. 
These distributions are called Bose-Einstein distribution and Fermi-Dirac distribution, respectively. Note that for the case of bosons, we have to require that the chemical potential is lower than the ground state energy, μ < ϵ ( 0 ) μ < ϵ ( 0 ) mu < epsilon(0)\mu<\epsilon(0)μ<ϵ(0), in order to avoid a diverging particle number. Also note, that the particular form of ϵ ( k ) ϵ ( k ) epsilon( vec(k))\epsilon(\vec{k})ϵ(k) was not important in the derivation. In particular, (5.31) and (5.32) also hold for relativistic particles (see section 5.3). The classical distribution n ¯ k e β ϵ ( k ) n ¯ k e β ϵ ( k ) bar(n)_( vec(k))prope^(-beta epsilon( vec(k)))\bar{n}_{\vec{k}} \propto e^{-\beta \epsilon(\vec{k})}n¯keβϵ(k) is obtained in the limit β ϵ ( k ) 1 β ϵ ( k ) 1 beta epsilon( vec(k))≫1\beta \epsilon(\vec{k}) \gg 1βϵ(k)1 i.e. ϵ ( k ) k B T ϵ ( k ) k B T epsilon( vec(k))≫k_(B)T\epsilon(\vec{k}) \gg k_{\mathrm{B}} Tϵ(k)kBT, consistent with our experience that quantum effects are usually only important for energies that are small compared to the temperature.
The mean energy E ± E ± E_(+-)E_{ \pm}E±is given by
(5.33) E ± = H ± = k ϵ ( k ) N ^ k ± = k ϵ ( k ) n ¯ k ± (5.33) E ± = H ± = k ϵ ( k ) N ^ k ± = k ϵ ( k ) n ¯ k ± {:(5.33)E_(+-)=(:H:)_(+-)=sum_( vec(k))(:epsilon(( vec(k))) hat(N)_( vec(k)):)_(+-)=sum_( vec(k))epsilon( vec(k)) bar(n)_( vec(k))^(+-):}\begin{equation*} E_{ \pm}=\langle H\rangle_{ \pm}=\sum_{\vec{k}}\left\langle\epsilon(\vec{k}) \hat{N}_{\vec{k}}\right\rangle_{ \pm}=\sum_{\vec{k}} \epsilon(\vec{k}) \bar{n}_{\vec{k}}^{ \pm} \tag{5.33} \end{equation*}(5.33)E±=H±=kϵ(k)N^k±=kϵ(k)n¯k±

5.2 Degeneracy pressure for free fermions

Let us now go back to the canonical ensemble, with density matrix ρ ± ρ ± rho^(+-)\rho^{ \pm}ρ±given by
(5.34) ρ ± = 1 Z N ± ± P ± e β H N (5.34) ρ ± = 1 Z N ± ± P ± e β H N {:(5.34)rho^(+-)=(1)/(Z_((N)/(+-))^(+-))P_(+-)e^(-betaH_(N)):}\begin{equation*} \rho^{ \pm}=\frac{1}{Z_{\frac{N}{ \pm}}^{ \pm}} \mathcal{P}_{ \pm} e^{-\beta H_{N}} \tag{5.34} \end{equation*}(5.34)ρ±=1ZN±±P±eβHN
Our aim is now to calculate the canonical partition function Z N ± Z N ± Z_(N)^(+-)Z_{N}^{ \pm}ZN±[for more details concerning, see e.g. Ch. 7 of M. Kardar: "Statistical Physics of Particles" Cambridge (2007),
which we mostly follow in this section]. Let | { x } ± | { x } ± |{ vec(x)}:)_(+-)|\{\vec{x}\}\rangle_{ \pm}|{x}±be an eigenbasis of the position operators. Then, with η { + , } η { + , } eta in{+,-}\eta \in\{+,-\}η{+,} :
where Ψ k ( x ) H 1 Ψ k ( x ) H 1 Psi_( vec(k))( vec(x))inH_(1)\Psi_{\vec{k}}(\vec{x}) \in \mathcal{H}_{1}Ψk(x)H1 are the 1-particle wave functions and
(5.36) η σ = { 1 for bosons sgn ( σ ) for fermions (5.36) η σ = 1  for bosons  sgn ( σ )  for fermions  {:(5.36)eta_(sigma)={[1," for bosons "],[sgn(sigma)," for fermions "]:}:}\eta_{\sigma}= \begin{cases}1 & \text { for bosons } \tag{5.36}\\ \operatorname{sgn}(\sigma) & \text { for fermions }\end{cases}(5.36)ησ={1 for bosons sgn(σ) for fermions 
The sum { k 1 , , k N } k 1 , , k N sum_({ vec(k)_(1),dots, vec(k)_(N)})^(')\sum_{\left\{\vec{k}_{1}, \ldots, \vec{k}_{N}\right\}}^{\prime}{k1,,kN} is restricted in order to ensure that each identical particle state appears only once. We may equivalently work in terms of the occupation number representation | { n k } ± n k ± |{n_( vec(k))}:)_(+-)\left|\left\{n_{\vec{k}}\right\}\right\rangle_{ \pm}|{nk}±. It is then clear that
(5.37) { k } = { k } { k } n k ! N ! , (5.37) { k } = { k } { k } n k ! N ! , {:(5.37)sum_({ vec(k)})^(')=sum_({ vec(k)})(prod_({ vec(k)})n_( vec(k))!)/(N!)",":}\begin{equation*} \sum_{\{\vec{k}\}}^{\prime}=\sum_{\{\vec{k}\}} \frac{\prod_{\{\vec{k}\}} n_{\vec{k}}!}{N!}, \tag{5.37} \end{equation*}(5.37){k}={k}{k}nk!N!,
where the factor in the unrestricted sum compensates the over-counting. This gives with the formulas for c η c η c_(eta)c_{\eta}cη derived above
η { x } | ρ | { x } η = { k } k n k ! N ! 1 k n k ! N ! σ , σ S N η σ η σ Z N e β i 2 k i 2 2 m Ψ σ { k } + ( { x } ) Ψ σ { k } ( { x } ) η x ρ | { x } η = { k } k n k ! N ! 1 k n k ! N ! σ , σ S N η σ η σ Z N e β i 2 k i 2 2 m Ψ σ { k } + x Ψ σ { k } ( { x } ) eta(:{ vec(x)^(')}|rho|{ vec(x)}:)_(eta)=sum_({ vec(k)})(prod_( vec(k))n_( vec(k))!)/(N!)(1)/(prod_( vec(k))n_( vec(k))!N!)sum_(sigma,sigma^(')inS_(N))(eta_(sigma)eta_(sigma^(')))/(Z_(N))e^(-betasum_(i)(ℏ^(2) vec(k)_(i)^(2))/(2m))Psi_(sigma^('){ vec(k)})^(+)({ vec(x)^(')})Psi_(sigma{ vec(k)})({ vec(x)})\eta\left\langle\left\{\vec{x}^{\prime}\right\}\right| \rho|\{\vec{x}\}\rangle_{\eta}=\sum_{\{\vec{k}\}} \frac{\prod_{\vec{k}} n_{\vec{k}}!}{N!} \frac{1}{\prod_{\vec{k}} n_{\vec{k}}!N!} \sum_{\sigma, \sigma^{\prime} \in S_{N}} \frac{\eta_{\sigma} \eta_{\sigma^{\prime}}}{Z_{N}} e^{-\beta \sum_{i} \frac{\hbar^{2} \vec{k}_{i}^{2}}{2 m}} \Psi_{\sigma^{\prime}\{\vec{k}\}}^{+}\left(\left\{\vec{x}^{\prime}\right\}\right) \Psi_{\sigma\{\vec{k}\}}(\{\vec{x}\})η{x}|ρ|{x}η={k}knk!N!1knk!N!σ,σSNησησZNeβi2ki22mΨσ{k}+({x})Ψσ{k}({x}).
It is now convenient to work with period boundary conditions instead of the Dirichlet boundary conditions used so far, i.e., we require that Ψ ( 0 , y , z ) = Ψ ( L , y , z ) Ψ ( 0 , y , z ) = Ψ ( L , y , z ) Psi(0,y,z)=Psi(L,y,z)\Psi(0, y, z)=\Psi(L, y, z)Ψ(0,y,z)=Ψ(L,y,z), with L L LLL the length of the cube, and similarly for the y y yyy and z z zzz direction. The normalized eigenmodes are then Ψ k = 1 V e i k x Ψ k = 1 V e i k x Psi_( vec(k))=(1)/(sqrtV)e^(i vec(k)* vec(x))\Psi_{\vec{k}}=\frac{1}{\sqrt{V}} e^{i \vec{k} \cdot \vec{x}}Ψk=1Veikx with k = 2 π L ( n x , n y , n z ) k = 2 π L n x , n y , n z vec(k)=(2pi)/(L)(n_(x),n_(y),n_(z))\vec{k}=\frac{2 \pi}{L}\left(n_{x}, n_{y}, n_{z}\right)k=2πL(nx,ny,nz), where n x , n y , n z Z n x , n y , n z Z n_(x),n_(y),n_(z)inZn_{x}, n_{y}, n_{z} \in \mathbb{Z}nx,ny,nzZ. Considering that the spacing between two wave vectors is 2 π L 2 π L (2pi)/(L)\frac{2 \pi}{L}2πL in every direction, we may replace the sum k k sum_( vec(k))\sum_{\vec{k}}k by V ( 2 π ) 3 d 3 k V ( 2 π ) 3 d 3 k (V)/((2pi)^(3))intd^(3)k\frac{V}{(2 \pi)^{3}} \int d^{3} kV(2π)3d3k in the limit V V V rarr ooV \rightarrow \inftyV, which yields
η { x } | ρ | { x } η = 1 Z N N ! 2 V N ( 2 π ) 3 N σ , σ η σ η σ 1 V N d 3 N k e β i = 1 N 2 k i 2 2 m e i j = 1 N ( k σ j x j k σ j x j ) = 1 Z N N ! 2 σ , σ η σ η σ j d 3 k ( 2 π ) 3 e i k ( x σ j x σ / j ) e β k 2 2 2 m . η x ρ | { x } η = 1 Z N N ! 2 V N ( 2 π ) 3 N σ , σ η σ η σ 1 V N d 3 N k e β i = 1 N 2 k i 2 2 m e i j = 1 N k σ j x j k σ j x j = 1 Z N N ! 2 σ , σ η σ η σ j d 3 k ( 2 π ) 3 e i k x σ j x σ / j e β k 2 2 2 m . {:[_(eta)(:{ vec(x)^(')}|rho|{( vec(x))}:):)_(eta)=(1)/(Z_(N)N!^(2))(V^(N))/((2pi)^(3N))sum_(sigma,sigma^('))eta_(sigma)eta_(sigma^('))int(1)/(V^(N))d^(3N)ke^(-betasum_(i=1)^(N)(ℏ^(2) vec(k)_(i)^(2))/(2m))e^(-isum_(j=1)^(N)( vec(k)_(sigma j) vec(x)_(j)- vec(k)_(sigma^(')j) vec(x)_(j)^(')))],[=(1)/(Z_(N)N!^(2))sum_(sigma,sigma^('))eta_(sigma)eta_(sigma^('))prod_(j)int(d^(3)k)/((2pi)^(3))e^(-i vec(k)( vec(x)_(sigma j)- vec(x)_(sigma//j)^(')))e^(-beta( vec(k)^(2)ℏ^(2))/(2m)).]:}\begin{aligned} \left.{ }_{\eta}\left\langle\left\{\vec{x}^{\prime}\right\}\right| \rho|\{\vec{x}\}\rangle\right\rangle_{\eta} & =\frac{1}{Z_{N} N!^{2}} \frac{V^{N}}{(2 \pi)^{3 N}} \sum_{\sigma, \sigma^{\prime}} \eta_{\sigma} \eta_{\sigma^{\prime}} \int \frac{1}{V^{N}} d^{3 N} k e^{-\beta \sum_{i=1}^{N} \frac{\hbar^{2} \vec{k}_{i}^{2}}{2 m}} e^{-i \sum_{j=1}^{N}\left(\vec{k}_{\sigma j} \vec{x}_{j}-\vec{k}_{\sigma^{\prime} j} \vec{x}_{j}^{\prime}\right)} \\ & =\frac{1}{Z_{N} N!^{2}} \sum_{\sigma, \sigma^{\prime}} \eta_{\sigma} \eta_{\sigma^{\prime}} \prod_{j} \int \frac{d^{3} k}{(2 \pi)^{3}} e^{-i \vec{k}\left(\vec{x}_{\sigma j}-\vec{x}_{\sigma / j}^{\prime}\right)} e^{-\beta \frac{\vec{k}^{2} \hbar^{2}}{2 m}} . \end{aligned}η{x}|ρ|{x}η=1ZNN!2VN(2π)3Nσ,σησησ1VNd3Nkeβi=1N2ki22meij=1N(kσjxjkσjxj)=1ZNN!2σ,σησησjd3k(2π)3eik(xσjxσ/j)eβk222m.
The Gaussian integrals can be explicitly performed, giving the result
1 λ 3 e π λ 2 ( x σ j x σ j ) 2 1 λ 3 e π λ 2 x σ j x σ j 2 (1)/(lambda^(3))e^(-(pi)/(lambda^(2))( vec(x)_(sigma j)- vec(x)_(sigma^(')j)^('))^(2))\frac{1}{\lambda^{3}} e^{-\frac{\pi}{\lambda^{2}}\left(\vec{x}_{\sigma j}-\vec{x}_{\sigma^{\prime} j}^{\prime}\right)^{2}}1λ3eπλ2(xσjxσj)2
with the thermal deBroglie wavelength λ = h 2 π m k B T λ = h 2 π m k B T lambda=(h)/(sqrt(2pi mk_(B)T))\lambda=\frac{h}{\sqrt{2 \pi m \mathrm{k}_{\mathrm{B}} T}}λ=h2πmkBT. Relabeling the summation indices then results in
(5.39) η { x } | ρ | { x } η = 1 Z N λ 3 N N ! σ η σ e π λ 2 j ( x j x σ j ) 2 (5.39) η x ρ | { x } η = 1 Z N λ 3 N N ! σ η σ e π λ 2 j x j x σ j 2 {:(5.39)eta(:{ vec(x)^(')}|rho|{ vec(x)}:)_(eta)=(1)/(Z_(N)lambda^(3N)N!)sum_(sigma)eta_(sigma)e^(-(pi)/(lambda^(2))sum_(j)( vec(x)_(j)- vec(x)_(sigma j))^(2)):}\begin{equation*} \eta\left\langle\left\{\vec{x}^{\prime}\right\}\right| \rho|\{\vec{x}\}\rangle_{\eta}=\frac{1}{Z_{N} \lambda^{3 N} N!} \sum_{\sigma} \eta_{\sigma} e^{-\frac{\pi}{\lambda^{2}} \sum_{j}\left(\vec{x}_{j}-\vec{x}_{\sigma j}\right)^{2}} \tag{5.39} \end{equation*}(5.39)η{x}|ρ|{x}η=1ZNλ3NN!σησeπλ2j(xjxσj)2
Setting x = x x = x vec(x)^(')= vec(x)\vec{x}^{\prime}=\vec{x}x=x, taking d 3 N x d 3 N x intd^(3N)x\int d^{3 N} xd3Nx on both sides gives, and using tr ρ = ! 1 tr ρ = ! 1 tr rho=^(!)1\operatorname{tr} \rho \stackrel{!}{=} 1trρ=!1 gives:
(5.40) Z N = 1 N ! λ 3 N d 3 N x σ S N η σ e π λ 2 j ( x j x σ j ) 2 (5.40) Z N = 1 N ! λ 3 N d 3 N x σ S N η σ e π λ 2 j x j x σ j 2 {:(5.40)Z_(N)=(1)/(N!lambda^(3N))intd^(3N)xsum_(sigma inS_(N))eta_(sigma)e^(-(pi)/(lambda^(2))sum_(j)( vec(x)_(j)- vec(x)_(sigma j))^(2)):}\begin{equation*} Z_{N}=\frac{1}{N!\lambda^{3 N}} \int d^{3 N} x \sum_{\sigma \in S_{N}} \eta_{\sigma} e^{-\frac{\pi}{\lambda^{2}} \sum_{j}\left(\vec{x}_{j}-\vec{x}_{\sigma j}\right)^{2}} \tag{5.40} \end{equation*}(5.40)ZN=1N!λ3Nd3NxσSNησeπλ2j(xjxσj)2
The terms with σ id σ id sigma!=id\sigma \neq \mathrm{id}σid are suppressed for λ 0 λ 0 lambda rarr0\lambda \rightarrow 0λ0 (i.e. for h 0 h 0 h rarr0h \rightarrow 0h0 or T T T rarr ooT \rightarrow \inftyT ), so the leading order contribution comes from σ = id σ = id sigma=id\sigma=\mathrm{id}σ=id. The next-to leading order corrections come from those σ σ sigma\sigmaσ having precisely 1 transposition (there are N ( N 1 ) 2 N ( N 1 ) 2 (N(N-1))/(2)\frac{N(N-1)}{2}N(N1)2 of them). A permutation with precisely one transposition corresponds to an exchange of two particles. Neglecting next-to-next-to leading order corrections, the canonical partition function is given by
Z N = 1 N ! λ 3 N d 3 N x [ 1 + N 2 ( N 1 ) η e 2 π λ 2 ( x 1 x 2 ) 2 + ] (5.41) d 3 x = V 1 N ! ( V λ 3 ) N [ 1 + N ( N 1 ) 2 V η d 3 r e 2 π λ 2 r 2 + ] = 1 N ! ( V λ 3 ) N [ 1 + N ( N 1 ) 2 V η ( 2 π λ 2 4 π ) 3 2 + ] . Z N = 1 N ! λ 3 N d 3 N x 1 + N 2 ( N 1 ) η e 2 π λ 2 x 1 x 2 2 + (5.41) d 3 x = V 1 N ! V λ 3 N 1 + N ( N 1 ) 2 V η d 3 r e 2 π λ 2 r 2 + = 1 N ! V λ 3 N 1 + N ( N 1 ) 2 V η 2 π λ 2 4 π 3 2 + {:[Z_(N)=(1)/(N!lambda^(3N))intd^(3N)x[1+(N)/(2)(N-1)etae^(-(2pi)/(lambda^(2))( vec(x)_(1)- vec(x)_(2))^(2))+dots]],[(5.41) intd^(3)x=V(1)/(N!)((V)/(lambda^(3)))^(N)[1+(N(N-1))/(2V)eta intd^(3)re^(-(2pi)/(lambda^(2)) vec(r)^(2))+dots]],[=(1)/(N!)((V)/(lambda^(3)))^(N)[1+(N(N-1))/(2V)eta((2pilambda^(2))/(4pi))^((3)/(2))+dots]". "]:}\begin{align*} & Z_{N}=\frac{1}{N!\lambda^{3 N}} \int d^{3 N} x\left[1+\frac{N}{2}(N-1) \eta e^{-\frac{2 \pi}{\lambda^{2}}\left(\vec{x}_{1}-\vec{x}_{2}\right)^{2}}+\ldots\right] \\ & \int d^{3} x=V \frac{1}{N!}\left(\frac{V}{\lambda^{3}}\right)^{N}\left[1+\frac{N(N-1)}{2 V} \eta \int d^{3} r e^{-\frac{2 \pi}{\lambda^{2}} \vec{r}^{2}}+\ldots\right] \tag{5.41}\\ & =\frac{1}{N!}\left(\frac{V}{\lambda^{3}}\right)^{N}\left[1+\frac{N(N-1)}{2 V} \eta\left(\frac{2 \pi \lambda^{2}}{4 \pi}\right)^{\frac{3}{2}}+\ldots\right] \text {. } \end{align*}ZN=1N!λ3Nd3Nx[1+N2(N1)ηe2πλ2(x1x2)2+](5.41)d3x=V1N!(Vλ3)N[1+N(N1)2Vηd3re2πλ2r2+]=1N!(Vλ3)N[1+N(N1)2Vη(2πλ24π)32+]
The free energy F := k B T log Z N F := k B T log Z N F:=-k_(B)T log Z_(N)F:=-\mathrm{k}_{\mathrm{B}} T \log Z_{N}F:=kBTlogZN is now calculated as
(5.42) F = N k B T log [ e λ 3 V N ] using N ! N N e N k B T N 2 2 V using log ( 1 + ϵ ) ϵ λ 3 2 3 2 η + (5.42) F = N k B T log e λ 3 V N using  N ! N N e N k B T N 2 2 V using  log ( 1 + ϵ ) ϵ λ 3 2 3 2 η + {:(5.42)F=ubrace(-Nk_(B)T log[(e)/(lambda^(3))*(V)/(N)]ubrace)_("using "N!~~N^(N)e^(-N))-ubrace((k_(B)TN^(2))/(2V)ubrace)_("using "log(1+epsilon)~~epsilon)(lambda^(3))/(2^((3)/(2)))eta+dots:}\begin{equation*} F=\underbrace{-N \mathrm{k}_{\mathrm{B}} T \log \left[\frac{e}{\lambda^{3}} \cdot \frac{V}{N}\right]}_{\text {using } N!\approx N^{N} e^{-N}}-\underbrace{\frac{\mathrm{k}_{\mathrm{B}} T N^{2}}{2 V}}_{\text {using } \log (1+\epsilon) \approx \epsilon} \frac{\lambda^{3}}{2^{\frac{3}{2}}} \eta+\ldots \tag{5.42} \end{equation*}(5.42)F=NkBTlog[eλ3VN]using N!NNeNkBTN22Vusing log(1+ϵ)ϵλ3232η+
Together with the following relation for the pressure (cf. (4.17)),
(5.43) P = F V | T (5.43) P = F V T {:(5.43)P=-(del F)/(del V)|_(T):}\begin{equation*} P=-\left.\frac{\partial F}{\partial V}\right|_{T} \tag{5.43} \end{equation*}(5.43)P=FV|T
it follows that
(5.44) P = n k B T ( 1 η n λ 3 2 5 2 + ) (5.44) P = n k B T 1 η n λ 3 2 5 2 + {:(5.44)P=nk_(B)T(1-eta n(lambda^(3))/(2^((5)/(2)))+dots):}\begin{equation*} P=n \mathrm{k}_{\mathrm{B}} T\left(1-\eta n \frac{\lambda^{3}}{2^{\frac{5}{2}}}+\ldots\right) \tag{5.44} \end{equation*}(5.44)P=nkBT(1ηnλ3252+)
where n = N V n = N V n=(N)/(V)n=\frac{N}{V}n=NV is the particle density. Comparing to the classical ideal gas, where we had
P = n k B T P = n k B T P=nk_(B)TP=n \mathrm{k}_{\mathrm{B}} TP=nkBT, we see that when n λ 3 n λ 3 nlambda^(3)n \lambda^{3}nλ3 is of order 1 , quantum effects significantly increase the pressure for fermions ( η = 1 ) ( η = 1 ) (eta=-1)(\eta=-1)(η=1) while they decrease the pressure for bosons ( η = + 1 ) ( η = + 1 ) (eta=+1)(\eta=+1)(η=+1). As we can see comparing the expression (5.41) with the leading order term in the cluster expansion of the classical gas (see chapter 4.6), this effect is also present for a classical gas to leading order if we include a 2-body potential V ( r ) V ( r ) V( vec(r))\mathcal{V}(\vec{r})V(r), such that
(5.45) e β V ( r ) 1 = η e 2 π r 2 λ 2 ( from ( 5.41 ) ) (5.45) e β V ( r ) 1 = η e 2 π r 2 λ 2 (  from  ( 5.41 ) ) {:(5.45)e^(-betaV( vec(r)))-1=etae^((-2pi vec(r)^(2))/(lambda^(2)))quad(" from "(5.41)):}\begin{equation*} e^{-\beta \mathcal{V}(\vec{r})}-1=\eta e^{\frac{-2 \pi \vec{r}^{2}}{\lambda^{2}}} \quad(\text { from }(5.41)) \tag{5.45} \end{equation*}(5.45)eβV(r)1=ηe2πr2λ2( from (5.41))
It follows that for the potential V ( r ) V ( r ) V( vec(r))\mathcal{V}(\vec{r})V(r) it holds
(5.46) V ( r ) = k B T log [ 1 + η e 2 π r 2 λ 2 ] k B T η e 2 π r 2 λ 2 , for r λ (5.46) V ( r ) = k B T log 1 + η e 2 π r 2 λ 2 k B T η e 2 π r 2 λ 2 ,  for  r λ {:(5.46)V( vec(r))=-k_(B)T log[1+etae^(-(2pi vec(r)^(2))/(lambda^(2)))]~~-k_(B)T etae^(-(2pi vec(r)^(2))/(lambda^(2)))","quad" for "r≳lambda:}\begin{equation*} \mathcal{V}(\vec{r})=-\mathrm{k}_{\mathrm{B}} T \log \left[1+\eta e^{-\frac{2 \pi \vec{r}^{2}}{\lambda^{2}}}\right] \approx-\mathrm{k}_{\mathrm{B}} T \eta e^{-\frac{2 \pi \vec{r}^{2}}{\lambda^{2}}}, \quad \text { for } r \gtrsim \lambda \tag{5.46} \end{equation*}(5.46)V(r)=kBTlog[1+ηe2πr2λ2]kBTηe2πr2λ2, for rλ
A sketch of V ( r ) V ( r ) V( vec(r))\mathcal{V}(\vec{r})V(r) is given in the following picture:
Figure 5.1: The potential V ( r ) V ( r ) V( vec(r))\mathcal{V}(\vec{r})V(r) ocurring in (5.46).
Thus, we can say that quantum effects lead to an effective potential. For fermions the resulting correction to the pressure P P PPP in (5.44) is called degeneracy pressure. Note that according to (5.44) the degeneracy pressure is proportional to k B T n 2 λ 3 k B T n 2 λ 3 k_(B)Tn^(2)lambda^(3)\mathrm{k}_{\mathrm{B}} T n^{2} \lambda^{3}kBTn2λ3 for fermions, which increases strongly for increasing density n n nnn. It provides a mechanism to support very dense objects against gravitational collapse, e.g. in neutron stars.

5.3 Spin Degeneracy

For particles with spin the energy levels have a corresponding g g ggg-fold degeneracy. Since different spin states have the same energy the Hamiltonian is now given by
(5.47) H = k , s ϵ ( k ) a k , s a k , s , s = 1 , , g = 2 S + 1 (5.47) H = k , s ϵ ( k ) a k , s a k , s , s = 1 , , g = 2 S + 1 {:(5.47)H=sum_( vec(k),s)epsilon( vec(k))a_( vec(k),s)^(†)a_( vec(k),s)","quad s=1","dots","g=2S+1:}\begin{equation*} H=\sum_{\vec{k}, s} \epsilon(\vec{k}) a_{\vec{k}, s}^{\dagger} a_{\vec{k}, s}, \quad s=1, \ldots, g=2 S+1 \tag{5.47} \end{equation*}(5.47)H=k,sϵ(k)ak,sak,s,s=1,,g=2S+1
where the creation/destruction operators a k , s a k , s a_( vec(k),s)^(†)a_{\vec{k}, s}^{\dagger}ak,s and a k , s a k , s a_( vec(k),s)a_{\vec{k}, s}ak,s fulfill the commutation relations
(5.48) [ a k , s , a k , s ] = δ k , k δ s , s , { a k , s , a k , s } = δ k , k δ s , s (5.48) a k , s , a k , s = δ k , k δ s , s , a k , s , a k , s = δ k , k δ s , s {:(5.48)[a_( vec(k),s),a_( vec(k)^('),s^('))^(†)]=delta_( vec(k), vec(k)^('))delta_(s,s^('))","quad{a_( vec(k),s),a_( vec(k)^('),s^('))^(†)}=delta_( vec(k), vec(k)^('))delta_(s,s^(')):}\begin{equation*} \left[a_{\vec{k}, s}, a_{\vec{k}^{\prime}, s^{\prime}}^{\dagger}\right]=\delta_{\vec{k}, \vec{k}^{\prime}} \delta_{s, s^{\prime}}, \quad\left\{a_{\vec{k}, s}, a_{\vec{k}^{\prime}, s^{\prime}}^{\dagger}\right\}=\delta_{\vec{k}, \vec{k}^{\prime}} \delta_{s, s^{\prime}} \tag{5.48} \end{equation*}(5.48)[ak,s,ak,s]=δk,kδs,s,{ak,s,ak,s}=δk,kδs,s
for bosons/fermions respectively. For the grand canonical ensemble the Hilbert space of particles with spin is given by
(5.49) H ± = N 0 H N ± , H 1 = L 2 ( V , d 3 x ) C g (5.49) H ± = N 0 H N ± , H 1 = L 2 V , d 3 x C g {:(5.49)H^(+-)=bigoplus_(N >= 0)H_(N)^(+-)","quadH_(1)=L^(2)(V,d^(3)x)oxC^(g):}\begin{equation*} \mathcal{H}^{ \pm}=\bigoplus_{N \geqslant 0} \mathcal{H}_{N}^{ \pm}, \quad \mathcal{H}_{1}=L^{2}\left(V, d^{3} x\right) \otimes \mathbb{C}^{g} \tag{5.49} \end{equation*}(5.49)H±=N0HN±,H1=L2(V,d3x)Cg
It is easy to see that for the grand canonical ensemble this results in the following expressions for the expected number densities n ¯ k n ¯ k bar(n)_( vec(k))\bar{n}_{\vec{k}}n¯k and the mean energy E ± E ± E_(+-)E_{ \pm}E±:
(5.50) n ¯ k ± = N ^ k ± = g e β ( ϵ ( k ) μ ) 1 (5.51) E = H ± = g k ϵ ( k ) e β ( ϵ ( k ) μ ) 1 (5.50) n ¯ k ± = N ^ k ± = g e β ( ϵ ( k ) μ ) 1 (5.51) E = H ± = g k ϵ ( k ) e β ( ϵ ( k ) μ ) 1 {:[(5.50) bar(n)_( vec(k))^(+-)=(: hat(N)_( vec(k)):)_(+-)=(g)/(e^(beta(epsilon( vec(k))-mu))∓1)],[(5.51)E=(:H:)_(+-)=gsum_( vec(k))(epsilon(( vec(k))))/(e^(beta(epsilon( vec(k))-mu))∓1)]:}\begin{align*} & \bar{n}_{\vec{k}}^{ \pm}=\left\langle\hat{N}_{\vec{k}}\right\rangle_{ \pm}=\frac{g}{e^{\beta(\epsilon(\vec{k})-\mu)} \mp 1} \tag{5.50}\\ & E=\langle H\rangle_{ \pm}=g \sum_{\vec{k}} \frac{\epsilon(\vec{k})}{e^{\beta(\epsilon(\vec{k})-\mu)} \mp 1} \tag{5.51} \end{align*}(5.50)n¯k±=N^k±=geβ(ϵ(k)μ)1(5.51)E=H±=gkϵ(k)eβ(ϵ(k)μ)1
In the canonical ensemble we find similar expressions. For a non-relativistic gas we get, with k V d 3 k ( 2 π ) 3 k V d 3 k ( 2 π ) 3 sum_( vec(k))rarr V int(d^(3)k)/((2pi)^(3))\sum_{\vec{k}} \rightarrow V \int \frac{d^{3} k}{(2 \pi)^{3}}kVd3k(2π)3 for V V V rarr ooV \rightarrow \inftyV :
(5.52) ϵ ± := E ± V = g d 3 k ( 2 π ) 3 2 k 2 2 m 1 e β ( 2 k 2 2 m μ ) 1 . (5.52) ϵ ± := E ± V = g d 3 k ( 2 π ) 3 2 k 2 2 m 1 e β 2 k 2 2 m μ 1 . {:(5.52)epsilon_(+-):=(E_(+-))/(V)=g int(d^(3)k)/((2pi)^(3))(ℏ^(2)k^(2))/(2m)(1)/(e^(beta((ℏ^(2)k^(2))/(2m)-mu))∓1).:}\begin{equation*} \epsilon_{ \pm}:=\frac{E_{ \pm}}{V}=g \int \frac{d^{3} k}{(2 \pi)^{3}} \frac{\hbar^{2} k^{2}}{2 m} \frac{1}{e^{\beta\left(\frac{\hbar^{2} k^{2}}{2 m}-\mu\right)} \mp 1} . \tag{5.52} \end{equation*}(5.52)ϵ±:=E±V=gd3k(2π)32k22m1eβ(2k22mμ)1.
Setting x = 2 k 2 2 m k B T x = 2 k 2 2 m k B T x=(ℏ^(2)k^(2))/(2mk_(B)T)x=\frac{\hbar^{2} k^{2}}{2 m \mathrm{k}_{\mathrm{B}} T}x=2k22mkBT or equivalently k = 2 π 1 2 λ x 1 2 k = 2 π 1 2 λ x 1 2 k=(2pi^((1)/(2)))/(lambda)x^((1)/(2))k=\frac{2 \pi^{\frac{1}{2}}}{\lambda} x^{\frac{1}{2}}k=2π12λx12 and defining the fugacity z := e β μ z := e β μ z:=e^(beta mu)z:=e^{\beta \mu}z:=eβμ, we find
(5.53) ϵ ± k B T = g λ 3 2 π 0 d x x 3 2 z 1 e x 1 . (5.53) ϵ ± k B T = g λ 3 2 π 0 d x x 3 2 z 1 e x 1 . {:(5.53)(epsilon^(+-))/(k_(B)T)=(g)/(lambda^(3))(2)/(sqrtpi)int_(0)^(oo)(dxx^((3)/(2)))/(z^(-1)e^(x)∓1).:}\begin{equation*} \frac{\epsilon^{ \pm}}{\mathrm{k}_{\mathrm{B}} T}=\frac{g}{\lambda^{3}} \frac{2}{\sqrt{\pi}} \int_{0}^{\infty} \frac{d x x^{\frac{3}{2}}}{z^{-1} e^{x} \mp 1} . \tag{5.53} \end{equation*}(5.53)ϵ±kBT=gλ32π0dxx32z1ex1.
or similarly
(5.54) n ¯ ± = N ^ ± V = g λ 3 2 π 0 d x x 1 2 z 1 e x 1 . (5.54) n ¯ ± = N ^ ± V = g λ 3 2 π 0 d x x 1 2 z 1 e x 1 . {:(5.54) bar(n)^(+-)=((:( hat(N)):)_(+-))/(V)=(g)/(lambda^(3))(2)/(sqrtpi)int_(0)^(oo)(dxx^((1)/(2)))/(z^(-1)e^(x)∓1).:}\begin{equation*} \bar{n}^{ \pm}=\frac{\langle\hat{N}\rangle_{ \pm}}{V}=\frac{g}{\lambda^{3}} \frac{2}{\sqrt{\pi}} \int_{0}^{\infty} \frac{d x x^{\frac{1}{2}}}{z^{-1} e^{x} \mp 1} . \tag{5.54} \end{equation*}(5.54)n¯±=N^±V=gλ32π0dxx12z1ex1.
Furthermore, we also have the following relation for the pressure P ± P ± P_(+-)P_{ \pm}P±and the grand canonical potential G ± = k B T log Y ± ( cf G ± = k B T log Y ± ( cf G_(+-)=-k_(B)T log Y^(+-)(cfG_{ \pm}=-\mathrm{k}_{\mathrm{B}} T \log Y^{ \pm}(\mathrm{cf}G±=kBTlogY±(cf. section 4.4):
(5.55) P ± = G ± V | T , μ (5.55) P ± = G ± V T , μ {:(5.55)P_(+-)=-(delG^(+-))/(del V)|_(T,mu):}\begin{equation*} P_{ \pm}=-\left.\frac{\partial G^{ \pm}}{\partial V}\right|_{T, \mu} \tag{5.55} \end{equation*}(5.55)P±=G±V|T,μ
From (5.27), (5.28) it follows that in the case of spin degeneracy the grand canonical partition function Y ± Y ± Y^(+-)Y^{ \pm}Y±is given by
(5.56) Y ± = [ k ( 1 z e β ϵ ( k ) ) ] g (5.56) Y ± = k 1 z e β ϵ ( k ) g {:(5.56)Y^(+-)=[prod_( vec(k))(1∓ze^(-beta epsilon( vec(k))))]^(∓g):}\begin{equation*} Y^{ \pm}=\left[\prod_{\vec{k}}\left(1 \mp z e^{-\beta \epsilon(\vec{k})}\right)\right]^{\mp g} \tag{5.56} \end{equation*}(5.56)Y±=[k(1zeβϵ(k))]g
Taking the logarithm on both sides and taking a large volume V V V rarr ooV \rightarrow \inftyV to approximate the sum by an integral as before yields
P ± k B T = g d 3 k ( 2 π ) 3 log [ 1 z e 2 k 2 2 m k B T ] (5.57) = g λ 3 4 3 π 1 0 d x x 3 2 z 1 e x 1 P ± k B T = g d 3 k ( 2 π ) 3 log 1 z e 2 k 2 2 m k B T (5.57) = g λ 3 4 3 π 1 0 d x x 3 2 z 1 e x 1 {:[(P_(+-))/(k_(B)T)=∓g int(d^(3)k)/((2pi)^(3))log[1∓ze^(-(ℏ^(2)k^(2))/(2mk_(B)T))]],[(5.57)=(g)/(lambda^(3))(4)/(3)sqrtpi^(-1)int_(0)^(oo)(dxx^((3)/(2)))/(z^(-1)e^(x)∓1)]:}\begin{align*} \frac{P_{ \pm}}{\mathrm{k}_{\mathrm{B}} T} & =\mp g \int \frac{d^{3} k}{(2 \pi)^{3}} \log \left[1 \mp z e^{-\frac{\hbar^{2} k^{2}}{2 m \mathrm{k}_{\mathrm{B}} T}}\right] \\ & =\frac{g}{\lambda^{3}} \frac{4}{3} \sqrt{\pi}^{-1} \int_{0}^{\infty} \frac{d x x^{\frac{3}{2}}}{z^{-1} e^{x} \mp 1} \tag{5.57} \end{align*}P±kBT=gd3k(2π)3log[1ze2k22mkBT](5.57)=gλ343π10dxx32z1ex1
To go to the last line, we used a partial integration in x x xxx. For z 1 z 1 z≪1z \ll 1z1, i.e. μ β = μ k B T 0 μ β = μ k B T 0 mu beta=(mu)/(k_(B)T)≪0\mu \beta=\frac{\mu}{\mathrm{k}_{\mathrm{B}} T} \ll 0μβ=μkBT0 one can expand n ¯ ± n ¯ ± bar(n)^(+-)\bar{n}^{ \pm}n¯±in z z zzz around z = 0 z = 0 z=0z=0z=0. Using the relation
d x x m 1 z 1 e x η = η ( m 1 ) ! n = 1 ( η z ) n n m d x x m 1 z 1 e x η = η ( m 1 ) ! n = 1 ( η z ) n n m int(dxx^(m-1))/(z^(-1)e^(x)-eta)=eta(m-1)!sum_(n=1)^(oo)((eta z)^(n))/(n^(m))\int \frac{d x x^{m-1}}{z^{-1} e^{x}-\eta}=\eta(m-1)!\sum_{n=1}^{\infty} \frac{(\eta z)^{n}}{n^{m}}dxxm1z1exη=η(m1)!n=1(ηz)nnm
(which for η z = 1 η z = 1 eta z=1\eta z=1ηz=1 yields the Riemann ζ ζ zeta\zetaζ-function), one finds that
(5.58) n ¯ ± λ 3 g = z ± z 2 2 3 2 + z 3 3 3 2 ± z 4 4 3 2 + (5.59) β P ± λ 3 g = z ± z 2 2 5 2 + z 3 3 5 2 ± z 4 4 5 2 + (5.58) n ¯ ± λ 3 g = z ± z 2 2 3 2 + z 3 3 3 2 ± z 4 4 3 2 + (5.59) β P ± λ 3 g = z ± z 2 2 5 2 + z 3 3 5 2 ± z 4 4 5 2 + {:[(5.58)( bar(n)_(+-)lambda^(3))/(g)=z+-(z^(2))/(2^((3)/(2)))+(z^(3))/(3^((3)/(2)))+-(z^(4))/(4^((3)/(2)))+dots],[(5.59)(betaP_(+-)lambda^(3))/(g)=z+-(z^(2))/(2^((5)/(2)))+(z^(3))/(3^((5)/(2)))+-(z^(4))/(4^((5)/(2)))+dots]:}\begin{align*} \frac{\bar{n}_{ \pm} \lambda^{3}}{g} & =z \pm \frac{z^{2}}{2^{\frac{3}{2}}}+\frac{z^{3}}{3^{\frac{3}{2}}} \pm \frac{z^{4}}{4^{\frac{3}{2}}}+\ldots \tag{5.58}\\ \frac{\beta P_{ \pm} \lambda^{3}}{g} & =z \pm \frac{z^{2}}{2^{\frac{5}{2}}}+\frac{z^{3}}{3^{\frac{5}{2}}} \pm \frac{z^{4}}{4^{\frac{5}{2}}}+\ldots \tag{5.59} \end{align*}(5.58)n¯±λ3g=z±z2232+z3332±z4432+(5.59)βP±λ3g=z±z2252+z3352±z4452+
Solving (5.58) for z z zzz and substituting into (5.59) gives
(5.60) P ± = n ¯ ± k B T [ 1 1 2 5 2 ( n ¯ ± λ 3 g ) + ] (5.60) P ± = n ¯ ± k B T 1 1 2 5 2 n ¯ ± λ 3 g + {:(5.60)P_(+-)= bar(n)_(+-)k_(B)T[1∓(1)/(2^((5)/(2)))(( bar(n)_(+-)lambda^(3))/(g))+dots]:}\begin{equation*} P_{ \pm}=\bar{n}_{ \pm} \mathrm{k}_{\mathrm{B}} T\left[1 \mp \frac{1}{2^{\frac{5}{2}}}\left(\frac{\bar{n}_{ \pm} \lambda^{3}}{g}\right)+\ldots\right] \tag{5.60} \end{equation*}(5.60)P±=n¯±kBT[11252(n¯±λ3g)+]
which for g = 1 g = 1 g=1g=1g=1 gives the same result for the degeneracy pressure we obtained previously in (5.44). Note again the "+" sign for fermions.

5.4 Black Body Radiation

We know that the dispersion relation for photons is given by (note that the momentum is p = k ) p = k ) vec(p)=ℏ vec(k))\vec{p}=\hbar \vec{k})p=k) :
(5.61) ϵ ( k ) = c | k | (5.61) ϵ ( k ) = c | k | {:(5.61)epsilon( vec(k))=ℏc| vec(k)|:}\begin{equation*} \epsilon(\vec{k})=\hbar c|\vec{k}| \tag{5.61} \end{equation*}(5.61)ϵ(k)=c|k|
There are two possibilities for the helicity ("spin") of a photon which is either parallel or anti-parallel to p p vec(p)\vec{p}p, corresponding to the polarization of the light. Hence, the degeneracy factor for photons is g = 2 g = 2 g=2g=2g=2 and the Hamiltonian is given by
(5.62) H = p , s = ± 1 ϵ ( p ) a p , s + a p , s + interaction ( p 0 ) (5.62) H = p , s = ± 1 ϵ ( p ) a p , s + a p , s + interaction  ( p 0 ) {:(5.62)H=sum_( vec(p),s=+-1)epsilon( vec(p))a_( vec(p),s)^(+)a_( vec(p),s)+ubrace(dotsubrace)_("interaction ")( vec(p)!=0):}\begin{equation*} H=\sum_{\vec{p}, s= \pm 1} \epsilon(\vec{p}) a_{\vec{p}, s}^{+} a_{\vec{p}, s}+\underbrace{\ldots}_{\text {interaction }}(\vec{p} \neq 0) \tag{5.62} \end{equation*}(5.62)H=p,s=±1ϵ(p)ap,s+ap,s+interaction (p0)
Under normal circumstances there is practically no interaction beween the photons, so the interaction terms indicated by "..." can be neglected in the previous formula. The following picture is a sketch of a 4 -photon interaction, where σ σ sigma\sigmaσ denotes the cross section for the corresponding 2-2 scattering process obtained from the computational rules of quantum electrodynamics:
Figure 5.2: Lowest-order Feynman diagram for photon-photon scattering in Quantum Electrodynamics.
The mean collision time of the photons is given by
(5.63) 1 τ = c σ N V = c σ n 10 44 × n cm 3 s (5.63) 1 τ = c σ N V = c σ n 10 44 × n cm 3 s {:(5.63)(1)/(tau)=(c sigma N)/(V)=c sigma n~~10^(-44)xx n(cm^(3))/((s)):}\begin{equation*} \frac{1}{\tau}=\frac{c \sigma N}{V}=c \sigma n \approx 10^{-44} \times n \frac{\mathrm{~cm}^{3}}{\mathrm{~s}} \tag{5.63} \end{equation*}(5.63)1τ=cσNV=cσn1044×n cm3 s
where N = N ^ N = N ^ N=(: hat(N):)N=\langle\hat{N}\rangleN=N^ is the average number of photons inside V V VVV and n = N / V n = N / V n=N//Vn=N / Vn=N/V their density. Even in extreme places like the interior sun, where T 10 7 K T 10 7 K T~~10^(7)KT \approx 10^{7} \mathrm{~K}T107 K, this leads to a mean collision time of 10 18 s 10 18 s 10^(18)s10^{18} \mathrm{~s}1018 s. This is more than the age of the universe, which is approximately 10 17 10 17 10^(17)10^{17}1017 s. From this we conclude that we can safely treat the photons as an ideal gas!
By the methods of the previous subsection we find for the grand canonical partition function, with μ = 0 μ = 0 mu=0\mu=0μ=0 :
(5.64) Y = tr ( e β H ) = [ p 0 1 1 e β ϵ ( p ) ] 2 (5.64) Y = tr e β H = p 0 1 1 e β ϵ ( p ) 2 {:(5.64)Y=tr(e^(-beta H))=[prod_( vec(p)!=0)(1)/(1-e^(-beta epsilon( vec(p))))]^(2):}\begin{equation*} Y=\operatorname{tr}\left(e^{-\beta H}\right)=\left[\prod_{\vec{p} \neq 0} \frac{1}{1-e^{-\beta \epsilon(\vec{p})}}\right]^{2} \tag{5.64} \end{equation*}(5.64)Y=tr(eβH)=[p011eβϵ(p)]2
since the degeneracy factor is g = 2 g = 2 g=2g=2g=2 and photons are bosons. For the Gibbs free energy
(in the limit V V V rarr ooV \rightarrow \inftyV ) we get 1 1 ^(1){ }^{1}1
G = k B T log Y = 2 V β d 3 p ( 2 π ) 3 log ( 1 e β c p ) = V ( k B T ) 4 π 2 ( c ) 3 0 d x x 2 log ( 1 e x ) = V ( k B T ) 4 π 2 ( c ) 3 ( 1 3 ) 0 d x x 3 e x 1 = V ( k B T ) 4 ( c ) 3 π 2 45 . = 2 ζ ( 4 ) = π 4 45 (5.66) G = 4 σ 3 c V T 4 . G = k B T log Y = 2 V β d 3 p ( 2 π ) 3 log 1 e β c p = V k B T 4 π 2 ( c ) 3 0 d x x 2 log 1 e x = V k B T 4 π 2 ( c ) 3 1 3 0 d x x 3 e x 1 = V k B T 4 ( c ) 3 π 2 45 . = 2 ζ ( 4 ) = π 4 45 (5.66) G = 4 σ 3 c V T 4 . {:[G=-k_(B)T log Y=(2V)/(beta)int(d^(3)p)/((2piℏ)^(3))log(1-e^(-beta cp))=(V(k_(B)T)^(4))/(pi^(2)(ℏc)^(3))int_(0)^(oo)dxx^(2)log(1-e^(-x))],[=(V(k_(B)T)^(4))/(pi^(2)(ℏc)^(3))(-(1)/(3))int_(0)^(oo)(dxx^(3))/(e^(x)-1)=-(V(k_(B)T)^(4))/((ℏc)^(3))(pi^(2))/(45).],[=-2zeta(4)=-(pi^(4))/(45)],[(5.66)=>G=-(4sigma)/(3c)VT^(4).]:}\begin{align*} & G=-\mathrm{k}_{\mathrm{B}} T \log Y=\frac{2 V}{\beta} \int \frac{d^{3} p}{(2 \pi \hbar)^{3}} \log \left(1-e^{-\beta c p}\right)=\frac{V\left(\mathrm{k}_{\mathrm{B}} T\right)^{4}}{\pi^{2}(\hbar c)^{3}} \int_{0}^{\infty} d x x^{2} \log \left(1-e^{-x}\right) \\ & =\frac{V\left(\mathrm{k}_{\mathrm{B}} T\right)^{4}}{\pi^{2}(\hbar c)^{3}}\left(-\frac{1}{3}\right) \int_{0}^{\infty} \frac{d x x^{3}}{e^{x}-1}=-\frac{V\left(\mathrm{k}_{\mathrm{B}} T\right)^{4}}{(\hbar c)^{3}} \frac{\pi^{2}}{45} . \\ & =-2 \zeta(4)=-\frac{\pi^{4}}{45} \\ & \Rightarrow G=-\frac{4 \sigma}{3 c} V T^{4} . \tag{5.66} \end{align*}G=kBTlogY=2Vβd3p(2π)3log(1eβcp)=V(kBT)4π2(c)30dxx2log(1ex)=V(kBT)4π2(c)3(13)0dxx3ex1=V(kBT)4(c)3π245.=2ζ(4)=π445(5.66)G=4σ3cVT4.
Here, σ = 5.67 × 10 8 J s m 2 K 4 σ = 5.67 × 10 8 J s m 2 K 4 sigma=5.67 xx10^(-8)((J))/((s)m^(2)K^(4))\sigma=5.67 \times 10^{-8} \frac{\mathrm{~J}}{\mathrm{~s} \mathrm{~m}^{2} \mathrm{~K}^{4}}σ=5.67×108 J s m2 K4 is the Stefan-Boltzmann constant.
The entropy was defined as S := k B tr ( ρ log ρ ) S := k B tr ( ρ log ρ ) S:=-k_(B)tr(rho log rho)S:=-\mathrm{k}_{\mathrm{B}} \operatorname{tr}(\rho \log \rho)S:=kBtr(ρlogρ) with ρ = 1 Y e β H ρ = 1 Y e β H rho=(1)/(Y)e^(-beta H)\rho=\frac{1}{Y} e^{-\beta H}ρ=1YeβH. One easily finds the relation
(5.67) S = G T | V , μ = 0 = T ( k B T log Y ) | V , μ = 0 (5.67) S = G T V , μ = 0 = T k B T log Y V , μ = 0 {:(5.67)S=-(del G)/(del T)|_(V,mu=0)=(del)/(del T)(k_(B)T log Y)|_(V,mu=0):}\begin{equation*} S=-\left.\frac{\partial G}{\partial T}\right|_{V, \mu=0}=\left.\frac{\partial}{\partial T}\left(\mathrm{k}_{\mathrm{B}} T \log Y\right)\right|_{V, \mu=0} \tag{5.67} \end{equation*}(5.67)S=GT|V,μ=0=T(kBTlogY)|V,μ=0
(see chapter 6.5 for a systematic review of such formulas) or
(5.68) S = 16 σ 3 c V T 3 (5.68) S = 16 σ 3 c V T 3 {:(5.68)=>quad S=(16 sigma)/(3c)VT^(3):}\begin{equation*} \Rightarrow \quad S=\frac{16 \sigma}{3 c} V T^{3} \tag{5.68} \end{equation*}(5.68)S=16σ3cVT3
The mean energy E E EEE is found as
E = H = 2 p 0 ϵ ( p ) 1 e β ϵ ( p ) 1 = 2 V d 3 p ( 2 π ) 3 | p | e β c | p | 1 . (5.69) E = 4 σ c V T 4 E = H = 2 p 0 ϵ ( p ) 1 e β ϵ ( p ) 1 = 2 V d 3 p ( 2 π ) 3 | p | e β c | p | 1 . (5.69) E = 4 σ c V T 4 {:[E=(:H:)=2sum_( vec(p)!=0)epsilon( vec(p))(1)/(e^(beta epsilon( vec(p)))-1)=2V int(d^(3)p)/((2piℏ)^(3))(|( vec(p))|)/(e^(beta c| vec(p)|)-1).],[(5.69)=>E=(4sigma)/(c)VT^(4)]:}\begin{gather*} E=\langle H\rangle=2 \sum_{\vec{p} \neq 0} \epsilon(\vec{p}) \frac{1}{e^{\beta \epsilon(\vec{p})}-1}=2 V \int \frac{d^{3} p}{(2 \pi \hbar)^{3}} \frac{|\vec{p}|}{e^{\beta c|\vec{p}|}-1} . \\ \Rightarrow E=\frac{4 \sigma}{c} V T^{4} \tag{5.69} \end{gather*}E=H=2p0ϵ(p)1eβϵ(p)1=2Vd3p(2π)3|p|eβc|p|1.(5.69)E=4σcVT4
Finally, the pressure P P PPP can be calculated as
(5.70) P = G V | T , μ = 0 = V ( k B T log Y ) | T , μ = 0 (5.70) P = G V T , μ = 0 = V k B T log Y T , μ = 0 {:(5.70)P=-(del G)/(del V)|_(T,mu=0)=(del)/(del V)(k_(B)T log Y)|_(T,mu=0):}\begin{equation*} P=-\left.\frac{\partial G}{\partial V}\right|_{T, \mu=0}=\left.\frac{\partial}{\partial V}\left(\mathrm{k}_{\mathrm{B}} T \log Y\right)\right|_{T, \mu=0} \tag{5.70} \end{equation*}(5.70)P=GV|T,μ=0=V(kBTlogY)|T,μ=0
see again chapter 6.5 for systematic review of such formulas. This gives
(5.71) P = 4 σ 3 c T 4 (5.71) P = 4 σ 3 c T 4 {:(5.71)=>quad P=(4sigma)/(3c)T^(4):}\begin{equation*} \Rightarrow \quad P=\frac{4 \sigma}{3 c} T^{4} \tag{5.71} \end{equation*}(5.71)P=4σ3cT4
As an example, for the sun, with T sun = 10 7 K T sun  = 10 7 K T_("sun ")=10^(7)KT_{\text {sun }}=10^{7} \mathrm{~K}Tsun =107 K, the pressure is P = 25 , 000 , 000 atm P = 25 , 000 , 000 atm P=25,000,000atmP=25,000,000 \mathrm{~atm}P=25,000,000 atm and
$$
(5.65) ζ ( s ) = n 1 n s , for Re ( s ) > 1 (5.65) ζ ( s ) = n 1 n s ,  for  Re ( s ) > 1 {:(5.65)zeta(s)=sum_(n >= 1)n^(-s)","quad" for "Re(s) > 1:}\begin{equation*} \zeta(s)=\sum_{n \geqslant 1} n^{-s}, \quad \text { for } \operatorname{Re}(s)>1 \tag{5.65} \end{equation*}(5.65)ζ(s)=n1ns, for Re(s)>1
f o r a H b o m b , w i t h $ T bomb = 10 5 K $ , t h e p r e s s u r e i s $ P = 0.25 atm $ . F r o m ( 5.69 ) , ( 5.70 ) o n e o b t a i n s f o r a H b o m b , w i t h $ T bomb  = 10 5 K $ , t h e p r e s s u r e i s $ P = 0.25 atm $ . F r o m ( 5.69 ) , ( 5.70 ) o n e o b t a i n s foraH-bomb,with$T_("bomb ")=10^(5)K$,thepressureis$P=0.25atm$.From(5.69),(5.70)oneobtainsfor a H-bomb, with $T_{\text {bomb }}=10^{5} \mathrm{~K}$, the pressure is $P=0.25 \mathrm{~atm}$. From (5.69), (5.70) one obtainsforaHbomb,with$Tbomb =105 K$,thepressureis$P=0.25 atm$.From(5.69),(5.70)oneobtains
(5.72) P = 1 3 E V E = 3 P V (5.72) P = 1 3 E V E = 3 P V {:(5.72)P=(1)/(3)(E)/(V)quad<=>quad E=3PV:}\begin{equation*} P=\frac{1}{3} \frac{E}{V} \quad \Leftrightarrow \quad E=3 P V \tag{5.72} \end{equation*}(5.72)P=13EVE=3PV
$$
This is also known as the Stefan-Boltzmann law.
Let u ( ν ) u ( ν ) u(nu)u(\nu)u(ν) be the spectral energy density, i.e., u ( ν ) d ν u ( ν ) d ν u(nu)d nuu(\nu) d \nuu(ν)dν is the contribution to the total energy density due to radiation in the range of frequencies [ ν , ν + d ν ] [ ν , ν + d ν ] [nu,nu+d nu][\nu, \nu+d \nu][ν,ν+dν]. To derive an expression for this, we recall that the average number of photons with wave vector k k vec(k)\vec{k}k is
(5.73) n ¯ k = 2 e β c k 1 . (5.73) n ¯ k = 2 e β c k 1 . {:(5.73) bar(n)_( vec(k))=(2)/(e^(beta cℏk)-1).:}\begin{equation*} \bar{n}_{\vec{k}}=\frac{2}{e^{\beta c \hbar k}-1} . \tag{5.73} \end{equation*}(5.73)n¯k=2eβck1.
Hence, taking the usual conversion from discrete to continuous wave vectors into account, the expected number of photons with wave vector in the range [ k , k + d k ] [ k , k + d k ] [ vec(k), vec(k)+d vec(k)][\vec{k}, \vec{k}+d \vec{k}][k,k+dk] is 2 e β c h k 1 V d 3 k ( 2 π ) 3 2 e β c h k 1 V d 3 k ( 2 π ) 3 (2)/(e^(beta chk)-1)V(d^(3)k)/((2pi)^(3))\frac{2}{e^{\beta c h k}-1} V \frac{d^{3} k}{(2 \pi)^{3}}2eβchk1Vd3k(2π)3. For the expected number of photons with the modulus of the wave vector in the range [ k , k + d k ] [ k , k + d k ] [k,k+dk][k, k+d k][k,k+dk], we thus get V π 2 k 2 e β c h k 1 d k V π 2 k 2 e β c h k 1 d k (V)/(pi^(2))(k^(2))/(e^(beta chk)-1)dk\frac{V}{\pi^{2}} \frac{k^{2}}{e^{\beta c h k}-1} d kVπ2k2eβchk1dk. The frequency of a wave with wave vector k k vec(k)\vec{k}k is ν = c 2 π k ν = c 2 π k nu=(c)/(2pi)k\nu=\frac{c}{2 \pi} kν=c2πk, so that the number of photons in the frequency range [ ν , ν + d ν ] [ ν , ν + d ν ] [nu,nu+d nu][\nu, \nu+d \nu][ν,ν+dν] is 8 π V c 3 ν 2 e β h ν 1 d ν 8 π V c 3 ν 2 e β h ν 1 d ν (8pi V)/(c^(3))(nu^(2))/(e^(beta h nu)-1)d nu\frac{8 \pi V}{c^{3}} \frac{\nu^{2}}{e^{\beta h \nu}-1} d \nu8πVc3ν2eβhν1dν. Multiplication with the energy h ν h ν h nuh \nuhν per photon and dividing by V V VVV, we obtain the Planck distribution
(5.74) u ( ν ) = h π c 3 ν 3 e h ν k B T 1 (5.74) u ( ν ) = h π c 3 ν 3 e h ν k B T 1 {:(5.74)u(nu)=(h)/(pic^(3))(nu^(3))/(e^((h nu)/(k_(B)T))-1):}\begin{equation*} u(\nu)=\frac{h}{\pi c^{3}} \frac{\nu^{3}}{e^{\frac{h \nu}{\mathrm{k}_{\mathrm{B}} T}}-1} \tag{5.74} \end{equation*}(5.74)u(ν)=hπc3ν3ehνkBT1
This is the famous law found by Planck in 1900 which lead to the development of quantum theory! The Planck distribution looks as follows:
Figure 5.3: Sketch of the Planck distribution for different temperatures.
This can be measured by drilling a hole in a cavity and measuring the spectral intensity of the outgoing radiation. An almost perfect black body spectrum is observed in the cosmic microwave background, at T 2.7 K T 2.7 K T≃2.7KT \simeq 2.7 \mathrm{~K}T2.7 K.
Solving u ( ν max ) = 0 u ν max = 0 u^(')(nu_(max))=0u^{\prime}\left(\nu_{\max }\right)=0u(νmax)=0 one finds that the maximum of u ( ν ) u ( ν ) u(nu)u(\nu)u(ν) lies at h ν max 2.82 k B T h ν max 2.82 k B T hnu_(max)~~2.82k_(B)Th \nu_{\max } \approx 2.82 \mathrm{k}_{\mathrm{B}} Thνmax2.82kBT, a relation also known as Wien's law. The following limiting cases are noteworthy:
(i) h ν k B T h ν k B T h nu≪k_(B)Th \nu \ll \mathrm{k}_{\mathrm{B}} ThνkBT :
In this case we have
(5.75) u ( ν ) k B T ν 2 π c 3 (5.75) u ( ν ) k B T ν 2 π c 3 {:(5.75)u(nu)~~(k_(B)Tnu^(2))/(pic^(3)):}\begin{equation*} u(\nu) \approx \frac{\mathrm{k}_{\mathrm{B}} T \nu^{2}}{\pi c^{3}} \tag{5.75} \end{equation*}(5.75)u(ν)kBTν2πc3
This formula is valid in particular for h 0 h 0 h rarr0h \rightarrow 0h0, i.e. it represents the classical limit. It was known before the Planck formula. It is not only inaccurate for larger frequencies but also fundamentally problematic since it suggests H = E d ν u ( ν ) = H = E d ν u ( ν ) = (:H:)=E prop int d nu u(nu)=oo\langle H\rangle=E \propto \int d \nu u(\nu)=\inftyH=Edνu(ν)=, which indicates an instability not seen in reality.
(ii) h ν k B T h ν k B T h nu≫k_(B)Th \nu \gg \mathrm{k}_{\mathrm{B}} ThνkBT :
In this case we have
(5.76) u ( ν ) h ν 3 π c 3 e h ν k B T (5.76) u ( ν ) h ν 3 π c 3 e h ν k B T {:(5.76)u(nu)~~(hnu^(3))/(pic^(3))e^((-h nu)/(k_(B)T)):}\begin{equation*} u(\nu) \approx \frac{h \nu^{3}}{\pi c^{3}} e^{\frac{-h \nu}{\mathrm{k}_{\mathrm{B}} T}} \tag{5.76} \end{equation*}(5.76)u(ν)hν3πc3ehνkBT
This formula had been found empirically by Wien without proper interpretation of the constants (and in particular without identifying h h hhh ).
We can also calculate the mean total particle number:
N ^ = p 0 2 e β c | p | 1 2 V d 3 p ( 2 π ) 3 1 e β c | p | 1 (5.77) = 2 ζ ( 3 ) π 2 V ( k B T c ) 3 N ^ = p 0 2 e β c | p | 1 2 V d 3 p ( 2 π ) 3 1 e β c | p | 1 (5.77) = 2 ζ ( 3 ) π 2 V k B T c 3 {:[(: hat(N):)=sum_( vec(p)!=0)(2)/(e^(beta c| vec(p)|)-1)~~2V int(d^(3)p)/((2piℏ)^(3))(1)/(e^(beta c| vec(p)|)-1)],[(5.77)=(2zeta(3))/(pi^(2))V((k_(B)T)/(ℏc))^(3)]:}\begin{align*} \langle\hat{N}\rangle & =\sum_{\vec{p} \neq 0} \frac{2}{e^{\beta c|\vec{p}|}-1} \approx 2 V \int \frac{d^{3} p}{(2 \pi \hbar)^{3}} \frac{1}{e^{\beta c|\vec{p}|}-1} \\ & =\frac{2 \zeta(3)}{\pi^{2}} V\left(\frac{\mathrm{k}_{\mathrm{B}} T}{\hbar c}\right)^{3} \tag{5.77} \end{align*}N^=p02eβc|p|12Vd3p(2π)31eβc|p|1(5.77)=2ζ(3)π2V(kBTc)3
Combining this formula with that for the entropy S S SSS, eq. (5.68), gives the relation
(5.78) S = 8 π 4 3 ζ ( 3 ) k B N 3.6 N k B (5.78) S = 8 π 4 3 ζ ( 3 ) k B N 3.6 N k B {:(5.78)S=(8pi^(4))/(3zeta(3))k_(B)N~~3.6 Nk_(B):}\begin{equation*} S=\frac{8 \pi^{4}}{3 \zeta(3)} \mathrm{k}_{\mathrm{B}} N \approx 3.6 N \mathrm{k}_{\mathrm{B}} \tag{5.78} \end{equation*}(5.78)S=8π43ζ(3)kBN3.6NkB
where N N ^ N N ^ N-=(: hat(N):)N \equiv\langle\hat{N}\rangleNN^ is the mean total particle number from above. Thus, for an ideal photon gas we have S = O ( 1 ) k B N S = O ( 1 ) k B N S=O(1)k_(B)NS=\mathcal{O}(1) \mathrm{k}_{\mathrm{B}} NS=O(1)kBN, i.e. each photon contributes one unit to S k B S k B (S)/(k_(B))\frac{S}{\mathrm{k}_{\mathrm{B}}}SkB on average (see problem B. 18 for an application of this elementary relation).

5.5 Degenerate Bose Gas

Ideal quantum gases of bosonic particles show a particular behavior for low temperature T T TTT and large particle number densities n = N ^ V n = N ^ V n=((:( hat(N)):))/(V)n=\frac{\langle\hat{N}\rangle}{V}n=N^V. We first discuss the ideal Bose gas in a finite volume. In this case, the expected particle density was given by
(5.79) n = N ^ V = g V k 1 e β ( ϵ ( k ) μ ) 1 . (5.79) n = N ^ V = g V k 1 e β ( ϵ ( k ) μ ) 1 . {:(5.79)n=((:( hat(N)):))/(V)=(g)/(V)sum_( vec(k))(1)/(e^(beta(epsilon( vec(k))-mu))-1).:}\begin{equation*} n=\frac{\langle\hat{N}\rangle}{V}=\frac{g}{V} \sum_{\vec{k}} \frac{1}{e^{\beta(\epsilon(\vec{k})-\mu)}-1} . \tag{5.79} \end{equation*}(5.79)n=N^V=gVk1eβ(ϵ(k)μ)1.
The sum is calculated for sufficiently large volumes again by replacing k k sum_( vec(k))\sum_{\vec{k}}k by V d 3 k ( 2 π ) 3 V d 3 k ( 2 π ) 3 V int(d^(3)k)/((2pi)^(3))V \int \frac{d^{3} k}{(2 \pi)^{3}}Vd3k(2π)3, which yields
n g d 3 k ( 2 π ) 3 1 e β ( ϵ ( k ) μ ) 1 (5.80) = g 2 π 2 0 d k k 2 e β ( ϵ ( k ) μ ) 1 n g d 3 k ( 2 π ) 3 1 e β ( ϵ ( k ) μ ) 1 (5.80) = g 2 π 2 0 d k k 2 e β ( ϵ ( k ) μ ) 1 {:[n~~g int(d^(3)k)/((2pi)^(3))(1)/(e^(beta(epsilon( vec(k))-mu))-1)],[(5.80)=(g)/(2pi^(2))int_(0)^(oo)dk(k^(2))/(e^(beta(epsilon( vec(k))-mu))-1)]:}\begin{align*} n & \approx g \int \frac{d^{3} k}{(2 \pi)^{3}} \frac{1}{e^{\beta(\epsilon(\vec{k})-\mu)}-1} \\ & =\frac{g}{2 \pi^{2}} \int_{0}^{\infty} d k \frac{k^{2}}{e^{\beta(\epsilon(\vec{k})-\mu)}-1} \tag{5.80} \end{align*}ngd3k(2π)31eβ(ϵ(k)μ)1(5.80)=g2π20dkk2eβ(ϵ(k)μ)1
The particle density is clearly maximal for μ 0 μ 0 mu rarr0\mu \rightarrow 0μ0 and its maximal value is given by n c n c n_(c)n_{c}nc where, with ϵ ( k ) = 2 k 2 2 m ϵ ( k ) = 2 k 2 2 m epsilon(k)=(ℏ^(2)k^(2))/(2m)\epsilon(k)=\frac{\hbar^{2} k^{2}}{2 m}ϵ(k)=2k22m,
n c = g 2 π 2 0 d k k 2 e β ϵ ( k ) 1 = g 2 π 2 ( 2 m β 2 ) 3 2 0 d x x 2 e x 2 1 = g 2 π 2 ( 2 m β 2 ) 3 2 n = 1 0 d x x 2 e n x 2 = g λ 3 ζ ( 3 2 ) n c = g 2 π 2 0 d k k 2 e β ϵ ( k ) 1 = g 2 π 2 2 m β 2 3 2 0 d x x 2 e x 2 1 = g 2 π 2 2 m β 2 3 2 n = 1 0 d x x 2 e n x 2 = g λ 3 ζ 3 2 {:[n_(c)=(g)/(2pi^(2))int_(0)^(oo)dk(k^(2))/(e^(beta epsilon( vec(k)))-1)],[=(g)/(2pi^(2))((2m)/(betaℏ^(2)))^((3)/(2))int_(0)^(oo)(dxx^(2))/(e^(x^(2))-1)],[=(g)/(2pi^(2))((2m)/(betaℏ^(2)))^((3)/(2))sum_(n=1)^(oo)int_(0)^(oo)dxx^(2)e^(-nx^(2))],[=(g)/(lambda^(3))zeta((3)/(2))]:}\begin{aligned} n_{c} & =\frac{g}{2 \pi^{2}} \int_{0}^{\infty} d k \frac{k^{2}}{e^{\beta \epsilon(\vec{k})}-1} \\ & =\frac{g}{2 \pi^{2}}\left(\frac{2 m}{\beta \hbar^{2}}\right)^{\frac{3}{2}} \int_{0}^{\infty} \frac{d x x^{2}}{e^{x^{2}}-1} \\ & =\frac{g}{2 \pi^{2}}\left(\frac{2 m}{\beta \hbar^{2}}\right)^{\frac{3}{2}} \sum_{n=1}^{\infty} \int_{0}^{\infty} d x x^{2} e^{-n x^{2}} \\ & =\frac{g}{\lambda^{3}} \zeta\left(\frac{3}{2}\right) \end{aligned}nc=g2π20dkk2eβϵ(k)1=g2π2(2mβ2)320dxx2ex21=g2π2(2mβ2)32n=10dxx2enx2=gλ3ζ(32)
and where λ = h 2 2 π m k B T λ = h 2 2 π m k B T lambda=sqrt((h^(2))/(2pi mk_(B)T))\lambda=\sqrt{\frac{h^{2}}{2 \pi m \mathrm{k}_{\mathrm{B}} T}}λ=h22πmkBT is the thermal deBroglie wavelength. From this wee see that n n c n n c n <= n_(c)n \leqslant n_{c}nnc. For a given density n n nnn, the minimal temperate is (for T < T c T < T c T < T_(c)T<T_{c}T<Tc, we would have n > n c n > n c n > n_(c)n>n_{c}n>nc )
(5.81) T c = h 2 2 π m k B ( n g ζ ( 3 2 ) ) 2 3 (5.81) T c = h 2 2 π m k B n g ζ 3 2 2 3 {:(5.81)T_(c)=(h^(2))/(2pi mk_(B))((n)/(g zeta((3)/(2))))^((2)/(3)):}\begin{equation*} T_{c}=\frac{h^{2}}{2 \pi m \mathrm{k}_{\mathrm{B}}}\left(\frac{n}{g \zeta\left(\frac{3}{2}\right)}\right)^{\frac{2}{3}} \tag{5.81} \end{equation*}(5.81)Tc=h22πmkB(ngζ(32))23
Equilibrium states with higher densities n > n c n > n c n > n_(c)n>n_{c}n>nc are not possible at finite volume. A new phenomenon happens, however, for infinite volume, i.e. in the thermodynamic limit, V V V rarr ooV \rightarrow \inftyV. Here, we must be careful because density matrices are only formal (e.g. the partition function Y Y Y rarr ooY \rightarrow \inftyY ), so it is better to characterize equilibrium states by the so-called KMS condition (for Kubo-Martin-Schwinger) for equilibrium states. As we will see, new interesting equilibrium states that can be found in this way in the thermodynamic limit. They correspond to a Bose-condensate, or a gas in a superfluid state.
In the present context, the KMS condition for a Gibbs state (:dots:)\langle\ldots\rangle for the ideal Bose gas is simply
(5.82) a p a k = e β ( ϵ ( k ) μ ) a k a p , (5.82) a p a k = e β ( ϵ ( k ) μ ) a k a p , {:(5.82)(:a_( vec(p))^(†)a_( vec(k)):)=e^(-beta(epsilon( vec(k))-mu))(:a_( vec(k))a_( vec(p))^(†):)",":}\begin{equation*} \left\langle a_{\vec{p}}^{\dagger} a_{\vec{k}}\right\rangle=e^{-\beta(\epsilon(\vec{k})-\mu)}\left\langle a_{\vec{k}} a_{\vec{p}}^{\dagger}\right\rangle, \tag{5.82} \end{equation*}(5.82)apak=eβ(ϵ(k)μ)akap,
which was already derived earlier. We can put this relation in a more convenient form recalling that in the case of no spin ( g = 1 ) ( g = 1 ) (g=1)(g=1)(g=1) we had the commutation relations [ a k , a p + ] = a k , a p + = [a_( vec(k)),a_( vec(p))^(+)]=\left[a_{\vec{k}}, a_{\vec{p}}^{+}\right]=[ak,ap+]= δ k , p δ k , p delta_( vec(k), vec(p))\delta_{\vec{k}, \vec{p}}δk,p for the creation/destruction operators. From this it follows that
(5.83) ( 1 e β ( ϵ ( k ) μ ) ) a p a k = e β ( ϵ ( k ) μ ) δ k , p (5.83) 1 e β ( ϵ ( k ) μ ) a p a k = e β ( ϵ ( k ) μ ) δ k , p {:(5.83)(1-e^(-beta(epsilon( vec(k))-mu)))(:a_( vec(p))^(†)a_( vec(k)):)=e^(-beta(epsilon( vec(k))-mu))delta_( vec(k), vec(p)):}\begin{equation*} \left(1-e^{-\beta(\epsilon(\vec{k})-\mu)}\right)\left\langle a_{\vec{p}}^{\dagger} a_{\vec{k}}\right\rangle=e^{-\beta(\epsilon(\vec{k})-\mu)} \delta_{\vec{k}, \vec{p}} \tag{5.83} \end{equation*}(5.83)(1eβ(ϵ(k)μ))apak=eβ(ϵ(k)μ)δk,p
So far, we are still at finite volume V V VVV. In the thermodynamic limit (infinite volume), V V V rarr ooV \rightarrow \inftyV, we should make the replacements
finite volume: k ( π L Z ) 3 k π L Z 3 vec(k)in((pi )/(L)Z)^(3)\vec{k} \in\left(\frac{\pi}{L} \mathbb{Z}\right)^{3}k(πLZ)3 and { a k δ k , p a k δ k , p {[a_( vec(k))],[delta_( vec(k), vec(p))]longrightarrow:}\left\{\begin{array}{l}a_{\vec{k}} \\ \delta_{\vec{k}, \vec{p}}\end{array} \longrightarrow\right.{akδk,p infinite volume: k R 3 k R 3 vec(k)inR^(3)\vec{k} \in \mathbb{R}^{3}kR3 and { a ( k ) δ 3 ( k p ) a ( k ) δ 3 ( k p ) {[a( vec(k))],[delta^(3)( vec(k)- vec(p))]:}\left\{\begin{array}{l}a(\vec{k}) \\ \delta^{3}(\vec{k}-\vec{p})\end{array}\right.{a(k)δ3(kp)
Thus, we expect that in the thermodynamic limit:
(5.84) ( 1 e β ( 2 k ^ 2 2 m μ ) ) a ( p ) a ( k ) = e β ( k 2 2 2 m μ ) δ 3 ( p k ) . (5.84) 1 e β 2 k ^ 2 2 m μ a ( p ) a ( k ) = e β k 2 2 2 m μ δ 3 ( p k ) . {:(5.84)(1-e^(-beta((ℏ^(2) hat(k)^(2))/(2m)-mu)))(:a^(†)(( vec(p)))a(( vec(k))):)=e^(-beta(( vec(k)^(2)ℏ^(2))/(2m)-mu))delta^(3)( vec(p)- vec(k)).:}\begin{equation*} \left(1-e^{-\beta\left(\frac{\hbar^{2} \hat{k}^{2}}{2 m}-\mu\right)}\right)\left\langle a^{\dagger}(\vec{p}) a(\vec{k})\right\rangle=e^{-\beta\left(\frac{\vec{k}^{2} \hbar^{2}}{2 m}-\mu\right)} \delta^{3}(\vec{p}-\vec{k}) . \tag{5.84} \end{equation*}(5.84)(1eβ(2k^22mμ))a(p)a(k)=eβ(k222mμ)δ3(pk).
In that limit, the statistical operator ρ ρ rho\rhoρ of the grand canonical ensemble does not make mathematical sense, because e β H + β μ N ^ e β H + β μ N ^ e^(-beta H+beta mu hat(N))e^{-\beta H+\beta \mu \hat{N}}eβH+βμN^ does not have a finite trace (i.e. Y = Y = Y=ooY=\inftyY= ). Nevertheless, the KMS condition (5.84) still makes perfect sense. We view it as the appropriate substitute for the notion of Gibbs state in the thermodynamic limit. There, it can be possible to get new equilibrium states at given temperature T T TTT and chemical potential μ μ mu\muμ that are described by certain additional "order parameters", and that are impossible at finite volume. We think of these as describing different phases.
To see this concretely in the case of the ideal Bose gas, we must therefore ask: What are the solutions of the KMS-condition (5.84)? For μ < 0 μ < 0 mu < 0\mu<0μ<0 the unique solution is the usual Bose-Einstein distribution:
a ( k ) a ( p ) = δ 3 ( p k ) e β ( 2 k 2 2 m μ ) 1 . a ( k ) a ( p ) = δ 3 ( p k ) e β 2 k 2 2 m μ 1 . (:a^(†)(( vec(k)))a(( vec(p))):)=(delta^(3)(( vec(p))-( vec(k))))/(e^(beta((ℏ^(2)k^(2))/(2m)-mu))-1).\left\langle a^{\dagger}(\vec{k}) a(\vec{p})\right\rangle=\frac{\delta^{3}(\vec{p}-\vec{k})}{e^{\beta\left(\frac{\hbar^{2} k^{2}}{2 m}-\mu\right)}-1} .a(k)a(p)=δ3(pk)eβ(2k22mμ)1.
The point is that for μ = 0 μ = 0 mu=0\mu=0μ=0 other solutions are also possible, for instance
for some n 0 0 n 0 0 n_(0) >= 0n_{0} \geqslant 0n00 (this follows from A + A 0 A + A 0 (:A^(+)A:) >= 0\left\langle A^{+} A\right\rangle \geqslant 0A+A0 for operators A A AAA in any state). The particle number density in the thermodynamic limit ( V ) ( V ) (V rarr oo)(V \rightarrow \infty)(V) is best expressed in terms of the creation operators at sharp position x x vec(x)\vec{x}x :
(5.85) a ( p ) = 1 ( 2 π ) 3 2 d 3 x e i p x a ( x ) . (5.85) a ( p ) = 1 ( 2 π ) 3 2 d 3 x e i p x a ( x ) . {:(5.85)a( vec(p))=(1)/((2pi)^((3)/(2)))intd^(3)xe^(-i vec(p) vec(x))a( vec(x)).:}\begin{equation*} a(\vec{p})=\frac{1}{(2 \pi)^{\frac{3}{2}}} \int d^{3} x e^{-i \vec{p} \vec{x}} a(\vec{x}) . \tag{5.85} \end{equation*}(5.85)a(p)=1(2π)32d3xeipxa(x).
The particle number density at the point x x vec(x)\vec{x}x is then defined as N ^ ( x ) := a ( x ) a ( x ) N ^ ( x ) := a ( x ) a ( x ) hat(N)( vec(x)):=a^(†)( vec(x))a( vec(x))\hat{N}(\vec{x}):=a^{\dagger}(\vec{x}) a(\vec{x})N^(x):=a(x)a(x) and therefore we have, for μ = 0 μ = 0 mu=0\mu=0μ=0 :
(5.86) n = N ^ ( x ) = 1 ( 2 π ) 3 d 3 p d 3 k a ( p ) a ( k ) e i ( p k ) x = n c + n 0 (5.86) n = N ^ ( x ) = 1 ( 2 π ) 3 d 3 p d 3 k a ( p ) a ( k ) e i ( p k ) x = n c + n 0 {:(5.86)n=(: hat(N)( vec(x)):)=(1)/((2pi)^(3))intd^(3)pd^(3)k(:a^(†)(( vec(p)))a(( vec(k))):)e^(-i( vec(p)- vec(k)) vec(x))=n_(c)+n_(0):}\begin{equation*} n=\langle\hat{N}(\vec{x})\rangle=\frac{1}{(2 \pi)^{3}} \int d^{3} p d^{3} k\left\langle a^{\dagger}(\vec{p}) a(\vec{k})\right\rangle e^{-i(\vec{p}-\vec{k}) \vec{x}}=n_{c}+n_{0} \tag{5.86} \end{equation*}(5.86)n=N^(x)=1(2π)3d3pd3ka(p)a(k)ei(pk)x=nc+n0
Thus, in this equilibrium state we have a macroscopically large occupation number n 0 n 0 n_(0)n_{0}n0 of the zero mode causing a different particle density at μ = 0 μ = 0 mu=0\mu=0μ=0. The fraction of zero modes, that is, that of the modes in the "condensate", can be written using our definition of T c T c T_(c)T_{c}Tc as
(5.87) n 0 = n ( 1 ( T T c ) 3 / 2 ) (5.87) n 0 = n 1 T T c 3 / 2 {:(5.87)n_(0)=n(1-((T)/(T_(c)))^(3//2)):}\begin{equation*} n_{0}=n\left(1-\left(\frac{T}{T_{c}}\right)^{3 / 2}\right) \tag{5.87} \end{equation*}(5.87)n0=n(1(TTc)3/2)
for T T TTT below T c T c T_(c)T_{c}Tc, and n 0 = 0 n 0 = 0 n_(0)=0n_{0}=0n0=0 above T c T c T_(c)T_{c}Tc. The formation of the condensate can thereby be seen as a phase transition at T = T c T = T c T=T_(c)T=T_{c}T=Tc.
We can also write down more general solutions to the KMS-condition, for example:
(5.88) a ( x ) a ( y ) = d 3 k ( 2 π ) 3 e i k ( x y ) e β 2 k 2 2 m 1 + f ( x ) f ( y ) (5.88) a ( x ) a ( y ) = d 3 k ( 2 π ) 3 e i k ( x y ) e β 2 k 2 2 m 1 + f ( x ) ¯ f ( y ) {:(5.88)(:a^(†)(( vec(x)))a(( vec(y))):)=int(d^(3)k)/((2pi)^(3))(e^(i vec(k)( vec(x)- vec(y))))/(e^(beta(ℏ^(2)k^(2))/(2m))-1)+ bar(f(( vec(x))))f( vec(y)):}\begin{equation*} \left\langle a^{\dagger}(\vec{x}) a(\vec{y})\right\rangle=\int \frac{d^{3} k}{(2 \pi)^{3}} \frac{e^{i \vec{k}(\vec{x}-\vec{y})}}{e^{\beta \frac{\hbar^{2} k^{2}}{2 m}}-1}+\overline{f(\vec{x})} f(\vec{y}) \tag{5.88} \end{equation*}(5.88)a(x)a(y)=d3k(2π)3eik(xy)eβ2k22m1+f(x)f(y)
where f f fff is any harmonic function, i.e. a function such that 2 f = 0 2 f = 0 vec(grad)^(2)f=0\vec{\nabla}^{2} f=02f=0. To understand the physical meaning of these states, we define the particle current operator j ( x ) j ( x ) vec(j)( vec(x))\vec{j}(\vec{x})j(x) as
(5.89) j ( x ) := i 2 m ( a ( x ) a ( x ) a ( x ) a ( x ) ) . (5.89) j ( x ) := i 2 m a ( x ) a ( x ) a ( x ) a ( x ) . {:(5.89) vec(j)( vec(x)):=(-i)/(2m)(a^(†)(( vec(x)))( vec(grad))a(( vec(x)))-( vec(grad))a^(†)(( vec(x)))a(( vec(x)))).:}\begin{equation*} \vec{j}(\vec{x}):=\frac{-i}{2 m}\left(a^{\dagger}(\vec{x}) \vec{\nabla} a(\vec{x})-\vec{\nabla} a^{\dagger}(\vec{x}) a(\vec{x})\right) . \tag{5.89} \end{equation*}(5.89)j(x):=i2m(a(x)a(x)a(x)a(x)).
An example of a harmonic function is f ( x ) = 1 + i m v x f ( x ) = 1 + i m v x f( vec(x))=1+im vec(v)* vec(x)f(\vec{x})=1+i m \vec{v} \cdot \vec{x}f(x)=1+imvx, and in this case one finds the expectation value
(5.90) j ( x ) = i 2 m ( f ( x ) f ( x ) f ( x ) f ( x ) ) = v (5.90) j ( x ) = i 2 m ( f ( x ) ¯ f ( x ) f ( x ) f ( x ) ¯ ) = v {:(5.90)(: vec(j)( vec(x)):)=(-i)/(2m)( bar(f(( vec(x)))) vec(grad)f( vec(x))-f( vec(x)) vec(grad) bar(f(( vec(x)))))= vec(v):}\begin{equation*} \langle\vec{j}(\vec{x})\rangle=\frac{-i}{2 m}(\overline{f(\vec{x})} \vec{\nabla} f(\vec{x})-f(\vec{x}) \vec{\nabla} \overline{f(\vec{x})})=\vec{v} \tag{5.90} \end{equation*}(5.90)j(x)=i2m(f(x)f(x)f(x)f(x))=v
This means that the condensate flows in the direction of v v vec(v)\vec{v}v without leaving equilibrium. Another solution is f ( x ) = f ( x , y , z ) = x + i y f ( x ) = f ( x , y , z ) = x + i y f( vec(x))=f(x,y,z)=x+iyf(\vec{x})=f(x, y, z)=x+i yf(x)=f(x,y,z)=x+iy. In this case one finds
j ( x , y , z ) = 1 m ( y , x , 0 ) j ( x , y , z ) = 1 m ( y , x , 0 ) (: vec(j)(x,y,z):)=(1)/(m)(-y,x,0)\langle\vec{j}(x, y, z)\rangle=\frac{1}{m}(-y, x, 0)j(x,y,z)=1m(y,x,0)
describing a circular motion around the origin (vortex). The condensate can hence flow or form vortices without leaving equilibrium. This phenomenon goes under the name of superfluidity.

Chapter 6

The Laws of Thermodynamics

The laws of thermodynamics predate the ideas and techniques from statistical mechanics, and are, to some extent, simply consequences of more fundamental ideas derived in statistical mechanics. However, they are still in use today, mainly because:
(i) they are easy to remember.
(ii) they are to some extent universal and model-independent.
(iii) microscopic descriptions are sometimes not known (e.g. black hole thermodynamics) or are not well-developed (non-equilibrium situations).
(iv) they are useful!
The laws of thermodynamics are based on:
(o) The empirical evidence that systems approach a new thermal equilibrium state after being pushed out of equilibrium by an external influence.
(i) The empirical evidence that, for a very large class of macroscopic systems, equilibrium states can generally be characterized by very few parameters. These thermodynamic parameters, often called X 1 , , X n X 1 , , X n X_(1),dots,X_(n)X_{1}, \ldots, X_{n}X1,,Xn in the following, can hence be viewed as "coordinates" on the space of equilibrium systems.
(ii) The idea to perform mechanical work on a system, or to bring equilibrium systems into "thermal contact" with reservoirs in order to produce new equilibrium states in a controlled way. The key idea here is that these changes (e.g. by "heating up a system" through contact with a reservoir system) should be extremely gentle so that the system is not pushed out of equilibrium too much. One thereby imagines that one can describe such a gradual change of the system by a succession of equilibrium states, i.e. a curve in the space of coordinates X 1 , , X n X 1 , , X n X_(1),dots,X_(n)X_{1}, \ldots, X_{n}X1,,Xn characterizing the different equilibrium states. This idealized notion of an infinitely gentle/slow change is often referred to as "quasi-static".
(iii) Given the notions of quasi-static changes in the space of equilibrium states, one can then postulate certain rules guided by empirical evidence that tell us which kind of changes should be possible, and which ones should not. These are, in essence, the laws of thermodynamics. For example, one knows that if one has access to equilibrium systems at different temperature, then one system can perform work on the other system. The first and second law state more precise conditions about such processes and imply, respectively, the existence of an energy- and entropy function on equilibrium states. The zeroth law just states that being in thermal equilibrium with each other is an equivalence relation for systems, i.e. in particular transitive. It implies the existence of a temperature function labelling the different equivalence classes.

6.1 The Zeroth Law

0 th 0 th  0^("th ")0^{\text {th }}0th  law of thermodynamics: If two subsystems I,II are separately in thermal contact with a third system, III, then they are in thermal equilibrium with each other.
The 0 th 0 th  0^("th ")0^{\text {th }}0th  law implies the existence of a function
Θ : { equilibrium systems } R Θ : {  equilibrium systems  } R Theta:{" equilibrium systems "}rarrR\Theta:\{\text { equilibrium systems }\} \rightarrow \mathbb{R}Θ:{ equilibrium systems }R
such that Θ Θ Theta\ThetaΘ is equal for systems in thermal equilibrium with each other. To see this, let us imagine that the equilibrium states of the systems I,II and III are parametrized by some coordinates { A 1 , A 2 , } , { B 1 , B 2 , } A 1 , A 2 , , B 1 , B 2 , {A_(1),A_(2),dots},{B_(1),B_(2),dots}\left\{A_{1}, A_{2}, \ldots\right\},\left\{B_{1}, B_{2}, \ldots\right\}{A1,A2,},{B1,B2,} and { C 1 , C 2 , } C 1 , C 2 , {C_(1),C_(2),dots}\left\{C_{1}, C_{2}, \ldots\right\}{C1,C2,}. Since a change in I implies a corresponding change in III, there must be a constraint 1 1 ^(1){ }^{1}1
(6.1) f I , III ( { A 1 , A 2 , ; C 1 , C 2 , } ) = 0 (6.1) f I , III A 1 , A 2 , ; C 1 , C 2 , = 0 {:(6.1)f_(I,III)({A_(1),A_(2),dots;C_(1),C_(2),dots})=0:}\begin{equation*} f_{\mathrm{I}, \mathrm{III}}\left(\left\{A_{1}, A_{2}, \ldots ; C_{1}, C_{2}, \ldots\right\}\right)=0 \tag{6.1} \end{equation*}(6.1)fI,III({A1,A2,;C1,C2,})=0
and a similar constraint
(6.2) f II , III ( { B 1 , B 2 , ; C 1 , C 2 , } ) = 0 (6.2) f II , III B 1 , B 2 , ; C 1 , C 2 , = 0 {:(6.2)f_(II,III)({B_(1),B_(2),dots;C_(1),C_(2),dots})=0:}\begin{equation*} f_{\mathrm{II}, \mathrm{III}}\left(\left\{B_{1}, B_{2}, \ldots ; C_{1}, C_{2}, \ldots\right\}\right)=0 \tag{6.2} \end{equation*}(6.2)fII,III({B1,B2,;C1,C2,})=0
which we can write as
(6.3) C 1 = f ~ I , IIII ( { A 1 , A 2 , ; C 2 , C 3 , } ) = f ~ II , IIII ( { B 1 , B 2 , ; C 2 , C 3 , } ) (6.3) C 1 = f ~ I , IIII A 1 , A 2 , ; C 2 , C 3 , = f ~ II , IIII B 1 , B 2 , ; C 2 , C 3 , {:(6.3)C_(1)= tilde(f)_(I,IIII)({A_(1),A_(2),dots;C_(2),C_(3),dots})= tilde(f)_(II,IIII)({B_(1),B_(2),dots;C_(2),C_(3),dots}):}\begin{equation*} C_{1}=\tilde{f}_{\mathrm{I}, \mathrm{IIII}}\left(\left\{A_{1}, A_{2}, \ldots ; C_{2}, C_{3}, \ldots\right\}\right)=\tilde{f}_{\mathrm{II}, \mathrm{IIII}}\left(\left\{B_{1}, B_{2}, \ldots ; C_{2}, C_{3}, \ldots\right\}\right) \tag{6.3} \end{equation*}(6.3)C1=f~I,IIII({A1,A2,;C2,C3,})=f~II,IIII({B1,B2,;C2,C3,})
Since, according to the 0 th 0 th  0^("th ")0^{\text {th }}0th  law, we also must have the constraint
(6.4) f I , II ( { A 1 , A 2 , , B 1 , B 2 , } ) = 0 (6.4) f I , II A 1 , A 2 , , B 1 , B 2 , = 0 {:(6.4)f_(I,II)({A_(1),A_(2),dots,B_(1),B_(2),dots})=0:}\begin{equation*} f_{\mathrm{I}, \mathrm{II}}\left(\left\{A_{1}, A_{2}, \ldots, B_{1}, B_{2}, \ldots\right\}\right)=0 \tag{6.4} \end{equation*}(6.4)fI,II({A1,A2,,B1,B2,})=0
we can proceed by noting that for { A 1 , A 2 , , B 1 , B 2 , } A 1 , A 2 , , B 1 , B 2 , {A_(1),A_(2),dots,B_(1),B_(2),dots}\left\{A_{1}, A_{2}, \ldots, B_{1}, B_{2}, \ldots\right\}{A1,A2,,B1,B2,} which satisfy the last equation, (6.3) must be satisfied for any { C 2 , C 3 , } C 2 , C 3 , {C_(2),C_(3),dots}\left\{C_{2}, C_{3}, \ldots\right\}{C2,C3,} ! Thus, we let III be our reference system and set { C 2 , C 3 , } C 2 , C 3 , {C_(2),C_(3),dots}\left\{C_{2}, C_{3}, \ldots\right\}{C2,C3,} to any convenient but fixed value. This reduces the condition (6.4) for equilibrium between I and II to:
$$
Θ ( { A 1 , A 2 , } ) : = f ~ I , III ( { A 1 , A 2 , , C 2 , C 3 , } ) (6.5) = f ~ II , III ( { B 1 , B 2 , , C 2 , C 3 , } ) = Θ ( { B 1 , B 2 , } ) . Θ A 1 , A 2 , : = f ~ I , III A 1 , A 2 , , C 2 , C 3 , (6.5) = f ~ II , III B 1 , B 2 , , C 2 , C 3 , = Θ B 1 , B 2 , . {:[Theta({A_(1),A_(2),dots}):= tilde(f)_(I,III)({A_(1),A_(2),dots,C_(2),C_(3),dots})],[(6.5)= tilde(f)_(II,III)({B_(1),B_(2),dots,C_(2),C_(3),dots})=Theta({B_(1),B_(2),dots}).]:}\begin{align*} \Theta\left(\left\{A_{1}, A_{2}, \ldots\right\}\right): & =\tilde{f}_{\mathrm{I}, \mathrm{III}}\left(\left\{A_{1}, A_{2}, \ldots, C_{2}, C_{3}, \ldots\right\}\right) \\ & =\tilde{f}_{\mathrm{II}, \mathrm{III}}\left(\left\{B_{1}, B_{2}, \ldots, C_{2}, C_{3}, \ldots\right\}\right)=\Theta\left(\left\{B_{1}, B_{2}, \ldots\right\}\right) . \tag{6.5} \end{align*}Θ({A1,A2,}):=f~I,III({A1,A2,,C2,C3,})(6.5)=f~II,III({B1,B2,,C2,C3,})=Θ({B1,B2,}).
$$
This means that equilibrium is characterized by some function Θ Θ Theta\ThetaΘ of thermodynamic coordinates, which has the properties of a temperature.
We may choose as our reference system III an ideal gas, with
(6.6) P V N k B = const. = T [ K ] =: Θ (6.6) P V N k B =  const.  = T [ K ] =: Θ {:(6.6)(PV)/(Nk_(B))=" const. "=T[K]=:Theta:}\begin{equation*} \frac{P V}{N \mathrm{k}_{\mathrm{B}}}=\text { const. }=T[\mathrm{~K}]=: \Theta \tag{6.6} \end{equation*}(6.6)PVNkB= const. =T[ K]=:Θ
By bringing this system (for V V V rarr ooV \rightarrow \inftyV ) in contact with any other system, we can measure the (absolute) temperature of the latter. For example, one can define the triple point of the system water-ice-vapor to be at 273.16 K . Together with the definition of k B = 1.4 × 10 23 J K k B = 1.4 × 10 23 J K k_(B)=1.4 xx10^(-23)((J))/((K))\mathrm{k}_{\mathrm{B}}=1.4 \times 10^{-23} \frac{\mathrm{~J}}{\mathrm{~K}}kB=1.4×1023 J K ) this then defines, in principle, the Kelvin temperature scale. Of course in practice the situation is more complicated because ideal gases do not exist.
Figure 6.1: The triple point of ice water and vapor in the ( P , T ) ( P , T ) (P,T)(P, T)(P,T) phase diagram
The Zeroth Law implies in particular: The temperature of a system in equilibrium is constant throughout the system. This has to be the case since subsystems obtained by imaginary walls are in equilibrium with each other, see the following figure:
Figure 6.2: A large system divided into subsystems I and II by an imaginary wall.

6.2 The First Law

1 st 1 st  1^("st ")1^{\text {st }}1st  law of thermodynamics: The amount of work required to change adiabatically a thermally isolated system from an initial state i i iii to a final state f f fff depends only on i i iii and f f fff, not on the path of the process.
Figure 6.3: Change of system from initial state i i iii to final state f f fff along two different paths.
Here, by an 'adiabatic change", one means a change without heat exchange. Consider a particle moving in a potential. By fixing an arbitrary reference point X 0 X 0 X_(0)X_{0}X0, we can define an energy landscape
(6.7) E ( X ) = X 0 X δ W (6.7) E ( X ) = X 0 X δ W {:(6.7)E(X)=int_(X_(0))^(X)delta W:}\begin{equation*} E(X)=\int_{X_{0}}^{X} \delta W \tag{6.7} \end{equation*}(6.7)E(X)=X0XδW
where the integral is along any path connecting X 0 X 0 X_(0)X_{0}X0 with X X XXX, and where X 0 X 0 X_(0)X_{0}X0 is a reference point corresponding to the zero of energy. δ W δ W delta W\delta WδW is the infinitesimal change of work done along the path. In order to define more properly the notion of such integrals of "infinitesimals", we will now make a short mathematical digression on differential forms.

Differentials ("differential forms")

A 1-form (or differential) is an expression of the form
(6.8) α = i = 1 N α i ( X 1 , , X N ) d X i (6.8) α = i = 1 N α i X 1 , , X N d X i {:(6.8)alpha=sum_(i=1)^(N)alpha_(i)(X_(1),dots,X_(N))dX_(i):}\begin{equation*} \alpha=\sum_{i=1}^{N} \alpha_{i}\left(X_{1}, \ldots, X_{N}\right) \mathrm{d} X_{i} \tag{6.8} \end{equation*}(6.8)α=i=1Nαi(X1,,XN)dXi
We define
(6.9) γ α := 0 1 i = 1 N α i ( X 1 ( t ) , , X N ( t ) ) d X i ( t ) d t d t " = d X i " (6.9) γ α := 0 1 i = 1 N α i X 1 ( t ) , , X N ( t ) d X i ( t ) d t d t " = d X i " {:(6.9)int_(gamma)alpha:=int_(0)^(1)sum_(i=1)^(N)alpha_(i)(X_(1)(t),dots,X_(N)(t))ubrace((dX_(i)(t))/(dt)dtubrace)_("=dX_(i)"):}\begin{equation*} \int_{\gamma} \alpha:=\int_{0}^{1} \sum_{i=1}^{N} \alpha_{i}\left(X_{1}(t), \ldots, X_{N}(t)\right) \underbrace{\frac{d X_{i}(t)}{d t} d t}_{"=d X_{i} "} \tag{6.9} \end{equation*}(6.9)γα:=01i=1Nαi(X1(t),,XN(t))dXi(t)dtdt"=dXi"
which in general is γ γ gamma\gammaγ-dependent. Given a function f ( X 1 , , X N ) f X 1 , , X N f(X_(1),dots,X_(N))f\left(X_{1}, \ldots, X_{N}\right)f(X1,,XN) on R N R N R^(N)\mathbb{R}^{N}RN, we write
(6.10) d f ( X 1 , , X N ) = f X 1 ( X 1 , , X N ) d X 1 + + f X N ( X 1 , , X N ) d X N (6.10) d f X 1 , , X N = f X 1 X 1 , , X N d X 1 + + f X N X 1 , , X N d X N {:(6.10)df(X_(1),dots,X_(N))=(del f)/(delX_(1))(X_(1),dots,X_(N))dX_(1)+dots+(del f)/(delX_(N))(X_(1),dots,X_(N))dX_(N):}\begin{equation*} \mathrm{d} f\left(X_{1}, \ldots, X_{N}\right)=\frac{\partial f}{\partial X_{1}}\left(X_{1}, \ldots, X_{N}\right) \mathrm{d} X_{1}+\ldots+\frac{\partial f}{\partial X_{N}}\left(X_{1}, \ldots, X_{N}\right) \mathrm{d} X_{N} \tag{6.10} \end{equation*}(6.10)df(X1,,XN)=fX1(X1,,XN)dX1++fXN(X1,,XN)dXN
d f d f df\mathrm{d} fdf is called an "exact" 1-form. From the definition of the path integral along γ γ gamma\gammaγ it is obvious that
(6.11) γ d f = 0 1 d d t { f ( X 1 ( t ) , , X N ( t ) ) } d t = f ( γ ( 1 ) ) f ( γ ( 0 ) ) , (6.11) γ d f = 0 1 d d t f X 1 ( t ) , , X N ( t ) d t = f ( γ ( 1 ) ) f ( γ ( 0 ) ) , {:(6.11)int_(gamma)df=int_(0)^(1)(d)/(dt){f(X_(1)(t),dots,X_(N)(t))}dt=f(gamma(1))-f(gamma(0))",":}\begin{equation*} \int_{\gamma} \mathrm{d} f=\int_{0}^{1} \frac{d}{d t}\left\{f\left(X_{1}(t), \ldots, X_{N}(t)\right)\right\} d t=f(\gamma(1))-f(\gamma(0)), \tag{6.11} \end{equation*}(6.11)γdf=01ddt{f(X1(t),,XN(t))}dt=f(γ(1))f(γ(0)),
so the integral of an exact 1-form only depends on the beginning and endpoint of the path. An example of a curve γ : [ 0 , 1 ] R 2 γ : [ 0 , 1 ] R 2 gamma:[0,1]rarrR^(2)\gamma:[0,1] \rightarrow \mathbb{R}^{2}γ:[0,1]R2 is given in the following figure:
Figure 6.4: A curve γ : [ 0 , 1 ] R 2 γ : [ 0 , 1 ] R 2 gamma:[0,1]rarrR^(2)\gamma:[0,1] \rightarrow \mathbb{R}^{2}γ:[0,1]R2.
The converse is also true: The integral is independent of the path γ γ gamma\gammaγ if and only if there
exists a function f f fff on R N R N R^(N)\mathbb{R}^{N}RN, such that d f = α d f = α df=alpha\mathrm{d} f=\alphadf=α, or equivalently, if and only if α i = f X i α i = f X i alpha_(i)=(del f)/(delX_(i))\alpha_{i}=\frac{\partial f}{\partial X_{i}}αi=fXi.
The notion of a p p ppp-form generalizes that of a 1 -form. It is an expression of the form
(6.12) α = i 1 , , i p α i 1 i p d X i 1 d X i p (6.12) α = i 1 , , i p α i 1 i p d X i 1 d X i p {:(6.12)alpha=sum_(i_(1),dots,i_(p))alpha_(i_(1)dotsi_(p))dX_(i_(1))dotsdX_(i_(p)):}\begin{equation*} \alpha=\sum_{i_{1}, \ldots, i_{p}} \alpha_{i_{1} \ldots i_{p}} \mathrm{~d} X_{i_{1}} \ldots \mathrm{~d} X_{i_{p}} \tag{6.12} \end{equation*}(6.12)α=i1,,ipαi1ip dXi1 dXip
where α i 1 i p α i 1 i p alpha_(i_(1)dotsi_(p))\alpha_{i_{1} \ldots i_{p}}αi1ip are (smooth) functions of the coordinates X i X i X_(i)X_{i}Xi. We declare the d X i d X i dX_(i)\mathrm{d} X_{i}dXi to anticommute,
(6.13) d X i d X j = d X j d X i . (6.13) d X i d X j = d X j d X i . {:(6.13)dX_(i)dX_(j)=-dX_(j)dX_(i).:}\begin{equation*} \mathrm{d} X_{i} \mathrm{~d} X_{j}=-\mathrm{d} X_{j} \mathrm{~d} X_{i} . \tag{6.13} \end{equation*}(6.13)dXi dXj=dXj dXi.
Then we may think of the coefficient tensors as totally anti-symmetric, i.e. we can assume without loss of generality that
(6.14) α i σ ( 1 ) i σ ( p ) = sgn ( σ ) α i 1 i p , (6.14) α i σ ( 1 ) i σ ( p ) = sgn ( σ ) α i 1 i p , {:(6.14)alpha_(i_(sigma(1)dots)dotsi_(sigma(p)))=sgn(sigma)alpha_(i_(1)dotsi_(p))",":}\begin{equation*} \alpha_{i_{\sigma(1) \ldots} \ldots i_{\sigma(p)}}=\operatorname{sgn}(\sigma) \alpha_{i_{1} \ldots i_{p}}, \tag{6.14} \end{equation*}(6.14)αiσ(1)iσ(p)=sgn(σ)αi1ip,
where σ σ sigma\sigmaσ is any permutation of p p ppp elements and s g n s g n sgns g nsgn is its signum (see the discussion of fermions in the chapter on the ideal quantum gas). We may now introduce an operator d with the following properties:
(i) d ( f g ) = d f g + ( 1 ) p f d g d ( f g ) = d f g + ( 1 ) p f d g d(fg)=dfg+(-1)^(p)fdg\mathrm{d}(f g)=\mathrm{d} f g+(-1)^{p} f \mathrm{~d} gd(fg)=dfg+(1)pf dg if f f fff is a p p ppp form and g g ggg a q q qqq form,
(ii) d ( λ f + η g ) = λ d f + η d g d ( λ f + η g ) = λ d f + η d g d(lambda f+eta g)=lambdadf+etadg\mathrm{d}(\lambda f+\eta g)=\lambda \mathrm{d} f+\eta \mathrm{d} gd(λf+ηg)=λdf+ηdg if f , g f , g f,gf, gf,g are p p ppp forms and λ , η λ , η lambda,eta\lambda, \etaλ,η are constants.
(iii) d f = i f X i d X i d f = i f X i d X i df=sum_(i)(del f)/(delX_(i))dX_(i)\mathrm{d} f=\sum_{i} \frac{\partial f}{\partial X_{i}} \mathrm{~d} X_{i}df=ifXi dXi for 0 -forms f f fff,
(iv) d 2 X i = 0 d 2 X i = 0 d^(2)X_(i)=0\mathrm{d}^{2} X_{i}=0d2Xi=0
On scalars (i.e. 0 -forms) the operator d is defined as before, and the rules (i)-(iv) then determine it for any p p ppp-form. The relation (6.13) can be interpreted as saying that we should think of the differentials d X i , i = 1 , , N d X i , i = 1 , , N dX_(i),i=1,dots,N\mathrm{d} X_{i}, i=1, \ldots, NdXi,i=1,,N as "fermionic-" or "anti-commuting variables". 2 2 ^(2){ }^{2}2 For instance, we then get for a 1-form α α alpha\alphaα :
(6.15) d α = i , j α i X j d X j d X i = d X i d X j (6.16) = 1 2 i , j ( α i X j α j X i ) d X j d X i . (6.15) d α = i , j α i X j d X j d X i = d X i d X j (6.16) = 1 2 i , j α i X j α j X i d X j d X i . {:[(6.15)dalpha=sum_(i,j)(delalpha_(i))/(delX_(j))ubrace((d)X_(j)(d)X_(i)ubrace)_(=-dX_(i)dX_(j))],[(6.16)=(1)/(2)sum_(i,j)((delalpha_(i))/(delX_(j))-(delalpha_(j))/(delX_(i)))dX_(j)dX_(i).]:}\begin{align*} \mathrm{d} \alpha & =\sum_{i, j} \frac{\partial \alpha_{i}}{\partial X_{j}} \underbrace{\mathrm{~d} X_{j} \mathrm{~d} X_{i}}_{=-\mathrm{d} X_{i} \mathrm{~d} X_{j}} \tag{6.15}\\ & =\frac{1}{2} \sum_{i, j}\left(\frac{\partial \alpha_{i}}{\partial X_{j}}-\frac{\partial \alpha_{j}}{\partial X_{i}}\right) \mathrm{d} X_{j} \mathrm{~d} X_{i} . \tag{6.16} \end{align*}(6.15)dα=i,jαiXj dXj dXi=dXi dXj(6.16)=12i,j(αiXjαjXi)dXj dXi.
The expression for d α d α dalpha\mathrm{d} \alphadα of a p p ppp-form follows similarly by applying the rules (i)-(iv). The rules imply the most important relation for p p ppp forms,
(6.17) d 2 α = d ( d α ) = 0 (6.17) d 2 α = d ( d α ) = 0 {:(6.17)d^(2)alpha=d(dalpha)=0:}\begin{equation*} \mathrm{d}^{2} \alpha=\mathrm{d}(\mathrm{~d} \alpha)=0 \tag{6.17} \end{equation*}(6.17)d2α=d( dα)=0
Conversely, it can be shown that for any p + 1 p + 1 p+1p+1p+1 form f f fff on R N R N R^(N)\mathbb{R}^{N}RN such that d f = 0 d f = 0 df=0\mathrm{d} f=0df=0 we must have f = d α f = d α f=dalphaf=\mathrm{d} \alphaf=dα for some p p ppp-form α α alpha\alphaα. This result is often referred to as the P o i n c a r e P o i n c a r e Poincare\mathbf{P o i n c a r e}Poincare lemma. An important and familiar example for this from field theory is provided by force fields f f vec(f)\vec{f}f on R 3 R 3 R^(3)\mathbb{R}^{3}R3. The components f i f i f_(i)f_{i}fi of the force field may be identified with the components of a 1-form called F = f i d X i F = f i d X i F=sumf_(i)dX_(i)F=\sum f_{i} \mathrm{~d} X_{i}F=fi dXi. The condition d F = 0 d F = 0 dF=0\mathrm{d} F=0dF=0 is seen to be equivalent to × f = 0 × f = 0 vec(grad)xx vec(f)=0\vec{\nabla} \times \vec{f}=0×f=0, i.e. we have a conservative force field. Poincaré's lemma implies the existence of a potenial W W -W-\mathcal{W}W, such that F = d W F = d W F=-dWF=-\mathrm{d} \mathcal{W}F=dW; in vector notation, f = W f = W vec(f)=- vec(grad)W\vec{f}=-\vec{\nabla} \mathcal{W}f=W. A similar statement is shown to hold for p p ppp-forms
Just as a 1-form can be integrated over oriented curves (1-dimensional surfaces), a p p ppp form can be integrated over an oriented p p ppp-dimensional surface Σ Σ Sigma\SigmaΣ. If that surface is parameterized by N N NNN functions X i ( t 1 , , t p ) X i t 1 , , t p X_(i)(t_(1),dots,t_(p))X_{i}\left(t_{1}, \ldots, t_{p}\right)Xi(t1,,tp) of p p ppp parameters ( t 1 , , t p ) U R p t 1 , , t p U R p (t_(1),dots,t_(p))in U subR^(p)\left(t_{1}, \ldots, t_{p}\right) \in U \subset \mathbb{R}^{p}(t1,,tp)URp (the ordering of which defines an orientation of the surface), we define the corresponding integral as
(6.18) Σ α = U d t 1 d t p i 1 , , i p α i 1 i p ( X ( t 1 , , t p ) ) X i 1 t 1 X i p t p (6.18) Σ α = U d t 1 d t p i 1 , , i p α i 1 i p X t 1 , , t p X i 1 t 1 X i p t p {:(6.18)int_(Sigma)alpha=int_(U)dt_(1)dots dt_(p)sum_(i_(1),dots,i_(p))alpha_(i_(1)dotsi_(p))(X(t_(1),dots,t_(p)))(delX_(i_(1)))/(delt_(1))dots(delX_(i_(p)))/(delt_(p)):}\begin{equation*} \int_{\Sigma} \alpha=\int_{U} d t_{1} \ldots d t_{p} \sum_{i_{1}, \ldots, i_{p}} \alpha_{i_{1} \ldots i_{p}}\left(X\left(t_{1}, \ldots, t_{p}\right)\right) \frac{\partial X_{i_{1}}}{\partial t_{1}} \ldots \frac{\partial X_{i_{p}}}{\partial t_{p}} \tag{6.18} \end{equation*}(6.18)Σα=Udt1dtpi1,,ipαi1ip(X(t1,,tp))Xi1t1Xiptp
The value of this integral is independent of the chosen parameterization up to a sign which corresponds to our choice of orientation. The most important fact pertaining to integrals of differential forms is Gauss' theorem (also called Stokes' theorem in this context):
(6.19) Σ d α = Σ α (6.19) Σ d α = Σ α {:(6.19)int_(Sigma)dalpha=int_(del Sigma)alpha:}\begin{equation*} \int_{\Sigma} \mathrm{d} \alpha=\int_{\partial \Sigma} \alpha \tag{6.19} \end{equation*}(6.19)Σdα=Σα
In particular, the integral of a form d α d α dalpha\mathrm{d} \alphadα vanishes if the boundary Σ Σ del Sigma\partial \SigmaΣ of Σ Σ Sigma\SigmaΣ is empty.
Using the language of differentials, the 1 st 1 st  1^("st ")1^{\text {st }}1st  law of thermodynamics may also be stated as saying that, in the absence of heat exchange, the infinitesimal work is an exact 1-form,
(6.20) d E = δ W (6.20) d E = δ W {:(6.20)dE=delta W:}\begin{equation*} \mathrm{d} E=\delta W \tag{6.20} \end{equation*}(6.20)dE=δW
or alternatively,
(6.21) d δ W = 0 (6.21) d δ W = 0 {:(6.21)ddelta W=0:}\begin{equation*} \mathrm{d} \delta W=0 \tag{6.21} \end{equation*}(6.21)dδW=0
We can break up the infinitesimal work change into the various forms of possible work such as in
(6.22) d E = δ W = i J i force d X i displacement = P d V + μ d N + { other types of work, see table } , (6.22) d E = δ W = i J i force  d X i displacement  = P d V + μ d N + {  other types of work, see table  } , {:(6.22)dE=delta W=sum_(i)ubrace(J_(i)ubrace)_("force ")ubrace(dX_(i)ubrace)_("displacement ")=-PdV+mudN+{" other types of work, see table "}",":}\begin{equation*} \mathrm{d} E=\delta W=\sum_{i} \underbrace{J_{i}}_{\text {force }} \underbrace{\mathrm{d} X_{i}}_{\text {displacement }}=-P \mathrm{~d} V+\mu \mathrm{d} N+\{\text { other types of work, see table }\}, \tag{6.22} \end{equation*}(6.22)dE=δW=iJiforce dXidisplacement =P dV+μdN+{ other types of work, see table },
if the change of state is adiabatic (no heat transfer!). If there is heat transfer, then the
1 st 1 st  1^("st ")1^{\text {st }}1st  law gets replaced by
(6.23) d E = δ Q + i J i force d X i displacement . (6.23) d E = δ Q + i J i force  d X i displacement  . {:(6.23)dE=delta Q+sum_(i)ubrace(J_(i)ubrace)_("force ")ubrace(dX_(i)ubrace)_("displacement ").:}\begin{equation*} \mathrm{d} E=\delta Q+\sum_{i} \underbrace{J_{i}}_{\text {force }} \underbrace{\mathrm{d} X_{i}}_{\text {displacement }} . \tag{6.23} \end{equation*}(6.23)dE=δQ+iJiforce dXidisplacement .
This relation is best viewed as the definition of the infinitesimal heat change δ Q δ Q delta Q\delta QδQ. Thus, we could say that the first law is just energy conservation, where energy can consist of either mechanical work or heat. We may then write
(6.24) δ Q = d E i J i force d X i displacement (6.24) δ Q = d E i J i force  d X i displacement  {:(6.24)delta Q=dE-sum_(i)ubrace(J_(i)ubrace)_("force ")ubrace(dX_(i)ubrace)_("displacement "):}\begin{equation*} \delta Q=\mathrm{d} E-\sum_{i} \underbrace{J_{i}}_{\text {force }} \underbrace{\mathrm{d} X_{i}}_{\text {displacement }} \tag{6.24} \end{equation*}(6.24)δQ=dEiJiforce dXidisplacement 
from which it can be seen that δ Q δ Q delta Q\delta QδQ is a 1-form depending on the variables ( E , X 1 , , X n ) E , X 1 , , X n (E,X_(1),dots,X_(n))\left(E, X_{1}, \ldots, X_{n}\right)(E,X1,,Xn).
An overview over several thermodynamic forces and displacements is given in the following table:
System Force J i J i J_(i)J_{i}Ji Displacement X i X i X_(i)X_{i}Xi
wire tension F F FFF length L L LLL
film surface tension τ τ tau\tauτ area A A AAA
fluid/gas pressure P P PPP volume V V VVV
magnet magnetic field B B vec(B)\vec{B}B magnetization M M vec(M)\vec{M}M
electricity electric field E E vec(E)\vec{E}E polarization Π Π vec(Pi)\vec{\Pi}Π
stat. potential ϕ ϕ phi\phiϕ charge q q qqq
chemical chemical potential μ μ mu\muμ particle number N N NNN
System Force J_(i) Displacement X_(i) wire tension F length L film surface tension tau area A fluid/gas pressure P volume V magnet magnetic field vec(B) magnetization vec(M) electricity electric field vec(E) polarization vec(Pi) stat. potential phi charge q chemical chemical potential mu particle number N| System | Force $J_{i}$ | Displacement $X_{i}$ | | :---: | :---: | :---: | | wire | tension $F$ | length $L$ | | film | surface tension $\tau$ | area $A$ | | fluid/gas | pressure $P$ | volume $V$ | | magnet | magnetic field $\vec{B}$ | magnetization $\vec{M}$ | | electricity | electric field $\vec{E}$ | polarization $\vec{\Pi}$ | | | stat. potential $\phi$ | charge $q$ | | chemical | chemical potential $\mu$ | particle number $N$ |
Table 6.1: Some thermodynamic forces and displacements for various types of systems.
Since δ Q δ Q delta Q\delta QδQ is not an exact differential (in particular d δ Q 0 d δ Q 0 ddelta Q!=0\mathrm{d} \delta Q \neq 0dδQ0 ) we have
Δ Q 1 = γ 1 δ Q γ 2 δ Q = Δ Q 2 Δ Q 1 = γ 1 δ Q γ 2 δ Q = Δ Q 2 DeltaQ_(1)=int_(gamma_(1))delta Q!=int_(gamma_(2))delta Q=DeltaQ_(2)\Delta Q_{1}=\int_{\gamma_{1}} \delta Q \neq \int_{\gamma_{2}} \delta Q=\Delta Q_{2}ΔQ1=γ1δQγ2δQ=ΔQ2
So, there does not exist a function Q = Q ( V , A , N , ) Q = Q ( V , A , N , ) Q=Q(V,A,N,dots)Q=Q(V, A, N, \ldots)Q=Q(V,A,N,) such that δ Q = d Q δ Q = d Q delta Q=dQ\delta Q=\mathrm{d} QδQ=dQ ! Traditionally, one refers to processes where δ Q 0 δ Q 0 delta Q!=0\delta Q \neq 0δQ0 as "non-adiabatic", i.e. heat is transferred.

6.3 The Second Law

2 nd 2 nd  2^("nd ")\mathbf{2}^{\text {nd }}2nd  law of thermodynamics (Kelvin): There are no processes in which heat goes over from a reservoir, is completely converted to other forms of energy, and nothing else happens.
One important consequence of the 2 nd 2 nd  2^("nd ")2^{\text {nd }}2nd  law is the existence of a state function S S SSS, called entropy. As before, we denote the n n nnn "displacement variables" generically by X i X i X_(i)inX_{i} \inXi { V , N , } { V , N , } {V,N,dots}\{V, N, \ldots\}{V,N,} and the "forces" by J i { P , μ , } J i { P , μ , } J_(i)in{-P,mu,dots}J_{i} \in\{-P, \mu, \ldots\}Ji{P,μ,}, and consider equilibrium states labeled by ( E , { X i } ) E , X i (E,{X_(i)})\left(E,\left\{X_{i}\right\}\right)(E,{Xi}) in an n + 1 n + 1 n+1n+1n+1-dimensional space. We consider within this space the "adiabatic" submanifold A A A\mathcal{A}A of all states that can be reached from a given state ( E , { X i } ) E , X i (E^(**),{X_(i)^(**)})\left(E^{*},\left\{X_{i}^{*}\right\}\right)(E,{Xi}) by means of a reversible and quasi-static (i.e. sufficiently slowly performed) process. On this submanifold we must have
(6.25) δ Q = d E i = 1 n J i d X i = 0 (6.25) δ Q = d E i = 1 n J i d X i = 0 {:(6.25)delta Q=dE-sum_(i=1)^(n)J_(i)dX_(i)=0:}\begin{equation*} \delta Q=\mathrm{d} E-\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i}=0 \tag{6.25} \end{equation*}(6.25)δQ=dEi=1nJi dXi=0
i.e. γ δ Q = 0 γ δ Q = 0 int_(gamma)delta Q=0\int_{\gamma} \delta Q=0γδQ=0 for any closed curve γ γ gamma\gammaγ in A A A\mathcal{A}A. Otherwise there would exist processes disturbing the energy balance (through the exchange of heat), and we could then choose a sign of δ Q δ Q delta Q\delta QδQ such that work is performed on a system by converting heat energy into work, which is impossible by the 2 nd 2 nd  2^("nd ")2^{\text {nd }}2nd  law.
We choose a (not uniquely defined) function S S SSS labeling different submanifolds A A A\mathcal{A}A :
Figure 6.5: Sketch of the submanifolds A A A\mathcal{A}A.
This means that d S d S dS\mathrm{d} SdS is proportional to d E i = 1 n J i d X i d E i = 1 n J i d X i dE-sum_(i=1)^(n)J_(i)dX_(i)\mathrm{d} E-\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i}dEi=1nJi dXi. Thus, at each point ( E , { X i } ) E , X i (E,{X_(i)})\left(E,\left\{X_{i}\right\}\right)(E,{Xi}) there is a function Θ ( E , X 1 , , X n ) Θ E , X 1 , , X n Theta(E,X_(1),dots,X_(n))\Theta\left(E, X_{1}, \ldots, X_{n}\right)Θ(E,X1,,Xn) such that
(6.26) Θ d S = d E i = 1 n J i d X i (6.26) Θ d S = d E i = 1 n J i d X i {:(6.26)ThetadS=dE-sum_(i=1)^(n)J_(i)dX_(i):}\begin{equation*} \Theta \mathrm{d} S=\mathrm{d} E-\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i} \tag{6.26} \end{equation*}(6.26)ΘdS=dEi=1nJi dXi
Θ Θ Theta\ThetaΘ can be identified with the temperature T [ K ] T [ K ] T[K]T[\mathrm{~K}]T[ K] for suitable choice of S = S ( E , X 1 , , X n ) S = S E , X 1 , , X n S=S(E,X_(1),dots,X_(n))S=S\left(E, X_{1}, \ldots, X_{n}\right)S=S(E,X1,,Xn),
which then uniquely defines S S SSS. This is seen for instance by comparing the coefficients in
(6.27) T d S = T ( S E d E + i = 1 n S X i d X i ) = d E i = 1 n J i d X i (6.27) T d S = T S E d E + i = 1 n S X i d X i = d E i = 1 n J i d X i {:(6.27)TdS=T((del S)/(del E)(d)E+sum_(i=1)^(n)(del S)/(delX_(i))(d)X_(i))=dE-sum_(i=1)^(n)J_(i)dX_(i):}\begin{equation*} T \mathrm{~d} S=T\left(\frac{\partial S}{\partial E} \mathrm{~d} E+\sum_{i=1}^{n} \frac{\partial S}{\partial X_{i}} \mathrm{~d} X_{i}\right)=\mathrm{d} E-\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i} \tag{6.27} \end{equation*}(6.27)T dS=T(SE dE+i=1nSXi dXi)=dEi=1nJi dXi
which yields
(6.28) 0 = ( T S E 1 ) = 0 d E + i = 1 n ( T S X i + J i ) = 0 d X i (6.28) 0 = T S E 1 = 0 d E + i = 1 n T S X i + J i = 0 d X i {:(6.28)0=ubrace((T(del S)/(del E)-1)ubrace)_(=0)dE+sum_(i=1)^(n)ubrace((T(del S)/(delX_(i))+J_(i))ubrace)_(=0)dX_(i):}\begin{equation*} 0=\underbrace{\left(T \frac{\partial S}{\partial E}-1\right)}_{=0} \mathrm{~d} E+\sum_{i=1}^{n} \underbrace{\left(T \frac{\partial S}{\partial X_{i}}+J_{i}\right)}_{=0} \mathrm{~d} X_{i} \tag{6.28} \end{equation*}(6.28)0=(TSE1)=0 dE+i=1n(TSXi+Ji)=0 dXi
Therefore, the following relations hold:
(6.29) 1 T = S ( E , { X j } ) E and J i T = S ( E , { X j } ) X i (6.29) 1 T = S E , X j E  and  J i T = S E , X j X i {:(6.29)(1)/(T)=(del S(E,{X_(j)}))/(del E)quad" and "quad-(J_(i))/(T)=(del S(E,{X_(j)}))/(delX_(i)):}\begin{equation*} \frac{1}{T}=\frac{\partial S\left(E,\left\{X_{j}\right\}\right)}{\partial E} \quad \text { and } \quad-\frac{J_{i}}{T}=\frac{\partial S\left(E,\left\{X_{j}\right\}\right)}{\partial X_{i}} \tag{6.29} \end{equation*}(6.29)1T=S(E,{Xj})E and JiT=S(E,{Xj})Xi
We recognize the first of those relations as the defining relation for temperature, whereas the second includes the definitions of the pressure and chemical potential. These relations were already stated in the microcanocical ensemble (cf. section 4.2.1.). We can now rewrite (6.26) as
(6.30) d E = T d S + i = 1 n J i d X i = T d S P d V + μ d N + (6.30) d E = T d S + i = 1 n J i d X i = T d S P d V + μ d N + {:(6.30)dE=TdS+sum_(i=1)^(n)J_(i)dX_(i)=TdS-PdV+mudN+dots:}\begin{equation*} \mathrm{d} E=T \mathrm{~d} S+\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i}=T \mathrm{~d} S-P \mathrm{~d} V+\mu \mathrm{d} N+\ldots \tag{6.30} \end{equation*}(6.30)dE=T dS+i=1nJi dXi=T dSP dV+μdN+
By comparing this formula with that for energy conservation for a process without heat transfer, we identify
(6.31) δ Q = heat transfer = T d S d S = δ Q T (noting that d ( δ Q ) 0 ! ). (6.31) δ Q =  heat transfer  = T d S d S = δ Q T  (noting that  d ( δ Q ) 0 !  ).  {:(6.31)delta Q=" heat transfer "=TdS quad=>quaddS=(delta Q)/(T)quad" (noting that "d(delta Q)!=0!" ). ":}\begin{equation*} \delta Q=\text { heat transfer }=T \mathrm{~d} S \quad \Rightarrow \quad \mathrm{~d} S=\frac{\delta Q}{T} \quad \text { (noting that } d(\delta Q) \neq 0!\text { ). } \tag{6.31} \end{equation*}(6.31)δQ= heat transfer =T dS dS=δQT (noting that d(δQ)0! ). 
Equation (6.30), which was derived for quasi-static processes, is the most important equation in thermodynamics.
Example: As an illustration, we calculate the adiabatic curves A A A\mathcal{A}A for an ideal gas. The defining relation is, with n = 1 n = 1 n=1n=1n=1 and X 1 = V X 1 = V X_(1)=VX_{1}=VX1=V in this case
0 = d E + P d V 0 = d E + P d V 0=dE+PdV0=\mathrm{d} E+P \mathrm{~d} V0=dE+P dV
Since P V = N k B T P V = N k B T PV=Nk_(B)TP V=N \mathrm{k}_{\mathrm{B}} TPV=NkBT and E = 3 2 N k B T E = 3 2 N k B T E=(3)/(2)Nk_(B)TE=\frac{3}{2} N \mathrm{k}_{\mathrm{B}} TE=32NkBT for the ideal gas, we find
(6.32) P = P ( E , V ) = 2 3 E V (6.32) P = P ( E , V ) = 2 3 E V {:(6.32)P=P(E","V)=(2)/(3)(E)/(V):}\begin{equation*} P=P(E, V)=\frac{2}{3} \frac{E}{V} \tag{6.32} \end{equation*}(6.32)P=P(E,V)=23EV
and therefore
(6.33) 0 = d E + 2 3 E V d V (6.33) 0 = d E + 2 3 E V d V {:(6.33)0=dE+(2)/(3)(E)/(V)dV:}\begin{equation*} 0=\mathrm{d} E+\frac{2}{3} \frac{E}{V} \mathrm{~d} V \tag{6.33} \end{equation*}(6.33)0=dE+23EV dV
Thus, we can parametrize the adiabatic A A A\mathcal{A}A by E = E ( V ) E = E ( V ) E=E(V)E=E(V)E=E(V), such that d E = E ( V ) V d V d E = E ( V ) V d V dE=(del E(V))/(del V)dV\mathrm{d} E=\frac{\partial E(V)}{\partial V} \mathrm{~d} VdE=E(V)V dV on
A A A\mathcal{A}A. We then obtain
0 = ( E V + 2 3 E V ) = 0 d V E ( V ) = E ( V V ) 2 / 3 0 = E V + 2 3 E V = 0 d V E ( V ) = E V V 2 / 3 {:[0=ubrace(((del E)/(del V)+(2)/(3)(E)/(V))ubrace)_(=0)dV],[=>E(V)=E^(**)((V^(**))/(V))^(2//3)]:}\begin{aligned} & 0=\underbrace{\left(\frac{\partial E}{\partial V}+\frac{2}{3} \frac{E}{V}\right)}_{=0} \mathrm{~d} V \\ \Rightarrow & E(V)=E^{*}\left(\frac{V^{*}}{V}\right)^{2 / 3} \end{aligned}0=(EV+23EV)=0 dVE(V)=E(VV)2/3
Figure 6.6: Adiabatics of the ideal gas
Of course, we may also switch to other thermodynamic variables, like ( S , V ) ( S , V ) (S,V)(S, V)(S,V), such that E E EEE now becomes a function of ( S , V ) ( S , V ) (S,V)(S, V)(S,V) :
(6.34) d E = T d S P d V = ( E V ) d V + ( E S ) d S (6.34) d E = T d S P d V = E V d V + E S d S {:(6.34)dE=TdS-PdV=((del E)/(del V))dV+((del E)/(del S))dS:}\begin{equation*} \mathrm{d} E=T \mathrm{~d} S-P \mathrm{~d} V=\left(\frac{\partial E}{\partial V}\right) \mathrm{d} V+\left(\frac{\partial E}{\partial S}\right) \mathrm{d} S \tag{6.34} \end{equation*}(6.34)dE=T dSP dV=(EV)dV+(ES)dS
The defining relation for the adiabatics then reads
(6.35) 0 = ( E V + P ) = 0 d V + ( E S T ) = 0 d S (6.35) 0 = E V + P = 0 d V + E S T = 0 d S {:(6.35)0=ubrace(((del E)/(del V)+P)ubrace)_(=0)dV+ubrace(((del E)/(del S)-T)ubrace)_(=0)dS:}\begin{equation*} 0=\underbrace{\left(\frac{\partial E}{\partial V}+P\right)}_{=0} \mathrm{~d} V+\underbrace{\left(\frac{\partial E}{\partial S}-T\right)}_{=0} \mathrm{~d} S \tag{6.35} \end{equation*}(6.35)0=(EV+P)=0 dV+(EST)=0 dS
from which it follows that
(6.36) T = E S | V and P = E V | S (6.36) T = E S V  and  P = E V S {:(6.36)T=(del E)/(del S)|_(V)quad" and "quad P=-(del E)/(del V)|_(S):}\begin{equation*} T=\left.\frac{\partial E}{\partial S}\right|_{V} \quad \text { and } \quad P=-\left.\frac{\partial E}{\partial V}\right|_{S} \tag{6.36} \end{equation*}(6.36)T=ES|V and P=EV|S
which hold generally (cf. section 4.2 .1 , eq. (4.17)). For an ideal gas ( P V = N k B T P V = N k B T (PV=Nk_(B)T:}\left(P V=N \mathrm{k}_{\mathrm{B}} T\right.(PV=NkBT and E = 3 2 N k B T E = 3 2 N k B T E=(3)/(2)Nk_(B)TE=\frac{3}{2} N \mathrm{k}_{\mathrm{B}} TE=32NkBT ) we thus find
E V V = k B N E S E = 3 2 k B N E S E V V = k B N E S E = 3 2 k B N E S {:[-(del E)/(del V)V=k_(B)N(del E)/(del S)],[E=(3)/(2)k_(B)N(del E)/(del S)]:}\begin{aligned} -\frac{\partial E}{\partial V} V & =\mathrm{k}_{\mathrm{B}} N \frac{\partial E}{\partial S} \\ E & =\frac{3}{2} \mathrm{k}_{\mathrm{B}} N \frac{\partial E}{\partial S} \end{aligned}EVV=kBNESE=32kBNES
which we can solve as
(6.37) E ( S , V ) = E ( S , V ) e 2 3 S S k B N (6.37) E ( S , V ) = E S , V e 2 3 S S k B N {:(6.37)E(S","V)=E(S^(**),V)e^((2)/(3)(S-S^(**))/(k_(B)N)):}\begin{equation*} E(S, V)=E\left(S^{*}, V\right) e^{\frac{2}{3} \frac{S-S^{*}}{\mathrm{k}_{\mathrm{B}} N}} \tag{6.37} \end{equation*}(6.37)E(S,V)=E(S,V)e23SSkBN
Since we also have
(6.38) 1 E E V = 2 3 1 V (6.38) 1 E E V = 2 3 1 V {:(6.38)(1)/(E)(del E)/(del V)=-(2)/(3)(1)/(V):}\begin{equation*} \frac{1}{E} \frac{\partial E}{\partial V}=-\frac{2}{3} \frac{1}{V} \tag{6.38} \end{equation*}(6.38)1EEV=231V
we find for the function E ( S , V ) E ( S , V ) E(S,V)E(S, V)E(S,V) :
(6.39) E ( S , V ) = E ( S , V ) ( V V ) 2 3 e 2 3 S S k B N (6.39) E ( S , V ) = E S , V V V 2 3 e 2 3 S S k B N {:(6.39)E(S","V)=E(S^(**),V^(**))((V^(**))/(V))^((2)/(3))e^((2)/(3)(S-S^(**))/(k_(B)N)):}\begin{equation*} E(S, V)=E\left(S^{*}, V^{*}\right)\left(\frac{V^{*}}{V}\right)^{\frac{2}{3}} e^{\frac{2}{3} \frac{S-S^{*}}{\mathrm{k}_{\mathrm{B}} N}} \tag{6.39} \end{equation*}(6.39)E(S,V)=E(S,V)(VV)23e23SSkBN
Solving this relation for S S SSS, we obtain the relation
(6.40) S = S + N k B log [ ( E E ) 3 2 V V ] (6.40) S = S + N k B log E E 3 2 V V {:(6.40)S=S^(**)+Nk_(B)log[((E)/(E^(**)))^((3)/(2))(V)/(V^(**))]:}\begin{equation*} S=S^{*}+N \mathrm{k}_{\mathrm{B}} \log \left[\left(\frac{E}{E^{*}}\right)^{\frac{3}{2}} \frac{V}{V^{*}}\right] \tag{6.40} \end{equation*}(6.40)S=S+NkBlog[(EE)32VV]
The dependence on V , E V , E V,EV, EV,E of this expression coincides with the expression (4.16), found in section 4.2.1. Indeed, we find
(6.41) S = N k B log [ V N ( 4 π e m 3 E N ) 3 2 ] (6.41) S = N k B log V N 4 π e m 3 E N 3 2 {:(6.41)S=Nk_(B)log[(V)/(N)((4pi em)/(3)(E)/(N))^((3)/(2))]:}\begin{equation*} S=N \mathrm{k}_{\mathrm{B}} \log \left[\frac{V}{N}\left(\frac{4 \pi e m}{3} \frac{E}{N}\right)^{\frac{3}{2}}\right] \tag{6.41} \end{equation*}(6.41)S=NkBlog[VN(4πem3EN)32]
i.e. the formula found before in the micro canonical ensemble for a suitable choice of the entropy S = S ( E , V ) S = S E , V S^(**)=S(E^(**),V^(**))S^{*}=S\left(E^{*}, V^{*}\right)S=S(E,V) at the reference point. 3 3 ^(3){ }^{3}3 The entropy at the reference point depends on N N NNN and on the microscopic parameter of the system (i.e. the particle mass m m mmm ), which clearly cannot be determined in the present context, since this information is neither contained in the equations of state, nor, of course, in the first law of thermodynamics.

6.4 Cyclic processes

6.4.1 The Carnot Engine

We next discuss the Carnot engine for an ideal (mono atomic) gas. As discussed in section 4.2., the ideal gas is characterized by the relations:
(6.42) E = 3 2 N k B T = 3 2 P V . (6.42) E = 3 2 N k B T = 3 2 P V . {:(6.42)E=(3)/(2)Nk_(B)T=(3)/(2)PV.:}\begin{equation*} E=\frac{3}{2} N \mathrm{k}_{\mathrm{B}} T=\frac{3}{2} P V . \tag{6.42} \end{equation*}(6.42)E=32NkBT=32PV.
We consider the cyclic process consisting of the following steps:
I II : I II : IrarrII:quad\mathrm{I} \rightarrow \mathrm{II}: \quadIII: isothermal expansion at T = T H T = T H T=T_(H)T=T_{H}T=TH,
II rarr\rightarrow III: adiabatic expansion ( δ Q = 0 ) ( δ Q = 0 ) (delta Q=0)(\delta Q=0)(δQ=0),
III rarr\rightarrow IV: isothermal compression at T = T C T = T C T=T_(C)T=T_{C}T=TC,
IV rarr\rightarrow I: adiabatic compression
where we assume T H > T C T H > T C T_(H) > T_(C)T_{H}>T_{C}TH>TC.
We want to work out the efficiency η η eta\etaη, which is defined as
(6.43) η := Δ W Δ Q in (6.43) η := Δ W Δ Q in {:(6.43)eta:=(Delta W)/(DeltaQ_(in)):}\begin{equation*} \eta:=\frac{\Delta W}{\Delta Q_{\mathrm{in}}} \tag{6.43} \end{equation*}(6.43)η:=ΔWΔQin
where
Δ Q in = I I I δ Q Δ Q in = I I I δ Q DeltaQ_(in)=int_(I)^(II)delta Q\Delta Q_{\mathrm{in}}=\int_{I}^{I I} \delta QΔQin=IIIδQ
is the total heat added to the system (analogously, Δ Q out = I I I I V δ Q Δ Q out = I I I I V δ Q DeltaQ_(out)=int_(III)^(IV)delta Q\Delta Q_{\mathrm{out}}=\int_{I I I}^{I V} \delta QΔQout=IIIIVδQ is the total heat given off by system into a colder reservoir), and where
Δ W = δ W = ( I I I + I I I I I + I I I I V + I V I ) δ W Δ W = δ W = I I I + I I I I I + I I I I V + I V I δ W Delta W=ointdelta W=(int_(I)^(II)+int_(II)^(III)+int_(III)^(IV)+int_(IV)^(I))delta W\Delta W=\oint \delta W=\left(\int_{I}^{I I}+\int_{I I}^{I I I}+\int_{I I I}^{I V}+\int_{I V}^{I}\right) \delta WΔW=δW=(III+IIIII+IIIIV+IVI)δW
is the total work done by the system. We may also write δ Q = T d S δ Q = T d S delta Q=TdS\delta Q=T \mathrm{~d} SδQ=T dS and δ W = P d V δ W = P d V delta W=PdV\delta W=P d VδW=PdV (or more generally δ W = i = 1 n J i d X i δ W = i = 1 n J i d X i delta W=-sum_(i=1)^(n)J_(i)dX_(i)\delta W=-\sum_{i=1}^{n} J_{i} d X_{i}δW=i=1nJidXi if other types of mechanical/ chemical work are performed by the system). By definition no heat exchange takes place during II rarr\rightarrow III and IV rarr\rightarrow I.
We now wish to calculate η Carnot η Carnot  eta_("Carnot ")\eta_{\text {Carnot }}ηCarnot . We can for instance take P P PPP and V V VVV as the variables to describe the process. We have P V = P V = PV=P V=PV= const. for isothermal processes by (6.42). To calculate the adiabatics, we could use the results from above and change the variables from ( E , V ) ( P , V ) ( E , V ) ( P , V ) (E,V)rarr(P,V)(E, V) \rightarrow(P, V)(E,V)(P,V) using (6.42), but it is just as easy to do this from scratch: We start with δ Q = 0 δ Q = 0 delta Q=0\delta Q=0δQ=0 for an adiabatic process. From this follows that
(6.44) 0 = d E + P d V (6.44) 0 = d E + P d V {:(6.44)0=dE+PdV:}\begin{equation*} 0=\mathrm{d} E+P \mathrm{~d} V \tag{6.44} \end{equation*}(6.44)0=dE+P dV
Since on adiabatics we may take P = P ( V ) P = P ( V ) P=P(V)P=P(V)P=P(V), this yields
(6.45) d E = 3 2 d ( P V ) = 3 2 ( V P V + P ) d V (6.45) d E = 3 2 d ( P V ) = 3 2 V P V + P d V {:(6.45)dE=(3)/(2)d(PV)=(3)/(2)(V(del P)/(del V)+P)dV:}\begin{equation*} \mathrm{d} E=\frac{3}{2} \mathrm{~d}(P V)=\frac{3}{2}\left(V \frac{\partial P}{\partial V}+P\right) \mathrm{d} V \tag{6.45} \end{equation*}(6.45)dE=32 d(PV)=32(VPV+P)dV
and therefore
(6.46) 0 = 3 2 d ( P V ) + P d V = ( 3 2 V P V + 5 2 P ) = 0 d V (6.46) 0 = 3 2 d ( P V ) + P d V = 3 2 V P V + 5 2 P = 0 d V {:(6.46)0=(3)/(2)d(PV)+PdV=ubrace(((3)/(2)V(del P)/(del V)+(5)/(2)P)ubrace)_(=0)dV:}\begin{equation*} 0=\frac{3}{2} \mathrm{~d}(P V)+P \mathrm{~d} V=\underbrace{\left(\frac{3}{2} V \frac{\partial P}{\partial V}+\frac{5}{2} P\right)}_{=0} \mathrm{~d} V \tag{6.46} \end{equation*}(6.46)0=32 d(PV)+P dV=(32VPV+52P)=0 dV
This yields the following relation:
(6.47) V P V = 5 3 P , P V γ = const. , γ = 5 3 (6.47) V P V = 5 3 P , P V γ =  const.  , γ = 5 3 {:(6.47)V(del P)/(del V)=-(5)/(3)P","quad=>quad PV^(gamma)=" const. "","quad gamma=(5)/(3):}\begin{equation*} V \frac{\partial P}{\partial V}=-\frac{5}{3} P, \quad \Rightarrow \quad P V^{\gamma}=\text { const. }, \quad \gamma=\frac{5}{3} \tag{6.47} \end{equation*}(6.47)VPV=53P,PVγ= const. ,γ=53
So in a ( P , V ) ( P , V ) (P,V)(P, V)(P,V)-diagram the Carnot process looks as follows:
Figure 6.7: Carnot cycle for an ideal gas. The solid lines indicate isotherms and the dashed lines indicate adiabatics.
From E = 3 2 P V E = 3 2 P V E=(3)/(2)PVE=\frac{3}{2} P VE=32PV, which gives d E = 0 d E = 0 dE=0\mathrm{d} E=0dE=0 on isotherms, it follows that the total heat added to the system is given by
Δ Q in = I I I ( d E + P d V ) 1 st law δ Q = d E + P d V = I I I P d V from d E = 0 on isotherms = N k B T H I I I V 1 d V P V = N k B T H on isotherm (6.48) = N k B T H log V I I V I . Δ Q in = I I I ( d E + P d V ) 1 st  law δ Q = d E + P d V = I I I P d V  from  d E = 0  on isotherms  = N k B T H I I I V 1 d V P V = N k B T H  on isotherm  (6.48) = N k B T H log V I I V I . {:[DeltaQ_(in)=int_(I)^(II)ubrace(((d)E+P(d)V)ubrace)_({:[1^("st ")law],[delta Q=dE+PdV]:})=int_(I)^(II)ubrace(P(d)Vubrace)_({:[" from "dE=0],[" on isotherms "]:})],[=ubrace(Nk_(B)T_(H)int_(I)^(II)V^(-1)(d)Vubrace)_(PV=Nk_(B)T_(H)" on isotherm ")],[(6.48)=Nk_(B)T_(H)log((V_(II))/(V_(I))).]:}\begin{align*} & \Delta Q_{\mathrm{in}}=\int_{I}^{I I} \underbrace{(\mathrm{~d} E+P \mathrm{~d} V)}_{\substack{1^{\text {st }} \operatorname{law} \\ \delta Q=\mathrm{d} E+P \mathrm{~d} V}}=\int_{I}^{I I} \underbrace{P \mathrm{~d} V}_{\substack{\text { from } \mathrm{d} E=0 \\ \text { on isotherms }}} \\ & =\underbrace{N \mathrm{k}_{\mathrm{B}} T_{H} \int_{I}^{I I} V^{-1} \mathrm{~d} V}_{P V=N \mathrm{k}_{\mathrm{B}} T_{H} \text { on isotherm }} \\ & =N \mathrm{k}_{\mathrm{B}} T_{H} \log \frac{V_{I I}}{V_{I}} . \tag{6.48} \end{align*}ΔQin=III( dE+P dV)1st lawδQ=dE+P dV=IIIP dV from dE=0 on isotherms =NkBTHIIIV1 dVPV=NkBTH on isotherm (6.48)=NkBTHlogVIIVI.
Using this result together with P d V = d E P d V = d E PdV=-dEP \mathrm{~d} V=-\mathrm{d} EP dV=dE on adiabatics we find for the total mechanical work done by the system:
Δ W = I I I P d V + I I I I I P d V + I I I I V P d V + I V I P d V = N k B T H log V I I V I I I I I I d E N k B T C log V I I I V I V I V I d E = E I I E I I I + E I V E I + N k B ( T H log V I I V I T C log V I I I V I V ) Δ W = I I I P d V + I I I I I P d V + I I I I V P d V + I V I P d V = N k B T H log V I I V I I I I I I d E N k B T C log V I I I V I V I V I d E = E I I E I I I + E I V E I + N k B T H log V I I V I T C log V I I I V I V {:[Delta W=int_(I)^(II)PdV+int_(II)^(III)PdV+int_(III)^(IV)PdV+int_(IV)^(I)PdV],[=Nk_(B)T_(H)log((V_(II))/(V_(I)))-int_(II)^(III)dE-Nk_(B)T_(C)log((V_(III))/(V_(IV)))-int_(IV)^(I)dE],[=E_(II)-E_(III)+E_(IV)-E_(I)+Nk_(B)(T_(H)log((V_(II))/(V_(I)))-T_(C)log((V_(III))/(V_(IV))))]:}\begin{aligned} \Delta W & =\int_{I}^{I I} P \mathrm{~d} V+\int_{I I}^{I I I} P \mathrm{~d} V+\int_{I I I}^{I V} P \mathrm{~d} V+\int_{I V}^{I} P \mathrm{~d} V \\ & =N \mathrm{k}_{\mathrm{B}} T_{H} \log \frac{V_{I I}}{V_{I}}-\int_{I I}^{I I I} \mathrm{~d} E-N \mathrm{k}_{\mathrm{B}} T_{C} \log \frac{V_{I I I}}{V_{I V}}-\int_{I V}^{I} \mathrm{~d} E \\ & =E_{I I}-E_{I I I}+E_{I V}-E_{I}+N \mathrm{k}_{\mathrm{B}}\left(T_{H} \log \frac{V_{I I}}{V_{I}}-T_{C} \log \frac{V_{I I I}}{V_{I V}}\right) \end{aligned}ΔW=IIIP dV+IIIIIP dV+IIIIVP dV+IVIP dV=NkBTHlogVIIVIIIIII dENkBTClogVIIIVIVIVI dE=EIIEIII+EIVEI+NkB(THlogVIIVITClogVIIIVIV)
By conservation of energy, d E = 0 d E = 0 ointdE=0\oint \mathrm{d} E=0dE=0, we get
E I I E I I I + E I V E I = E I I E I + E I V E I I I = I I I d E + I I I I V d E = 0 E I I E I I I + E I V E I = E I I E I + E I V E I I I = I I I d E + I I I I V d E = 0 {:[E_(II)-E_(III)+E_(IV)-E_(I)=E_(II)-E_(I)+E_(IV)-E_(III)],[=int_(I)^(II)dE+int_(III)^(IV)dE=0]:}\begin{aligned} E_{I I}-E_{I I I}+E_{I V}-E_{I} & =E_{I I}-E_{I}+E_{I V}-E_{I I I} \\ & =\int_{I}^{I I} \mathrm{~d} E+\int_{I I I}^{I V} \mathrm{~d} E=0 \end{aligned}EIIEIII+EIVEI=EIIEI+EIVEIII=III dE+IIIIV dE=0
since d E = d ( 3 2 N k B T ) = 0 d E = d 3 2 N k B T = 0 dE=d((3)/(2)Nk_(B)T)=0\mathrm{d} E=\mathrm{d}\left(\frac{3}{2} N \mathrm{k}_{\mathrm{B}} T\right)=0dE=d(32NkBT)=0 on isotherms. From this it follows that
(6.49) Δ W = N k B ( T H log V I I V I T C log V I I I V I V ) (6.49) Δ W = N k B T H log V I I V I T C log V I I I V I V {:(6.49)Delta W=Nk_(B)(T_(H)log((V_(II))/(V_(I)))-T_(C)log((V_(III))/(V_(IV)))):}\begin{equation*} \Delta W=N \mathrm{k}_{\mathrm{B}}\left(T_{H} \log \frac{V_{I I}}{V_{I}}-T_{C} \log \frac{V_{I I I}}{V_{I V}}\right) \tag{6.49} \end{equation*}(6.49)ΔW=NkB(THlogVIIVITClogVIIIVIV)
We can now use (6.48) and (6.49) to find
(6.50) η Carnot = Δ W Δ Q in = 1 T C T H log V I I I / V I V log V I I / V I (6.50) η Carnot  = Δ W Δ Q in = 1 T C T H log V I I I / V I V log V I I / V I {:(6.50)eta_("Carnot ")=(Delta W)/(DeltaQ_(in))=1-(T_(C))/(T_(H))(log V_(III)//V_(IV))/(log V_(II)//V_(I)):}\begin{equation*} \eta_{\text {Carnot }}=\frac{\Delta W}{\Delta Q_{\mathrm{in}}}=1-\frac{T_{C}}{T_{H}} \frac{\log V_{I I I} / V_{I V}}{\log V_{I I} / V_{I}} \tag{6.50} \end{equation*}(6.50)ηCarnot =ΔWΔQin=1TCTHlogVIII/VIVlogVII/VI
The relation (6.47) for the adiabatics, together with the ideal gas condition (6.42) implies
P I I V I I γ = P I I I V I I I γ T H V I I γ 1 = T C V I I I γ 1 P I V I γ = P I V V I V γ T H V I γ 1 = T C V I V γ 1 , V I I V I = V I I I V I V . P I I V I I γ = P I I I V I I I γ T H V I I γ 1 = T C V I I I γ 1 P I V I γ = P I V V I V γ T H V I γ 1 = T C V I V γ 1 , V I I V I = V I I I V I V . {:[P_(II)V_(II)^(gamma)=P_(III)V_(III)^(gamma)=>T_(H)V_(II)^(gamma-1)=T_(C)V_(III)^(gamma-1)],[P_(I)V_(I)^(gamma)=P_(IV)V_(IV)^(gamma)=>T_(H)V_(I)^(gamma-1)=T_(C)V_(IV)^(gamma-1)","],[=>(V_(II))/(V_(I))=(V_(III))/(V_(IV)).]:}\begin{aligned} P_{I I} V_{I I}^{\gamma}=P_{I I I} V_{I I I}^{\gamma} & \Rightarrow T_{H} V_{I I}^{\gamma-1}=T_{C} V_{I I I}^{\gamma-1} \\ P_{I} V_{I}^{\gamma}=P_{I V} V_{I V}^{\gamma} & \Rightarrow T_{H} V_{I}^{\gamma-1}=T_{C} V_{I V}^{\gamma-1}, \\ & \Rightarrow \frac{V_{I I}}{V_{I}}=\frac{V_{I I I}}{V_{I V}} . \end{aligned}PIIVIIγ=PIIIVIIIγTHVIIγ1=TCVIIIγ1PIVIγ=PIVVIVγTHVIγ1=TCVIVγ1,VIIVI=VIIIVIV.
We thus find for the efficiency of the Carnot cycle
(6.51) η = 1 T C T H (6.51) η = 1 T C T H {:(6.51)eta=1-(T_(C))/(T_(H)):}\begin{equation*} \eta=1-\frac{T_{C}}{T_{H}} \tag{6.51} \end{equation*}(6.51)η=1TCTH
This fundamental relation for the efficiency of a Carnot cycle can be derived also using the variables ( T , S ) ( T , S ) (T,S)(T, S)(T,S) instead of ( P , V ) ( P , V ) (P,V)(P, V)(P,V), which also reveals the distinguished role played by this process. As d T = 0 d T = 0 dT=0\mathrm{d} T=0dT=0 for isotherms and d S = 0 d S = 0 dS=0\mathrm{d} S=0dS=0 for adiabatic processes, the Carnot cycle is just a rectangle in the T T TTT - S S SSS-diagram:
Figure 6.8: The Carnot cycle in the ( T , S ) ( T , S ) (T,S)(T, S)(T,S)-diagram.
We evidently have for the total heat added to the system:
(6.52) Δ Q in = I I I δ Q = I I I T d S = T H ( S I I S I ) (6.52) Δ Q in = I I I δ Q = I I I T d S = T H S I I S I {:(6.52)DeltaQ_(in)=int_(I)^(II)delta Q=int_(I)^(II)TdS=T_(H)(S_(II)-S_(I)):}\begin{equation*} \Delta Q_{\mathrm{in}}=\int_{I}^{I I} \delta Q=\int_{I}^{I I} T d S=T_{H}\left(S_{I I}-S_{I}\right) \tag{6.52} \end{equation*}(6.52)ΔQin=IIIδQ=IIITdS=TH(SIISI)
To compute Δ W Δ W Delta W\Delta WΔW, the total mechanical work done by the system, we observe that (as d E = 0 ) d E = 0 ) ointdE=0)\oint \mathrm{d} E=0)dE=0)
Δ W = δ W = P d V = ( P d V + d E ) = T d S . Δ W = δ W = P d V = ( P d V + d E ) = T d S . {:[Delta W=ointdelta W=ointPdV],[=oint(PdV+dE)],[=ointTdS.]:}\begin{aligned} \Delta W & =\oint \delta W=\oint P \mathrm{~d} V \\ & =\oint(P \mathrm{~d} V+\mathrm{d} E) \\ & =\oint T \mathrm{~d} S . \end{aligned}ΔW=δW=P dV=(P dV+dE)=T dS.
If A A AAA is the domain enclosed by the rectangular curve describing the process in the T S T S T-ST-STS diagram, Gauss' theorem gives
Δ W = T d S = A d ( T d S ) = A d T d S = ( T H T C ) ( S I I S I ) Δ W = T d S = A d ( T d S ) = A d T d S = T H T C S I I S I {:[Delta W=ointTdS=int_(A)d(TdS)=int_(A)dTdS],[=(T_(H)-T_(C))(S_(II)-S_(I))]:}\begin{aligned} \Delta W & =\oint T \mathrm{~d} S=\int_{A} \mathrm{~d}(T \mathrm{~d} S)=\int_{A} \mathrm{~d} T \mathrm{~d} S \\ & =\left(T_{H}-T_{C}\right)\left(S_{I I}-S_{I}\right) \end{aligned}ΔW=T dS=A d(T dS)=A dT dS=(THTC)(SIISI)
from which it immediately follows that the efficiency η Carnot η Carnot  eta_("Carnot ")\eta_{\text {Carnot }}ηCarnot  is given by
(6.53) η Carnot = Δ W Δ Q in = ( T H T C ) Δ S T H Δ S = 1 T C T H < 1 (6.53) η Carnot  = Δ W Δ Q in = T H T C Δ S T H Δ S = 1 T C T H < 1 {:(6.53)eta_("Carnot ")=(Delta W)/(DeltaQ_(in))=((T_(H)-T_(C))Delta S)/(T_(H)Delta S)=1-(T_(C))/(T_(H)) < 1:}\begin{equation*} \eta_{\text {Carnot }}=\frac{\Delta W}{\Delta Q_{\mathrm{in}}}=\frac{\left(T_{H}-T_{C}\right) \Delta S}{T_{H} \Delta S}=1-\frac{T_{C}}{T_{H}}<1 \tag{6.53} \end{equation*}(6.53)ηCarnot =ΔWΔQin=(THTC)ΔSTHΔS=1TCTH<1
as before. Since T H > T C T H > T C T_(H) > T_(C)T_{H}>T_{C}TH>TC, the efficiency can never be 100 % 100 % 100%100 \%100%.

6.4.2 General Cyclic Processes

Consider now the more general cycle given by the curve C C CCC in the ( T , S ) ( T , S ) (T,S)(T, S)(T,S)-diagram depicted in the figure below:
Figure 6.9: A generic cyclic process in the ( T , S ) ( T , S ) (T,S)(T, S)(T,S)-diagram.
We define C ± C ± C_(+-)C_{ \pm}C±to be the part of of the boundary curve C C CCC where heat is injected resp. given off. Then we have d S > 0 d S > 0 dS > 0\mathrm{d} S>0dS>0 on C + C + C_(+)C_{+}C+and d S < 0 d S < 0 dS < 0\mathrm{d} S<0dS<0 on C C C_(-)C_{-}C. For such a process, we define the efficiency η = η ( C ) η = η ( C ) eta=eta(C)\eta=\eta(C)η=η(C) as before by the ratio of net work Δ W Δ W Delta W\Delta WΔW and injected heat Δ Q in Δ Q in DeltaQ_(in)\Delta Q_{\mathrm{in}}ΔQin :
(6.54) η = Δ W Δ Q in (6.54) η = Δ W Δ Q in {:(6.54)eta=(Delta W)/(DeltaQ_(in)):}\begin{equation*} \eta=\frac{\Delta W}{\Delta Q_{\mathrm{in}}} \tag{6.54} \end{equation*}(6.54)η=ΔWΔQin
The quantities Δ W Δ W Delta W\Delta WΔW and Δ Q in Δ Q in  DeltaQ_("in ")\Delta Q_{\text {in }}ΔQin  are then calculated as
Δ W = C δ W = C ( T d S d E ) = C T d S Δ Q in = C + T d S Δ W = C δ W = C ( T d S d E ) = C T d S Δ Q in  = C + T d S {:[Delta W=-oint_(C)delta W=oint_(C)(TdS-dE)=oint_(C)TdS],[DeltaQ_("in ")=int_(C_(+))TdS]:}\begin{aligned} \Delta W & =-\oint_{C} \delta W=\oint_{C}(T \mathrm{~d} S-\mathrm{d} E)=\oint_{C} T \mathrm{~d} S \\ \Delta Q_{\text {in }} & =\int_{C_{+}} T \mathrm{~d} S \end{aligned}ΔW=CδW=C(T dSdE)=CT dSΔQin =C+T dS
from which it follows that the efficiency η = η ( C ) η = η ( C ) eta=eta(C)\eta=\eta(C)η=η(C) is given by
(6.55) η = C T d S C + T d S = 1 + C T d S C + T d S = 1 Δ Q out Δ Q in (6.55) η = C T d S C + T d S = 1 + C T d S C + T d S = 1 Δ Q out Δ Q in {:(6.55)eta=(oint_(C)T(d)S)/(int_(C_(+))T(d)S)=1+(int_(C_(-))T(d)S)/(int_(C_(+))T(d)S)=1-(DeltaQ_(out))/(DeltaQ_(in)):}\begin{equation*} \eta=\frac{\oint_{C} T \mathrm{~d} S}{\int_{C_{+}} T \mathrm{~d} S}=1+\frac{\int_{C_{-}} T \mathrm{~d} S}{\int_{C_{+}} T \mathrm{~d} S}=1-\frac{\Delta Q_{\mathrm{out}}}{\Delta Q_{\mathrm{in}}} \tag{6.55} \end{equation*}(6.55)η=CT dSC+T dS=1+CT dSC+T dS=1ΔQoutΔQin
Now, if the curve C C CCC is completely contained between two isotherms at temperatures T H > T C T H > T C T_(H) > T_(C)T_{H}>T_{C}TH>TC, as in the above figure, then
0 C + T d S T H C + d S ( as d S < 0 on C + ) , C T d S T C C d S 0 ( as d S 0 on C ) . 0 C + T d S T H C + d S  as  d S < 0  on  C + , C T d S T C C d S 0  as  d S 0  on  C . {:[0 <= int_(C_(+))TdS <= T_(H)int_(C_(+))dS quad(" as "dS < 0" on "C_(+))","],[int_(C_(-))TdS <= T_(C)int_(C_(-))dS <= 0quad(" as "dS <= 0" on "C_(-)).]:}\begin{aligned} 0 \leqslant & \int_{C_{+}} T \mathrm{~d} S \leqslant T_{H} \int_{C_{+}} \mathrm{d} S \quad\left(\text { as } \mathrm{d} S<0 \text { on } C_{+}\right), \\ & \int_{C_{-}} T \mathrm{~d} S \leqslant T_{C} \int_{C_{-}} \mathrm{d} S \leqslant 0 \quad\left(\text { as } \mathrm{d} S \leqslant 0 \text { on } C_{-}\right) . \end{aligned}0C+T dSTHC+dS( as dS<0 on C+),CT dSTCCdS0( as dS0 on C).
The efficiency η C η C eta_(C)\eta_{C}ηC of our general cycle C C CCC can now be estimated as
(6.56) η C = 1 + C T d S C + T d S 1 + T C C d S T H C + d S = 1 T C T H = η Carnot (6.56) η C = 1 + C T d S C + T d S 1 + T C C d S T H C + d S = 1 T C T H = η Carnot  {:(6.56)eta_(C)=1+(int_(C_(-))T(d)S)/(int_(C_(+))T(d)S) <= 1+(T_(C)int_(C_(-))dS)/(T_(H)int_(C_(+))dS)=1-(T_(C))/(T_(H))=eta_("Carnot "):}\begin{equation*} \eta_{C}=1+\frac{\int_{C_{-}} T \mathrm{~d} S}{\int_{C_{+}} T \mathrm{~d} S} \leqslant 1+\frac{T_{C} \int_{C_{-}} \mathrm{d} S}{T_{H} \int_{C_{+}} \mathrm{d} S}=1-\frac{T_{C}}{T_{H}}=\eta_{\text {Carnot }} \tag{6.56} \end{equation*}(6.56)ηC=1+CT dSC+T dS1+TCCdSTHC+dS=1TCTH=ηCarnot 
where we used the above inequalities as well as 0 = d S = C + d S + C d S 0 = d S = C + d S + C d S 0=ointdS=int_(C_(+))dS+int_(C_(-))dS0=\oint \mathrm{d} S=\int_{C_{+}} \mathrm{d} S+\int_{C_{-}} \mathrm{d} S0=dS=C+dS+CdS. Thus, we conclude that an arbitrary process is always less efficient than the Carnot process. This is why the Carnot process plays a distinguished role.
We can get a more intuitive understanding of this important finding by considering the following process:
The heat Δ Q in Δ Q in  DeltaQ_("in ")\Delta Q_{\text {in }}ΔQin  is given by Δ Q in = T H Δ S Δ Q in  = T H Δ S DeltaQ_("in ")=T_(H)Delta S\Delta Q_{\text {in }}=T_{H} \Delta SΔQin =THΔS, and as before Δ W = C T d S = A d T d S Δ W = C T d S = A d T d S Delta W=int_(C)TdS=int_(A)dTdS\Delta W=\int_{C} T \mathrm{~d} S=\int_{A} \mathrm{~d} T \mathrm{~d} SΔW=CT dS=A dT dS. Thus, Δ W Δ W Delta W\Delta WΔW is the area A A AAA enclosed by the closed curve C C CCC. This is clearly smaller than the area enclosed by the corresponding Carnot cycle (dashed rectangle). Now divide a general cyclic process into C = C 1 C 2 C = C 1 C 2 C=C_(1)uuC_(2)C=C_{1} \cup C_{2}C=C1C2, as sketched in the following figure:
Figure 6.10: A generic cyclic process divided into two parts by an isotherm at temperature T I T I T_(I)T_{I}TI
This process describes two cylic processes acting one after the other, where the heat dropped during cycle C 1 C 1 C_(1)C_{1}C1 is injected during cycle C 2 C 2 C_(2)C_{2}C2 at temperature T I T I T_(I)T_{I}TI. It follows from the discussion above that
(6.57) η ( C 2 ) = Δ W 2 Δ Q 2 , in T I T C T I = 1 T C T I (6.57) η C 2 = Δ W 2 Δ Q 2 ,  in  T I T C T I = 1 T C T I {:(6.57)eta(C_(2))=(DeltaW_(2))/(DeltaQ_(2," in ")) <= (T_(I)-T_(C))/(T_(I))=1-(T_(C))/(T_(I)):}\begin{equation*} \eta\left(C_{2}\right)=\frac{\Delta W_{2}}{\Delta Q_{2, \text { in }}} \leqslant \frac{T_{I}-T_{C}}{T_{I}}=1-\frac{T_{C}}{T_{I}} \tag{6.57} \end{equation*}(6.57)η(C2)=ΔW2ΔQ2, in TITCTI=1TCTI
which means that the cycle C 2 C 2 C_(2)C_{2}C2 is less efficient than the Carnot process acting between temperatures T I T I T_(I)T_{I}TI and T C T C T_(C)T_{C}TC. It remains to show that the cycle C 1 C 1 C_(1)C_{1}C1 is also less efficient than the Carnot cycle acting betweeen temperatures T H T H T_(H)T_{H}TH and T I T I T_(I)T_{I}TI. The work Δ W 1 Δ W 1 DeltaW_(1)\Delta W_{1}ΔW1 done along
C 1 C 1 C_(1)C_{1}C1 is again smaller than the area enclosed by the latter Carnot cycle, i.e. we have Δ W 1 ( T H T I ) Δ S Δ W 1 T H T I Δ S DeltaW_(1) <= (T_(H)-T_(I))Delta S\Delta W_{1} \leqslant\left(T_{H}-T_{I}\right) \Delta SΔW1(THTI)ΔS. Furthermore, we must have Δ Q 1 , in Δ Q 1 , out = T I Δ S Δ Q 1 ,  in  Δ Q 1 ,  out  = T I Δ S DeltaQ_(1," in ") >= DeltaQ_(1," out ")=T_(I)Delta S\Delta Q_{1, \text { in }} \geqslant \Delta Q_{1, \text { out }}=T_{I} \Delta SΔQ1, in ΔQ1, out =TIΔS, which yields
η ( C 1 ) = Δ W 1 Δ Q 1 , in T H T I T I 1 T I T H . η C 1 = Δ W 1 Δ Q 1 , in T H T I T I 1 T I T H . eta(C_(1))=(DeltaW_(1))/(DeltaQ_(1,in)) <= (T_(H)-T_(I))/(T_(I)) <= 1-(T_(I))/(T_(H)).\eta\left(C_{1}\right)=\frac{\Delta W_{1}}{\Delta Q_{1, \mathrm{in}}} \leqslant \frac{T_{H}-T_{I}}{T_{I}} \leqslant 1-\frac{T_{I}}{T_{H}} .η(C1)=ΔW1ΔQ1,inTHTITI1TITH.
Thus, the cycle C 1 C 1 C_(1)C_{1}C1 is less efficient than the Carnot cycle acting between temperatures T H T H T_(H)T_{H}TH and T I T I T_(I)T_{I}TI. It follows that the cycle C = C 1 C 2 C = C 1 C 2 C=C_(1)uuC_(2)C=C_{1} \cup C_{2}C=C1C2 must be less efficient than the Carnot cycle acting between temperatures T H T H T_(H)T_{H}TH and T C T C T_(C)T_{C}TC.

6.4.3 The Diesel Engine

Another example of a cyclic process is the Diesel engine. The idealized version of this process consists of the following 4 steps:
I II I II IrarrII\mathrm{I} \rightarrow \mathrm{II}III : isentropic (adiabatic) compression
II rarr\rightarrow III: reversible heating at constant pressure
III rarr\rightarrow IV: adiabatic expansion with work done by the expanding fluid
IV rarr\rightarrow I: reversible cooling at constant volume
Figure 6.11: The process describing the Diesel engine in the ( P , V ) ( P , V ) (P,V)(P, V)(P,V)-diagram.
As before, we define the thermal efficiency to be
η Diesel = Δ W Δ Q in = ( I I I + I I I I I + I I I I V + I V I ) T d S I I I I I T d S η Diesel  = Δ W Δ Q in = I I I + I I I I I + I I I I V + I V I T d S I I I I I T d S eta_("Diesel ")=(Delta W)/(DeltaQ_(in))=((int_(I)^(II)+int_(II)^(III)+int_(III)^(IV)+int_(IV)^(I))T(d)S)/(int_(II)^(III)T(d)S)\eta_{\text {Diesel }}=\frac{\Delta W}{\Delta Q_{\mathrm{in}}}=\frac{\left(\int_{I}^{I I}+\int_{I I}^{I I I}+\int_{I I I}^{I V}+\int_{I V}^{I}\right) T \mathrm{~d} S}{\int_{I I}^{I I I} T \mathrm{~d} S}ηDiesel =ΔWΔQin=(III+IIIII+IIIIV+IVI)T dSIIIIIT dS
As in the discussion of the Carnot process we use an ideal gas, with V = N k B T , E = V = N k B T , E = V=Nk_(B)T,E=V=N \mathrm{k}_{\mathrm{B}} T, E=V=NkBT,E= 3 2 P V 3 2 P V (3)/(2)PV\frac{3}{2} P V32PV, and d E = T d S P d V d E = T d S P d V dE=TdS-PdV\mathrm{d} E=T \mathrm{~d} S-P \mathrm{~d} VdE=T dSP dV. Since d S = 0 d S = 0 dS=0\mathrm{d} S=0dS=0 on the paths I rarr\rightarrow II and III IV IV rarrIV\rightarrow \mathrm{IV}IV, it follows
that
(6.58) η Diesel = 1 + I V I T d S I I I I I T d S (6.58) η Diesel  = 1 + I V I T d S I I I I I T d S {:(6.58)eta_("Diesel ")=1+(int_(IV)^(I)T(d)S)/(int_(II)^(III)T(d)S):}\begin{equation*} \eta_{\text {Diesel }}=1+\frac{\int_{I V}^{I} T \mathrm{~d} S}{\int_{I I}^{I I I} T \mathrm{~d} S} \tag{6.58} \end{equation*}(6.58)ηDiesel =1+IVIT dSIIIIIT dS
Using (6.42), the integrals in this expression are easily calculated as
I V I T d S = I V I ( d E + P d V ) = I V I ( 3 2 V d P + 5 2 P d V = 0 ) = 3 2 N k B ( T I T I V ) , I I I I I T d S = I I I I I ( d E + P d V ) = I I I I I ( 3 2 V d P = 0 + 5 2 P d V ) = 5 2 N k B ( T I I I T I I ) , I V I T d S = I V I ( d E + P d V ) = I V I ( 3 2 V d P + 5 2 P d V = 0 ) = 3 2 N k B T I T I V , I I I I I T d S = I I I I I ( d E + P d V ) = I I I I I ( 3 2 V d P = 0 + 5 2 P d V ) = 5 2 N k B T I I I T I I , {:[int_(IV)^(I)TdS=int_(IV)^(I)(dE+PdV)=int_(IV)^(I)((3)/(2)VdP+(5)/(2)Pubrace((d)Vubrace)_(=0))],[=(3)/(2)Nk_(B)(T_(I)-T_(IV))","],[int_(II)^(III)TdS=int_(II)^(III)(dE+PdV)=int_(II)^(III)((3)/(2)Vubrace((d)Pubrace)_(=0)+(5)/(2)PdV)],[=(5)/(2)Nk_(B)(T_(III)-T_(II))","]:}\begin{aligned} \int_{I V}^{I} T \mathrm{~d} S & =\int_{I V}^{I}(\mathrm{~d} E+P \mathrm{~d} V)=\int_{I V}^{I}(\frac{3}{2} V \mathrm{~d} P+\frac{5}{2} P \underbrace{\mathrm{~d} V}_{=0}) \\ & =\frac{3}{2} N \mathrm{k}_{\mathrm{B}}\left(T_{I}-T_{I V}\right), \\ \int_{I I}^{I I I} T \mathrm{~d} S & =\int_{I I}^{I I I}(\mathrm{~d} E+P \mathrm{~d} V)=\int_{I I}^{I I I}(\frac{3}{2} V \underbrace{\mathrm{~d} P}_{=0}+\frac{5}{2} P \mathrm{~d} V) \\ & =\frac{5}{2} N \mathrm{k}_{\mathrm{B}}\left(T_{I I I}-T_{I I}\right), \end{aligned}IVIT dS=IVI( dE+P dV)=IVI(32V dP+52P dV=0)=32NkB(TITIV),IIIIIT dS=IIIII( dE+P dV)=IIIII(32V dP=0+52P dV)=52NkB(TIIITII),
which means that the efficiency η Diesel η Diesel  eta_("Diesel ")\eta_{\text {Diesel }}ηDiesel  is given by
(6.59) η Diesel = 1 3 5 T I V T I T I I I T I I (6.59) η Diesel  = 1 3 5 T I V T I T I I I T I I {:(6.59)eta_("Diesel ")=1-(3)/(5)(T_(IV)-T_(I))/(T_(III)-T_(II)):}\begin{equation*} \eta_{\text {Diesel }}=1-\frac{3}{5} \frac{T_{I V}-T_{I}}{T_{I I I}-T_{I I}} \tag{6.59} \end{equation*}(6.59)ηDiesel =135TIVTITIIITII

6.5 Thermodynamic potentials

The first law can be rewritten in terms of other "thermodynamic potentials", which are sometimes useful, and which are naturally related to different equilibrium ensembles.
We start from the 1 st 1 st  1^("st ")1^{\text {st }}1st  law of thermodynamics in the form
(6.60) d E = T d S P d V + μ d N + ( = T d S + i = 1 n J i d X i ) (6.60) d E = T d S P d V + μ d N + = T d S + i = 1 n J i d X i {:(6.60)dE=TdS-PdV+mudN+dots(=T(d)S+sum_(i=1)^(n)J_(i)(d)X_(i)):}\begin{equation*} \mathrm{d} E=T \mathrm{~d} S-P \mathrm{~d} V+\mu \mathrm{d} N+\ldots\left(=T \mathrm{~d} S+\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i}\right) \tag{6.60} \end{equation*}(6.60)dE=T dSP dV+μdN+(=T dS+i=1nJi dXi)
By (6.60) E E EEE is naturally viewed as a function of ( S , V , N ) ( S , V , N ) (S,V,N)(S, V, N)(S,V,N) (or more generally of S S SSS and { X i } X i {X_(i)}\left\{X_{i}\right\}{Xi} ). To get a thermodynamic potential that naturally depends on ( T , V , N T , V , N T,V,NT, V, NT,V,N ) (or more generally, T T TTT and { X i } X i {X_(i)}\left\{X_{i}\right\}{Xi} ), we form the free energy
(6.61) F = E T S . (6.61) F = E T S {:(6.61)F=E-TS". ":}\begin{equation*} F=E-T S \text {. } \tag{6.61} \end{equation*}(6.61)F=ETS
Taking the differential of this, we get
d F = d E S d T T d S = T d S P d V + μ d N + S d T T d S = S d T P d V + μ d N + ( = S d T + i = 1 n J i d X i ) d F = d E S d T T d S = T d S P d V + μ d N + S d T T d S = S d T P d V + μ d N + ( = S d T + i = 1 n J i d X i {:[dF=dE-SdT-TdS],[=TdS-PdV+mudN+dots-SdT-TdS],[=-SdT-PdV+mudN+dots],[({:=-S(d)T+sum_(i=1)^(n)J_(i)(d)X_(i))]:}\begin{aligned} \mathrm{d} F & =\mathrm{d} E-S \mathrm{~d} T-T \mathrm{~d} S \\ & =T \mathrm{~d} S-P \mathrm{~d} V+\mu \mathrm{d} N+\ldots-S \mathrm{~d} T-T \mathrm{~d} S \\ & =-S \mathrm{~d} T-P \mathrm{~d} V+\mu \mathrm{d} N+\ldots \\ ( & \left.=-S \mathrm{~d} T+\sum_{i=1}^{n} J_{i} \mathrm{~d} X_{i}\right) \end{aligned}dF=dES dTT dS=T dSP dV+μdN+S dTT dS=S dTP dV+μdN+(=S dT+i=1nJi dXi)
Writing out the differential d F d F dF\mathrm{d} FdF as
(6.62) d F = F T | V , N d T + F V | T , N d V + (6.62) d F = F T V , N d T + F V T , N d V + {:(6.62)dF=(del F)/(del T)|_(V,N)dT+(del F)/(del V)|_(T,N)dV+dots:}\begin{equation*} \mathrm{d} F=\left.\frac{\partial F}{\partial T}\right|_{V, N} \mathrm{~d} T+\left.\frac{\partial F}{\partial V}\right|_{T, N} \mathrm{~d} V+\ldots \tag{6.62} \end{equation*}(6.62)dF=FT|V,N dT+FV|T,N dV+
and comparing the coefficients, we get
(6.63) 0 = ( F T | V , N + S ) d T + ( F V | T , N + P ) d V + ( F N | T , V μ ) d N + (6.63) 0 = F T V , N + S d T + F V T , N + P d V + F N T , V μ d N + {:(6.63)0=((del F)/(del T)|_(V,N)+S)dT+((del F)/(del V)|_(T,N)+P)dV+((del F)/(del N)|_(T,V)-mu)dN+dots:}\begin{equation*} 0=\left(\left.\frac{\partial F}{\partial T}\right|_{V, N}+S\right) \mathrm{d} T+\left(\left.\frac{\partial F}{\partial V}\right|_{T, N}+P\right) \mathrm{d} V+\left(\left.\frac{\partial F}{\partial N}\right|_{T, V}-\mu\right) \mathrm{d} N+\ldots \tag{6.63} \end{equation*}(6.63)0=(FT|V,N+S)dT+(FV|T,N+P)dV+(FN|T,Vμ)dN+
This yields the following relations:
(6.64) S = F T | V , N , P = F V | T , N , μ = F N | T , V , (6.64) S = F T V , N , P = F V T , N , μ = F N T , V , {:(6.64)S=-(del F)/(del T)|_(V,N)","quad P=-(del F)/(del V)|_(T,N)","quad mu=(del F)/(del N)|_(T,V)","quad dots:}\begin{equation*} S=-\left.\frac{\partial F}{\partial T}\right|_{V, N}, \quad P=-\left.\frac{\partial F}{\partial V}\right|_{T, N}, \quad \mu=\left.\frac{\partial F}{\partial N}\right|_{T, V}, \quad \ldots \tag{6.64} \end{equation*}(6.64)S=FT|V,N,P=FV|T,N,μ=FN|T,V,
By the first of these equations, the entropy S = S ( T , V , N ) S = S ( T , V , N ) S=S(T,V,N)S=S(T, V, N)S=S(T,V,N) is naturally a function of ( T , V , N ) ( T , V , N ) (T,V,N)(T, V, N)(T,V,N), which suggests a relation between F F FFF and the canonical ensemble. As discussed in section 4.3, in this ensemble we have
(6.65) ρ = ρ ( T , V , N ) = 1 Z e H ( N , V ) k B T and S = k B tr ρ log ρ (6.65) ρ = ρ ( T , V , N ) = 1 Z e H ( N , V ) k B T  and  S = k B tr ρ log ρ {:(6.65)rho=rho(T","V","N)=(1)/(Z)e^(-(H(N,V))/(k_(B)T))quad" and "quad S=-k_(B)tr rho log rho:}\begin{equation*} \rho=\rho(T, V, N)=\frac{1}{Z} e^{-\frac{H(N, V)}{\mathrm{k}_{\mathrm{B}} T}} \quad \text { and } \quad S=-\mathrm{k}_{\mathrm{B}} \operatorname{tr} \rho \log \rho \tag{6.65} \end{equation*}(6.65)ρ=ρ(T,V,N)=1ZeH(N,V)kBT and S=kBtrρlogρ
We now seek an F F FFF satisfying S = F T | V , N S = F T V , N S=-(del F)/(del T)|_(V,N)S=-\left.\frac{\partial F}{\partial T}\right|_{V, N}S=FT|V,N. A simple calculation shows
(6.66) F ( T , V , N ) = k B T log Z ( T , V , N ) (6.66) F ( T , V , N ) = k B T log Z ( T , V , N ) {:(6.66)F(T","V","N)=-k_(B)T log Z(T","V","N):}\begin{equation*} F(T, V, N)=-\mathrm{k}_{\mathrm{B}} T \log Z(T, V, N) \tag{6.66} \end{equation*}(6.66)F(T,V,N)=kBTlogZ(T,V,N)
Indeed:
F T = k B { log tr e H k B T + 1 k B T tr H e H k B T tr e H k B T } = k B tr ρ log ρ = S F T = k B log tr e H k B T + 1 k B T tr H e H k B T tr e H k B T = k B tr ρ log ρ = S {:[(del F)/(del T)=-k_(B){log tre^(-(H)/(k_(B)T))+(1)/(k_(B)T)(tr He^(-(H)/(k_(B)T)))/(tre^(-(H)/(k_(B)T)))}],[=k_(B)tr rho log rho=-S]:}\begin{aligned} \frac{\partial F}{\partial T} & =-\mathrm{k}_{\mathrm{B}}\left\{\log \operatorname{tr} e^{-\frac{H}{\mathrm{k}_{\mathrm{B}} T}}+\frac{1}{\mathrm{k}_{\mathrm{B}} T} \frac{\operatorname{tr} H e^{-\frac{H}{\mathrm{k}_{\mathrm{B}} T}}}{\operatorname{tr} e^{-\frac{H}{\mathrm{k}_{\mathrm{B}} T}}}\right\} \\ & =\mathrm{k}_{\mathrm{B}} \operatorname{tr} \rho \log \rho=-S \end{aligned}FT=kB{logtreHkBT+1kBTtrHeHkBTtreHkBT}=kBtrρlogρ=S
In the same way, we may look for a function G G GGG of the variables ( T , μ , V ) ( T , μ , V ) (T,mu,V)(T, \mu, V)(T,μ,V). To this end,
we form the grand potential
(6.67) G = E T S μ N = F μ N (6.67) G = E T S μ N = F μ N {:(6.67)G=E-TS-mu N=F-mu N:}\begin{equation*} G=E-T S-\mu N=F-\mu N \tag{6.67} \end{equation*}(6.67)G=ETSμN=FμN
The differential of G G GGG is
d G = d F μ d N N d μ = S d T P d V + μ d N μ d N N d μ = S d T P d V N d μ d G = d F μ d N N d μ = S d T P d V + μ d N μ d N N d μ = S d T P d V N d μ {:[dG=dF-mudN-Ndmu],[=-SdT-PdV+mudN-mudN-Ndmu],[=-SdT-PdV-Ndmu]:}\begin{aligned} \mathrm{d} G & =\mathrm{d} F-\mu \mathrm{d} N-N \mathrm{~d} \mu \\ & =-S \mathrm{~d} T-P \mathrm{~d} V+\mu \mathrm{d} N-\mu \mathrm{d} N-N \mathrm{~d} \mu \\ & =-S \mathrm{~d} T-P \mathrm{~d} V-N \mathrm{~d} \mu \end{aligned}dG=dFμdNN dμ=S dTP dV+μdNμdNN dμ=S dTP dVN dμ
Writing out d G d G dG\mathrm{d} GdG as
d G = G T | V , μ d T + G V | T , μ d V + d G = G T V , μ d T + G V T , μ d V + dG=(del G)/(del T)|_(V,mu)dT+(del G)/(del V)|_(T,mu)dV+dots\mathrm{d} G=\left.\frac{\partial G}{\partial T}\right|_{V, \mu} \mathrm{~d} T+\left.\frac{\partial G}{\partial V}\right|_{T, \mu} \mathrm{~d} V+\ldotsdG=GT|V,μ dT+GV|T,μ dV+
and comparing the coefficients, we get
0 = ( G T | V , μ + S ) d T + ( G V | T , μ + P ) d V + ( G μ | T , V + N ) d μ 0 = G T V , μ + S d T + G V T , μ + P d V + G μ T , V + N d μ 0=((del G)/(del T)|_(V,mu)+S)dT+((del G)/(del V)|_(T,mu)+P)dV+((del G)/(del mu)|_(T,V)+N)dmu0=\left(\left.\frac{\partial G}{\partial T}\right|_{V, \mu}+S\right) \mathrm{d} T+\left(\left.\frac{\partial G}{\partial V}\right|_{T, \mu}+P\right) \mathrm{d} V+\left(\left.\frac{\partial G}{\partial \mu}\right|_{T, V}+N\right) \mathrm{d} \mu0=(GT|V,μ+S)dT+(GV|T,μ+P)dV+(Gμ|T,V+N)dμ
which yields the relations
(6.68) S = G T | V , μ , N = G μ | T , V , P = G V | T , μ (6.68) S = G T V , μ , N = G μ T , V , P = G V T , μ {:(6.68)S=-(del G)/(del T)|_(V,mu)","quad N=-(del G)/(del mu)|_(T,V)","quad P=-(del G)/(del V)|_(T,mu):}\begin{equation*} S=-\left.\frac{\partial G}{\partial T}\right|_{V, \mu}, \quad N=-\left.\frac{\partial G}{\partial \mu}\right|_{T, V}, \quad P=-\left.\frac{\partial G}{\partial V}\right|_{T, \mu} \tag{6.68} \end{equation*}(6.68)S=GT|V,μ,N=Gμ|T,V,P=GV|T,μ
In the first of these equations, S S SSS is naturally viewed as a function of the variables ( T , μ , V ) ( T , μ , V ) (T,mu,V)(T, \mu, V)(T,μ,V), suggesting a relationship between G G GGG and the grand canonical ensemble. As discussed in section 4.4 , in this ensemble we have
(6.69) ρ ( T , μ , V ) = 1 Y e ( H ( V ) μ N ^ ) k B T and S = k B tr ρ log ρ (6.69) ρ ( T , μ , V ) = 1 Y e ( H ( V ) μ N ^ ) k B T  and  S = k B tr ρ log ρ {:(6.69)rho(T","mu","V)=(1)/(Y)e^(-((H(V)-mu( hat(N))))/(k_(B)T))quad" and "quad S=-k_(B)tr rho log rho:}\begin{equation*} \rho(T, \mu, V)=\frac{1}{Y} e^{-\frac{(H(V)-\mu \hat{N})}{\mathrm{k}_{\mathrm{B}} T}} \quad \text { and } \quad S=-\mathrm{k}_{\mathrm{B}} \operatorname{tr} \rho \log \rho \tag{6.69} \end{equation*}(6.69)ρ(T,μ,V)=1Ye(H(V)μN^)kBT and S=kBtrρlogρ
We now seek a function G G GGG satisfying S = G T | μ , V S = G T μ , V S=-(del G)/(del T)|_(mu,V)S=-\left.\frac{\partial G}{\partial T}\right|_{\mu, V}S=GT|μ,V and N = G μ | T , V N = G μ T , V N=-(del G)/(del mu)|_(T,V)N=-\left.\frac{\partial G}{\partial \mu}\right|_{T, V}N=Gμ|T,V. An easy calculation reveals
(6.70) G ( T , μ , V ) = k B T log Y ( T , μ , V ) (6.70) G ( T , μ , V ) = k B T log Y ( T , μ , V ) {:(6.70)G(T","mu","V)=-k_(B)T log Y(T","mu","V):}\begin{equation*} G(T, \mu, V)=-\mathrm{k}_{\mathrm{B}} T \log Y(T, \mu, V) \tag{6.70} \end{equation*}(6.70)G(T,μ,V)=kBTlogY(T,μ,V)
Indeed:
G T = k B { log tr e ( H μ N ^ ) k B T + 1 k B T tr ( H μ N ^ ) e ( H μ N ^ ) k B T tr e ( H μ N ^ ) k B T } = k B tr ρ log ρ = S G T = k B log tr e ( H μ N ^ ) k B T + 1 k B T tr ( H μ N ^ ) e ( H μ N ^ ) k B T tr e ( H μ N ^ ) k B T = k B tr ρ log ρ = S {:[(del G)/(del T)=-k_(B){log tre^(-((H-mu( hat(N))))/(k_(B)T))+(1)/(k_(B)T)(tr(H-mu( hat(N)))e^(-((H-mu( hat(N))))/(k_(B)T)))/(tre^(-((H-mu( hat(N))))/(k_(B)T)))}],[=k_(B)tr rho log rho=-S]:}\begin{aligned} \frac{\partial G}{\partial T} & =-\mathrm{k}_{\mathrm{B}}\left\{\log \operatorname{tr} e^{-\frac{(H-\mu \hat{N})}{\mathrm{k}_{\mathrm{B}} T}}+\frac{1}{\mathrm{k}_{\mathrm{B}} T} \frac{\operatorname{tr}(H-\mu \hat{N}) e^{-\frac{(H-\mu \hat{N})}{\mathrm{k}_{\mathrm{B}} T}}}{\operatorname{tr} e^{-\frac{(H-\mu \hat{N})}{\mathrm{k}_{\mathrm{B}} T}}}\right\} \\ & =\mathrm{k}_{\mathrm{B}} \operatorname{tr} \rho \log \rho=-S \end{aligned}GT=kB{logtre(HμN^)kBT+1kBTtr(HμN^)e(HμN^)kBTtre(HμN^)kBT}=kBtrρlogρ=S
The second relation can be demonstrated in a similar way (with N = N ^ N = N ^ N=(: hat(N):)N=\langle\hat{N}\rangleN=N^ ). To get a function H H HHH which naturally depends on the variables ( P , T , N ) ( P , T , N ) (P,T,N)(P, T, N)(P,T,N), we form the free enthalpy (or Gibbs potential)
(6.71) H = E T S + P V = F + P V . (6.71) H = E T S + P V = F + P V {:(6.71)H=E-TS+PV=F+PV". ":}\begin{equation*} H=E-T S+P V=F+P V \text {. } \tag{6.71} \end{equation*}(6.71)H=ETS+PV=F+PV
It satisfies the relations
(6.72) S = H T | P , N , μ = H N | P , T , V = H P | N , T (6.72) S = H T P , N , μ = H N P , T , V = H P N , T {:(6.72)S=-(del H)/(del T)|_(P,N)","quad mu=(del H)/(del N)|_(P,T)","quad V=(del H)/(del P)|_(N,T):}\begin{equation*} S=-\left.\frac{\partial H}{\partial T}\right|_{P, N}, \quad \mu=\left.\frac{\partial H}{\partial N}\right|_{P, T}, \quad V=\left.\frac{\partial H}{\partial P}\right|_{N, T} \tag{6.72} \end{equation*}(6.72)S=HT|P,N,μ=HN|P,T,V=HP|N,T
or equivalently
(6.73) d H = S d T + V d P + μ d N (6.73) d H = S d T + V d P + μ d N {:(6.73)dH=-SdT+VdP+mudN:}\begin{equation*} \mathrm{d} H=-S \mathrm{~d} T+V \mathrm{~d} P+\mu \mathrm{d} N \tag{6.73} \end{equation*}(6.73)dH=S dT+V dP+μdN
The free 4 4 ^(4){ }^{4}4 enthalpy is often used in the context of chemical processes, because these naturally occur at constant atmospheric pressure. For processes at constant pressure P P PPP (isobaric processes) we have
(6.74) d H = S d T + μ d N (6.74) d H = S d T + μ d N {:(6.74)dH=-SdT+mudN:}\begin{equation*} \mathrm{d} H=-S \mathrm{~d} T+\mu \mathrm{d} N \tag{6.74} \end{equation*}(6.74)dH=S dT+μdN
Assuming that the entropy S = S ( E , V , N i , ) S = S E , V , N i , S=S(E,V,N_(i),dots)S=S\left(E, V, N_{i}, \ldots\right)S=S(E,V,Ni,) is an extensive quantity, we can derive relations between the various potentials. The extensive property of S S SSS means that
(6.75) S ( λ E , λ V , λ N i ) = λ S ( E , V , N i ) , for λ > 0 (6.75) S λ E , λ V , λ N i = λ S E , V , N i ,  for  λ > 0 {:(6.75)S(lambda E,lambda V,lambdaN_(i))=lambda S(E,V,N_(i))","quad" for "lambda > 0:}\begin{equation*} S\left(\lambda E, \lambda V, \lambda N_{i}\right)=\lambda S\left(E, V, N_{i}\right), \quad \text { for } \lambda>0 \tag{6.75} \end{equation*}(6.75)S(λE,λV,λNi)=λS(E,V,Ni), for λ>0
Taking the partial derivative λ λ (del)/(del lambda)\frac{\partial}{\partial \lambda}λ of this expression gives
(6.76) S = S E | V , N i E + S V | E , N i V + i S N i | V , E N i (6.76) S = S E V , N i E + S V E , N i V + i S N i V , E N i {:(6.76)S=(del S)/(del E)|_(V,N_(i))E+(del S)/(del V)|_(E,N_(i))V+sum_(i)(del S)/(delN_(i))|_(V,E)N_(i):}\begin{equation*} S=\left.\frac{\partial S}{\partial E}\right|_{V, N_{i}} E+\left.\frac{\partial S}{\partial V}\right|_{E, N_{i}} V+\left.\sum_{i} \frac{\partial S}{\partial N_{i}}\right|_{V, E} N_{i} \tag{6.76} \end{equation*}(6.76)S=SE|V,NiE+SV|E,NiV+iSNi|V,ENi
Together with the relations
(6.77) S E | V , N i = 1 T , , S V | E , N i = P T , S N i | V , E = μ i T (6.77) S E V , N i = 1 T , , S V E , N i = P T , S N i V , E = μ i T {:(6.77)(del S)/(del E)|_(V,N_(i))=(1)/(T)","quad"," quad(del S)/(del V)|_(E,N_(i))=(P)/(T)"," quad(del S)/(delN_(i))|_(V,E)=-(mu_(i))/(T):}\begin{equation*} \left.\frac{\partial S}{\partial E}\right|_{V, N_{i}}=\frac{1}{T}, \quad,\left.\quad \frac{\partial S}{\partial V}\right|_{E, N_{i}}=\frac{P}{T},\left.\quad \frac{\partial S}{\partial N_{i}}\right|_{V, E}=-\frac{\mu_{i}}{T} \tag{6.77} \end{equation*}(6.77)SE|V,Ni=1T,,SV|E,Ni=PT,SNi|V,E=μiT
we find the Gibbs-Duhem relation (after multiplication with T T TTT ):
(6.78) E + P V i μ i N i T S = 0 (6.78) E + P V i μ i N i T S = 0 {:(6.78)E+PV-sum_(i)mu_(i)N_(i)-TS=0:}\begin{equation*} E+P V-\sum_{i} \mu_{i} N_{i}-T S=0 \tag{6.78} \end{equation*}(6.78)E+PViμiNiTS=0
or equivalently
(6.79) H = i μ i N i . (6.79) H = i μ i N i . {:(6.79)H=sum_(i)mu_(i)N_(i).:}\begin{equation*} H=\sum_{i} \mu_{i} N_{i} . \tag{6.79} \end{equation*}(6.79)H=iμiNi.
Let us summarize the properties of the potentials we have discussed so far in a table:
Thermodynamic
potential
Thermodynamic potential| Thermodynamic | | :---: | | potential |
Definition First Law
Natural
variables
Natural variables| Natural | | :---: | | variables |
entropy S S SSS fundamental T d S = d E + P d V μ d N T d S = d E + P d V μ d N TdS=dE+PdV-mudNT \mathrm{~d} S=\mathrm{d} E+P \mathrm{~d} V-\mu \mathrm{d} NT dS=dE+P dVμdN E , V , N E , V , N E,V,NE, V, NE,V,N
free energy F F FFF F = E T S F = E T S F=E-TSF=E-T SF=ETS d F = S d T P d V + μ d N d F = S d T P d V + μ d N dF=-SdT-PdV+mudN\mathrm{~d} F=-S \mathrm{~d} T-P \mathrm{~d} V+\mu \mathrm{d} N dF=S dTP dV+μdN T , V , N T , V , N T,V,NT, V, NT,V,N
grand potential G G GGG G = E T S μ N G = E T S μ N G=E-TS-mu NG=E-T S-\mu NG=ETSμN d G = S d T P d V N d μ d G = S d T P d V N d μ dG=-SdT-PdV-Ndmu\mathrm{~d} G=-S \mathrm{~d} T-P \mathrm{~d} V-N \mathrm{~d} \mu dG=S dTP dVN dμ T , V , μ T , V , μ T,V,muT, V, \muT,V,μ
free enthalpy H H HHH H = E T S + P V H = E T S + P V H=E-TS+PVH=E-T S+P VH=ETS+PV d H = S d T + V d P + μ d N d H = S d T + V d P + μ d N dH=-SdT+VdP+mudN\mathrm{~d} H=-S \mathrm{~d} T+V \mathrm{~d} P+\mu \mathrm{d} N dH=S dT+V dP+μdN T , P , N T , P , N T,P,NT, P, NT,P,N
"Thermodynamic potential" Definition First Law "Natural variables" entropy S fundamental TdS=dE+PdV-mudN E,V,N free energy F F=E-TS dF=-SdT-PdV+mudN T,V,N grand potential G G=E-TS-mu N dG=-SdT-PdV-Ndmu T,V,mu free enthalpy H H=E-TS+PV dH=-SdT+VdP+mudN T,P,N| Thermodynamic <br> potential | Definition | First Law | Natural <br> variables | | :---: | :---: | :---: | :---: | | entropy $S$ | fundamental | $T \mathrm{~d} S=\mathrm{d} E+P \mathrm{~d} V-\mu \mathrm{d} N$ | $E, V, N$ | | free energy $F$ | $F=E-T S$ | $\mathrm{~d} F=-S \mathrm{~d} T-P \mathrm{~d} V+\mu \mathrm{d} N$ | $T, V, N$ | | grand potential $G$ | $G=E-T S-\mu N$ | $\mathrm{~d} G=-S \mathrm{~d} T-P \mathrm{~d} V-N \mathrm{~d} \mu$ | $T, V, \mu$ | | free enthalpy $H$ | $H=E-T S+P V$ | $\mathrm{~d} H=-S \mathrm{~d} T+V \mathrm{~d} P+\mu \mathrm{d} N$ | $T, P, N$ |
Table 6.2: Relationship between various thermodynamic potentials
The relationship between the various potentials can be further elucidated by means of the Legendre transform. This characterization is important because it makes transparent the convexity respectively concavity properties of G , F G , F G,FG, FG,F following from the convexity of S S SSS.

Example: Virial expansion and van der Waals equation of state

As an example for the use of potentials, we employ the cluster expansion discussed in section 4.6 to derive the equation of state for a realistic monoatomic gas. Calculations are left as an exercise (problem B.26).
Recall that the cluster expansion is given by
1 V log Y = 1 λ 3 l = 1 b l ( V , β ) z l 1 V log Y = 1 λ 3 l = 1 b l ( V , β ) z l (1)/(V)log Y=(1)/(lambda^(3))sum_(l=1)^(oo)b_(l)(V,beta)z^(l)\frac{1}{V} \log Y=\frac{1}{\lambda^{3}} \sum_{l=1}^{\infty} b_{l}(V, \beta) z^{l}1VlogY=1λ3l=1bl(V,β)zl
where Y Y YYY is the grand canonical partition function, λ λ lambda\lambdaλ is the thermal de Broglie wavelength, and z = e β μ z = e β μ z=e^(beta mu)z=\mathrm{e}^{\beta \mu}z=eβμ is the fugacity. Using the Gibbs-Duhem relation and (6.70), the cluster expansion gives an expansion of the pressure,
P = k B T λ 3 l = 1 b l ( V , β ) z l P = k B T λ 3 l = 1 b l ( V , β ) z l P=(k_(B)T)/(lambda^(3))sum_(l=1)^(oo)b_(l)(V,beta)z^(l)P=\frac{k_{B} T}{\lambda^{3}} \sum_{l=1}^{\infty} b_{l}(V, \beta) z^{l}P=kBTλ3l=1bl(V,β)zl
By (6.68) and (6.70), it also yields an expansion of the particle density n = N / V n = N / V n=N//Vn=N / Vn=N/V,
n = 1 λ 3 l = 1 l b l ( V , β ) e l β μ n = 1 λ 3 l = 1 l b l ( V , β ) e l β μ n=(1)/(lambda^(3))sum_(l=1)^(oo)lb_(l)(V,beta)e^(l beta mu)n=\frac{1}{\lambda^{3}} \sum_{l=1}^{\infty} l b_{l}(V, \beta) \mathrm{e}^{l \beta \mu}n=1λ3l=1lbl(V,β)elβμ
We would like to eliminate z z zzz in favor of the more accessible number density n n nnn. For this,
we make the ansatz
z = λ 3 n + a 2 ( λ 3 n ) 2 + a 3 ( λ 3 n ) 3 + z = λ 3 n + a 2 λ 3 n 2 + a 3 λ 3 n 3 + z=lambda^(3)n+a_(2)(lambda^(3)n)^(2)+a_(3)(lambda^(3)n)^(3)+dotsz=\lambda^{3} n+a_{2}\left(\lambda^{3} n\right)^{2}+a_{3}\left(\lambda^{3} n\right)^{3}+\ldotsz=λ3n+a2(λ3n)2+a3(λ3n)3+
and determine the coefficients a 2 , a 3 , a 2 , a 3 , a_(2),a_(3),dotsa_{2}, a_{3}, \ldotsa2,a3, from the expansion of n n nnn :
a 2 = 2 b 2 , a 3 = 8 b 2 2 3 b 2 , a 2 = 2 b 2 , a 3 = 8 b 2 2 3 b 2 , a_(2)=-2b_(2),quada_(3)=8b_(2)^(2)-3b_(2),dotsa_{2}=-2 b_{2}, \quad a_{3}=8 b_{2}^{2}-3 b_{2}, \ldotsa2=2b2,a3=8b223b2,
Plugging this into the expansion of P P PPP, we obtain
P = k B T n ( 1 + B 2 ( T ) n + B 3 ( T ) n 2 + ) P = k B T n 1 + B 2 ( T ) n + B 3 ( T ) n 2 + P=k_(B)Tn(1+B_(2)(T)n+B_(3)(T)n^(2)+dots)P=k_{B} T n\left(1+B_{2}(T) n+B_{3}(T) n^{2}+\ldots\right)P=kBTn(1+B2(T)n+B3(T)n2+)
where
B 2 = b 2 λ 3 , B 3 = ( 4 b 2 2 3 b 3 ) λ 6 , B 2 = b 2 λ 3 , B 3 = 4 b 2 2 3 b 3 λ 6 , B_(2)=-b_(2)lambda^(3),quadB_(3)=(4b_(2)^(2)-3b_(3))lambda^(6),dotsB_{2}=-b_{2} \lambda^{3}, \quad B_{3}=\left(4 b_{2}^{2}-3 b_{3}\right) \lambda^{6}, \ldotsB2=b2λ3,B3=(4b223b3)λ6,
This is known as the virial expansion, and the B l B l B_(l)B_{l}Bl are known as the virial coefficients. The first order contribution yields the equation of state of the classical monoatomic ideal gas. It corresponds to the situation where the particles do not interact. For a realistic dilute gas, it is reasonable to expect that low orders in the expansion give a good approximation; we consider second order. At this order, we obtain
(6.80) P k B T n ( 1 + B 2 ( T ) n ) . (6.80) P k B T n 1 + B 2 ( T ) n . {:(6.80)P~~k_(B)Tn(1+B_(2)(T)n).:}\begin{equation*} P \approx k_{B} T n\left(1+B_{2}(T) n\right) . \tag{6.80} \end{equation*}(6.80)PkBTn(1+B2(T)n).
Let us compute B 2 B 2 B_(2)B_{2}B2 under the assumption that the interaction is described by the spherically symmetric two-body potential
V ( r ) = { + r < r 0 u 0 ( r 0 r ) 6 r r 0 V ( r ) = + r < r 0 u 0 r 0 r 6 r r 0 V(r)={[+oo,∣r < r_(0)],[-u_(0)((r_(0))/(r))^(6),∣r >= r_(0)]:}\mathcal{V}(r)= \begin{cases}+\infty & \mid r<r_{0} \\ -u_{0}\left(\frac{r_{0}}{r}\right)^{6} & \mid r \geqslant r_{0}\end{cases}V(r)={+r<r0u0(r0r)6rr0
where r 0 r 0 r_(0)r_{0}r0 is twice the radius of the atoms. This potential models a hard core repulsion at atomic distances and a moderately decreasing attraction at large distances. Using the integral formula (4.119) for b 2 b 2 b_(2)b_{2}b2, in the high temperature limit we find
B 2 V a 2 ( 1 u 0 k B T ) , B 2 V a 2 1 u 0 k B T , B_(2)~~(V_(a))/(2)(1-(u_(0))/(k_(B)T)),B_{2} \approx \frac{V_{a}}{2}\left(1-\frac{u_{0}}{k_{B} T}\right),B2Va2(1u0kBT),
where
V a = 4 π / 3 r 0 3 V a = 4 π / 3 r 0 3 V_(a)=4pi//3r_(0)^(3)V_{a}=4 \pi / 3 r_{0}^{3}Va=4π/3r03
is the effective volume of one particle. Plugging this into the second order virial expansion (6.80), we get
P k B T n + V a 2 ( k B T u 0 ) n 2 . P k B T n + V a 2 k B T u 0 n 2 . P~~k_(B)Tn+(V_(a))/(2)(k_(B)T-u_(0))n^(2).P \approx k_{B} T n+\frac{V_{a}}{2}\left(k_{B} T-u_{0}\right) n^{2} .PkBTn+Va2(kBTu0)n2.
Substituting
1 + x 1 1 x 1 + x 1 1 x 1+x~~(1)/(1-x)1+x \approx \frac{1}{1-x}1+x11x
this can be written in the form
(6.81) ( P + a ( N V ) 2 ) ( V N b ) N k B T (6.81) P + a N V 2 ( V N b ) N k B T {:(6.81)(P+a((N)/(V))^(2))(V-Nb)~~Nk_(B)T:}\begin{equation*} \left(P+a\left(\frac{N}{V}\right)^{2}\right)(V-N b) \approx N k_{B} T \tag{6.81} \end{equation*}(6.81)(P+a(NV)2)(VNb)NkBT
with the coefficients
a = V a u 0 2 , b = V a 2 . a = V a u 0 2 , b = V a 2 . a=(V_(a)u_(0))/(2),quad b=(V_(a))/(2).a=\frac{V_{a} u_{0}}{2}, \quad b=\frac{V_{a}}{2} .a=Vau02,b=Va2.
Equation (6.81) is known as the van-der-Waals equation. The coefficient b b bbb can be interpreted as a volume per particle by which the volume available to the system is reduced due to exclusion of particles. Since r 0 r 0 r_(0)r_{0}r0 is the distance of minimal approach, i.e., twice the radius of the particles, b b bbb amounts to twice the effective volume of the particles. The coefficient a a aaa decreases the pressure P P PPP in the system due attractive particle interaction. In applications we should bear in mind that the equation can be exptected to give a good approximation in case the volume per particle available is much larger than the effective volume of the particles, V b / 2 V b / 2 V≫b//2V \gg b / 2Vb/2, and for high temperature, u 0 k B T u 0 k B T u_(0)≪k_(B)Tu_{0} \ll k_{B} Tu0kBT.

6.6 Chemical Equilibrium

We consider chemical reactions characterized by a k k kkk-tuple r = ( r 1 , , r k ) r _ = r 1 , , r k r_=(r_(1),dots,r_(k))\underline{r}=\left(r_{1}, \ldots, r_{k}\right)r=(r1,,rk) of integers corresponding to a chemical reaction of the form
(6.82) r i < 0 | r i | χ i r i > 0 | r i | χ i (6.82) r i < 0 r i χ i r i > 0 r i χ i {:(6.82)sum_(r_(i) < 0)|r_(i)|chi_(i)⇆sum_(r_(i) > 0)|r_(i)|chi_(i):}\begin{equation*} \sum_{r_{i}<0}\left|r_{i}\right| \chi_{i} \leftrightarrows \sum_{r_{i}>0}\left|r_{i}\right| \chi_{i} \tag{6.82} \end{equation*}(6.82)ri<0|ri|χiri>0|ri|χi
where χ i χ i chi_(i)\chi_{i}χi is the chemical symbol of the i i iii-th compound. For example, the reaction
C + O 2 CO 2 C + O 2 CO 2 C+O_(2)⇆CO_(2)\mathrm{C}+\mathrm{O}_{2} \leftrightarrows \mathrm{CO}_{2}C+O2CO2
is described by χ 1 = C , χ 2 = O 2 , χ 3 = CO 2 χ 1 = C , χ 2 = O 2 , χ 3 = CO 2 chi_(1)=C,chi_(2)=O_(2),chi_(3)=CO_(2)\chi_{1}=\mathrm{C}, \chi_{2}=\mathrm{O}_{2}, \chi_{3}=\mathrm{CO}_{2}χ1=C,χ2=O2,χ3=CO2 and r 1 = 1 , r 2 = 1 , r 3 = + 1 r 1 = 1 , r 2 = 1 , r 3 = + 1 r_(1)=-1,r_(2)=-1,r_(3)=+1r_{1}=-1, r_{2}=-1, r_{3}=+1r1=1,r2=1,r3=+1, or r = r _ = r_=\underline{r}=r= ( 1 , 1 , + 1 ) ( 1 , 1 , + 1 ) (-1,-1,+1)(-1,-1,+1)(1,1,+1). The full system is described by some complicated Hamiltonian H ( V ) H ( V ) H(V)H(V)H(V) and number operators N ^ i N ^ i hat(N)_(i)\hat{N}_{i}N^i for the i i iii-th compound. Since the dynamics can change the particle number, we will have [ H ( V ) , N ^ i ] 0 H ( V ) , N ^ i 0 [H(V), hat(N)_(i)]!=0\left[H(V), \hat{N}_{i}\right] \neq 0[H(V),N^i]0 in general. We imagine that an entropy S ( E , V , { N i } ) S E , V , N i S(E,V,{N_(i)})S\left(E, V,\left\{N_{i}\right\}\right)S(E,V,{Ni}) can be assigned to an ensemble of states with energy between E Δ E E Δ E E-Delta EE-\Delta EEΔE and E E EEE, and average particle numbers { N i = N ^ i } N i = N ^ i {N_(i)=(: hat(N)_(i):)}\left\{N_{i}=\left\langle\hat{N}_{i}\right\rangle\right\}{Ni=N^i}, but we note that the definition of S S SSS in microscopic terms is far from obvious because N ^ i N ^ i hat(N)_(i)\hat{N}_{i}N^i is not a constant of motion.
The entropy should be maximized in equilibrium. Since N = ( N 1 , , N k ) N _ = N 1 , , N k N_=(N_(1),dots,N_(k))\underline{N}=\left(N_{1}, \ldots, N_{k}\right)N=(N1,,Nk) changes
by r = ( r 1 , , r k ) r _ = r 1 , , r k r_=(r_(1),dots,r_(k))\underline{r}=\left(r_{1}, \ldots, r_{k}\right)r=(r1,,rk) in a reaction, the necessary condition for equilibrium is
(6.83) d d n S ( E , V , N + n r ) | n = 0 = 0 (6.83) d d n S ( E , V , N _ + n r _ ) n = 0 = 0 {:(6.83)(d)/(dn)S(E,V,N_+nr_)|_(n=0)=0:}\begin{equation*} \left.\frac{d}{d n} S(E, V, \underline{N}+n \underline{r})\right|_{n=0}=0 \tag{6.83} \end{equation*}(6.83)ddnS(E,V,N+nr)|n=0=0
Since by definition S N i | V , E = μ i T S N i V , E = μ i T (del S)/(delN_(i))|_(V,E)=-(mu_(i))/(T)\left.\frac{\partial S}{\partial N_{i}}\right|_{V, E}=-\frac{\mu_{i}}{T}SNi|V,E=μiT, in equilibrium we must have
(6.84) 0 = μ r = i = 1 k μ i r i (6.84) 0 = μ _ r _ = i = 1 k μ i r i {:(6.84)0=mu _*r_=sum_(i=1)^(k)mu_(i)r_(i):}\begin{equation*} 0=\underline{\mu} \cdot \underline{r}=\sum_{i=1}^{k} \mu_{i} r_{i} \tag{6.84} \end{equation*}(6.84)0=μr=i=1kμiri
Let us now assume that in equilibrium we can use the expression for μ i μ i mu_(i)\mu_{i}μi of an ideal gas with k k kkk distinguishable components and N i N i N_(i)N_{i}Ni indistinguishable particles of the i i iii-th component. This is basically the assumption that interactions contribute negligibly to the entropy of the equilibrium state. According to the discussion in section 4.2.3 the total entropy is given by
(6.85) S = i = 1 k S i + Δ S (6.85) S = i = 1 k S i + Δ S {:(6.85)S=sum_(i=1)^(k)S_(i)+Delta S:}\begin{equation*} S=\sum_{i=1}^{k} S_{i}+\Delta S \tag{6.85} \end{equation*}(6.85)S=i=1kSi+ΔS
where S i = S ( E i , V i , N i ) S i = S E i , V i , N i S_(i)=S(E_(i),V_(i),N_(i))S_{i}=S\left(E_{i}, V_{i}, N_{i}\right)Si=S(Ei,Vi,Ni) is the entropy of the i i iii-th species, Δ S Δ S Delta S\Delta SΔS is the mixing entropy, and we have
(6.86) N i V i = N V , N i = N , V i = V , E i = E (6.86) N i V i = N V , N i = N , V i = V , E i = E {:(6.86)(N_(i))/(V_(i))=(N)/(V)","quad sumN_(i)=N","quad sumV_(i)=V","quad sumE_(i)=E:}\begin{equation*} \frac{N_{i}}{V_{i}}=\frac{N}{V}, \quad \sum N_{i}=N, \quad \sum V_{i}=V, \quad \sum E_{i}=E \tag{6.86} \end{equation*}(6.86)NiVi=NV,Ni=N,Vi=V,Ei=E
The entropy of the i i iii-th species is given by
(6.87) S i = N i k B [ log e V i N i + log ( 4 3 E i N i π e m i ) 3 2 ] (6.87) S i = N i k B log e V i N i + log 4 3 E i N i π e m i 3 2 {:(6.87)S_(i)=N_(i)k_(B)[log((eV_(i))/(N_(i)))+log ((4)/(3)(E_(i))/(N_(i))pi em_(i))^((3)/(2))]:}\begin{equation*} S_{i}=N_{i} \mathrm{k}_{\mathrm{B}}\left[\log \frac{e V_{i}}{N_{i}}+\log \left(\frac{4}{3} \frac{E_{i}}{N_{i}} \pi e m_{i}\right)^{\frac{3}{2}}\right] \tag{6.87} \end{equation*}(6.87)Si=NikB[logeViNi+log(43EiNiπemi)32]
The mixing entropy is given by
(6.88) Δ S = N k B i = 1 k ( c i log c i c i ) , (6.88) Δ S = N k B i = 1 k c i log c i c i , {:(6.88)Delta S=-Nk_(B)sum_(i=1)^(k)(c_(i)log c_(i)-c_(i))",":}\begin{equation*} \Delta S=-N \mathrm{k}_{\mathrm{B}} \sum_{i=1}^{k}\left(c_{i} \log c_{i}-c_{i}\right), \tag{6.88} \end{equation*}(6.88)ΔS=NkBi=1k(cilogcici),
where c i = N i N c i = N i N c_(i)=(N_(i))/(N)c_{i}=\frac{N_{i}}{N}ci=NiN is the concentration of the i i iii-th component. Let μ ¯ i μ ¯ i bar(mu)_(i)\bar{\mu}_{i}μ¯i be the chemical potential of the i i iii-th species without taking into account the contribution due to the mixing:
μ ¯ i T = S i N i | V i , E i = k B log [ V i N i ( 4 π m i E i 3 N i ) 3 2 ] = S i N i + 5 2 k B μ ¯ i T = S i N i V i , E i = k B log V i N i 4 π m i E i 3 N i 3 2 = S i N i + 5 2 k B {:[( bar(mu)_(i))/(T)=-(delS_(i))/(delN_(i))|_(V_(i),E_(i))=k_(B)log[(V_(i))/(N_(i))((4pim_(i)E_(i))/(3N_(i)))^((3)/(2))]],[=-(S_(i))/(N_(i))+(5)/(2)k_(B)]:}\begin{aligned} \frac{\bar{\mu}_{i}}{T}=-\left.\frac{\partial S_{i}}{\partial N_{i}}\right|_{V_{i}, E_{i}} & =\mathrm{k}_{\mathrm{B}} \log \left[\frac{V_{i}}{N_{i}}\left(\frac{4 \pi m_{i} E_{i}}{3 N_{i}}\right)^{\frac{3}{2}}\right] \\ & =-\frac{S_{i}}{N_{i}}+\frac{5}{2} \mathrm{k}_{\mathrm{B}} \end{aligned}μ¯iT=SiNi|Vi,Ei=kBlog[ViNi(4πmiEi3Ni)32]=SiNi+52kB
We have for the total chemical potential for the i i iii-speicies:
μ i = μ ¯ i + k B T log c i = 5 2 k B T S i T N i + k B T log c i = 1 N i ( E i + P V i T S i ) + k B T log c i = h i H i / N i = free enthalpy per particle for species i + k B T log c i , μ i = μ ¯ i + k B T log c i = 5 2 k B T S i T N i + k B T log c i = 1 N i E i + P V i T S i + k B T log c i = h i H i / N i =  free enthalpy   per particle for species  i + k B T log c i , {:[mu_(i)= bar(mu)_(i)+k_(B)T log c_(i)],[=(5)/(2)k_(B)T-(S_(i)T)/(N_(i))+k_(B)T log c_(i)],[=(1)/(N_(i))(E_(i)+PV_(i)-TS_(i))+k_(B)T log c_(i)],[=ubrace(h_(i)ubrace)_({:[H_(i)//N_(i)=" free enthalpy "],[" per particle for species "i]:})+k_(B)T log c_(i)","]:}\begin{aligned} \mu_{i} & =\bar{\mu}_{i}+\mathrm{k}_{\mathrm{B}} T \log c_{i} \\ & =\frac{5}{2} \mathrm{k}_{\mathrm{B}} T-\frac{S_{i} T}{N_{i}}+\mathrm{k}_{\mathrm{B}} T \log c_{i} \\ & =\frac{1}{N_{i}}\left(E_{i}+P V_{i}-T S_{i}\right)+\mathrm{k}_{\mathrm{B}} T \log c_{i} \\ & =\underbrace{h_{i}}_{\substack{H_{i} / N_{i}=\text { free enthalpy } \\ \text { per particle for species } i}}+\mathrm{k}_{\mathrm{B}} T \log c_{i}, \end{aligned}μi=μ¯i+kBTlogci=52kBTSiTNi+kBTlogci=1Ni(Ei+PViTSi)+kBTlogci=hiHi/Ni= free enthalpy  per particle for species i+kBTlogci,
where we have used the equations of state for the ideal gas for each species. From this it follows that the condition for equilibrium becomes
(6.89) 0 = i μ i r i = i ( h i r i + k B T log c i r i ) , (6.89) 0 = i μ i r i = i h i r i + k B T log c i r i , {:(6.89)0=sum_(i)mu_(i)*r_(i)=sum_(i)(h_(i)r_(i)+k_(B)T log c_(i)^(r_(i)))",":}\begin{equation*} 0=\sum_{i} \mu_{i} \cdot r_{i}=\sum_{i}\left(h_{i} r_{i}+\mathrm{k}_{\mathrm{B}} T \log c_{i}^{r_{i}}\right), \tag{6.89} \end{equation*}(6.89)0=iμiri=i(hiri+kBTlogciri),
which yields
(6.90) 1 = e Δ h k B T i c i r i (6.90) 1 = e Δ h k B T i c i r i {:(6.90)1=e^((Delta h)/(k_(B)T))prod_(i)c_(i)^(r_(i)):}\begin{equation*} 1=e^{\frac{\Delta h}{\mathrm{k}_{\mathrm{B}} T}} \prod_{i} c_{i}^{r_{i}} \tag{6.90} \end{equation*}(6.90)1=eΔhkBTiciri
or equivalently
(6.91) e Δ h k B T = r i > 0 c i | r i | r i < 0 c i | r i | (6.91) e Δ h k B T = r i > 0 c i r i r i < 0 c i r i {:(6.91)e^(-(Delta h)/(k_(B)T))=(prod_(r_(i) > 0)c_(i)^(|r_(i)|))/(prod_(r_(i) < 0)c_(i)^(|r_(i)|)):}\begin{equation*} e^{-\frac{\Delta h}{\mathrm{k}_{\mathrm{B}} T}}=\frac{\prod_{r_{i}>0} c_{i}^{\left|r_{i}\right|}}{\prod_{r_{i}<0} c_{i}^{\left|r_{i}\right|}} \tag{6.91} \end{equation*}(6.91)eΔhkBT=ri>0ci|ri|ri<0ci|ri|
with Δ h = i r i h i = Δ h = i r i h i = Delta h=sum_(i)r_(i)h_(i)=\Delta h=\sum_{i} r_{i} h_{i}=Δh=irihi= enthalpy increase for one reaction. The above relation is sometimes called the "mass-action law". It is clearly in general not an exact relation, because we have treated the constituents as ideal gases. Nevertheless, it is often a surprisingly good approximation.

6.7 Phase Co-Existence and Clausius-Clapeyron Relation

We consider a system comprised of k k kkk compounds with particle numbers N 1 , , N k N 1 , , N k N_(1),dots,N_(k)N_{1}, \ldots, N_{k}N1,,Nk. It is assumed that chemical reactions are not possible, so each N i N i N_(i)N_{i}Ni is conserved. The entropy is assumed to be given as a function of S = S ( X ) S = S ( X _ ) S=S(X_)S=S(\underline{X})S=S(X), where X = ( E , V , N 1 , , N k ) X _ = E , V , N 1 , , N k X_=(E,V,N_(1),dots,N_(k))\underline{X}=\left(E, V, N_{1}, \ldots, N_{k}\right)X=(E,V,N1,,Nk) (here we also include E E EEE into the thermodynamic coordinates.) We assume that the system is in an equilibrium state with φ φ varphi\varphiφ coexisting pure phases which are labeled by α = 1 , , φ α = 1 , , φ alpha=1,dots,varphi\alpha=1, \ldots, \varphiα=1,,φ. The equilibrium state for each phase α α alpha\alphaα is thus characterized by some vector X ( α ) X _ ( α ) X_^((alpha))\underline{X}^{(\alpha)}X(α), or rather the corresponding ray { λ X ( α ) λ > 0 } λ X _ ( α ) λ > 0 {lambdaX_^((alpha))∣lambda > 0}\left\{\lambda \underline{X}^{(\alpha)} \mid \lambda>0\right\}{λX(α)λ>0} since we can scale up the volume, energy, and numbers of particles by a positive constant.

Examples:

  1. Consider the following example of a phase boundary between coffee and sugar:
Figure 6.12: The phase boundary between solution and a solute.
In this example we have k = 2 k = 2 k=2k=2k=2 compounds (coffee, sugar) with φ = 2 φ = 2 varphi=2\varphi=2φ=2 coexisting phases (solution, sugar at bottom). The solid phase and coffee/sugar solution phases are described by vectors
(6.92) X ( solid ) = ( E ( 1 ) , N sugar ( 1 ) , N coffee ( 1 ) = 0 , V ( 1 ) ) , X ( solution ) ( E ( 2 ) , N sugar ( 2 ) , N coffee ( 2 ) , V ( 2 ) ) , (6.92) X _ ( solid  ) = E ( 1 ) , N sugar  ( 1 ) , N coffee  ( 1 ) = 0 , V ( 1 ) , X _ ( solution  ) E ( 2 ) , N sugar  ( 2 ) , N coffee  ( 2 ) , V ( 2 ) , {:(6.92)X_^(("solid "))=(E^((1)),N_("sugar ")^((1)),N_("coffee ")^((1))=0,V^((1)))","quadX_^(("solution "))(E^((2)),N_("sugar ")^((2)),N_("coffee ")^((2)),V^((2)))",":}\begin{equation*} \underline{X}^{(\text {solid })}=\left(E^{(1)}, N_{\text {sugar }}^{(1)}, N_{\text {coffee }}^{(1)}=0, V^{(1)}\right), \quad \underline{X}^{(\text {solution })}\left(E^{(2)}, N_{\text {sugar }}^{(2)}, N_{\text {coffee }}^{(2)}, V^{(2)}\right), \tag{6.92} \end{equation*}(6.92)X(solid )=(E(1),Nsugar (1),Ncoffee (1)=0,V(1)),X(solution )(E(2),Nsugar (2),Ncoffee (2),V(2)),
respectively. In this example we need 2 independent parameters to describe phase equilibrium, such as the temperature T T TTT of the coffee and the concentration c c ccc of sugar, i.e. sweetness of the coffee.
2) Another example is the ice-vapor-water diagram where we only have k = 1 k = 1 k=1k=1k=1 substance (water). At the triple point, we have φ = 3 φ = 3 varphi=3\varphi=3φ=3 coexisting phases. At the water-vapor boundary, we have φ = 2 φ = 2 varphi=2\varphi=2φ=2 coexisting phases, and we need one parameter to fix where we are on this phase boundary. Away from any phase boundary, only φ = 1 φ = 1 varphi=1\varphi=1φ=1 phase is possible.
The temperature T T TTT, pressure P P PPP, and chemical potentials μ i μ i mu_(i)\mu_{i}μi must have the same value in each phase, i.e. we have for all α α alpha\alphaα :
(6.93) S E ( X ( α ) ) = 1 T , S V ( X ( α ) ) = P T , S N i ( X ( α ) ) = μ i T . (6.93) S E X _ ( α ) = 1 T , S V X _ ( α ) = P T , S N i X _ ( α ) = μ i T . {:(6.93)(del S)/(del E)(X_^((alpha)))=(1)/(T)","quad(del S)/(del V)(X_^((alpha)))=(P)/(T)","quad(del S)/(delN_(i))(X_^((alpha)))=-(mu_(i))/(T).:}\begin{equation*} \frac{\partial S}{\partial E}\left(\underline{X}^{(\alpha)}\right)=\frac{1}{T}, \quad \frac{\partial S}{\partial V}\left(\underline{X}^{(\alpha)}\right)=\frac{P}{T}, \quad \frac{\partial S}{\partial N_{i}}\left(\underline{X}^{(\alpha)}\right)=-\frac{\mu_{i}}{T} . \tag{6.93} \end{equation*}(6.93)SE(X(α))=1T,SV(X(α))=PT,SNi(X(α))=μiT.
We define a ( k + 2 ) ( k + 2 ) (k+2)(k+2)(k+2)-component vector ξ ξ _ xi _\underline{\xi}ξ, which is is independent of α α alpha\alphaα, as follows:
(6.94) ξ = ( 1 T , P T , μ 1 T , , μ k T ) (6.94) ξ _ = 1 T , P T , μ 1 T , , μ k T {:(6.94)xi _=((1)/(T),(P)/(T),-(mu_(1))/(T),dots,-(mu_(k))/(T)):}\begin{equation*} \underline{\xi}=\left(\frac{1}{T}, \frac{P}{T},-\frac{\mu_{1}}{T}, \ldots,-\frac{\mu_{k}}{T}\right) \tag{6.94} \end{equation*}(6.94)ξ=(1T,PT,μ1T,,μkT)
As an example consider the following phase diagram for 6 phases:
Figure 6.13: Imaginary phase diagram for the case of 6 different phases. At each point on a phase boundary which is not an intersection point, φ = 2 φ = 2 varphi=2\varphi=2φ=2 phases are supposed to coexist. At each intersection point φ = 4 φ = 4 varphi=4\varphi=4φ=4 phases are supposed to coexist.
From the discussion in the previous sections we know that
(1) S S SSS is extensive in equilibrium:
(6.95) S ( λ X ) = λ S ( X ) , λ > 0 . (6.95) S ( λ X _ ) = λ S ( X _ ) , λ > 0 . {:(6.95)S(lambdaX_)=lambda S(X_)","quad lambda > 0.:}\begin{equation*} S(\lambda \underline{X})=\lambda S(\underline{X}), \quad \lambda>0 . \tag{6.95} \end{equation*}(6.95)S(λX)=λS(X),λ>0.
(2) S S SSS is a concave function in X R k + 2 X _ R k + 2 X_inR^(k+2)\underline{X} \in \mathbb{R}^{k+2}XRk+2 (subadditivity), and
(6.96) α λ ( α ) S ( X ( α ) ) S ( α λ ( α ) X ( α ) ) , (6.96) α λ ( α ) S X _ ( α ) S α λ ( α ) X _ ( α ) , {:(6.96)sum_(alpha)lambda^((alpha))S(X_^((alpha))) <= S(sum_(alpha)lambda^((alpha))X_^((alpha)))",":}\begin{equation*} \sum_{\alpha} \lambda^{(\alpha)} S\left(\underline{X}^{(\alpha)}\right) \leqslant S\left(\sum_{\alpha} \lambda^{(\alpha)} \underline{X}^{(\alpha)}\right), \tag{6.96} \end{equation*}(6.96)αλ(α)S(X(α))S(αλ(α)X(α)),
as long as α λ ( α ) = 1 , λ ( α ) 0 α λ ( α ) = 1 , λ ( α ) 0 sum_(alpha)lambda^((alpha))=1,lambda^((alpha)) >= 0\sum_{\alpha} \lambda^{(\alpha)}=1, \lambda^{(\alpha)} \geqslant 0αλ(α)=1,λ(α)0. Since the coexisting phases are in equilibrium with each other, we must have "=" rather than " < < <<< " in the above inequality. Otherwise, the entropy would be maximized for some non-trivial linear combination X min = X _ min = X__(min)=\underline{X}_{\min }=Xmin= α λ ( α ) X ( α ) α λ ( α ) X _ ( α ) sum_(alpha)lambda^((alpha))X_^((alpha))\sum_{\alpha} \lambda^{(\alpha)} \underline{X}^{(\alpha)}αλ(α)X(α), and only one homogeneous phase given by this minimizer X min X _ min X__(min)\underline{X}_{\min }Xmin could be realized.
By (1) and (2) it follows that in the region C R 2 + k C R 2 + k C subR^(2+k)C \subset \mathbb{R}^{2+k}CR2+k, where several phases can coexist, S S SSS is linear, S ( X ) = ξ X S ( X _ ) = ξ _ X _ S(X_)=xi _*X_S(\underline{X})=\underline{\xi} \cdot \underline{X}S(X)=ξX for all X C , ξ = X _ C , ξ _ = X_in C,xi _=\underline{X} \in C, \underline{\xi}=XC,ξ= const. in C C CCC, and C C CCC consists of positive linear combinations
(6.97) C = { X = α = 1 φ λ ( α ) X ( α ) : λ ( α ) 0 } (6.97) C = X _ = α = 1 φ λ ( α ) X _ ( α ) : λ ( α ) 0 {:(6.97)C={X_=sum_(alpha=1)^(varphi)lambda^((alpha))X_^((alpha))quad:quadlambda^((alpha)) >= 0}:}\begin{equation*} C=\left\{\underline{X}=\sum_{\alpha=1}^{\varphi} \lambda^{(\alpha)} \underline{X}^{(\alpha)} \quad: \quad \lambda^{(\alpha)} \geqslant 0\right\} \tag{6.97} \end{equation*}(6.97)C={X=α=1φλ(α)X(α):λ(α)0}
in other words, the coexistence region C C CCC is a convex cone generated by the vectors X ( α ) , α = 1 , , φ X _ ( α ) , α = 1 , , φ X_^((alpha)),alpha=1,dots,varphi\underline{X}^{(\alpha)}, \alpha=1, \ldots, \varphiX(α),α=1,,φ. The set of points in the space ( P , T , { c i } ) P , T , c i (P,T,{c_(i)})\left(P, T,\left\{c_{i}\right\}\right)(P,T,{ci}) where equilibrium between φ φ varphi\varphiφ phases holds (i.e. the phase boundaries in a P T { c i } P T c i P-T-{c_(i)}P-T-\left\{c_{i}\right\}PT{ci}-diagram) can be characterized as follows. Since ξ ξ _ xi _\underline{\xi}ξ is constant within the convex cone C C CCC, we have for any X C X _ C X_in C\underline{X} \in CXC and any
α = 1 , , φ α = 1 , , φ alpha=1,dots,varphi\alpha=1, \ldots, \varphiα=1,,φ and any I I III :
0 = d d λ ξ I ( X + λ X ( α ) ) | λ = 0 = J X J ( α ) X J ξ I ( X ) = J X J ( α ) 2 X J X I S ( X ) = J X J ( α ) 2 X I X J S ( X ) = J X J ( α ) X I ξ J ( X ) 0 = d d λ ξ I X _ + λ X _ ( α ) λ = 0 = J X J ( α ) X J ξ I ( X _ ) = J X J ( α ) 2 X J X I S ( X _ ) = J X J ( α ) 2 X I X J S ( X _ ) = J X J ( α ) X I ξ J ( X _ ) {:[0=(d)/(d lambda)xi_(I)(X_+lambdaX_^((alpha)))|_(lambda=0)=sum_(J)X_(J)^((alpha))(del)/(delX_(J))xi_(I)(X_)=sum_(J)X_(J)^((alpha))(del^(2))/(delX_(J)delX_(I))S(X_)],[=sum_(J)X_(J)^((alpha))(del^(2))/(delX_(I)delX_(J))S(X_)],[=sum_(J)X_(J)^((alpha))(del)/(delX_(I))xi_(J)(X_)]:}\begin{aligned} 0=\left.\frac{d}{d \lambda} \xi_{I}\left(\underline{X}+\lambda \underline{X}^{(\alpha)}\right)\right|_{\lambda=0} & =\sum_{J} X_{J}^{(\alpha)} \frac{\partial}{\partial X_{J}} \xi_{I}(\underline{X})=\sum_{J} X_{J}^{(\alpha)} \frac{\partial^{2}}{\partial X_{J} \partial X_{I}} S(\underline{X}) \\ & =\sum_{J} X_{J}^{(\alpha)} \frac{\partial^{2}}{\partial X_{I} \partial X_{J}} S(\underline{X}) \\ & =\sum_{J} X_{J}^{(\alpha)} \frac{\partial}{\partial X_{I}} \xi_{J}(\underline{X}) \end{aligned}0=ddλξI(X+λX(α))|λ=0=JXJ(α)XJξI(X)=JXJ(α)2XJXIS(X)=JXJ(α)2XIXJS(X)=JXJ(α)XIξJ(X)
where we denote the k + 2 k + 2 k+2k+2k+2 components of X X _ X_\underline{X}X by { X I } X I {X_(I)}\left\{X_{I}\right\}{XI}. Multiplying this equation by d X I d X I dX_(I)\mathrm{d} X_{I}dXI and summing over I I III, this relation can be written as
(6.98) X ( α ) d ξ = 0 (6.98) X _ ( α ) d ξ _ = 0 {:(6.98)X_^((alpha))*dxi _=0:}\begin{equation*} \underline{X}^{(\alpha)} \cdot \mathrm{d} \underline{\xi}=0 \tag{6.98} \end{equation*}(6.98)X(α)dξ=0
which must hold in the coexistence region C C CCC. Since the equation must hold for all α = 1 , , φ α = 1 , , φ alpha=1,dots,varphi\alpha=1, \ldots, \varphiα=1,,φ, the coexistence region is is subject to φ φ varphi\varphiφ constraints, and we therefore need f = ( 2 + k φ ) f = ( 2 + k φ ) f=(2+k-varphi)f=(2+k-\varphi)f=(2+kφ) parameters to describe the coexistence region in the phase diagram. This statement is sometimes called the Gibbs phase rule.
Examples: Consider again the example of a phase boundary between coffee and sugar, where we had k = 2 k = 2 k=2k=2k=2 compounds (coffee, sugar) with φ = 2 φ = 2 varphi=2\varphi=2φ=2 coexisting phases (solution, sugar at bottom). The phase rule tells us that we need f = 2 + 2 2 = 2 f = 2 + 2 2 = 2 f=2+2-2=2f=2+2-2=2f=2+22=2 independent parameters to describe phase equilibrium, which is correct. In the ice-vapor-water diagram we only had k = 1 k = 1 k=1k=1k=1 substance (water). At the triple point, we have φ = 3 φ = 3 varphi=3\varphi=3φ=3 coexisting phases and f = 1 + 2 3 = 0 f = 1 + 2 3 = 0 f=1+2-3=0f=1+2-3=0f=1+23=0, which is consistent because a point is a 0 -dimensional manifold. At the water-ice coexistence line, we have φ = 2 φ = 2 varphi=2\varphi=2φ=2 and f = 1 + 2 2 = 1 f = 1 + 2 2 = 1 f=1+2-2=1f=1+2-2=1f=1+22=1, which is the correct dimension of a line.
Now consider a 1-component system ( k = 1 ) ( k = 1 ) (k=1)(k=1)(k=1), such that X = ( E , N , V ) X _ = ( E , N , V ) X_=(E,N,V)\underline{X}=(E, N, V)X=(E,N,V) and ξ = ξ _ = xi _=\underline{\xi}=ξ= ( 1 T , P T , μ T ) 1 T , P T , μ T ((1)/(T),(P)/(T),-(mu )/(T))\left(\frac{1}{T}, \frac{P}{T},-\frac{\mu}{T}\right)(1T,PT,μT) The φ φ varphi\varphiφ different phases are described by
X ( 1 ) = ( E ( 1 ) , N ( 1 ) , V ( 1 ) ) , , X ( φ ) = ( E ( φ ) , N ( φ ) , V ( φ ) ) . X _ ( 1 ) = E ( 1 ) , N ( 1 ) , V ( 1 ) , , X _ ( φ ) = E ( φ ) , N ( φ ) , V ( φ ) . X_^((1))=(E^((1)),N^((1)),V^((1))),dots,X_^((varphi))=(E^((varphi)),N^((varphi)),V^((varphi))).\underline{X}^{(1)}=\left(E^{(1)}, N^{(1)}, V^{(1)}\right), \ldots, \underline{X}^{(\varphi)}=\left(E^{(\varphi)}, N^{(\varphi)}, V^{(\varphi)}\right) .X(1)=(E(1),N(1),V(1)),,X(φ)=(E(φ),N(φ),V(φ)).
In the case of φ = 2 φ = 2 varphi=2\varphi=2φ=2 different phases we thus have
E ( 1 ) d ( 1 T ) + V ( 1 ) d ( P T ) N ( 1 ) d ( μ T ) = 0 E ( 2 ) d ( 1 T ) + V ( 2 ) d ( P T ) N ( 2 ) d ( μ T ) = 0 E ( 1 ) d 1 T + V ( 1 ) d P T N ( 1 ) d μ T = 0 E ( 2 ) d 1 T + V ( 2 ) d P T N ( 2 ) d μ T = 0 {:[E^((1))d((1)/(T))+V^((1))d((P)/(T))-N^((1))d((mu )/(T))=0],[E^((2))d((1)/(T))+V^((2))d((P)/(T))-N^((2))d((mu )/(T))=0]:}\begin{aligned} & E^{(1)} \mathrm{d}\left(\frac{1}{T}\right)+V^{(1)} \mathrm{d}\left(\frac{P}{T}\right)-N^{(1)} \mathrm{d}\left(\frac{\mu}{T}\right)=0 \\ & E^{(2)} \mathrm{d}\left(\frac{1}{T}\right)+V^{(2)} \mathrm{d}\left(\frac{P}{T}\right)-N^{(2)} \mathrm{d}\left(\frac{\mu}{T}\right)=0 \end{aligned}E(1)d(1T)+V(1)d(PT)N(1)d(μT)=0E(2)d(1T)+V(2)d(PT)N(2)d(μT)=0
We assume that the particle numbers are equal in both phases, N ( 1 ) = N ( 2 ) N N ( 1 ) = N ( 2 ) N N^((1))=N^((2))-=NN^{(1)}=N^{(2)} \equiv NN(1)=N(2)N, which
means that f = 2 + k φ = 1 f = 2 + k φ = 1 f=2+k-varphi=1f=2+k-\varphi=1f=2+kφ=1.Thus,
(6.99) [ E ( 1 ) E ( 2 ) + P ( V ( 1 ) V ( 2 ) ) ] d T T 2 = ( V ( 1 ) V ( 2 ) ) d P T (6.99) E ( 1 ) E ( 2 ) + P V ( 1 ) V ( 2 ) d T T 2 = V ( 1 ) V ( 2 ) d P T {:(6.99)[E^((1))-E^((2))+P(V^((1))-V^((2)))](dT)/(T^(2))=(V^((1))-V^((2)))(dP)/(T):}\begin{equation*} \left[E^{(1)}-E^{(2)}+P\left(V^{(1)}-V^{(2)}\right)\right] \frac{\mathrm{d} T}{T^{2}}=\left(V^{(1)}-V^{(2)}\right) \frac{\mathrm{d} P}{T} \tag{6.99} \end{equation*}(6.99)[E(1)E(2)+P(V(1)V(2))]dTT2=(V(1)V(2))dPT
or, equivalently,
(6.100) d P ( T ) d T = Δ E + P Δ V T Δ V (6.100) d P ( T ) d T = Δ E + P Δ V T Δ V {:(6.100)(dP(T))/(dT)=(Delta E+P Delta V)/(T Delta V):}\begin{equation*} \frac{\mathrm{d} P(T)}{\mathrm{d} T}=\frac{\Delta E+P \Delta V}{T \Delta V} \tag{6.100} \end{equation*}(6.100)dP(T)dT=ΔE+PΔVTΔV
Together with the relation Δ E = T Δ S P Δ V Δ E = T Δ S P Δ V Delta E=T Delta S-P Delta V\Delta E=T \Delta S-P \Delta VΔE=TΔSPΔV we find the Clausius-Clapeyronequation
(6.101) d P d T = Δ S Δ V (6.101) d P d T = Δ S Δ V {:(6.101)(dP)/((d)T)=(Delta S)/(Delta V):}\begin{equation*} \frac{\mathrm{d} P}{\mathrm{~d} T}=\frac{\Delta S}{\Delta V} \tag{6.101} \end{equation*}(6.101)dP dT=ΔSΔV
As an application, consider a solid (phase 1) in equilibrium with its vapor (phase 2). For the volume we should have V ( 1 ) V ( 2 ) V ( 1 ) V ( 2 ) V^((1))≪V^((2))V^{(1)} \ll V^{(2)}V(1)V(2), from which it follows that Δ V = V ( 1 ) V ( 2 ) Δ V = V ( 1 ) V ( 2 ) Delta V=V^((1))-V^((2))~~\Delta V=V^{(1)}-V^{(2)} \approxΔV=V(1)V(2) V ( 2 ) V ( 2 ) -V^((2))-V^{(2)}V(2). For the vapor phase, we assume the relations for an ideal gas, P V ( 2 ) = k B T N ( 2 ) = P V ( 2 ) = k B T N ( 2 ) = PV^((2))=k_(B)TN^((2))=P V^{(2)}=\mathrm{k}_{\mathrm{B}} T N^{(2)}=PV(2)=kBTN(2)= k B T N k B T N k_(B)TN\mathrm{k}_{\mathrm{B}} T NkBTN. Substitution for P P PPP gives
(6.102) d P d T = Δ Q N P k B T 2 , with Δ Q = Δ S T (6.102) d P d T = Δ Q N P k B T 2 ,  with  Δ Q = Δ S T {:(6.102)(dP)/((d)T)=(Delta Q)/(N)(P)/(k_(B)T^(2))","quad" with "Delta Q=-Delta S*T:}\begin{equation*} \frac{\mathrm{d} P}{\mathrm{~d} T}=\frac{\Delta Q}{N} \frac{P}{\mathrm{k}_{\mathrm{B}} T^{2}}, \quad \text { with } \Delta Q=-\Delta S \cdot T \tag{6.102} \end{equation*}(6.102)dP dT=ΔQNPkBT2, with ΔQ=ΔST
Assuming Δ q = Δ Q N Δ q = Δ Q N Delta q=(Delta Q)/(N)\Delta q=\frac{\Delta Q}{N}Δq=ΔQN to be roughly independent of T T TTT, we obtain
(6.103) P ( T ) = P 0 e Δ q k B T (6.103) P ( T ) = P 0 e Δ q k B T {:(6.103)P(T)=P_(0)e^(-(Delta q)/(k_(B)T)):}\begin{equation*} P(T)=P_{0} e^{-\frac{\Delta q}{\mathrm{k}_{\mathrm{B}} T}} \tag{6.103} \end{equation*}(6.103)P(T)=P0eΔqkBT
on the phase boundary, see the following figure:
Figure 6.14: Phase boundary of a vapor-solid system in the ( P , T ) ( P , T ) (P,T)(P, T)(P,T)-diagram

6.8 Osmotic Pressure

We consider a system made up of two compounds and define
N 1 = particle number of "ions" (solute) N 2 = particle number of "water molecules" (solvent). N 1 =  particle number of "ions" (solute)  N 2 =  particle number of "water molecules" (solvent).  {:[N_(1)=" particle number of "ions" (solute) "],[N_(2)=" particle number of "water molecules" (solvent). "]:}\begin{aligned} & N_{1}=\text { particle number of "ions" (solute) } \\ & N_{2}=\text { particle number of "water molecules" (solvent). } \end{aligned}N1= particle number of "ions" (solute) N2= particle number of "water molecules" (solvent). 
The corresponding chemical potentials are denoted μ 1 μ 1 mu_(1)\mu_{1}μ1 and μ 2 μ 2 mu_(2)\mu_{2}μ2. The grand canonical partition function,
Y ( μ 1 , μ 2 , V , β ) = tr [ e β ( H ( V ) μ 1 N ^ 1 μ 2 N ^ 2 ) ] Y μ 1 , μ 2 , V , β = tr e β H ( V ) μ 1 N ^ 1 μ 2 N ^ 2 Y(mu_(1),mu_(2),V,beta)=tr[e^(-beta(H(V)-mu_(1) hat(N)_(1)-mu_(2) hat(N)_(2)))]Y\left(\mu_{1}, \mu_{2}, V, \beta\right)=\operatorname{tr}\left[e^{-\beta\left(H(V)-\mu_{1} \hat{N}_{1}-\mu_{2} \hat{N}_{2}\right)}\right]Y(μ1,μ2,V,β)=tr[eβ(H(V)μ1N^1μ2N^2)]
can be written as
(6.104) Y ( μ 1 , μ 2 , V , β ) = N 1 = 0 Y N 1 ( μ 2 , β , V ) e β μ 1 N 1 (6.104) Y μ 1 , μ 2 , V , β = N 1 = 0 Y N 1 μ 2 , β , V e β μ 1 N 1 {:(6.104)Y(mu_(1),mu_(2),V,beta)=sum_(N_(1)=0)^(oo)Y_(N_(1))(mu_(2),beta,V)e^(betamu_(1)N_(1)):}\begin{equation*} Y\left(\mu_{1}, \mu_{2}, V, \beta\right)=\sum_{N_{1}=0}^{\infty} Y_{N_{1}}\left(\mu_{2}, \beta, V\right) e^{\beta \mu_{1} N_{1}} \tag{6.104} \end{equation*}(6.104)Y(μ1,μ2,V,β)=N1=0YN1(μ2,β,V)eβμ1N1
where Y N 1 Y N 1 Y_(N_(1))Y_{N_{1}}YN1 is the grand canonical partition function for substance 2 with a fixed number N 1 N 1 N_(1)N_{1}N1 of particles of substance 1 5 1 5 1^(5)1^{5}15. Let now y N := 1 V Y N Y 0 y N := 1 V Y N Y 0 y_(N):=(1)/(V)(Y_(N))/(Y_(0))y_{N}:=\frac{1}{V} \frac{Y_{N}}{Y_{0}}yN:=1VYNY0. It then follows that
(6.105) log Y log Y 0 + log Y Y 0 = log Y 0 + log [ 1 + N 1 > 0 V y N 1 e β μ 1 N 1 ] (6.105) log Y log Y 0 + log Y Y 0 = log Y 0 + log 1 + N 1 > 0 V y N 1 e β μ 1 N 1 {:(6.105)log Y-=log Y_(0)+log((Y)/(Y_(0)))=log Y_(0)+log[1+sum_(N_(1) > 0)Vy_(N_(1))e^(betamu_(1)N_(1))]:}\begin{equation*} \log Y \equiv \log Y_{0}+\log \frac{Y}{Y_{0}}=\log Y_{0}+\log \left[1+\sum_{N_{1}>0} V y_{N_{1}} e^{\beta \mu_{1} N_{1}}\right] \tag{6.105} \end{equation*}(6.105)logYlogY0+logYY0=logY0+log[1+N1>0VyN1eβμ1N1]
hence
(6.106) log Y = log Y 0 + V y 1 ( μ 2 , β no V dependence for large systems as free energy G = k B T log Y V ) e β μ 1 + O ( e 2 β μ 2 ) (6.106) log Y = log Y 0 + V y 1 ( μ 2 , β  no  V  dependence for large   systems as free energy  G = k B T log Y V ) e β μ 1 + O e 2 β μ 2 {:(6.106)log Y=log Y_(0)+Vy_(1)(ubrace(mu_(2),betaubrace)_({:[" no "V" dependence for large "],[" systems as free energy "],[G=-k_(B)T log Y∼V]:}))e^(betamu_(1))+O(e^(2betamu_(2))):}\log Y=\log Y_{0}+V y_{1}(\underbrace{\mu_{2}, \beta}_{\begin{array}{c} \text { no } V \text { dependence for large } \tag{6.106}\\ \text { systems as free energy } \\ G=-\mathrm{k}_{\mathrm{B}} T \log Y \sim V \end{array}}) e^{\beta \mu_{1}}+\mathcal{O}\left(e^{2 \beta \mu_{2}}\right)(6.106)logY=logY0+Vy1(μ2,β no V dependence for large  systems as free energy G=kBTlogYV)eβμ1+O(e2βμ2)
For the (expected) particle number of substance 1 we therefore have
(6.107) N 1 = 1 β μ 1 log Y ( μ 1 , μ 2 , V , β ) (6.107) N 1 = 1 β μ 1 log Y μ 1 , μ 2 , V , β {:(6.107)N_(1)=-(1)/(beta)(del)/(delmu_(1))log Y(mu_(1),mu_(2),V,beta):}\begin{equation*} N_{1}=-\frac{1}{\beta} \frac{\partial}{\partial \mu_{1}} \log Y\left(\mu_{1}, \mu_{2}, V, \beta\right) \tag{6.107} \end{equation*}(6.107)N1=1βμ1logY(μ1,μ2,V,β)
which follows from
(6.108) d G = S d T P d V N 1 d μ 1 N 2 d μ 2 (6.108) d G = S d T P d V N 1 d μ 1 N 2 d μ 2 {:(6.108)dG=-SdT-PdV-N_(1)dmu_(1)-N_(2)dmu_(2):}\begin{equation*} \mathrm{d} G=-S \mathrm{~d} T-P \mathrm{~d} V-N_{1} \mathrm{~d} \mu_{1}-N_{2} \mathrm{~d} \mu_{2} \tag{6.108} \end{equation*}(6.108)dG=S dTP dVN1 dμ1N2 dμ2
using the manipulations with thermodynamic potentials reviewed in section 6.5. Because log Y 0 log Y 0 log Y_(0)\log Y_{0}logY0 does not depend on μ 1 μ 1 mu_(1)\mu_{1}μ1, we find
(6.109) N 1 / V = n 1 = y 1 ( μ 2 , β ) e β μ 1 + O ( e 2 β μ 1 ) (6.109) N 1 / V = n 1 = y 1 μ 2 , β e β μ 1 + O e 2 β μ 1 {:(6.109)N_(1)//V=n_(1)=y_(1)(mu_(2),beta)e^(betamu_(1))+O(e^(2betamu_(1))):}\begin{equation*} N_{1} / V=n_{1}=y_{1}\left(\mu_{2}, \beta\right) e^{\beta \mu_{1}}+\mathcal{O}\left(e^{2 \beta \mu_{1}}\right) \tag{6.109} \end{equation*}(6.109)N1/V=n1=y1(μ2,β)eβμ1+O(e2βμ1)
On the other hand, we have for the pressure (see section 6.5)
(6.110) P = 1 β V log Y ( μ 1 , μ 2 , V , β ) (6.110) P = 1 β V log Y μ 1 , μ 2 , V , β {:(6.110)P=-(1)/(beta)(del)/(del V)log Y(mu_(1),mu_(2),V,beta):}\begin{equation*} P=-\frac{1}{\beta} \frac{\partial}{\partial V} \log Y\left(\mu_{1}, \mu_{2}, V, \beta\right) \tag{6.110} \end{equation*}(6.110)P=1βVlogY(μ1,μ2,V,β)
which follows again from (6.108). Using that y 1 y 1 y_(1)y_{1}y1 is approximately independent of V V VVV for large volume, we obtain the following relation:
P ( μ 2 , N 1 , β ) = P ( μ 2 , N 1 = 0 , β ) + y 1 ( μ 2 , T ) e β μ 1 / β + O ( e 2 β μ 1 ) P μ 2 , N 1 , β = P μ 2 , N 1 = 0 , β + y 1 μ 2 , T e β μ 1 / β + O e 2 β μ 1 P(mu_(2),N_(1),beta)=P(mu_(2),N_(1)=0,beta)+y_(1)(mu_(2),T)e^(betamu_(1))//beta+O(e^(2betamu_(1)))P\left(\mu_{2}, N_{1}, \beta\right)=P\left(\mu_{2}, N_{1}=0, \beta\right)+y_{1}\left(\mu_{2}, T\right) e^{\beta \mu_{1}} / \beta+\mathcal{O}\left(e^{2 \beta \mu_{1}}\right)P(μ2,N1,β)=P(μ2,N1=0,β)+y1(μ2,T)eβμ1/β+O(e2βμ1)
Using e β μ 1 = n 1 y 1 + O ( n 1 2 ) e β μ 1 = n 1 y 1 + O n 1 2 e^(betamu_(1))=(n_(1))/(y_(1))+O(n_(1)^(2))e^{\beta \mu_{1}}=\frac{n_{1}}{y_{1}}+\mathcal{O}\left(n_{1}^{2}\right)eβμ1=n1y1+O(n12), which follows from (6.109), we get
(6.111) P ( μ 2 , N 1 , T ) = P ( μ 2 , N 1 = 0 , T ) + k B T n 1 + O ( n 1 2 ) (6.111) P μ 2 , N 1 , T = P μ 2 , N 1 = 0 , T + k B T n 1 + O n 1 2 {:(6.111)P(mu_(2),N_(1),T)=P(mu_(2),N_(1)=0,T)+k_(B)Tn_(1)+O(n_(1)^(2)):}\begin{equation*} P\left(\mu_{2}, N_{1}, T\right)=P\left(\mu_{2}, N_{1}=0, T\right)+\mathrm{k}_{\mathrm{B}} T n_{1}+\mathcal{O}\left(n_{1}^{2}\right) \tag{6.111} \end{equation*}(6.111)P(μ2,N1,T)=P(μ2,N1=0,T)+kBTn1+O(n12)
Here we note that y 1 y 1 y_(1)y_{1}y1, which in general is hard to calculate, fortunately does not appear on the right hand side at this order of approximation.
Consider now two copies of the system called A and B, separated by a wall which leaves through water, but not the ions of the solute. The concentration n 1 ( A ) n 1 ( A ) n_(1)^((A))n_{1}^{(A)}n1(A) of ions on one side of the wall need not be equal to the concentration n 1 ( B ) n 1 ( B ) n_(1)^((B))n_{1}^{(B)}n1(B) on the other side. So we have different pressures P ( A ) P ( A ) P^((A))P^{(A)}P(A) and P ( B ) P ( B ) P^((B))P^{(B)}P(B). Their difference is
Δ P = P ( A ) P ( B ) = k B T ( n 1 ( A ) n 1 ( B ) ) Δ P = P ( A ) P ( B ) = k B T n 1 ( A ) n 1 ( B ) Delta P=P^((A))-P^((B))=k_(B)T(n_(1)^((A))-n_(1)^((B)))\Delta P=P^{(A)}-P^{(B)}=\mathrm{k}_{\mathrm{B}} T\left(n_{1}^{(A)}-n_{1}^{(B)}\right)ΔP=P(A)P(B)=kBT(n1(A)n1(B))
hence, writing Δ n = n 1 ( A ) n 1 ( B ) Δ n = n 1 ( A ) n 1 ( B ) Delta n=n_(1)^((A))-n_(1)^((B))\Delta n=n_{1}^{(A)}-n_{1}^{(B)}Δn=n1(A)n1(B), we obtain the osmotic formula, due to van 't Hoff:
(6.112) Δ P = k B T Δ n (6.112) Δ P = k B T Δ n {:(6.112)Delta P=k_(B)T Delta n:}\begin{equation*} \Delta P=\mathrm{k}_{\mathrm{B}} T \Delta n \tag{6.112} \end{equation*}(6.112)ΔP=kBTΔn
In the derivation of this formula we neglected terms of the order n 1 2 n 1 2 n_(1)^(2)n_{1}^{2}n12, which means that the formula is valid only for dilute solutions!

Appendix A

Dynamical Systems and Approach to Equilibrium

A. 1 The Master Equation

In this section, we will study a toy model for dynamically evolving ensembles (i.e. non-stationary ensembles) [in this section, we follow mostly Ch. 6 of "Physique Statistique" by A. Georges, M. Mézard, École Polytechnique (2010)]. We will not start from a Hamiltonian description of the dynamics, but rather work with a phenomenological description which is already probabilistic. In this approach, the ensemble is described by a time-dependent probability distribution { p n ( t ) } p n ( t ) {p_(n)(t)}\left\{p_{n}(t)\right\}{pn(t)}, where p n ( t ) p n ( t ) p_(n)(t)p_{n}(t)pn(t) is the probability of the system to be in state n n nnn at time t t ttt. Since p n ( t ) p n ( t ) p_(n)(t)p_{n}(t)pn(t) are to be probabilities, we evidently should have i = 1 N p i ( t ) = 1 , p i ( t ) 0 i = 1 N p i ( t ) = 1 , p i ( t ) 0 sum_(i=1)^(N)p_(i)(t)=1,p_(i)(t) >= 0\sum_{i=1}^{N} p_{i}(t)=1, p_{i}(t) \geqslant 0i=1Npi(t)=1,pi(t)0 for all t t ttt.
We assume that the time dependence is determined by the dynamical law
(A.1) d p i ( t ) d t = j i [ T i j p j ( t ) T j i p i ( t ) ] (A.1) d p i ( t ) d t = j i T i j p j ( t ) T j i p i ( t ) {:(A.1)(dp_(i)(t))/(dt)=sum_(j!=i)[T_(ij)p_(j)(t)-T_(ji)p_(i)(t)]:}\begin{equation*} \frac{d p_{i}(t)}{d t}=\sum_{j \neq i}\left[T_{i j} p_{j}(t)-T_{j i} p_{i}(t)\right] \tag{A.1} \end{equation*}(A.1)dpi(t)dt=ji[Tijpj(t)Tjipi(t)]
where T i j > 0 T i j > 0 T_(ij) > 0T_{i j}>0Tij>0 is the transition amplitude for going from state j j jjj to the state i i iii per unit of time. We call this law the "master equation." As already discussed in sec. 3.2 , the master equation can be thought of as a version of the Boltzmann equation. In the context of quantum mechanics, the transition amplitudes T i j T i j T_(ij)T_{i j}Tij induced by some small perturbation of the dynamics H 1 H 1 H_(1)H_{1}H1 would e.g. be given by Fermi's golden rule, T i j = 2 π n / h | i | H 1 | j | 2 T i j = 2 π n / h i | H 1 j 2 {:T_(ij)=2pi n//h|(:i|H_(1)|j:)|^(2)\left.T_{i j}=2 \pi n / h\left|\langle i| H_{1}\right| j\right\rangle\left.\right|^{2}Tij=2πn/h|i|H1|j|2 and would therefore be symmetric in i i iii and j , T i j = T j i j , T i j = T j i j,T_(ij)=T_(ji)j, T_{i j}=T_{j i}j,Tij=Tji. In this section, we do not assume that the transition amplitude is symmetric as this would exclude interesting examples.
It is instructive to check that the master equation has the desired property of keeping p i ( t ) 0 p i ( t ) 0 p_(i)(t) >= 0p_{i}(t) \geqslant 0pi(t)0 and i p i ( t ) = 1 i p i ( t ) = 1 sum_(i)p_(i)(t)=1\sum_{i} p_{i}(t)=1ipi(t)=1. The first property is seen as follows. Suppose that t 0 t 0 t_(0)t_{0}t0 is the first time that some p i ( t 0 ) = 0 p i t 0 = 0 p_(i)(t_(0))=0p_{i}\left(t_{0}\right)=0pi(t0)=0. From the structure of the master equation, it then follows
that d p i ( t 0 ) / d t > 0 d p i t 0 / d t > 0 dp_(i)(t_(0))//dt > 0d p_{i}\left(t_{0}\right) / d t>0dpi(t0)/dt>0, unless in fact all p j ( t 0 ) = 0 p j t 0 = 0 p_(j)(t_(0))=0p_{j}\left(t_{0}\right)=0pj(t0)=0. This is impossible, because the sum of the probabilities equal to 1 for all times. Indeed,
d d t i p i = i d d t p i = i j : j i ( T i j p j T j i p i ) = i , j : i j T i j p j i , j : j i T j i p i = 0 . d d t i p i = i d d t p i = i j : j i T i j p j T j i p i = i , j : i j T i j p j i , j : j i T j i p i = 0 . {:[(d)/(dt)sum_(i)p_(i)=sum_(i)(d)/(dt)p_(i)],[=sum_(i)sum_(j:j!=i)(T_(ij)p_(j)-T_(ji)p_(i))],[=sum_(i,j:i!=j)T_(ij)p_(j)-sum_(i,j:j!=i)T_(ji)p_(i)=0.]:}\begin{aligned} \frac{d}{d t} \sum_{i} p_{i} & =\sum_{i} \frac{d}{d t} p_{i} \\ & =\sum_{i} \sum_{j: j \neq i}\left(T_{i j} p_{j}-T_{j i} p_{i}\right) \\ & =\sum_{i, j: i \neq j} T_{i j} p_{j}-\sum_{i, j: j \neq i} T_{j i} p_{i}=0 . \end{aligned}ddtipi=iddtpi=ij:ji(TijpjTjipi)=i,j:ijTijpji,j:jiTjipi=0.
An equilibrium state corresponds to a distribution { p i eq } p i eq  {p_(i)^("eq ")}\left\{p_{i}^{\text {eq }}\right\}{pieq } which is constant in time and is a solution to the master equation, i.e.
(A.2) j : j i T i j p j eq = p i eq j : j i T j i . (A.2) j : j i T i j p j eq = p i eq j : j i T j i . {:(A.2)sum_(j:j!=i)T_(ij)p_(j)^(eq)=p_(i)^(eq)sum_(j:j!=i)T_(ji).:}\begin{equation*} \sum_{j: j \neq i} T_{i j} p_{j}^{\mathrm{eq}}=p_{i}^{\mathrm{eq}} \sum_{j: j \neq i} T_{j i} . \tag{A.2} \end{equation*}(A.2)j:jiTijpjeq=pieqj:jiTji.
An important special case is the case of symmetric transition amplitudes. We are in this case for example if the underlying microscopic dynamics is reversible. In that case, the uniform distribution p i eq = 1 N p i eq  = 1 N p_(i)^("eq ")=(1)/(N)p_{i}^{\text {eq }}=\frac{1}{N}pieq =1N is always stationary (micro canonical ensemble).

Example: Time evolution of a population of bacteria

Consider a population of some kind of bacteria, characterized by the following quantities:
n = number of bacteria in the population M = mortality rate R = reproduction rate p n ( t ) = probability that the population consists of n bacteria at instant t n =  number of bacteria in the population  M =  mortality rate  R =  reproduction rate  p n ( t ) =  probability that the population consists of  n  bacteria at instant  t {:[n=" number of bacteria in the population "],[M=" mortality rate "],[R=" reproduction rate "],[p_(n)(t)=" probability that the population consists of "n" bacteria at instant "t]:}\begin{aligned} n & =\text { number of bacteria in the population } \\ M & =\text { mortality rate } \\ R & =\text { reproduction rate } \\ p_{n}(t) & =\text { probability that the population consists of } n \text { bacteria at instant } t \end{aligned}n= number of bacteria in the population M= mortality rate R= reproduction rate pn(t)= probability that the population consists of n bacteria at instant t
In this case the master equation (A.1) reads:
(A.3) d d t p 0 = M p 1 (A.3) d d t p 0 = M p 1 {:(A.3)(d)/(dt)p_(0)=Mp_(1):}\begin{equation*} \frac{d}{d t} p_{0}=M p_{1} \tag{A.3} \end{equation*}(A.3)ddtp0=Mp1
It means that the transition amplitudes are given by
(A.5) { T n ( n + 1 ) = M ( n + 1 ) T n ( n 1 ) = R ( n 1 ) T i j = 0 otherwise (A.5) T n ( n + 1 ) = M ( n + 1 ) T n ( n 1 ) = R ( n 1 ) T i j = 0  otherwise  {:(A.5){[T_(n(n+1)),=M(n+1)],[T_(n(n-1)),=R(n-1)],[T_(ij),=0quad" otherwise "]:}:}\begin{cases}T_{n(n+1)} & =M(n+1) \tag{A.5}\\ T_{n(n-1)} & =R(n-1) \\ T_{i j} & =0 \quad \text { otherwise }\end{cases}(A.5){Tn(n+1)=M(n+1)Tn(n1)=R(n1)Tij=0 otherwise 
and the condition for equilibrium becomes
(A.6) R ( n 1 ) p n 1 eq + M ( n + 1 ) p n + 1 eq = ( R + M ) n p n eq , with n 1 and p 1 = 0 (A.6) R ( n 1 ) p n 1 eq + M ( n + 1 ) p n + 1 eq = ( R + M ) n p n eq ,  with  n 1  and  p 1 = 0 {:(A.6)R(n-1)p_(n-1)^(eq)+M(n+1)p_(n+1)^(eq)=(R+M)np_(n)^(eq)","quad" with "n >= 1" and "p_(1)=0:}\begin{equation*} R(n-1) p_{n-1}^{\mathrm{eq}}+M(n+1) p_{n+1}^{\mathrm{eq}}=(R+M) n p_{n}^{\mathrm{eq}}, \quad \text { with } n \geqslant 1 \text { and } p_{1}=0 \tag{A.6} \end{equation*}(A.6)R(n1)pn1eq+M(n+1)pn+1eq=(R+M)npneq, with n1 and p1=0
It follows by induction that in this example the only possible equilibrium state is given by
(A.7) p n eq = { 1 if n = 0 0 if n 1 (A.7) p n eq = 1  if  n = 0 0  if  n 1 {:(A.7)p_(n)^(eq)={[1," if "n=0],[0," if "n >= 1]:}:}p_{n}^{\mathrm{eq}}= \begin{cases}1 & \text { if } n=0 \tag{A.7}\\ 0 & \text { if } n \geqslant 1\end{cases}(A.7)pneq={1 if n=00 if n1
i.e. we have equilibrium if and only if all bacteria are dead.

A. 2 Properties of the Master Equation

We may rewrite the master equation (A.1) as
(A.8) d p i ( t ) d t = j X i j p j ( t ) (A.8) d p i ( t ) d t = j X i j p j ( t ) {:(A.8)(dp_(i)(t))/(dt)=sum_(j)X_(ij)p_(j)(t):}\begin{equation*} \frac{d p_{i}(t)}{d t}=\sum_{j} \mathcal{X}_{i j} p_{j}(t) \tag{A.8} \end{equation*}(A.8)dpi(t)dt=jXijpj(t)
where
(A.9) X i j = { T i j if i j k i T k i if i = j (A.9) X i j = T i j  if  i j k i T k i  if  i = j {:(A.9)X_(ij)={[T_(ij)," if "i!=j],[-sum_(k!=i)T_(ki)," if "i=j]:}:}\mathcal{X}_{i j}= \begin{cases}T_{i j} & \text { if } i \neq j \tag{A.9}\\ -\sum_{k \neq i} T_{k i} & \text { if } i=j\end{cases}(A.9)Xij={Tij if ijkiTki if i=j
We immediately find that X i j 0 X i j 0 X_(ij) >= 0\mathcal{X}_{i j} \geqslant 0Xij0 for all i j i j i!=ji \neq jij and X i i 0 X i i 0 X_(ii) <= 0\mathcal{X}_{i i} \leqslant 0Xii0 for all i i iii. We can obtain X i i < 0 X i i < 0 X_(ii) < 0\mathcal{X}_{i i}<0Xii<0 if we assume that for each i i iii there is at least one state j j jjj with nonzero transition amplitude T i j T i j T_(ij)T_{i j}Tij. We make this assumption from now on. The formal solution of (A.8) is given by the following matrix exponential:
(A.10) p ( t ) = e t X p ( 0 ) p ( t ) = ( p 1 ( t ) , , p N ( t ) ) . (A.10) p _ ( t ) = e t X p _ ( 0 ) p _ ( t ) = p 1 ( t ) , , p N ( t ) . {:(A.10)p_(t)=e^(tX)p_(0)quadp_(t)=(p_(1)(t),dots,p_(N)(t)).:}\begin{equation*} \underline{p}(t)=e^{t \mathcal{X}} \underline{p}(0) \quad \underline{p}(t)=\left(p_{1}(t), \ldots, p_{N}(t)\right) . \tag{A.10} \end{equation*}(A.10)p(t)=etXp(0)p(t)=(p1(t),,pN(t)).
(We also assume that the total number N N NNN of states is finite).
We would now like to understand whether there must always exist an equilibrium state, and if so, how it is approached. An equilibrium distribution must satisfy 0 = 0 = 0=0=0= j X i j p j eq j X i j p j eq  sum_(j)X_(ij)p_(j)^("eq ")\sum_{j} \mathcal{X}_{i j} p_{j}^{\text {eq }}jXijpjeq , which is possible if and only if the matrix X X X\mathcal{X}X has a zero eigenvalue. Thus, we must have some information about the eigenvalues of X X X\mathcal{X}X. We note that this matrix need
not be symmetric, so its eigenvalues, E E EEE, need not be real, and we are not necessarily able to diagonalize it! Nevertheless, it turns out that the master equation gives us a sufficient amount of information to understand the key features of the eigenvalue distribution. If we define the evolution matrix A ( t ) A ( t ) A(t)A(t)A(t) by
(A.11) A ( t ) := e t X (A.11) A ( t ) := e t X {:(A.11)A(t):=e^(tX):}\begin{equation*} A(t):=e^{t \mathcal{X}} \tag{A.11} \end{equation*}(A.11)A(t):=etX
then, since A ( t ) A ( t ) A(t)A(t)A(t) maps element-wise positive vectors p = ( p 1 , , p N ) p _ = p 1 , , p N p_=(p_(1),dots,p_(N))\underline{p}=\left(p_{1}, \ldots, p_{N}\right)p=(p1,,pN) to vectors with the same property, it easily follows that A i j ( 1 ) 0 A i j ( 1 ) 0 A_(ij)(1) >= 0A_{i j}(1) \geqslant 0Aij(1)0 for all i , j i , j i,ji, ji,j. Hence, by the PerronFrobenius theorem, the eigenvector v v _ v_\underline{v}v of A ( 1 ) A ( 1 ) A(1)A(1)A(1) whose eigenvalue λ max λ max  lambda_("max ")\lambda_{\text {max }}λmax  has the largest real part must be element wise positive, v i 0 v i 0 v_(i) >= 0v_{i} \geqslant 0vi0 for all i i iii, and λ max λ max lambda_(max)\lambda_{\max }λmax must be real and positive,
(A.12) A ( 1 ) v = λ max v , λ max > 0 . (A.12) A ( 1 ) v _ = λ max v _ , λ max > 0 . {:(A.12)A(1)v_=lambda_(maxv_)","quadlambda_(max) > 0.:}\begin{equation*} A(1) \underline{v}=\lambda_{\max \underline{v}}, \quad \lambda_{\max }>0 . \tag{A.12} \end{equation*}(A.12)A(1)v=λmaxv,λmax>0.
This (up to a rescaling) unique vector v v _ v_\underline{v}v must also be an eigenvector of X X X\mathcal{X}X, with real eigenvalue log λ max = E max log λ max = E max log lambda_(max)=E_(max)\log \lambda_{\max }=E_{\max }logλmax=Emax. We next show that any eigenvalue E E EEE of X X X\mathcal{X}X (possibly C C inC\in \mathbb{C}C ) has Re ( E ) 0 Re ( E ) 0 Re(E) <= 0\operatorname{Re}(E) \leqslant 0Re(E)0 by arguing as follows: Let w w _ w_\underline{w}w be an eigenvector of X X X\mathcal{X}X with eigenvalue E E EEE, i.e. X w = E w X w _ = E w _ Xw_=Ew_\mathcal{X} \underline{w}=E \underline{w}Xw=Ew. Then
(A.13) j i X i j w j = ( E X i i ) w i (A.13) j i X i j w j = E X i i w i {:(A.13)sum_(j!=i)X_(ij)w_(j)=(E-X_(ii))w_(i):}\begin{equation*} \sum_{j \neq i} \mathcal{X}_{i j} w_{j}=\left(E-\mathcal{X}_{i i}\right) w_{i} \tag{A.13} \end{equation*}(A.13)jiXijwj=(EXii)wi
and therefore
(A.14) j i X i j | w j | | E X i i | | w i | , (A.14) j i X i j w j E X i i w i , {:(A.14)sum_(j!=i)X_(ij)|w_(j)| >= |E-X_(ii)||w_(i)|",":}\begin{equation*} \sum_{j \neq i} \mathcal{X}_{i j}\left|w_{j}\right| \geqslant\left|E-\mathcal{X}_{i i}\right|\left|w_{i}\right|, \tag{A.14} \end{equation*}(A.14)jiXij|wj||EXii||wi|,
which follows from the triangle inequality and X i j 0 X i j 0 X_(ij) >= 0\mathcal{X}_{i j} \geqslant 0Xij0 for i j i j i!=ji \neq jij. Taking the sum i i sum_(i)\sum_{i}i and using (A.9) then yields i ( X i i + | E X i i | ) | w i | 0 i X i i + E X i i w i 0 sum_(i)(X_(ii)+|E-X_(ii)|)|w_(i)| <= 0\sum_{i}\left(\mathcal{X}_{i i}+\left|E-\mathcal{X}_{i i}\right|\right)\left|w_{i}\right| \leqslant 0i(Xii+|EXii|)|wi|0 and therefore ( X i i + | E X i i | ) | w i | 0 X i i + E X i i w i 0 (X_(ii)+|E-X_(ii)|)|w_(i)| <= 0\left(\mathcal{X}_{i i}+\left|E-\mathcal{X}_{i i}\right|\right)\left|w_{i}\right| \leqslant 0(Xii+|EXii|)|wi|0 for at least one i i iii. Since X i i < 0 X i i < 0 X_(ii) < 0\mathcal{X}_{i i}<0Xii<0, this is impossible unless Re ( E ) 0 Re ( E ) 0 Re(E) <= 0\operatorname{Re}(E) \leqslant 0Re(E)0. Then it follows that E max 0 E max 0 E_(max) <= 0E_{\max } \leqslant 0Emax0 and then also λ max 1 λ max 1 lambda_(max) <= 1\lambda_{\max } \leqslant 1λmax1. We would now like to argue that E max = 0 E max = 0 E_(max)=0E_{\max }=0Emax=0, in fact. Assume on the contrary E max < 0 E max < 0 E_(max) < 0E_{\max }<0Emax<0. Then
v ( t ) = A ( t ) v = e t E max v 0 , v _ ( t ) = A ( t ) v _ = e t E max v _ 0 , v_(t)=A(t)v_=e^(tE_(max))v_rarr0,\underline{v}(t)=A(t) \underline{v}=e^{t E_{\max }} \underline{v} \rightarrow 0,v(t)=A(t)v=etEmaxv0,
which is impossible as evolution preserves i v i ( t ) > 0 i v i ( t ) > 0 sum_(i)v_(i)(t) > 0\sum_{i} v_{i}(t)>0ivi(t)>0. From this we conclude that E max = 0 E max = 0 E_(max)=0E_{\max }=0Emax=0, or X v = 0 X v _ = 0 Xv_=0\mathcal{X} \underline{v}=0Xv=0, and thus
(A.15) p j eq = v j i v i (A.15) p j eq = v j i v i {:(A.15)p_(j)^(eq)=(v_(j))/(sum_(i)v_(i)):}\begin{equation*} p_{j}^{\mathrm{eq}}=\frac{v_{j}}{\sum_{i} v_{i}} \tag{A.15} \end{equation*}(A.15)pjeq=vjivi
is an equilibrium distribution. This equilibrium distribution is unique (from the PerronFrobenius theorem). Since any other eigenvalue E E EEE of X X X\mathcal{X}X must have Re ( E ) < 0 Re ( E ) < 0 Re(E) < 0\operatorname{Re}(E)<0Re(E)<0, any distribution { p i ( t ) } p i ( t ) {p_(i)(t)}\left\{p_{i}(t)\right\}{pi(t)} must approach this equilibrium state. We summarize our findings:
  1. There exists a unique equilibrium distribution { p j eq } p j eq  {p_(j)^("eq ")}\left\{p_{j}^{\text {eq }}\right\}{pjeq }.
  2. Any distribution { p i ( t ) } p i ( t ) {p_(i)(t)}\left\{p_{i}(t)\right\}{pi(t)} obeying the master equation must approach equilibrium as | p j ( t ) p j eq | = O ( e t / τ relax ) p j ( t ) p j eq = O e t / τ relax  |p_(j)(t)-p_(j)^(eq)|=O(e^(-t//tau_("relax ")))\left|p_{j}(t)-p_{j}^{\mathrm{eq}}\right|=\mathcal{O}\left(e^{-t / \tau_{\text {relax }}}\right)|pj(t)pjeq|=O(et/τrelax ) for all states j j jjj, where the relaxation timescale is given by τ relax = 1 / E 1 τ relax  = 1 / E 1 tau_("relax ")=-1//E_(1)\tau_{\text {relax }}=-1 / E_{1}τrelax =1/E1, where E 1 < 0 E 1 < 0 E_(1) < 0E_{1}<0E1<0 is largest non-zero eigenvalue of X X X\mathcal{X}X.
In statistical mechanics, one often has
(A.16) T i j e β ϵ j = T j i e β ϵ i , (A.16) T i j e β ϵ j = T j i e β ϵ i , {:(A.16)T_(ij)e^(-betaepsilon_(j))=T_(ji)e^(-betaepsilon_(i))",":}\begin{equation*} T_{i j} e^{-\beta \epsilon_{j}}=T_{j i} e^{-\beta \epsilon_{i}}, \tag{A.16} \end{equation*}(A.16)Tijeβϵj=Tjieβϵi,
where ϵ i ϵ i epsilon_(i)\epsilon_{i}ϵi is the energy of the state i i iii. Equation (A.16) is called the detailed balance condition. It is easy to see that it implies
p i eq = e β ϵ i / Z p i eq = e β ϵ i / Z p_(i)^(eq)=e^(-betaepsilon_(i))//Zp_{i}^{\mathrm{eq}}=e^{-\beta \epsilon_{i}} / Zpieq=eβϵi/Z
Thus, in this case, the unique equilibrium distribution is the canonical ensemble, which was motivated already in chapter 4.
If the detailed balance condition is fulfilled, we may pass from X i j X i j X_(ij)\mathcal{X}_{i j}Xij, which need not be symmetric, to a symmetric (hence diagonalizable) matrix by a change of the basis as follows. If we set q i ( t ) = p i ( t ) e β E i 2 q i ( t ) = p i ( t ) e β E i 2 q_(i)(t)=p_(i)(t)e^((betaE_(i))/(2))q_{i}(t)=p_{i}(t) e^{\frac{\beta E_{i}}{2}}qi(t)=pi(t)eβEi2, we get
(A.17) d q i ( t ) d t = j = 1 N X ~ i j q j ( t ) (A.17) d q i ( t ) d t = j = 1 N X ~ i j q j ( t ) {:(A.17)(dq_(i)(t))/(dt)=sum_(j=1)^(N) tilde(X)_(ij)q_(j)(t):}\begin{equation*} \frac{d q_{i}(t)}{d t}=\sum_{j=1}^{N} \tilde{\mathcal{X}}_{i j} q_{j}(t) \tag{A.17} \end{equation*}(A.17)dqi(t)dt=j=1NX~ijqj(t)
where
X ~ i j = e β ϵ i 2 X i j e β ϵ j 2 X ~ i j = e β ϵ i 2 X i j e β ϵ j 2 tilde(X)_(ij)=e^((betaepsilon_(i))/(2))X_(ij)e^((-betaepsilon_(j))/(2))\tilde{\mathcal{X}}_{i j}=e^{\frac{\beta \epsilon_{i}}{2}} \mathcal{X}_{i j} e^{\frac{-\beta \epsilon_{j}}{2}}X~ij=eβϵi2Xijeβϵj2
is now symmetric. We can diagonalize it with real eigenvalues λ n 0 λ n 0 lambda_(n) <= 0\lambda_{n} \leqslant 0λn0 and real eigenvectors w ( n ) w _ ( n ) w_^((n))\underline{w}^{(n)}w(n), so that X ~ w ( n ) = λ n w ( n ) X ~ w _ ( n ) = λ n w _ ( n ) tilde(X)w_^((n))=lambda_(n)w_^((n))\tilde{\mathcal{X}} \underline{w}^{(n)}=\lambda_{n} \underline{w}^{(n)}X~w(n)=λnw(n). The eigenvalue λ 0 = 0 λ 0 = 0 lambda_(0)=0\lambda_{0}=0λ0=0 again corresponds to equilibrium and w i ( 0 ) e β ϵ i / 2 w i ( 0 ) e β ϵ i / 2 w_(i)^((0))prope^(-betaepsilon_(i)//2)w_{i}^{(0)} \propto e^{-\beta \epsilon_{i} / 2}wi(0)eβϵi/2. Then we can write
(A.18) p i ( t ) = p i eq + e β ϵ i 2 n 1 c n e t λ n w i ( n ) (A.18) p i ( t ) = p i eq + e β ϵ i 2 n 1 c n e t λ n w i ( n ) {:(A.18)p_(i)(t)=p_(i)^(eq)+e^(-(betaepsilon_(i))/(2))sum_(n >= 1)c_(n)e^(tlambda_(n))w_(i)^((n)):}\begin{equation*} p_{i}(t)=p_{i}^{\mathrm{eq}}+e^{-\frac{\beta \epsilon_{i}}{2}} \sum_{n \geqslant 1} c_{n} e^{t \lambda_{n}} w_{i}^{(n)} \tag{A.18} \end{equation*}(A.18)pi(t)=pieq+eβϵi2n1cnetλnwi(n)
where c n = q ( 0 ) w ( n ) c n = q ( 0 ) w _ ( n ) c_(n)=q(0)*w_^((n))c_{n}=q(0) \cdot \underline{w}^{(n)}cn=q(0)w(n) are the Fourier coefficients. We see again that p i ( t ) p i ( t ) p_(i)(t)p_{i}(t)pi(t) converges to the equilibrium state exponentially with relaxation time 1 λ 1 < 1 λ 1 < -(1)/(lambda_(1)) < oo-\frac{1}{\lambda_{1}}<\infty1λ1<, where λ 1 < 0 λ 1 < 0 lambda_(1) < 0\lambda_{1}<0λ1<0 is the largest non-zero eigenvalue of X ~ X ~ tilde(X)\tilde{\mathcal{X}}X~.

A. 3 Relaxation time vs. ergodic time

We come back to the question why one never observes in practice that a macroscopically large system returns to its initial state. We discuss this in a toy model consisting of N N NNN
spins. A state of the system is described by a configuration C C CCC of spins:
(A.19) C = ( σ 1 , , σ N ) { + 1 , 1 } N . (A.19) C = σ 1 , , σ N { + 1 , 1 } N . {:(A.19)C=(sigma_(1),dots,sigma_(N))in{+1","-1}^(N).:}\begin{equation*} C=\left(\sigma_{1}, \ldots, \sigma_{N}\right) \in\{+1,-1\}^{N} . \tag{A.19} \end{equation*}(A.19)C=(σ1,,σN){+1,1}N.
The system has 2 N 2 N 2^(N)2^{N}2N possible states C C CCC, and we let p C ( t ) p C ( t ) p_(C)(t)p_{C}(t)pC(t) be the probability that the system is in the state C C CCC at time t t ttt. Furthermore, let τ 0 τ 0 tau_(0)\tau_{0}τ0 be the time scale for one update of the system, i.e. a spin flip occurs with probability d t τ 0 d t τ 0 (dt)/(tau_(0))\frac{d t}{\tau_{0}}dtτ0 during the time interval [ t , t + d t ] [ t , t + d t ] [t,t+dt][t, t+d t][t,t+dt]. We assume that all spin flips are equally likely in our model. This leads to a master equation (A.1) of the form
(A.20) d p C ( t ) d t = 1 τ 0 { 1 N i = 1 N p C i ( t ) p C ( t ) } = C X C C p C ( t ) (A.20) d p C ( t ) d t = 1 τ 0 1 N i = 1 N p C i ( t ) p C ( t ) = C X C C p C ( t ) {:(A.20)(dp_(C)(t))/(dt)=(1)/(tau_(0)){(1)/(N)sum_(i=1)^(N)p_(C_(i))(t)-p_(C)(t)}=sum_(C^('))X_(CC^('))p_(C^('))(t):}\begin{equation*} \frac{d p_{C}(t)}{d t}=\frac{1}{\tau_{0}}\left\{\frac{1}{N} \sum_{i=1}^{N} p_{C_{i}}(t)-p_{C}(t)\right\}=\sum_{C^{\prime}} \mathcal{X}_{C C^{\prime}} p_{C^{\prime}}(t) \tag{A.20} \end{equation*}(A.20)dpC(t)dt=1τ0{1Ni=1NpCi(t)pC(t)}=CXCCpC(t)
Here, the first term in the brackets { } { } {dots}\{\ldots\}{} describes the increase in probability due to a change C i C C i C C_(i)rarr CC_{i} \rightarrow CCiC, where C i C i C_(i)C_{i}Ci differs from C C CCC by flipping the i th i th  i^("th ")i^{\text {th }}ith  spin. This change occurs with probability 1 N 1 N (1)/(N)\frac{1}{N}1N per time τ 0 τ 0 tau_(0)\tau_{0}τ0. The second term in the brackets { } { } {dots}\{\ldots\}{} describes the decrease in probability due to the change C C i C C i C rarrC_(i)C \rightarrow C_{i}CCi for any i i iii. It can be checked from definition of X X X\mathcal{X}X that
(A.21) C X C C = 0 C p C ( t ) = 1 t . (A.21) C X C C = 0 C p C ( t ) = 1 t . {:(A.21)sum_(C)X_(CC^('))=0=>sum_(C)p_(C)(t)=1AA t.:}\begin{equation*} \sum_{C} \mathcal{X}_{C C^{\prime}}=0 \Rightarrow \sum_{C} p_{C}(t)=1 \forall t . \tag{A.21} \end{equation*}(A.21)CXCC=0CpC(t)=1t.
Furthermore it can be checked that the equilibrium configuration is given by
(A.22) p C eq = 1 2 N C { 1 , + 1 } N (A.22) p C eq = 1 2 N C { 1 , + 1 } N {:(A.22)p_(C)^(eq)=(1)/(2^(N))quad AA C in{-1","+1}^(N):}\begin{equation*} p_{C}^{\mathrm{eq}}=\frac{1}{2^{N}} \quad \forall C \in\{-1,+1\}^{N} \tag{A.22} \end{equation*}(A.22)pCeq=12NC{1,+1}N
Indeed: C X C C p C eq = 0 C X C C p C eq = 0 sum_(C^('))X_(CC^('))p_(C^('))^(eq)=0\sum_{C^{\prime}} \mathcal{X}_{C C^{\prime}} p_{C^{\prime}}^{\mathrm{eq}}=0CXCCpCeq=0, so in the equilibrium distribution, all states C C CCC are equally likely for this model.
If we now imagine a discretized version of the process, where at each time step one randomly chosen spin is flipped, then the timescale over which the system returns to the initial condition is estimated by τ ergodic 2 N τ 0 τ ergodic  2 N τ 0 tau_("ergodic ")~~2^(N)tau_(0)\tau_{\text {ergodic }} \approx 2^{N} \tau_{0}τergodic 2Nτ0 since we have to visit O ( 2 N ) O 2 N O(2^(N))\mathcal{O}\left(2^{N}\right)O(2N) sites before returning and each step takes time τ 0 τ 0 tau_(0)\tau_{0}τ0. We claim that this is much larger than the relaxation timescale. To estimate the latter, we choose an arbitrary but fixed spin, say the first spin. Then we define p ± = δ ( σ 1 1 ) p ± = δ σ 1 1 p_(+-)=(:delta(sigma_(1)∓1):)p_{ \pm}=\left\langle\delta\left(\sigma_{1} \mp 1\right)\right\ranglep±=δ(σ11), where the time-dependent average is calculated with respect to the distribution { p C ( t ) } p C ( t ) {p_(C)(t)}\left\{p_{C}(t)\right\}{pC(t)}, in other words
(A.23) p ± ( t ) = C : σ 1 = ± 1 p C ( t ) = probability for finding the 1 st spin up/down at time t (A.23) p ± ( t ) = C : σ 1 = ± 1 p C ( t ) =  probability for finding the  1 st   spin up/down at time  t {:(A.23)p_(+-)(t)=sum_(C:sigma_(1)=+-1)p_(C)(t)=" probability for finding the "1^("st ")" spin up/down at time "t:}\begin{equation*} p_{ \pm}(t)=\sum_{C: \sigma_{1}= \pm 1} p_{C}(t)=\text { probability for finding the } 1^{\text {st }} \text { spin up/down at time } t \tag{A.23} \end{equation*}(A.23)p±(t)=C:σ1=±1pC(t)= probability for finding the 1st  spin up/down at time t
The master equation implies an evolution equation for p + p + p_(+)p_{+}p+(and similarly p p p_(-)p_{-}p), which is
obtained by simply summing (A.20) subject to the condition C : σ 1 = ± 1 C : σ 1 = ± 1 sum_(C:sigma_(1)=+-1)\sum_{C: \sigma_{1}= \pm 1}C:σ1=±1. This gives:
(A.24) d p + d t = 1 τ 0 { 1 N ( 1 p + ) 1 N p + } , (A.24) d p + d t = 1 τ 0 1 N 1 p + 1 N p + , {:(A.24)(dp_(+))/(dt)=(1)/(tau_(0)){(1)/(N)(1-p_(+))-(1)/(N)p_(+)}",":}\begin{equation*} \frac{d p_{+}}{d t}=\frac{1}{\tau_{0}}\left\{\frac{1}{N}\left(1-p_{+}\right)-\frac{1}{N} p_{+}\right\}, \tag{A.24} \end{equation*}(A.24)dp+dt=1τ0{1N(1p+)1Np+},
which has the solution
(A.25) p + ( t ) = 1 2 + ( p + ( 0 ) 1 2 ) e 2 t N τ 0 (A.25) p + ( t ) = 1 2 + p + ( 0 ) 1 2 e 2 t N τ 0 {:(A.25)p_(+)(t)=(1)/(2)+(p_(+)(0)-(1)/(2))e^(-(2t)/(Ntau_(0))):}\begin{equation*} p_{+}(t)=\frac{1}{2}+\left(p_{+}(0)-\frac{1}{2}\right) e^{-\frac{2 t}{N \tau_{0}}} \tag{A.25} \end{equation*}(A.25)p+(t)=12+(p+(0)12)e2tNτ0
So for t t t rarr oot \rightarrow \inftyt, we have p + ( t ) 1 2 p + ( t ) 1 2 p_(+)(t)rarr(1)/(2)p_{+}(t) \rightarrow \frac{1}{2}p+(t)12 at an exponential rate. This means 1 2 1 2 (1)/(2)\frac{1}{2}12 is the equilibrium value of p + p + p_(+)p_{+}p+. Since this holds for any chosen spin, we expect that the relaxation time towards equilibrium is τ relax N 2 τ 0 τ relax  N 2 τ 0 tau_("relax ")~~(N)/(2)tau_(0)\tau_{\text {relax }} \approx \frac{N}{2} \tau_{0}τrelax N2τ0 and we see
(A.26) τ ergodic τ relax . (A.26) τ ergodic  τ relax  . {:(A.26)tau_("ergodic ")≫tau_("relax ").:}\begin{equation*} \tau_{\text {ergodic }} \gg \tau_{\text {relax }} . \tag{A.26} \end{equation*}(A.26)τergodic τrelax .
A more precise analysis of relaxation time involves finding the eigenvalues of the 2 N 2 N 2^(N_(-))2^{N_{-}}2N dimensional matrix X C C X C C X_(CC^('))\mathcal{X}_{C C^{\prime}}XCC : we think of the eigenvectors u 0 , u 1 , u 2 , u _ 0 , u _ 1 , u _ 2 , u__(0),u__(1),u__(2),dots\underline{u}_{0}, \underline{u}_{1}, \underline{u}_{2}, \ldotsu0,u1,u2, with eigenvalues λ 0 = 0 , λ 1 , λ 2 , λ 0 = 0 , λ 1 , λ 2 , lambda_(0)=0,lambda_(1),lambda_(2),dots\lambda_{0}=0, \lambda_{1}, \lambda_{2}, \ldotsλ0=0,λ1,λ2, as functions u 0 ( C ) , u 1 ( C ) , u 0 ( C ) , u 1 ( C ) , u_(0)(C),u_(1)(C),dotsu_{0}(C), u_{1}(C), \ldotsu0(C),u1(C), where C = ( σ 1 , , σ N ) C = σ 1 , , σ N C=(sigma_(1),dots,sigma_(N))C=\left(\sigma_{1}, \ldots, \sigma_{N}\right)C=(σ1,,σN). Then the eigenvalue equation is
(A.27) C X C C u n ( C ) = λ n u n ( C ) (A.27) C X C C u n C = λ n u n ( C ) {:(A.27)sum_(C^('))X_(CC^('))u_(n)(C^('))=lambda_(n)u_(n)(C):}\begin{equation*} \sum_{C^{\prime}} \mathcal{X}_{C C^{\prime}} u_{n}\left(C^{\prime}\right)=\lambda_{n} u_{n}(C) \tag{A.27} \end{equation*}(A.27)CXCCun(C)=λnun(C)
and we have
(A.28) u 0 ( C ) u 0 ( σ 1 , , σ N ) = p C eq = 1 2 N C (A.28) u 0 ( C ) u 0 σ 1 , , σ N = p C eq = 1 2 N C {:(A.28)u_(0)(C)-=u_(0)(sigma_(1),dots,sigma_(N))=p_(C)^(eq)=(1)/(2^(N))quad AA C:}\begin{equation*} u_{0}(C) \equiv u_{0}\left(\sigma_{1}, \ldots, \sigma_{N}\right)=p_{C}^{\mathrm{eq}}=\frac{1}{2^{N}} \quad \forall C \tag{A.28} \end{equation*}(A.28)u0(C)u0(σ1,,σN)=pCeq=12NC
Now we define the next N N NNN eigenvectors u 1 j , j = 1 , , N u 1 j , j = 1 , , N u_(1)^(j),j=1,dots,Nu_{1}^{j}, j=1, \ldots, Nu1j,j=1,,N by
(A.29) u 1 j ( σ 1 , , σ N ) = { α if σ j = + 1 β if σ j = 1 (A.29) u 1 j σ 1 , , σ N = α  if  σ j = + 1 β  if  σ j = 1 {:(A.29)u_(1)^(j)(sigma_(1),dots,sigma_(N))={[alpha," if "sigma_(j)=+1],[beta," if "sigma_(j)=-1]:}:}u_{1}^{j}\left(\sigma_{1}, \ldots, \sigma_{N}\right)= \begin{cases}\alpha & \text { if } \sigma_{j}=+1 \tag{A.29}\\ \beta & \text { if } \sigma_{j}=-1\end{cases}(A.29)u1j(σ1,,σN)={α if σj=+1β if σj=1
Imposing the eigenvalue equation gives α = β α = β alpha=-beta\alpha=-\betaα=β, and then λ 1 = 2 N λ 1 = 2 N lambda_(1)=-(2)/(N)\lambda_{1}=-\frac{2}{N}λ1=2N. The eigenvectors are orthogonal to each other. The next set of eigenvectors u 2 i j , 1 i < j N u 2 i j , 1 i < j N u_(2)^(ij),1 <= i < j <= Nu_{2}^{i j}, 1 \leqslant i<j \leqslant Nu2ij,1i<jN is
(A.30) u 2 i j ( σ 1 , , σ N ) = { α if σ i = 1 , σ j = 1 α if σ i = 1 , σ j = 1 α if σ i = 1 , σ j = 1 α if σ i = 1 , σ j = 1 (A.30) u 2 i j σ 1 , , σ N = α  if  σ i = 1 , σ j = 1 α  if  σ i = 1 , σ j = 1 α  if  σ i = 1 , σ j = 1 α  if  σ i = 1 , σ j = 1 {:(A.30)u_(2)^(ij)(sigma_(1),dots,sigma_(N))={[alpha," if "sigma_(i)=1","sigma_(j)=1],[-alpha," if "sigma_(i)=1","sigma_(j)=-1],[-alpha," if "sigma_(i)=-1","sigma_(j)=1],[alpha," if "sigma_(i)=-1","sigma_(j)=-1]:}:}u_{2}^{i j}\left(\sigma_{1}, \ldots, \sigma_{N}\right)= \begin{cases}\alpha & \text { if } \sigma_{i}=1, \sigma_{j}=1 \tag{A.30}\\ -\alpha & \text { if } \sigma_{i}=1, \sigma_{j}=-1 \\ -\alpha & \text { if } \sigma_{i}=-1, \sigma_{j}=1 \\ \alpha & \text { if } \sigma_{i}=-1, \sigma_{j}=-1\end{cases}(A.30)u2ij(σ1,,σN)={α if σi=1,σj=1α if σi=1,σj=1α if σi=1,σj=1α if σi=1,σj=1
The vectors u 2 i j u 2 i j u_(2)^(ij)u_{2}^{i j}u2ij are again found to be orthogonal, with the eigenvalue λ 2 = 4 N λ 2 = 4 N lambda_(2)=-(4)/(N)\lambda_{2}=-\frac{4}{N}λ2=4N. The subsequent vectors are constructed in the same fashion, and we find λ k = 2 k N λ k = 2 k N lambda_(k)=-(2k)/(N)\lambda_{k}=-\frac{2 k}{N}λk=2kN for the
k k kkk-th set. The general solution of the master equation is given by (A.10)
(A.31) p C ( t ) = C ( e t X ) C C p C ( 0 ) (A.31) p C ( t ) = C e t X C C p C ( 0 ) {:(A.31)p_(C)(t)=sum_(C^('))(e^(tX))_(CC^('))p_(C^('))(0):}\begin{equation*} p_{C}(t)=\sum_{C^{\prime}}\left(e^{t \mathcal{X}}\right)_{C C^{\prime}} p_{C^{\prime}}(0) \tag{A.31} \end{equation*}(A.31)pC(t)=C(etX)CCpC(0)
which we can now evaluate using our eigenvectors. If we write
p C ( t ) = p C eq + k = 1 N 1 i 1 < < i k N a i 1 i k ( t ) u k i 1 i k ( C ) p C ( t ) = p C eq + k = 1 N 1 i 1 < < i k N a i 1 i k ( t ) u k i 1 i k ( C ) p_(C)(t)=p_(C)^(eq)+sum_(k=1)^(N)sum_(1 <= i_(1) < dots < i_(k) <= N)a_(i_(1)dotsi_(k))(t)u_(k)^(i_(1)dotsi_(k))(C)p_{C}(t)=p_{C}^{\mathrm{eq}}+\sum_{k=1}^{N} \sum_{1 \leqslant i_{1}<\ldots<i_{k} \leqslant N} a_{i_{1} \ldots i_{k}}(t) u_{k}^{i_{1} \ldots i_{k}}(C)pC(t)=pCeq+k=1N1i1<<ikNai1ik(t)uki1ik(C)
we get
a i 1 i k ( t ) = a i 1 i k ( 0 ) e 2 k t / ( N τ 0 ) a i 1 i k ( t ) = a i 1 i k ( 0 ) e 2 k t / N τ 0 a_(i_(1)dotsi_(k))(t)=a_(i_(1)dotsi_(k))(0)e^(-2kt//(Ntau_(0)))a_{i_{1} \ldots i_{k}}(t)=a_{i_{1} \ldots i_{k}}(0) e^{-2 k t /\left(N \tau_{0}\right)}ai1ik(t)=ai1ik(0)e2kt/(Nτ0)
This gives the relaxation time for a general distribution. We see that the relaxation time is given by the exponential with the smallest decay (the term with k = 1 k = 1 k=1k=1k=1 in the sum), leading to the relaxation time τ relax = N τ 0 / 2 τ relax  = N τ 0 / 2 tau_("relax ")=Ntau_(0)//2\tau_{\text {relax }}=N \tau_{0} / 2τrelax =Nτ0/2 already guessed before. This is exponentially small compared to the ergodic time! For N = 1 N = 1 N=1N=1N=1 mol we have, approximately
(A.32) τ ergodic τ relax = O ( e ( 10 23 ) ) (A.32) τ ergodic  τ relax  = O e 10 23 {:(A.32)(tau_("ergodic "))/(tau_("relax "))=O(e^((10^(23)))):}\begin{equation*} \frac{\tau_{\text {ergodic }}}{\tau_{\text {relax }}}=\mathcal{O}\left(e^{\left(10^{23}\right)}\right) \tag{A.32} \end{equation*}(A.32)τergodic τrelax =O(e(1023))

A. 4 Monte Carlo methods and Metropolis algorithm

The Metropolis algorithm is based in an essential way on the fact that τ relax τ ergodic τ relax  τ ergodic  tau_("relax ")≪tau_("ergodic ")\tau_{\text {relax }} \ll \tau_{\text {ergodic }}τrelax τergodic  for typical systems. The general aim of the algorithm is to efficiently compute expectation values of the form
(A.33) F = C F ( C ) e β E ( C ) Z ( β ) (A.33) F = C F ( C ) e β E ( C ) Z ( β ) {:(A.33)(:F:)=sum_(C)F(C)(e^(-beta E(C)))/(Z(beta)):}\begin{equation*} \langle F\rangle=\sum_{C} F(C) \frac{e^{-\beta E(C)}}{Z(\beta)} \tag{A.33} \end{equation*}(A.33)F=CF(C)eβE(C)Z(β)
where E ( C ) E ( C ) E(C)E(C)E(C) is the energy of the state C C CCC and F F FFF is some observable. A good example to have in mind is the Ising model on a d d ddd-dimensional square lattice, where the energy is given by:
E ( C ) = J bonds i j σ i σ j + b sites i σ i , E ( C ) = J bonds  i j σ i σ j + b sites  i σ i , E(C)=-Jsum_("bonds "ij)sigma_(i)sigma_(j)+bsum_("sites "i)sigma_(i),E(C)=-J \sum_{\text {bonds } i j} \sigma_{i} \sigma_{j}+b \sum_{\text {sites } i} \sigma_{i},E(C)=Jbonds ijσiσj+bsites iσi,
and F = σ i F = σ i F=sigma_(i)F=\sigma_{i}F=σi or F = σ i σ j F = σ i σ j F=sigma_(i)sigma_(j)F=\sigma_{i} \sigma_{j}F=σiσj. Here, a configuration is a set of spins C = { σ 1 , , σ N } C = σ 1 , , σ N C={sigma_(1),dots,sigma_(N)}C=\left\{\sigma_{1}, \ldots, \sigma_{N}\right\}C={σ1,,σN}. In this example, as well as in most other models of statistical physics, the number of configurations scales exponentially with the system size; in the present example this number would be 2 N 2 N 2^(N)2^{N}2N. Furthermore, except for a rather special class of models, it seems impossible to do the sum in closed form. This happens for the Ising model for all dimensions d 3 d 3 d >= 3d \geqslant 3d3. Then, already for a cubic lattice of very modest side-length such as 10 , we are faced with 2 1000 2 1000 2^(1000)2^{1000}21000 configurations, meaning that it is utterly out of question to do this sum by simply adding all terms up on a computer.
If we have to evaluate numerically an integral of the form
I = 0 1 g ( x ) d x I = 0 1 g ( x ) d x I=int_(0)^(1)g(x)dxI=\int_{0}^{1} g(x) d xI=01g(x)dx
then if g ( x ) g ( x ) g(x)g(x)g(x) is sufficiently regular and varies on a scale of order 1 , we can do the integral e.g. by generating a sample X = { x 1 , , x m } X = x 1 , , x m X={x_(1),dots,x_(m)}X=\left\{x_{1}, \ldots, x_{m}\right\}X={x1,,xm}, chosen according to the uniform probability distribution on [ 0 , 1 ] [ 0 , 1 ] [0,1][0,1][0,1] (the latter means that we have no prior idea of what g ( x ) g ( x ) g(x)g(x)g(x) looks like. Then we expect that
I m 1 x i X g ( x i ) , I m 1 x i X g x i , I~~m^(-1)sum_(x_(i)in X)g(x_(i)),I \approx m^{-1} \sum_{x_{i} \in X} g\left(x_{i}\right),Im1xiXg(xi),
already for a relatively small number m m mmm of points. However, this will fail e.g. if g ( x ) g ( x ) g(x)g(x)g(x) is very sharply peaked near some point x 0 x 0 x_(0)x_{0}x0, say with peak width 10 1000 10 1000 10^(-1000)10^{-1000}101000. It is clear that generically, none of the randomly chosen points X = { x 1 , , x m } X = x 1 , , x m X={x_(1),dots,x_(m)}X=\left\{x_{1}, \ldots, x_{m}\right\}X={x1,,xm} will hit the peak, unless we take m 10 1000 m 10 1000 m~~10^(1000)m \approx 10^{1000}m101000, which is out of the question. Roughly speaking, we typically run into the same kind of problem when evaluating the state sum in statistical physics.
Again, a simple minded method would be to simply generate a uniformly distributed sample C ~ 1 , , C ~ u C ~ 1 , , C ~ u tilde(C)_(1),dots, tilde(C)_(u)\tilde{C}_{1}, \ldots, \tilde{C}_{u}C~1,,C~u where u 1 u 1 u≫1u \gg 1u1 and to approximate F i = 1 u F ( C ~ i ) e β E ( C ~ i ) Z F i = 1 u F C ~ i e β E C ~ i Z (:F:)~~sum_(i=1)^(u)F( tilde(C)_(i))(e^(-beta E( tilde(C)_(i))))/(Z)\langle F\rangle \approx \sum_{i=1}^{u} F\left(\tilde{C}_{i}\right) \frac{e^{-\beta E\left(\tilde{C}_{i}\right)}}{Z}Fi=1uF(C~i)eβE(C~i)Z. But this is a very bad idea in most cases since the fraction of configurations out of which the quantity e β E ( C ~ i ) e β E C ~ i e^(-beta E( tilde(C)_(i)))e^{-\beta E\left(\tilde{C}_{i}\right)}eβE(C~i) is not practically 0 is exponentially small. The idea is instead to generate a sample C 1 , , C m C 1 , , C m C_(1),dots,C_(m)C_{1}, \ldots, C_{m}C1,,Cm of configurations distributed according to e β E ( C ) e β E ( C ) prope^(-beta E(C))\propto e^{-\beta E(C)}eβE(C). But how to get such samples? We choose any (!) T C , C T C , C T_(C,C^('))T_{C, C^{\prime}}TC,C satisfying the detailed balance condition (A.16):
(A.34) T C , C e β E ( C ) = T C , C e β E ( C ) . (A.34) T C , C e β E C = T C , C e β E ( C ) . {:(A.34)T_(C,C^('))e^(-beta E(C^(')))=T_(C^('),C)e^(-beta E(C)).:}\begin{equation*} T_{C, C^{\prime}} e^{-\beta E\left(C^{\prime}\right)}=T_{C^{\prime}, C} e^{-\beta E(C)} . \tag{A.34} \end{equation*}(A.34)TC,CeβE(C)=TC,CeβE(C).
Then, according to the above discussion, we expect that an initial distribution p C ( t ) p C ( t ) p_(C)(t)p_{C}(t)pC(t) will reach the equilibrium configuration p C eq = e β E ( C ) / Z p C eq  = e β E ( C ) / Z p_(C)^("eq ")=e^(-beta E(C))//Zp_{C}^{\text {eq }}=e^{-\beta E(C)} / ZpCeq =eβE(C)/Z after about N N NNN time-steps, where N N NNN is the system size. If we chose transition amplitudes such that also C T C , C = 1 C T C , C = 1 sum_(C^('))T_(C^('),C)=1\sum_{C^{\prime}} T_{C^{\prime}, C}=1CTC,C=1 for all C C CCC, the discretized version of the master equation then becomes
p C ( t + 1 ) = C T C , C p C ( t ) p C ( t + 1 ) = C T C , C p C ( t ) p_(C^('))(t+1)=sum_(C)T_(C^('),C)p_(C)(t)p_{C^{\prime}}(t+1)=\sum_{C} T_{C^{\prime}, C} p_{C}(t)pC(t+1)=CTC,CpC(t)
In the simplest case, the sum is over all configurations C C CCC differing from C C C^(')C^{\prime}C by flipping precisely one spin. If C i C i C_(i)^(')C_{i}^{\prime}Ci is the configuration obtained from some configuration C C C^(')C^{\prime}C by flipping spin i i iii, we therefore assume that T C , C T C , C T_(C^('),C)T_{C^{\prime}, C}TC,C is non-zero only if C = C i C = C i C=C_(i)^(')C=C_{i}^{\prime}C=Ci for some i i iii.
Stating the algorithm in a slightly different way, we can say that, for a given configuration, we accept the change C C C C C rarrC^(')C \rightarrow C^{\prime}CC randomly with probability T C , C T C , C T_(C^('),C)T_{C^{\prime}, C}TC,C. A very simple and practical choice for the acceptance probability (in other words T C , C T C , C T_(C^('),C)T_{C^{\prime}, C}TC,C ) satisfying our
conditions is given by
(A.35) p accept = { 1 if E ( C ) E ( C ) e β [ E ( C ) E ( C ) ] if E ( C ) E ( C ) (A.35) p accept  = 1  if  E C E ( C ) e β E C E ( C )  if  E C E ( C ) {:(A.35)p_("accept ")={[1," if "E(C^(')) <= E(C)],[e^(-beta[E(C^('))-E(C)])," if "E(C^(')) >= E(C)]:}:}p_{\text {accept }}= \begin{cases}1 & \text { if } E\left(C^{\prime}\right) \leqslant E(C) \tag{A.35}\\ e^{-\beta\left[E\left(C^{\prime}\right)-E(C)\right]} & \text { if } E\left(C^{\prime}\right) \geqslant E(C)\end{cases}(A.35)paccept ={1 if E(C)E(C)eβ[E(C)E(C)] if E(C)E(C)
We may then summarize the algorithm as follows:

Metropolis Algorithm

(1) Choose an initial configuration C C CCC.
(2) Choose randomly a spin i i iii and determine the change in energy E ( C ) E ( C i ) = δ i E E ( C ) E C i = δ i E E(C)-E(C_(i))=delta_(i)EE(C)-E\left(C_{i}\right)=\delta_{i} EE(C)E(Ci)=δiE for the new configuration C i C i C_(i)C_{i}Ci obtained by flipping one spin i i iii.
(3) Choose a uniformly distributed random number u [ 0 , 1 ] u [ 0 , 1 ] u in[0,1]u \in[0,1]u[0,1]. If u < e β δ i E u < e β δ i E u < e^(-betadelta_(i)E)u<e^{-\beta \delta_{i} E}u<eβδiE, change σ i σ i σ i σ i sigma_(i)rarr-sigma_(i)\sigma_{i} \rightarrow-\sigma_{i}σiσi, otherwise leave σ i σ i sigma_(i)\sigma_{i}σi unchanged.
(4) Rename C i C C i C C_(i)rarr CC_{i} \rightarrow CCiC.
(5) Go back to (2).
Running the algorithm m m mmm times, going through approximately N N NNN iterations each time, gives the desired sample C 1 , , C m C 1 , , C m C_(1),dots,C_(m)C_{1}, \ldots, C_{m}C1,,Cm distributed approximately according to e β E ( C ) / Z e β E ( C ) / Z e^(-beta E(C))//Ze^{-\beta E(C)} / ZeβE(C)/Z. The expectation value < F > < F > < F ><F><F> is then computed as the average of F ( C ) F ( C ) F(C)F(C)F(C) over the sample C 1 , , C m C 1 , , C m C_(1),dots,C_(m)C_{1}, \ldots, C_{m}C1,,Cm. An important practical point is that the change in energy if we flip one spin is very easy to calculate in the example of the Ising model because the interaction is local (i.e. we have to compute only 2 d 2 d 2d2 d2d terms associated with the nearest neighbors of i i iii ), as it is in most models.

A. 5 Eigenstate thermalization

[In this section, we shall be following Srednicki: J Phys A 32 (1999) 1163-1175, Sections 2 and 3.] We have seen previously that in classical systems, the ensemble average of an observables equals the time-average under typical conditions. This can be seen as a kind of statement about equilibration. For quantum systems showing quantum chaos one can give another argument why the system will equilibrate no matter what its initial state was. It does not rely on an incomplete knowledge of the dynamics (as discussed in chapter 3.2) but rather on the specific structure of observables in such systems. Assume that dynamics is ruled by the time-independent (exact) Hamiltonian H ^ H ^ hat(H)\hat{H}H^ with nondegenerate eigenvalues E n E n E_(n)E_{n}En and eigenstates | n | n |n:)|n\rangle|n. Then, in the energy eigenbasis, the diagonal matrix elements of an observable A A AAA can be expressed in terms of a function in one variable E n E n E_(n)E_{n}En and the off-diagonal matrix elements can be expressed in terms of a function in two
variables E n E n E_(n)E_{n}En and E m E m E_(m)E_{m}Em, or in terms of their sum and difference
E n m = E n + E m 2 , ω n m = E n E m 2 . E n m = E n + E m 2 , ω n m = E n E m 2 . E_(nm)=(E_(n)+E_(m))/(2),quadomega_(nm)=(E_(n)-E_(m))/(2).E_{n m}=\frac{E_{n}+E_{m}}{2}, \quad \omega_{n m}=\frac{E_{n}-E_{m}}{2} .Enm=En+Em2,ωnm=EnEm2.
Thus, one can make the ansatz
(A.36) A n m = α ( E n ) δ n m + e S ( E n m ) 2 f ( E n m , ω n m ) R n m (A.36) A n m = α E n δ n m + e S E n m ) 2 f E n m , ω n m R n m {:(A.36)A_(nm)=alpha(E_(n))delta_(nm)+e^(-(S(E_(nm)))/(2))f(E_(nm),omega_(nm))R_(nm):}\begin{equation*} A_{n m}=\alpha\left(E_{n}\right) \delta_{n m}+\mathrm{e}^{-\frac{S\left(E_{n m)}\right.}{2}} f\left(E_{n m}, \omega_{n m}\right) R_{n m} \tag{A.36} \end{equation*}(A.36)Anm=α(En)δnm+eS(Enm)2f(Enm,ωnm)Rnm
where R n m R n m R_(nm)R_{n m}Rnm is some complex matrix, α α alpha\alphaα and f f fff are smooth real-valued functions, and where
S ( E ) = log ( E n h ( E E n ) ) S ( E ) = log E n h E E n S(E)=log(Esum_(n)h(E-E_(n)))S(E)=\log \left(E \sum_{n} h\left(E-E_{n}\right)\right)S(E)=log(Enh(EEn))
with a nonnegative smooth function h h hhh having unit integral and chosen so that S S SSS is monotonic. Calculation of the expectation value of A A AAA in the canonical ensemble at temperature T T TTT yields
A ens = tr ( A e β H ^ ) tr ( e β H ^ ) = 0 d E E α ( E ) e S ( E ) β E 0 d E E e S ( E ) β E + O ( e S / 2 ) , A ens  = tr A e β H ^ tr e β H ^ = 0 d E E α ( E ) e S ( E ) β E 0 d E E e S ( E ) β E + O e S / 2 , (:A:)_("ens ")=(tr(Ae^(-beta hat(H))))/(tr(e^(-beta hat(H))))=(int_(0)^(oo)(dE)/(E)alpha(E)e^(S(E)-beta E))/(int_(0)^(oo)(dE)/(E)e^(S(E)-beta E))+O(e^(-S//2)),\langle A\rangle_{\text {ens }}=\frac{\operatorname{tr}\left(A \mathrm{e}^{-\beta \hat{H}}\right)}{\operatorname{tr}\left(\mathrm{e}^{-\beta \hat{H}}\right)}=\frac{\int_{0}^{\infty} \frac{\mathrm{d} E}{E} \alpha(E) \mathrm{e}^{S(E)-\beta E}}{\int_{0}^{\infty} \frac{\mathrm{d} E}{E} \mathrm{e}^{S(E)-\beta E}}+O\left(\mathrm{e}^{-S / 2}\right),Aens =tr(AeβH^)tr(eβH^)=0dEEα(E)eS(E)βE0dEEeS(E)βE+O(eS/2),
where in the last term, S S SSS stands for the entropy of the ensemble. Extending this interpretation to the function S S SSS in the integrals and using that the entropy is an extensive quantity, one may evaluate the integrals by the method of stationary phase (see eg. Appendix A. 2 in the script on quantum mechanics). This leads to
(A.37) A ens = α ( H ^ ens ) + O ( N 1 ) + O ( e S / 2 ) , (A.37) A ens = α H ^ ens + O N 1 + O e S / 2 , {:(A.37)(:A:)_(ens)=alpha((:( hat(H)):)_(ens))+O(N^(-1))+O(e^(-S//2))",":}\begin{equation*} \langle A\rangle_{\mathrm{ens}}=\alpha\left(\langle\hat{H}\rangle_{\mathrm{ens}}\right)+O\left(N^{-1}\right)+O\left(\mathrm{e}^{-S / 2}\right), \tag{A.37} \end{equation*}(A.37)Aens=α(H^ens)+O(N1)+O(eS/2),
where T , S T , S T,ST, ST,S and E E EEE are related by E = H ^ ens E = H ^ ens E=(: hat(H):)_(ens)E=\langle\hat{H}\rangle_{\mathrm{ens}}E=H^ens and the usual relationship
T = ( S E | E ) 1 T = S E E 1 T=((del S)/(del E)|_(E))^(-1)T=\left(\left.\frac{\partial S}{\partial E}\right|_{E}\right)^{-1}T=(SE|E)1
On the other hand, for the time average A Ψ , time A Ψ ,  time  (:A:)_(Psi," time ")\langle A\rangle_{\Psi, \text { time }}AΨ, time  of the expectation value Ψ t A Ψ t Ψ t A Ψ t (:Psi_(t)∣APsi_(t):)\left\langle\Psi_{t} \mid A \Psi_{t}\right\rangleΨtAΨt in some given state Ψ Ψ Psi\PsiΨ, we find
(A.38) A Ψ , time = lim τ 1 τ 0 τ Ψ t | A | Ψ t d t = n | γ n | 2 A n n (A.38) A Ψ , time = lim τ 1 τ 0 τ Ψ t A Ψ t d t = n γ n 2 A n n {:(A.38)(:A:)_(Psi,time)=lim_(tau rarr oo)(1)/(tau)int_(0)^(tau)(:Psi_(t)|A|Psi_(t):)dt=sum_(n)|gamma_(n)|^(2)A_(nn):}\begin{equation*} \langle A\rangle_{\Psi, \mathrm{time}}=\lim _{\tau \rightarrow \infty} \frac{1}{\tau} \int_{0}^{\tau}\left\langle\Psi_{t}\right| A\left|\Psi_{t}\right\rangle \mathrm{d} t=\sum_{n}\left|\gamma_{n}\right|^{2} A_{n n} \tag{A.38} \end{equation*}(A.38)AΨ,time=limτ1τ0τΨt|A|Ψtdt=n|γn|2Ann
where γ n γ n gamma_(n)\gamma_{n}γn are the expansion coefficients of Ψ ( 0 ) Ψ ( 0 ) Psi(0)\Psi(0)Ψ(0) relative to the energy eigenbasis,
Ψ ( 0 ) = n γ n | n . Ψ ( 0 ) = n γ n | n . Psi(0)=sum_(n)gamma_(n)|n:).\Psi(0)=\sum_{n} \gamma_{n}|n\rangle .Ψ(0)=nγn|n.
In view of (A.36), this yields
(A.39) A Ψ , time = n | γ n | 2 α ( E n ) + O ( e S / 2 ) . (A.39) A Ψ ,  time  = n γ n 2 α E n + O e S / 2 . {:(A.39)(:A:)_(Psi," time ")=sum_(n)|gamma_(n)|^(2)alpha(E_(n))+O(e^(-S//2)).:}\begin{equation*} \langle A\rangle_{\Psi, \text { time }}=\sum_{n}\left|\gamma_{n}\right|^{2} \alpha\left(E_{n}\right)+O\left(\mathrm{e}^{-S / 2}\right) . \tag{A.39} \end{equation*}(A.39)AΨ, time =n|γn|2α(En)+O(eS/2).
For a state of a macroscopic system that could be realistically prepared in a lab, the uncertainty of H ^ H ^ hat(H)\hat{H}H^ in the state Ψ Ψ Psi\PsiΨ typically satisfies
Δ Ψ ( H ^ ) H ^ Ψ N Δ Ψ ( H ^ ) H ^ Ψ N Delta_(Psi)( hat(H))∼((:( hat(H)):)_(Psi))/(sqrtN)\Delta_{\Psi}(\hat{H}) \sim \frac{\langle\hat{H}\rangle_{\Psi}}{\sqrt{N}}ΔΨ(H^)H^ΨN
so it is small for large N N NNN. Thus, expanding α α alpha\alphaα in (A.39) about H ^ Ψ , time H ^ Ψ ,  time  (: hat(H):)_(Psi," time ")\langle\hat{H}\rangle_{\Psi, \text { time }}H^Ψ, time  to second order, we obtain the approximation
A Ψ , time = α ( H ^ Ψ ) + 1 2 α ( H ^ Ψ ) Δ Ψ 2 ( H ^ ) + + O ( e S / 2 ) = α ( H ^ Ψ ) + O ( Δ Ψ 2 ( H ^ ) ) + O ( e S / 2 ) A Ψ ,  time  = α H ^ Ψ + 1 2 α H ^ Ψ Δ Ψ 2 ( H ^ ) + + O e S / 2 = α H ^ Ψ + O Δ Ψ 2 ( H ^ ) + O e S / 2 {:[(:A:)_(Psi," time ")=alpha((:( hat(H)):)_(Psi))+(1)/(2)alpha^('')((:( hat(H)):)_(Psi))Delta_(Psi)^(2)( hat(H))+cdots+O(e^(-S//2))],[=alpha((:( hat(H)):)_(Psi))+O(Delta_(Psi)^(2)(( hat(H))))+O(e^(-S//2))]:}\begin{aligned} \langle A\rangle_{\Psi, \text { time }} & =\alpha\left(\langle\hat{H}\rangle_{\Psi}\right)+\frac{1}{2} \alpha^{\prime \prime}\left(\langle\hat{H}\rangle_{\Psi}\right) \Delta_{\Psi}^{2}(\hat{H})+\cdots+O\left(\mathrm{e}^{-S / 2}\right) \\ & =\alpha\left(\langle\hat{H}\rangle_{\Psi}\right)+O\left(\Delta_{\Psi}^{2}(\hat{H})\right)+O\left(\mathrm{e}^{-S / 2}\right) \end{aligned}AΨ, time =α(H^Ψ)+12α(H^Ψ)ΔΨ2(H^)++O(eS/2)=α(H^Ψ)+O(ΔΨ2(H^))+O(eS/2)
Combining this with the approximation for the ensemble average of A A AAA given by (A.37) and choosing T T TTT so that H ^ ens = H ^ Ψ H ^ ens  = H ^ Ψ (: hat(H):)_("ens ")=(: hat(H):)_(Psi)\langle\hat{H}\rangle_{\text {ens }}=\langle\hat{H}\rangle_{\Psi}H^ens =H^Ψ, we finally obtain
A Ψ , time = A ens + O ( Δ Ψ 2 ( H ^ ) ) + O ( N 1 ) + O ( e S / 2 ) . A Ψ ,  time  = A ens + O Δ Ψ 2 ( H ^ ) + O N 1 + O e S / 2 . (:A:)_(Psi," time ")=(:A:)_(ens)+O(Delta_(Psi)^(2)(( hat(H))))+O(N^(-1))+O(e^(-S//2)).\langle A\rangle_{\Psi, \text { time }}=\langle A\rangle_{\mathrm{ens}}+O\left(\Delta_{\Psi}^{2}(\hat{H})\right)+O\left(N^{-1}\right)+O\left(\mathrm{e}^{-S / 2}\right) .AΨ, time =Aens+O(ΔΨ2(H^))+O(N1)+O(eS/2).
As a result, the time average of the expectation value of A A AAA in a realistically preparable state is approximately equal to the ensemble average of A A AAA at the appropriate temperature. Therefore, no matter what (realistic) state the system is prepared in, it will always equilibrate.

Appendix B

Exercises

B. 1 Exercises for chapter 2

Problem B.1. [Random walk] Let w ( x ) d x w ( x ) d x w(x)dxw(x) d xw(x)dx be an arbitrary probability distribution describing the probability for finding a real random variable X X XXX in the 'interval' [ x , x + d x ] [ x , x + d x ] [x,x+dx][x, x+d x][x,x+dx]. Let the mean and spread be defined as usual as
μ = x w ( x ) d x , σ = ( ( x μ ) 2 w ( x ) d x ) 1 2 . μ = x w ( x ) d x , σ = ( x μ ) 2 w ( x ) d x 1 2 . mu=int xw(x)dx,quad sigma=(int(x-mu)^(2)w(x)dx)^((1)/(2)).\mu=\int x w(x) d x, \quad \sigma=\left(\int(x-\mu)^{2} w(x) d x\right)^{\frac{1}{2}} .μ=xw(x)dx,σ=((xμ)2w(x)dx)12.
Consider now a random walk on the real axis R R R\mathbb{R}R with increment/decrement X i X i X_(i)X_{i}Xi at step i i iii. The mean distance covered from the origin after N N NNN steps is Y = 1 N ( X 1 + + X N ) Y = 1 N X 1 + + X N Y=(1)/(N)(X_(1)+cdots+X_(N))Y=\frac{1}{N}\left(X_{1}+\cdots+X_{N}\right)Y=1N(X1++XN). The aim is to show that, for large N , Y N , Y N,YN, YN,Y has approximately a Gaussian probability distribution, with mean μ μ mu\muμ and spread σ / N σ / N sigma//sqrtN\sigma / \sqrt{N}σ/N.
a) Introduce the variable Z = ( X i μ ) / N Z = X i μ / N Z=sum(X_(i)-mu)//sqrtNZ=\sum\left(X_{i}-\mu\right) / \sqrt{N}Z=(Xiμ)/N and demonstrate that its probability distribution w Z ( z ) d z w Z ( z ) d z w_(Z)(z)dzw_{Z}(z) d zwZ(z)dz is given by
w Z ( z ) = d k 2 π e i k z d x 1 d x N w ( x i ) exp ( i k ( x 1 + + x N ) / N + i k μ N ) w Z ( z ) = d k 2 π e i k z d x 1 d x N w x i exp i k x 1 + + x N / N + i k μ N w_(Z)(z)=int(dk)/(2pi)e^(ikz)int dx_(1)dots dx_(N)prod w(x_(i))exp(-ik(x_(1)+cdots+x_(N))//sqrtN+ik musqrtN)w_{Z}(z)=\int \frac{d k}{2 \pi} e^{i k z} \int d x_{1} \ldots d x_{N} \prod w\left(x_{i}\right) \exp \left(-i k\left(x_{1}+\cdots+x_{N}\right) / \sqrt{N}+i k \mu \sqrt{N}\right)wZ(z)=dk2πeikzdx1dxNw(xi)exp(ik(x1++xN)/N+ikμN)
b) Introduce the 'characteristic function'
χ ( q ) = w ~ ( q ) = d x w ( x ) e i q x , χ ( q ) = w ~ ( q ) = d x w ( x ) e i q x , chi(q)= tilde(w)(q)=int dxw(x)e^(-iqx),\chi(q)=\tilde{w}(q)=\int d x w(x) e^{-i q x},χ(q)=w~(q)=dxw(x)eiqx,
where w ~ ( q ) w ~ ( q ) tilde(w)(q)\tilde{w}(q)w~(q) is the Fourier transform of w ( x ) w ( x ) w(x)w(x)w(x), and write w Z ( z ) w Z ( z ) w_(Z)(z)w_{Z}(z)wZ(z) in terms of it.
c) Show that the first terms in the expansion log χ ( q ) = Σ x n c ( i q ) n / n log χ ( q ) = Σ x n c ( i q ) n / n log chi(q)=Sigma(:x^(n):)_(c)(-iq)^(n)//n\log \chi(q)=\Sigma\left\langle x^{n}\right\rangle_{c}(-i q)^{n} / nlogχ(q)=Σxnc(iq)n/n ! are given by the cumulants x c = μ , x 2 c = σ 2 x c = μ , x 2 c = σ 2 (:x:)_(c)=mu,(:x^(2):)_(c)=sigma^(2)\langle x\rangle_{c}=\mu,\left\langle x^{2}\right\rangle_{c}=\sigma^{2}xc=μ,x2c=σ2. Substitute this into (b) and show that the result may be written as
w Z ( z ) = d k 2 π e i k z 1 2 ( σ k ) 2 + w Z ( z ) = d k 2 π e i k z 1 2 ( σ k ) 2 + w_(Z)(z)=int(dk)/(2pi)e^(ikz-(1)/(2)(sigma k)^(2)+cdots)w_{Z}(z)=\int \frac{d k}{2 \pi} e^{i k z-\frac{1}{2}(\sigma k)^{2}+\cdots}wZ(z)=dk2πeikz12(σk)2+
where ... stand for terms going to zero as 1 / N 1 / N 1//sqrtN1 / \sqrt{N}1/N or faster for large N N NNN.
d) Deduce that for large N N NNN one has w Z ( z ) 1 2 π σ exp ( z 2 / ( 2 σ 2 ) ) w Z ( z ) 1 2 π σ exp z 2 / 2 σ 2 w_(Z)(z)rarr(1)/(sqrt(2pi sigma))exp(-z^(2)//(2sigma^(2)))w_{Z}(z) \rightarrow \frac{1}{\sqrt{2 \pi \sigma}} \exp \left(-z^{2} /\left(2 \sigma^{2}\right)\right)wZ(z)12πσexp(z2/(2σ2)). Relating this to the distribution of Y Y YYY, get the desired result.
e) What is the wider significance of the result beyond a random walk on R R R\mathbb{R}R ?
Problem B. 2 (Entropy maximization 1). A system has N N NNN states occupied with probabilities p n , n = 0 , 1 , , N p n , n = 0 , 1 , , N p_(n),n=0,1,dots,Np_{n}, n=0,1, \ldots, Npn,n=0,1,,N. The n n nnn-th state has energy E n = n E n = n E_(n)=nE_{n}=nEn=n. The average energy, U = n E n p n U = n E n p n U=sum_(n)E_(n)p_(n)U=\sum_{n} E_{n} p_{n}U=nEnpn, is assumed to be given.
a) Show that the entropy is maximized by a distribution of the form p n = e β n / Z p n = e β n / Z p_(n)=e^(-beta n)//Zp_{n}=e^{-\beta n} / Zpn=eβn/Z.
b) In the case N = 2 N = 2 N=2N=2N=2, work out the explicit form of β , Z β , Z beta,Z\beta, Zβ,Z in terms of U U UUU.
c) Suppose now that the standard deviation,
(B.1) Δ U = ( n E n 2 p n ) U 2 (B.1) Δ U = n E n 2 p n U 2 {:(B.1)Delta U=sqrt((sum_(n)E_(n)^(2)p_(n))-U^(2)):}\begin{equation*} \Delta U=\sqrt{\left(\sum_{n} E_{n}^{2} p_{n}\right)-U^{2}} \tag{B.1} \end{equation*}(B.1)ΔU=(nEn2pn)U2
is also known. What is the form of the distribution maximizing the entropy in this case for general N N NNN ?
Your answers should indicate why the distribution is an actual maximum of the entropy and not just a stationary point.
Problem B. 3 (Entropy maximization 2). Repeatedly throwing an ideal dice will evidently yield the average result of 3.5 . However, it is found for an-obviously manipulateddice that the average is instead 4. In the absence of further information, what is the probability distribution, to within 3 significant figures, assigned to this dice?
Hint: Maximize the information entropy. You may use a computer (e.g. MATHEMATICA) to help you with any equation that you need to solve numerically.
Problem B. 4 (Information entropy). This problem motivates the definition of information entropy. Consider an experiment with N N NNN possible outcomes that occur randomly with probabilities p 1 , , p N p 1 , , p N p_(1),dots,p_(N)p_{1}, \ldots, p_{N}p1,,pN. (Think of throwing a dice, where N = 6 , p i = 1 6 N = 6 , p i = 1 6 N=6,p_(i)=(1)/(6)N=6, p_{i}=\frac{1}{6}N=6,pi=16.) To determine which outcome O O OOO has occurred, we allow ourselves yes/no questions of the type: Is O S O S O in SO \in SOS ? where S S SSS is some subset of { 1 , , N } { 1 , , N } {1,dots,N}\{1, \ldots, N\}{1,,N}. For example, for S = { i } S = { i } S={i}S=\{i\}S={i}, the corresponding question would be: Is O { i } O { i } O in{i}O \in\{i\}O{i}, i.e. has event i i iii occurred? Or, if S = { 1 , 3 } S = { 1 , 3 } S={1,3}S=\{1,3\}S={1,3}, the question is: Has event 1 or event 3 occurred?
a) Consider the following question strategy to find out what O O OOO was: We first ask: Has outcome 1 occurred? If yes, we are done, if no, we ask: Has outcome 2 occurred? and so on. What is the maximum number of questions needed to determine O O OOO in this strategy? What is the average number of questions needed in this strategy?
b) Let I = i p i log 2 p i I = i p i log 2 p i I=-sum_(i)p_(i)log_(2)p_(i)I=-\sum_{i} p_{i} \log _{2} p_{i}I=ipilog2pi be the information entropy. Show that in any strategy, the average number of questions needed to determine O O OOO is I I >= I\geqslant II.
c) Verify that this is indeed the case for strategy a), applied to a dice. For a dice, suggest a strategy which requires fewer questions on average than a).
d)* Show that there always exists a strategy such that the average number of questions needed is I + 1 . b I + 1 . b <= I+1.b\leqslant I+1 . \mathrm{b}I+1.b ) and d) show that I I III is an estimate of the average number of questions needed to find out what the outcome was.
Problem B.5. [Ising spin chain] We consider the 1-dimensional Ising spin chain with periodic boundary conditions. In this model, we have n n nnn spins σ 1 , , σ n { ± 1 } σ 1 , , σ n { ± 1 } sigma_(1),dots,sigma_(n)in{+-1}\sigma_{1}, \ldots, \sigma_{n} \in\{ \pm 1\}σ1,,σn{±1}, and the energy of a configuration is given by
H = J ( σ 1 σ 2 + σ 2 σ 3 + + σ n 1 σ n + σ n σ 1 ) H = J σ 1 σ 2 + σ 2 σ 3 + + σ n 1 σ n + σ n σ 1 H=-J(sigma_(1)sigma_(2)+sigma_(2)sigma_(3)+cdots+sigma_(n-1)sigma_(n)+sigma_(n)sigma_(1))H=-J\left(\sigma_{1} \sigma_{2}+\sigma_{2} \sigma_{3}+\cdots+\sigma_{n-1} \sigma_{n}+\sigma_{n} \sigma_{1}\right)H=J(σ1σ2+σ2σ3++σn1σn+σnσ1)
where J > 0 J > 0 J > 0J>0J>0. The probability distribution in the canonical ensemble is
P ( { σ j } ) = 1 Z exp ( β H ( { σ j } ) ) . P σ j = 1 Z exp β H σ j . P({sigma_(j)})=(1)/(Z)exp(-beta H({sigma_(j)})).P\left(\left\{\sigma_{j}\right\}\right)=\frac{1}{Z} \exp \left(-\beta H\left(\left\{\sigma_{j}\right\}\right)\right) .P({σj})=1Zexp(βH({σj})).
a) Draw a picture of the spin chain for a configuration minimizing/maximizing H H HHH.
b) Show that Z = { σ j } exp ( β H ( { σ j } ) ) Z = σ j exp β H σ j Z=sum_({sigma_(j)})exp(-beta H({sigma_(j)}))Z=\sum_{\left\{\sigma_{j}\right\}} \exp \left(-\beta H\left(\left\{\sigma_{j}\right\}\right)\right)Z={σj}exp(βH({σj})).
c) Show that Z Z ZZZ can be written alternatively as
Z = σ 1 = ± 1 σ n = ± 1 T σ 1 σ 2 T σ 2 σ 3 T σ n 1 σ n T σ n σ 1 = tr T n Z = σ 1 = ± 1 σ n = ± 1 T σ 1 σ 2 T σ 2 σ 3 T σ n 1 σ n T σ n σ 1 = tr T n Z=sum_(sigma_(1)=+-1)dotssum_(sigma_(n)=+-1)T_(sigma_(1)sigma_(2))T_(sigma_(2)sigma_(3))dotsT_(sigma_(n-1)sigma_(n))T_(sigma_(n)sigma_(1))=trT^(n)Z=\sum_{\sigma_{1}= \pm 1} \ldots \sum_{\sigma_{n}= \pm 1} T_{\sigma_{1} \sigma_{2}} T_{\sigma_{2} \sigma_{3}} \ldots T_{\sigma_{n-1} \sigma_{n}} T_{\sigma_{n} \sigma_{1}}=\operatorname{tr} T^{n}Z=σ1=±1σn=±1Tσ1σ2Tσ2σ3Tσn1σnTσnσ1=trTn
with the "transfer matrix" T σ σ = e β J σ σ T σ σ = e β J σ σ T_(sigmasigma^('))=e^(beta J sigmasigma^('))T_{\sigma \sigma^{\prime}}=e^{\beta J \sigma \sigma^{\prime}}Tσσ=eβJσσ. In matrix form,
T = ( T + + T + T + T + + ) = ( e β J e β J e β J e β J ) T = T + + T + T + T + + = e β J e β J e β J e β J T=([T_(++),T_(+-)],[T_(-+),T_(++)])=([e^(beta J),e^(-beta J)],[e^(-beta J),e^(beta J)])T=\left(\begin{array}{cc} T_{++} & T_{+-} \\ T_{-+} & T_{++} \end{array}\right)=\left(\begin{array}{cc} e^{\beta J} & e^{-\beta J} \\ e^{-\beta J} & e^{\beta J} \end{array}\right)T=(T++T+T+T++)=(eβJeβJeβJeβJ)
Hint: write out H H HHH and the multiple sums in Z = { σ j } exp ( β H ( { σ j } ) ) Z = σ j exp β H σ j Z=sum_({sigma_(j)})exp(-beta H({sigma_(j)}))Z=\sum_{\left\{\sigma_{j}\right\}} \exp \left(-\beta H\left(\left\{\sigma_{j}\right\}\right)\right)Z={σj}exp(βH({σj})).
d) Show that T T TTT has eigenvalues λ 1 = 2 cosh ( J β ) , λ 2 = 2 sinh ( J β ) λ 1 = 2 cosh ( J β ) , λ 2 = 2 sinh ( J β ) lambda_(1)=2cosh(J beta),lambda_(2)=2sinh(J beta)\lambda_{1}=2 \cosh (J \beta), \lambda_{2}=2 \sinh (J \beta)λ1=2cosh(Jβ),λ2=2sinh(Jβ). (This means that we can diagonalize T T TTT, i.e. we have T = U D U T = U D U T=UDU^(†)T=U D U^{\dagger}T=UDU, where D = diag ( λ 1 , λ 2 ) D = diag λ 1 , λ 2 D=diag(lambda_(1),lambda_(2))D=\operatorname{diag}\left(\lambda_{1}, \lambda_{2}\right)D=diag(λ1,λ2) and U U UUU is a unitary matrix.) Use this and Z = tr T n Z = tr T n Z=trT^(n)Z=\operatorname{tr} T^{n}Z=trTn to show that
Z = 2 n [ ( cosh ( β J ) ) n + ( sinh ( β J ) ) n ] . Z = 2 n ( cosh ( β J ) ) n + ( sinh ( β J ) ) n . Z=2^(n)[(cosh(beta J))^(n)+(sinh(beta J))^(n)].Z=2^{n}\left[(\cosh (\beta J))^{n}+(\sinh (\beta J))^{n}\right] .Z=2n[(cosh(βJ))n+(sinh(βJ))n].
Note that you do not need to compute U U UUU (explain why!).
Problem B. 6 (Initial conditions). 1 mm 3 1 mm 3 1mm^(3)1 \mathrm{~mm}^{3}1 mm3 of a gas at normal pressure and temperature contains about 10 15 10 15 10^(15)10^{15}1015 particles. Considering the particles as point-like and classical, provide a rough, conservative estimate for how many hard drives would be necessary to store the initial conditions of all gas particles. (As of 2013, a normal hard drive can store about 5 TB of data.)
Problem B. 7 (Time evolution of ensemble averages).
a) Let ( P , Q ) ( p 1 , q 1 , , p N , q N ) R 6 N ( P , Q ) p 1 , q 1 , , p N , q N R 6 N (P,Q)-=( vec(p)_(1), vec(q)_(1),dots, vec(p)_(N), vec(q)_(N))inR^(6N)(P, Q) \equiv\left(\vec{p}_{1}, \vec{q}_{1}, \ldots, \vec{p}_{N}, \vec{q}_{N}\right) \in \mathbb{R}^{6 N}(P,Q)(p1,q1,,pN,qN)R6N be a point in the phase space of N N NNN particles, and let the dynamical law be given through a time-independent Hamiltonian H ( P , Q ) H ( P , Q ) H(P,Q)H(P, Q)H(P,Q), which defines the trajectories ( P ( t ) , Q ( t ) ) ( P ( t ) , Q ( t ) ) (P(t),Q(t))(P(t), Q(t))(P(t),Q(t)) via Hamilton's equations. Let Φ t ( P , Q ) := ( P ( t ) , Q ( t ) ) Φ t ( P , Q ) := ( P ( t ) , Q ( t ) ) Phi_(t)(P,Q):=(P(t),Q(t))\Phi_{t}(P, Q):=(P(t), Q(t))Φt(P,Q):=(P(t),Q(t)) with initial condition ( P ( 0 ) , Q ( 0 ) ) = ( P , Q ) ( P ( 0 ) , Q ( 0 ) ) = ( P , Q ) (P(0),Q(0))=(P,Q)(P(0), Q(0))=(P, Q)(P(0),Q(0))=(P,Q). Show that for any observable O ( P , Q ) O ( P , Q ) O(P,Q)O(P, Q)O(P,Q), the 'time-translated' observable O t ( P , Q ) = O [ Φ t ( P , Q ) ] O t ( P , Q ) = O Φ t ( P , Q ) O_(t)(P,Q)=O[Phi_(t)(P,Q)]O_{t}(P, Q)=O\left[\Phi_{t}(P, Q)\right]Ot(P,Q)=O[Φt(P,Q)] have the same expectation value, O = O t O = O t (:O:)=(:O_(t):)\langle O\rangle=\left\langle O_{t}\right\rangleO=Ot for all t t ttt, where the probability distribution describing the ensemble has constant value on each energy surface.
b) Is the phase space for N N NNN particles always of the form R 6 N R 6 N R^(6N)\mathbb{R}^{6 N}R6N ? Hint: Think e.g. of a gas consisting of molecules.
Problem B. 8 (Phase space density). The purpose of this problem is to explain why the phase space density for an equilibrium ensemble must generically be a function of E E EEE alone.
a) Show that the "classical trace" of a rapidly decaying function f ( P , Q ) f ( P , Q ) f(P,Q)f(P, Q)f(P,Q) on the phase space R 6 N R 6 N R^(6N)\mathbb{R}^{6 N}R6N, defined by
tr ( f ) = f ( P , Q ) d 3 N Q d 3 N P tr ( f ) = f ( P , Q ) d 3 N Q d 3 N P tr(f)=int f(P,Q)d^(3N)Qd^(3N)P\operatorname{tr}(f)=\int f(P, Q) d^{3 N} Q d^{3 N} Ptr(f)=f(P,Q)d3NQd3NP
has the properties
tr { f , g } = 0 , ( tr ( f ) ) = tr ( f ) , tr ( f f ) 0 tr { f , g } = 0 , ( tr ( f ) ) = tr f , tr f f 0 tr{f,g}=0,quad(tr(f))^(**)=tr(f^(**)),quad tr(f^(**)f) >= 0\operatorname{tr}\{f, g\}=0, \quad(\operatorname{tr}(f))^{*}=\operatorname{tr}\left(f^{*}\right), \quad \operatorname{tr}\left(f^{*} f\right) \geqslant 0tr{f,g}=0,(tr(f))=tr(f),tr(ff)0
Hint: Use the results of problem B.7.
b) Let ρ ( P , Q ) ρ ( P , Q ) rho(P,Q)\rho(P, Q)ρ(P,Q) be a classical phase space distribution, O ( P , Q ) O ( P , Q ) O(P,Q)O(P, Q)O(P,Q) an observable, and O = tr ( O ρ ) O = tr ( O ρ ) (:O:)=tr(O rho)\langle O\rangle=\operatorname{tr}(O \rho)O=tr(Oρ), with the "classical trace" defined in a). Show that if ρ = ρ ( I 0 , , I N ) ρ = ρ I 0 , , I N rho=rho(I_(0),dots,I_(N))\rho=\rho\left(I_{0}, \ldots, I_{N}\right)ρ=ρ(I0,,IN) is a function of conserved quantities I i I i I_(i)I_{i}Ii of the system (which include at least I 0 = H I 0 = H I_(0)=HI_{0}=HI0=H ), then the ensemble defined by ρ ρ rho\rhoρ is stationary, i.e. O t = O O t = O (:O_(t):)=(:O:)\left\langle O_{t}\right\rangle=\langle O\rangleOt=O for all t t ttt and all O O OOO. Is the converse statement also true? What are the conserved quantities for a generic Hamiltonian of the form
H ( P , Q ) = i | p i | 2 2 m + i < j V ( | x i x j | ) + i W ( x i ) H ( P , Q ) = i p i 2 2 m + i < j V x i x j + i W x i H(P,Q)=sum_(i)(| vec(p)_(i)|^(2))/(2m)+sum_(i < j)V(| vec(x)_(i)- vec(x)_(j)|)+sum_(i)W( vec(x)_(i))H(P, Q)=\sum_{i} \frac{\left|\vec{p}_{i}\right|^{2}}{2 m}+\sum_{i<j} V\left(\left|\vec{x}_{i}-\vec{x}_{j}\right|\right)+\sum_{i} W\left(\vec{x}_{i}\right)H(P,Q)=i|pi|22m+i<jV(|xixj|)+iW(xi)
(note that W W WWW can be used to describe the "walls" of a box.) What if W = 0 W = 0 W=0W=0W=0 ? What if W ( x i ) = w ( | x i | ) W x i = w x i W( vec(x)_(i))=w(| vec(x)_(i)|)W\left(\vec{x}_{i}\right)=w\left(\left|\vec{x}_{i}\right|\right)W(xi)=w(|xi|), i.e. a spherically symmetric external potential? What if V = W = 0 ? V = W = 0 ? V=W=0?V=W=0 ?V=W=0?
Problem B. 9 (Density matrices).
a) Verify the following elementary properties of the trace of matrices:
tr [ A , B ] = 0 , ( tr A ) = tr ( A ) , tr ( A A ) 0 tr [ A , B ] = 0 , ( tr A ) = tr A , tr A A 0 tr[A,B]=0,quad(tr A)^(**)=tr(A^(†)),quad tr(A^(†)A) >= 0\operatorname{tr}[A, B]=0, \quad(\operatorname{tr} A)^{*}=\operatorname{tr}\left(A^{\dagger}\right), \quad \operatorname{tr}\left(A^{\dagger} A\right) \geqslant 0tr[A,B]=0,(trA)=tr(A),tr(AA)0
b) Let ρ , σ ρ , σ rho,sigma\rho, \sigmaρ,σ be density matrices. Show that for any nonnegative real numbers p , q p , q p,qp, qp,q satisfying p + q = 1 p + q = 1 p+q=1p+q=1p+q=1, the matrix p ρ + q σ p ρ + q σ p rho+q sigmap \rho+q \sigmapρ+qσ has the properties of a density matrix.
Problem B. 10 (Entanglement entropy). Ignoring all degrees of freedom other than spin, the Hilbert space of a single neutron is H = C 2 H = C 2 H=C^(2)\mathscr{H}=\mathbb{C}^{2}H=C2, with spin-operators
σ x = ( 0 1 1 0 ) , σ y = ( 0 i i 0 ) , σ z = ( 1 0 0 1 ) σ x = 0 1 1 0 , σ y = 0 i i 0 , σ z = 1 0 0 1 sigma_(x)=([0,1],[1,0]),quadsigma_(y)=([0,-i],[i,0]),quadsigma_(z)=([1,0],[0,-1])\sigma_{x}=\left(\begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array}\right), \quad \sigma_{y}=\left(\begin{array}{cc} 0 & -i \\ i & 0 \end{array}\right), \quad \sigma_{z}=\left(\begin{array}{cc} 1 & 0 \\ 0 & -1 \end{array}\right)σx=(0110),σy=(0ii0),σz=(1001)
a) The neutron is aligned with probability p p ppp in the + z + z +z+z+z direction and with probability 1 p 1 p 1-p1-p1p in the + x + x +x+x+x direction. What is the density matrix describing this ensemble? What are its eigenvalues?
b) Now consider two neutrons in the normalized state | ψ = α | ↑↑ + β | ↓↓ | ψ = α | ↑↑ + β | ↓↓ |psi:)=alpha|uarr uarr:)+beta|darr darr:)|\psi\rangle=\alpha|\uparrow \uparrow\rangle+\beta|\downarrow \downarrow\rangle|ψ=α|↑↑+β|↓↓, where | | |uarr:)|\uparrow\rangle| resp. | | |darr:)|\downarrow\rangle| are the normalized eigenstates for the + z + z +z+z+z resp. z z -z-zz direction and | ↑↑ = | | | ↑↑ = | | |uarr uarr:)=|uarr:)ox|uarr:)|\uparrow \uparrow\rangle=|\uparrow\rangle \otimes|\uparrow\rangle|↑↑=|| etc. What is the reduced density matrix for the first neutron? When is the corresponding entanglement entropy maximal/minimal?
c) Assume now that the two neutron system is in an eigenstate | χ | χ |chi:)|\chi\rangle|χ with zero total spin in the z z zzz-direction. What is the reduced density matrix for the first neutron? Show that if | χ | χ |chi:)|\chi\rangle|χ is either symmetric or anti-symmetric, then the entanglement entropy is maximized.
d) Can an arbitrary density matrix ρ 1 ρ 1 rho_(1)\rho_{1}ρ1 for the first neutron arise as the reduced density matrix of a suitable (pure) state of the system with two neutrons?

B. 2 Exercises for chapter 3

Problem B. 11 (Boltzmann's H-theorem). Let f ( v , x , t ) f ( v , x , t ) f( vec(v), vec(x),t)f(\vec{v}, \vec{x}, t)f(v,x,t) be the 1-particle probability distribution, expressed in terms of the velocity v = p / m v = p / m vec(v)= vec(p)//m\vec{v}=\vec{p} / mv=p/m. Define a function H ( t ) H ( t ) H(t)H(t)H(t) by
H ( t ) = d 3 v d 3 x f ( v , x , t ) log f ( v , x , t ) H ( t ) = d 3 v d 3 x f ( v , x , t ) log f ( v , x , t ) H(t)=-intd^(3)vd^(3)xf( vec(v), vec(x),t)log f( vec(v), vec(x),t)H(t)=-\int d^{3} v d^{3} x f(\vec{v}, \vec{x}, t) \log f(\vec{v}, \vec{x}, t)H(t)=d3vd3xf(v,x,t)logf(v,x,t)
which is similar in nature to the von Neumann ( prop\propto information-) entropy. The aim of this problem is to derive the important consequence H ˙ 0 H ˙ 0 H^(˙) >= 0\dot{H} \geqslant 0H˙0 of the Boltzmann equation.
a) Explain why the Boltzmann equation can alternatively be written as
[ t + v x + 1 m F ( x ) v ] f ( x , v , t ) = d 3 v 2 d 3 v 3 d 3 v 4 W ( v , v 2 , v 3 , v 4 ) [ f ( x , v 3 , t ) f ( x , v 4 , t ) f ( x , v , t ) f ( x , v 2 , t ) ] t + v x + 1 m F ( x ) v f ( x , v , t ) = d 3 v 2 d 3 v 3 d 3 v 4 W v , v 2 , v 3 , v 4 f x , v 3 , t f x , v 4 , t f ( x , v , t ) f x , v 2 , t {:[[(del)/(del t)+( vec(v))(del)/(del( vec(x)))+(1)/(m)( vec(F))(( vec(x)))(del)/(del( vec(v)))]f( vec(x)"," vec(v)","t)],[=intd^(3)v_(2)d^(3)v_(3)d^(3)v_(4)W(( vec(v)), vec(v)_(2), vec(v)_(3), vec(v)_(4))[f(( vec(x)), vec(v)_(3),t)f(( vec(x)), vec(v)_(4),t)-f(( vec(x)),( vec(v)),t)f(( vec(x)), vec(v)_(2),t)]]:}\begin{aligned} & {\left[\frac{\partial}{\partial t}+\vec{v} \frac{\partial}{\partial \vec{x}}+\frac{1}{m} \vec{F}(\vec{x}) \frac{\partial}{\partial \vec{v}}\right] f(\vec{x}, \vec{v}, t)} \\ & =\int d^{3} v_{2} d^{3} v_{3} d^{3} v_{4} W\left(\vec{v}, \vec{v}_{2}, \vec{v}_{3}, \vec{v}_{4}\right)\left[f\left(\vec{x}, \vec{v}_{3}, t\right) f\left(\vec{x}, \vec{v}_{4}, t\right)-f(\vec{x}, \vec{v}, t) f\left(\vec{x}, \vec{v}_{2}, t\right)\right] \end{aligned}[t+vx+1mF(x)v]f(x,v,t)=d3v2d3v3d3v4W(v,v2,v3,v4)[f(x,v3,t)f(x,v4,t)f(x,v,t)f(x,v2,t)]
for some W > 0 W > 0 W > 0W>0W>0. Hint: "Undo" the momentum- and energy-conservation rule which has already been included in the Boltzmann equation by introducing δ 3 ( v + v 1 δ 3 v + v 1 delta^(3)(( vec(v))+ vec(v)_(1)-:}\delta^{3}\left(\vec{v}+\vec{v}_{1}-\right.δ3(v+v1 v 3 v 4 v 3 v 4 vec(v)_(3)- vec(v)_(4)\vec{v}_{3}-\vec{v}_{4}v3v4 ) and new integrations etc.
b) Argue physically why the following relations should hold:
W ( v 1 , v 2 , v 3 , v 4 ) = W ( v 2 , v 1 , v 4 , v 3 ) W ( v 1 , v 2 , v 3 , v 4 ) = W ( v 1 , v 2 , v 3 , v 4 ) W ( v 1 , v 2 , v 3 , v 4 ) = W ( v 3 , v 4 , v 1 , v 2 ) W v 1 , v 2 , v 3 , v 4 = W v 2 , v 1 , v 4 , v 3 W v 1 , v 2 , v 3 , v 4 = W v 1 , v 2 , v 3 , v 4 W v 1 , v 2 , v 3 , v 4 = W v 3 , v 4 , v 1 , v 2 {:[W( vec(v)_(1), vec(v)_(2), vec(v)_(3), vec(v)_(4))=W( vec(v)_(2), vec(v)_(1), vec(v)_(4), vec(v)_(3))],[W(- vec(v)_(1),- vec(v)_(2),- vec(v)_(3),- vec(v)_(4))=W( vec(v)_(1), vec(v)_(2), vec(v)_(3), vec(v)_(4))],[W( vec(v)_(1), vec(v)_(2), vec(v)_(3), vec(v)_(4))=W(- vec(v)_(3),- vec(v)_(4),- vec(v)_(1),- vec(v)_(2))]:}\begin{aligned} & W\left(\vec{v}_{1}, \vec{v}_{2}, \vec{v}_{3}, \vec{v}_{4}\right)=W\left(\vec{v}_{2}, \vec{v}_{1}, \vec{v}_{4}, \vec{v}_{3}\right) \\ & W\left(-\vec{v}_{1},-\vec{v}_{2},-\vec{v}_{3},-\vec{v}_{4}\right)=W\left(\vec{v}_{1}, \vec{v}_{2}, \vec{v}_{3}, \vec{v}_{4}\right) \\ & W\left(\vec{v}_{1}, \vec{v}_{2}, \vec{v}_{3}, \vec{v}_{4}\right)=W\left(-\vec{v}_{3},-\vec{v}_{4},-\vec{v}_{1},-\vec{v}_{2}\right) \end{aligned}W(v1,v2,v3,v4)=W(v2,v1,v4,v3)W(v1,v2,v3,v4)=W(v1,v2,v3,v4)W(v1,v2,v3,v4)=W(v3,v4,v1,v2)
Hint: What is the physical meaning of these equations for the collision?
c) Let H ( x , t ) H ( x , t ) H( vec(x),t)H(\vec{x}, t)H(x,t) be defined as H ( t ) H ( t ) H(t)H(t)H(t) but without the d 3 x d 3 x d^(3)xd^{3} xd3x-integration. Show that
H ˙ ( x , t ) = x d 3 v v f ( x , v , t ) log f ( x , v , t ) + I ( x , t ) H ˙ ( x , t ) = x d 3 v v f ( x , v , t ) log f ( x , v , t ) + I ( x , t ) H^(˙)( vec(x),t)=(del)/(del( vec(x)))intd^(3)v vec(v)f( vec(x), vec(v),t)log f( vec(x), vec(v),t)+I( vec(x),t)\dot{H}(\vec{x}, t)=\frac{\partial}{\partial \vec{x}} \int d^{3} v \vec{v} f(\vec{x}, \vec{v}, t) \log f(\vec{x}, \vec{v}, t)+I(\vec{x}, t)H˙(x,t)=xd3vvf(x,v,t)logf(x,v,t)+I(x,t)
where
I = d 3 v 1 d 3 v 2 d 3 v 3 d 3 v 4 W 1234 ( f 1 f 2 f 3 f 4 ) ( 1 + log f 1 ) I = d 3 v 1 d 3 v 2 d 3 v 3 d 3 v 4 W 1234 f 1 f 2 f 3 f 4 1 + log f 1 I=intd^(3)v_(1)d^(3)v_(2)d^(3)v_(3)d^(3)v_(4)W_(1234)(f_(1)f_(2)-f_(3)f_(4))(1+log f_(1))I=\int d^{3} v_{1} d^{3} v_{2} d^{3} v_{3} d^{3} v_{4} W_{1234}\left(f_{1} f_{2}-f_{3} f_{4}\right)\left(1+\log f_{1}\right)I=d3v1d3v2d3v3d3v4W1234(f1f2f3f4)(1+logf1)
using the shorthand f 1 = f ( x , v 1 , t ) , f 2 = f ( x , v 2 , t ) f 1 = f x , v 1 , t , f 2 = f x , v 2 , t f_(1)=f(( vec(x)), vec(v)_(1),t),f_(2)=f(( vec(x)), vec(v)_(2),t)f_{1}=f\left(\vec{x}, \vec{v}_{1}, t\right), f_{2}=f\left(\vec{x}, \vec{v}_{2}, t\right)f1=f(x,v1,t),f2=f(x,v2,t), etc.
d) Using b), show that I I III can be written as
I = 1 4 d 3 v 1 d 3 v 2 d 3 v 3 d 3 v 4 W 1234 ( f 1 f 2 f 3 f 4 ) log f 1 f 2 f 3 f 4 I = 1 4 d 3 v 1 d 3 v 2 d 3 v 3 d 3 v 4 W 1234 f 1 f 2 f 3 f 4 log f 1 f 2 f 3 f 4 I=(1)/(4)intd^(3)v_(1)d^(3)v_(2)d^(3)v_(3)d^(3)v_(4)W_(1234)(f_(1)f_(2)-f_(3)f_(4))log((f_(1)f_(2))/(f_(3)f_(4)))I=\frac{1}{4} \int d^{3} v_{1} d^{3} v_{2} d^{3} v_{3} d^{3} v_{4} W_{1234}\left(f_{1} f_{2}-f_{3} f_{4}\right) \log \frac{f_{1} f_{2}}{f_{3} f_{4}}I=14d3v1d3v2d3v3d3v4W1234(f1f2f3f4)logf1f2f3f4
and conclude that I 0 I 0 I >= 0I \geqslant 0I0. Using c), show that H ˙ 0 H ˙ 0 H^(˙) >= 0\dot{H} \geqslant 0H˙0.
e) What is the physical significance of this result?
Problem B. 12 (Master equation). We consider a time dependent probability distribution { p i ( t ) } p i ( t ) {p_(i)(t)}\left\{p_{i}(t)\right\}{pi(t)} described by the master equation:
d d t p i ( t ) = j : j i [ T i j p j ( t ) T j i p i ( t ) ] d d t p i ( t ) = j : j i T i j p j ( t ) T j i p i ( t ) (d)/(dt)p_(i)(t)=sum_(j:j!=i)[T_(ij)p_(j)(t)-T_(ji)p_(i)(t)]\frac{d}{d t} p_{i}(t)=\sum_{j: j \neq i}\left[T_{i j} p_{j}(t)-T_{j i} p_{i}(t)\right]ddtpi(t)=j:ji[Tijpj(t)Tjipi(t)]
Each transition amplitude T i j T i j T_(ij)T_{i j}Tij is assumed to be positive. Assume the detailed balance condition:
T i j e β E j = T j i e β E i T i j e β E j = T j i e β E i T_(ij)e^(-betaE_(j))=T_(ji)e^(-betaE_(i))T_{i j} e^{-\beta E_{j}}=T_{j i} e^{-\beta E_{i}}TijeβEj=TjieβEi
a) Show that the time-independent distribution p i = e β E i / Z p i = e β E i / Z p_(i)=e^(-betaE_(i))//Zp_{i}=e^{-\beta E_{i}} / Zpi=eβEi/Z is a (stationary) solution to the master equation.
b) Show that the master equation implies i p i ( t ) = 1 i p i ( t ) = 1 sum_(i)p_(i)(t)=1\sum_{i} p_{i}(t)=1ipi(t)=1 for all t t ttt if this is true at t = 0 t = 0 t=0t=0t=0. (Hint: differentiate this sum and use the master equation.) Give an argument why each p i ( t ) p i ( t ) p_(i)(t)p_{i}(t)pi(t) has to remain positive for all times if this holds initially. (Hint: consider a t 0 t 0 t_(0)t_{0}t0 such that p i ( t 0 ) = 0 p i t 0 = 0 p_(i)(t_(0))=0p_{i}\left(t_{0}\right)=0pi(t0)=0 and use the master equation.)
c) Consider a population of bacteria. Let n n nnn be the number of bacteria, M M MMM the mortality rate, R R RRR the reproduction rate. p n ( t ) p n ( t ) p_(n)(t)p_{n}(t)pn(t) is the probability that the population consists of n n nnn bacteria. We consider the evolution equation
p ˙ n ( t ) = R ( n 1 ) p n 1 ( t ) + M ( n + 1 ) p n + 1 ( t ) ( M + R ) n p n ( t ) p ˙ n ( t ) = R ( n 1 ) p n 1 ( t ) + M ( n + 1 ) p n + 1 ( t ) ( M + R ) n p n ( t ) p^(˙)_(n)(t)=R(n-1)p_(n-1)(t)+M(n+1)p_(n+1)(t)-(M+R)np_(n)(t)\dot{p}_{n}(t)=R(n-1) p_{n-1}(t)+M(n+1) p_{n+1}(t)-(M+R) n p_{n}(t)p˙n(t)=R(n1)pn1(t)+M(n+1)pn+1(t)(M+R)npn(t)
for n > 0 n > 0 n > 0n>0n>0 and p ˙ 0 ( t ) = M p 1 ( t ) p ˙ 0 ( t ) = M p 1 ( t ) p^(˙)_(0)(t)=Mp_(1)(t)\dot{p}_{0}(t)=M p_{1}(t)p˙0(t)=Mp1(t) for n = 0 n = 0 n=0n=0n=0. Show that the equation has the form of a master equation. Derive the possible equilibrium state(s) of this system.

B. 3 Exercises for chapter 4

Problem B. 13 (1-dimensional classical Ising model). The d d ddd-dimensional Ising model is exactly solvable in d = 1 , 2 d = 1 , 2 d=1,2d=1,2d=1,2, but not beyond. Here we look at the (easier) case when d = 1 d = 1 d=1d=1d=1. Consider a 1-dimensional lattice of N + 1 N + 1 N+1N+1N+1 atoms, each of which is assumed to carry a spin σ i = ± 1 , i = 0 , , N σ i = ± 1 , i = 0 , , N sigma_(i)=+-1,i=0,dots,N\sigma_{i}= \pm 1, i=0, \ldots, Nσi=±1,i=0,,N. The energy of the state described by { σ i } = ( σ 0 , , σ N ) σ i = σ 0 , , σ N {sigma_(i)}=(sigma_(0),dots,sigma_(N))\left\{\sigma_{i}\right\}=\left(\sigma_{0}, \ldots, \sigma_{N}\right){σi}=(σ0,,σN) is assumed to be
H ( { σ i } ) = J i = 1 N σ i σ i 1 H σ i = J i = 1 N σ i σ i 1 H({sigma_(i)})=-Jsum_(i=1)^(N)sigma_(i)sigma_(i-1)H\left(\left\{\sigma_{i}\right\}\right)=-J \sum_{i=1}^{N} \sigma_{i} \sigma_{i-1}H({σi})=Ji=1Nσiσi1
where J J JJJ is a constant which determines the strength of the interaction.
a) Neighboring spins can have equal or opposite signs, in which case they are called "parallel" resp. "anti-parallel". Let ν ν nu\nuν be the number of anti-parallel pairs in { σ i } σ i {sigma_(i)}\left\{\sigma_{i}\right\}{σi}. Express the energy in terms of ν ν nu\nuν. Count the number of states with ν ν nu\nuν anti-parallel pairs. Hence, what is the number W ( E ) W ( E ) W(E)W(E)W(E) of configurations { σ i } σ i {sigma_(i)}\left\{\sigma_{i}\right\}{σi} having H ( { σ i } ) = E H σ i = E H({sigma_(i)})=EH\left(\left\{\sigma_{i}\right\}\right)=EH({σi})=E ?
b) Using the result of a), calculate the canonical partition function
Z ( β ) = { σ i } e β H ( { σ i } ) Z ( β ) = σ i e β H σ i Z(beta)=sum_({sigma_(i)})e^(-beta H({sigma_(i)}))Z(\beta)=\sum_{\left\{\sigma_{i}\right\}} \mathrm{e}^{-\beta H\left(\left\{\sigma_{i}\right\}\right)}Z(β)={σi}eβH({σi})
Hint: Rewrite the sum as a sum over ν ν nu\nuν.
c) Calculate the free energy F / N F / N F//NF / NF/N per spin and the entropy per spin S / N spin S / N spin S//N\operatorname{spin} S / NspinS/N in the canonical and micro-canonical ensembles for large N N NNN.
Problem B. 14 (Heat capacity of a crystal). We study a simplified microscopic model to understand the heat capacity of a crystal. We suppose that the crystal consists of N N NNN atoms (or ions) arranged in some sort of lattice. The equilibrium position of an atom is at some lattice site, about which it can oscillate. We assume that the oscillations are small and can be described by a harmonic oscillator, and that the individual oscillators are independent, i.e. do not interact with one another (to what extent is the last assumption realistic in practice?). The total Hamiltonian is hence the sum of the Hamiltonians for the individual oscillators:
H = i = 1 N H i , H i = 1 2 m p i 2 + m 2 ω 2 x i 2 H = i = 1 N H i , H i = 1 2 m p i 2 + m 2 ω 2 x i 2 H=sum_(i=1)^(N)H_(i),quadH_(i)=(1)/(2m) vec(p)_(i)^(2)+(m)/(2)omega^(2) vec(x)_(i)^(2)H=\sum_{i=1}^{N} H_{i}, \quad H_{i}=\frac{1}{2 m} \vec{p}_{i}^{2}+\frac{m}{2} \omega^{2} \vec{x}_{i}^{2}H=i=1NHi,Hi=12mpi2+m2ω2xi2
where p i p i vec(p)_(i)\vec{p}_{i}pi is the momentum, x i x i vec(x)_(i)\vec{x}_{i}xi is the position relative to the equilibrium position, and m m mmm is the mass of the atom.
a) Einstein model, canonical approach.
i) Describe the eigenstates and eigenvalues (energy levels) of the crystal in terms of those of a single 1-dimensional harmonic oscillator.
ii) Give the quantum canonical partition function Z N Z N Z_(N)Z_{N}ZN as a function of the temperature
iii) Deduce the mean energy U U UUU and the specific heat C C CCC of the system. Compare this to the specific heat of a paramagnetic (non-interacting) spin chain. What is the behavior of C C CCC for high/low temperatures? What is the numerical value for C C CCC at high temperature for a crystal containing N A = 6.022 × 10 23 N A = 6.022 × 10 23 N_(A)=6.022 xx10^(23)N_{A}=6.022 \times 10^{23}NA=6.022×1023 atoms? What is the characteristic temperature T 0 = ω / k B T 0 = ω / k B T_(0)=ℏomega//k_(B)T_{0}=\hbar \omega / k_{B}T0=ω/kB for a value ω = 0.1 eV ω = 0.1 eV ℏomega=0.1eV\hbar \omega=0.1 \mathrm{eV}ω=0.1eV ?
iv) Evaluate also the classical canonical partition function Z N class Z N class  Z_(N)^("class ")Z_{N}^{\text {class }}ZNclass  of N N NNN distinguishable classical oscillators. Demonstrate that it is comparable to the quantum partition function for T T 0 T T 0 T≫T_(0)T \gg T_{0}TT0. Comment?
b) Micro-canonical approach. Here, one fixes the total energy, E = ( M + 3 2 N ) ω E = M + 3 2 N ω E=(M+(3)/(2)N)ℏomegaE=\left(M+\frac{3}{2} N\right) \hbar \omegaE=(M+32N)ω, where M 1 M 1 M≫1M \gg 1M1.
i) What is the number W ( E ) W ( E ) W(E)W(E)W(E) of accessible micro-states of the system?
ii) Deduce the entropy S ( E ) S ( E ) S(E)S(E)S(E) of the system.
iii) Express E E EEE as a function of the temperature.
iv) Compute C C CCC and compare the result to that found in part 1.
c) Modified Einstein model. The prediction for the heat capacity as a function of T T TTT is qualitatively in accord with experiments for high temperatures, but not for low temperatures, where experiments show the behavior C T 3 C T 3 C∼T^(3)C \sim T^{3}CT3. In order to have a model which is in accord with that behavior for low T T TTT, suppose that we have instead N N NNN oscillators with variable frequencies ω i , i = 1 , , N ω i , i = 1 , , N omega_(i),i=1,dots,N\omega_{i}, i=1, \ldots, Nωi,i=1,,N between 0 and some maximum ω max ω max omega_(max)\omega_{\max }ωmax. The heat capacity is now given by a sum. Approximate this sum by an integral in terms of the frequency distribution function D ( ω ) D ( ω ) D(omega)D(\omega)D(ω), defined in such a way that D ( ω ) d ω D ( ω ) d ω D(omega)d omegaD(\omega) d \omegaD(ω)dω is the number of atoms with frequency in the range ω ω + d ω ω ω + d ω omega dots omega+d omega\omega \ldots \omega+d \omegaωω+dω. Assuming that D ( ω ) D ( ω ) D(omega)D(\omega)D(ω) behaves as D ( ω ) A ω ν D ( ω ) A ω ν D(omega)∼Aomega^(nu)D(\omega) \sim A \omega^{\nu}D(ω)Aων for ω 1 ω 1 omega≪1\omega \ll 1ω1, find the correct value of ν ν nu\nuν to reproduce the behavior C T 3 C T 3 C∼T^(3)C \sim T^{3}CT3 for low T T TTT.
Problem B. 15 (Paramagnetism). Consider a system of a large number N N NNN of spins. Let s i = ± 1 s i = ± 1 s_(i)=+-1s_{i}= \pm 1si=±1 be the value of the i i iii-th spin in the z z zzz-direction.
a) In the absence of a magnetic field, all spin configurations ( s 1 , , s N ) s 1 , , s N (s_(1),dots,s_(N))\left(s_{1}, \ldots, s_{N}\right)(s1,,sN) are equally probable.
i) What is the probability for a fixed configuration of spins?
ii) What is the probability of finding a configuration with N + N + N_(+)N_{+}N+positive spins (and N = N N + N = N N + N_(-)=N-N_(+)N_{-}=N-N_{+}N=NN+negative spins)?
iii) Generalize the probability law in 2 to the case when a positive spin occurs with probability p p ppp and a negative spin with probability 1 p 1 p 1-p1-p1p. Calculate, in this case, the mean value N + N + (:N_(+):)\left\langle N_{+}\right\rangleN+and the spread Δ N + Δ N + DeltaN_(+)\Delta N_{+}ΔN+. What is the dependence on N N NNN for a large system?
b) Now suppose the spins are associated with the electrons sitting at distinct lattice sites of a crystal. The magnetic moment associated with the i i iii-th spin is
μ i = μ σ i , μ = | q | 2 m e = 9.27 × 10 24 J / T μ i = μ σ i , μ = | q | 2 m e = 9.27 × 10 24 J / T vec(mu)_(i)=-mu vec(sigma)_(i),quad mu=(|q|ℏ)/(2m_(e))=9.27 xx10^(-24)J//T\vec{\mu}_{i}=-\mu \vec{\sigma}_{i}, \quad \mu=\frac{|q| \hbar}{2 m_{\mathrm{e}}}=9.27 \times 10^{-24} \mathrm{~J} / \mathrm{T}μi=μσi,μ=|q|2me=9.27×1024 J/T
where σ = ( σ x , σ y , σ z ) σ = σ x , σ y , σ z vec(sigma)=(sigma_(x),sigma_(y),sigma_(z))\vec{\sigma}=\left(\sigma_{x}, \sigma_{y}, \sigma_{z}\right)σ=(σx,σy,σz) are the Pauli matrices. What is the Hamiltonian in the presence of a constant magnetic field B B vec(B)\vec{B}B in the z z zzz-direction? What are the maximum and minimum values E max / min E max / min E_(max//min)E_{\max / \min }Emax/min of the energy of the system interacting with the external magnetic field? Express the degeneracy of a given energy level as a function of N + , N N + , N N_(+),N_(-)N_{+}, N_{-}N+,N.
c) Entropy:
i) Calculate the entropy of the system given the information that the energy is at the fixed value
E = μ B ( N + N ) . E = μ B N + N . E=mu B(N_(+)-N_(-)).E=\mu B\left(N_{+}-N_{-}\right) .E=μB(N+N).
Express the entropy in terms of N N NNN and the variable ϵ = E N μ B ϵ = E N μ B epsilon=(E)/(N mu B)\epsilon=\frac{E}{N \mu B}ϵ=ENμB (energy per spin in units of μ B μ B mu B\mu BμB ). Sketch S = S ( N , ϵ ) S = S ( N , ϵ ) S=S(N,epsilon)S=S(N, \epsilon)S=S(N,ϵ). Verify that the entropy is a concave function of the energy, meaning that
2 S ( N , ϵ ) ϵ 2 0 . 2 S ( N , ϵ ) ϵ 2 0 . (del^(2)S(N,epsilon))/(delepsilon^(2)) <= 0.\frac{\partial^{2} S(N, \epsilon)}{\partial \epsilon^{2}} \leqslant 0 .2S(N,ϵ)ϵ20.
(You may recall Stirling's formula which states log n ! = n log n n + O ( log n ) log n ! = n log n n + O ( log n ) log n!=n log n-n+O(log n)\log n!=n \log n-n+O(\log n)logn!=nlognn+O(logn) for large n n nnn.) Why is it plausible that the entropy should be concave?
ii) Suppose the energy of the system is not known exactly, but only up to Δ U Δ U Delta U\Delta UΔU, where we assume Δ U N μ B Δ U N μ B Delta U≪N mu B\Delta U \ll N \mu BΔUNμB. What is the entropy of the system? Is the result significantly different from that in 1 ?
d) Temperature: Recall that the absolute temperature of the system is defined by
T = 1 S / E T = 1 S / E T=(1)/(del S//del E)T=\frac{1}{\partial S / \partial E}T=1S/E
i) Express T ( E ) T ( E ) T(E)T(E)T(E) as a function of energy for a spin system with N N NNN spins. Sketch this function, and comment on the behavior of T ( E ) T ( E ) T(E)T(E)T(E) for E > 0 E > 0 E > 0E>0E>0 !
ii) Invert the relation between temperature and energy and obtain the energy as a function E = E ( T , N ) E = E ( T , N ) E=E(T,N)E=E(T, N)E=E(T,N) of T , N T , N T,NT, NT,N.
iii) Consider a spin system with positive energy. The spin system is put in thermal contact with an ideal monoatomic gas at temperature T g T g T_(g)T_{\mathrm{g}}Tg. The energy of the gas is, as usual, E g = 3 2 N g k B T g E g = 3 2 N g k B T g E_(g)=(3)/(2)N_(g)k_(B)T_(g)E_{\mathrm{g}}=\frac{3}{2} N_{\mathrm{g}} k_{B} T_{\mathrm{g}}Eg=32NgkBTg. Once thermal equilibrium is reached, what can be said about the final temperature T f T f T_(f)T_{\mathrm{f}}Tf ? What are its limits for N / N g N / N g N//N_(g)rarr ooN / N_{\mathrm{g}} \rightarrow \inftyN/Ng resp. 0 0 rarr0\rightarrow 00 ?
e) Curie's law: The magnetization M = ( M x , M y , M z ) M = M x , M y , M z vec(M)=(M_(x),M_(y),M_(z))\vec{M}=\left(M_{x}, M_{y}, M_{z}\right)M=(Mx,My,Mz) is defined as the average magnetic moment in the spin system. The magnetic susceptibility per volume V V VVV is defined in the small field limit as 1 1 ^(1){ }^{1}1
χ = lim B 0 M V B . χ = lim B 0 M V B . chi=lim_(B rarr0)(M)/(V*B).\chi=\lim _{B \rightarrow 0} \frac{M}{V \cdot B} .χ=limB0MVB.
A substance having χ > 0 χ > 0 chi > 0\chi>0χ>0 is called paramagnetic, while a substance having χ < 0 χ < 0 chi < 0\chi<0χ<0 is called diamagnetic. For paramagnetic substances, one finds experimentally that, to a good precision, χ χ chi\chiχ is inversely proportional to the absolute temperature. This behavior is called 'Curie's law'. Suppose the spin system is in thermal equilibrium at temperature T T TTT. Give the magnetic moment M M z M M z M-=M_(z)M \equiv M_{z}MMz as a function of β , B B z , N β , B B z , N beta,B-=B_(z),N\beta, B \equiv B_{z}, Nβ,BBz,N.
Deduce the susceptibility and verify Curie's law. Calculate also the heat capacity C = E / T C = E / T C=del E//del TC=\partial E / \partial TC=E/T and sketch C C CCC as a function of B / T B / T B//TB / TB/T and T / B T / B T//BT / BT/B.
Problem B. 16 (Mean field theory and ferromagnetism). A ferromagnetic material has a spontaneous magnetization below a critical temperature T c T c T_(c)T_{c}Tc, even in the absence of an external magnetic field B B BBB. Above T c T c T_(c)T_{c}Tc, the spontaneous magnetization is zero, and the material behaves like a paramagnet. To understand this effect, we study the famous Ising model. In this model, one considers independent spins σ i = ± 1 σ i = ± 1 sigma_(i)=+-1\sigma_{i}= \pm 1σi=±1 on the sites i i iii of a hypercubic lattice Z d Z d Z^(d)\mathbb{Z}^{d}Zd in d d ddd spatial dimensions. The energy of a configuration of spins { σ i } σ i {sigma_(i)}\left\{\sigma_{i}\right\}{σi} is taken to be
H ( { σ i } ) = J i j σ i σ j b i σ i H σ i = J i j σ i σ j b i σ i H({sigma_(i)})=-Jsum_(ij)sigma_(i)sigma_(j)-bsum_(i)sigma_(i)H\left(\left\{\sigma_{i}\right\}\right)=-J \sum_{i j} \sigma_{i} \sigma_{j}-b \sum_{i} \sigma_{i}H({σi})=Jijσiσjbiσi
The first sum is over all lattice bonds, i.e. pairs ( i , j ) ( i , j ) (i,j)(i, j)(i,j) with i < j i < j i < ji<ji<j such that spin i i iii and spin j j jjj are nearest neighbors. The second sum is over all lattice sites. b b bbb is related to the background field by b = μ B b = μ B b=mu Bb=\mu Bb=μB, where μ μ mu\muμ is of the order of the Bohr magneton, 9.3 × 10 24 J / T . J 9.3 × 10 24 J / T . J ∼-9.3 xx10^(-24)J//T.J\sim-9.3 \times 10^{-24} J / T . J9.3×1024J/T.J is the ferromagnetic coupling between the spins. The probability distribution for the spin configurations is
ρ ( { σ i } ) = 1 Z exp β H ( { σ i } ) ρ σ i = 1 Z exp β H σ i rho({sigma_(i)})=(1)/(Z)exp-beta H({sigma_(i)})\rho\left(\left\{\sigma_{i}\right\}\right)=\frac{1}{Z} \exp -\beta H\left(\left\{\sigma_{i}\right\}\right)ρ({σi})=1ZexpβH({σi})
with the partition function Z = Z ( β , J , b ) Z = Z ( β , J , b ) Z=Z(beta,J,b)Z=Z(\beta, J, b)Z=Z(β,J,b).
a) Write down a formula for the partition function.
b) Write down a formula for ρ ( σ 1 , , σ i = + 1 , ) / ρ ( σ 1 , , σ i = 1 , ) ρ σ 1 , , σ i = + 1 , / ρ σ 1 , , σ i = 1 , rho(sigma_(1),dots,sigma_(i)=+1,dots)//rho(sigma_(1),dots,sigma_(i)=-1,dots)\rho\left(\sigma_{1}, \ldots, \sigma_{i}=+1, \ldots\right) / \rho\left(\sigma_{1}, \ldots, \sigma_{i}=-1, \ldots\right)ρ(σ1,,σi=+1,)/ρ(σ1,,σi=1,) in terms of β β beta\betaβ and h i h i h_(i)h_{i}hi. Let p ± p ± p_(+-)p_{ \pm}p±be the probabilities that the i i iii-th spin is ± 1 ± 1 +-1\pm 1±1, respectively. Show that the mean magnetization defined as m = σ i m = σ i m=(:sigma_(i):)m=\left\langle\sigma_{i}\right\ranglem=σi is independent of i i iii and can be written as
m = p + / p 1 p + / p + 1 m = p + / p 1 p + / p + 1 m=(p_(+)//p_(-)-1)/(p_(+)//p_(-)+1)m=\frac{p_{+} / p_{-}-1}{p_{+} / p_{-}+1}m=p+/p1p+/p+1
(The mean value is defined wrt. the probability distribution ρ ρ rho\rhoρ given above.)
c) In the mean field approximation, each individual spin can be thought of as being subject to an "effective" magnetic field
h i = b + J bonds containing i σ j h i = b + J bonds containing  i σ j h_(i)=b+Jsum_("bonds containing "i)sigma_(j)h_{i}=b+J \sum_{\text {bonds containing } i} \sigma_{j}hi=b+Jbonds containing iσj
for a given configuration { σ j } σ j {sigma_(j)}\left\{\sigma_{j}\right\}{σj} of the other spins j i j i j!=ij \neq iji. One writes the energy in terms of these fields,
H = 1 2 i σ i ( h i + b ) H = 1 2 i σ i h i + b H=-(1)/(2)sum_(i)sigma_(i)(h_(i)+b)H=-\frac{1}{2} \sum_{i} \sigma_{i}\left(h_{i}+b\right)H=12iσi(hi+b)
and assumes that it is consistent to replace h i h i h_(i)h_{i}hi with its mean value h := h i h := h i h:=(:h_(i):)h:=\left\langle h_{i}\right\rangleh:=hi. Assuming the mean field approximation, derive, using a), the "self-consistency" relation
m = tanh β ( J v m + b ) m = tanh β ( J v m + b ) m=tanh beta(Jvm+b)m=\tanh \beta(J v m+b)m=tanhβ(Jvm+b)
where v v vvv is the number of nearest neighbors in the lattice, i.e. v = 2 v = 2 v=2v=2v=2 in 1 d , v = 4 1 d , v = 4 1d,v=41 d, v=41d,v=4 in 2 d , v = 6 2 d , v = 6 2d,v=62 d, v=62d,v=6 in 3 d 3 d 3d3 d3d, etc.
d) The free energy is given as usual by F = k B T log Z F = k B T log Z F=-k_(B)T log ZF=-k_{B} T \log ZF=kBTlogZ, where β 1 = k B T β 1 = k B T beta^(-1)=k_(B)T\beta^{-1}=k_{B} Tβ1=kBT. Verify that
(B.2) N m = F b (B.2) N m = F b {:(B.2)Nm=-(del F)/(del b):}\begin{equation*} N m=-\frac{\partial F}{\partial b} \tag{B.2} \end{equation*}(B.2)Nm=Fb
Here N N NNN is the total number of lattice sites, which we assume to be finite in this part (box). To calculate Z Z ZZZ, write σ i = σ i + δ σ i σ i = σ i + δ σ i sigma_(i)=(:sigma_(i):)+deltasigma_(i)\sigma_{i}=\left\langle\sigma_{i}\right\rangle+\delta \sigma_{i}σi=σi+δσi, with δ σ i = σ i σ i δ σ i = σ i σ i deltasigma_(i)=sigma_(i)-(:sigma_(i):)\delta \sigma_{i}=\sigma_{i}-\left\langle\sigma_{i}\right\rangleδσi=σiσi. Substitute this into the formula for H H HHH, and neglect terms that are quadratic in δ σ i δ σ i deltasigma_(i)\delta \sigma_{i}δσi. Calculate Z Z ZZZ and F F FFF in this approximation, and verify that
F = β 1 N log [ 2 cosh β ( v J m + b ) ] + 1 2 N v J m 2 F = β 1 N log [ 2 cosh β ( v J m + b ) ] + 1 2 N v J m 2 F=-beta^(-1)N log[2cosh beta(vJm+b)]+(1)/(2)NvJm^(2)F=-\beta^{-1} N \log [2 \cosh \beta(v J m+b)]+\frac{1}{2} N v J m^{2}F=β1Nlog[2coshβ(vJm+b)]+12NvJm2
Verify that the self-cosistency relation of c) is consistent with eq. (B.2) in this approximation.
e) Now let b = 0 b = 0 b=0b=0b=0 (no external field). Show that the self-consistency equation for m m mmm in b) has m = 0 m = 0 m=0m=0m=0 as its only solution if T > T c T > T c T > T_(c)T>T_{c}T>Tc, where T c := v J / k B T c := v J / k B T_(c):=vJ//k_(B)T_{c}:=v J / k_{B}Tc:=vJ/kB. Whence, in this case there is no spontaneous magnetization. Show that, for T < T C T < T C T < T_(C)T<T_{\mathcal{C}}T<TC, the self-consistency equation has two nonzero solutions. Thus, below the critical temperature, there is spontaneous magnetization. For T c T > 0 T c T > 0 T_(c)-T > 0T_{c}-T>0TcT>0 and small, solve the self-consistency equation by expanding the tanh around m = 0 m = 0 m=0m=0m=0. Show that the solution m ( T ) m ( T ) m(T)m(T)m(T) behaves as
m ( T ) ( T c T ) 1 / 2 m ( T ) T c T 1 / 2 m(T)∼(T_(c)-T)^(1//2)m(T) \sim\left(T_{c}-T\right)^{1 / 2}m(T)(TcT)1/2
as T T c T T c T rarrT_(c)T \rightarrow T_{c}TTc. This behavior is characteristic in the theory of phase transitions. The exponent is called the critical exponent.
Problem B. 17 (Directed polymer). A polymer consists of atoms i = 0 , 1 , 2 , , N i = 0 , 1 , 2 , , N i=0,1,2,dots,Ni=0,1,2, \ldots, Ni=0,1,2,,N at positions ( x i , y i ) Z 2 x i , y i Z 2 (x_(i),y_(i))inZ^(2)\left(x_{i}, y_{i}\right) \in \mathbb{Z}^{2}(xi,yi)Z2 of a square lattice. The atom at the origin is fixed at the position x 0 = y 0 = 0 x 0 = y 0 = 0 x_(0)=y_(0)=0x_{0}=y_{0}=0x0=y0=0, and the other atoms are chained together such that x i x i 1 = 1 x i x i 1 = 1 x_(i)-x_(i-1)=1x_{i}-x_{i-1}=1xixi1=1 and | y i y i 1 | = 1 y i y i 1 = 1 |y_(i)-y_(i-1)|=1\left|y_{i}-y_{i-1}\right|=1|yiyi1|=1. This polymer is hence oriented in the x x xxx-direction and it does not selfintersect.
a) Determine the total number of micro-states of the polymer.
b) Determine the number of microstates W ( N , y ) W ( N , y ) W(N,y)W(N, y)W(N,y) having the property that y N = y y N = y y_(N)=yy_{N}=yyN=y.
c) Determine, more generally, the number of microstates W ( i , y ) W ( i , y ) W(i,y)W(i, y)W(i,y) having the property that y i = y y i = y y_(i)=yy_{i}=yyi=y. Also calculate y / W ( i , y ) y / W ( i , y ) y//W(i,y)y / W(i, y)y/W(i,y).
d) Determine the number of micro-states W ( i , y , N , y ) W i , y , N , y W(i,y,N,y^('))W\left(i, y, N, y^{\prime}\right)W(i,y,N,y) having the property that y i = y i = y_(i)=y_{i}=yi= y , y N = y y , y N = y y,y_(N)=y^(')y, y_{N}=y^{\prime}y,yN=y. Also calculate y y / W ( i , y , N , y ) y y / W i , y , N , y yy^(')//W(i,y,N,y^('))y y^{\prime} / W\left(i, y, N, y^{\prime}\right)yy/W(i,y,N,y).
e) Finally, calculate the typical deflection of the chain end,
y N 2 = y y 2 W ( N , y ) y W ( N , y ) y N 2 = y y 2 W ( N , y ) y W ( N , y ) (:y_(N)^(2):)=(sum_(y)y^(2)W(N,y))/(sum_(y)W(N,y))\left\langle y_{N}^{2}\right\rangle=\frac{\sum_{y} y^{2} W(N, y)}{\sum_{y} W(N, y)}yN2=yy2W(N,y)yW(N,y)
Hint: Consider the partition function of the canonical ensemble
Z ( N , β ) = y e β y W ( N , y ) . Z ( N , β ) = y e β y W ( N , y ) . Z(N,beta)=sum_(y)e^(beta y)W(N,y).Z(N, \beta)=\sum_{y} e^{\beta y} W(N, y) .Z(N,β)=yeβyW(N,y).

B. 4 Exercises for chapter 5

Problem B. 18 (Entropy budget of the Earth). It is estimated that the mass of carbon bound in newly generated biomass on earth is about 10 11 10 12 10 11 10 12 10^(11)-10^(12)10^{11}-10^{12}10111012 tons per year. Carbon is mostly taken out of the atmosphere by converting CO 2 CO 2 CO_(2)\mathrm{CO}_{2}CO2 gas and water vapor into organic material via photosynthesis. Organic material consists of highly organized structures and consequently should have a much lower entropy than water vapor and CO 2 CO 2 CO_(2)\mathrm{CO}_{2}CO2 gas. In order to reconcile this with the principle that the entropy of a system cannot decrease, one notes that the earth is not an isolated system, but receives high energy photons from the sun and emits heat in the form of low energy photons back into space. Through this process, the entropy of the photons is increased. The aim of this question is to estimate this gain and to show that it can account for the entropy decrease through newly generated biomass.
a) Most photons arriving from the sun have a wavelength of 520 nm 520 nm ∼520nm\sim 520 \mathrm{~nm}520 nm. Using the Einstein relation E = h ν E = h ν E=h nuE=h \nuE=hν for the energy of a single photon, and using the value 1400 J s m 2 1400 J s m 2 1400((J))/((s)*m^(2))1400 \frac{\mathrm{~J}}{\mathrm{~s} \cdot \mathrm{~m}^{2}}1400 J s m2 for the energy of solar radiation per area per unit of time, estimate the number of photons arriving on earth from the sun per year. Of the photons arriving on earth, only about 50 % 50 % 50%50 \%50% are absorbed on the surface, whereas the rest is reflected or absorbed by clouds etc. Of these, only about 0.1 % 0.1 % 0.1%0.1 \%0.1% participate in the actual photosynthesis that results in a net gain of glucose (and then biomass). Hence, what is the number of photons per year participating in the creation of new biomass?
b) The average temperature on the surface of the earth is about T = 280 K T = 280 K T=280KT=280 \mathrm{~K}T=280 K. Assuming that the intensity of low energy photons emitted from the earth back into space
follows a black body distribution,
I ( ν ) = 2 π h ν 3 c 2 1 exp ( h ν k B T ) 1 I ( ν ) = 2 π h ν 3 c 2 1 exp h ν k B T 1 I(nu)=(2pi hnu^(3))/(c^(2))(1)/(exp((h nu)/(k_(B)T))-1)I(\nu)=\frac{2 \pi h \nu^{3}}{c^{2}} \frac{1}{\exp \left(\frac{h \nu}{k_{B} T}\right)-1}I(ν)=2πhν3c21exp(hνkBT)1
what is the most probable frequency ν ν nu\nuν of photons emitted from earth, i.e. that maximizing I I III (do this first for general T T TTT )? The total energy of photons absorbed by earth is approximately equal to that of the photons emitted back into space (this follows from energy conservation; we can ignore the chemical energy stored in new biomass). Hence, what is the ratio of photons received on earth to that emitted?
c) The entropy of a gas of N γ N γ N_(gamma)N_{\gamma}Nγ photons in thermal equilibrium is S γ 0.9 k B N γ S γ 0.9 k B N γ S_(gamma)∼0.9k_(B)N_(gamma)S_{\gamma} \sim 0.9 k_{\mathrm{B}} N_{\gamma}Sγ0.9kBNγ. Whence, what is the gain in entropy coming from those photons participating in the creation of new biomass (you can leave k B k B k_(B)k_{\mathrm{B}}kB in the formula)?
d) Now estimate the entropy decrease through the creation of new biomass. In an extremely simplified description of this process, we can say that CO 2 CO 2 CO_(2)\mathrm{CO}_{2}CO2 gets converted to C which is bound in organic material, and O 2 O 2 O_(2)\mathrm{O}_{2}O2, which is released back into the atomosphere. The atmosphere is treated as an ideal gas. According to the formula for the entropy of an ideal gas in the lecture, the entropy contributions are, respectively
S α = k B N α log [ V ( 4 π m α k B T ) 3 / 2 ] , α = O 2 , CO 2 S α = k B N α log V 4 π m α k B T 3 / 2 , α = O 2 , CO 2 S_(alpha)=k_(B)N_(alpha)log[V(4pim_(alpha)k_(B)T)^(3//2)],quad alpha=O_(2),CO_(2)S_{\alpha}=k_{\mathrm{B}} N_{\alpha} \log \left[V\left(4 \pi m_{\alpha} k_{\mathrm{B}} T\right)^{3 / 2}\right], \quad \alpha=\mathrm{O}_{2}, \mathrm{CO}_{2}Sα=kBNαlog[V(4πmαkBT)3/2],α=O2,CO2
The entropy of bound carbon in organic material is neglected. Using the atomic masses 12 u 12 u 12 u12 u12u for C and 16 u 16 u 16 u16 u16u for O , what is the decrease in entropy due to the conversion of CO 2 CO 2 CO_(2)\mathrm{CO}_{2}CO2 into O 2 O 2 O_(2)\mathrm{O}_{2}O2 via the creation of new biomass per year (you can leave k B k B k_(B)k_{\mathrm{B}}kB in the formula)? Compare your answer to c). Comment?
Problem B. 19 (Athmospheric pressure). Consider a gas of N N NNN classical, non-interacting particles enclosed in an infinitely high cylinder with base occupying an area S S SSS. The cylinder is placed upright into a gravitational field i.e. F = m g F = m g vec(F)=m vec(g)\vec{F}=m \vec{g}F=mg, where the axis of the cylinder is parallel to g g vec(g)\vec{g}g.
a) What is the Hamiltonian of the system? What is the density function ρ ρ rho\rhoρ for the canonical ensemble?
b) What is the average number of particles above height h h hhh ?
c) Derive from b) a formula for the pressure as a function of h h hhh.
Problem B. 20 (Relativistic classical ideal gas). For a relativistic particle, the energymomentum relation is ϵ ( p ) = m 2 c 4 + c 2 p 2 ϵ ( p _ ) = m 2 c 4 + c 2 p 2 epsilon(p_)=sqrt(m^(2)c^(4)+c^(2)p^(2))\epsilon(\underline{p})=\sqrt{m^{2} c^{4}+c^{2} p^{2}}ϵ(p)=m2c4+c2p2, where p = | p | p = | p _ | p=|p_|p=|\underline{p}|p=|p|. We first consider a classical gas of N N NNN indistinguishable massless particles enclosed in a box with volume V V VVV.
a) Show that the partition function (canonical ensemble) is given by
Z = 1 N ! ( 8 π V ( k B T h c ) 3 ) N Z = 1 N ! 8 π V k B T h c 3 N Z=(1)/(N!)(8pi V((k_(B)T)/(hc))^(3))^(N)Z=\frac{1}{N!}\left(8 \pi V\left(\frac{k_{B} T}{h c}\right)^{3}\right)^{N}Z=1N!(8πV(kBThc)3)N
b) By analogy with the non-relativistic case, where λ cl = h / m k B T λ cl = h / m k B T lambda_(cl)=h//sqrt(mk_(B)T)\lambda_{\mathrm{cl}}=h / \sqrt{m k_{B} T}λcl=h/mkBT, we see that the thermal de Broglie wave length is now λ rel = h c / ( k B T ) λ rel = h c / k B T lambda_(rel)=hc//(k_(B)T)\lambda_{\mathrm{rel}}=h c /\left(k_{B} T\right)λrel=hc/(kBT). Using Stirling's formula, find the expression
F N k B T [ 1 + log ( 8 π V N λ rel 3 ) ] F N k B T 1 + log 8 π V N λ rel 3 F~~-Nk_(B)T[1+log((8pi V)/(Nlambda_(rel)^(3)))]F \approx-N k_{B} T\left[1+\log \left(\frac{8 \pi V}{N \lambda_{\mathrm{rel}}^{3}}\right)\right]FNkBT[1+log(8πVNλrel3)]
for the free energy. Use the standard relation P = F / V | T P = F / V T P=-del F// del V|_(T)P=-\partial F /\left.\partial V\right|_{T}P=F/V|T to derive the equation of state for the relativistic gas. Compare your result to the non-relativistic case. Do the same for the internal energy E = F T F / T | V E = F T F / T V E=F-T del F// del T|_(V)E=F-T \partial F /\left.\partial T\right|_{V}E=FTF/T|V.
c) Show that the relativistic and non-relativistic de Broglie wave lengths are related by
λ rel λ cl = m c 2 k B T λ rel λ cl = m c 2 k B T (lambda_(rel))/(lambda_(cl))=sqrt((mc^(2))/(k_(B)T))\frac{\lambda_{\mathrm{rel}}}{\lambda_{\mathrm{cl}}}=\sqrt{\frac{m c^{2}}{k_{B} T}}λrelλcl=mc2kBT
so that, for non-relativistic particles with k B T m c 2 k B T m c 2 k_(B)T≪mc^(2)k_{B} T \ll m c^{2}kBTmc2 we have λ rel λ cl λ rel λ cl lambda_(rel)≫lambda_(cl)\lambda_{\mathrm{rel}} \gg \lambda_{\mathrm{cl}}λrelλcl. We can consider d = ( V / N ) 1 / 3 d = ( V / N ) 1 / 3 d=(V//N)^(1//3)d=(V / N)^{1 / 3}d=(V/N)1/3 as the mean distance between the particles. Quantum effects should become important when d d ddd becomes less than the de Broglie wave length. Increasing N N NNN, where should quantum effects show up first, in the non-relativistic or the relativistic system?
d) At what temperature is the de Broglie wavelength comparable to the wave length for photons in the visible part of the spectrum, say 500 nm ? What wave length do photons have if their wave length is equal to the de Brogile wave length at room temperature?
e) Repeat the derivation in a) and b) for a massive relativistic particle working to first non-trivial order in m c 2 / k B T m c 2 / k B T mc^(2)//k_(B)Tm c^{2} / k_{B} Tmc2/kBT.

B. 5 Exercises for chapter 6

Problems

Problem B. 21 (First law of thermodynamics). Consider the first law of thermodynamics:
T d S = d E + P d V μ d N T d S = d E + P d V μ d N TdS=dE+PdV-mu dNT d S=d E+P d V-\mu d NTdS=dE+PdVμdN
a) Derive the relations
N E | V , S = 1 μ , N S | V , E = T μ , N V | E , S = P μ . N E V , S = 1 μ , N S V , E = T μ , N V E , S = P μ . (del N)/(del E)|_(V,S)=(1)/(mu), quad(del N)/(del S)|_(V,E)=-(T)/( mu), quad(del N)/(del V)|_(E,S)=(P)/( mu).\left.\frac{\partial N}{\partial E}\right|_{V, S}=\frac{1}{\mu},\left.\quad \frac{\partial N}{\partial S}\right|_{V, E}=-\frac{T}{\mu},\left.\quad \frac{\partial N}{\partial V}\right|_{E, S}=\frac{P}{\mu} .NE|V,S=1μ,NS|V,E=Tμ,NV|E,S=Pμ.
Hint: rewrite the first law in terms of the differentials d E , d V , d S d E , d V , d S dE,dV,dSd E, d V, d SdE,dV,dS.
b) Introduce the free energy by F = E T S F = E T S F=E-TSF=E-T SF=ETS, viewed as a function of T , N , V T , N , V T,N,VT, N, VT,N,V. Write the first law in terms of F F FFF instead of E E EEE.
c) Write the first law as d S = d S = dS=dotsd S=\ldotsdS=. Applying the exterior differential d d ddd to the resulting equation and using d ( d S ) = 0 d ( d S ) = 0 d(dS)=0d(d S)=0d(dS)=0, derive the relation
T V | E , N P T E | V , N + T P E | V , N = 0 . T V E , N P T E V , N + T P E V , N = 0 . (del T)/(del V)|_(E,N)-P(del T)/(del E)|_(V,N)+T(del P)/(del E)|_(V,N)=0.\left.\frac{\partial T}{\partial V}\right|_{E, N}-\left.P \frac{\partial T}{\partial E}\right|_{V, N}+\left.T \frac{\partial P}{\partial E}\right|_{V, N}=0 .TV|E,NPTE|V,N+TPE|V,N=0.
Hint: Keep in mind that d E d V = d V d E d E d V = d V d E dEdV=-dVdEd E d V=-d V d EdEdV=dVdE.
Problem B. 22 (Idealized Otto engine). An idealized Otto engine is described by the following cycle:
  • I I I I I I I rarr III \rightarrow I IIII : Adiabatic compression of air: piston moves up.
  • II I I I I I I rarr III\rightarrow I I IIII : Constant volume heat transfer: ignition and burning of fuel.
  • III I V I V rarr IV\rightarrow I VIV : Adiabatic expansion: power stroke, piston moves down.
  • IV I I rarr I\rightarrow II : Constant volume cooling.
    a) Draw the cycle in a ( P , V ) ( P , V ) (P,V)(P, V)(P,V)-diagram. Identify those processes in the diagram where work is performed by/on the system, and where heat is injected/given off by the system.
    b) Treating the fluid as an ideal gas, compute the net work Δ W Δ W Delta W\Delta WΔW performed by the system in one cycle, and the heat Δ Q in Δ Q in  DeltaQ_("in ")\Delta Q_{\text {in }}ΔQin  injected into the system. (Give each of these quantities in terms of the temperatures T I , , T I V ) T I , , T I V {:T_(I),dots,T_(IV))\left.T_{I}, \ldots, T_{I V}\right)TI,,TIV). Compute the efficiency η η eta\etaη of the idealized Otto-cycle in terms of the temperatures.
Problem B. 23 (Cyclic process). Consider the following cyclic process:
  1. I I I I I I I rarr III \rightarrow I IIII : Adiabatic (constant S S SSS ) expansion
  2. I I I I I I I I I I II rarr IIII I \rightarrow I I IIIIII : Isochoric (constant V V VVV ) cooling
  3. III I V I V rarr IV\rightarrow I VIV : Adiabatic (constant S S SSS ) compression
  4. I V I I V I IV rarr II V \rightarrow IIVI : Isothermal (constant T T TTT ) expansion
Throughout it is assumed that the particle number N N NNN remains constant (so that d N = 0 d N = 0 dN=0d N=0dN=0 in the entire process), and we assume that the equations of state of an ideal gas hold:
P V = N k B T , E = 3 2 P V . P V = N k B T , E = 3 2 P V . PV=Nk_(B)T,quad E=(3)/(2)PV.P V=N k_{B} T, \quad E=\frac{3}{2} P V .PV=NkBT,E=32PV.
a) Show that P V = c s t P V = c s t PV=cstP V=c s tPV=cst. on isotherms and P V 5 / 3 = c s t P V 5 / 3 = c s t PV^(5//3)=cstP V^{5 / 3}=c s tPV5/3=cst. on adiabatics using the equation(s) of state and the first law T d S = d E + P d V T d S = d E + P d V TdS=dE+PdVT d S=d E+P d VTdS=dE+PdV. (If you cannot do this, carry on with b)-e) assuming these results.)
b) Sketch the process in a ( P , V P , V P,VP, VP,V )-diagram, and identify where heat is injected/given off by the system.
c) What is the work Δ W Δ W Delta W\Delta WΔW performed by the system in one cycle?
d) What is the heat Δ Q in Δ Q in  DeltaQ_("in ")\Delta Q_{\text {in }}ΔQin  injected into the system in one cycle?
e) What is the efficiency η = Δ W Δ Q in η = Δ W Δ Q in  eta=(Delta W)/(DeltaQ_("in "))\eta=\frac{\Delta W}{\Delta Q_{\text {in }}}η=ΔWΔQin  ?
State your answers in c)-e) in terms of N , T I , V I , V I I ( = V I I I ) , V I V N , T I , V I , V I I = V I I I , V I V N,T_(I),V_(I),V_(II)(=V_(III)),V_(IV)N, T_{I}, V_{I}, V_{I I}\left(=V_{I I I}\right), V_{I V}N,TI,VI,VII(=VIII),VIV.
Problem B. 24 (Gibbs-Duhem relation). Consider a system in equilibrium characterized by a fixed energy E E EEE, volume V V VVV, and particle number N i N i N_(i)N_{i}Ni for the i i iii-th species of particle. We argued in chapter 4 that the entropy S ( E , V , { N i } ) S E , V , N i S(E,V,{N_(i)})S\left(E, V,\left\{N_{i}\right\}\right)S(E,V,{Ni}) of such an equilibrium state is extensive in the sense that, for each ν ν nu\nuν, we have
S ( E , V , { N i } ) = ν S ( E / ν , V / ν , { N i / ν } ) . S E , V , N i = ν S E / ν , V / ν , N i / ν . S(E,V,{N_(i)})=nu S(E//nu,V//nu,{N_(i)//nu}).S\left(E, V,\left\{N_{i}\right\}\right)=\nu S\left(E / \nu, V / \nu,\left\{N_{i} / \nu\right\}\right) .S(E,V,{Ni})=νS(E/ν,V/ν,{Ni/ν}).
a) Differentiate this relation to obtain the Gibbs-Duhem relation
E + P V i μ i N i T S = 0 E + P V i μ i N i T S = 0 E+PV-sum_(i)mu_(i)N_(i)-TS=0E+P V-\sum_{i} \mu_{i} N_{i}-T S=0E+PViμiNiTS=0
(Use the definitions of T , P , μ i T , P , μ i T,P,mu_(i)T, P, \mu_{i}T,P,μi in terms of S S SSS given in the lectures.)
b) Write this relation as H = i μ i N i H = i μ i N i H=sum_(i)mu_(i)N_(i)H=\sum_{i} \mu_{i} N_{i}H=iμiNi in terms of the free enthalpy H H HHH, and derive the relationship
d P = s d T + i n i d μ i d P = s d T + i n i d μ i dP=sdT+sum_(i)n_(i)dmu_(i)d P=s d T+\sum_{i} n_{i} d \mu_{i}dP=sdT+inidμi
for the pressure, where s = S / V s = S / V s=S//Vs=S / Vs=S/V and n i = N i / V n i = N i / V n_(i)=N_(i)//Vn_{i}=N_{i} / Vni=Ni/V are the entropy- and number densities. Derive the identities
s μ i | P , T = n i T | { μ j } , P , n i μ j | P , T = n j μ i | P , T s μ i P , T = n i T μ j , P , n i μ j P , T = n j μ i P , T (del s)/(delmu_(i))|_(P,T)=(deln_(i))/(del T)|_({mu_(j)},P), quad(deln_(i))/(delmu_(j))|_(P,T)=(deln_(j))/(delmu_(i))|_(P,T)\left.\frac{\partial s}{\partial \mu_{i}}\right|_{P, T}=\left.\frac{\partial n_{i}}{\partial T}\right|_{\left\{\mu_{j}\right\}, P},\left.\quad \frac{\partial n_{i}}{\partial \mu_{j}}\right|_{P, T}=\left.\frac{\partial n_{j}}{\partial \mu_{i}}\right|_{P, T}sμi|P,T=niT|{μj},P,niμj|P,T=njμi|P,T
c) Consider two copies of a system characterized by the variables z ( 1 ) = ( E ( 1 ) , V ( 1 ) , N ( 1 ) ) z ( 1 ) = E ( 1 ) , V ( 1 ) , N ( 1 ) z^((1))=(E^((1)),V^((1)),N^((1)))z^{(1)}=\left(E^{(1)}, V^{(1)}, N^{(1)}\right)z(1)=(E(1),V(1),N(1)) and z ( 2 ) = ( E ( 2 ) , V ( 2 ) , N ( 2 ) ) z ( 2 ) = E ( 2 ) , V ( 2 ) , N ( 2 ) z^((2))=(E^((2)),V^((2)),N^((2)))z^{(2)}=\left(E^{(2)}, V^{(2)}, N^{(2)}\right)z(2)=(E(2),V(2),N(2)), which are separately in equilibrium but not necessarily with each other. Since the entropy of the composite system is maximal in equilibrium, we should normally have
S ( z ( 1 ) ) + S ( z ( 2 ) S ( z ( 1 ) + z ( 2 ) ) S z ( 1 ) + S z ( 2 ) S z ( 1 ) + z ( 2 ) S(z^((1)))+S(z^((2)) <= S(z^((1))+z^((2))):}S\left(z^{(1)}\right)+S\left(z^{(2)} \leqslant S\left(z^{(1)}+z^{(2)}\right)\right.S(z(1))+S(z(2)S(z(1)+z(2))
Argue that the entropy must be a concave function. [Recall that a function f ( x ) f ( x ) f(x)f(x)f(x) of n n nnn variables x = ( x 1 , , x n ) x = x 1 , , x n x=(x_(1),dots,x_(n))x=\left(x_{1}, \ldots, x_{n}\right)x=(x1,,xn) is called concave iff f ( λ x + ( 1 λ ) y ) λ f ( x ) + ( 1 λ ) f ( y ) f ( λ x + ( 1 λ ) y ) λ f ( x ) + ( 1 λ ) f ( y ) f(lambda x+(1-lambda)y) >= lambda f(x)+(1-lambda)f(y)f(\lambda x+(1-\lambda) y) \geqslant \lambda f(x)+(1-\lambda) f(y)f(λx+(1λ)y)λf(x)+(1λ)f(y) for all 0 λ 1 0 λ 1 0 <= lambda <= 10 \leqslant \lambda \leqslant 10λ1.] In particular, show that
T E | V , N 0 T E V , N 0 (del T)/(del E)|_(V,N) >= 0\left.\frac{\partial T}{\partial E}\right|_{V, N} \geqslant 0TE|V,N0
Problem B. 25 (Charged gas). We consider a gas of particles of unit charge ± q ± q +-q\pm q±q. The eigenstates of the charge operator Q ^ Q ^ hat(Q)\hat{Q}Q^ and the Hamiltonian H ^ H ^ hat(H)\hat{H}H^ are | n + , n n + , n |n_(+),n_(-):)\left|n_{+}, n_{-}\right\rangle|n+,nwith
H ^ | n + , n = ϵ n + , n | n + , n , Q ^ | n + , n = q ( n + n ) | n + , n H ^ n + , n = ϵ n + , n n + , n , Q ^ n + , n = q n + n n + , n hat(H)|n_(+),n_(-):)=epsilon_(n_(+),n_(-))|n_(+),n_(-):),quad hat(Q)|n_(+),n_(-):)=q(n_(+)-n_(-))|n_(+),n_(-):)\hat{H}\left|n_{+}, n_{-}\right\rangle=\epsilon_{n_{+}, n_{-}}\left|n_{+}, n_{-}\right\rangle, \quad \hat{Q}\left|n_{+}, n_{-}\right\rangle=q\left(n_{+}-n_{-}\right)\left|n_{+}, n_{-}\right\rangleH^|n+,n=ϵn+,n|n+,n,Q^|n+,n=q(n+n)|n+,n
where n + , n 0 n + , n 0 n_(+),n_(-) >= 0n_{+}, n_{-} \geqslant 0n+,n0 are integers that have the interpretation of the number of positively resp. negatively charged particles in the state. We consider a density matrix of the form
ρ = n + , n 0 p n + , n | n + , n n + , n | . ρ = n + , n 0 p n + , n n + , n n + , n . rho=sum_(n_(+),n_(-) >= 0)p_(n_(+),n_(-))|n_(+),n_(-):)(:n_(+),n_(-)|.\rho=\sum_{n_{+}, n_{-} \geqslant 0} p_{n_{+}, n_{-}}\left|n_{+}, n_{-}\right\rangle\left\langle n_{+}, n_{-}\right| .ρ=n+,n0pn+,n|n+,nn+,n|.
The information entropy is defined by S ( ρ ) = k B tr ρ log ρ S ( ρ ) = k B tr ρ log ρ S(rho)=-k_(B)tr rho log rhoS(\rho)=-k_{B} \operatorname{tr} \rho \log \rhoS(ρ)=kBtrρlogρ, the mean energy by E = H ^ E = H ^ E=(: hat(H):)E=\langle\hat{H}\rangleE=H^ and the mean charge by Q = Q ^ Q = Q ^ Q=(: hat(Q):)Q=\langle\hat{Q}\rangleQ=Q^.
a) Using the method of Lagrange multipliers, show that the density matrix which maximizes S ( ρ ) S ( ρ ) S(rho)S(\rho)S(ρ) for fixed E , Q E , Q E,QE, QE,Q is of the form
p n + , n = 1 Y exp ( β [ ϵ n + , n + Φ q ( n + n ) ] ) p n + , n = 1 Y exp β ϵ n + , n + Φ q n + n p_(n_(+),n_(-))=(1)/(Y)exp(-beta[epsilon_(n_(+),n_(-))+Phi q(n_(+)-n_(-))])p_{n_{+}, n_{-}}=\frac{1}{Y} \exp \left(-\beta\left[\epsilon_{n_{+}, n_{-}}+\Phi q\left(n_{+}-n_{-}\right)\right]\right)pn+,n=1Yexp(β[ϵn+,n+Φq(n+n)])
or equivalently
ρ = 1 Y exp [ β ( H ^ + Φ Q ^ ) ] ρ = 1 Y exp [ β ( H ^ + Φ Q ^ ) ] rho=(1)/(Y)exp[-beta( hat(H)+Phi hat(Q))]\rho=\frac{1}{Y} \exp [-\beta(\hat{H}+\Phi \hat{Q})]ρ=1Yexp[β(H^+ΦQ^)]
where
Y = tr exp [ β ( H ^ + Φ Q ^ ) ] Y = tr exp [ β ( H ^ + Φ Q ^ ) ] Y=tr exp[-beta( hat(H)+Phi hat(Q))]Y=\operatorname{tr} \exp [-\beta(\hat{H}+\Phi \hat{Q})]Y=trexp[β(H^+ΦQ^)]
(Here β β beta\betaβ and Φ Φ Phi\PhiΦ are constants.)
b) Define G = k B T log Y ( T , Φ ) G = k B T log Y ( T , Φ ) G=-k_(B)T log Y(T,Phi)G=-k_{B} T \log Y(T, \Phi)G=kBTlogY(T,Φ), where β 1 = k B T β 1 = k B T beta^(-1)=k_(B)T\beta^{-1}=k_{B} Tβ1=kBT. Show that
(B.3) S = G T | Φ , Q = G Φ | T (B.3) S = G T Φ , Q = G Φ T {:(B.3)S=-(del G)/(del T)|_(Phi)","quad Q=-(del G)/(del Phi)|_(T):}\begin{equation*} S=-\left.\frac{\partial G}{\partial T}\right|_{\Phi}, \quad Q=-\left.\frac{\partial G}{\partial \Phi}\right|_{T} \tag{B.3} \end{equation*}(B.3)S=GT|Φ,Q=GΦ|T
where S , Q S , Q S,QS, QS,Q are defined as above.
c) For a charged gas at fixed volume, the first law of thermodynamics is T d S = d E T d S = d E TdS=dE-T d S=d E-TdS=dE Φ d Q Φ d Q Phi dQ\Phi d QΦdQ. What is the physical meaning of Φ Φ Phi\PhiΦ ? Show that if we define G = E T S Φ Q G = E T S Φ Q G=E-TS-Phi QG=E-T S-\Phi QG=ETSΦQ, then G = G ( T , Φ ) G = G ( T , Φ ) G=G(T,Phi)G=G(T, \Phi)G=G(T,Φ) satisfies d G = S d T Q d Φ d G = S d T Q d Φ dG=-SdT-Qd Phid G=-S d T-Q d \PhidG=SdTQdΦ.
d) Verify the relations (B.3) using d G = S d T Q d Φ d G = S d T Q d Φ dG=-SdT-Qd Phid G=-S d T-Q d \PhidG=SdTQdΦ.
Problem B. 26 (Virial expansion and van der Waals equation of state). The aim of this exercise is to use the linked cluster expansion in order to derive an equation of state for a realistic monoatomic gas. Recall that the cluster expansion for a classical monoatomic non-relativistic gas is
1 V log Y ( μ , V , β ) = 1 λ 3 l = 1 b l ( V , β ) z l 1 V log Y ( μ , V , β ) = 1 λ 3 l = 1 b l ( V , β ) z l (1)/(V)log Y(mu,V,beta)=(1)/(lambda^(3))sum_(l=1)^(oo)b_(l)(V,beta)z^(l)\frac{1}{V} \log Y(\mu, V, \beta)=\frac{1}{\lambda^{3}} \sum_{l=1}^{\infty} b_{l}(V, \beta) z^{l}1VlogY(μ,V,β)=1λ3l=1bl(V,β)zl
where Y Y YYY is the grand canonical partition function, z = e β μ z = e β μ z=e^(beta mu)z=e^{\beta \mu}z=eβμ is the fugacity, and λ λ lambda\lambdaλ is the thermal de Broglie wavelength.
a) Using the Gibbs-Duhem relation and expressing the grand potential G G GGG in terms of Y Y YYY according to (6.70), show that
P k B T = 1 λ 3 l = 1 b l ( V , β ) z l P k B T = 1 λ 3 l = 1 b l ( V , β ) z l (P)/(k_(B)T)=(1)/(lambda^(3))sum_(l=1)^(oo)b_(l)(V,beta)z^(l)\frac{P}{k_{B} T}=\frac{1}{\lambda^{3}} \sum_{l=1}^{\infty} b_{l}(V, \beta) z^{l}PkBT=1λ3l=1bl(V,β)zl
b) Using N = G μ | T , V N = G μ T , V N=-(del G)/(del mu)|_(T,V)N=-\left.\frac{\partial G}{\partial \mu}\right|_{T, V}N=Gμ|T,V, demonstrate the relation
λ 3 n = l = 1 l b l ( V , β ) z l = z + 2 b 2 z 2 + 3 b 3 z 3 + λ 3 n = l = 1 l b l ( V , β ) z l = z + 2 b 2 z 2 + 3 b 3 z 3 + lambda^(3)n=sum_(l=1)^(oo)l*b_(l)(V,beta)z^(l)=z+2b_(2)z^(2)+3b_(3)z^(3)+dots\lambda^{3} n=\sum_{l=1}^{\infty} l \cdot b_{l}(V, \beta) z^{l}=z+2 b_{2} z^{2}+3 b_{3} z^{3}+\ldotsλ3n=l=1lbl(V,β)zl=z+2b2z2+3b3z3+
where n = N / V n = N / V n=N//Vn=N / Vn=N/V is the particle density.
c) We next want to eliminate z z zzz in favor of n n nnn in a). For this we write z = λ 3 n + z = λ 3 n + z=lambda^(3)n+z=\lambda^{3} n+z=λ3n+ a 2 ( λ 3 n ) 2 + a 3 ( λ 3 n ) 3 + a 2 λ 3 n 2 + a 3 λ 3 n 3 + a_(2)(lambda^(3)n)^(2)+a_(3)(lambda^(3)n)^(3)+dotsa_{2}\left(\lambda^{3} n\right)^{2}+a_{3}\left(\lambda^{3} n\right)^{3}+\ldotsa2(λ3n)2+a3(λ3n)3+ and substitute this in b) in order to determine a 2 , a 3 a 2 , a 3 a_(2),a_(3)a_{2}, a_{3}a2,a3. Show that a 2 = 2 b 2 a 2 = 2 b 2 a_(2)=-2b_(2)a_{2}=-2 b_{2}a2=2b2 and a 3 = 8 b 2 2 3 b 3 a 3 = 8 b 2 2 3 b 3 a_(3)=8b_(2)^(2)-3b_(3)a_{3}=8 b_{2}^{2}-3 b_{3}a3=8b223b3.
d) Using the result obtained in c) in a), derive the virial expansion
P k B T = n ( 1 + B 2 ( T ) n + B 3 ( T ) n 2 + ) P k B T = n 1 + B 2 ( T ) n + B 3 ( T ) n 2 + (P)/(k_(B)T)=n(1+B_(2)(T)n+B_(3)(T)n^(2)+dots)\frac{P}{k_{B} T}=n\left(1+B_{2}(T) n+B_{3}(T) n^{2}+\ldots\right)PkBT=n(1+B2(T)n+B3(T)n2+)
where B 2 = b 2 λ 3 , B 3 = ( 4 b 2 2 3 b 3 ) λ 6 B 2 = b 2 λ 3 , B 3 = 4 b 2 2 3 b 3 λ 6 B_(2)=-b_(2)lambda^(3),B_(3)=(4b_(2)^(2)-3b_(3))lambda^(6)B_{2}=-b_{2} \lambda^{3}, B_{3}=\left(4 b_{2}^{2}-3 b_{3}\right) \lambda^{6}B2=b2λ3,B3=(4b223b3)λ6.
e) Let us now study the virial coefficient B 2 B 2 B_(2)B_{2}B2 for a typical gas. We use the following approximation:
v ( r ) = { + for r < r 0 u 0 ( r 0 / r ) 6 for r > r 0 v ( r ) = +  for  r < r 0 u 0 r 0 / r 6  for  r > r 0 v(r)={[+oo," for "r < r_(0)],[-u_(0)(r_(0)//r)^(6)," for "r > r_(0)]:}v(r)= \begin{cases}+\infty & \text { for } r<r_{0} \\ -u_{0}\left(r_{0} / r\right)^{6} & \text { for } r>r_{0}\end{cases}v(r)={+ for r<r0u0(r0/r)6 for r>r0
Sketch this potential. Show that
2 λ 3 b 2 = 4 π r 0 3 3 + 4 π r 0 [ e u 0 ( r 0 / r ) 6 / ( k B T ) 1 ] r 2 d r 2 λ 3 b 2 = 4 π r 0 3 3 + 4 π r 0 e u 0 r 0 / r 6 / k B T 1 r 2 d r 2lambda^(3)b_(2)=-(4pir_(0)^(3))/(3)+4piint_(r_(0))^(oo)[e^(u_(0)(r_(0)//r)^(6)//(k_(B)T))-1]r^(2)dr2 \lambda^{3} b_{2}=-\frac{4 \pi r_{0}^{3}}{3}+4 \pi \int_{r_{0}}^{\infty}\left[e^{u_{0}\left(r_{0} / r\right)^{6} /\left(k_{B} T\right)}-1\right] r^{2} d r2λ3b2=4πr033+4πr0[eu0(r0/r)6/(kBT)1]r2dr
Approximating the integrand by u 0 ( r 0 / r ) 6 / ( k B T ) u 0 r 0 / r 6 / k B T ~~u_(0)(r_(0)//r)^(6)//(k_(B)T)\approx u_{0}\left(r_{0} / r\right)^{6} /\left(k_{B} T\right)u0(r0/r)6/(kBT) in the high temperature limit u 0 / k B T 1 u 0 / k B T 1 u_(0)//k_(B)T≪1u_{0} / k_{B} T \ll 1u0/kBT1, show that
B 2 V a 2 [ 1 u 0 / ( k B T ) ] B 2 V a 2 1 u 0 / k B T B_(2)~~(V_(a))/(2)[1-u_(0)//(k_(B)T)]B_{2} \approx \frac{V_{a}}{2}\left[1-u_{0} /\left(k_{B} T\right)\right]B2Va2[1u0/(kBT)]
where V a = 4 π r 0 3 3 V a = 4 π r 0 3 3 V_(a)=(4pir_(0)^(3))/(3)V_{a}=\frac{4 \pi r_{0}^{3}}{3}Va=4πr033 is the effective volume of one atom.
f) Using the results of e), we can write the virial expansion as
P k b T = n + V a 2 [ 1 u 0 / ( k B T ) ] n 2 + P k b T = n + V a 2 1 u 0 / k B T n 2 + (P)/(k_(b)T)=n+(V_(a))/(2)[1-u_(0)//(k_(B)T)]n^(2)+dots\frac{P}{k_{b} T}=n+\frac{V_{a}}{2}\left[1-u_{0} /\left(k_{B} T\right)\right] n^{2}+\ldotsPkbT=n+Va2[1u0/(kBT)]n2+
Show that this can be rearranged into
1 k B T ( P + u 0 V a 2 n 2 ) n 1 n V a / 2 1 k B T P + u 0 V a 2 n 2 n 1 n V a / 2 (1)/(k_(B)T)(P+(u_(0)V_(a))/(2)n^(2))~~(n)/(1-nV_(a)//2)\frac{1}{k_{B} T}\left(P+\frac{u_{0} V_{a}}{2} n^{2}\right) \approx \frac{n}{1-n V_{a} / 2}1kBT(P+u0Va2n2)n1nVa/2
neglecting higher orders than n 2 n 2 n^(2)n^{2}n2 (i.e., assuming low density). Show that this can be written as
( P + a ( N / V ) 2 ) ( V b N ) N k B T P + a ( N / V ) 2 ( V b N ) N k B T (P+a(N//V)^(2))(V-bN)~~Nk_(B)T\left(P+a(N / V)^{2}\right)(V-b N) \approx N k_{B} T(P+a(N/V)2)(VbN)NkBT
which is known as the van der Waals equation. Identify the van der Waals parameters a , b a , b a,ba, ba,b with the microscopic parameters of the system.
g) Plot the isotherms of the van der Waals equation using a computer programme such as Mathematica. It is sensible to plot p = P / P c , P c = a / ( 27 b 2 ) p = P / P c , P c = a / 27 b 2 p=P//P_(c),P_(c)=a//(27b^(2))p=P / P_{c}, P_{c}=a /\left(27 b^{2}\right)p=P/Pc,Pc=a/(27b2) against v = V / ( 3 b N ) v = V / ( 3 b N ) v=V//(3bN)v=V /(3 b N)v=V/(3bN) and several isotherms around T c = 8 a / ( 27 b ) T c = 8 a / ( 27 b ) T_(c)=8a//(27 b)T_{c}=8 a /(27 b)Tc=8a/(27b) in the range 0 < P / P c < 2 0 < P / P c < 2 0 < P//P_(c) < 20<P / P_{c}<20<P/Pc<2 and 0 < v < 10 0 < v < 10 0 < v < 100<v<100<v<10. You should see a distinctive qualitative change of the isotherms above and below T c T c T_(c)T_{c}Tc. Compare this to the isotherms of the ideal gas.

Acknowledgements

These lecture notes are based on lectures given by Prof. Dr. Stefan Hollands at the University of Leipzig.

  1. 1 1 ^(1){ }^{1}1 Of course this theory turned out to be incorrect. Nevertheless, we nowadays know that heat can be radiated away by particles which we call "photons". This shows that, in science, even a wrong idea can contain a germ of truth.
    2 2 ^(2){ }^{2}2 It seems that Lavoisier's foresight in political matters did not match his superb scientific insight. He became very wealthy owing to his position as a tax collector during the "Ancien Régime" but got in trouble for this lucrative but highly unpopular job during the French Revolution and was eventually sentenced to death by a revolutionary tribunal. After his execution, one onlooker famously remarked: "It takes one second to chop off a head like this, but centuries to grow a similar one."
  2. 1 1 ^(1){ }^{1}1 This description is not always appropriate, as the example of a rigid body shows. Here the phase space coordinates take values in the co-tangent space of the space of all orthogonal frames describing the configuration of the body, i.e. Ω T S O ( 3 ) Ω T S O ( 3 ) Omega~=T^(**)SO(3)\Omega \cong T^{*} S O(3)ΩTSO(3), with S O ( 3 ) S O ( 3 ) SO(3)S O(3)SO(3) the group of orientation preserving rotations.
  3. 2 2 ^(2){ }^{2}2 A general self-adjoint operator on a Hilbert space will have a spectral decomposition A = A = A=A=A= a d E A ( a ) a d E A ( a ) int_(-oo)^(oo)adE_(A)(a)\int_{-\infty}^{\infty} a d E_{A}(a)adEA(a). The spectral measure does not have to be atomic, as suggested by the formula (2.58). The corresponding probability measure is in general d μ ( a ) = Ψ d E A ( a ) Ψ d μ ( a ) = Ψ d E A ( a ) Ψ d mu(a)=(:Psi∣dE_(A)(a)Psi:)d \mu(a)=\left\langle\Psi \mid d E_{A}(a) \Psi\right\rangledμ(a)=ΨdEA(a)Ψ.
  4. 1 1 ^(1){ }^{1}1 This equation can be viewed as a discretized analog of the Boltzmann equation in the present context. See the Appendix for further discussion of this equation.
  5. 1 1 ^(1){ }^{1}1 The quantity W cl W cl W^(cl)W^{\mathrm{cl}}Wcl is for this reason often defined by
    (4.43) W cl ( E , N ) := h 3 N | Ω E , N | (4.43) W cl ( E , N ) := h 3 N Ω E , N {:(4.43)W^(cl)(E","N):=h^(-3N)|Omega_(E,N)|:}\begin{equation*} W^{\mathrm{cl}}(E, N):=h^{-3 N}\left|\Omega_{E, N}\right| \tag{4.43} \end{equation*}(4.43)Wcl(E,N):=h3N|ΩE,N|
    Also, one often includes further combinatorial factors to include the distinction between distinguishable and indistinguishable particles, cf. (4.49).
  6. 2 2 ^(2){ }^{2}2 For distinguishable particles, this would be H N = L 2 ( R N ) H N = L 2 R N H_(N)=L^(2)(R^(N))\mathcal{H}_{N}=L^{2}\left(\mathbb{R}^{N}\right)HN=L2(RN). However, in real life, quantum mechanical particles are either bosons or fermions, and the corresponding definition of the N N NNN-particle Hilbert space has to take this into account, see Ch. 5.
  7. 3 3 ^(3){ }^{3}3 The proof of the linked cluster theorem is very similar to that of the formula (2.10) for the cumulants x n c x n c (:x^(n):)_(c)\left\langle x^{n}\right\rangle_{c}xnc, see section 2.1.
  8. 1 1 ^(1){ }^{1}1 Here, we make use of the Riemann zeta function, which is defined by
  9. 1 1 ^(1){ }^{1}1 This is how one could actually mathematically implement the idea of "thermal contact"
  10. 2 2 ^(2){ }^{2}2 Mathematically, the differentials d X i d X i dX_(i)\mathrm{d} X_{i}dXi are the generators of a Grassmann algebra of dimension N N NNN.
  11. 3 3 ^(3){ }^{3}3 Here we quote the formula for indistinguishable particles, which means that we should include the 1 N ! 1 N ! (1)/(N!)\frac{1}{N!}1N! into the definition of the microcanonical partition function W ( E , N , V ) W ( E , N , V ) W(E,N,V)W(E, N, V)W(E,N,V) for indistinguishable particles, cf. section 4.2.3.
  12. 4 4 ^(4){ }^{4}4 One also uses the enthalpy defined as E + P V E + P V E+PVE+P VE+PV. Its natural variables are T , P , N T , P , N T,P,NT, P, NT,P,N which is more useful for processes leaving N N NNN unchanged.
  13. 5 5 ^(5){ }^{5}5 Here we assume implicitly that [ H , N ^ 1 ] = 0 H , N ^ 1 = 0 [H, hat(N)_(1)]=0\left[H, \hat{N}_{1}\right]=0[H,N^1]=0 so that H H HHH maps subspaces of N 1 N 1 N_(1)N_{1}N1-particles to itself.
  14. 1 1 ^(1){ }^{1}1 More precisely, χ i j = lim M i V B j χ i j = lim M i V B j chi_(ij)=lim(M_(i))/(V*B_(j))\chi_{i j}=\lim \frac{M_{i}}{V \cdot B_{j}}χij=limMiVBj is a tensor. Here we only look at the z z z z zzz zzz-component of this tensor, which is relevant in our situation.