Advanced Statistical Mechanics - uni-leipzig.dekroy/materials/sm2_16.pdf · Advanced Statistical...

Advanced Statistical Mechanics

K. Kroy

Leipzig, 2016∗

Contents

I Interacting Many-Body Systems 1

1 Packing structure and material behavior 11.1 Pair interactions and pair correlations . . . . . . . . . . . . . . . . . . . 31.2 Correlation functions and constitutive relations . . . . . . . . . . . . . . 81.3 Dilute gases and the virial expansion . . . . . . . . . . . . . . . . . . . . 9

2 Attacking the many-body problem 142.1 Ornstein–Zernike integral equation . . . . . . . . . . . . . . . . . . . . . 142.2 Thermodynamic perturbation theory & FDT . . . . . . . . . . . . . . . 172.3 Density functional theories . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 Critical phenomena and renormalization 293.1 Scaling, exponent relations, Landau theory . . . . . . . . . . . . . . . . 293.2 Learning from toy models: the Ising chain . . . . . . . . . . . . . . . . . 343.3 The idea of renormalization . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Stochastic Thermodynamics 454.1 Einstein 1905: Brownian motion in a nutshell . . . . . . . . . . . . . . . 454.2 Langevin, Ornstein, Fokker–Planck, and all that . . . . . . . . . . . . . 51

∗The script is not meant to be a substitute for reading proper textbooks nor for dissemina-tion. (See the notes for the introductory course for background information.) Comments andsuggestions are highly welcome.

i

Part I

Interacting Many-Body Systems

Die Begrundung der statistischenMechanik bereitet begrifflicheSchwierigkeiten. Die dabeidurchzufuhrenden Rechnungensind jedoch meist einfach. DieserSachverhalt kehrt sich bei denAnwendungen der statistischenMechanik gerade um. Begrifflichgibt es dabei nahezu keine, dafurum so mehr mathematischeSchwierigkeiten.

W. Brenig

The aim of this part is to introduce general theoretical tools that have beendeveloped and used in many areas of physics to address interacting many-bodysystems on a microscopic basis.

1 Packing structure and material behavior

The empty canvas

Statistical Mechanics provides a microscopic foundation of the Thermostaticsof many-body systems. The central paradigmatic many-body system discussedin introductory courses is an isothermal classical ideal gas. This trivial casewith no interactions and no packing structure is briefly recapitulated to set thestage. Thermodynamically, the ideal gas is defined in an operational way by twoequations of state: (i) the caloric equation or energy equation

Uig =3

2NkBT , (1)

which can be measured in a calorimeter, and which, in the language of StatisticalMechanics, amounts to equipartition of energy among the degrees of freedom;(ii) an equation linking thermal and mechanical properties,

pV = NkBT . (2)

Using Eqs. (1) and (2) to express the intensive variables in terms of the exten-sive ones, and invoking homogeneity, the differential fundamental relation (per

1

particle)

ds(u, v) =1

Tdu+

p

Tdv (3)

in a three-dimensional thermodynamic state space spanned by U , V , and N canbe integrated to yield the fundamental relation S(U, V,N) = Ns(U/N, V/N) andthus all thermostatic properties of the gas from these two equations of state.

In Statistical Mechanics, as opposed to Thermodynamics, one is used to ad-dressing the fundamental relation directly—at least on a conceptual level—bycomputing the (e.g. canonical) partition sum

Z ≡∫

dΓ e−βH = e−βF (T,V,N) dΓ ≡ ΠidqidpiN !hdN

. (4)

This is how it works in principle but rarely in practice. In fact, most of thefollowing serves to introduce more practical approaches to squeeze out someinformation without doing this monstrous integral. The brute-force approachworks for toy-models, such as the isothermal mono-atomic classical ideal gas (ofindistinguishable particles in d = 3 dimensions). In fact, with the HamiltonianHig =

∑i p

2i /2m, one easily finds the ideal-gas fundamental relation

Zig = (V/λ3T )N/N ! ⇒ Fig = −kBT lnZig = NkBT [ln(λ3

Tn)− 1] (5)

The thermal wavelength

λT ≡h√

2πmkBT(6)

appears as a natural length scale in a quantum statistical description. It is thede Broglie wavelength of a typical gas particle of mass m with an energy of aboutkBT , as dictated by equipartition. It suggests itself as an elementary unit cellsize in the classical configuration space. The naive interpretation of V/λ3

T asthe number of “available classical states” for each of N non-interacting particlesturns out to be inappropriate if these are indistinguishable, though. Since particlepermutations produce no new states in this case, each particle can only explorethe volume v = V/N = 1/n instead of V , and there are only W ≈ v/λ3

T classicalstates per particle available, which is also apparent from the entropy,

Sig = NkB[5/2− ln(λ3Tn)] . (7)

Not too surprisingly, the model therefore breaks down if one attempts toraise the density or to lower the temperature beyond the limit nλ3

T ' 1, whichcauses the condensation of an increasing fraction of particles into their quantumground states. These particles do not scatter any more between different states.Therefore they carry no heat or entropy and behave very differently from whatis expected for a gas. The number of particles available to contribute to theclassical gas behavior, or to any sort of thermodynamics, is thus reduced and the

2

thermodynamic properties are modified with respect to the classical expectation;e.g., it is found that the pressure is bound from below for fermions and from abovefor bosons. With a grain of salt, one can say that the thermodynamics beyondthis observation is essentially all due the normal fraction of the gas, which has areduced density nc = n(T/Tchar)

ζ (with system-specific characteristic temperatureTchar and exponent ζ) and otherwise behaves more or less like a classical idealgas.

1.1 Pair interactions and pair correlations

Pair interactions

For atoms and molecules, potential interactions can usually not be neglected. Itis usually via these “direct interactions” (as opposed to the “statistical” exchangeinteractions) that we notice the existence of these particles, in the first place. Andthe existence of various phases of the same substance and its physical propertiesare primarily the result of strong mutual particle interactions. The reason thatthe model of an ideal (interaction-free) gas is useful at all, and in fact highlysuccessful, is due to the fact that many elementary excitations in condensedmatter may (after quantization) to a good approximation be described as dilutegases of quasi-particles (so-called “phonons”, “magnons”, “electrons”, “holes”,etc.), at least at low temperatures. Interactions govern the phase transitionsinto these condensed states, their packing structure and energetics, and, last butnot least, the often complex and composite (hybrid) nature of the elementaryexcitations that may then, eventually, be treated as if they were a dilute Bose orFermi gas.

Before entering the discussion of direct interactions, it is useful to estimatetheir importance relative to the effective “statistical interactions” arising fromthe exchange statistics of indistinguishable particles. Roughly, the latter shouldstart to matter once the effective statistical pair potential becomes of similarrange and strength as the direct interactions. A common feature of all directinteractions between atoms and molecules is their strong hard-core repulsion atcenter-of-mass distances on the order of a few Bohr radii aB = ~2/mee

2. For theweak statistical interactions to become relevant, they need to be of considerablylonger range,

λT/aB � 1 ⇒ kBT � Ryme/ma = O(m eV) , (8)

where me and ma denote the masses of the electron and the atom, respectively.Also note that at the characteristic distance λT the statistical interaction strengthis only about ±2 · 10−3kBT . In contrast, the strength of the repulsion is propor-tional to the electron density, which decays exponentially and is down to about1 Ry/e at the nominal atom radius. Altogether, this argument thus suggests

3

that atoms and molecules have to be cooled down substantially below room tem-perature, if not the lowest temperatures in the universe, before the exchangeinteractions become relevant. A related estimate via the comparison of the directand exchange contributions to the second virial coefficient is discussed in the ex-ercises. Both estimates show that the energetics and packing structure of atomsand molecules is usually completely dominated by “direct” potential interac-tions1. Of course, these direct interactions themselves arise as an effective picturefrom a complex combination of the wave-like nature of matter (uncertainty rela-tion), exchange interactions (Pauli principle), and Coulomb interactions withinthe electronic degrees of freedom of the atoms, as discussed in atomic physicsclasses.

The most convenient starting point for studying the effects due to direct in-teractions on the many-body physics is the pair approximation. It assumes thatthe molecular interactions are reasonably well represented by pair-wise additiveinteractions between the individual particles. In the following, only such pairinteractions V({rj}) =

∑i<j ν(ri − rj) corresponding to the Hamiltonian

H =∑i

p2i

2m+∑i<j

ν(ri − rj) (9)

are considered. This excludes the effective interactions due to the exchange statis-tics and other possible M -particle interactions with M > 2 from the very begin-ning. And even without invoking quantum mechanics there are many situationswhere this is inappropriate. As an example one may mention the effective deple-tion attractions induced between two tagged hard spheres by other hard spheresthat cannot enter the depletion zones around these particles (sketch). Theseeffective attractions are, in general, not pair-wise additive2 although they arisefrom the purely pairwise interactions between the individual hard-spheres. Thisexample therefore also serves to illustrate the point that the restriction to pairinteractions does not exclude the possibility that complicated many-body corre-lations govern the packing structure and macroscopic behavior.

The general task now, is to calculate the partition sum, or rather (since theintegration over momenta can immediately be performed) the configuration in-tegral for the Hamiltonian (9). Except for a few toy models, this is analyticallycompletely out of reach, which leaves one with essentially two choices. The firstone is to calculate Z numerically. This replaces an arbitrary generic problem inequilibrium statistical mechanics by a rather specific technical challenge, namelythe accurate numerical evaluation of an extremely high-dimensional integral. Theuniversal method to address this is called Monte Carlo simulation. To get thebasic idea, imagine a simple 2-dimensional example, namely the evaluation of thearea of a circle. Draw your circle inside a sand pit of known area and let the

1For conduction electrons in solids exchange interactions are crucial, though.2A dilute, strongly bi-disperse mixture of hard spheres provides an exception (exercises).

4

kids randomly throw pebbles into the sand pit. At the end of the day you collectthe pebbles from inside and outside the circle area and compare their accumu-lated weights to get an estimate of the ratio of the circle area to the known sandpit area (see exercises). An alternative approach, which is less suitable for theexploitation of child labor, is pursued in the following.

Pair correlations: the radial distribution function

As another restriction, the following discussion will focus almost exclusively onthe discussion of pair correlations. Thanks to the restriction to pair interactions,they contain in many cases the most salient information. Higher-order correla-tions may well contain essential additional information, but they are certainlymore difficult to measure and to calculate. In fact, a framework based on pair-interactions and pair-correlations is to date the most common and most practicalway to approach many of the formidable challenges provided by many-body sys-tems. But beware! In general, pair correlations will not only not tell us the wholestory, but they may even fail to provide a good qualitative picture of the overallbehavior. Concentrated hard spheres can serve as a striking example. They inter-act only pairwise, yet exhibit a dramatic slowing down of their dynamics, calleda glass transition, without any obvious sign in their pair correlations3. Similarly,complex ring-like bridges involving subtle correlations between more than justpairs of particles are held responsible for the jamming of granular flows (whichcomplicates the construction of a good hourglass).

In the following, the grand canonical formulation is applied and for the mi-croscopic particle concentration the abbreviation

n(r) ≡∑i

δ(r− ri) (10)

is introduced. The position-dependent “1-body” or “1-point” density

n(r) = 〈n(r)〉 = 〈∑i

δ(r− ri)〉 = 〈Nδ(r− r1)〉 (11)

simply contains the information how many particles are found on average in a cer-tain region of configuration space. For a homogeneous (i.e., translation-invariant)system, n(r) = n ≡ N/V it is just the overall density, which is constant. Themore interesting function to look at is then the 2-point density or pair correlationfunction. It shall here be defined as 〈n(r)n(r′)〉 minus the self-correlations4,

n(r′, r′′) ≡ 〈∑i 6=j

δ(r′ − ri)δ(r′′ − rj)〉 = 〈N(N − 1)δ(r′ − r1)δ(r′′ − r2)〉 . (12)

3In contrast, their crystalization is well detectable in the pair correlations.4The experts often tacitly use the same notation, no matter whether they want the self-

correlations in or out, which has then to be decided from the context.

5

There are usually still interesting pair correlations even if the system is translationinvariant, in which case n(r′, r′′) is solely dependent on the relative position r =r′ − r′′. This can be exploited by setting the origin of the coordinate systemat the center of an arbitrary particle and defining the dimensionless correlationfunction g(r)

n2g(r) ≡ 1

V

∫dr′ n(r + r′, r′)

ng(r) =1

〈N〉

∫dr′ n(r + r′, r′) =

1

〈N〉

∫dr′ 〈

∑i 6=j

δ(r + r′ − ri)δ(r′ − rj)〉

=1

〈N〉〈∑i 6=j

δ(r + rj − ri)〉 =1

〈N〉〈N(N − 1)δ(r + r1 − r2)〉 ;

g(r)r12≡r2−r1=

V

〈N〉2〈N(N − 1)δ(r− r12)〉 N=const.

= V 〈δ(r− r12)〉

(13)

This manifestly translation invariant form of the pair correlation function encodesthe neighbor correlations in the simplest possible way, namely as a local particledensity as seen from the position of an arbitrary particle. If the system is moreoverisotropic, the argument of g(r) is actually the distance r ≡ |r|, and one speaks ofthe radial distribution function g(r). It gives the probability density 4πr2g(r)/Vto find a particle at distance r from an arbitrarily chosen test particle. For astatistically homogeneous and isotropic system the full information about thepacking structure, as far as pairs of particles are concerned (i.e. excluding theinformation, with which probability three or more particles are found in a certaincorrelated state), is thus encoded in a single scalar function of a scalar variable.For hard core particles5 g(r → 0) = 0. For systems without long-range order,such as liquids and gases, the limit g(r → ∞) = 1 exists, for crystals g(r) willoscillate indefinitely around it. For Fourier transforming g(r), it is customary tosubtract the 1, which would otherwise always produce a δ-function correspondingto the forward scattering6 of a homogeneous material. The resulting function,which only retains the non-trivial pair correlations or “pair fluctuations”, is oftendenoted by

h(r) ≡ g(r)− 1 =[〈n(r)n(0)〉 − 〈n(0)〉2

]/n2 . (14)

(The last expression should be understood as a notational convention; the selfcorrelations drop out.)

5Remember that self-correlations, which would contribute a δ(r) to ng(r), were subtracted.6As an experimentalist setting up a scattering experiment you would also better like to

subtract this contribution by a beam stop to save your detector.

6

Pair correlations: the structure factor

It should be familiar from elementary optics that the Fraunhofer interferencepatterns resulting from scattering in optically dilute (i.e. essentially transparent)media correspond to a Fourier transform of the pattern of point scatterers. Therelevant non-technical information about the (relative) scattering intensity withscattering vector q is contained in the structure factor

Sq ≡1

〈N〉〈∑ij

exp[−iq · (ri − rj)]〉

=1

〈N〉

∫dr′dr′′ exp[−iq · (r′ − r′′)]〈n(r′)n(r′′)〉

=1 +1

〈N〉

∫dr′dr′′ exp[−iq · (r′ − r′′)]n(r′, r′′)

=1 + n

∫dr exp(−iq · r)g(r)

=1 + (2π)dnδ(q) + n

∫dr exp(−iq · r)h(r) .

(15)

For the first equality, introduce a 1 in the form∫dr′dr′′ δ(r′ − ri)δ(r

′′ − rj). Theleading 1 emerging in the third line corresponds to the incoherent scatteringfrom 〈N〉 independent particles and results from the self-correlations in the den-sity auto-correlation 〈n(r)n(r′)〉. The remaining terms are due to interferenceand encode correlations. The δ-function amounts to the forward “scattering”from the homogeneous contribution contained in g(r) and subtracted in h(r). Itis quite often tacitly omitted, which makes the limit q → 0 of the structure fac-tor continuous, in agreement with the experimental procedure of using a beamstop to block the contribution at q = 0. The last term contains the interest-ing information in the form of a Fourier transform of the pair fluctuations. Theinformation encoded in small-angle scattering (q 6= 0) is the integral over thefluctuations h(r), hence according to Eqs. (13), (14) and using a result from in-troductory statistical mechanics relating number fluctuations and compressibility(see exercises), we have

Sq→0 ∼ 1 + n

∫drh(r) = 1 +

〈N(N − 1)〉 − 〈N〉2

〈N〉= nkBTκT (16)

This so-called compressibility sum rule provides a reverse perspective at the re-lation between fluctuations and response. While the fluctuations are usually un-derstood to increase in magnitude as a consequence of a decrease of the responsecoefficient that determines the restoring forces, Eq. (16), in return, expresses thevalue of a response coefficient by an integral over the fluctuations. It therebyclearly displays the connection between a diverging correlation length, which en-tails a (spatially) slow decay of the fluctuations encoded in h(r), and a divergenceof response coefficients at a critical point.

7

In summary, the pair distribution completely determines one of the mostimportant experimental observables of the packing structure of a material, namelythe static structure factor. Its spatial integral moreover also determines thematerials’ mechanical strength (its compression modulus) via a sum rule, whichprovides a special case of the general fluctuation-response theorem, one of themajor predictions by statistical mechanics that repeatedly reappears in variousgeneralizations throughout this part of the lecture.

Exercise: Argue loosely that for dilute gases, one can calculate g(r) by onlyconsidering two particles, and that, in this limit, g(r) ∼ e−βV(r), where V(r)represents interaction potential between the particles. Use this to show thatg(r) = θ(r − σ) for a gas of hard spheres with diameter σ and density n �σ−3. Calculate and sketch the structure factor Sq and discuss the small anglescattering. Sketch your qualitative expectations for g(r) and Sq for a liquid, fora crystal, and close to a liquid-vapor spinodal.

1.2 Correlation functions and constitutive relations

Exploiting the simplifications ensuing from the limitation to pair interactions,Eq. (16) can be taken somewhat further, to express the two constitutive equations(or equations of state) of an interacting gas in terms of the pair distributionfunction g(r).

The caloric equation of state or energy equation, Eq. (1) is readily generalized,

U = 〈H〉 = 3〈N〉kBT/2 + 〈V〉 . (17)

Noting that all pairs of particles give the same contribution to the last term, itmay be rewritten as

〈V〉 = 〈N(N − 1)ν(r12)〉/2 . (18)

Now, introducing a 1 in the form∫dr δ(r − r12) and using Eq. (13), this may

immediately be rephrased in the form

U = 〈H〉 =3

2〈N〉kBT

[1 +

n

3

∫dr g(r)βν(r)

]. (19)

Similarly, the thermal (mechanical) equation of state may be extended and relatedto the radial distribution,

βp = −β∂V F |T = ∂V ln(ZigQ

)= n+Q−1∂VQ . (20)

The normalized configuration integral Q is defined via Z ≡ ZigQ, hence (in thecanonical ensemble)

Q =

∫dNr

V Ne−βV({rj}) . (21)

8

Due to the normalization the only volume dependence in Q is due to ν(rij), i.e.∂V ν(rij) = ∇ν(rij)∂V rij. Exploiting homogeneity and writing rij = V 1/3ξij withvolume-independent matrix elements ξij, one finds ∂V rij = rij/3V and

∂VQ = −∫

dNr

V N

N(N − 1)

2

β r12

3V· ∇ν(r12)e−βV({rj}) . (22)

Using again the trick with the inserted delta function, and keeping the N ’s insidethe averages to allow for a grand canonical interpretation if required, one thushas

Q−1∂VQ = − 1

6kBTV

∫dr r · ∇ν(r)〈N(N − 1)δ(r− r12)〉 (23)

With Eq. (13) the average can be rewritten as 〈N〉2g(r)/V , so that the correctionto the thermo-mechanical constitutive equation of the ideal gas resulting from thepotential interactions is found to be given by the average of the virial (weightedby the pair distribution), hence the name virial equation of state

p = nkBT

[1− n

6kBT

∫dr g(r)r · ∇ν(r)

]. (24)

Equations (16), (19) and (24) show how thermodynamics follows from packingstructure7. They reduce the task of deriving the thermodynamic equations ofstate to integrations over the spatial density fluctuations. Or, more precisely,the thermodynamics of a homogeneous isotropic fluid with pair interactions canbe expressed solely in terms of the two-point correlation function that reducesto g(r) for a homogeneous system. To make any practical use of this formal re-sult, one has to find g(r) in a continuous parameter region, which is indeed thecentral task of so-called liquid state theories and of many Monte Carlo simula-tions. In particular, a very educated guess about the general form of g(r) andν(r) is indispensable to apply the result in reverse, namely to infer aspects of themicroscopic packing structure and interaction potential from macroscopic ther-modynamic measurements. Historically, we owe much of our knowledge aboutmolecular interactions to this connection, and thanks to recent developments inmicroscopy, it might become a fruitful path to explore even further, in the future.

1.3 Dilute gases and the virial expansion

It is intuitively obvious that encounters between particles in a dilute gas arerare. Then, multiple collisions are probably very weakly correlated and thereforeexponentially rare, i.e., the probability for M particles to collide simultaneouslydecays like (σ3n)M , with σ a characteristic interaction range. Hence, to extract

7Three constitutive equations cannot be independent, of course, and errors of approximateliquid-state theories are sometimes estimated (or reduced) by comparing (or superimposing)the predictions obtained from Eq. (24) and Eq. (16), respectively.

9

the dominant thermodynamic effect of the interactions it suffices to calculatethe radial distribution of one particle in the field created by a second particle,which may be located at the origin of the coordinate system, for a homogeneoussystem. Given the interpretation of g(r) as a real space probability distribution,this problem is of course equivalent to the derivation of the barometer equation,with the known result8 (see the exercises)

g(r) ∼ e−βν(r) (n→ 0) . (25)

Before a more formal discussion is provided, some immediate consequences ofthis shall be derived. Inserting Eq. (25) into the energy and virial equations ofstate, Eqs. (19), (24), one immediately deduces the equations of state of a diluteinteracting gas9

U

3NkBT/2∼ 1− 2n

3T∂TB(T ) (26)

andp

nkBT∼ 1 +

n

6

∫dr r · ∇f(r) = 1 + nB(T ) , (27)

respectively. The notation makes use of two important and common definitions,the second virial coefficient

B(T ) ≡ −1

2

∫dr f(r) , (28)

and the Mayer functionf(r) ≡ e−βν(r) − 1 , (29)

which here plays the role of the function denoted by the same symbol in therelated expansion at the end of the introductory statistical mechanics lecture.The Mayer function vanishes at large arguments for well-behaved interactions,thanks to the subtracted 1, which renders the above integrals finite. Also noticethat the Mayer-function is the low-density limit of the pair correlation func-tion, h(r) ∼ f(r) for n → 0. The second virial coefficient B(T ) is an effec-tive volume that quantifies the leading corrections to the ideal gas limit of theconstitutive equations. A positive/negative B(T ) (corresponding to effectivelyexcluded/gained volume) is indicative of a predominantly repulsive/attractiveinteraction, which increases/decreases the pressure and the energy density, cor-respondingly.

A more formal derivation of the above results starts from the grand canonicalpotential

pV

kBT= lnZG = ln

∑N

zNZ(N) , (30)

8The symbol ∼ means “asymptotically equal”, i.e. an exact equality in the indicated limit.9Use βνe−βν = −β∂βf = T∂T f , and the definitions of f(r) and B(T ).

10

where Z(N) ≡ Z(V, T,N) is the canonical partition sum. Expanding ZG in thefugacity z ≡ eβµ, which is a small quantity zig = nλ3

T � 1 in the classical ideal-gas limit, one obtains a series with coefficients10 that roughly increase like powersof 1/z. The corresponding cumulant expansion (see exercises)

lnZG = ln[1 + Z(1)z + Z(2)z2 . . . ] = Z(1)z + [2Z(2)− Z(1)2]z2

2+ . . . (31)

is much better behaved. It not only has a small expansion parameter but alsosmall expansion coefficients for a system close to the ideal gas limit. Note that thesecond order simply measures the deviation of the 2-particle partition sum Z(2)from that of 2 independent particles, Z(1)2/2!, and corresponding cancellationsoccur in all higher orders, as suggested by the observation that the leading termof the cumulant expansion already yields the exact result for an ideal gas aftereliminating z in favor of N ,

N = ∂βµ lnZG = z∂z lnZG = Z(1)z + [2Z(2)− Z(1)2]z2 + . . . (32)

For the leading terms of Eqs. (30), (31), one then has

pV

kBT= lnZG = N − [2Z(2)− Z(1)2]

z2

2+ . . .

= N − nV N 2Z(2)− Z(1)2

[Z(1)z + . . . ]2z2

2+ . . .

(33)

The series in powers of the particle concentration n thus obtained is called thevirial expansion

p

nkBT= 1 +B(T )n+O(n2) (34)

with the second virial coefficient (observe that Z(1) = Zig(1) = V/λ3T and

Zig(2) = Z2(1)/2)

B(T ) = −V2

[2Z(2)

Z(1)2− 1

]= −V

2[Q(2)− 1] = −1

2

∫dr f(r) . (35)

This result demonstrates that it is indeed sufficient to solve a two-body problem,namely the two-particle configuration integral Q(2), to compute the leading ther-modynamic correction to the ideal gas. While the solution of a one-body problemwas sufficient to describe a gas of N non-interacting particles, the solution of atwo-body problem (which can always be reduced to a one-particle problem in anexternal field, viz. the barometer equation) suffices to deal with N interactingparticles as long as simultaneous encounters of three or more particles are rare.

10For the classical ideal gas Z(1) = V/λ3T = N/(nλ3

T ) = N/zig⇒ Zig(N) = Z(1)N/N ! ≈ z−Nig .

11

In summary, the corrections to the thermodynamics of an ideal gas are fordilute systems completely contained in B(T ), which contains a two-body configu-ration integral. Assuming the general functional form of the interaction potentialto be given — e.g. Lennard–Jones, (sticky) hard sphere, etc. — measurements ofB(T ) can be used (and have excessively been used) to determine the potentialparameters for atoms, simple molecules, colloidal particles or globular proteins.Dilute interacting gases are not more complex than ideal gases in an externalfield, i.e. the barometer equation. Calculation of higher order terms in the virialexpansion requires increasing effort and is of limited use, as the whole expansionwill break down at the first really interesting occasion (e.g. a phase transition,or long-ranged interactions, which jeopardize the otherwise plausible fast conver-gence of the series).

Motivated by Eq. (25), one writes also the pair distribution g(r) of a densefluid in the form

g(r) = e−βw(r) , (36)

which can be used to define the potential of mean force w(r). With Eq. (13)

∇w(r) =V

βg(r)〈∇r12δ(r− r12)〉 =

V

g(r)〈δ(r− r12)∇r12ν(r12)〉 = 〈∇r12ν(r12)〉r12=r

follows by partial integration, which explains the name and provides an alterna-tive definition. From the last expression or — by analogy with the barometerformula — from Eq. (36), w(r) is seen to be the reversible work needed to ap-proach two infinitely distant particles of the fluid to a distance r through the“background solvent” provided by all the other particles. Splitting off the corre-sponding work ν(r) for the pair in empty space,

w(r) = ν(r) + ∆w(r) , (37)

one obtains the work — or, in fact, free energy — ∆w(r) isothermally absorbedby the solvent during this process, i.e. the part of the reversible work solelydue to the presence of the solvent. In other words, the other particles induce aneffective interaction ∆w(r) between a chosen pair of test particles. This is furtherdiscussed in the exercises for the example of the so-called “depletion attractions”induced in a fluid of purely repulsive particles.

The hard sphere fluid: paradigm of condensed matter

To illustrate the above formalism, it is useful to consider a fluid of hard spheres,for which

ν(r) =

{∞ r < σ

0 r > σ⇒ g(r) =

{0 r < σ

e−β∆w(r) r > σ(38)

12

This model idealizes the most salient feature of all atomic, molecular and most col-loidal interactions, a strong hard-core repulsion at short center-of-mass distances,which plays the major role in determining the characteristic packing structure ofmost simple fluids. This is why the hard sphere fluid may justly be called aparadigm of condensed matter. Attractive interactions are often less pronouncedthan the hard-core repulsion and can be taken into account perturbatively oncethe hard sphere fluid is under control.

The only work in displacing a pair of hard spheres for distances r > σ is dueto the effective interactions ∆w(r) induced by the solvent, i.e. by the surroundingspheres. The function ∆w(r) can therefore be identified as the free energy costfor bringing two cavities of the size of the spheres from infinity to a distancer. The corresponding part of the radial distribution function gc(r) = e−β∆w(r)

is therefore also known as the “cavity distribution”. Inserting Eq. (36) into thevirial equation (24) yields a relation between the pressure and the contact valueg(σ) of g(r),

p

nkBT= 1− n

6

∫dr e−βν(r)−β∆w(r)βν ′(r)r

= 1 +2πn

3

∫dr(e−βν(r)

)′e−β∆w(r)r3

= 1 +2πn

3

∫dr δ(r − σ)gc(r)r

3

= 1 +2π

3nσ3g(σ) = 1 + 4φg(σ) .

(39)

Here, φ ≡ 4π/3 ·(σ/2)3n denotes the volume fraction of the spheres, and the finalg(σ) should be understood11 as gc(σ) = g(r → σ+). The take-home message isthat the equation of state of a hard sphere fluid is determined by the contact valueof the radial distribution function, i.e. by the number of collisions between thespheres. For concentrations n ≈ ncp = 1/vcp near close-packing12, a simple freevolume consideration suggests that the radial distribution function must developa pointed next-neighbor peak of width ∆ ≈ (v − vcp)/3σ2 at r = σ. The areaunder the peak,

4πn

∫ σ+∆

σ

dr r2g(r) ∼ C (n→ ncp) (40)

gives the number of nearest neighbors or “coordination number” C. For densefluids and close-packed crystals C ≈ 12. A simple estimate for the hard sphereequation of state p(φ) is therefore obtained by parametrizing the next-neighborpeak for r & σ, e.g. by and exponential g(σ)e−r/∆, plugging this into Eq. (40), and

11In contrast to g, gc is smooth at r = σ and no ambiguities due to discontinuity arise.12In three dimensions, closest packing is attained in a fcc/hcp-crystal with n =

√2/σ3 cor-

responding to φ ≈ 0.74, as conjectured by J. Kepler, proved by K. F. Gauß with respect to allcrystalline packings, and, at the end of the 20th century, by T. C. Hales in full generality.

13

solving for the contact value g(σ) as a function of n = 1/v. More accurate semi-empirical formulas for the equation of state and the free energy of the hard-spherefluid were proposed by Carnahan and Starling (and others) by extrapolating fromthe leading terms of the virial expansion:

p

nkBT≈ 1 + φ

4− 2φ

(1− φ)3,

F

NkBT≈ Fig

NkBT+ φ

4− 3φ

(1− φ)2. (41)

They can also be understood as results of a heuristic approach called “scaledparticle theory” and, alternatively, the Percus–Yevick expression for g(r), intro-duced further below. Reality is more complex. Already at φ ≈ 0.58 one observesa dramatic slowdown of the dynamics attributed to a glass transition that cannot directly be guessed by simple free-volume arguments or any other of the men-tioned theories, but is very well described by mode-coupling theory13, a dynamictheory that builds on liquid state theory. Moreover, there is a fluid-crystal phasecoexistence with p(n) = constant for 0.49 . φ . 0.55, which is best addressedby density functional theories, as outlined further below.

2 Attacking the many-body problem

2.1 Ornstein–Zernike integral equation

A central task in many-body theory is the prediction of phase transitions and theaccompanying structural changes, as these provide not only the most spectacularapplications but also the most significant and critical tests of the theory. Thechallenge then is to demonstrate how long-range correlations and divergent re-sponse coefficients emerge from a typically short-ranged interaction potential. Ata critical point or at a spinodal, the pair correlation function h(r) = g(r)− 1 hasto become long ranged such that its integral and, according to Eq. (16), the com-pressibility Sq→0/(nkBT ) diverges within a narrow parameter range. A standardtrick to guarantee that this subtle structure is preserved in practical calculations(which are, of course, approximate calculations) consists in introducing a newfunction c(r) and its Fourier transform cq, called the direct correlation function,via

Sq = 1 + nhq =1

1− ncq(42)

A small change in the density n then translates into singular behavior of Sq→0,and a finite error in cq will only affect the precise location of the phase transitionbut not wreck its singular nature. This equation, and its transformation into realspace

h(r) = c(r) + n

∫dr′ c(r− r′)h(r′) , (43)

13W. Gotze, Complex Dynamics of Glass-Forming Liquids: A Mode-Coupling Theory, OxfordUniv. Press, Oxford 2009.

14

are called Ornstein–Zernike (integral) equations (OZE). A formal justification ofthe pertinence of this structure and the interpretation of c(r) are postponed tothe next section, while the remainder of this section first attempts to provide anintuitive physical interpretation.

By expanding the denominator in Eq. (42) or iterating Eq. (43), respectively,one immediately sees that the OZE decomposes the correlations in h(r) intochains of “direct correlations” as expressed by c(r) (each ∗ stands for a convolu-tion integral in real space or multiplication in Fourier space, respectively)

h = c+ nc ∗ c+ n2c ∗ c ∗ c+ . . . . (44)

Indeed, the intuition of long-range correlations in g(r) building up as chains ofshort-range correlations in c(r) is an important guide in applications of the OZE.For long-ranged interactions, such as a Coulomb potential, the trick also worksbackwards and helps to explain how a long-ranged pair interaction gets screenedby the presence of other charges. The chain structure of Eq. (44) suggests that thedirect correlation function c(r) should be a close relative of the pair potential ν(r),since it is the latter that helps to transmit interactions from particle to particleand thereby generates correlations between distant pairs of particles. However,these correlations are not merely transmitted by chains (there could also be loopsin between). Therefore, the “dressed” interaction c(r) and the “bare” interactionν(r), although they play similar roles, are not identical to each other.

There are essentially two different ways of applying Eq. (43) to a system ofinterest. First, as an exact self-consistency relation that has to be solved forthe functions h and c, simultaneously. To this end, one has to supply closureconditions to constrain the two unknowns. Ideally, one would like to chooseclosures that avoid violations of any a priori known properties of g(r), but thereis no general systematic way how to find an exact closure. In practice, the self-consistent solutions are therefore non-perturbative, uncontrolled approximationscorresponding to partial resummations of the virial series. A popular example isthe Percus–Yevick closure for hard spheres,

h(r < σ) = −1 c(r > σ) = 0 . (45)

While the first equation simply expresses the no-overlap condition, g(r < σ) = 0,and therefore is obviously exact, the second expresses the expectation that thedirect correlation function should be of similar range as the interaction potentialitself, i.e. it is a physically motivated approximation of a priori unknown quality.With the closure (and some effort), the OZE is analytically solvable and yields

c(r < σ) =1

(1− φ)4

[6φ(1 + φ/2)2r/σ − (1 + 2φ)2(1 + φ r3/2σ3)

]. (46)

Here, φ ≡ nπσ3/6 again denotes the volume fraction of the spheres. From thisone gets pretty accurate approximations (see the yellow curve in Fig. 2.1) to the

15

numerically exact g(r) and Sq, which are known from Monte Carlo simulations(not shown). After integration, it also yields the Carnahan–Starling expressionfor the equation of state, Eq. (41).

The second reading of the OZE is as a kind of perturbation series that es-sentially “creates a highbrow h(r) out of a lowbrow c(r)”. This view can alsoprovide some insight as to what might be a good closure. For example, one maystart from the observation that in the low-density limit

n→ 0 ⇒ h(r) ∼ c(r) ∼ f(r) = e−βν(r) − 1 . (47)

Using this approximation for c(r) in the OZE, one obtains the first correction tothe limiting form for h(r) ∼ f(r), namely

h(r) ∼ f(r) + n

∫dr′ f(r− r′)f(r′) . (48)

In particular, for hard spheres f(r > σ) = 0 and h(r > σ) = e−β∆w(r) − 1 ∼−β∆w(r) at low densities, so that one immediately reads off

β∆w(r12) = −n∫

dr3 f(r13)f(r23) +O(n2) . (49)

Inserting the hard-sphere Mayer function and doing the integral one thus obtainsthe interesting prediction that a pair of test spheres experiences an effectiveattraction (∆w < 0), called depletion attraction, due to the unbalanced pressureexerted by the surrounding particles (see exercises). The product of the twoMayer functions indicates that, to leading order, ∆w(r) takes the effect of asingle additional particle onto the interaction of the pair of test particles intoaccount. The leading order approximation therefore amounts to treating the“solvent” of other particles as an ideal gas, except for its interactions with the twotest particles. This is known as the Asakura–Oosawa approximation in colloidscience. A more accurate expression for ∆w for hard-sphere suspensions thattakes the interactions of the solvent particles into account, is easily obtainedfrom Eqs. (36), (37) and the Percus–Yevick form for g(r), of course.

To summarize, by an apparently simple rewriting (expressing h or S in termsof c), one has thus gained a robust and efficient approximation scheme for theotherwise forbiddingly complicated many-body problem. Remember that theoriginal task was to solve a ridiculously high-dimensional configuration integralfor N ' 1023 strongly interacting particles. The OZE involves only a singleintegration and the guessing of appropriate closures. With the OZE, already thesimple limiting form of the direct correlation function for vanishing density (infact, nothing but the good old barometer equation) yields interesting predictionsfor the pair correlations and the structure factor that moreover stay qualitativelytrustworthy for finite (not too large) densities n and may even give a rough ideawhereabout and how a phase transition might occur.

16

0 5 10 15 200.0

0.5

1.0

1.5

qΣ

S q

Figure 1: The structure factor of a hard sphere fluid of volume fraction φ = 0.3(purple) and φ = 0.6 (blue) as estimated by the simplistic procedure describedin the main text, which uses the hard-sphere Mayer function f(r) = θ(r− σ)− 1as an approximation for c(r); see Eq. (48). While not quantitatively useful, itcaptures the qualitative features of the exact result (and, in fact, of the structurefactor of any simple fluid) surprisingly well, as demonstrated by the comparisonwith the quite accurate Percus–Yevick result, Eq. (46) for φ = 0.3 (yellow).

The following section establishes the above structure of pair correlations anddirect correlations and their relation by the OZE on a more systematic basis, thedouble hierarchy of correlation functions generated from a twin couple of gen-erating functionals generalizing the equilibrium free energy and grand canonicalpotential.

2.2 Thermodynamic perturbation theory & FDT

The formalism to be introduced in this section provides the formal grounds forthe Ornstein–Zernike equation and also for the discussion of the fluctuation-dissipation theorem and the density functional theories in the remainder of thispart of the lecture.

The 2-lane stairway of correlations & response

This section generalizes the double hierarchy of isothermal derivatives of theequilibrium free energy F (T, V,N) and grand canonical potential J(T, V, µ), orrather their volume densities f and j, respectively,

∂nf = µ , ∂2nβf = ∂nβµ = (n2kBTκT )−1

−∂µj = n , −∂2βµβj = ∂βµn = nSq→0 ,

(50)

where β and V are understood to be constant throughout. The procedure canclearly be continued to higher order derivatives, which will however not be pur-

17

sued in the following. Note that the compressibility sum rule, Eq. (16) is equiva-lent to the statement that the second derivatives of the potentials are inverse toeach other,

(∂2nβf)(−∂2

βµβj) = (∂2nf)(−∂2

µj) = 1 (51)

This double hierarchy becomes a very useful tool when it is upgraded to allow forspatial variations of all quantities involved except for the temperature T = 1/kBβ,which parametrizes the local equilibrium, and the volume V used to turn allextensive quantities into local densities. Spatial variations can arise from externalforces that are represented by a potential U(r) or from the (pair-)interactionpotential V(r),

H =∑i

p2i

2m+ V({ri}) + U({ri}) =

∑i

p2i

2m+∑i<j

ν(ri − rj) +∑i

u(ri)

=∑i

p2i

2m+

1

2

∫drdr′ n(r)ν(r− r′)n(r′)−Nν(0)/2 +

∫dr n(r)u(r)

(52)

The subtraction of Nν(0) from — and the prefactor 1/2 in front of — the in-teraction term correct for counting the self-interactions and for double countingall pairs, respectively. The main difference of this local formalism compared tothe above is that partial derivatives of the free energy densities turn into func-tional derivatives of the free energies themselves and that multiplications of freeenergies have to be interpreted as convolutions. (By Fourier transformation, onecan always get rid of this complication). The benefit is that derivatives gener-ate correlation and response functions and not merely moments of some globalquantities and response coefficients, respectively.

Concerning the free energies, notice that the average of the external potential

〈U({ri})〉 =

∫dr 〈n(r)〉u(r) =

∫drn(r)u(r) (53)

has the form of a space-dependent generalization of the chemical potential term−µN . In fact, this familiar structure is recovered for u(r) = −µ = const. Withthe notion of a spatially heterogeneous generalized chemical potential (known as“electrochemical potential” in solid state physics)

µ(r) ≡ µ− u(r) (54)

one can therefore delegate much of influence of the external potential to the grandcanonical potential. Its functional derivatives with respect to µ(r) are

−δJδµ(r)

= n(r) ,−δ2βJ

δβµ(r)δβµ(r′)=

δn(r)

δβµ(r′)= 〈δn(r)δn(r′)〉 ≡ G(r, r′) , (55)

with δn(r) ≡ n(r)−〈n(r)〉. Also recall that the pair correlation function G(r, r′) isessentially the Fourier transform of the structure factor (times particle number).

18

The corresponding “free-energy route” follows from the Hamiltonian in Eq. (52)after subtracting 〈U〉. The remaining parts give rise to the intrinsic free energy

F [n(r)] = J +

∫drn(r)µ(r) , (56)

which is the Legendre transformation of the functional J [µ(r)] with respect toµ(r). Its functional derivatives are14

δF

δn(r)= µ(r) ,

δ2βF

δn(r)δn(r′)=δβµ(r)

δn(r′)= G−1(r′, r) , (57)

so that∫dr′′

−δ2J

δµ(r)δµ(r′′)

δ2F

δn(r′)δn(r′′)=

∫dr′′G(r, r′′)G−1(r′′, r′) = δ(r− r′) . (58)

It is often useful to isolate the exactly known ideal gas contribution or, moregenerally, any other “known” reference free energy. Up to gauge terms O(N)(such as the −1 inside the curly brackets), the free energy of the inhomogeneousideal gas reads

βFig[n(r)] =

∫drn(r){ln[n(r)λ3

T ]− 1} (59)

with the derivatives

δFig

δn(r)= kBT ln[n(r)λ3

T ] ≡ µig(r) ,δ2βFig

δn(r)δn(r′)=

1

n(r)δ(r− r′) . (60)

This suggests to introduce the so-called excess free energy Fex ≡ F − Fig, withthe derivatives

δFex

δn(r)= µ(r)− µig(r) ≡ µex(r) ,

δ2βFex

δn(r)δn(r′)=δβµ(r)

δn(r′)− 1

n(r)δ(r− r′) =

δβµex(r)

δn(r′)≡ −c(r, r′).

(61)

By comparison with the second line of Eq. (52), c(r, r′) can be said to be thedressed version of the bare pair interaction ν(r−r′) (the free energy looks like theHamiltonian but with n and c in place of n and ν, respectively). The notationc(r, r′) is not an accident. The quantity thus defined, or rather n(r)c(r, r′), isseen to encode “excess” correlations beyond the δ-correlations of the ideal gasand reduces to the direct correlation function n c(r − r′), introduced above, in

14Do not worry about the distribution of r′ and r′′ onto the first and second arguments ofthe functions involved. Second derivatives of free energies are interchangeable.

19

the special case of a translation invariant system. Inserting Eq. (61) togetherwith the (total) pair correlation function

n(r)n(r′)h(r, r′) ≡ n(r, r′)− n(r)n(r′) = G(r, r′)− n(r)δ(r− r′) , (62)

(where n(r, r′) is the usual 2-point correlation function without self-correlations)into Eq.(58), one finds the familiar structure (exercises):

h(r, r′) = c(r, r′) +

∫dr′′ n(r′′)c(r′′, r)h(r′, r′′) . (63)

This is, of course, nothing but the Ornstein–Zernike equation in new garment.The earlier versions are immediately recovered for a homogeneous system withconstant density n(r) = n, for which also the definition of h(r, r′) reduces to thatof h(r− r′).

This now further clarifies the pertinence of the OZE and the notion of the di-rect correlation function, and shows that both are deeply ingrained in the generalstructure of equilibrium thermodynamics and statistical mechanics, providing aparadigm for quantum field theories and non-equilibrium statistical mechanics.The OZE constitutes a relation between the 2-point correlation and/or responsefunctions of different hierarchies, one obtained along the free-energy route, theother along the grand-potential route. In terms of response functions, it simplystates the (mathematically trivial) reciprocity of ∂µn and ∂nµ in terms of theexcess fluctuations and response. The response of the density to an infinitesimalfluctuation of the chemical potential is the reciprocal of the response of the chemi-cal potential to an infinitesimal density fluctuation. As the above discussion triedto convey, despite this seemingly trivial content, the OZE can serve as a powerfulpasskey to the otherwise forbiddingly complicated many-body problem.

Fluctuation-response theorem

The second derivatives of the generalized thermodynamic functionals F [n(r)] andJ [µ(r)] in Eqs. (55) and (57) can be interpreted as spatially varying susceptibilityfunctions or correlation functions, respectively. These equations thus generalizeresults about the equivalence of fluctuations and response coefficients; be it therelation between grand canonical number fluctuations and the compressibility de-rived in introductory texts on statistical mechanics, the compressibility sum rulein Eq. (16), or the relations between the constitutive equations and the packingstructure established in Sec. 1.1. These relations between packing structure, ther-mal fluctuations, and macroscopic material properties are taken one step furtherand extended to include detailed spatial information, by the above formalism. Tocast the result into a more familiar form, it shall now once more be re-derived andspecialized to the case of a homogeneous fluid. For the sake of the argument, theexternal perturbation is assumed to be periodic (considering a Fourier component

20

of any arbitrary perturbation, say),

u(r) =uqVeiq·r . (64)

This corresponds to the perturbation Hamiltonian U = uqn−q/V , with nq =∑i e−iq·ri the Fourier transformed microscopic density. Then, the resulting den-

sity field 〈nq〉U in presence of the perturbation U is obtained to leading order inthe perturbing field as

〈nq〉U =〈nqe−βU〉〈e−βU〉

∼ 〈nq(1− βU)〉〈1− βU〉

∼ nq +βuqV

(〈nq〉〈n−q〉 − 〈nqn−q〉

).

Here and in the following, quantities calculated in, or pertaining to the perturbedstate are marked with U to distinguish them from corresponding quantities in thereference fluid. The strength of the deviations from the homogeneous referencestate are controlled by the structure factor Sq of the unperturbed system, which—at the same time—characterizes the fluctuations and packing structure,15

∆n(U)q ≡ 〈nq〉U − 〈nq〉 = −βuq

V〈δnqδn−q〉 = −βuqnSq . (65)

This is a typical linear-response result. The response of the density is linearlyproportional to the perturbing field uq. The prefactor is an unperturbed equilib-rium average independent of uq, hence characteristic of the equilibrium system.The local susceptibility χ generalizing the overall isothermal compressibility κTis the functional derivative of the local density with respect to the perturbingpotential. For a homogeneous system, it turns into an ordinary partial derivativein Fourier space,

χ(r, r′) ≡ δn(r)/δu(r′) , χq = ∂nq/∂uq = −βnSq . (66)

The structure factor, the two-point correlation function of the density fluctua-tions, is at the same time the susceptibility controlling the density change inresponse to an external perturbing potential. This is the fluctuation-responsetheorem, which is of course already contained in Eq. (55). It reduces to thecompressibility sum rule as the special case q → 0, and it is a special case (fortemporally stationary situations) of the more general fluctuation-dissipation the-orem (FDT) to be discussed in the context of stochastic dynamics, further below.

Perturbation theory

Perturbation theories can be used to systematically improve a “known” referencesystem, such as the ideal gas or the hard sphere fluid. The ultimate aim would

15Note the overloading of the symbols Sq (q = 0 contribution included/excluded) and δ (Diracdelta, functional derivative and variation δn ≡ n− n around a local/global average n ≡ 〈n〉).

21

be to calculate the excess free energy. The fluctuation-response relation derivedin the preceding paragraph concerns the linear response to (or first-order pertur-bation by) an external potential. This can be integrated to obtain a “perturbed”free energy that contains corrections to the reference system. A similar proce-dure can be applied with the interaction potential ν(r, r′) in place of the externalpotential or with the density itself. Starting from

δJ

δν(r− r′)=

δF

δν(r− r′)=

1

2n(r, r′) , (67)

if the interaction να = ν0 +α(ν−ν0) (with ν0 = 0 in the simplest case) is switchedon by a charging parameter α, one has

Fex[n(r)] =1

2

∫ 1

0

dα

∫drdr′ nα(r, r′)να(r, r′) . (68)

The dependence of the correlation function on α keeps track of the accumulatingchanges in the pair structure upon switching on the interactions. Starting fromEq. (57), instead, one has to integrate the direct correlation function from thereference state to the state with the desired density n of excitations. Since thereare now two factors of the density, one has to integrate twice:

βFex[n(r)] = −∫ 1

0

dα

∫drn(r)

∫ α

0

dα′∫

dr′ n(r′) cα′(r, r′) . (69)

The first integration produces a non-trivial force at a finite density nα that hasthen to be integrated up over the differential response ndα to get the free energy.The “charging parameter” α can thus loosely be thought of as the amount of airpumped into a dinghy of shape n(r) to inflate it16. The dependence of the directcorrelation function on α keeps track of the accumulating changes in the pairstructure upon increasing αn from zero to n. For a homogeneous final state, andusing the identity ∫ 1

0

dα

∫ α

0

dα′ cα′ =

∫ 1

0

dα (1− α)cα , (70)

this simplifies to

βfex = n2

∫ 1

0

dα (α− 1)

∫dr cα(r) . (71)

The problem with the exact relations Eqs. (69), (68) is that cα and nα areof course not known, a priori. As already mentioned in the context of the con-stitutive equations, one needs expressions for the radial distribution in a whole

16Do not confuse the integrations over α with functional integrations over the free energy,i.e. the inverse operation of taking a functional derivative: Eq. (69) is just a way of writingδFex[δn(r)] for finite (not infinitesimal) density variations that are not integrated out.

22

continuous parameter range to infer the thermodynamics from the packing struc-ture. In some practical situations, one might be satisfied with a less ambitiousprogram, though. First order perturbation results are for instance obtained byreplacing the unknown pair functions in Eqs. (69), (68) by those pertaining to thereference state. This is reminiscent of the first order corrections to energy levels inquantum mechanics, which are given by the matrix elements of the perturbationHamiltonian in the basis functions of the unperturbed problem.

Further, the so-called RPA (“random phase approximation”) replaces Eqs. (69),(68) by17

Fex[n(r)] =1

2

∫drdr′ n(r)ν(r− r′)n(r′) (RPA) . (72)

This ubiquitous approximation can be interpreted as arising from either

n(r, r′)→ n(r)n(r′) or c(r, r′)→ −βν(r, r′) . (73)

The first is known as the decoupling approximation, which shows that the RPAbecomes exact at large distances in a (non-critical) fluid, where the correlationsdecay over a finite length scale. The second amounts to replacing the dressed bythe bare interaction. It has already been used above in a slightly upgraded version— namely in the form c(r) → f(r) = e−βν(r) − 1, which amounts to a differentpartial resummation of the virial series (see Fig. 2.1). There, it was motivated asa low-density approximation. From the first interpretation in Eq. (73), it is nowseen to also hold for dense fluids at large distances |r − r′| → ∞. In particular,Eq. (73) corroborates that c(r) and ν(r) are of similar range in a fluid, wheren(r, r′)→ n(r)n(r′) for |r− r′| � σ. This property is an important ingredient inall successful closure schemes for the OZE of fluids.

Finally, the minimum property of the grand potential,

δJ

δn(r)=

δF

δn(r)− µ(r) =

δFex

δn(r)− µex(r) = 0 , (74)

can be exploited to venture beyond the above approximations by a variationalprinciple. Free energies calculated with approximate structure functions (or basedon guessed trial functions) can then a posteriori be minimized with respect tofree parameters. Similarly, the independence of the free energies of artificialdecompositions of the interactions into a reference part ν0 and perturbation partν − ν0 can be exploited. All of the above techniques (and many others) are usedin so-called density functional theories.

As a schematic example, of how these theories work in principle (not literally),think of a fluid of interacting particles, where the interactions can be split into aneffective hard-sphere part and a perturbation part, as illustrated in the sketch in

17The homogeneous contribution n2νq→0 to the integrand is sometimes excluded by replacingthe density fields n(r) by their deviations δn(r) from the homogeneous reference state n(r) = n.

23

Figure 2: Schematic of how the various perturbative and non-perturbative ap-proximation methods introduced so far — hard-sphere-fluid, PY, OZE, thermo-dynamic integration, RPA, free energy minimization — can (in principle) becombined to produce approximate expressions for the free energy functional F [n]of a dense interacting fluid with pair interaction potential ν(r).

Fig. 2. The hard-sphere part is dealt with in the Percus–Yevick approximation toobtain c(r) analytically via the Ornstein-Zernike equation, while the perturbationpart is dealt with in perturbation theory, e.g. in the RPA approximation. Finally,the free energy obtained by integration is minimized to find an optimum valuefor some artificial model parameters (schematically represented by the effectivehard-core diameter σ in the figure). To put such ideas to work in practice is anintricate story of its own. The next section only provides a very brief outline.

2.3 Density functional theories

Density functional theory (DFT) is a very successful and popular frameworkbased on the above formalism that extends (and includes as a special case) thewell-known Ginzburg–Landau theory discussed in introductory texts, but alsotheories of freezing and melting transitions (in simple fluids as well as in liquidcrystals), and the Poisson–Boltzmann and Debye–Huckel theory. Appropriatelyextended versions of DFT involving particle exchange effects are very useful inquantum mechanical many-body problems, e.g. in describing electrons and quasi-particles in (semi-)conductors and large atoms. In the following, the emphasis ison a few paradigmatic examples of fundamental interest in many areas of physics.

24

Inhomogeneous ideal gas

Basically the only exactly known density functional of an inhomogeneous fluidis that of the inhomogeneous ideal gas. It is obtained by replacing the constantdensity n by the spatially varying field n(r) in the known bulk formula,

βFig[n(r)] =

∫drn(r)

(ln[n(r)λ3

T ]− 1)⇒ βδ2Fig

δn(r)δn(r′)=

1

n(r)δ(r− r′) . (75)

In the homogeneous state the fields degenerate to n(r) = n, G(r, r′) = nδ(r− r′),and hence ng(r) = n, h(r) = c(r) = 0, and Sq ≡ 1.

Slightly inhomogeneous fluids

Even for a strongly interacting many-body system, weak fluctuations around ahomogeneous reference state may often be treated as a weakly interacting gas.The corresponding model of a slightly inhomogeneous fluid is among the mostcommonly applied models in many-body physics. It is the (classical) basis forthe quantized elementary excitations and quasi-particles that govern the worldof solid-state physics, but it is equally useful in soft matter physics. The in-homogeneous part of the free energy in terms of the weak density fluctuationsδn(r) = n(r)− n with |δn(r)| � n reads18

βFih =

∫dr {n(r) ln[n(r)/n]− δn(r)} − 1

2

∫drdr′ c(r, r′)δn(r)δn(r′) . (76)

Note that due to the assumed weakness of the fluctuations, there is no addi-tional integration over a “charging parameter” α involved, here. This is simplya rewriting of the second line of Eq. (61) as a differential form for infinitesimalδn(r). The direct correlation function of the reference state is assumed to beknown from somewhere (e.g. from the OZE) or estimated via the RPA. A fa-miliar example for such theories is the square-gradient expansion familiar fromthe Ginzburg–Landau free energy for critical fluctuations, that amounts to theadditional small wave vector approximation cq ∼ c0 + c1q

2/2. (Here, c1/4β wouldbecome the coefficient in front of the square gradient term in the free energy afterFourier back transformation.) But many other remarkably successful DFTs —in fact all those introduced here — are constructed in the same way.

For the inhomogeneous free energy, Eq. (76), the minimum condition, Eq. (74),reads

βδFih

δn(r)= βµih(r) = ln[n(r)/n]−

∫dr′ c(r, r′)δn(r′) . (77)

This has to vanish or to balance an external potential −βu(r), if present. (Theoverall chemical potential µ serves to fix the value of n.) In other words, the

18The last term in the curly brackets is a gauge term (actually vanishing upon integration)kept to make later cancellations with the first term explicit.

25

spatial density profile n(r) obeys the self-consistency relation

n(r) = ne−βueff(r) with βueff(r) = βu(r)−∫

dr′ c(r, r′)δn(r′) (78)

a position dependent effective potential, which is self-generated by the inter-actions. This generalizes the well-known self-consistency equation (most oftenfirst encountered in the mean-field theory for the Ising model in introductorystatistical-mechanics texts) to a spatially varying mean field. It incidentally im-proves it by replacing the bare pair interaction potential with the direct correla-tion function. (In return, one learns that, if mean-field theory is exact in infinitedimensions and/or for van-der-Waals-type pair interactions that couple all possi-ble pairs with equal strength, then the same must hold for the RPA, which furthersupports the robust character of this approximation.) For vanishing interactionsthe barometer equation is recovered. The calculation of weak density fluctua-tions in an interacting fluid thus parallels the solution of the barometer equationfor an ideal gas with the additional (hard) problem of finding the self-consistentself-generated mean field (the last term in the equation). If the direct correlationfunction is not yet known (e.g. from the OZE), it is commonly replaced by thebare potential (or by the Mayer function), in the RPA approximation. Even then,the Eq. (78) represents an infinite-dimensional nonlinear eigenvalue problem thatusually cannot be solved exactly by analytical means.

A widely known and applied example for the above formalism is the Poisson–Boltzmann theory (also known by the name Guoy–Chapman theory to chemists):

∇2βφ(r) = κ2 sinh βφ(r) (“Poisson–Boltzmann”) . (79)

In this theory, βueff is taken to be the electrostatic potential

βφ(r) =

∫dr′

z2`B|r− r′|

ne(r′) , (80)

generated by a globally neutral plasma of particles carrying the charge ±ze, withthe net charge density ne(r) ≡ n+(r) − n−(r) (the local mismatch between thepositive and negative fractions of the total charge density). The characteristiclength scales `B ≡ βe2 and κ−1, with κ2 ≡ 4πz2`Bn, are called Bjerrum lengthand Debye screening length, respectively. The first is defined by the balance ofelectrostatic and thermal energy for two elementary charges at distance `B. Andby linearizing Eq. (79), the second is seen to be the characteristic distance overwhich a charge mismatch is compensated by the polarization of the surroundingplasma.

The name Poisson–Boltzmann theory derives from the following obvious in-terpretation. If the electrostatic potential φ(r) is generated by the charge densityne(r) it is a solution of Poisson’s equation

∇2βφ(r) = −4πz2`Bne(r) = −κ2[n+(r)− n−(r)]/n (Poisson) . (81)

26

From this, one arrives at Eq. (79) by simply estimating the local densities of thepositive and negative charges on the right from the barometer equation for theself-generated potential φ(r), namely,

n±(r) = (n/2)e∓βφ(r) (barometer equation) . (82)

This derivation reveals the mean-field character of the theory, as it treats the par-ticles as an ideal gas in the self-generated mean field (without considering moresubtle inter-particle correlations), consistent with the assumption of a weakly in-homogeneous system, underlying Eq. (78). The direct comparison with Eq. (78)moreover reveals a replacement of the direct correlation function by the bareCoulomb potential in Eq. (80), which clearly amounts to yet another approxima-tion, namely the RPA. If moreover βφ � 1 is invoked to linearize Eq. (79), oneends up with the so-called Debye–Huckel approximation: φ(r) ∝ e−κr.

Freezing and mesophases

Other significant early examples for DFTs were the Thomas–Fermi theory foratoms with many electrons and Onsager’s theory of a nematic transition in sus-pensions of hard rods. Onsager’s theory19 from 1949 is particularly interesting,since it combines two independent theoretical concepts introduced above, thevirial expansion and density functional theory. It is a paradigm of partial order-ing (orientational ordering without positional ordering) into a so-called micro- ormesophase, a theme with countless variations in (soft) condensed matter theory;and finally it is an early example for a purely entropic or “geometric” transition.Thereby, it anticipates later theories of freezing and melting of simple fluids.

The essential physics of the nematic transition is that isotropic rods of lengthL and diameter a jam up at high densities, as familiar from the mikado game.To avoid an entropy crisis, i.e., a complete loss of conformational entropy at thejamming density20 n = nc ≈ 1/L2a, the system sacrifices a bit of its orientationaldisorder to retain some more of its translational freedom. This trick only worksif a macroscopic fraction of the molecules can agree on an average preferredorientation: spontaneous long-range orientational order has to emerge in orderthat the center-of-mass coordinates can retain their fluid-like distribution (whencethe name liquid crystal, referring to the lack of positional long-range order).

In 1979, Ramakrishnan und Yussouff used Eq. (76) along similar lines withgood success to elucidate the geometric nature of the freezing and melting tran-sitions in classical fluids, explain the universality of the packing structure nearthese transitions, and numerically compute quantitative predictions for the phase

19L. Onsager, Annals of the New York Academy of Sciences 51 (627-659) 1949.20Onsager argues that, by virtue of the low critical volume fraction a2Lnc ' a/L, the virial

series can be truncated after the second virial coefficient and the direct correlation function canbe approximated by the Mayer function, both with negligible error, for very thin needles.

27

diagram. Their analysis provided a theoretical basis for successful phenomeno-logical rules of thumb, such as the Verlet rule and the Lindemann criterion, whichare widely used in practical applications. The Lindemann criterion states thatcrystals melt if the mean-square fluctuations of the atoms exceed 10% of their av-erage next-neighbor distance (the lattice unit), and the Hansen–Verlet rule statesthat fluids freeze if the height of the main peak in their structure factor exceeds3, thereby hinting at different underlying mechanisms for both transitions.

The DFT approach regards the crystal as a (not so) slightly inhomogeneousfluid. The procedure can be characterized as “fishing for crystal structures”.Briefly, one looks for instances of a non-zero order parameter nKi

≡ 〈∑

j eiKi·rj〉 =

O(N) (i.e. Bragg peaks) in a preferably exhaustive set of possible reciprocal latticevectors Ki. To this end one needs an ansatz. Typically, a sum of Gaussiandistributions of width a centered at the lattice sites {R}

n(r) = ns(√

πa/a0

)−3∑{R}

e−(r−R)2/2a2

(83)

is plugged into Eq. (76) as a trial function. Local minima of the resulting freeenergy other than the homogeneous fluid reference state are then searched byvarying the model parameters ns, a0, and a for a prescribed trial crystallographiclattice {R(a0)} such as BBC, FCC, etc., and the one with the lowest free energyis then proposed as the (most likely) equilibrium structure. While the latticeconstant a0 refers to the abstract lattice, the density ns refers to the actual masssitting on this lattice. If it is treated as an independent parameter (not tied to1/a3

0, say), the ansatz can account even for vacancies.The above examples illustrate that DFT can be a quick and easy way to

get impressive results by comparably little effort — essentially by dropping themagic hats on the densities n(r) in the Hamiltonian to accomplish much of theimpossible task of calculating the partition sum of an interacting many-bodysystem. The trick works remarkably well, even for some pretty complex first-order phase transitions. Are we in paradise, then, or what are the limitationsof this strikingly simple and powerful recipe? Even though DFT exploits andbuilds on formally exact relations and admits fluctuations via the spatially varyingdensity field n(r), it is mean-field like (as van-der-Waals or Curie-Weiß theory), inpractice. The approximations required to address real-life applications (such asweak inhomogeneity, RPA) usually introduce uncontrolled errors that spoil thecritical behavior at continuous phase transitions and other subtle correlations.Briefly, from the structure of the OZE and the compressibility sum rule

nκT = βSq→0 =β

1− ncq→0

RPA≈ β

1 + nβνq→0

=k−1B

T − Tc. (84)

So the compressibility κT can clearly be made to diverge like (T − Tc)−γ with

γ = 1 at a critical temperature Tc ≡ −nνq→0/kB. But it remains obscure how γ

28

might ever acquire a non-mean-field (non-rational) value without extraordinaryconspiracy. Another limitation of DFT is that it usually does not get rid of someguessing of trial functions or, in more elaborate versions, at least of some freeparameters. With respect to the task of obtaining numerically precise predictionsthis freedom may suit the practitioner, while it is somewhat less satisfying withrespect to the task of generating insight into fundamental physical mechanisms.

3 Critical phenomena and renormalization

3.1 Scaling, exponent relations, Landau theory

Critical exponents

Critical behavior is the thermodynamic behavior associated with continuous(“second-order”) phase transitions, as typically found at an end of a discontin-uous (“first-order”) phase transition lines, such as the liquid-vapor coexistenceline in a p− T−diagram. The critical point for Bose–Einstein condensation dis-cussed in the introductory lecture is a somewhat trivial example, since the modelis interaction-free. A more generic example is provided by a paramagnet spon-taneously turning into a ferromagnet upon cooling below the Curie temperature.Quite generally, one characterizes the associated critical behavior phenomenolog-ically by a number of critical exponents characterizing the singular dependen-cies of various thermodynamic quantities of interest at small t ≡ T/Tc − 1 (ort ≡ 1− Tc/T ),

CV ∝ |t|−α (85)

ψ ∝ |t|β (t < 0) (86)

χT ∝ |t|−γ (ψ> = 0;ψ< = ψ1,2) (87)

h ∝ |ψ|δ (t = 0) (88)

ξ ∝ |t|−ν (89)

Broadly speaking, α characterizes the thermal effects and γ the mechanical, mag-netic, etc. instability at the transition. For the liquid-gas critical point χT = κT ,and γ quantifies the divergence of the compressibility. The shape of the criticalisotherm is characterized by δ. The notation |t| indicates that both positive andnegative t are to be considered, and that the singular behavior above and belowthe transition is quite symmetric (except for numerical factors). The emergingorder parameter is quantified by β (defined only for t < 0), and its spatial cor-relations by the correlation length ξ with exponent ν. A spontaneous symmetrybreaking is often associated with the transition, as e.g. the spontaneously emerg-ing magnetization clearly has to point in some definite but a priori arbitrarydirection.

29

There are two most interesting and perplexing experimental facts about crit-ical phenomena:

1. Their universality, which means that phenomenologically very different sys-tems such as magnets and fluids may share the identical critical behavior(the same critical exponents), which allows to group the wide variety of phe-nomenological materials and other physical and even non-physical systemsinto a few universality classes.

2. With few exceptions, the critical exponents are irrational numbers relatedby so-called exponent relations.

Breakdown of Landau theory: upper/lower critical dimension

A mechanistic theory proposed to deal with small order parameter fluctuationsψ for arbitrary phenomenological systems near their critical points is Landau–Ginzburg theory. It essentially consists of a Taylor expansion of whatever (un-known) free energy of whatever complicated system of interest, in the small pa-rameters ψ, t ≡ T/Tc − 1 and h in the vicinity of a critical point:

LG = nckBTc

∫V

drLG , LG ≡`2

2(∇ψ)2 +

t

2ψ2 +

g

4ψ4 − hψ . (90)

The critical values nc and Tc of the density and temperature have been used tobring the free-energy density LG into a dimensionless form, ` is a length on theorder of the interaction range, g some phenomenological number, and h a nor-malized (dimensionless) external field. Since the exact free energy is usually notknown, one has to guess the Landau function on the basis of the dimensionalityof the order parameter and the embedding space, and considering only terms inthe Taylor expansion that are allowed by symmetry21 (e.g. for an up-down sym-metric magnet no odd powers of the magnetization are allowed). The resultingtheory clearly is of the sort of DFTs devised above to deal with weak fluctuationsabout a reference state (here the critical state). Upon minimizing LG, it correctlypredicts the emergence of order below Tc. And it suggests that thermodynamicbehavior may be classified into universality classes according to the topology ofthe embedding space and the order parameter field and according to fundamen-tal symmetries. So it accounts for observation (1) from the preceding paragraph.However, as already pointed out, such a theory cannot predict non-mean-fieldvalues for the critical exponents. Moreover, it naturally breaks down when the(relative) fluctuations diverge, which actually happens at the critical point. Sothe theory cannot account for observation (2) from the preceding paragraph. Thefailure of the theory is not merely a quantitative problem, which could be over-come by a slightly improved model. It is a failure in principle, due to the basicassumption that the free energy can be expanded in a Taylor series.

21This may sound familiar from the standard model of particle physics — a Landau theory.

30

You should not hastily conclude that Landau theory is pointless, though. It is,in fact, the standard starting point for gaining a qualitative and (ultimately even)quantitative understanding of the odd behavior of critical systems within statisti-cal mechanics. This understanding only emerged in the 1970’s and was rewardedwith the Nobel prize for K. G. Wilson (see links on the homepage). Moreover,Landau theory is a useful starting point to deal with a variety of phenomenaaway from criticality, such as surface tension, nucleation, and coarsening.

Luckily, Landau–Ginzburg theory predicts a criterion for its own breakdown,known as the Ginzburg criterion. The idea is that at the critical point, where thequadratic term ∝ ψ2 in the Landau free energy vanishes, the task of confining theorder parameter fluctuations is largely left to the gradient term (the ψ4−termbeing very flat). Using dimensional analysis, the relevant integration range andthe ∇ are estimatd by the characteristic correlation length ξ, i.e.

∫dr → ξd,

∇ → ξ−1. Invoking equipartition (which states that each quadratic classicaldegree of freedom carries kBT/2 of thermal energy in its fluctuations) for thecorrelation volumes ξd, which correspond to the effective (collective) degrees offreedom near the critical point, one has

nckBTc

∫V

dr `2(∇ψ)2 ' nckBTcξd−2`2δψ2 ' kBTc (91)

The overall strength of the order-parameter fluctuations δψ ≡ ψ − ψ1,2 is then

δψ2 ' `−dn−1c (ξ/`)2−d = `−dn−1

c |t|(d−2)/2 , (92)

using ξ = `|t|−ν with ν = 1/2 for the characteristic range of the order parametercorrelations22. The result is now compared with the (squared) absolute valueψ2

1,2 = |t|/g of the order parameter predicted by Landau theory, which is obtainedby minimizing LG for a homogeneous system with constant ψ(r) = ψ, belowTc (exercises). Both δψ and ψ1,2 are functions of t, and one would expect theGinzburg theory to be trustworthy in a range of t where the fluctuations δψ donot exceed ψ1,2 in magnitude:

δψ2/ψ21 ' `−dn−1

c g|t|(d−4)/2 . 1 ⇒ |t| & (g/`dnc)2/(4−d) . (93)

This is the so-called Ginzburg criterion. It shows that d = du = 4 plays the role ofan upper critical dimension. For d > du fluctuations vanish for t→ 0 and Landautheory should hold close enough to the critical point. In contrast, for d < du,Eq. (93) defines a critical region |t| . (g/`dnc)

2/(4−d) around the critical pointwhere Landau theory fails. In brief: “murder at the critical point — macroscopicorder parameter fluctuations kill Landau theory.” Mean-field theory thus breaksdown close to the critical point, while further away from criticality (albeit nottoo far for a Taylor-expanded Landau free energy) one can trust its predictions

22The mean-field value ν = 1/2 is confirmed by a direct calculation, Eqs. (97), (98), below.

31

(sketch). As an exception to the rule, “classical” superconductors do obey mean-field theory, since their critical region is very small due to the large off-criticalcorrelation length `� n

−1/3c due to the strongly delocalized Cooper pairs.

The above estimate of the absolute order parameter fluctuations δψ in Eq. (92)moreover suggests that the order-parameter fluctuations become large in abso-lute terms, not only relatively to ψ1,2, below d . dl = 2. This hints at aneven more severe breakdown of the theory below a lower critical dimension dl.This suspicion, and actually the whole above discussion, is corroborated by astraightforward diagonalization of LG, using Fourier modes

ψq ≡1√V

∫V

drψ(r)e−iq·r ψ(r) =1√V

∑q

ψqeiq·r , (94)

in the harmonic approximation (exercises)

LG = nckBTc1

2

∑q

(t+ q2`2)|ψq|2 . (95)

For simplicity, T & Tc (t > 0) and h = 0 shall be assumed. Now due to equiparti-tion of energy for the eigenmodes ψq, the average strength of the order parameterfluctuations follows from

nckBTc(t+ q2`2)〈|ψq|2〉 = kBTc . (96)

Observe that, while the average order parameter 〈ψ〉 vanishes for T > Tc, itssquared fluctuations have non-vanishing mode amplitudes

nc`2〈|ψq|2〉 =

1

ξ−2 + q2. (97)

Here the parameter combination ξ = `/√|t| is identified with the correlation

length of order parameter fluctuations, which corroborates Eq. (92). It controlsthe spatial decay of correlations, which becomes more evident from the Fourierback transform, the order parameter correlation function in real space,

h(r) = 〈ψ(r)ψ(0)〉 = FT−1 〈|ψq|2〉√V∝

e−r/ξ

r(d−1)/2(r →∞)

1/r(d−2) (ξ →∞) .(98)

The average of the product of two values of the order parameter at two points adistance r apart does not vanish unless r � ξ, although 〈ψ(r)〉 is zero for T > Tc.

From the last expression in Eq. (98), one can again read off the lower criti-cal space dimension dl = 2. Correlations should decay with distance, not growindefinitely. The more general (perplexing) observation that hydrodynamic fluc-tuations can cause the complete breakdown of hydrodynamic theories below a

32

lower critical dimension dl (not only a lack of precision, as below the upper crit-ical dimension) has been called hydrodynamic suicide23. It is not an artifact ofthe theory but due to a real physical effect. Due to the reduced number of con-straining neighbors, long-wavelength hydrodynamic excitations and modes tryingto restore broken symmetries (also called Goldstone modes) become all pervasiveat and below the lower critical dimension dl = 2 at any T > 0. That this pre-vents the phase ordering phenomena observed in higher dimensions (althoughnot phase separation as such) for all continuous order parameters in d ≤ 2, if theinteractions are not themselves long-ranged, has been proved in great generality(“Mermin–Wagner theorem”). It implies that there are, in the thermodynamiclimit, no (proper) one- or two-dimensional crystals, which hints at the fact thatthe physics of membranes and polymers is dominated by fluctuations. Phasetransitions exactly at d = dl (first discussed by Kosterlitz, Thouless, Nelson andHalperin) are quite subtle and have kept statistical physicists busy for decades.

There is another important subtlety related to the second Eq. (98). Note thatit makes a prediction for the behavior inside the critical region, where one shouldactually not trust the prediction of the Ginzburg theory. Precise measurementshave indeed revealed small systematic deviations from the Ginzburg form forr � ξ in d < 4 space dimensions. They have (first phenomenologically) beenrationalized by introducing a so-called “anomalous dimension” η, writing

〈|ψq|2〉 ∝ q−2+η , 〈ψ(r)ψ(0)〉 ∝ 1/rd−2+η , ξ →∞ . (99)

The name derives from the perplexing observation that this relation apparentlyviolates ordinary dimensional analysis. It implies that there must be a subtle hid-den dependence on the microscopic length scale `, even — or rather in particular— at the critical point, where `/ξ → 0. This is a consequence of the strong cou-pling of fluctuations on all scales (large and small) at the critical point. As a sideremark, Eq. (99) implies the (universal) exponent relation γ = ν(2−η). It followsfrom the compressibility sum rule that relates the order-parameter fluctuationsin the form of the integral over h(r) to the corresponding response coefficient (theisothermal compressibility). This relation clearly is not limited to gases or fluids.For a magnet, one simply has to replace particle number fluctuations 〈N2〉 by spinnumber (or magnetization) fluctuations and the isothermal compressibility κT bythe magnetic susceptibility χT to arrive at the same conclusion via Eq. (98).

Scaling hypothesis and exponent relations

As pointed out above, a major shortcoming of Landau theory and density-functional theories, in general, is their failure to properly predict the non-trivialbehavior at continuous phase transitions and critical points. Phenomenologi-cally, the observation of exponents with irrational values can be rationalized by

23W. Brenig, Statistical Theory of Heat, Nonequilibrium Phenomena.

33

the postulate of a generalized homogeneity relation for the Landau free energy L,or rather its singular part Ls. This postulate is known as the scaling hypothesis

λLs(t, h) = Ls(λatt, λahh) , (100)

where λ is a positive scale factor and at and ah are real numbers. In particular,choosing λ = |t|−1/at , Eq. (100) reduces to

Ls(t, h) = |t|1/atLs(±1, h/|t|ah/at) ≡ |t|1/atL±(h/|t|ah/at) . (101)

Landau’s analyticity assumption for L is now replaced by regularity requirementsfor the scaling function L± and its first and second derivatives at the origin,complemented by the asymptotic condition L′± ≡ ∂xL±

x→∞∼ −x1/δ (to reproducethe critical isotherm). Then Eq. (101) immediately produces the desired power-law behavior for t→ 0:

CV ∝ −T∂2tLs(t, h = 0) ∝ |t|1/at−2L±(0) (102)

ψ ∝ − ∂hLs|h=0 ∝ |t|(1−ah)/atL′±(0) (103)

χT ∝ − ∂2hLs∣∣h=0∝ |t|(1−2ah)/atL′′±(0) (104)

ψ(h) ∝ −∂hLs ∼ −|t|(1−ah)/atL′±(h/|t|ah/at) (105)

∝ |t|1/at−ah/at(1+1/δ)h1/δ (t→ 0) .

Comparison with the definitions in Eqs. (85)-(89) altogether yields

α = 2− 1/at (106)

β = (1− ah)/at (107)

γ = (2ah − 1)/at (108)

δ = ah/(1− ah) . (109)

The fact that two independent exponents create a substantial part of the Greekalphabet immediately explains the existence of exponent relations, such as 2−α =2β+γ, which historically motivated the scaling hypothesis, in the first place. Alsonote that equipartition implies that for T ≈ Tc correlation volumes ξd have a freeenergy Ls equal to the thermal energy kBTc. In other words, the density Ls scaleslike kBTcξ

−d, hence ξ−d ∼ |t|1/at or dν = 2− α (so-called hyper-scaling).The scaling hypothesis is an entirely phenomenological approach to critical

phenomena. It postulates a specific scaling form of the free energy but does notgive a clue how it might arise from a microscopic description. To find this out, itis useful to consider some toy models, which is the strategy pursued in the nextsection.

3.2 Learning from toy models: the Ising chain

To gain further insight into the origin of non-trivial critical behavior below theupper critical dimension, it is useful to analyze toy models and try to extract ageneral strategy for the study of critical behavior.

34

The Ising model

The Ising model consists of a regular lattice of N binary “spin” variables si,which can take the values si = ±1 (or si = ±1/2, si = 1, 0). Depending onthe application one has in mind, these variables can e.g. represent constrainedmagnetic moments that can only point up or down, or the presence or absence ofatoms in a lattice model of a gas, fluid or solid, votes in an election, firing statesof neurons, etc. etc. Depending on the type and dimensionality of the lattice andthe spin interactions, the Ising model may thus serve as an idealized minimalmodel for a large variety of complex physical, biological, or even social systemsencountered in the real world. In presence of an external field h, the Hamiltonianreads

H({si}) = −J∑

ip

sisj − h∑i

si (110)

Here ip stands for a sum over all Nip interacting pairs. Two very useful ide-alizations are provided by the extreme choices that the pair interactions areeither (1) van-der-Waals type infinite-range interactions (

∑ip →

∑i<j, Nip =

N(N − 1)/2), which produces mean-field behavior, and (2) nearest-neighbor in-teractions (

∑ip →

∑〈ij〉, Nip = Nq/2, with q being the coordination number, i.e.

the number of nearest neighbors per lattice site), which produces non-trivial crit-ical behavior in and below 4 dimensions24. Ferromagnetic and anti-ferromagneticinteractions can be realized by choosing a positive or negative J , respectively.More generally, one may consider a spin glass with a non-thermal (quenched)distribution of interactions Jij, if one is after a truly complex model.

Historically, Onsager’s 1944 solution of the two-dimensional Ising model withnearest neighbor interactions settled debates about the possibility of non mean-field values for critical exponents, and whether these can be deduced from apartition sum, at all. In the 1950’s, the theory of Yang-Lee and Fischer zerosfinally elucidated the general mathematical mechanism how a singular free energymay emerge from a sum of infinitely many positive analytical Boltzmann factors.

Transfer matrix for the Ising chain

The 1-dimensional Ising chain with nearest neighbor interactions and periodicboundary conditions calls for the popular method of the transfer matrix. Note

24Writing si = σ+(si−σ) in terms of the mean spin magnetization σ = 〈si〉 plus fluctuationssi − σ around it, and neglecting correlations of fluctuations according to

−Jsisj = Jσ2 − Jσ(si + sj)− J(si − σ)(sj − σ) ≈ Jσ2 − Jσ(si + sj) , (111)

amounts to treating the spins as independently exposed to the self-generated mean field Jqσ,i.e. to the mean-field approximation, which becomes exact for large q → N − 1 and J → I/N .

35

that the partition sum

Z =∑{si=±1}

e−βH =∑{si=±1}

eβ∑

i

[Jsisi+1+hsi

]. (112)

decays into a product of structurally identical terms

Z =∑{si=±1}

∏i

eβJsisi+1+βh(si+si+1)/2 . (113)

This can be rewritten using an operator T with matrix elements

〈si|T |si+1〉 = exp[βJsisi+1 + βh(si + si+1)/2] (114)

and matrix representation(exp[β(J + h)] exp[−βJ ]

exp[−βJ ] exp[β(J − h)]

)(“transfer matrix”) (115)

in the spin space with basis {|+〉, |−〉} for si = ±1. With T one can write

Z =∑{si=±1}

〈s1|T |s2〉〈s2|T |s3〉 . . . 〈sN |T |s1〉 . (116)

Using the completeness of the spin basis, this reduces to

Z =∑s1=±1

〈s1|TN |s1〉 = trTN = λN+ + λN− , (117)

where λ+ and λ− denote the transfer matrix eigenvalues

eβJ[cosh βh± (e−4βJ + sinh2βh)1/2

]∼

eβJ ± e−βJ (βh = 0)

2 cosh βh , 0 (βJ = 0)

eβJ±β|h| (βJ →∞)

. (118)

Emergence of singularities for T → 0, N →∞

Observe that for finite βJ and βh there is a non-degenerate largest eigenvalueλ+ > λ−, which is an analytical function of βJ and βh. (This is the Frobenius–Perron theorem for matrices with finite positive matrix elements). So for finitematrix entries (finite βJ and βh), the larger eigenvalue λ+ will dominate thepartition sum in the thermodynamic limit (N → ∞), corresponding to the freeenergy per spin

βf = limN→∞

βF

N= − lnλ+ =

− ln[2 cosh βJ ]− (βheβJ)2/2 (βh→ 0)

− ln[2 cosh βh] (βJ = 0)

−βJ − |βh| (βJ →∞)

.

36

In the high-temperature limit (βJ → 0), F reduces to −TS with S = kBN ln 2the entropy of the up-down degeneracy for each spin in a paramagnet. Becauseof the analytic parameter dependence of the transfer matrix, it is only the low-temperature limit βJ → ∞, in which its entries diverge or vanish, that canproduce the non-analytical behavior characteristic of a phase transition. Takingderivatives of f with respect to h, gives the site magnetization

−∂βhβf = 〈σ〉 =sinh βh√

sinh2βh+ e−4βJ∼

βhe2βJ (βh→ 0)

tanh βh (βJ → 0)

sgn(βh) (βJ →∞)

(119)

and the susceptibilityχ = ∂h〈σ〉|h=0 = βe2βJ . (120)

The latter reduces to the well-known Curie law χ ' β (the constitutive equationfor a paramagnet) at high temperature (βJ → 0). Its strange (non-powerlaw) di-vergence25 at low temperature (βJ →∞) suggests the unusual critical parametert = e−βJ and γ = 2.

Now, reversing the order of limits, i.e. keeping N finite and taking the limitβJ →∞ first, and using (1 + x/N)N ∼ ex, Eqs. (117) & (118) give for small βh

Z = eNβJ[(1 + |βh|)N + (1− |βh|)N

]= eNβJ2 cosh(Nβh) , (121)

hence

βf ≡ βF

N= −βJ − 1

Nln[2 cosh(Nβh)]

N→∞∼ −βJ − |βh| . (122)

While the ultimate result in the double limit βJ, N → ∞ is the same as foundabove by taking the limit in the reverse order, it is interesting to see also thesub-leading behavior and to analyze the non-extensive contributions for finite N .The argument of the cosh is no longer the energy h of a single spin in an externalfield, as in a paramagnet, but instead the energy Nh of N aligned spins in thefield h. This is reflected in the site magnetization

〈σ〉 = −∂hf = tanhNβh ∼ sgn(βh) , (123)

and in the susceptibility

χ = −∂2hf =

Nβ

cosh2Nβh∼ 2βδ(βh) . (124)

They both clearly display how the singularity develops (for βJ → ∞) fromsmooth functions as a consequence of spontaneous collective behavior, upon tak-ing the thermodynamic limit N →∞. For T → 0 all spins align and are turned

25This “Schottky anomaly” is due to the finite energy gap between the ground state (T = 0)and its excitations (there are no Goldstone modes with arbitrary small energy).

37

by an infinitesimally small field like a single giant spin S = Nsi, hence χ = βN ,for h = 0. In contrast, once the giant spin has been oriented by an infinitesimalfield, increasing h has no further effect, so that χ = 0 for h 6= 0.

The above discussion of the order of the limits βJ →∞ and N →∞ corrob-orates the physical intuition that, for any finite J 6= 0, a finite chain will orderat T = 0, while an infinite chain will always be disordered at any T > 0. Italso provides a straightforward interpretation of Eq. (121), which is simply thepartition sum of a paramagnet consisting of a single spin of length N (shifted bythe ground state energy −JN), as could have been guessed without calculation.

The consequence of the formation of a giant spin can also be seen in thebehavior of the specific heat ch = −T∂2

Tf . Starting from Eq. (122), which cor-responds to the taking the limit N → ∞ after βJ → ∞, reveals a tiny specificheat

chkB

=N(βh)2

cosh2Nβh∼ p

N[δ+βh + δ−βh] . (125)

The symbolic notation δ± refers to a Kronecker−δ centered around slightly pos-itive or negative arguments, respectively, and p ≈ 0.44. As all spins align intoa giant spin S = Nsi in the low-temperature limit, the free energy becomessimply the ground state energy −J − |h|. The only remaining entropy is thenon-extensive contribution kB ln 2, as seen by setting h = 0 in Eq. (122) beforetaking the thermodynamic limit N →∞. It corresponds to the two-fold groundstate degeneracy, i.e. to turning the giant spin. For this to happen the field hasto be switched off, since otherwise S is firmly aligned along h. Accordingly, onlya small, non-extensive amount ≈ kB of heat can be taken up, altogether, so thatthe specific heat vanishes in the thermodynamic limit. Yet, the limit in Eq. (125)is subtle and interesting. To reveal the specific heat, the ergodicity of the en-semble has to be broken artificially by an infinitesimally small field (|h| > 0) tosingle out the up or the down component. (This is what typically would au-tomatically happen in a real experiment.) The giant spin can then explore theother half of the ensemble after the field is switched off (h→ 0) and some heat isprovided. This doubles the accessible phase space volume, which manifests itselfas a (tiny) specific heat. In contrast, setting βh = 0 from the outset would haveallowed both sub-components of the phase-space to remain equally populated.This would have killed the specific heat entirely, as it formally corresponds toT =∞, meaning that the system cannot be heated up.

3.3 The idea of renormalization

The qualitative idea of the renormalization group and ε−expansion.

Remember that, according to the Ginzburg criterion, the critical region shrinksto zero above the upper critical dimension, du = 4, so that the simple theoriesapply even at the critical point. In that sense, a dimension d > du is as good

38

as infinitely many dimensions. A successful strategy to gain access to the truecritical exponents in dimensions d < du aims to exploit this by extrapolatingthe predictions of the mean-field theory down from the upper critical dimensiondu. One systematic scheme to do so is the famous asymptotic expansion in the“small” parameter ε ≡ 4− d known as the ε−expansion.

The intuitive picture behind this is the following. Above the upper criticaldimension du, there are enough neighbors to constrain the fluctuations to themolecular scale `, so that the deterministic thermodynamic theory applies at allscales much larger than `. Upon crossing du, fluctuations start to mess up the sim-ple theory. Fluctuations on all scales get coupled to each other and ultimately tothe microscopic scale ` in a subtle way, which is why simple dimensional analysisfails and exponents acquire complicated non-rational values, related to each otherby the scaling hypothesis and the ensuing exponent relations. The spatial struc-ture of the order parameter fluctuations ψ(r) corresponds to a homogeneity —in the statistical sense — of a strangely ramified self-similar or “fractal” patternof heterogeneities, and this is the physical origin of the mathematical structurepostulated in the scaling hyothesis. The renormalization group (“RG”) targetsand exploits precisely this non-trivial spatial self-similarity of the order param-eter fluctuations, i.e. of all of the physics described by the Landau–Ginzburgtheory and therefore the theory itself. The RG attributes this special symmetry(a statistical dilation symmetry) to a fixed point of a semi-group transformation.This RG-transformation consists of two steps: an incremental coarse graining (orlow-pass filtering) and rescaling. Above and below the critical point (i.e. awayfrom the RG fixed point), the repeated application of the RG-transformationtakes one to the totally disordered and ordered states corresponding to T = 0and T → ∞, respectively. Right at the fixed point, where the pattern of order-parameter fluctuations is fully self-similar, the pattern remain unchanged (in thestatistical sense). In terms of the mathematical structure of the theory, this meansthat, in the vicinity of the fixed point, the “true” Landau–Ginzburg functionalreproduces itself (up to sub-leading terms) with rescaled parameters under theRG-transformation. And from the parameter rescaling one reads off the criticalexponents just as one reads off the exponent γ in the ordinary Landau theoryfrom the prefactor of the quadratic term in L(ψ). Above du, the whole procedureis trivial, corresponding to the conventional homogeneity of the ordinary Lan-dau functional, and simple rational exponents, as familiar from van-der-Waalstype models. But for intermediate dimensions dl < d < du, the second, moreinteresting type of self-similarity emerges, corresponding to a “non-trivial” newfixed point of the RG-transformation and odd numerical values of the exponents.This is, what one tries to get hold of perturbatively, using the ε expansion toextrapolate from du to d.

39

Decimation transformation

To see how the RG-idea works in practice, it is useful to consider a simple example,where it is not masked by technicalities of the perturbation theory. The exactlysolvable Ising chain is a suitable toy model, although it suffers from two flaws,namely its discrete state space and the associated Schottky anomaly and strangeexponential form of the critical parameter t, and the fact that Tc = 0 so thereis no “other side” of the critical point. The starting point is a new perspectiveonto Eq. (116). Instead of trying to do the sum directly to obtain the free energy(which would be completely out of reach for more interesting models), one asksthe following question: how does the partition sum (or the Hamiltonian) evolveupon taking a partial trace? The partial trace is the technical realization ofthe above mentioned incremental coarse-graining. For the simple example of theIsing chain with nearest-neighbor interactions, it can be performed exactly andis particularly simple. Due to the completeness of the basis, tracing out everysecond spin simply amounts to deleting all terms

∑s2i=±1 |s2i〉〈s2i| = 1 for integer

i in Eq. (116):

ZN(K) =∑

{s2i−1=±1}

〈s1|T 2|s3〉 . . . 〈sN−1|T 2|s1〉 . (126)

This defines a new Ising chain with only half as many spins and twice the latticeconstant of the original chain. In other words, ZN can be interpreted as the par-tition sum ZN of a chain with parameters that have been renormalized accordingto the rule

N → N = N/2 , `→ ˜= 2` , (127)

and with a new transfer matrix T ∝ T 2 instead of T . Specializing to the fieldfree case, h ≡ 0, and writing the original transfer matrix in the form

T =

(t−1 tt t−1

)with t ≡ e−βJ , (128)

one observes the following transformation rule “under renormalization” (i.e. underthis procedure of simultaneously taking a partial trace and rescaling the systemsparameters to map the coarse-grained system onto the original system):

T → T 2 =

(t−2 + t2 2

2 t−2 + t2

)= 2t−1

(t−1 tt t−1

)≡ 2t−1T . (129)

Up to the overall factor 2t−1, we thus have

t→ t with 2t−2 ≡ t−2 + t2 = 2 cosh 2βJ , (130)

for the matrix elements. This corresponds to

2K → 2K = ln cosh 2K (131)

40

for the dimensionless coupling strengths K ≡ βJ , respectively, which shows thatthe renormalization steps can be thought of as causing a shift in temperatureor interaction strength. To make the model exactly self-similar under the RGtransformation, the partition sum ZN and the corresponding free energy perspin, f ≡ −(lnZN)/(βN), have to absorb the remaining overall factors 2t−1 inEq. (129) in each step. Their transformation rules thus read

ZN(K) = 2N coshN/2(2K)ZN(K) , (132)

(N/N)βf(K) = βf(K)− K − ln 2 . (133)

This is suggestive of a singular part of the free energy that scales in N/N or,for a regular lattice with lattice unit ` in d = 1 space dimension, ˜d/`d = ξd,according to Eq. (127). This is in agreement with the scaling hypothesis andthe hyperscaling relation (to recover the full scaling form, the dependence on theexternal field h would have to be considered, as well).

The above recursion relations under decimation are called “renormalizationgroup equations”. They can be read as an equation of motion of the couplingsand the free energy under coarse-graining. By solving them, one obtains therenormalization group flow. For example, for a ferromagnet with K > 0, anyfinite coupling strength is driven towards the origin (K = 0), corresponding tothe high-temperature limit of a paramagnet, with the RG-flow given by Eq. (131)(sketch). The limits K = ∞ and K = 0 thus correspond to the unstable andstable fixed point of the renormalization group equations, describing total orderand disorder of the spins, respectively. A typical chain conformation at lowtemperature may have large aggregates of aligned adjacent spins, but these willshrink in successive coarse-graining steps and eventually degenerate into singlespins pointing in arbitrary directions. After sufficiently many coarse-grainingsteps, the system looks like a pure paramagnet. With a conformal mapping (inthe variable t2) to the new critical parameter

z ≡ 1− t2

1 + t2= tanhK ≤ 1 , (134)

Eq. (131) takes the strikingly simple form

z = z2 (135)

which makes the above statements most obvious (iteration clearly takes any z < 1to zero). It moreover gives direct access to the critical exponent ν if one considersξ(z) near the unstable critical point (corresponding to the phase transition).Since the decimation transformation, as a purely formal renaming exercise doesof course not change the physics of the system, it leaves the physical correlationlength invariant,

ξ ≡ ˜ξ(z) = `ξ(z) = const. , (136)

41

but shrinks the dimensionless correlation length ξ, measured in units of the latticeconstant ` that increases by ˜/` = 2 during each renormalization step:

ξ(z) = ξ(z2) = ξ(z)`/˜= ξ(z)/2 . (137)

The solution of ξ−1(z2) = 2ξ−1(z) is

ξ−1 ∝ − ln z = − ln tanhK ∼ 2e−2K ∝ t2 , (138)

where the final asymptotic result holds for large K = βJ (small t), correspondingto the low-temperature limit. This is in accord with the above identification oft = e−βJ as the critical parameter and the value ν = 2 for the critical exponentcharacterizing the divergence of the correlation length ξ. Together with the aboveobservation that γ = 2, and the exponent relation γ = ν(2 − η) deduced fromEq. (99), one finds η = 1.

Yang-Lee zeros

The above calculations have shown explicitly how thermodynamic quantities de-velop non-analytic behavior as a function of the coupling strength, the externalfield, and the particle number. Another way of looking at what is going on herefocuses on the zeros of the partition sum in the complex field or temperatureplane for large βJ and N . Starting from the partition sum for finite N andβJ → ∞, corresponding to the free energy in Eq. (122), one has to find thecomplex solutions of

Z = 2eNβJ coshNβh = 0 ⇒ βh = ±iπ(n− 1/2)/N n ∈ N . (139)

While these lie on the imaginary βh-axis, they close up to the origin for increasingN and eventually accumulate there in the limit N → ∞ — thereby destroyingthe analyticity at the origin by literally cutting the real axis into two halves.

Similarly, starting from the partition sum for vanishing field (h = 0), andusing Ne−2βJ � 1 and (1 + x/N)N ∼ ex for large N , one finds

Z =(eβJ + e−βJ)N + (eβJ − e−βJ)N = eNβJ [(1 + e−2βJ)N + (1− e−2βJ)N ]

∼2eNβJ cosh(Ne−2βJ)(140)

and thus the zeros of Z in the complex “temperature” plane from

Z = 2eNβJ cosh[Ne−2βJ ] = 0 ⇒ t2 ≡ e−2βJ = ±iπ(n− 1/2)/N n ∈ N .(141)

Note that the unusual temperature parameter t = e−βJ has again been employed.As for the complex h−plane, there is an accumulation of zeros that cut thecomplex t−plane into pieces at the origin for N → ∞. The zeros approach on

42

the imaginary axis of the complex t2−plane or along the diagonals in the complext-plane, respectively.

The conformal mapping Eq. (134) is convenient for establishing a connectionbetween the zeros of the partition sum and so-called Julia sets. It maps the imag-inary axis of the complex t2−plane onto the unit circle of the complex z−plane.Of the four special points t2 = ±i, 0,∞ the first two are invariant under theconformal mapping (they go to z = ±i), the remaining two (corresponding to thelow temperature fixed points for the ferro- and antiferromagnet, respectively) goto z = ±1. Equation (134) therefore also maps the complex zeros of the parti-tion sum in the complex t2−plane, which all lie on the imaginary axis, onto theunit circle in the complex z−plane. The zeros of the partition function are mosteasily deduced from the factor coshN/2 2K in Eq. (132) that absorbs the zerosthe partition sum looses upon decimation. Reverting this argument, one shouldget the partition sum zeros of long Ising chains by applying the inverse of thedecimation transformation to a short chain. Starting with a ring made of twospins,

Z2 = tr T 2 = 2(t−2 + t2) = 0 ⇒ t22 = ±i ⇒ z2 = ±i , (142)

and working backwards by repeatedly taking the square-root of z2 (i.e. simplydividing its phase angle by two), one finds the complex zeros of an Ising chainmade of N = 2m spins,

z2, . . . , zN = (±i)1, . . . , (±i)1/2m = e±iπ/2, . . . , e±iπ/2m

. (143)

Note that these zeros accumulate densely on the real axis at the low-temperaturefixed point at z = 1 (corresponding to t → 0, K → ∞, T → 0, respectively),in the limit N , m → ∞. And it definitely matters whether you are on the leftor right of the point where the real axis is pinched, or which limit is taken first,N → ∞ or T → 0. The accumulation of complex zeros therefore destroys theanalyticity of the free energy on the real z−axis at z = 1, (and thus on thereal t2−axis at t2 = 0 and on the real K−axis at K → ∞) and divides it intoanalytically disconnected regions.

The unit circle in the complex z−plane represents the boundary between thepoints that iterate to the trivial fixed point z = 0 (corresponding to the param-agnet with t2 = 1, K = 0, T → ∞) under the renormalization transformation,Eq. (135), and those iterating to infinity (which have no obvious physical inter-pretation for the Ising chain). It is an invariant set of Eq. (135) and its inversethat dissects the complex plane into disjoint regions. One can ask the generalquestion, how such sets look like for more complicated transformation rules, e.g.for the rule z → z2 + c. This produces somewhat more impressive sets of Ju-lia sets. And one may wonder for which values of c these are connected, whichproduces the famous Mandelbrot set. Illustrations can e.g. be found in the book“The Beauty of Fractals” by H.-O. Peitgen and P. H. Richter.

43

Recovering the scaling hypothesis and finite-size scaling

The non-analytic strong coupling behavior revealed by the above discussion isindicative of a phase transition at T = 0. It is worthwhile checking whether thegeneral framework of the scaling hypothesis is applicable here. Recall that thesingular part fs of the free energy density was supposed to diverge as t1/at = t2−α

in absence of an external field (h = 0), where by convention t = T/Tc − 1 isused to measure the dimensionless distance from the critical point. This was thescaling hypothesis, which provided a neat phenomenological explanation of theobserved non-trivial divergencies at the critical point T = Tc. However, the abovediscussion of the Ising model has suggested the very unusual critical parametert = e−βJ and Tc = 0, instead. These artifacts of the simple toy model make thecomparison a bit indirect.

The existence of a single characteristic diverging length scale ξ ∝ t−ν nearthe critical point together with dimensional analysis suggests that fs, as a spatialdensity, should obey fs ∝ ξ−d ∝ tdν . From the scaling hypothesis follows fs ∝t2−α, hence the exponent relation

dν = 2− α (144)

known as the hyperscaling relation. With d = 1 and ν = 2, one finds α = 0as required by the absence of a divergence in the specific heat. From Eq. (120),γ = 2 was deduced, which together with the exponent relation 2 − α = 2β + γimplies β = 0, in accord with the absence of a spontaneous magnetization forany finite coupling strength. To check the scaling ansatz for the singular partof the free energy density, one starts from the free energy f derived from thelargest eigenvalue Eq. (118), keeping only the leading order terms in the criticalparameters e−βJ and βh for large βJ , to find

βf ∼ −βJ − e−2βJ√

1 + (βhe2βJ)2 . (145)

The singular part fs of the free energy f per spin, which is up to a divisionwith the lattice constant the same as the free energy density f, is indeed of theexpected scaling form

βfs = βf + βJ ∼ −ξ−1ϕ(βhξ) = −tνϕ(βh/tν) = −t2−αϕ(βh/tγ+β) (146)

where the last expression corresponds to the general scaling hypothesis as intro-duced above. Similarly, for large (but finite) N , βJ and βh = 0, the free energyfollows from Eq. (140) as

βfs ∼−1

Nln[(2 cosh(Ne−2βJ)]

)∼− e−2βJ

[Ne−2βJ −N3e−6βJ/6 + . . .

]/2

∼− ξ−1φ(N/ξ) (βJ →∞ , N →∞)

(147)

44

where φ = O(1) should hold for large and small arguments, to reproduce thecritical behavior, and because the free energy density must be intensive. The finitesize scaling relation, Eq. (147), clarifies that any finite chain already becomes fullycritical at a length-dependent finite distance from the critical point, namely assoon as ξ exceeds N , so that φ(N/ξ) ∼ φ(0) saturates. Or, turned the otherway round, one can find the exponent ν from comparing the free energy of finitechains of different lengths N . By tuning the temperature such that the singularpart of the free energy density vanishes like fs ∝ 1/N , one knows that one hasachieved N/ξ = constant, so that one can infer the divergence of the correlationlength with temperature. The method is not limited to the free energy, of course,and is in fact a standard trick used in computer simulations to extract precisevalues of critical exponents for systems of finite size.

4 Stochastic Thermodynamics

Another way of making Ginzburg–Landau theory useful for the calculation ofcritical phenomena is to explicitly dress it up with fluctuations. One adds a forceterm that represents the thermal noise and drives the hydrodynamic equations ofmotion corresponding to Eq. (96) — i.e. the Euler–Lagrange equation pertainingto the GL free energy. Apart from critical phenomena (which are not pursuedany further, here), such an approach is useful for many things, in particular fordescribing the mesoscopic dynamics of many-body systems, which is neither fullydeterministic, as the macroscopic behavior, nor quite as wild as the molecularchaos — and therefore often quite interesting. One usually first writes downthe hydrodynamic equations of motion for the slow modes, which are typicallydensities of conserved quantities and broken-symmetry variables. These are thencomplemented by random forces representing the influence of the irregular dy-namics of all the other degrees of freedom. For a system in thermal equilibrium,with a strong scale separation between the slow modes and the fluctuating “mi-croscopic” degrees of freedom, there is no need to actually derive the dynamicsof the thermal forces. One can simply postulate their statistical properties inaccord with fundamental symmetries and the structure of the Gibbs ensembles.Einstein’s paper on Brownian motion provides a paradigm for this efficient andconvenient approach.

4.1 Einstein 1905: Brownian motion in a nutshell

Einstein’s paper26 from 1905 nicely summarizes, in a nutshell, the various levelsof a statistical mechanics descriptions of a many-body system with a scale separa-tion between microscopic and mesoscopic degrees of freedom, namely thermostat-

26Uber die von der molekulartheoretischen Theorie der Warme geforderte Bewegung von inruhenden Flussigkeiten suspendierten Teilchen, Annalen der Physik, Leipzig 17 (1905) 549

45

ics, thermodynamics, and stochastic thermodynamics. It exploits the “middle-world”27 character of Brownian motion, intermediate between the molecular andthe macroscopic world, to arrive, in a few lines, at the simplest versions of thecorner stones of equilibrium stochastic thermodynamics.

Brownian motion, as Einstein puts it, is the thermal motion of particles sus-pended in a solvent, which are small enough to jiggle perceptibly, but large enoughto be visible in the microscope. Einstein proposed to study this motion as a probeof the atomistic structure of the (then) invisible molecular world, i.e. to exploitthat it provides a scaled-up and more tangible model of the invisible molecularworld. Indeed Perrin received the Nobel prize in 1926 “for proving atoms real”on this basis, which eventually quieted protests by those calling themselves “En-ergetiker” (Ostwald and some other anti-atom freaks in Leipzig). ParaphrasingEinstein somewhat, one could also say that Brownian motion is the thermal mo-tion of dispersed particles small enough to jiggle perceptibly, but large enough toadmit some coarse-graining over the solvent degrees of freedom. In fact, Einsteinactually describes the solvent in the simplest possible hydrodynamic approxima-tion (known today as the “Markov approximation”), which greatly simplifies histask. This statement may justly prompt the question how this could ever leadto a proof of the atomic structure of matter, and a proper historian would more-over note that what is called the Einstein relation, below, was first derived byan Australian researcher named Sutherland. But then, you should of course notmistake our common myths of scientific discovery for a proper history of science.

Step 1: thermostatic force balance

Einstein argues against the view — still common at the time — that suspendedparticles may be dismissed in thermodynamics. He insists that N particles, eachof volume v◦, which are suspended in a solvent at small number density n� v−1

◦represent an (almost) ideal gas and therefore give rise to an osmotic pressure28

p = nkBT , (148)

thereby exerting a finite, albeit small, thermodynamic force if spatially restricted.In presence of a constant volume force K (e.g. gravity, K = −mgez) acting onthe particles one thus has the force balance

n(r)K = ∇p(r) , (149)

or, using Eq. (148) in Eq. (149), the barometer equation

p(r) = n(r)kBT ∝ eK·r/kBT = e−mgz/kBT (150)

in isothermal equilibrium (T = const.).

27Mark Haw: Middle World, the restless heart of life and matter, MacMillan Science.28A truly thermostatic approach would use the gas constant and count matter in moles.

46

Step 2: detailed balance of thermodynamic fluxes

In a second step, Einstein suggests to reconsider this stationary balance of ther-modynamic forces as a dynamic balance of the thermodynamic fluxes excited bythese forces, namely the diffusion and drift currents

−jdiff = jdrift ≡ nvdrift . (151)

He employs two (linear-response) constitutive equations that were well estab-lished by 1905 to express the currents in terms of the forces: Fick’s first law ofdiffusion29 from 1855,

jdiff = −D∇n , (152)

and Stokes’ law for the viscous drag on a suspended particle in uniform motionfrom 1851,

vdrift = K/ζ . (153)

The two response coefficients or susceptibilities, ζ and D, characterize the frictionand diffusivity of the Brownian particles, respectively. Eliminating K and nbetween the force and flux balance equations (149) and (151), one infers thecelebrated Einstein relation30,

D = kBT/ζ . (154)

This is a relation between two transport coefficients appearing in two general-ized hydrodynamic equations, namely the above equations by Stokes and Fick.What made Eq. (154) historically remarkable, is the appearance of the Boltz-mann constant kB (then still called Planck’s constant) originally written in theform of the gas constant divided by Avogadro’s number (then called Loschmidtnumber). The presence of this truly microscopic quantity in Eq. (154) meant thatit could be deduced from measurements of macroscopic (or at least mesoscopic)quantities, alone. Intriguingly, Eq. (154) thereby indicated a relation of the ki-netic coefficients D, ζ in the hydrodynamic equations by Fick and Stokes to themicroscopic, molecular world and its discrete atomistic structure (the ratio D/ζwould vanish in the continuum limit of infinite Loschmidt number). In the jointsbetween some unsuspicious smooth continuum equations lurked, much to the dis-may of Ostwald, the grotesque face of the atomic world. In this sense, Eq. (154)is reminiscent of another famous relation put forward by Einstein at about thesame time, E = ~ω, and played an equally important role for establishing thediscrete atomistic structure of matter. In contrast to the latter, Eq. (154) wasnot postulated but derived — namely from the (at the time neither new nor gen-erally accepted) assumption that thermodynamics applies to suspended particles.There was no reference, though, to the molecular structure of the solvent thatEinstein tacitly coarse-grained away.

29Combining it with mass conservation, ∂tn+∇ · jdiff = 0, yields Fick’s 2nd law, Eq. (156).30“Stokes–Einstein” if ζ = 6πηa for a particle of radius a in a solvent of viscosity η is used.

47

Step 3: stochastic thermodynamics

The explicit link to the underlying thermal fluctuations is established at theend of Einstein’s paper, which undertakes a probabilistic derivation of the dif-fusion equation based on a particularly simple model for Brownian fluctuations,nowadays called a random walk. A random walk is just a caricature of the realphysical Brownian motion. But Einstein appeals to the universality of the hydro-/thermodynamic laws, anticipating that the simplest conceivable representativeof a huge universality class of microscopic systems will do as good a job as anyother for his purpose.

Einstein’s starting point is the interpretation of the particle concentration as aprobability density to find a single Brownian particle (for simplicity in one spacedimension) at position x. Using common sense, its value at time t + τ can beexpressed as a function of its values at time t,

n(x, t+ τ) =

∫dy n(x+ y, t)ϕτ (y) . (155)

The auxiliary “jump probabilities” ϕτ (y) to move a distance y in time τ in thisMaster– or Chapman–Kolmogorov -type equation, as it would be called today, hasto be normalized and (for a homogeneous space) symmetric. Einstein’s approachcircumvents the construction of an explicit microscopic dynamical theory of theprocess by Taylor expanding n(x, t) with respect to τ and y and imposing theseconditions on ϕτ (y). Matching the leading order terms on both sides, he recoversthe diffusion equation

∂tn(x, t) = D∇2n(x, t) (156)

along with a stochastic expression for the diffusion coefficient

2D =1

τ

∫dy y2ϕτ (y) =

〈δx2〉τ

=〈δr2〉dτ

. (157)

The expressions for D (with d the spatial dimension) are understood to be in-dependent of the time interval τ considered, as long as τ is large31 compared toa molecular collision time. As suggested by Eq. (157), and as you will verifyin greater detail in the exercises, a crucial point is that the mean-square dis-placement (MSD) per time unit τ remains a well defined quantity at least for alarge range of values of τ , quite in contrast to the velocity 〈|δx|〉/τ , which someexperimentalists had been trying to measure. What they naturally expected tobe a “good” variable on the basis of Newton’s law of motion turned out to bemeaningless, which reveals the crucial importance of a good theory (or a goodintuition) about what is worthwhile to be measured.

31Note that the derivation also invokes the dubious limit “τ → 0”.

48

Coda: Langevin’s noisy hydrodynamics

Three years after Einstein’s paper appeared, Langevin wrote a paper where heproposed another very elegant way of dealing with Brownian particles. In itscomplete version (due to Ornstein) it can essentially be understood as startingfrom Einstein’s final result, the link between thermal fluctuations and diffusivity(or friction). Einstein’s discussion of the random walk amounts to representingthe cumulative net random force exerted by the solvent molecules on the Brown-ian particle by a Gaussian random “noise” variable ξ(t) of zero mean, correlatedaccording to

〈ξ(t)ξ(t′)〉 = 2kBTζδ(t− t′) , (158)

TTI=⇒ 〈ξ(t)ξ(0)〉 = 2kBTζδ(t) . (159)

In the second line, the time-translation invariance (TTI) or “stationarity” ofequilibrium averages has been exploited to shift the origin of the time axis32 tot′. The fact that the product of the friction coefficient and temperature quan-tifies the strength of the thermal noise is general and goes under the name offluctuation-dissipation theorem (FDT) “of the second kind”. Now, one boldlyextends Stokes’ law, Eq. (153) (actually derived for stationary forces), to suchwildly time-dependent forces,

ζx = ξ (t > 0) . (160)

These two idealizations together (neglecting memory in the friction and in thethermal noise) amount to the so-called Markov approximation, which is alsoimplicit in Einstein’s derivation. It entails that the Fourier transform of Eq. (158),the so-called “power spectrum” (sometimes symbolically written as 〈ξωξ−ω〉)∫ ∞

−∞dt 〈ξ(t)ξ(0)〉eiωt = 2ζkBT (161)

is flat, i.e., it contains all frequencies with equal strength, and is therefore calledwhite noise. With the particular form of the noise correlations in Eq. (158), thelaws of Brownian motion as discussed by Einstein may completely be recoveredfrom this intuitive model. (You may try to show this as an exercise.)

Fluctuation-Dissipation & Green-Kubo relations

Some further, equivalent formulations are easily derived from Eqs. (158), (160).Integrating Eq. (158) over the whole time axis, one immediately gets the followingintegral representation for the friction coefficient33,

ζ =1

2kBT

∫ ∞−∞

dt 〈ξ(t)ξ(0)〉 =1

kBT

∫ ∞0

dt 〈ξ(t)ξ(0)〉 = limω→0

〈ξωξ−ω〉2kBT

(162)

32First write 〈ξ(t)ξ(t′)〉 = 〈ξ(t− t′)ξ(0)〉, then rename t− t′ as t.33Note that the integral over half of the δ−function, which is of course nothing but the

idealization of a narrowly peaked symmetric function enclosing unit area, is set to 1/2.

49

and, using Eqs. (154), (160), the diffusion coefficient

D =

∫ ∞0

dt 〈x(t)x(0)〉 . (163)

Here, the time-translation invariance of equilibrium averages has again been ex-ploited to shift the origin of the time axis to t′, and the time symmetry of thecorrelator allows to restrict the integration range to positive times. Such “Green–Kubo relations”, expressing a transport coefficient as a time integral over correla-tion functions of equilibrium fluctuations (or, equivalently, as the low frequencylimit of their power spectral density), are generally very useful for estimatingtransport coefficients, without actually forcing a system, in numerical simula-tions or experiments.

Integrating Eq. (158) only from −t to t, and using Eq. (160) and the timeasymmetry of 〈x(t)x(0)〉 = −〈x(−t)x(0)〉, one finds

1

ζsgn(t− t′) =

1

kBT∂t′〈x(t)x(t′)〉 . (164)

Rewriting this once more, using the causal response function or susceptibility〈δx(t)/δf(t′)〉 (that vanishes for t < t′) and

x(t) =

∫ t

−∞dt′′ ξ(t′′)/ζ (165)

yields the fluctuation-dissipation theorem (FDT) “of the first kind”

〈δx(t)/δξ(t′)〉 =θ(t− t′)kBT

∂t′〈x(t)x(t′)〉 , (166)

which is a dynamic generalization of the fluctuation response relations discussedearlier in the lecture. There are a couple of interesting (and quite general) obser-vations to be made, here. First, the step function θ(t) reconciles the causality ofthe response function with the time-inversion invariance of the equilibrium cor-relation function 〈x(t)x(t′)〉, which immediately follows from its time-translationinvariance (TTI)34, and the corresponding time asymmetry of its time derivative.Secondly, the signum function in Eq. (164) suggests a sign change of the frictionterm in Eq. (160) upon time reversal, a property referred to as “passivity” (frictionsucks — always, you cannot escape it, not even by running backwards in time),which is a reformulation of the second law. The discontinuity is a consequenceof the Markov approximation, namely the assumption of a δ−correlated thermalnoise. In the more general non-Markovian case, where one takes into account thepersistence (or memory) of the thermal forces, the discontinuous friction func-tion ζ−1sgn(t− t′) is replaced by a continuous time-asymmetric friction function

34Use 〈a(t)b(0)〉 = 〈a(0)b(−t)〉 = 〈b(−t)a(0)〉 for a ≡ b ≡ x.

50

2iχ′′(t) defined via 2iχ′′(t − t′)θ(t − t′) ≡ δx(t)/δf(t′). It is then customary tosuppress the step function in Eq. (166) by writing the FDT in Fourier space,using the (real) Fourier transform of χ′′.

Finally, to close the circle back to Einstein’s discussion, consider the timederivative of the mean-square displacement MSD = 〈[x(t)− x(0)]2〉:

∂tMSD = −2〈x(t)x(0)〉 = 2〈x(t)x(0)〉 = 2

∫ t

0

dt 〈x(t)x(0)〉 ≡ 2D(t) . (167)

The first equality follows from the stationarity of equilibrium averages (which im-plies, in particular, that one-point functions are time independent) and the secondfrom the time asymmetry of the correlation function 〈x(t)x(0)〉 = −〈x(−t)x(0)〉 =−〈x(0)x(t)〉. Thereby, Eq. (157) is recovered for sufficiently late times, for whichD(t) = D = constant and the Markov assumption holds.

4.2 Langevin, Ornstein, Fokker–Planck, and all that

The above crash course in stochastic (Brownian) dynamics provides a broadoverview over the key results in the field, but may provoke a number of ques-tions. The calculation of time-dependent correlation functions 〈a(t)b(0)〉 wasapparently achieved without ever explicitly implementing the equilibrium aver-ages 〈. . . 〉. In fact, the symbol 〈. . . 〉 acquired a new meaning (averaging over allpossible realizations of the thermal noise force ξ) in the Langevin picture, and itwould be desirable to understand its precise relation with the canonical definitionin the Gibbs ensembles. Moreover, one would like to make the relation betweenthe Langevin picture and the diffusion equation more explicit and to see howsuch equations emerge from a microscopic many-body description in terms of theLiouville equation. All this is what the present section is about.

Irreversible Thermodynamics

As a start, one may retrace Einstein’s steps, again. Step 1 is about thermostaticforce balances, a topic that was formalized in the first two chapters of this lec-ture. Step 2 enters dynamics in the sense of dissipative fluxes, but it simply stealsthe main equations from Fick and Stokes. Stokes arrived at Eq. (153) by solv-ing the hydrodynamic equation of motion of a slowly flowing liquid (the Stokesequation), which is essentially a mechanical-engineering problem. In contrast,Fick’s equation (152) and the diffusion equation (156) emanating from it, aresuch hydrodynamic equations. So how does one arrive at these hydrodynamic or“thermodynamic” (in the proper sense) equations, in the first place? As it turnsout, Fick’s first law may be obtained as an immediate consequence of Stokes’ lawv = K/ζ and the static force balance35 nK = −∇p invoked by Einstein, which

35Mind the minus sign: now the pressure is meant to cause (not balance) the force K.

51

allows to write the particle flux in the form

jn = nv = −∇pζ

= −(∂np)Tζ∇n = − 1

κTnζ∇n = −D∇n . (168)

Fick’s gradient diffusion coefficient can thus be expressed as

D =1

κTnζ=

D0

κTnkBT. (169)

The second equality employs the Einstein relation for the diffusion coefficientD0 = kBT/ζ of a single particle. Indeed, Fick’s D is seen to reduce to D0 (only)for a dilute suspension, for which the isothermal compressibility takes the ideal-gas form κT = (nkBT )−1.

Another way of looking at Fick’s first law is obtained with the Gibbs–Duhemrelation dµ)T = n−1dp, namely

jn = −D0n

kBT∇µ , (170)

which suggests gradients of the chemical potential µ(r, t) (rather than osmoticpressure gradients) as the driving force behind the phenomenon of diffusion. Thisperspective also naturally emerges from the appropriate fundamental relation ofan isothermal system, and betrays Fick’s law as a direct manifestation of thesecond law of thermodynamics. Namely, normalizing the free energy to a fixedvolume V , and interpreting the total differential as a time derivative, one has

df = µdn , ⇒ ∂tf = µ∂tn , ⇒ ∂tf +∇ · µjn = jn · ∇µ . (171)

The last step made use of the continuity equation for the particle density,

∂tn(r, t) +∇ · jn(r, t) = 0 , (172)

to recast the free energy balance into the form of a continuity equation with asource term. Since the divergence vanishes when the equation is integrated overthe total volume of a closed system, the change of the total free energy is entirelydue to this source term:

F (t) =

∫V

dr ∂tf(r, t) =

∫V

dr jn · ∇µ . (173)

Using Eq. (170), this becomes

F (t) = −∫V

drD0n

kBT(∇µ)2 ≤ 0 , (174)

which says that that diffusion consumes free energy and is therefore a dissipativeprocess in the sense of the second law. Starting from a macrosopically inhomoge-neous initial state, in which the system is only in a local (or partial) equilibrium,

52

it proceeds spontaneously to bring the system towards global equilibrium. Gradi-ents in the chemical potential or in the osmotic pressure can be identified as thenatural driving forces behind the diffusion flux jn. Also note that the dynamicfree energy F (t) owes its time dependence to the dynamics of the relevant systemvariables and simultaneously obeys F (t) ≥ 0 and F (t) ≤ 0. It thereby guaranteesglobal stability and thus qualifies as a so-called Ljapunov functional.

One may wonder how it is possible at all that equilibrium thermodynamicsor equilibrium statistical mechanics admit and even describe time variations atall. Above, the time dependence was smuggled in via the continuity equation36,Eq. (173). Therefore, the dynamic free energy F (t) is a unique and immediategeneralization of the thermostatic free energy. This indicates that the irreversiblethermodynamics thus obtained is a theory for processes close to equilibrium,essentially for relaxations into equilibrium. Such a straightforward procedureis not available for processes far from equilibrium, which may never reach ornot even approach equilibrium, such as non-equilibrium steady states or agingsystems.

Formally, one may deal with time-dependent equilibrium correlation functionsin the same way as with spatially varying equilibrium correlators that were in-troduced in the preceding chapters to generalize the (originally homogeneous)thermostatic relations. Global equilibrium is relaxed and only local equilibriumrequired, which now amounts to the assumption of an adiabatic time-scale sep-aration between the processes establishing the thermal equilibrium between thehydrodynamic variables at a certain instant in time and the slow drift of theircorresponding equilibrium values that eventually establishes global equilibrium.Historically, the generalization from spatially to temporally inhomogeneous sys-tems is known as Onsager’s regression hypothesis (or theorem), and it is equiv-alent to the classical FDT. In words, it states the dynamic equivalence of therelaxation of hydrodynamic variables (if not excited too far from their equilib-rium values) with that of spontaneous thermal fluctuations. The intuitive idea isthat a moderate hydrodynamic excitation could have been caused by a sponta-neous fluctuation, so that it should decay in the same way. Consider a variableM of vanishing mean 〈M〉 = 0 and a weak constant external field h that couplesto it via a small perturbation term −Mh in the Hamiltonian. The phase spacedensity and average with and without the field are defined as ρh, 〈. . . 〉h and ρ,〈. . . 〉, respectively. With this notation, the static fluctuation-response theorem,see e.g. Eq. (65), reads

〈M〉h = χh = 〈MM〉βh . (175)

The response is linear in the perturbing field with the linear susceptibility χgiven by the correlator 〈MM〉 over kBT . This result was originally derived byexpanding the Boltzmann factor for the perturbed density to leading order in the

36In the same way, Fick’s law (152) can be turned into the diffusion equation Eq. (156)

53

perturbation Hamiltonian, namely

ρh ∼ ρ(1 + βhM) = ρ(1 + M〈M〉h/〈MM〉) . (176)

The funny rewriting in the last expression utilizes Eq. (175) and serves to upgradeit to dynamic processes. The formal trick goes as follows. Assume that the fieldh has been constant since t = −∞ but is switched off at t = 0, i.e., h→ hθ(−t).Next, interpret all M without arguments as M(t = 0) and note that 〈M(t)〉 = 0can be dropped, while 〈M(0)〉h is a finite constant. Then, multiplication ofboth sides of Eq. (176) by M(t) and a phase-space integration formally yields anequality for t ≥ 0 that can be identified with the regression theorem:

〈M(t)〉h〈M〉h

=〈M(t)M〉〈MM〉

(t ≥ 0) (177)

The proper meaning of 〈M(t)〉h is

〈M(t)〉h = 〈M(t)〉|〈M(0)〉=〈M〉h , (178)

i.e. it denotes the free decay of 〈M(t)〉 from its initial value 〈M(0)〉 = 〈M〉h tozero. Hence, both sides of Eq. (177) are understood to decay from 1 to 0.

That this purely formal hocus-pocus does yield (though not prove) a sensibleresult is supported by the explicit discussion of the special case of an Ising spin,M = s. The conditional expectation value 〈s(t)〉s(0)=±1 of the spin at time t ≥ 0is easily expressed in terms of its possible values s(0) = ±1 at time t = 0, and sois the time-dependent two-point correlation function of the two:

〈s(t)〉 = (1/2)〈s(t)〉s(0)=1 + (1/2)〈s(t)〉s(0)=−1

〈s(t)s(0)〉 = (1/2)〈s(t)〉s(0)=1 − (1/2)〈s(t)〉s(0)=−1 .(179)

Here, the conditional probabilities are turned into absolute probabilities throughmultiplication with the probabilities for the initial spin values. Without an ex-ternal field, s(0) = ±1 both occur with probability 1/2. In presence of the weakfield hθ(−t), these probabilities change to e±βh/(eβh + e−βh) ∼ (1 ± βh)/2 toleading order in h, hence

〈s(t)〉h = 〈s(t)〉s(0)=1(1 + βh)/2 + 〈s(t)〉s(0)=−1(1− βh)/2 . (180)

The interpretation of the average on the left is the same as in Eq. (178). Using〈s(t)〉 = 0 for the unconditioned spin in zero field, Eq. (180) can be written as

〈s(t)〉h = 〈s(t)s(0)〉βh , (181)

which, with 〈s(0)〉h = βh and 〈s2〉 = 1, indeed corroborates Eq. (177).

54

Ornstein–Uhlenbeck: fluctuating hydrodynamics in a nutshell

The preceding paragraph explains why the above formalism was historically calledirreversible thermodynamics or nonequilibrium thermodynamics. But it shouldactually better be called generalized hydrodynamics or more simply thermody-namics if the latter notion had not been abused so much for something thatwould more appropriately be be called thermostatics. In any case, the theorydescribes thermodynamic processes by deterministic laws, driven by (free) en-ergy differences, as familiar from classical (continuum) mechanics. It is blind tothe atomistic thermal fluctuations that are actually the microscopic reason forthe relaxation to equilibrium and the second law, as evidenced from the variousfluctuation-response relations discussed throughout the lecture. To bring themexplicitly into the game, one can proceed phenomenologically and microscopically.

Along the first route, one starts from the above observation that hydrody-namic (thermodynamic) equations of motion describe dissipative relaxations toequilibrium. One may quite generally blame some sort of friction force for bring-ing all systematic motion to rest at late times, so that the equations are all ofthe type “initial dynamics halted by friction”. But things are not really totallyat rest, in thermal equilibrium, they fluctuate thermally, which is the rationalfor speaking of temperature, entropy, and the second law. Friction, as a system-atic manifestation of the stochastic dynamics on the molecular scale, is thus justone side of the medal of molecular chaos. The thermal forces introduced in theLangevin equation are its other side, namely its non-systematic, random side,which was wiped under the carpet in the deterministic hydrodynamic descriptionof the preceding paragraph. The effect of the fast molecular dynamics onto theslow (almost deterministic) thermodynamic or hydrodynamic variables can thusbe decomposed into a systematic part (friction) and a random part (noise), whichare firmly tied to each other by the requirement of thermal equilibrium.

To see how these words can be turned into equations, consider a simple genericexample, a hydrodynamic variable p, which may be thought of as the momentumof a Brownian particle. The simplest possible hydrodynamic equation of motionp = −γp has no other systematic force than the mandatory friction −γp with γ =ζ/m. Its solution is p(t) = p(0)e−γt. Note that hydrodynamics, as a macroscopictheory ruled by the second law, is meant to predict the future from the past, sothere is a good reason here37 for restricting the equation and the application ofits solution to positive times t. Now enter the noise ξ:

p(t) = −γ p(t) + ξ(t) (t > 0) . (182)

Obviously, if all systematic parts of the thermal force are contained in the frictionterm, the average of the remaining noise force must vanish,

〈ξ〉 = 0 (“noisy noise”) . (183)

37in contrast to the Schrodinger equation, say

55

Moreover, any systematic influence of the hydrodynamic past p(t < 0) onto thenoisy present ξ(0) should be prohibited, at least if no such memory of the pastis showing up in the systematic friction force, in the first place,

〈ξ(t)p(0)〉 = 0 (t > 0) (“truly noisy noise”) . (184)

This is equivalent to saying that the energy and momentum supplied to the bathare dissipated instantaneously — obviously an idealization for a description onlong time scales38. Together with the instantaneous (local in time) form of thefriction, this statement constitutes the so-called Markov condition. At late times,and in particular for Brownian degrees of freedom (pertaining to objects that arelarge compared to the atoms of the solvent or heat reservoir), it is moreover con-ceivable that the random force represents the cumulative effect of a large numberof essentially stochastic sub-processes. It is then natural (and of course the sim-plest thing one can do!) to appeal to the central limit theorem and assume thatξ is, at any time, a Gaussian distributed random variable. While this assumptionis not necessary, it is extremely convenient and indeed the standard assumption.It has proven to work extremely well for a large number of applications close toequilibrium. Equally famous is its spectacular failure for financial markets andother phenomena very far from equilibrium.

An important side effect of the introduction of the noise is the possibilityto reinterpret the averages 〈. . . 〉 in terms of the ensemble of realizations of thestochastic process ξ(t). While this allows for generalizations of the averagingprocedure beyond the one of standard equilibrium statistical mechanics, it is ofcourse most interesting to find the conditions under which the two are consistent.This is the next task.

The first requirement is that the known deterministic hydrodynamic equa-tions should emerge on average. Note that this is an immediate consequence ofEqs. (182), (183). The second requirement is more subtle. Equilibrium averagesmust be time-translation invariant or stationary in the same way as spatial av-erages are translation invariant in spatially homogeneous systems. This implies,for example,

〈p(t)p(0)〉 = 〈p(0)p(−t)〉 = 〈p(−t)p(0)〉 = 〈p(|t|)p(0)〉 (TTI) , (185)

with the last expression clearly exposing time-inversion invariance or “time sym-metry” as an immediate consequence of stationarity. The time symmetry of〈p(t)p(0)〉, in turn, implies the time-asymmetry of its time derivative,

〈p(t)p(0)〉 = −〈p(−t)p(0)〉 = −〈p(t)p(0)〉 . (186)

as promptly checked by renaming t → −t (hence ∂t → −∂t). Now, multiplyingthe Langevin equation (182) by p(0) and averaging, one finds with Eq. (184)

〈p(t)p(0)〉 = −γ〈p(t)p(0)〉 (t > 0) . (187)

38In reality, momentum is locally conserved in the solvent, so that it has to be dispersed by(fast) propagating sound modes and (slow) diffusive shear modes, which takes some time.

56

The correlator obeys the deterministic hydrodynamic equation. Since the left sideis symmetric and the right side is asymmetric under time reversal, an extensionto negative times is only possible if one side is multiplied by a sgn-function,

〈p(t)p(0)〉 = −γ sgn t 〈p(t)p(0)〉 (∀t) . (188)

The obvious solution is 〈p(t)p(0)〉 = 〈p2〉e−γ|t|, which is now perfectly acceptablefor a correlator of equilibrium fluctuations. Taking another time-derivative yields

〈p(t)p(0)〉 = 2γ δ(t)〈p2(0)〉+ γ sgn(t)〈p(t)p(0)〉 . (189)

Hence, altogether, the proper extension of Eq. (182) to negative times must be

p(t) = −γ sgn(t)p(t) + ξ(t)

〈ξ(t)〉 = 0 , 〈ξ(t)ξ(0)〉 = 2γ 〈p2〉δ(t) = 2ζkBTδ(t) .(190)

Assuming that p denotes the momentum of a Brownian particle, the equiparti-tion theorem and the definition γ = ζ/m were used, in the last step.39 Withthe above assumption that the noise distribution is Gaussian, and therefore com-pletely specified by its first two moments (average and variance), the model def-inition is complete and bears the name Ornstein–Uhlenbeck process. In the ex-ercises you check that the model reproduces the static averages of the canonicalensemble and moreover allows for a quick and easy calculation of equilibriumtime-dependent correlation functions, and that even non-equilibrium averagesand correlation functions can be calculated by imposing a non-equilibrium initialcondition. For equilibrium initial conditions at t = 0, or after taking a canonicalaverage over initial data, or for very long times, the solutions of Eq. (190) arestationary. In fact, up to trivial variations, the Ornstein–Uhlenbeck process isthe only stationary Gaussian Markov process, which makes it quite a celebrity.

The Fokker–Planck equation

Note that the Gaussian property of the noise distribution is not destroyed bya linear coordinate transformation. It is therefore transmitted from the noisecorrelations that drive the dynamics to the solutions of the linear Langevin equa-tion (190). Therefore, the conditional distribution ρ(p, t|p0) to find values of thevariable p at time t, given an initial condition p = p0 at time t = 0, must beGaussian at all times. One can thus immediately write down the general form ofthe solution:

ρ(p, t|p0) =1

[2π∆(t)]1/2e−[p−p(t)]2/2∆(t) . (191)

The explicit expressions for the dynamic first and second moments,

p(t) ≡ 〈p(t)〉p0 = p0e−γ|t| (192)

∆(t) ≡ 〈[p(t)− p(t)]2〉p0 = 〈p2(t)〉p0 − p2(t) = mkBT (1− e−2γ|t|) , (193)

39To check the last Eq. (190), use the first on its left side together with Eq. (189) and TTI.

57

follow from Eq. (190). Note that the deterministic initial condition is recov-ered at short times but forgotten at late times, where the conventional canonicalequilibrium distribution is approached:

limt→0

ρ(p, t|p0) = δ(p− p0) ,

limt→∞

ρ(p, t|p0) = (2πmkBT )−1/2e−p2/2mkBT .

(194)

An average over the initial condition p0 would turn the conditional distributioninto the momentum distribution ρ(p, t). With a Maxwell–Boltzmann distributedinitial momentum p0, it is Gaussian, too.

It is an interesting question how to get the distribution function in caseswhen it is not Gaussian and possibly a much richer object than just its first twomoments, e.g., for a nonlinear Markov processes with a more general drift termv(p) in place of −γp. The answer is provided by an equation of motion for thedistribution function that can be derived for any Markovian Langenvin equation,the Fokker–Planck equation. The task is essentially to go from the “Heisenberg”picture adopted in the Langevin equation to the corresponding “Liouville” or“von Neumann” picture. For the very simple Eq. (160), Einstein’s derivation ofthe diffusion equation (156) already achieves this task. If a (possibly nonlinear)drift term v(p) is included in the Langevin equation, also the diffusion equationacquires an extra drift term. The general derivation can be accomplished withthe help of the functional calculus, introduced earlier in the lecture. Here we takea shortcut, instead, sticking to the above simple example of the linear Ornstein–Uhlenbeck process, for which the Fokker–Planck equation for ρ(p, t|p0) is obtainedfrom its known solution, Eq. (191), by differentiation:

∂tρ = γ∂p(pρ) + ζkBT∂2pρ or

1

γ∂tρ = ∂v(vρ) +

kBT

m∂2vρ . (195)

This structure survives in the case of vector variables p and nonlinear MarkovianLangevin equations,

∂tρ = −∂pv(p)ρ+ ζkBT∂2pρ . (196)

As a concrete example, consider a Brownian particle in an external force fieldf(x) − ∂xU(x), with the vector (x, p) generalizing the momentum coordinate pand (p/m, f(x) − γp) generalizing the drift term −γp. For the harmonic forcef(x) = −kx, this is the Brownian oscillator, discussed in the exercises. ItsFokker–Planck equation for the density ρ(x, p, t) reads

∂tρ+ ∂x(ρp/m) + ∂p[f(x)− γp]ρ = ζkBT∂2pρ . (197)

One recognizes the Liouville structure of a continuity equation ∂tρ+ div(vρ) = 0for the trajectories in phase space, streaming with velocity v = (p/m, f) albeitwith some of the momentum-space “velocity” f taken away by friction, and thenon-streaming but mass-conserving diffusion term due to the noise, on the right.

58

The Smoluchowski equation

The Fokker–Planck operator is generally not self-adjoined, so that it is difficult tosay much about its solutions, in general. The situation is much better for its over-damped version, which is obtained from the corresponding Langevin equationafter dropping the inertial term p against the friction, i.e.,

��AAp+ ζx = f(x) + ξ(t) . (198)

This matters so much that the corresponding Fokker–Planck equation for ρ(x, t),

∂tρ = ∂xD(∂x − βf)ρ or ζ∂tρ = ∂x(kBT∂x − f)ρ , (199)

deserves its own name and is called the Smoluchowski equation. For vanishingforce, f(x) ≡ 0, it reduces to the simple diffusion equation of a single particle. Itsstraightforward extension to many (1014, say) mutually interacting colloidal par-ticles — with inter-particle forces fi(x) = −∂iV({xj}) — is still a very challengingequation, though. The variable x becomes a high-dimensional vector {xj}) con-taining all particle coordinates and the diffusion coefficient turns into a matrixDij({xi}) that still depends on the mutual hydrodynamic interactions40 betweenthe particles and hence all their positions. The full equation for the N−particledensity ρ({xi}, t) then reads

∂tρ =∑ij

∂i ·Dij · [∂j − βfj]ρ . (200)

It is the standard microscopic description for a wealth of problems in colloid andpolymer physics.

Going back to the single-particle case with an external potential U(x), notethat the equation can be cast into the form of a continuity equation

∂tρ+ ∂xj = 0 (201)

for ρ(x, t), which is in then nothing but the ordinary 1-point particle densityn(x, t). The current

j(x, t) ≡ ρv = −ζ−1(kBT∂x − f)ρ (202)

of the conserved density flows with the velocity

v(x, t) ≡ −ζ−1∂x(kBT ln ρ+ U) = −ζ−1∂xµ(x, t) , (203)

and is seen to be driven by gradients of the generalized chemical potential

µ(x, t) = µig(x, t) + U(x) = kBT ln ρ+ U . (204)

40In the Markovian approximation, these are idealized as an instantaneous action at a distancewith a complicated spatial structure that follows from the stationary Stokes equation.

59

The chemical potential is thus (again) recognized as a neat way of defining athermodynamic force −∂xµ that can be treated on par with real (mechanical,electrical, . . . ) forces and thereby effectively allows the Liouville structure of astreaming of a conserved phase space volume to be restored, without any (ap-parent) source terms. Thermal fluctuations are thereby smoothly integrated intoa (continuum) mechanical formalism that allows to predict dynamical phenom-ena involving all kinds of local dynamic forces along with dissipative diffusiveprocesses.

Note that a stationary state, defined by the vanishing of all time derivativesin the theory (i.e. ∂tρ ≡ 0) is characterized by the condition that the divergenceof the current vanishes. In equilibrium, the current itself must vanish. A con-stantly sheared colloidal suspension is in a non-equilibrium steady state but notin equilibrium, for example. As already discussed in the context of irreversiblethermodynamics, the relaxation from a non-equilibrium initial condition to theglobal equilibrium state ρ0(x) = e−βU(x)+βF0 is governed by a Ljapunov functional

∆F (t) = kBT

∫dx ρ ln(ρ/ρ0) =

∫dxµρ−F0 =

∫dx (kBT ln ρ+U)ρ−F0 (205)

that generalizes the equilibrium free energy. More precisely, it is the excessover the equilibrium free energy F0, expressed in terms of the chemical potentialdensity µρ integrated over the configuration space. Because of the minimumcondition for the equilibrium free energy, it cannot be negative, whereas its timederivative cannot be positive,

∆F (t) =

∫dxµ∂tρ = −

∫dxµ∂xj =

∫dx j∂xµ = −ζ−1

∫dx (∂xµ)2ρ ≤ 0 . (206)

For the first equality, one makes use of the normalization of the density and of thefact that the potential U(x) and the equilibrium free energy F0 are conservativepotentials, for the second and fourth of Eqs. (201), (203), respectively, and forthe third one of the fact that nothing flows across the system boundaries.

In view of such perfect conduct of the Smoluchowski equation, it should notcome as a surprise that the Smoluchowski–operator LS can be made self-adjoined.This is achieved by a simple normalization of the distribution ρ with the squareroot of its equilibrium form ρ0(x) ∝ e−βU(x) (the proportionality factor is irrele-vant), namely

ψ(x, t) ≡ ρeβU/2 , (207)

as one might have guessed from the following rewriting of the Smoluchowskiequation

∂tρ ≡ LSρ = D∂x(∂x + βU ′)ρ = D∂xe−βU∂xe

βUρ . (208)

Indeed, the dynamics for the normalized density ψ is straightforwardly shown tobe given by

∂tψ = DeβU/2∂xe−βU/2e−βU/2∂xe

βU/2ψ , (209)

60

with a manifestly self-adjoined dynamic operator. Moreover, with yet somemore rewriting, this equation can be recast into the form of a supersymmetricSchrodinger equation in the imaginary time variable τ = −i~Dt,

i~∂τψ = (−∂2x + Ususy)ψ , Ususy ≡ (βU ′/2)2 − βU ′′/2 . (210)

This establishes a 1:1 relation between solutions of the 1-particle Smoluchowskiequation and the Schrodinger equation for supersymmetric potentials Ususy. Sincethe Schrodinger equation is a subject for another lecture, and the supersymmet-ric form of the potential implies that every solution will find a supersymmetricpartner, this seems to be a suitable point for a happy end.

61

Advanced Statistical Mechanics - uni-leipzig.dekroy/materials/sm2_16.pdf · Advanced Statistical...

Documents

Transcript of Advanced Statistical Mechanics - uni-leipzig.dekroy/materials/sm2_16.pdf · Advanced Statistical...