Post on 16-May-2017
Statistical Mechanics
Part IGeneral Principles
1 Conservation of Information, Energy, Entropy, and
Temperature
1.1 Introduction
Statistical mechanics of often thought of as the theory of how atoms combine to form
gases, liquids, solids, and even plasmas and black body radiation. But it is both more
and less than that. Statistical mechanics is a useful tool in many areas of science in which
large a number of variables have to be dealt with using statistical methods. My son, who
studies neural networks, uses it. I have no doubt that some of the financial wizards at AIG
and Lehman Brothers used it. Saying that Stat Mech is the theory of gasses...... is rather
like saying calculus is the theory of planetary orbits. Stat Mech is really a particular type
of probability theory.
Coin flipping is a good place to start. The probabilities for heads (H) and tails (T) are
both equal to 1/2. Why do I say that? One answer is that the symmetry between H and T
means their probabilities are equal. Here is another example. Let’s take a die (as in dice)
and color the six faces red, yellow, blue, green, orange, and purple (R, Y,B,G, O, P ). The
obvious cubic symmetry of the die dictates that the probabilities are equal to 1/6. But
what if we don’t have a symmetry to rely on? How do we assign a priori probabilities?
Suppose for example instead of the coloring scheme that I indicated above, I chose to
color the purple face red. Then there would be only five colors. Would the probability
of throwing a given color be 1/5? After all, if I just write (R, Y, B,G, O), the 5 names
are just as symmetric as the original 6 names. Nonsense, you say: the real symmetry is
among the 6 faces, and that is so. But what if there really is no obvious symmetry at all,
for example if the die is weighted in some unfair way?
In that case we would have to rely on a bunch of details such as the precise way the
die was thrown by the hand that threw it, the wind, maybe even the surface that the die
lands of (can it bounce?). As is often the case, we have to think of the system in question
as part of a bigger system. But what about the bigger system? How do we assign its
probabilities.
Here is another idea that involves some dynamics. Suppose there is a law of motion (in
this example time is discrete) that takes a configuration and in the next instant replaces
is by another unique configuration. For example R → B B → Y Y → G G → O O → P
P → R. I can then ask what fraction of the time does that die spend in each configuration?
The answer is 1/6. In fact there are many possible laws for which the answer will be the
1
same. For example, R → B B → G G → P P → O O → Y Y → R. or R → Y Y → P
P → G G → O O → B B → R.
But what about the law R → B B → G G → R P → O O → Y Y → P. In this case
there are two trajectories through the space of states. If we are on one of them we don’t
jump to the other. So the probability will depend on the relative probability for beginning
on the two cycles.
In this last case there is a conserved quantity. Suppose we assign the number 1 to
R,B, G and 0 to O, Y, P . Let’s call this quantity the Zilch. Obviously Zilch is conserved.
Whenever we have a conserved quantity like Zilch, either we have to specify its value.
That’s ok, there are conserved quantities in nature, the most important in statistical
mechanics being energy. But once we specify all the conserved Zilches, we can proceed
as usual, and say that all the states on a trajectory of fixed Zilch-numbers are equally
probable.
That sounds good, but there are lots of counterexamples. Here is one. R → R B → R
G → R P → R O → R Y → R. No matter where you begin, you go to red in the next
instant. There are no conserved quantities but obviously the probability after a short time
is completely unequal; only red is possible.
This last law has something odd about it. Is an example of a law that does not
respect the ”conservation of distinction,” or ”conservation of information.” In the previous
examples distinct starting points always lead to distinct outcomes. But in this case we
quickly lose track of where we started. Trajectories merge!
One of the central principles of classical mechanics ( it has a quantum analog) is that
information (distinctions) is conserved. This principle is so important to the validity of
stat mech and thermodynamics that I would call it the minus first law of thermo (the
zeroth law is already taken.)
What the conservation of distinctions says is that trajectories in phase space never run
into each other. Even stronger, if you start with a volume in phase space and follow it
using the Hamilton’s equations of motion, the volume is conserved. That suggests that for
a closed system, a priori probability is uniform over the phase space. We will come back
to this.
2
1.2 Energy
The First Law of Thermodynamics is that energy is conserved. For a closed system1,
dE/dt = 0
If a system is composed of two very weakly interacting subsystems2, the energy is
additive,
E = E1 + E2
Interactions between the subsystems can exchange energy between them. Thus the
energy of either subsystem is not conserved. In equilibrium the energy of a subsystem
fluctuates although the total energy is fixed.
1.3 Entropy
Entropy is an information-theoretic concept. It is a measure of how much we DON’T know
about a system. Our ignorance may be due to the fact that the degrees of freedom are
two small, numerous, and rapidly changing, or just plain laziness. In either case entropy
measures our ignorance.
Suppose a system can be in any of N states. If we know nothing about the state of
the system (complete ignorance) the entropy, S, is defined to be
S = log N.
Suppose that we know only that the system is an any of M states with M < N . Then
we have some nontrivial knowledge. In this case the entropy is defined to be
S = log M.
Evidently, the more we know, the less the entropy.
Here are some examples. Let the system be n distinguishable coins. Each coin is either
H (heads), or T (tails). A state consists of a sequence of H’s and T’s.
1By a closed system we mean one with a time independent Hamiltonian and with insulating walls sothat no energy can enter or leave the system from outside
2Weakly interacting means that the energy of interaction is very small compared to the separate energiesof the subsystems by themselves. If the Hamiltonian is H1 + H2 + Hint, then the numerical value of Hint
should be negligible compared with either H1 or H2.
3
H H T H T T H.....
Suppose we know nothing at all. The total number of states is 2n and the entropy is
S = log 2n = n log 2.
Note that by taking the log we have made entropy proportional to the number of degrees
of freedom (extensive). Entropy is measured in bits. If the entropy is S = log 2n = n log 2,
one says that there are n bits of entropy.
Another example is that we know everything that there is to know about the coins. In
that case we know the exact state and M = 1. The entropy is zero.
Next, suppose we know that all the coins but one are H but we don’t know which one
is T. The system can be in any of n states and the entropy is
S = log n
In this case the entropy is not additive but that is because there are strong correlations
(to be defined later).
Homework Problem: Suppose that n is even and that we know that half the coins
are H and half are T, but that is all we know. What is the entropy?
For large n what is the entropy if we know that a fraction f of the coins are H and
fraction 1− f are T?
In each case we are given a probability distribution on the space of states. If we label
the states with index i then the probability to be in state i is called P (i). In the examples
P (i) is either zero for those states not possible, and P (i) = 1/M for the possible states.
We can write a formula for S in terms of P (i),
S = −∑
i
P (i) log P (i). (1.1)
Those states with P = 0 don’t contribute and those with P = 1/M contribute 1/M log M .
Since there are M such states the sum gives log M.
For a general probability distribution we require only that P (i) ≥ 0 and
∑
i
P (i) = 1. (1.2)
4
The general definition of entropy is given by 1.1. Obviously 0 ≤ P (i) ≤ 1 which implies
that each contribution in 1.1 is positive or zero.
The entropy defined by 1.1 roughly measures the log of the number of states that have
non-negligible probability in the distribution P (i). In other words exp S is the “width” of
the distribution. Note that it is zero if and only if P = 1 for a single state and vanishes
for all others.
Homework Problems:
1) Suppose you have a set of N coins. Each coin independently has a probability 1/3
to be heads and 2/3 to be tails. What is the total entropy?
2) A variable q has a uniformly spaced spectrum of values with very small spacing δ.
The sum over states can be accurately approximated by
∑
i
→ 1
δ
∫dq
Suppose that the probability distribution for q is proportional to e−q2. What is the entropy?
1.4 Temperature
The average energy associated with a probability distribution (I will just call it E) is given
by
E =∑
i
EiP (i) (1.3)
where Ei is the energy of the state i.
Now suppose we have a one-parameter family of probability distributions labeled by
the average energy P (i; E). For each value of E P (i; E) satisfies the usual requirements of
a probability distribution. Later we will think of it as the thermal equilibrium distribution
for given average E. But for now it is just a one-parameter family.
At each value of E we can compute the entropy so that S becomes a function of E,
S(E).
Consider the amount of energy that is needed to increase the entropy by one bit (by
log 2 ). It is given by
δE =dE
dSlog 2 (1.4)
5
We call the quantity dEdS
the temperature, T .
T ≡ dE
dS(1.5)
Slogan: Apart from a factor of log 2, the temperature is the amount of energy needed
to increase the entropy by one bit.
For example, if you erase a bit of information from your computer you are really
transferring it from the computer out into the atmosphere, where it shows up as some
heat. How much heat? The answer is T log 2.
Except in very unusual circumstances, the temperature is always positive, i,e,. entropy
is a monotonically increasing function of energy.
1.5 The Zeroth and Second Laws of Thermodynamics
The second Law of Thermodynamics is that Entropy always increases. We can state it
in the following way: When a closed system which is out of thermal equilibrium comes
to equilibrium the entropy of the system increases. We will come back to the reasons for
this, but for now let’s accept it.
From the 2nd law we can prove that heat always flows from hot to cold. Consider two
isolated systems , A and B, at different temperatures. Let them have energies, temper-
atures, and entropies EA, EB, TA, TB, SA, SB. Without loss of generality we can assume
that TB > TA
Now bring them into contact so that energy (heat) flows between them. Suppose a small
quantity of energy is exchanged. The total change in energy must be zero. Therefore
TAdSA + TBdSB = 0 (1.6)
Since they must tend to equilibrium, if the entropy is not maximum it must increase.
Hence
dSA + dSB > 0 (1.7)
We can use 1.6 to eliminate SB from 1.7. We find
(TB − TA)dSA > 0 (1.8)
Since the system B is initially the hotter of the two, (TB − TA) is positive. Therefore dSA
and also TAdSA are positive. Equation 1.6 then tells us that TBdSB is negative. Equiva-
lently, energy flows from hot to cold as equilibrium is established. The final equilibrium
6
configuration in which energy has stopped flowing, must have TA = TB. In other words,
temperature must be uniform in a system in thermal equilibrium.
2 The Boltzmann Distribution
If a system A is in contact with (weakly interaction with) a much larger system called
the bath, then it can exchange energy with the bath. After a long time combined system
will have jumped around over all states with the given total energy. There will be equal
probability for all the various ways the energy can be shared. This is the assumption of
chaos and although it is very difficult to prove for a given system, it is almost certainly
true for almost all systems. Whenever it is true the small system A will be in thermal
equilibrium with a certain probability distribution for being in the state i. (Note that i
refers to states of the small system A and not the combined system.)
We will illustrate the consequences of this principle by choosing the bath to be N − 1
copies of A so that all together we have a total system consisting of N copies of A.
The copies are labeled 1, 2, 3, ....., N and they are in states i1, i2, i3, ....iN with energies
Ei1 , Ei2Ei3 ....EiN .
Let us define the “occupation numbers” ni to be the number of copies that occupy the
state i. The ni satisfy two conditions,
∑
i
ni = N (2.1)
and ∑
i
niEi = Etotal. (2.2)
These equations state that the total number of copies adds up to N and that the total
energy adds up to some fixed value, Etotal.
The chaotic assumption tells us that all of the configurations of the total system with
a given total energy are equally likely. Thus the probability for a given partitioning of the
energy (given set of ni) is equal to the number of configurations with that set of ni. How
many distinct configurations of the total system are there with given (n1, n2, ...nN)? This
is a combinatoric problem that I will leave to you. The answer is
N !∏i ni!
(2.3)
The important point is that when N and n become large, subject to the constraints
2.1 and 2.2, then the quantity in 2.3 becomes very sharply peaked around some set of
7
occupation numbers. Before we compute the occupation numbers that maximize 2.3, let
me introduce some changes of notation.
Define P (i) to be the fraction of copies in state i.
P (i) =ni
N
and let E be the average energy of a copy. Obviously
E =Etotal
N.
Then 2.1 and 2.2 take a form identical to 1.2 and 1.3, namely
∑
i
P (i) = 1
and ∑
i
P (i)Ei = E.
Now we will assume that N and ni are very large and use Stirling’s approximation3
2.3. But first let us take its logarithm (Maximizing a positive quantity is the same as
maximizing its log.).
logN !∏i ni!
≈ N log N −∑
i
ni log ni.
We want to maximize this subject to 2.1 and 2.2.
Substituting n/N = P and Etotal = NE we find that this is equivalent to maximizing
−∑
i
P (i) log P (i)
subject to ∑
i
P (i) = 1
and ∑
i
P (i)Ei = E.
In other words the probability distribution for thermal equilibrium maximizes the entropy
subject to the constraint of a given average energy.
3Stirling’s approximation is n! = nne−n.
8
In order for find the P (i) we use the method of Lagrange multipliers to implement the
two constraints. The two multipliers are called α and β. Thus we maximize
−∑
i
P (i) log P (i)− α∑
i
P (i)− β∑
i
P (i)Ei (2.4)
At the end we choose α, β so that the constraints are satisfied.
Differentiating with respect to P (i) and setting the result to zero gives
P (i) = e−(α+1)e−βEi . (2.5)
This is the Boltzmann distribution.
Let us define
e(α+1) = Z
Then 2.5 has the familiar form
P (i) =e−βEi
Z. (2.6)
2.1 Solving for the Lagrange Multipliers
Solving for α is equivalent to solving for Z which is done by setting the sum of the P to
1. This gives the famous formula for the partition function,
Z =∑
i
e−βEi . (2.7)
Next consider the equation ∑
i
P (i)Ei = E
which becomes1
Z
∑
i
e−βEiEi = E.
Let us use e−βEiEi = −∂βe−βEi and we obtain
E = − 1
Z∂βZ = −∂ log Z
∂β. (2.8)
This equation can be used to fix β in terms of the average energy E.
The Lagrange multiplier β is, of course, the inverse temperature. We will demonstrate
this, but first let us derive another familiar thermodynamic formula.
9
2.2 Helmholtz Free Energy
Using S = −∑P log P we can write the entropy in the form
S =∑
i
1
Ze−βEi(βEi + log Z) = βE + log Z
The quantity A = −T log Z is called the Helmholtz free energy (reminder: T = 1/β.)
Thus we find
S = −β(E − A)
or
A = E − TS (2.9)
The Helmholtz free energy satisfies
dA = dE − TdS − SdT
Using dE = TdS we find
dA = −SdT (2.10)
2.3 Why is T = 1/β?
We have proposed two definitions of temperature. The first is equation 1.5 and the second
is the inverse of the Lagrange multiplier, β. We would like to see that they are really the
same.
Consider a small change in the energy of a system,
dE = −d[∂β log Z]
Now use 2.9 in the form E = ST − T log Z to get
dE = TdS + SdT − Td log Z
dβdβ − log ZdT
Using 2.9 again, and the definition of A, we find that the last 3 terms cancel, leaving
dE = TdS. (2.11)
This is of course equivalent to 1.5. Thus the two definitions of temperature are the same.
10
3 Fluctuations
So far we have been deriving classical thermodynamics from statistical mechanics. We go
beyond thermodynamics when we consider fluctuations of quantities about their averages.
Such fluctuations are observable—for example Einstein’s theory of the Brownian motion.
In this section I will illustrate by considering the fluctuations of the energy of a system in
contact with a heat bath.
Given a probability distribution P (x) the fluctuation in x (called ∆x)is defined by
(∆x)2 = 〈(x− 〈x〉)〉2 (3.1)
which is also equal to
(∆x)2 = 〈x2〉 − 〈x〉2 (3.2)
where 〈〉 means average. For any function f(x) the average is defined by 〈f(x)〉 =∑
x f(x)P (x). If x is continuous then the sum is replaced by integral in the obvious
way.
Let us consider the fluctuation of energy of a system in equilibrium. We use the
following:
〈E〉 = − 1
Z∂βZ
and
〈E2〉 =1
Z∂2
βZ.
The first identity is the usual identification of average energy in terms of the derivative of
Z. The second identity is derived the same way as the first, noting that each derivative
acting on exp (−βE) brings down a factor of −E. Thus
(∆E)2 = 〈E2〉 − 〈E〉2
is given by
(∆E)2 =1
Z∂2
βZ − (1
Z∂βZ)2.
But this expression is equivalent to
(∆E)2 = ∂2β log Z = −∂βE
Using T = 1/β we get
(∆E)2 = T 2dE
dT.
11
Now note that dEdT
is just the heat capacity of the system. Call it C. The final identity is
(∆E)2 = T 2C. (3.3)
Thus we find that the fluctuation of the energy is proportional to the specific heat.
It may seem odd that the fluctuations should be so large. But that is because we have
set the Boltzmann constant to 1. If we put k back into the equation it becomes
(∆E)2 = kT 2C. (3.4)
Recall that k is a very small number in units of meters, kilograms, seconds, and degrees
Kelvin: k = 1.4× 10−23. Thus fluctuations are small because k is small.
Nevertheless, despite their small magnitude, fluctuations can be measured and they
provide a method for determining Boltzmann’s constant.
4 Control Parameters
So far we have considered closed systems characterized by constant values of the param-
eters. These parameters include all the parameters in the Lagrangian such as the masses
of particles, the values of external electric and magnetic fields, and the shape and volume
of the containers that enclose the system. Some of these parameters such as the volume
of the system and the external fields may be controllable from the outside, for example
by moving pistons to change the volume. We will call such macroscopic control variables
Xm. For simplicity we will consider the case of only one X although the principles are the
same for several of them. If you want to think of a specific example, X can represent the
volume of the system.
4.1 The Adiabatic Theorem and the First Law
An adiabatic process means two things. First of all it means that the system is isolated so
that no energy in the form of heat can flow into or out of the system. Secondly, it means
that the control parameters are varied very slowly. In that situation the system will remain
in equilibrium throughout the process, although typically the energy and temperature will
change. Those things which do not change during a adiabatic process are called adiabatic
invariants. It is a theorem that entropy is an adiabatic invariant. It is true in classical
mechanics but it is most easily understood in quantum mechanics.
Consider the energy levels of the system Ei. In general they will depend on X: Ei =
Ei(X). If a sudden change is made in X the system will not remain in an eigenstate
12
of energy. But if the change is arbitrarily slow then the adiabatic theorem says that
the system remains in an eigenstate, simply tracking the slow time dependence of the
instantaneous energy levels. In fact the levels will not cross over or disappear. The
implication is that the probability function P (i) is constant for each level, even as the
energy along the way gradually varies. Obviously if P (i) is constant, so is the entropy.
That is why entropy is an adiabatic invariant.
Now consider the change in energy of the system during the adiabatic process. That
energy change is by definition, the work done on the system by changing X. The most
familiar example is the work done in slowly compressing a gas in a insulated container. If
the change in X is small (call it dX) we may assume the work done is small (call it dW ).
We can express the above idea in equations.
dW =∂E
∂X|SdX
In the general case of several control parameters this becomes
dW =∑n
∂E
∂Xn
|SdXn.
Let us define the conjugate variables, Yn, to the Xn, by the formula
Yn = − ∂E
∂Xn
|S (4.1)
and it follows that
dW = −∑n
YndXn (4.2)
The most familiar example of (X, Y ) is volume and pressure (V, P ).
dW = −PdV
Let us suppose that an infinitesimal adiabatic change is followed by a second process in
which energy is added to the system in the form of heat—in other words a second process
in which the control parameters are constant but the entropy changes. For this second
process dE = TdS so that the combined effect of the work (adiabatic process) and the
added heat give a change in energy,
dE = TdS − PdV
More generally
dE = TdS −∑YndXn (4.3)
13
This relation is called the First Law of Thermodynamics, but it is really an expression
of energy conservation. The term TdS (energy due to a change of entropy) is called heat
and is sometimes denoted dQ. But there is no function Q. Technically dQ is not an exact
differential. To see this we write 4.3 in the form
dQ = dE + Y dX.
If Q were a well defined function of E and X then
∂Q
∂E= 1
and∂Q
∂X= Y
Consider ∂2Q∂X∂E
. Since the order of differentiation does not matter, one finds ∂Y∂E|X = 0.
For example one would find that the pressure would not depend on the energy of a gas a
fixed volume. This of course is false, so dQ cannot be an exact differential.
The meaning of all of this is that it is possible to bring a system through a series of
changes that bring it back to its original equilibrium state, in such a way that the net
input of heat is not zero. What must be zero for such a cycle is the change in the energy.
In other words ∮(dQ + dW )
must be zero.
4.2 Processes at Fixed Temperature
We have defined the conjugate variables Y such as pressure in terms of adiabatic or
constant entropy processes. We can also define them in terms of constant temperature
(isothermal). Such processes usually mean that the system is in contact with a large heat
bath which is so big that its temperature does not change.
First we need a calculus theorem that I leave for you to prove. Suppose we have a
function S(T,X) of two variables T and X. Let there also be a second function E(T, X).
By solving for T in terms of S,X, we can think of E as a function of S and X.
Homework Problem: Prove the following identity:
∂E
∂X|S =
∂E
∂X|T − ∂S
∂X|T ∂E
∂S|X
14
This identity is general but in the case where S, T, E have their usual thermodynamic
meaning, we can use∂E
∂S|X = T
to get an expression for Y ≡ − ∂E∂X|S
∂E
∂X|S =
∂E
∂X|T − T
∂S
∂X|T =
∂(E − TS)
∂X|T
Finally, using E − TS = A this can be written ∂A∂X|T . Thus,
Y = − ∂A
∂X|T (4.4)
Thus we can either can either define conjugate variables like pressure in terms of
derivatives of E with respect to X at fixed entropy, OR derivatives of A with respect
to X at fixed temperature.
Part IISome Simple Applications
5 Dilute Gas
5.1 Ideal Gas
The ideal gas is one in which the interactions between particles can be ignored.
Consider a container of volume V containing N identical non-interacting point particles,
each with mass m. In order to apply the reasoning of Part I we must know how to define
the state of the system and how to sum over states. Begin with a single particle.
The state of a classical system is described by a point in its phase space, i.e., a value
for each generalized coordinate and each generalized momentum. Thus in the present
case a value for each of 3N coordinates restricted to the interior of the container, and 3N
momenta (unrestricted).
To sum over states we imagine replacing the phase space by a discrete collection of
small cells. The phase-space-volume (units of momentum times length to the power 3N)
15
of each cell tends to zero at the end of the calculation. Quantum mechanics suggests that
we take it to be (h̄)3N . Thus we replace the sum over states by
∑
i
→∫ dx3Ndp3N
(h̄)3N
An ideal gas is one composed of non-interacting particles. The energy of a state at
point x, p is
E(x, p) =1
2m
3N∑
n=1
pn2 (5.1)
and the partition function is
Z(β) =1
N !
∫ dx3Ndp3N
(h̄)3Ne−βE(x,p). (5.2)
The factor 1N !
is put in to avoid over counting configurations of identical particles. For
example if there are two particles, there is no difference between the configuration particle-
one at point x.p, particle-two at x′, p′, and the configuration particle-two at point x.p,
particle-one at x′, p′. Thus we divide by the number of equivalent configurations which
in this case is 2 and in the general case is N !. We will use Stirling’s approximation,
N ! = NNe−N in evaluating the partition function.
The x integrals in 5.2 are trivial since the integrand does not depend on x. It just gives
V 3N When combined with the 1/N ! it gives
(eV
N
)N
Since N/V is the particle density, call it ρ, these factors combine to give(
e
ρ
)N
Notice how the N and V dependence nicely combine to give an expression which only
depends on the density which we will keep fixed as the number of particles tends to infinity.
The momentum integral in 5.2 is a gaussian integral over 3N variables. In fact it is
the 3N power of the one dimensional integral
∫dp e−β p2
2m =
√2mπ
β.
The final result for the partition function is
Z(β) =
(e
ρ
)N (2mπ
β
) 3N2
(5.3)
16
If we want to explicitly exhibit the dependence on volume we replace ρ by N/V .
Z(β) =(
eV
N
)N(
2mπ
β
) 3N2
(5.4)
Homework Problem:
Given the partition function in 5.4, compute the A,E, S, P as functions of the temper-
ature. Derive the energy per particle and the ideal gas law P = ρT . What is the average
speed (magnitude of velocity) of a particle?
On a PV diagram (pressure on one axis, volume on the other) what are the curves of
constant temperature (isotherms) and constant entropy (adiabats)?
What is the relation between pressure and temperature at fixed entropy?
5.2 Almost Ideal Gas
Now let us introduce interactions between particles. The potential energy is a sum over
pairs of particles (more complicated 3-body, 4-body,...potentials are possible but we will
ignore them).
U(x1, x2, ..., xN) =∑m>n
U(xm − xn) (5.5)
We will treat U as a small quantity and compute Z to first order.
Z(β) =1
N !
∫ dx3Ndp3N
(h̄)3Ne−βE(x,p).
where
E(x, p) =∑n
p2n
2m+
∑m>n
U(xm − xn)
To linear order in U
Z =1
N !
∫d3Nx d3Np e−β
∑p2
2m [1− β∑m>n
U(xm − xn)]
which is given by
V N
N !
(2mπ
β
) 3N2
− βV N−2
N !
(2mπ
β
) 3N2 N(N − 1
2
∫dxdx′U(x− x′)
17
Using∫
dxdx′U(x− x′) = V∫
dxU(x) ≡ V U0, approximating N(N − 1) = N2, and using
Stirling’s approximation, one finds
Z =
(e
ρ
)N (2mπ
β
) 3N2
[1− βN
2ρU0] (5.6)
To calculate log Z we use the first order Taylor series for log (1− ε), namely log (1− ε) =
−ε.
log Z = −N log ρ− 3N
2log β − Nβ
2ρU0 + const (5.7)
Homework Problems:
1) Given the partition function in 5.7, compute the A,E, S, P as functions of the
temperature. Derive the energy per particle. Find the correction to the ideal gas law
P = ρT + .... What is the average energy of a particle?
On a PV diagram (pressure on one axis, volume on the other) what are the curves of
constant temperature (isotherms) and constant entropy (adiabats)?
2) Calculate the log Z to the next order (second order) in U and find the next correction
to the ideal gas law.
5.3 Ideal Gas in a Potential
Consider a box of gas in an external potential. For example the box could be in a gravita-
tional field so that every particle has a potential energy U = mgy where y is the vertical
direction. More generally U = U(x). The value of the potential at every point can be
thought of as a control parameter. In the general case we have a continuous infinity of
control parameters. Ordinary derivatives are replaced by functional derivatives.
What is the conjugate to the potential at point x? If we adiabatically increase the
potential a tiny bit in some region, every particle in that region will have its energy
increased. So the change in energy is
δE =∫
d3x ρ(x)δU(x).
18
The variational derivative of E with respect to U(x) is just ρ(x). Thus the conjugate to
U(x) is −ρ(x).
Now consider the partition function. The only difference with the ordinary ideal gas is
that the integral∫
d3x which gave a factor of volume for each particle, becomes
∫d3x e−βU(x).
Thus the partition function becomes
Z(β) =(∫
d3x e−βU(x))N
(2mπ
β
) 3N2
(5.8)
The entire dependence of the free energy on U will be in the term
−TN log(∫
d3x e−βU(x))
If we want the density at point y we functionally differentiate this expression with respect
to U(y).
ρ(y) =TN∫
d3x e−βU(x)βe−βU(y)
or
ρ(y) =N∫
d3x e−βU(x)e−βU(y)
The factor N∫d3x e−βU(x) is independent of position. It can be replaced by a normalization
constant K determined by the fact that the integral of the density must equal the total
number of particles in the box.
ρ(y) = Ke−βU(y). (5.9)
Thus, as we might expect, the particles are most dense where the potential in lowest.
For a gravitational field the density as a function of height is proportional to
e−βmgy
5.4 Diatomic Molecular Gas
So far we have considered a gas composed of point particles. If the molecules of the gas
have structure, they are capable of internal motion such as rotation and vibration. Just
to illustrate the ideas we will consider a gas composed of diatomic molecules modeled by
two mass points connected by a rigid massless rod of fixed length. The partition function
will factorize into a translational part and an internal part.
19
The energy of the molecule consists of its overall translational energy p2/2m where p
represents the linear momentum of the molecule. In addition there is rotational energy.
Lets suppose the rod is oriented in space with angular coordinates u, v where 0 < u < π
and 0 < v < 2π. You can think of u as the polar angle, and v as the azimuthal angle. The
rotational kinetic energy is
Erot =I
2(u̇2 + v̇2(sin2 u) (5.10)
The canonical momenta are
pu =u̇
I,
pv =v̇
I sin2 uand the energy (Hamiltonian) is
Erot =p2
u
2I+
p2v
2I sin2 u(5.11)
The integral∫
d3x d3p e−β p2
2m (there is one such factor for each molecule) is the same
as for the point particle.
The internal factor for each molecule is∫
du dv dpu dpv e−βErot . (5.12)
First do the momentum integrals and obtain
2Iπ
β
∫du dv sin u =
4πI
β
The constant 4Iπ is of no importance and we can drop it. The important thing is that
there is a factor 1β
for each molecule. This means that the partition function in 5.3 is
changed by replacing β−3N
2 by β−5N
2 . The effect is to change the energy per molecule from32T to 5
2T .
In general we get an additional energy of 12T for each internal degree of freedom of the
molecule.
This leads to a famous paradox; as the distance between the parts of the molecule tend
to zero, one would think the molecule tends to the point particle case. But the added
internal energy does not depend on the radius. The resolution of the paradox involved
quantum mechanics.
Homework Problem:
20
Calculate the thermodynamic properties of the diatomic gas. In particular find the
adiabats on the PV diagram. Generalize the result to the case of n internal degrees of
freedom.
5.5 A Note about Volume and the van der Waals Gas
Consider that factor V N
N !in the ideal gas partition function. Using Stirling we can write
this as (V
N
)N
One way to interpret this is to say that each particle moves in a box of size V/N .
Now let us suppose each particle is not really a point but rather an impenetrable sphere
of volume v. In that case the volume over which the center of mass can move is smaller
and of order VN− v. The partition function contains the factor
(V
N− v
)N
If we combine the idea of an impenetrable spherical particle with the corrections due
to a long range weak potential, equation 5.6 becomes
Z =(
V
N− v
)N(
2mπ
β
) 3N2
[1− βN
2ρU0] (5.13)
and the free energy is
log Z = −N log(
V
N− v
)− 3N
2log β − Nβ
2
N
VU0 + const (5.14)
Calculating the pressure and rearranging the equation gives the van der Vaals equation,
(P − u0
2ρ2)(V −Nv) = NT (5.15)
6 Simple Magnet and Magnetization
Consider a collection of N spins σ1, σ2....σN , each of which can be up or down along the z
axis (I will use the notation σ to indicate the z component of spin). There is a magnetic
field H oriented along z and the energy of each spin is σµH. Note that in this case the
magnetic field is a control parameter.
21
Let the number of up spins be n and the number of down spins be m where n+m = N .
The energy of such a configuration is
E = (n−m)Hµ
and the number of such configurations is
N !
n!m!=
N !
n!(N − n)!.
The partition function is
Z(β, H) =∑n
N !
n!(N − n)!e−β(n−m)Hµ.
I have written the partition function in terms of β and H in order to indicate its dependence
of temperature and the control parameter.
The sum is a binomial expansion and the result is
Z(β, H) = (eβHµ + e−βHµ)N = (2 cosh βHµ)N . (6.1)
and the free energy A is
A(β, H) = NT log (2 cosh βHµ). (6.2)
The conjugate variable to the control parameter H is the magnetization M .
M = −(
∂A
∂H
)|T . (6.3)
Using 6.2, we find that 6.3 gives
M = −Nµ tanh βHµ.
Homework Problems:
1) Compute energy and entropy of the simple magnet. Find the functional form of the
energy in terms of entropy. Do you see anything strange?
2) Show that M is the average total magnetic moment of the assembly of spins.
22
6.1 Ising Model
The so called Ising magnet is equivalent to the simple magnet we just studied. Think of a
one dimensional line of N +1 spins, each interacting with its two nearest neighbors except
for the end-spins which have only a single neighbor. The system is defined by its energy.
E = −jN∑
i=1
σiσi+1. (6.4)
The constant j is the strength of interaction between the neighboring spins. At this point
we will not introduce an external field although we could, but that makes the problem
harder. Note that I have chosen the sign so that for positive j the energy is minimum
when the spins are aligned. This is the ferromagnetic case. The anti-ferromagnetic case is
defined by choosing j to be negative.
The trick is to write the partition function as the sum of two terms. The first term
contains all configurations in which spin-one is up. To calculate this term lets change
variables. Define “dual” spin variables µi. (These µ′s are not magnetic moments. There
should be no confusion; a µ without an index is a magnetic moment. With an index it is
a dual spin variable. )
µi = σiσi+1.
There are N µi-variables labeled µ1, µ2, µ3, ...µN . They are all independent and determine
the original spins by the transformation
σj =j−1∏
i=1
µi.
For example, in the sector with the first spin up,
σ1 = 1
σ2 = µ1
σ3 = µ1µ2
and so on. Note also that the µi take on the values ±1 just as the original spins.
The partition function in this sector is given by
Z =∑
exp (βj∑
µi).
This is exactly the same as the earlier model with the following substitutions:
Hµ → −j
23
and
σi → µi.
Thus the partition function in this sector is
Z = (2 cosh βj)N
Finally we have to add the sector with σ1 = −1. But the entire problem is symmetric
with respect to changing the sign of all the spins simultaneously. Therefore the contribution
of the second sector is the same as the first, and we find,
Z = 2(2 cosh βj)N (6.5)
Homework Problem:
Add a term to 12 corresponding to an external magnetic field. In other words replace
12 by
E = −jN∑
i=1
σiσi+1 +N+1∑
i=1
µHσi. (6.6)
Compute the magnetization, to linear order in the magnetic field. Note what happens
in the limits of small and large temperature. Can you explain the behavior?
7 The Maxwell Relations
Unlike heat, the total energy of a system is a well defined function of the thermodynamic
equilibrium state. This means it is a function of all the control variables and one other
variable such as the temperature or the entropy. For simplicity we will work with a single
control variable, the volume, and freeze all the others.
The equation
dE = TdS − PdV
suggests that we think of E as a function of S and V . We can then write
∂E
∂S= T
∂E
∂V= −P (7.1)
24
Now from the fact that ∂2E∂S∂V
= ∂2E∂V ∂S
we derive the first Maxwell Relation.
∂T
∂V|S = −∂P
∂S|V (7.2)
Figure 1 illustrates a thought experiment to confirm the first Maxwell Relation.
Equation 7.2 is a remarkable bit of magic. With very little input we derive the general
fact that for all systems the change in temperature when we adiabatically change the
volume, is the negative of the change in pressure when we add a bit of entropy at fixed
volume. It is very general and applies to solids, liquids, gases, black body radiation...
We can derive additional relation by focusing on the Helmholz free energy A = E−TS.
dA = dE − TdS − SdT = (TdS − PdV )− TdS − SdT
or
dA = −SdT − PdV. (7.3)
We think of A as a function of T and V . Thus
∂A
∂T= −S
∂A
∂V= −P
and by using the fact that derivatives commute, we derive the second Maxwell relation.
∂S
∂V|T =
∂P
∂T|V (7.4)
The first Maxwell Relation follows from considering the energy as a function of the
pair (S, V ), and the second Maxwell Relation from A(T, V ). Two more relations can be
obtained from functions H(S, P ) (Enthalpy) and G(P, T ) (Gibbs free energy).
H = E + PV
dH = TdS + V dP (7.5)
Enthalpy is useful when considering a process at fixed pressure. For example, a gas in a
cylinder with a piston such that the pressure is just the weight of the piston. The change
in enthalpy if you add some heat is just the heat added (TdS). If we call the heat added
Q, then ∆E + P∆V = Q.
The corresponding Maxwell relation is
∂T
∂P|S =
∂V
∂S|P . (7.6)
25
Figure 1: Experiment Illustrating First Maxwell Relation: On the left we have an insu-lating cylinder with a movable piston that can be used to change the volume. We slowlymove the piston and change the volume by dV. On the right we have a heat-conductingbox of fixed volume and at temperature T. We allow an amount of heat dQ into the boxand the pressure changes by dP.
26
Finally the Gibbs free energy is defined by
G = H − TS
dG = V dP − SdT (7.7)
and the last Maxwell relation is
∂V
∂T|P = −∂S
∂P|T . (7.8)
When using the Maxwell relations we can replace a differential change in entropy by dQ/T
where dQ is the added heat-energy.
Homework:
design experiments for confirming the second, third, and forth Maxwell Relations sim-
ilar to the one shown in the figure.
8 The Second Law
Let us simplify the story a bit by considering an initial probability distribution on phase
space which is constant over some small blob, and zero outside the blob. In this case the
entropy can be taken to be the logarithm of the phase space volume of the blob. In some
sense it increases with time, but that seems to violate basic principles of mechanics.
The Second Law has an interesting history. Boltzmann orininally tried to define a
quantity called H that always increased dH/dt ≥ 0 along a trajectory in phase space. But
Loschmidt said that mechanics is reversible, so if H increases along a trajectory it will
decrease along the time reversed trajectory. Since the time reversed trajectory is also a
solution to Newton’s equations, it is not possible to have a function on (p,x) that always
increases.
The answer is chaos and coarse graining.
Chaos is the fact that for most systems the orbits in phase space are unstable. An
example is a frictionless billiard system (ignore the pockets). Start with the 15 numbered
balls all arranged in a tight stack and carefully aim the Q-ball. If the initial conditions
are perfectly reproducible then every time you do it the result will be the same no matter
how long you follow the orbits. But the tiniest error, either in the way the numbered balls
are stacked, or in the initial trajectory of the Q-ball will exponentially grow, so that after
just a couple of collisions the outcome will be extremely different.
27
Figure 2: The Second Law. The phase space volume and topology are exactly conservedbut when coarse grained it grows.
This is a generic phenomenon in mechanics once the number of coordinates is larger
than one. It is characterized by a quantity called the Lyapunov exponent. Let the distance
in phase space of two initial points be ε. After a small time the distance between them will
be eλt where λ is the (largest) Lyapunov exponent. It is easy to prove that the Lyapunov
exponent must be greater than or equal to zero. Zero is a very exceptional value and for
almost all systems with more than one degree of freedom, it is positive. Then the system
is chaotic.
To see what chaos implies, imagine following a patch of phase space—say at a fixed
energy—as it evolves. The Liouville theorem says that the volume and topology of the
region does not change, but in general the shape does change. If the system is chaotic,
then points that were initially close will soon diverge. The shape branches out and forms
a crazy fractal made of thinner and thinner tendrils spreading our over the phase space as
in Figure 2.
Eventually the tendrils will get arbitrarily close to any point on the energy surface of
the initial points.
28
Suppose we take into account our inability to resolve points of phase space with arbi-
trary precision. In other words we introduce a cutoff, replacing each phase space point by
a small but finite sphere. We can coarse grain the phase space blob by drawing such a
sphere over each point in the original blob. Assuming the original blob is bigger than the
cutoff, then at first, coarse graining has little effect. But as the tendrils spread the coarse
grained version fills a larger and larger volume, even thought the fine grained blob has a
fixed volume. Eventually the coarse grained blob will fill the entire energy surface. Liou-
ville’s theorem says nothing about the coarse grained volume not increasing. Of course
the course grained volume cannot be smaller than the fine grained, so the coarse grained
volume cannot decrease.
Now come back to Loschmidt. After a long time imagine time reversing every particle
(this means reversing its momentum). If we do this for every point in the fine grained
blob, it will trace back to the original small round blob. Here is an example: take all the
air in the room and start it out in a very small volume in the corner of the room. In a
short amount of time the molecules will spread out over the volume and fill the room. Can
the opposite happen? No, because it violates the second law; the entropy of a confined
gas is less than that of a gas that fills the room. But if we time reverse every molecule of
the final state, the air WILL rush back to the corner.
The problem is that if we make a tiny error in the motion of just a single molecule,
That error will exponentially grow with the Lyapunov exponent and instead of going off
to the corner, it will continue to fill the room. We see that in the second half of Figure
2. Some time reversed trajectories (blue lines, lead back to the original blob. Most don’t,
Even trajectories that start very close to one which does go back to the blob, qickly depart
and go somewhere else. In fact if we run the coarse grained blob backward in time (or
forward) and then coarse grain the result, the phase space volume will be even bigger.
Freak accidents do happen, if you wait long enough. Given enough time the air in the
room will by accident congregate in the corner. The correct statement is not that unlikely
things never happen, but only that they rarely happen. The time that you would have to
wait for the unusual air-event to take place is exponential in the number of molecules.
There are two kinds of entropies, fine grained and coarse grained. The fine grained
entropy never changes because of Liouville’s theorem. The coarse grained entropy does
increase.
Homework Problem Suppose the room has volume of 100 cubic meters. Consider
the possibility that all the air accidentally accumulates in a one cubic meter volume in the
29
corner. What is the probability for this to happen. How long would you expect to have
to wait to see it happen?
9 Quantum Considerations
There are a number of paradoxes in statistical mechanics that were only resolved by the
introduction of quantum mechanics. We have already seen one. The energy per molecule
of an ideal gas depends on the molecular structure: for point molecules it is 3T/2. For
the diatomic molecule with fixed separation it is 5T/2. But as the diatomic molecule gets
smaller and smaller, it should tend to the point molecule. It does not, at lease according to
classical (non quantum) stat mech. Another example is the infinite energy of black body
radiation.
Let’s begin with a simple example. Suppose there is a single Harmonic Oscillator in
equilibrium with a heat bath. We can think of the HO as a molecule if we ignore its
translational degrees of freedom. The Hamiltonian is
E =p2
2m+
kx2
2(9.1)
If we think of the HO as a molecule then p and x are not the center of mass position and
momentum, but rather the relative variables. The mass m would be the reduced mass of
the system. The partition function is
Z =∫
dxdpe−β( p2
2m+ kx2
2)
or
Z =∫
dpe−β( p2
2m)
∫dxe−β( kx2
2)
The integrals are gaussian, and using ω =√
km
, give
Z =
√2mπ
β
√2π
kβ=
2π
ωβ−1. (9.2)
An easy calculation gives the average energy of the oscillator to be
E = T. (9.3)
Two things to notice about this formula. First is that it is independent of m and especially
k. Second, in the molecular interpretation it is the internal energy of the molecule and
would be added to the usual 3T/2.
30
What is strange is that even in the limit k →∞ the internal energy does not go away.
One might have expected that when k is very large the molecule should be indistinguishable
from a point molecule which has no internal energy.
Let us redo the calculation taking account of quantum mechanics. All we really need
from QM is the fact that the energy spectrum is
En = h̄ωn (9.4)
where n = 0, 1, 2, .... and ω =√
km
.
The partition function is
Z =∞∑
n=0
e−βh̄ωn.
The sum is a geometric series:
Z =1
1− e−βh̄ω=
eβh̄ω
eβh̄ω − 1(9.5)
Finally, the average energy is
E =ωh̄
eβh̄ω − 1(9.6)
Note that at high temperature (small β) the answer agrees with the classical answer
E = T . But for low temperature the energy is exponentially small:
E ≈ ωh̄e−ωh̄
T . (9.7)
Notice that in the limit k → ∞ or equivalently ω → ∞ the energy of the oscillator
goes to zero. In other words the HO molecule does behave like the classical point particle
which has no internal energy. Only when the temperature becomes large enough that the
classical internal energy T becomes comparable to a single quantum does the molecule
reveal its structure.
Homework:
1) Compute the free energy and entropy of the classical and quantum oscillators. How
do they compare for high and low temperature? Where is the transition from quantum to
classical?
2) In section 5.4 the energy for the diatomic molecule could have been written
E =L2
2I
31
In quantum mechanics L2 takes on the values n(n + 1)h̄ and each energy level has degen-
eracy 2n + 1. The partition function for a single molecule can be written as an infinite
sum over n. Estimate the sum in the limit of large and small temperature. Show that at
high temperature the classical diatomic result is reproduces for the energy. Show that at
low temperature the energy per particle tends to the point molecule limit.
10 Thermal Radiation
Consider a box (cube) with sides of length L. The walls of the box are reflecting. Electro-
magnetic radiation can exist in the box. The radiation can be decomposed into standing
waves with the form
F (x, y, z) =∑
nx,ny,nz
X(nx, ny, nz) sinnxπ
Lx sin
nyπ
Ly sin
nzπ
Lz (10.1)
The dynamical degrees of freedom are the amplitudes X(nx, ny, nz) which behave like
harmonic oscillators of frequency
ωn =nπc
L(10.2)
where n ≡√
n2x + n2
y + n2z
The partition function is an infinite product of harmonic oscillator partition functions
for each mode of oscillation. Classically
Z =∏
nx,ny,nz
eβh̄ωn
eβh̄ωn − 1
Let us consider the energy of the system. Since log Z is a sum over the modes, the
energy is just the sum of the energy of the oscillators. As we have seen, a classical oscillator
has an energy equal to T = β−1, independently of its frequency. Therefore each mode has
energy T and the total energy is infinite. Most of the energy is in very short wavelengths
(large n) and the infinite result is called “The Ultraviolet Catastrophe.”
The resolution of the catastrophe is quantum mechanics. As we have seen, the energy
stored in a quantum oscillator is much smaller than the classical answer when the frequency
of the oscillator is large. Using equation 9.6 we find the total energy is the sum
E =∑
nx,ny,nz
ωnh̄
eβh̄ωn − 1(10.3)
32
which converges.
When the volume of the box is large the neighboring values of the ωn are very close and
the sum can be approximated as an integral. Let us define the wave vector k by kx = nxπL
and similarly for y and z. The difference between neighboring wave vectors is
∆kx =π
L
and the sum is replaced by∑
nx,ny ,nz
→ L3
π3
∫d3k
We also note that ωn = ck so that 10.3 becomes
E =L3
π3
∫d3k
ckh̄
eβh̄ck − 1
Multiplying by Tβ = 1 gives
E =L3
π3T
∫d3k
βckh̄
eβh̄ck − 1
Changing integration variables we get
E =L3
(πch̄)3T 4
∫d3u
|u|eu − 1
I have done the calculation as if the field were a scalar with only one component. In
fact the field has two components corresponding to the two possible polarizations. The
net effect is that there are twice as many oscillators and the total energy is twice the above
value. We can also simplify the integral by writing∫
d3u = 4π∫
u2du.
Homework: Compute A, S, and P for thermal radiation. Work out all integrals.
11 Chemical Potential
For most of the rest of this quarter we are going to study two three topics: magnets;
the classical liquid-gas phase transition; and Bose Einstein condensation. For all three
applications we will need the concept of chemical potential.
The chemical potential is a useful tool for studying systems in which the number
of particles fluctuates. Roughly speaking it is related to particle number the same way
temperature is related to energy; as a Lagrange Multiplier.
33
Let’s return to the derivation of the Boltzmann distribution is Section 2. Let’s suppose
that there are some conserved quantities in addition to energy. For example suppose the
system is a container of helium. The total number of helium atoms is conserved (assuming
that the temperature is not high enough to break up the atoms).
There are situations in which the number of particles in each subsystem may fluctuate
even though the total is fixed. For example, in considering the ensemble of N copies, we
may allow helium atoms to be transferred from one subsystem to another. In that case
only the total number of atoms is conserved. Suppose the number of atoms in the nth copy
is called Nn. The constraint takes the form
∑Nn = Ntotal.
A configuration of the (sub)system is now specified by an energy and a number of atoms, N
and the probability function depends on N as well as on the energy level. To implement the
constraint of fixed Ntotal we introduce a new Lagrange multiplier. The standard convention
is to include a factor of β in the definition of the LM. The LM is βµ where µ is called the
chemical potential.
Following the same logic as in Section 2, it is easy to see that the probability function
takes the form
P (i, N) = Z−1e−β(Ei+µN).
In this formula, the label i refers to the energy levels of the N particle system and in
general these levels will depend on N .
The partition function is defined so that the total probability is unity. Thus
Z(β, µ) =∑
N
e−βµNZN (11.1)
where ZN is the partition function with fixed number of particles N .
Recall that in the case of energy, the Lagrange multiplier β is determined so as to
fix the average energy. In the same way the chemical potential is determined to fix the
average number of atoms. It is easy to see that
N = −∂A
∂µ(11.2)
where as usual, A = −T log Z.
Since N = V ρ we can also think of the chemical potential as the parameter that allows
us to vary the particle density.
34
Homework: 1) Prove the following relations for processes at fixed volume and other
control parameters:
E = TS − µN − T log Z
dE = TdS − µdN
2) Find an expression for the fluctuation of the number of particles analogous to the
relation in Section 3.
12 Back to Magnets
(Please note that I have made a change in Section 6. I changed the definition of the
constant j so that for positive j the energy is lowest in the ferromagnetic case.)
Let’s return to the Ising model in an external magnetic field and compute the magne-
tization to lowest order in the field. To simplify the notation I will set µH = h (There
are too many things called µ!) We will mostly be interested in the Ferromagnetic case
with j being positive. In this case the energy is lowest when the spins line up in the same
direction.
E = −jN∑
i=1
σiσi+1 +N+1∑
i=1
hσi.
The magnetization (eq 6.3) M is given in terms of the derivative of log Z with respect
to h. For simplicity we replace 6.3 by
M = −(
∂A
∂h
)(12.1)
To compute M to order h we must compute log Z to order h2. Thus we write
Z =∑σ
eβ(j∑N
i=1σiσi+1−
∑N+1
i=1hσi)
35
and expand to order h2.
Z =∑σ
eβj∑N
i=1σiσi+1(1− βh
∑
i
σi +1
2β2h2
∑
ij
σiσj) (12.2)
It is easy to see that the term proportional to h vanishes. The original Ising model is
symmetric with respect to changing the sign of all spins and that insures that only even
powers of h survive. To order h2 the logarithm of Z is given by
log Z = log ZI +β2h2
2
∑σ
eβj∑N
i=1σiσi+1
∑
ij
σiσj. (12.3)
where ZI is the Ising partition function given by 6.5.
For i = j the term σiσj = 1. Thus we may write
∑
ij
σiσj = N + 2∑
i>j
σiσj
or ∑
ij
σiσj = N + 2∑
i,m
σiσi+m
where the m-sum starts at m = 1.
Now let us change variables to the dual variables µi = σiσi+1 The term in ?? quadratic
in h is given by
β2h2
2ZI
∑µ
e−βj∑
µi
N + 2
∑
i,m
σiσi+m
. (12.4)
Now consider σiσi+m. In terms of the µi it is given by
σiσi+m = µiµi+1..µiµi+m−1
Thus the second term in 12.4 gives
β2h2
(2 cosh βj)N
∑µ
e−βj∑
µi
∑
i,m
µiµi+1..µiµi+m−1
. (12.5)
where I have replaced ZI by the value we computed in Section 6.
For those µ outside the range (i, i+m− 1) the sum over the values of µ just cancel the
corresponding factors of (2 cosh βj) in the denominator. For the points inside the range,
the sum simply replaces (2 cosh βj) by (2 sinh βj). Thus the net result for large N is
Nβ2h2∑
m=1
(tanh βj)m = Nβ2h2
(tanh βj
1− tanh βj+
1
2
)
36
Figure 3: Magnetization vs external Magnetic field.
or more simply
Nβ2h2
2e2βj
The magnetization (−∂hA) is
M = βhe2βj. (12.6)
First of all, notice that for all finite temperature the magnetization is linear and con-
tinuous in h for small h. This means that there is no spontaneous magnetization when h
is shut off. But at zero temperature the coefficient e2βj diverges. The magnetization does
not diverge at zero temperature but its derivative becomes infinite. At finite temperature
the graph of M versus h looks like Figure 3a while for zero temperature it looks like 3b.
In other words as you shut off the external field the magnetization persists, but this
phenomenon only occurs at zero temperature for the Ising model. We will see that in
higher dimensions the spontaneous magnetization persists up to a critical temperature.
The Ising model is characteristic of all one-dimensional statistical systems with short
range interactions. There is no phase transition at finite temperature.
I made a historical mistake in class. Ising correctly solved the 1-dimensional system
37
and correctly told his advisor Lenz that it had no transition at finite T . His mistake was
to conclude that the corresponding model in higher dimensions also had no transition.
12.1 Connection with Fluids
One of the interesting connections that we will explore is the connection between magnetic
systems and the liquid gas transition in molecular systems like water. Imagine a lattice
approximation to space in any number of dimensions. At each point in the lattice there
may either be a particle or not. If all sites are vacant we call the state the vacuum.
Let us introduce a variable at each point that indicates whether or not a particle is
present. Call it σi where i indicates the site. If σ = −1 we say the site is vacant. If
σ = +1 we say a particle occupies the site. In this way we map the problem of a system
of identical molecules to a magnetic system of the type we have studied.
Notice that any site can have at most one particle. This means that the particles have
a infinite hard core repulsion that forbids particles from being closer than a lattice spacing.
Now let us introduce an energy function of the type studied by Ising. Only neighboring
sites interact and the interaction has the form
−j∗∑
σiσj
where the symbol∗∑
means a sum over neighboring sites on the d-dimensional lattice. We take j > 0. If two
particles are far apart the interaction energy is zero but if they are on neighboring sites
the energy is negative. This corresponds to an attractive short range potential. For the
moment we are ignoring the kinetic energy of the particles.
The partition function is the d-dimensional analog of the Ising model.
Z =∑σ
exp
( ∗∑βσiσj
)
So far we have not constrained the number of particles. If particle number is conserved
then we can and should introduce a chemical potential µ which we will use to tune the
number of particles.
The number of particles is given by
N =∑
i
σ + 1
2
38
and partition function becomes
Z =∑σ
exp
( ∗∑βσiσj − βµ
∑
i
σ + 1
2
)
The term βµ∑
i(12) has no significance and may be dropped in which case the relevant
physics is described by
Z =∑σ
exp
( ∗∑βσiσj − 1
2βµ
∑
i
σ
)
Obviously this is the d-dimensional analog of the Ising model in the presence of an external
magnetic field h = µ.
Let us consider the magnetization M . It is simply the total number of sites times the
average value of σ. Thus we may identify the magnetization per site (average of σ) as
2ρ− 1.
Let us suppose that a magnetic transition takes place so that at h = 0 there is a jump in
magnetization. In the Ising model this only happens at T = 0 but in higher dimensions it
happens at all temperatures up to some critical temperature T ∗ that depends on dimension.
What does this jump mean from the point of view of the molecular system. It is a sudden
jump in density as the chemical potential is varied. In fact it is the transition from a
gaseous phase to a more dense liquid phase. The phase diagram is shown in Figure 3.
Notice that there are two ways to pass from one phase to another. You can cross the
discontinuity (blue line) or go around it. When you cross it the magnetization (density)
jumps in what is called a first order transition. Going around the critical point involves
no jump and the behavior is continuous. This is characteristic of both magnetic systems
and gas-liquid systems such as water.
One interesting point is that in 1-dimension there is no gas-liquid phase transition.
12.2 Ising-like Systems in Higher Dimensions
39
Figure 4: Phase diagram of a magnet. The thick blue line represents the discontinuity inthe magnetization. In the 1-D Ising model the critical temperature is zero. In the anal-ogy with molecular fluids, negative magnetization represents the gas phase, and positivemagnetization represents the liquid phase.
40