Slides from FYS4411 Lectures

Slides from FYS4411 Lectures

Morten Hjorth-Jensen

1Department of Physics and Center of Mathematics for ApplicationsUniversity of Oslo, N-0316 Oslo, Norway

Spring 2010

1 / 540

Topics for Week 3, January 18-2

Introduction, Parallelization, MPI and Variational MonteCarlo

I Presentation of topics to be covered and introduction toMany-Body physics (Lecture notes chapter 16, Raimeschapter 1 or Thijssen chapter 4).

I Variational Monte Carlo theory and presentation of project1. (lecture notes chapter 11, Thijssen chapter 12)

I Introduction to Message Passing Interface (MPI) andparallelization. (lecture notes chapter 7.7)

I Assignment for next week: study chapter 11 of Lecturenotes or Chapter 12 of Thijssen.

2 / 540

18 January - 31 May

Course overview, Computational aspects

I Parallelization (MPI), high-performance computing topicsand object orientation. Choose between F95 and/or C++as programming languages. Python also possible asprogramming language. (all projects)

I Algorithms for Monte Carlo Simulations (multidimensionalintegrals), Metropolis-Hastings and importance samplingalgorithms. Improved Monte Carlo methods (project 1)

I Statistical analysis of data from Monte Carlo calculations,blocking method. (project 1)

3 / 540

18 January - 31 May

Course overview, Computational aspects

I Search for minima in multidimensional spaces (conjugategradient method) (project 1)

I Object orientation (both projects)I Solutions of coupled differential equations for Hartree-Fock

and density functional calculations. (project 2)I Alternativ project 2: Lattice quantum chromodynamics or

path integral Monte Carlo (Many-body physics at finitetemperature)

4 / 540

18 January -31 May, project 1

Quantum Mechanical Methods and Systems

1. Variational Monte Carlo for ab initio studies of quantummechanical many-body systems.

2. Simulation of atoms like Helium, Beryllium and Neon withextensions to solids. It has also to be extended totwo-dimensional systems like quantum dots.

3. Aim of projects 1: understand how to simulate qauntummechanical systems with many interacting particles usingvariational Monte Carlo methods.

The methods of projects 1 and 2 are relevant for atomic,molecular,solid state, materials science, nanotechnology,quantum chemistry and nuclear physics.

5 / 540



1. Project 2 (standard development) solves much of the samesystems as in project 1 but introduces Hartree-Fock theoryand density functional theory.

2. The Hartree-Fock solutions are in turn used in the codefrom project 1 to obtain an ab initio solution for a givensystem

3. This solution is then used to constrain a density functional(actual research).

4. We will also end up writing a density functional code anduse this to compute properties of solids (atoms in a lattice).

DFT and HF are covered by the lectures notes, chapters 4-6 ofThijssen and the articles of Jones on the webpage of thecourse.

6 / 540



1. Project 2 is however not yet determined. Depending on theinterest of the participants we may extend project 1 to dealwith path integral Monte Carlo methods. This is relevant forstudies of quantum mechanical systems at finitetemperature and for example lattice quantumchromodynamics. Open for discussions.

7 / 540

18 January -31 May

Projects, deadlines and oral exam

1. Deadline project 1: March 222. Deadline project 2: 31 May3. Oral exam: week 24 (8-12 June), most likley Friday June

11.

The oral exam is based on your presentation of the projects.

8 / 540

18 January -31 May

More on projects

1. Keep a logbook, important for keeping track of all yourchanges etc etc.

2. The projects should be written as a regular scientificarticle, with introduction, formalism, codes which havebeen developed and discussion of results. Conclusionsand references should also be included. An example canbe found on the webpage of the course.

3. The link with the article example contains also an article onhow to use latex and write good scientific articles!

9 / 540

Lectures and ComputerLab

I Lectures: Thursday (14.15-16, room FV329)I Detailed lecture notes, all programs presented and

projects can be found at the homepage of the course.I Computerlab: 16-19 thursday, room FV329I Weekly plans and relevant information are on the official

webpage.I Chapters 8, 9, 11 and 16 and 17 of the FYS3150/4150

lecture notes give a good starting point. We recommendalso J. M. Thijssen text Computational Physics and the textof Raimes as background. For MPI we recommend Gropp,Lusk and Sjellums text.

10 / 540

Thijssens text

J. M. Thijssens text

I ComputationalPhysics

I Chapters 3-6 and 12,possibly also chapter8-9

I see http://www.tn.tudelft.nl/tn/People/Staff/Thijssen/comphybook.html

11 / 540

http://www.tn.tudelft.nl/tn/People/Staff/Thijssen/comphybook.htmlhttp://www.tn.tudelft.nl/tn/People/Staff/Thijssen/comphybook.htmlhttp://www.tn.tudelft.nl/tn/People/Staff/Thijssen/comphybook.htmlhttp://www.tn.tudelft.nl/tn/People/Staff/Thijssen/comphybook.htmlhttp://www.tn.tudelft.nl/tn/People/Staff/Thijssen/comphybook.html

MPI text

Gropp, Lusk andSjellum

I Using MPII Chapters 1-5I seehttp://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10761

12 / 540

http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10761http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10761http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10761http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10761http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10761

Selected Texts and lectures on C/C++

J. J. Barton and L. R. Nackman,Scientific and Engineering C++, Addison Wesley,3rd edition 2000.

B. Stoustrup, The C++ programming language, Pearson, 1997.

George Em Karniadakis and Robert M. Kirby II, Parallel Scientific Computing inC++ and MPI http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521520805

D. Yang, C++ and Object-oriented Numeric Computing for Scientists andEngineers, Springer 2000.

More books reviewed at http:://www.accu.org/ andhttp://www.comeaucomputing.com/booklist/

13 / 540

http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521520805http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521520805http:://www.accu.org/http://www.comeaucomputing.com/booklist/

Definitions and notations

The Schrodinger equation reads

H(r1, r2, . . . , rN )(r1, r2, . . . , rN ) = E(r1, r2, . . . , rN ), (1)

where the vector ri represents the coordinates (spatial and spin) of particle i , stands

for all the quantum numbers needed to classify a given N-particle state and is the

pertaining eigenfunction. Throughout this course, refers to the exact eigenfunction,

unless otherwise stated.

14 / 540


We write the Hamilton operator, or Hamiltonian, in a generic way

H = T + V

where T represents the kinetic energy of the system

T =NX

i=1

p2i2mi

=NX

i=1

~2

2mii2

=NX

i=1

t(ri )

while the operator V for the potential energy is given by

V =NX

i=1

u(ri ) +NX

ji=1

v(ri , rj ) +NX

ijk=1

v(ri , rj , rk ) + . . . (2)

Hereafter we use natural units, viz. ~ = c = e = 1, with e the elementary chargeand cthe speed of light. This means that momenta and masses have dimension energy.

15 / 540


If one does quantum chemistry, after having introduced the Born-Oppenheimerapproximation which effectively freezes out the nucleonic degrees of freedom, theHamiltonian for N = ne electrons takes the following form

H =neX

i=1

t(ri )neX

i=1

kZri

+

neXi


We can rewrite this as

H = H0 + H1 =neX

i=1

hi +neX

i


The potential energy term due to the attraction of the nucleus defines the onebody fieldui = u(ri ) of Eq. (2). We have moved this term into the H0 part of the Hamiltonian,instead of keeping it in V as in Eq. (2). The reason is that we will hereafter treat H0 asour non-interacting Hamiltonian. For a many-body wavefunction defined by anappropriate single-particle basis, we may solve exactly the non-interacting eigenvalueproblem

H0 = e,

with e being the non-interacting energy. This energy is defined by the sum over

single-particle energies to be defined below. For atoms the single-particle energies

could be the hydrogen-like single-particle energies corrected for the charge Z . For

nuclei and quantum dots, these energies could be given by the harmonic oscillator in

three and two dimensions, respectively.

18 / 540


We will assume that the interacting part of the Hamiltonian can be approximated by atwo-body interaction. This means that our Hamiltonian is written as

H = H0 + H1 =NX

i=1

hi +NX

i


Our Hamiltonian is invariant under the permutation (interchange) of two particles.Since we deal with fermions however, the total wave function is antisymmetric. Let Pbe an operator which interchanges two particles. Due to the symmetries we haveascribed to our Hamiltonian, this operator commutes with the total Hamiltonian,

[H, P] = 0,

meaning that (r1, r2, . . . , rN ) is an eigenfunction of P as well, that is

Pij (r1, r2, . . . , ri , . . . , rj , . . . , rN ) = (r1, r2, . . . , rj , . . . , ri , . . . , rN ).

We have introduced the suffix ij in order to indicate that we permute particles i and j .

The Pauli principle tells us that the total wave function for a system of fermions has to

be antisymmetric. What does that mean for the above permutation?

20 / 540


In our case we assume that we can approximate the exact eigenfunction with a Slaterdeterminant

(r1, r2, . . . , rN , , , . . . , ) =1

N!

(r1) (r2) . . . . . . (rN )(r1) (r2) . . . . . . (rN ). . . . . . . . . . . . . . .. . . . . . . . . . . . . . .

(r1) (r2) . . . . . . (rN )

, (7)

where ri stand for the coordinates and spin values of a particle i and , , . . . , are

quantum numbers needed to describe remaining quantum numbers.

21 / 540


The single-particle function (ri ) are eigenfunctions of the onebody Hamiltonian hi ,that is

hi = h(ri ) = t(ri ) + u(ri ),

with eigenvalueshi(ri ) = t(ri ) + u(ri )(ri ) = (ri ).

The energies are the so-called non-interacting single-particle energies, or

unperturbed energies. The total energy is in this case the sum over all single-particle

energies, if no two-body or more complicated many-body interactions are present.

22 / 540


Let us denote the ground state energy by E0. According to the variational principle wehave

E0 E [] =Z

Hd

where is a trial function which we assume to be normalizedZd = 1,

where we have used the shorthand d = dr1dr2 . . . drN .

23 / 540


In the Hartree-Fock method the trial function is the Slater determinant of Eq. (7) whichcan be rewritten as

(r1, r2, . . . , rN , , , . . . , ) =1

N!

XP

()P P(r1)(r2) . . . (rN ) =

N!AH ,

(8)

where we have introduced the antisymmetrization operator A defined by thesummation over all possible permutations of two nucleons.

24 / 540


It is defined asA =

1N!

Xp

()pP, (9)

with p standing for the number of permutations. We have introduced for later use theso-called Hartree-function, defined by the simple product of all possible single-particlefunctions

H (r1, r2, . . . , rN , , , . . . , ) = (r1)(r2) . . . (rN ).

25 / 540


Both H0 and H1 are invariant under all possible permutations of any two particles andhence commute with A

[H0,A] = [H1,A] = 0. (10)

Furthermore, A satisfiesA2 = A, (11)

since every permutation of the Slater determinant reproduces it.

26 / 540


The expectation value of H0ZH0d = N!

ZHAH0AHd

is readily reduced to ZH0d = N!

ZH H0AHd,

where we have used eqs. (10) and (11). The next step is to replace theantisymmetrization operator by its definition Eq. (8) and to replace H0 with the sum ofone-body operators

ZH0d =

NXi=1

Xp

()pZ

H hi PHd.

27 / 540


The integral vanishes if two or more particles are permuted in only one of theHartree-functions H because the individual single-particle wave functions areorthogonal. We obtain then

ZH0d =

NXi=1

ZH hi Hd.

Orthogonality of the single-particle functions allows us to further simplify the integral,and we arrive at the following expression for the expectation values of the sum ofone-body Hamiltonians

ZH0d =

NX=1

Z(r)h(r)dr. (12)

28 / 540


We introduce the following shorthand for the above integral

|h| =Z(r)h(r)dr,

and rewrite Eq. (12) as ZH0d =

NX=1

|h|. (13)

29 / 540


The expectation value of the two-body Hamiltonian is obtained in a similar manner. Wehave Z

H1d = N!Z

HAH1AHd,

which reduces to

ZH1d =

NXij=1

Xp

()pZ

HV (rij )PHd,

by following the same arguments as for the one-body Hamiltonian.

30 / 540


Because of the dependence on the inter-particle distance rij , permutations of any twoparticles no longer vanish, and we get

ZH1d =

NXi


We obtain

ZH1d =

12

NX=1

NX=1

Z(ri )

(rj )V (rij )(ri )(rj )dri drj

Z(ri )

(rj )V (rij )(rj )(ri )dri drj

.

(14)

The first term is the so-called direct term. It is frequently also called the Hartree term,

while the second is due to the Pauli principle and is called the exchange term or just

the Fock term. The factor 1/2 is introduced because we now run over all pairs twice.

32 / 540


The last equation allows us to introduce some further definitions. The single-particlewave functions (r), defined by the quantum numbers and r (recall that r alsoincludes spin degree) are defined as the overlap

(r) = r|.

33 / 540


We introduce the following shorthands for the above two integrals

|V | =Z(ri )

(rj )V (rij )(ri )(rj )dri drj ,

and|V | =

Z(ri )

(rj )V (rij )(rj )(ri )dri drj .

34 / 540


The direct and exchange matrix elements can be brought together if we define theantisymmetrized matrix element

|V |AS = |V | |V |,

or for a general matrix element

|V |AS = |V | |V |.

It has the symmetry property

|V |AS = |V |AS = |V |AS .

35 / 540


The antisymmetric matrix element is also hermitian, implying

|V |AS = |V |AS .

With these notations we rewrite Eq. (14) as

ZH1d =

12

NX=1

NX=1

|V |AS . (15)

36 / 540


Combining Eqs. (13) and (96) we obtain the energy functional

E [] =NX=1

|h|+12

NX=1

NX=1

|V |AS . (16)

which we will use as our starting point for the Hartree-Fock calculations later in this

course.

37 / 540

Quantum Monte Carlo Motivation

Most quantum mechanical problems of interest in e.g., atomic, molecular, nuclear andsolid state physics consist of a large number of interacting electrons and ions ornucleons. The total number of particles N is usually sufficiently large that an exactsolution cannot be found. Typically, the expectation value for a chosen hamiltonian for asystem of N particles is

H =RdR1dR2 . . . dRN (R1,R2, . . . ,RN )H(R1,R2, . . . ,RN )(R1,R2, . . . ,RN )R

dR1dR2 . . . dRN (R1,R2, . . . ,RN )(R1,R2, . . . ,RN ),

an in general intractable problem. an in general intractable problem.

This integral is actually the starting point in a Variational Monte Carlo calculation.

Gaussian quadrature: Forget it! given 10 particles and 10 mesh points for each

degree of freedom and an ideal 1 Tflops machine (all operations take the same time),

how long will it ta ke to compute the above integral? Lifetime of the universe

T 4.7 1017s.

38 / 540

Quantum Monte Carlo

As an example from the nuclear many-body problem, we have Schrodingers equationas a differential equation

H(r1, .., rA, 1, .., A) = E(r1, .., rA, 1, .., A)

wherer1, .., rA,

are the coordinates and1, .., A,

are sets of relevant quantum numbers such as spin and isospin for a system of A

nucleons (A = N + Z , N being the number of neutrons and Z the number of protons).

39 / 540

Quantum Monte Carlo

There are

2A

AZ

coupled second-order differential equations in 3A dimensions.For a nucleus like 10Be this number is 215040. This is a truely challenging many-bodyproblem.

Methods like partial differential equations can at most be used for 2-3 particles.

40 / 540

Quantum Many-particle(body) Methods

1. Monte-Carlo methods

2. Renormalization group (RG) methods, in particular density matrix RG

3. Large-scale diagonalization (Iterative methods, Lanczos method,dimensionalities 1010 states)

4. Coupled cluster theory, favoured method in quantum chemistry, molecular andatomic physics. Applications to ab initio calculations in nuclear physics as well forlarge nuclei.

5. Perturbative many-body methods

6. Greens function methods

7. Density functional theory/Mean-field theory and Hartree-Fock theory

The physics of the system hints at which many-body methods to use. For systems with

strong correlations among the constituents, item 5 and 7 are ruled out.

41 / 540

Pros and Cons of Monte Carlo

I Is physically intuitive.

I Allows one to study systems with many degrees of freedom. Diffusion MonteCarlo (DMC) and Greens function Monte Carlo (GFMC) yield in principle theexact solution to Schrodingers equation.

I Variational Monte Carlo (VMC) is easy to implement but needs a reliable trialwave function, can be difficult to obtain. This is where we will use Hartree-Focktheory to construct an optimal basis.

I DMC/GFMC for fermions (spin with half-integer values, electrons, baryons,neutrinos, quarks) has a sign problem. Nature prefers an anti-symmetric wavefunction. PDF in this case given distribution of random walkers (p 0).

I The solution has a statistical error, which can be large.

I There is a limit for how large systems one can study, DMC needs a huge numberof random walkers in order to achieve stable results.

I Obtain only the lowest-lying states with a given symmetry. Can get excitedstates.

42 / 540

Where and why do we use Monte Carlo Methods inQuantum Physics

I Quantum systems with many particles at finite temperature: Path Integral MonteCarlo with applications to dense matter and quantum liquids (phase transitionsfrom normal fluid to superfluid). Strong correlations.

I Bose-Einstein condensation of dilute gases, method transition from non-linearPDE to Diffusion Monte Carlo as density increases.

I Light atoms, molecules, solids and nuclei.

I Lattice Quantum-Chromo Dynamics. Impossible to solve without MCcalculations.

I Simulations of systems in solid state physics, from semiconductors to spinsystems. Many electrons active and possibly strong correlations.

43 / 540

Bose-Einstein Condensation of atoms, thousands ofAtoms in one State, Project 2 in 2007

44 / 540

Quantum Monte Carlo

Given a hamiltonian H and a trial wave function T , the variational principle states thatthe expectation value of H, defined through

E [H] = H =R

dRT (R)H(R)T (R)RdRT (R)T (R)

,

is an upper bound to the ground state energy E0 of the hamiltonian H, that is

E0 H.

In general, the integrals involved in the calculation of various expectation values are

multi-dimensional ones. Traditional integration methods such as the Gauss-Legendre

will not be adequate for say the computation of the energy of a many-body system.

45 / 540

Quantum Monte Carlo

The trial wave function can be expanded in the eigenstates of the hamiltonian sincethey form a complete set, viz.,

T (R) =X

i

ai i (R),

and assuming the set of eigenfunctions to be normalized one obtainsPnm a

man

RdRm(R)H(R)n(R)P

nm aman

RdRm(R)n(R)

=

Pn a

2nEnP

n a2n E0,

where we used that H(R)n(R) = Enn(R). In general, the integrals involved in the

calculation of various expectation values are multi-dimensional ones. The variational

principle yields the lowest state of a given symmetry.

46 / 540

Quantum Monte Carlo

In most cases, a wave function has only small values in large parts of configurationspace, and a straightforward procedure which uses homogenously distributed randompoints in configuration space will most likely lead to poor results. This may suggest thatsome kind of importance sampling combined with e.g., the Metropolis algorithm maybe a more efficient way of obtaining the ground state energy. The hope is then thatthose regions of configurations space where the wave function assumes appreciablevalues are sampled more efficiently.

The tedious part in a VMC calculation is the search for the variational minimum. A

good knowledge of the system is required in order to carry out reasonable VMC

calculations. This is not always the case, and often VMC calculations serve rather as

the starting point for so-called diffusion Monte Carlo calculations (DMC). DMC is a way

of solving exactly the many-body Schrodinger equation by means of a stochastic

procedure. A good guess on the binding energy and its wave function is however

necessary. A carefully performed VMC calculation can aid in this context.

47 / 540

Quantum Monte Carlo

I Construct first a trial wave function T (R), for a many-body system consisting ofN particles located at positions R = (R1, . . . ,RN). The trial wave functiondepends on variational parameters = (1, . . . , N ).

I Then we evaluate the expectation value of the hamiltonian H

E [H] = H =R


.

I Thereafter we vary according to some minimization algorithm and return to thefirst step.

48 / 540

Quantum Monte CarloChoose a trial wave function T (R).

P(R) =|T (R)|2R|T (R)|2 dR

.

This is our new probability distribution function (PDF). The approximation to theexpectation value of the Hamiltonian is now

E [H] R


.

Define a new quantity

EL(R) =1

T (R)HT (R),

called the local energy, which, together with our trial PDF yields

E [H] = H Z

P(R)EL(R)dR 1N

NXi=1

P(Ri)EL(Ri)

with N being the number of Monte Carlo samples.

49 / 540

Quantum Monte CarloAlgo:

I Initialisation: Fix the number of Monte Carlo steps. Choose an initial R andvariational parameters and calculate

T (R)

2.I Initialise the energy and the variance and start the Monte Carlo calculation

(thermalize)

1. Calculate a trial position Rp = R + r step where r is arandom variable r [0,1].

2. Metropolis algorithm to accept or reject this move

w = P(Rp)/P(R).

3. If the step is accepted, then we set R = Rp. Updateaverages

I Finish and compute final averages.

Observe that the jumping in space is governed by the variable step. Called brute-force

sampling. Need importance sampling to get more relevant sampling.

50 / 540

Quantum Monte Carlo

The radial Schrodinger equation for the hydrogen atom can be written as

~2

2m2u(r)r2

ke2

r

~2l(l + 1)2mr2

u(r) = Eu(r),

or with dimensionless variables

122u()2

u()

+l(l + 1)

22u() u() = 0,

with the hamiltonian

H = 122

2

1

+l(l + 1)

22.

Use variational parameter in the trial wave function

uT () = e.

51 / 540

Quantum Monte Carlo

Inserting this wave function into the expression for the local energy EL gives

EL() = 1

2

2

.

H 2 /

N7.00000E-01 -4.57759E-01 4.51201E-02 6.71715E-048.00000E-01 -4.81461E-01 3.05736E-02 5.52934E-049.00000E-01 -4.95899E-01 8.20497E-03 2.86443E-041.00000E-00 -5.00000E-01 0.00000E+00 0.00000E+001.10000E+00 -4.93738E-01 1.16989E-02 3.42036E-041.20000E+00 -4.75563E-01 8.85899E-02 9.41222E-041.30000E+00 -4.54341E-01 1.45171E-01 1.20487E-03

52 / 540

Quantum Monte Carlo

We note that at = 1 we obtain the exact result, and the variance is zero, as it should.The reason is that we then have the exact wave function, and the action of thehamiltionan on the wave function

H = constant ,

yields just a constant. The integral which defines various expectation values involvingmoments of the hamiltonian becomes then

Hn =R

dRT (R)Hn(R)T (R)R

dRT (R)T (R)= constant

RdRT (R)T (R)RdRT (R)T (R)

= constant.

This gives an important information: the exact wave function leads to zero

variance! Variation is then performed by minimizing both the energy and the variance.

53 / 540

Quantum Monte Carlo

The helium atom consists of two electrons and a nucleus with charge Z = 2. Thecontribution to the potential energy due to the attraction from the nucleus is

2ke2

r1

2ke2

r2,

and if we add the repulsion arising from the two interacting electrons, we obtain thepotential energy

V (r1, r2) = 2ke2

r1

2ke2

r2+

ke2

r12,

with the electrons separated at a distance r12 = |r1 r2|.

54 / 540

Quantum Monte Carlo

The hamiltonian becomes then

bH = ~2212m

~2222m

2ke2

r1

2ke2

r2+

ke2

r12,

and Schrodingers equation reads bH = E.All observables are evaluated with respect to the probability distribution

P(R) =|T (R)|2R|T (R)|2 dR

.

generated by the trial wave function. The trial wave function must approximate an exact

eigenstate in order that accurate results are to be obtained. Improved trial wave

functions also improve the importance sampling, reducing the cost of obtaining a

certain statistical accuracy.

55 / 540

Quantum Monte Carlo

Choice of trial wave function for Helium: Assume r1 0.

EL(R) =1

T (R)HT (R) =

1T (R)

1221

Zr1

T (R) + finite terms.

EL(R) =1

RT (r1)

12

d2

dr21

1r1

ddr1

Zr1

!RT (r1) + finite terms

For small values of r1, the terms which dominate are

limr10

EL(R) =1

RT (r1)

1r1

ddr1

Zr1

RT (r1),

since the second derivative does not diverge due to the finiteness of at the origin.

56 / 540

Quantum Monte Carlo

This results in1

RT (r1)dRT (r1)

dr1= Z ,

andRT (r1) eZr1 .

A similar condition applies to electron 2 as well. For orbital momenta l > 0 we have

1RT (r)

dRT (r)dr

= Z

l + 1.

Similalry, studying the case r12 0 we can write a possible trial wave function as

T (R) = e(r1+r2)er12 .

The last equation can be generalized to

T (R) = (r1)(r2) . . . (rN )Yi

VMC code for helium, vmc para.cpp

// Here we define global variables used in various functions// These can be changed by reading from file the different parametersint dimension = 3; // three-dimensional systemint charge = 2; // we fix the charge to be that of the helium atomint my_rank, numprocs; // these are the parameters used by MPI to

// define which node and how manydouble step_length = 1.0; // we fix the brute force jump to 1 Bohr radiusint number_particles = 2; // we fix also the number of electrons to be 2

58 / 540

VMC code for helium, vmc para.cpp, main part

// MPI initializationsMPI_Init (&argc, &argv);MPI_Comm_size (MPI_COMM_WORLD, &numprocs);MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);time_start = MPI_Wtime();

if (my_rank == 0 && argc


// Setting output file name for this rank:ostringstream ost;ost


// broadcast the total number of variationsMPI_Bcast (&max_variations, 1, MPI_INT, 0, MPI_COMM_WORLD);MPI_Bcast (&number_cycles, 1, MPI_INT, 0, MPI_COMM_WORLD);total_number_cycles = number_cycles*numprocs;// array to store all energies for last variation of alphaall_energies = new double[number_cycles+1];// Do the mc sampling and accumulate data with MPI_Reducemc_sampling(max_variations, number_cycles, cumulative_e,

cumulative_e2, all_energies);// Collect data in total averagesfor( i=1; i


blockofile.write((char*)(all_energies+1),number_cycles*sizeof(double));

blockofile.close();delete [] total_cumulative_e; delete [] total_cumulative_e2;delete [] cumulative_e; delete [] cumulative_e2; delete [] all_energies;// End MPIMPI_Finalize ();return 0;} // end of main function

62 / 540

VMC code for helium, vmc para.cpp, sampling

alpha = 0.5*charge;// every node has its own seed for the random numbersidum = -1-my_rank;// allocate matrices which contain the position of the particlesr_old =(double **)matrix(number_particles,dimension,sizeof(double));r_new =(double **)matrix(number_particles,dimension,sizeof(double));for (i = 0; i < number_particles; i++) {

for ( j=0; j < dimension; j++) {r_old[i][j] = r_new[i][j] = 0;

}}// loop over variational parameters

63 / 540


for (variate=1; variate

VMC code for helium, vmc para.cpp, sampling// loop over monte carlo cyclesfor (cycles = 1; cycles


// compute local energydelta_e = local_energy(r_old, alpha, wfold);// save all energies on last variateif(variate==max_variations){

all_energies[cycles] = delta_e;}// update energiesenergy += delta_e;energy2 += delta_e*delta_e;

} // end of loop over MC trials// update the energy average and its squaredcumulative_e[variate] = energy;cumulative_e2[variate] = energy2;

} // end of loop over variational steps

66 / 540

VMC code for helium, vmc para.cpp, wave function

// Function to compute the squared wave function, simplest form

double wave_function(double **r, double alpha){

int i, j, k;double wf, argument, r_single_particle, r_12;

argument = wf = 0;for (i = 0; i < number_particles; i++) {r_single_particle = 0;for (j = 0; j < dimension; j++) {r_single_particle += r[i][j]*r[i][j];

}argument += sqrt(r_single_particle);

}wf = exp(-argument*alpha) ;return wf;

}

67 / 540

VMC code for helium, vmc para.cpp, local energy

// Function to calculate the local energy with num derivative

double local_energy(double **r, double alpha, double wfold){

int i, j , k;double e_local, wfminus, wfplus, e_kinetic, e_potential, r_12,r_single_particle;

double **r_plus, **r_minus;

// allocate matrices which contain the position of the particles// the function matrix is defined in the progam libraryr_plus =(double **)matrix(number_particles,dimension,sizeof(double));r_minus =(double **)matrix(number_particles,dimension,sizeof(double));for (i = 0; i < number_particles; i++) {for ( j=0; j < dimension; j++) {r_plus[i][j] = r_minus[i][j] = r[i][j];

}}

68 / 540


// compute the kinetic energye_kinetic = 0;for (i = 0; i < number_particles; i++) {for (j = 0; j < dimension; j++) {r_plus[i][j] = r[i][j]+h;r_minus[i][j] = r[i][j]-h;wfminus = wave_function(r_minus, alpha);wfplus = wave_function(r_plus, alpha);e_kinetic -= (wfminus+wfplus-2*wfold);r_plus[i][j] = r[i][j];r_minus[i][j] = r[i][j];

}}

// include electron mass and hbar squared and divide by wave functione_kinetic = 0.5*h2*e_kinetic/wfold;

69 / 540


// compute the potential energye_potential = 0;// contribution from electron-proton potentialfor (i = 0; i < number_particles; i++) {r_single_particle = 0;for (j = 0; j < dimension; j++) {r_single_particle += r[i][j]*r[i][j];

}e_potential -= charge/sqrt(r_single_particle);

}// contribution from electron-electron potentialfor (i = 0; i < number_particles-1; i++) {for (j = i+1; j < number_particles; j++) {r_12 = 0;for (k = 0; k < dimension; k++) {

r_12 += (r[i][k]-r[j][k])*(r[i][k]-r[j][k]);}e_potential += 1/sqrt(r_12);

}}

70 / 540

Going Parallel with MPI

In all projects it will be useful to parallelize the code. Taskparallelism: the work of a global problem can be divided into anumber of independent tasks, which rarely need to synchronize.Monte Carlo simulation or integrations are examples of this. It isalmost embarrassingly trivial to parallelize Monte Carlo codes.MPI is a message-passing library where all the routines havecorresponding C/C++-binding

MPI_Command_name

and Fortran-binding (routine names are in uppercase, but canalso be in lower case)

MPI_COMMAND_NAME

71 / 540

What is Message Passing Interface (MPI)? Yetanother library!

MPI is a library, not a language. It specifies the names, callingsequences and results of functions or subroutines to be calledfrom C or Fortran programs, and the classes and methods thatmake up the MPI C++ library. The programs that users write inFortran, C or C++ are compiled with ordinary compilers andlinked with the MPI library.MPI is a specification, not a particular implementation. MPIprograms should be able to run on all possible machines andrun all MPI implementetations without change.An MPI computation is a collection of processescommunicating with messages.See chapter 7.7 of lecture notes for more details.

72 / 540

MPI

MPI is a library specification for the message passing interface,proposed as a standard.

I independent of hardware;I not a language or compiler specification;I not a specific implementation or product.

A message passing standard for portability and ease-of-use.Designed for high performance.Insert communication and synchronization functions wherenecessary.

73 / 540

Demands from the HPC community

In the field of scientific computing, there is an ever-lasting wishto do larger simulations using shorter computer time.Development of the capacity for single-processor computerscan hardly keep up with the pace of scientific computing:

I processor speedI memory size/speed

Solution: parallel computing!

74 / 540

The basic ideas of parallel computing

I Pursuit of shorter computation time and larger simulationsize gives rise to parallel computing.

I Multiple processors are involved to solve a global problem.I The essence is to divide the entire computation evenly

among collaborative processors. Divide and conquer.

75 / 540

A rough classification of hardware models

I Conventional single-processor computers can be calledSISD (single-instruction-single-data) machines.

I SIMD (single-instruction-multiple-data) machinesincorporate the idea of parallel processing, which use alarge number of processing units to execute the sameinstruction on different data.

I Modern parallel computers are so-called MIMD(multiple-instruction- multiple-data) machines and canexecute different instruction streams in parallel on differentdata.

76 / 540

Shared memory and distributed memory

I One way of categorizing modern parallel computers is tolook at the memory configuration.

I In shared memory systems the CPUs share the sameaddress space. Any CPU can access any data in theglobal memory.

I In distributed memory systems each CPU has its ownmemory. The CPUs are connected by some network andmay exchange messages.

I A recent trend is ccNUMA(cache-coherent-non-uniform-memory- access) systemswhich are clusters of SMP (symmetric multi-processing)machines and have a virtual shared memory.

77 / 540

Different parallel programming paradigms

I Task parallelism: the work of a global problem can bedivided into a number of independent tasks, which rarelyneed to synchronize. Monte Carlo simulation is oneexample. Integration is another. However this paradigm isof limited use.

I Data parallelism: use of multiple threads (e.g. one threadper processor) to dissect loops over arrays etc. Thisparadigm requires a single memory address space.Communication and synchronization between processorsare often hidden, thus easy to program. However, the usersurrenders much control to a specialized compiler.Examples of data parallelism are compiler-basedparallelization and OpenMP directives.

78 / 540

Different parallel programming paradigms

I Message-passing: all involved processors have anindependent memory address space. The user isresponsible for partition- ing the data/work of a globalproblem and distributing the subproblems to theprocessors. Collaboration between processors is achievedby explicit message passing, which is used for datatransfer plus synchronization.

I This paradigm is the most general one where the user hasfull control. Better parallel efficiency is usually achieved byexplicit message passing. However, message-passingprogramming is more difficult.

79 / 540

SPMD

Although message-passing programming supports MIMD, itsuffices with an SPMD (single-program-multiple-data) model,which is flexible enough for practical cases:

I Same executable for all the processors.I Each processor works primarily with its assigned local

data.I Progression of code is allowed to differ between

synchronization points.I Possible to have a master/slave model. The standard

option in Monte Carlo calculations and numericalintegration.

80 / 540

Todays situation of parallel computing

I Distributed memory is the dominant hardwareconfiguration. There is a large diversity in these machines,from MPP (massively parallel pro cessing) systems toclusters of off-the-shelf PCs, which are very cost-effective.

I Message-passing is a mature programming paradigm andwidely accepted. It often provides an efficient match to thehardware. It is primarily used for the distributed memorysystems, but can also be used on shared memory systems.

In these lectures we consider only message-passing for writingparallel programs.

81 / 540

Overhead present in parallel computing

I Uneven load balance: not all the processors can performuseful work at all time.

I Overhead of synchronization.I Overhead of communication.I Extra computation due to parallelization.

Due to the above overhead and that certain part of a sequentialalgorithm cannot be parallelized we may not achieve an optimalparallelization.

82 / 540

Parallelizing a sequential algorithm

I Identify the part(s) of a sequential algorithm that can beexecuted in parallel. This is the difficult part,

I Distribute the global work and data among P processors.

83 / 540

Process and processor

I We refer to process as a logical unit which executes itsown code, in an MIMD style.

I The processor is a physical device on which one or severalprocesses are executed.

I The MPI standard uses the concept process consistentlythroughout its documentation.

I However, we only consider situations where one processoris responsible for one process and therefore use the twoterms interchangeably.

84 / 540

Bindings to MPI routines

MPI is a message-passing library where all the routines havecorresponding C/C++-binding

MPI_Command_name

and Fortran-binding (routine names are in uppercase, but canalso be in lower case)

MPI_COMMAND_NAME

The discussion in these slides focuses on the C++ binding.

85 / 540

Communicator

I A group of MPI processes with a name (context).I Any process is identified by its rank. The rank is only

meaningful within a particular communicator.I By default communicator MPI COMM WORLD contains all

the MPI processes.I Mechanism to identify subset of processes.I Promotes modular design of parallel libraries.

86 / 540

The 6 most important MPI routines

I MPI Init - initiate an MPI computationI MPI Finalize - terminate the MPI computation and clean upI MPI Comm size - how many processes participate in a

given MPI communicator?I MPI Comm rank - which one am I? (A number between 0

and size-1.)I MPI Send - send a message to a particular pro cess within

an MPI communicatorI MPI Recv - receive a message from a particular pro cess

within an MPI communicator

87 / 540

The first MPI C/C++ programLet every process write Hello world on the standard output.This is program2.cpp of chapter 7.

using namespace std;#include #include int main (int nargs, char* args[]){int numprocs, my_rank;// MPI initializationsMPI_Init (&nargs, &args);MPI_Comm_size (MPI_COMM_WORLD, &numprocs);MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);cout

The Fortran program

PROGRAM helloINCLUDE "mpif.h"INTEGER:: size, my_rank, ierr

CALL MPI_INIT(ierr)CALL MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)CALL MPI_COMM_RANK(MPI_COMM_WORLD, my_rank, ierr)WRITE(*,*)"Hello world, Ive rank ",my_rank," out of ",sizeCALL MPI_FINALIZE(ierr)

END PROGRAM hello

89 / 540

Note 1

The output to screen is not ordered since all processes aretrying to write to screen simultaneously. It is then the operatingsystem which opts for an ordering. If we wish to have anorganized output, starting from the first process, we may rewriteour program as in the next example (program3.cpp), see againchapter 7.7 of lecture notes.

90 / 540

Ordered output with MPI Barrier

int main (int nargs, char* args[]){int numprocs, my_rank, i;MPI_Init (&nargs, &args);MPI_Comm_size (MPI_COMM_WORLD, &numprocs);MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);for (i = 0; i < numprocs; i++) {}MPI_Barrier (MPI_COMM_WORLD);if (i == my_rank) {cout

Note 2

Here we have used the MPI Barrier function to ensure that thatevery process has completed its set of instructions in aparticular order. A barrier is a special collective operation thatdoes not allow the processes to continue until all processes inthe communicator (here MPI COMM WORLD have calledMPI Barrier . The barriers make sure that all processes havereached the same point in the code. Many of the collectiveoperations like MPI ALLREDUCE to be discussed later, havethe same property; viz. no process can exit the operation untilall processes have started. However, this is slightly moretime-consuming since the processes synchronize betweenthemselves as many times as there are processes. In the nextHello world example we use the send and receive functions inorder to a have a synchronized action.

92 / 540

Ordered output with MPI Recv and MPI Send

.....int numprocs, my_rank, flag;MPI_Status status;MPI_Init (&nargs, &args);MPI_Comm_size (MPI_COMM_WORLD, &numprocs);MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);if (my_rank > 0)MPI_Recv (&flag, 1, MPI_INT, my_rank-1, 100,

MPI_COMM_WORLD, &status);cout

Note 3

The basic sending of messages is given by the functionMPI SEND, which in C/C++ is defined as

int MPI_Send(void *buf, int count,MPI_Datatype datatype,int dest, int tag, MPI_Comm comm)}

This single command allows the passing of any kind of variable,even a large array, to any group of tasks. The variable buf isthe variable we wish to send while count is the number ofvariables we are passing. If we are passing only a single value,this should be 1. If we transfer an array, it is the overall size ofthe array. For example, if we want to send a 10 by 10 array,count would be 10 10 = 100 since we are actually passing100 values.

94 / 540

Note 4Once you have sent a message, you must receive it on anothertask. The function MPI RECV is similar to the send call.

int MPI_Recv( void *buf, int count, MPI_Datatype datatype,int source,int tag, MPI_Comm comm, MPI_Status *status )

The arguments that are different from those in MPI SEND arebuf which is the name of the variable where you will be storingthe received data, source which replaces the destination in thesend command. This is the return ID of the sender.Finally, we have used MPI Status status; where one cancheck if the receive was completed.The output of this code is the same as the previous example,but now process 0 sends a message to process 1, whichforwards it further to process 2, and so forth.Armed with this wisdom, performed all hello world greetings,we are now ready for serious work.

95 / 540

Integrating

Examples

I Go to the webpageand click on theprograms link

I Go to MPI and thenchapter 7

I Look at program5.ccpand program6.cpp.(Fortran version alsoavailable).

I These codescompute using therectangular andtrapezoidal rules.

96 / 540

Integration algos

The trapezoidal rule (example6.cpp)

I =Z b

af (x)dx = h (f (a)/2 + f (a + h) + f (a + 2h) + + f (b h) + fb/2) .

Another very simple approach is the so-called midpoint or rectangle method. In thiscase the integration area is split in a given number of rectangles with length h andheigth given by the mid-point value of the function. This gives the following simple rulefor approximating an integral

I =Z b

af (x)dx h

NXi=1

f (xi1/2),

where f (xi1/2) is the midpoint value of f for a given rectangle. This is used in

example5.cpp.

97 / 540

Dissection of example 5

1 // Reactangle rule and numerical integration2 using namespace std;3 #include 4 #include

5 int main (int nargs, char* args[])6 {7 int numprocs, my_rank, i, n = 1000;8 double local_sum, rectangle_sum, x, h;9 // MPI initializations10 MPI_Init (&nargs, &args);11 MPI_Comm_size (MPI_COMM_WORLD, &numprocs);12 MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);

98 / 540


After the standard initializations with MPI such as MPI Init, MPI Comm size andMPI Comm rank, MPI COMM WORLD contains now the number of processes definedby using for example

mpiexec -np 10 ./prog.x

In line 4 we check if we have read in from screen the number of mesh points n. Notethat in line 7 we fix n = 1000, however we have the possibility to run the code with adifferent number of mesh points as well. If my rank equals zero, which correponds tothe master node, then we read a new value of n if the number of arguments is largerthan two. This can be done as follows when we run the code

mpiexec -np 10 ./prog.x 10000

99 / 540


13 // Read from screen a possible new vaue of n14 if (my_rank == 0 && nargs > 1) {15 n = atoi(args[1]);16 }17 h = 1.0/n;18 // Broadcast n and h to all processes19 MPI_Bcast (&n, 1, MPI_INT, 0, MPI_COMM_WORLD);20 MPI_Bcast (&h, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);21 // Every process sets up its contribution to the integral22 local_sum = 0.;23 for (i = my_rank; i < n; i += numprocs) {24 x = (i+0.5)*h;25 local_sum += 4.0/(1.0+x*x);26 }27 local_sum *= h;

In line 17 we define also the step length h. In lines 19 and 20 we use the broadcast

function MPI Bcast. We use this particular function because we want data on oneprocessor (our master node) to be shared with all other processors. The broadcast

function sends data to a group of processes.

100 / 540


The MPI routine MPI Bcast transfers data from one task to a group of others. Theformat for the call is in C++ given by the parameters of

MPI_Bcast (&n, 1, MPI_INT, 0, MPI_COMM_WORLD);.MPI_Bcast (&h, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);

in a case of a double. The general structure of this function is

MPI_Bcast( void *buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm).

All processes call this function, both the process sending the data (with rank zero) andall the other processes in MPI COMM WORLD. Every process has now copies of nand h, the number of mesh points and the step length, respectively.

We transfer the addresses of n and h. The second argument represents the number of

data sent. In case of a one-dimensional array, one needs to transfer the number of

array elements. If you have an n m matrix, you must transfer n m. We need also tospecify whether the variable type we transfer is a non-numerical such as a logical or

character variable or numerical of the integer, real or complex type.

101 / 540


28 if (my_rank == 0) {29 MPI_Status status;30 rectangle_sum = local_sum;31 for (i=1; i < numprocs; i++) {32 MPI_Recv(&local_sum,1,MPI_DOUBLE,MPI_ANY_SOURCE,500,

MPI_COMM_WORLD,&status);33 rectangle_sum += local_sum;34 }35 cout


In lines 23-27, every process sums its own part of the final sum used by the rectanglerule. The receive statement collects the sums from all other processes in casemy rank == 0, else an MPI send is performed. If we are not the master node, wesend the results, else they are received and the local results are added to final sum.The above can be rewritten using the MPI allreduce, as discussed in the next example.

The above function is not very elegant. Furthermore, the MPI instructions can be

simplified by using the functions MPI Reduce or MPI Allreduce. The first function takes

information from all processes and sends the result of the MPI operation to one process

only, typically the master node. If we use MPI Allreduce, the result is sent back to all

processes, a feature which is useful when all nodes need the value of a joint operation.

We limit ourselves to MPI Reduce since it is only one process which will print out the

final number of our calculation, The arguments to MPI Allreduce are the same.

103 / 540

MPI reduce

Call as

MPI_reduce( void *senddata, void* resultdata, int count,MPI_Datatype datatype, MPI_Op, int root, MPI_Comm comm)

The two variables senddata and resultdata are obvious, besides the fact that onesends the address of the variable or the first element of an array. If they are arrays theyneed to have the same size. The variable count represents the total dimensionality, 1in case of just one variable, while MPI Datatype defines the type of variable which issent and received.The new feature is MPI Op. It defines the type of operation we want to do. In our case,since we are summing the rectangle contributions from every process we defineMPI Op = MPI SUM. If we have an array or matrix we can search for the largest ogsmallest element by sending either MPI MAX or MPI MIN. If we want the location aswell (which array element) we simply transfer MPI MAXLOC or MPI MINOC. If we wantthe product we write MPI PROD.MPI Allreduce is defined as

MPI_Alreduce( void *senddata, void* resultdata, int count,MPI_Datatype datatype, MPI_Op, MPI_Comm comm)}.

104 / 540


// Trapezoidal rule and numerical integration usign MPI, example 6using namespace std;#include #include

// Here we define various functions called by the main program

double int_function(double );double trapezoidal_rule(double , double , int , double (*)(double));

// Main function begins hereint main (int nargs, char* args[]){

int n, local_n, numprocs, my_rank;double a, b, h, local_a, local_b, total_sum, local_sum;double time_start, time_end, total_time;

105 / 540


// MPI initializationsMPI_Init (&nargs, &args);MPI_Comm_size (MPI_COMM_WORLD, &numprocs);MPI_Comm_rank (MPI_COMM_WORLD, &my_rank);time_start = MPI_Wtime();// Fixed values for a, b and na = 0.0 ; b = 1.0; n = 1000;h = (b-a)/n; // h is the same for all processeslocal_n = n/numprocs;// make sure n > numprocs, else integer division gives zero// Length of each process interval of// integration = local_n*h.local_a = a + my_rank*local_n*h;local_b = local_a + local_n*h;

106 / 540


total_sum = 0.0;local_sum = trapezoidal_rule(local_a, local_b, local_n,

&int_function);MPI_Reduce(&local_sum, &total_sum, 1, MPI_DOUBLE,

MPI_SUM, 0, MPI_COMM_WORLD);time_end = MPI_Wtime();total_time = time_end-time_start;if ( my_rank == 0) {cout


// this function defines the function to integratedouble int_function(double x){

double value = 4./(1.+x*x);return value;

} // end of function to evaluate

108 / 540


Implementation of the trapezoidal rule.

// this function defines the trapezoidal ruledouble trapezoidal_rule(double a, double b, int n,

double (*func)(double)){

double trapez_sum;double fa, fb, x, step;int j;step=(b-a)/((double) n);fa=(*func)(a)/2. ;fb=(*func)(b)/2. ;trapez_sum=0.;for (j=1; j

Strategies

I Develop codes locally, run with some few processes andtest your codes. Do benchmarking, timing and so forth onlocal nodes, for example your laptop. You can installMPICH2 on your laptop (most new laptos come with dualcores). You can test with one node at the lab.

I When you are convinced that your codes run correctly, youstart your production runs on available supercomputers, inour case titan.uio.no.

110 / 540

How do I run MPI on the machines at the lab(MPICH2)

The machines at the lab are all quad-coresI Compile with mpicxx, mpic++, icc for C++ user and mpif90

or ifort for fortran users.I Set up collaboration between processes and runmpd --ncpus=4 run code withmpiexec -n 4 ./nameofprog

Here we declare that we will use 4 processes via thencpus option and via n4 when running.

I End withmpdallexit

111 / 540

How do I use the titan.uio.no cluster?

[email protected]

I Computational Physics requires High PerformanceComputing (HPC) resources

I USIT and the Research Computing Services (RCS)provides HPC resources and HPC support

I Resources: titan.uio.noI Support: 14 peopleI Contact: [email protected]

112 / 540

[email protected]

Titan

HardwareI 546 X2200m2, 7 X4200, Magnum 3456 IB switchI Hugemem nodes of 128 - 256GB RAMI EVA8K 120 TB storageI Quad core Intel and AMD ( 4000 cores in total)I Infiniband and ethernetI Heterogenous cluster!

113 / 540

Titan

SoftwareI Batch system: SLURM and MAUII Message Passing Interface (MPI):

I OpenMPII ScampiI MPICH2

I Compilers: GCC, Intel, Portland and PathscaleI Optimized math libraries and scientific applicationsI All you need may be found under /siteI Available software: http://www.hpc.uio.no/index.php/Titan_software

114 / 540

http://www.hpc.uio.no/index.php/Titan_softwarehttp://www.hpc.uio.no/index.php/Titan_software

Getting startedBatch systems

I A batch system controls the use of the cluster resourcesI Submits the job to the right resourceI Monitors the job while executingI Restarts the job in case of failureI Takes care of priorities and queues to control execution

order of unrelated jobs

Sun Grid Engine

I SGE is the batch system used on TitanI Jobs are executed either interactively or through job scriptsI Useful commands: showq, qlogin, sbatchI http://hpc.uio.no/index.php/Titan_User_Guide

115 / 540

http://hpc.uio.no/index.php/Titan_User_Guidehttp://hpc.uio.no/index.php/Titan_User_Guide

Getting started

ModulesI Different compilers, MPI-versions and applications need

different sets of user environment variablesI The modules package lets you load and remove the

different variable setsI Useful commands:

I List available modules: module availI Load module: module load I Unload module: module unload I Currently loaded: module list

I http://hpc.uio.no/index.php/Titan_User_Guide

116 / 540

http://hpc.uio.no/index.php/Titan_User_Guidehttp://hpc.uio.no/index.php/Titan_User_Guide

Example

Interactively# l o g i n to t i t a n$ ssh t i t a n . u io . no# ask for 4 cpus$ q log in account=fys3150 ntasks=4# s t a r t a job setup , note the punktum !$ source / s i t e / b in / jobsetup# we want to use the scampi module$ module load scampi# or rep lace scampi w i th openmpi$ mkdir p fys3150 / mpiexample /$ cd fys3150 / mpiexample /# Use program5 . cpp from the course pages , see chapter 7# compile the program$ mpic++ O3 o program5 . x program5 . cpp# and execute i t$ mpirun . / program5 . x

4 / opt / s c a l i / b in / mpimon s t d i n a l l . / program5 . x compute921 1 compute926 1compute926 1 compute931 1

$ Result : 3.14159

117 / 540

The job script is called file.slurm

job.slurm# ! / b in / sh# Cal l th is f i l e job . slurm# 4 cpus wi th mpi ( or other communication )#SBATCH ntasks=4# 10 mins o f wa l l t ime#SBATCHt ime =0:10:00# p r o j e c t fys3150#SBATCHaccount=fys3150# we need 2000 MB of memory per process#SBATCHmempercpu=2000M# name of job#SBATCHjobname=program5

source / s i t e / b in / jobsetup

# load the module used when we compiled the programmodule load scampi

# s t a r t programmpirun . / program5 . x

#END OF SCRIPT

118 / 540

Example

Submitting

# l o g i n to t i t a n$ ssh t i t a n . u io . no# we want to use the module scampi$ module load scampi$ cd fys3150 / mpiexample /# compile the program$ mpic++ O3 o program5 . x program5 . cpp# and submit i t$ sbatch job . slurm$ ex i t

119 / 540

Example

Checking execution# check i f j ob i s running :$ showq u mhjensenACTIVE JOBSJOBNAME USERNAME STATE PROC REMAINING STARTTIME

883129 mhjensen Running 4 10:31:17 F r i Oct 2 13:59:25

1 Ac t i ve Job 2692 of 4252 Processors Ac t i ve (63.31%)482 of 602 Nodes Ac t i ve (80.07%)

IDLE JOBSJOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME

0 I d l e Jobs

BLOCKED JOBSJOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME

Tota l Jobs : 1 Ac t i ve Jobs : 1 I d l e Jobs : 0 Blocked Jobs : 0

120 / 540

Tips and admonitions

Tips

I Titan FAQ: http://www.hpc.uio.no/index.php/FAQI man-pages, e.g. man sbatchI Ask us

AdmonitionsI Remember to exit from qlogin-sessions; the resource is

reserved for you untill you exitI Dont run jobs on login-nodes; these are only for compiling

and editing files

121 / 540

http://www.hpc.uio.no/index.php/FAQ

Topics for Week 4, 25-29 January

Work on project 1

I Repetion from previous weeksI Only work on project 1

122 / 540

Topics for Week 5, February 1-5

Importance sampling and closed form expressions for thelocal energy

I Repetition from the last two weeksI How to compute the local energy, numerically versus

closed form expressionsI Importance sampling, the basic philosophy.

Project work this week: finalize 1a and start of 1b.

123 / 540

Structuring the code

During the development of our code we need to make several checks. It is also veryinstructive to compute a closed form expression for the local energy. Since our wavefunction is rather simple it is straightforward to find an analytic expressions. Considerfirst the case of the simple helium function

T (r1, r2) = e(r1+r2)

The local energy is for this case

EL1 = ( Z )

1r1

+1r2

+

1r12 2

which gives an expectation value for the local energy given by

EL1 = 2 2

Z 516

124 / 540


With analytic formulae we can speed up the computation of the correlation. In our casewe write it as

C = exp

8


We can test this by computing the local energy for our helium wave function

T (r1, r2) = exp ((r1 + r2)) exp

r122(1 + r12)

,

with and as variational parameters.The local energy is for this case

EL2 = EL1 +1

2(1 + r12)2

(r1 + r2)

r12(1

r1r2r1r2

)1

2(1 + r12)2

2r12

+2

1 + r12

ffIt is very useful to test your code against these expressions. It means also that you

dont need to compute derivative numerically as discussed last week. This week you

should implement this expression and test the time usage against the code with

numerical derivation.

126 / 540

Structuring the code, simple task

I Make another copy of your code.

I Implement the closed form expression for the local energy

I Compile the new and old codes with the -pg option for profiling.

I Run both codes and profile them afterwards usinggprof < executable >> outprofile

I Study the time usage in the file outprofile

127 / 540

Your tasks till next week


I Convince yourself that the closed form expressions are correct, see slides fromthis week.

I Implement the above expressions for systems with more than two electrons.

I Implement importance sampling, see code vmc be.cpp.

I Finish part 1a and begin part 1b. Blocking will be discussed next week.

I You need to produce random numbers with a Gaussian distribution.

I Reading task: Thijssens text chapters 8.8 and 12.2.

128 / 540

Efficient calculations of wave function ratios

In the Metropolis/Hasting algorithm, the acceptance ratio determines the probability fora particle to be accepted at a new position. The ratio of the trial wave functionsevaluated at the new and current positions is given by

R newTcurT

=newDcurD| {z }RSD

newCcurC| {z }

RC

. (17)

Here D is our Slater determinant while C is our correlation function. We need tooptimize T /T ratio and the second derivative as well, that is the 2T /T ratio.The first is needed when we compute the so-called quantum force in importancesampling. The second is needed when we compute the kinetic energy term of the localenergy.

=(D C)

D C=

CD + DCDC

=D

D+

CC

129 / 540


The expectation value of the kinetic energy expressed in atomic units for electron i is

bKi = 12 |2i |

|, (18)

Ki = 122i

. (19)

2

=2(D C)

D C=

[(D C)]D C

=[CD + DC ]

D C

=C D + C2D + D C + D2C

D C(20)

2

=2D

D+2C

C+ 2

DD

C

C(21)

130 / 540

Definitions

We define the correlated function as

C =Yi


The total number of different relative distances rij is N(N 1)/2. In a matrix storageformat, the set forms a strictly upper triangular matrix

r

0BBBBBBBB@

0 r1,2 r1,3 r1,N... 0 r2,3 r2,N...

... 0. . .

......

......

. . . rN1,N0 0 0 0

1CCCCCCCCA. (22)

This applies to g = g(rij ) as well.

In our algorithm we will move one particle at the time, say the kth-particle. Keep this in

mind in the discussion to come.

132 / 540


RC =newCcurC

=

k1Yi=1

gnewikgcurik

NYi=k+1

gnewkigcurki

. (23)

For the Pade-Jastrow form

RC =newCcurC

=eUnew

eUcur= eU , (24)

where

U =k1Xi=1

`f newik f

curik

+NX

i=k+1

`f newki f

curki

(25)

One needs to develop a special algorithm that runs only through the elements of theupper triangular matrix g and have k as an index.

133 / 540


The expression to be derived in the following is of interest when computing thequantum force and the kinetic energy. It has the form

i CC

=1

C

Cxi

,

for all dimensions and with i running over all particles. For the first derivative onlyN 1 terms survive the ratio because the g-terms that are not differentiated cancelwith their corresponding ones in the denominator. Then,

1C

Cxk

=

k1Xi=1

1gik

gikxk

+NX

i=k+1

1gki

gkixk

. (26)

An equivalent equation is obtained for the exponential form after replacing gij byexp(fij ), yielding:

1C

Cxk

=

k1Xi=1

gikxk

+NX

i=k+1

gkixk

, (27)

with both expressions scaling as O(N).

134 / 540


Using the identity

xigij =

xjgij (28)

on the right hand side terms of Eq. (26) and Eq. (27), we get expressions where all thederivatives acting on the particle are represented by the second index of g:

1C

Cxk

=

k1Xi=1

1gik

gikxk

NX

i=k+1

1gki

gkixi

, (29)

and for the exponential case:

1C

Cxk

=

k1Xi=1

gikxk

NX

i=k+1

gkixi

. (30)

135 / 540


For correlation forms depending only on the scalar distances rij we can use the chainrule. Noting that

gijxj

=gijrij

rijxj

=xj xi

rij

gijrij

, (31)

after substitution in Eq. (29) and Eq. (30) we arrive at

1C

Cxk

=

k1Xi=1

1gik

rikrik

gikrik

NX

i=k+1

1gki

rkirki

gkirki

. (32)

136 / 540

Efficient calculations of wave function ratiosNote that for the Pade-Jastrow form we can set gij g(rij ) = ef (rij ) = efij and

gijrij

= gijfijrij

. (33)

Therefore,

1C

Cxk

=

k1Xi=1

rikrik

fikrik

NXi=k+1

rkirki

fkirki

, (34)

wherer ij = |r j r i | = (xj xi )e1 + (yj yi )e2 + (zj zi )e3 (35)

is the vectorial distance. When the correlation function is the linear Pade-Jastrow weset

fij =arij

(1 + rij ), (36)

which yields the analytical expression

fijrij

=a

(1 + rij )2. (37)

137 / 540


Computing the 2C/C ratio

k C =k1Xi=1

1gik

k gik +NX

i=k+1

1gki

k gki .

After multiplying by C and taking the gradient on both sides we get,

2k C = k C

0@k1Xi=1

1gik

k gik +NX

i=k+1

1gki

k gki

1A+ Ck

0@ NXi=k+1

1gki

k gki +NX

i=k+1

1gki

k gki

1A= C

k C

C

2+ Ck

0@ NXi=k+1

1gki

k gki +NX

i=k+1

1gki

k gki

1A . (38)

138 / 540

Efficient calculations of wave function ratiosNow,

k

1gik

k gik

= k

1gik

k gik +

1gik

k k gik

= 1

g2ikk gik k gik +

1gik

k

r ikrik

gikrik

=

1g2ik

(k gik )2

+1

gik

k

1rik

gikrik

r ik +

1rik

gikrik

k r ik

=

1g2ik

r ikrik

gikrik

2+

1gik

k

1rik

gikrik

r ik +

1rik

gikrik

d

= 1

g2ik

gikrik

2+

1gik

k

1rik

gikrik

r ik +

1rik

gikrik

d, (39)

with d being the number of spatial dimensions.

139 / 540


Moreover,

k

1rik

gikrik

=

r ikrik

rik

1rik

gikrik

=

r ikrik

1r2ik

gikrik

+1rik

2gikr2ik

!.

We finally get

k

1gik

k gik

= 1

g2ik

gikrik

2+

1gik

"d 1

rik

gikrik

+2gikr2ik

#.

140 / 540


Inserting the last expression in Eq. (38) and after division by C we get,

2k CC

=

k C

C

2+

k1Xi=1

1

g2ik

gikrik

2+

1gik

"d 1

rik

gikrik

+2gikr2ik

#

+NX

i=k+1

1

g2ki

gkirki

2+

1gki

"d 1

rki

gkirki

+2gkir2ki

#. (40)

141 / 540


For the exponential case we have

2k CC

=

k C

C

2+

k1Xi=1

1

g2ik

gik

fikrik

2+

1gik

d 1

rik

gik

fikrik

+

rik

gik

fikrik

+NX

i=k+1

1

g2ki

gik

fkirki

2+

1gki

d 1

rki

gki

fkirki

+

rki

gki

fkirki

.

142 / 540


Using

rik

gik

fikrik

=gikrik

fikrik

+ gik2fikr2ik

= gikfikrik

fikrik

+ gik2fikr2ik

= gik

fikrik

2+ gik

2fikr2ik

and substituting this result into the equation above gives rise to the final expression,

2k PJPJ

=

k PJ

PJ

2+

k1Xi=1

"d 1

rik

fikrik

+2fikr2ik

#+

NXi=k+1

"d 1

rki

fkirki

+2fkir2ki

#. (41)

143 / 540

Summing up: Bringing it all together, quantum force

The general derivative formula of the Jastrow factor is

1C

Cxk

=

k1Xi=1

gikxk

+NX

i=k+1

gkixk

However, with our

C =Yi

Summing up: Bringing it all together, Local energy

The second derivative of the Jastrow factor divided by the Jastrow factor (the way itenters the kinetic energy) is

2C

C

x

= 2NX

k=1

k1Xi=1

2gikx2k

+NX

k=1

0@k1Xi=1

gikxk

NX

i=k+1

gkixi

1A2

But we have a simple form for the function, namely

C =Yi

Bringing it all together, Local energy

Using

f (rij ) =arij

1 + rij,

and g(rkj ) = dg(rkj )/drkj and g(rkj ) = d2g(rkj )/dr2kj we find that for particle k wehave

2k CC

=Xij 6=k

(rk ri )(rk rj )rki rkj

a(1 + rki )2

a(1 + rkj )2

+Xj 6=k

2a

rkj (1 + rkj )2

2a(1 + rkj )3

!

146 / 540

Important feature

For the correlation part

C =Yi

Importance sampling, what we want to doWe need to replace the brute force Metropolis algorithm with a walk in coordinatespace biased by the trial wave function. This approach is based on the Fokker-Planckequation and the Langevin equation for generating a trajectory in coordinate space.This is explained later.For a diffusion process characterized by a time-dependent probability density P(x , t) inone dimension the Fokker-Planck equation reads (for one particle/walker)

Pt

= D

x

x F

P(x , t),

where F is a drift term and D is the diffusion coefficient.The new positions in coordinate space are given as the solutions of the Langevinequation using Eulers method, namely, we go from the Langevin equation

x(t)t

= DF (x(t)) + ,

with a random variable, yielding a new position

y = x + DF (x)t + ,

where is gaussian random variable and t is a chosen time step.

148 / 540

Importance sampling, what we want to do

The process of isotropic diffusion characterized by a time-dependent probabilitydensity P(x , t) obeys (as an approximation) the so-called Fokker-Planck equation

Pt

=X

i

D

xi

xi Fi

P(x , t),

where Fi is the i th component of the drift term (drift velocity) caused by an externalpotential, and D is the diffusion coefficient. The convergence to a stationary probabilitydensity can be obtained by setting the left hand side to zero. The resulting equation willbe satisfied if and only if all the terms of the sum are equal zero,

2Pxi 2

= P

xiFi + Fi

xiP.

149 / 540

Importance sampling, what we want to do

The drift vector should be of the form F = g(x) Px . Then,

2Pxi 2

= PgP

Pxi

2+ Pg

2Pxi 2

+ gPxi

2.

The condition of stationary density means that the left hand side equals zero. In otherwords, the terms containing first and second derivatives have to cancel each other. It ispossible only if g = 1P , which yields

F = 21

TT , (42)

which is known as the so-called quantum force. This term is responsible for pushing

the walker towards regions of configuration space where the trial wave function is large,

increasing the efficiency of the simulation in contrast to the Metropolis algorithm where

the walker has the same probability of moving in every direction.

150 / 540

Importance Sampling

The Fokker-Planck equation yields a (the solution to the equation) transition probabilitygiven by the Greens function

G(y , x ,t) =1

(4Dt)3N/2exp

(y x DtF (x))2/4Dt

which in turn means that our brute force Metropolis algorithm

A(y , x) = min(1, q(y , x))),

with q(y , x) = |T (y)|2/|T (x)|2 is now replaced by

q(y , x) =G(x , y ,t)|T (y)|2

G(y , x ,t)|T (x)|2

See program vmc be.cpp for example. Read more in Thijssens text chapters 8.8 and12.2.

151 / 540

Importance sampling, new positions in functionvmc be.cpp

for ( v a r i a t e =1; v a r i a t e


/ / loop over monte ca r l o cyc lesfor ( cyc les = 1; cyc les


/ / we move only one p a r t i c l e a t the t imefor ( k = 0 ; k < number par t i c les ; k++) {

i f ( k != i ) {for ( j =0; j < dimension ; j ++) {

r new [ k ] [ j ] = r o l d [ k ] [ j ] ;}

}}/ / wave function onemove ( r new ,

qforce new , &wfnew , beta ) ;wfnew = wave funct ion ( r new , beta ) ;quantum force ( r new , qforce new , beta ,

wfnew ) ;

154 / 540


/ / we compute the log o f the r a t i o o f thegreens f u n c t i o n s to be used i n the

/ / Met ropo l isHast ings a lgo r i thmgreens func t ion = 0 . 0 ;for ( j =0; j < dimension ; j ++) {

greens func t ion += 0 .5 ( q f o r ce o ld [ i ] [ j ]+qforce new [ i ] [ j ] )

(D t imestep 0.5 ( q f o r ce o ld [ i ] [ j ]qforce new [ i ] [ j ] )r new [ i ] [ j ]+ r o l d[ i ] [ j ] ) ;

}greens func t ion = exp ( greens func t ion ) ;

155 / 540


/ / The Met ropo l i s t e s t i s performed bymoving one p a r t i c l e a t the t ime

i f ( ran2 (&idum )

Importance sampling, Quantum force in functionvmc be.cpp

void quantum force ( double r , double qforce ,double beta , double wf )

{i n t i , j ;double wfminus , wfp lus ;double r p l us , r minus ;

r p l u s = ( double ) mat r i x ( number par t ic les ,dimension , sizeof ( double ) ) ;

r minus = ( double ) mat r i x ( number par t ic les ,dimension , sizeof ( double ) ) ;

for ( i = 0 ; i < number par t i c les ; i ++) {for ( j =0; j < dimension ; j ++) {

r p l u s [ i ] [ j ] = r minus [ i ] [ j ] = r [ i ] [ j ] ;}

}. . .

157 / 540

Importance sampling, Quantum force in functionvmc be.cpp

/ / compute the f i r s t d e r i v a t i v efor ( i = 0 ; i < number par t i c les ; i ++) {

for ( j = 0 ; j < dimension ; j ++) {r p l u s [ i ] [ j ] = r [ i ] [ j ]+h ;r minus [ i ] [ j ] = r [ i ] [ j ]h ;wfminus = wave funct ion ( r minus , beta ) ;wfp lus = wave funct ion ( r p lus , beta ) ;q force [ i ] [ j ] = ( wfpluswfminus ) 2 .0 / wf / h ;r p l u s [ i ] [ j ] = r [ i ] [ j ] ;r minus [ i ] [ j ] = r [ i ] [ j ] ;

}}

} / / end of quantum force f u n c t i o n

158 / 540

Topics for Week 6, February 8-12

Importance sampling, Fokker-Planck and Langevinequations

I Repetition from last weekI Derivation of the Fokker-Planck and the Langevin

equations (Background materialI Importance sampling, further discussion of codesI Begin discussion of blocking

Project work this week: finalize 1a and 1b (only importancesampling part). Next week we discuss blocking as a tool toperform statistical analysis of MonteCarlo data.

159 / 540

Your tasks from the previous week plus new tasks


I Convince yourself that the closed form expressions are correct, see slides fromlast week.

I Implement the above expressions for systems with more than two electrons.

I Implement importance sampling, see code vmc be.cpp.

I Finish part 1a and begin part 1b. Blocking will be discussed this week.

I You need to produce random numbers with a Gaussian distribution.

I Reading task: Thijssens text chapters 8.8 and 12.2. To be discussed today.

I Task to next week: Finish coding importance sampling in 1b.

160 / 540

Importance sampling, Fokker-Planck and Langevinequation

A stochastic process is simply a function of two variables, one is the time, the other is astochastic variable X , defined by specifying

I the set {x} of possible values for X ;

I the probability distribution, wX (x), over this set, or briefly w(x)

The set of values {x} for X may be discrete, or continuous. If the set of values iscontinuous, then wX (x) is a probability density so that wX (x)dx is the probability that

one finds the stochastic variable X to have values in the range [x , x + dx ] .

161 / 540


An arbitrary number of other stochastic variables may be derived from X . For example,any Y given by a mapping of X , is also a stochastic variable. The mapping may also betime-dependent, that is, the mapping depends on an additional variable t

YX (t) = f (X , t).

The quantity YX (t) is called a random function, or, since t often is time, a stochasticprocess. A stochastic process is a function of two variables, one is the time, the otheris a stochastic variable X . Let x be one of the possible values of X then

y(t) = f (x , t),

is a function of t , called a sample function or realization of the process. In physics one

considers the stochastic process to be an ensemble of such sample functions.

162 / 540


For many physical systems initial distributions of a stochastic variable y tend toequilibrium distributions: w(y , t) w0(y) as t . In equilibrium detailed balanceconstrains the transition rates

W (y y )w(y) = W (y y)w0(y),

where W (y y) is the probability, per unit time, that the system changes from a state|y , characterized by the value y for the stochastic variable Y , to a state |y .Note that for a system in equilibrium the transition rate W (y y) and the reverseW (y y ) may be very different.

163 / 540


Consider, for instance, a simple system that has only two energy levels 0 = 0 and1 = E .For a system governed by the Boltzmann distribution we find (the partition function hasbeen taken out)

W (0 1) exp0/kT = W (1 0) exp1/kT

We get thenW (1 0)W (0 1)

= expE/kT ,

which goes to zero when T tends to zero.

164 / 540


If we assume a discrete set events, our initial probability distribution function can begiven by

wi (0) = i,0,

and its time-development after a given time step t = is

wi (t) =X

j

W (j i)wj (t = 0).

The continuous analog to wi (0) is

w(x) (x), (43)

where we now have generalized the one-dimensional position x to a

generic-dimensional vector x. The Kroenecker function is replaced by the

distribution function (x) at t = 0.

165 / 540


The transition from a state j to a state i is now replaced by a transition to a state withposition y from a state with position x. The discrete sum of transition probabilities canthen be replaced by an integral and we obtain the new distribution at a time t + t as

w(y, t + t) =Z

W (y, t + t |x, t)w(x, t)dx, (44)

and after m time steps we have

w(y, t + mt) =Z

W (y, t + mt |x, t)w(x, t)dx. (45)

When equilibrium is reached we have

w(y) =Z

W (y|x, t)w(x)dx, (46)

that is no time-dependence. Note our change of notation for W

166 / 540


We can solve the equation for w(y, t) by making a Fourier transform to momentumspace. The PDF w(x, t) is related to its Fourier transform w(k, t) through

w(x, t) =Z

dk exp (ikx)w(k, t), (47)

and using the definition of the -function

(x) =1

2

Z

dk exp (ikx), (48)

we see thatw(k, 0) = 1/2. (49)

167 / 540


We can then use the Fourier-transformed diffusion equation

w(k, t)t

= Dk2w(k, t), (50)

with the obvious solution

w(k, t) = w(k, 0) exph(Dk2t)

=

12

exph(Dk2t)

i. (51)

168 / 540


Using Eq. (47) we obtain

w(x, t) =Z

dk exp [ikx]1

2exp

h(Dk2t)

i=

1

4Dtexp

h(x2/4Dt)

i, (52)

with the normalization condition Z

w(x, t)dx = 1. (53)

169 / 540


It is rather easy to verify by insertion that Eq. (52) is a solution of the diffusion equation.The solution represents the probability of finding our random walker at position x attime t if the initial distribution was placed at x = 0 at t = 0.There is another interesting feature worth observing. The discrete transition probabilityW itself is given by a binomial distribution. The results from the central limit theoremstate that transition probability in the limit n converges to the normal distribution.It is then possible to show that

W (il jl, n) W (y, t + t |x, t) =1

4Dt

exph((y x)2/4Dt)

i, (54)

and that it satisfies the normalization condition and is itself a solution to the diffusion

equation.

170 / 540


Let us now assume that we have three PDFs for times t0 < t < t , that is w(x0, t0),w(x, t ) and w(x, t). We have then

w(x, t) =Z

W (x.t |x.t )w(x, t )dx,

andw(x, t) =

Z

W (x.t |x0.t0)w(x0, t0)dx0,

andw(x, t ) =

Z

W (x.t |x0, t0)w(x0, t0)dx0.

171 / 540


We can combine these equations and arrive at the famousEinstein-Smoluchenski-Kolmogorov-Chapman (ESKC) relation

W (xt |x0t0) =Z

W (x, t |x, t )W (x, t |x0, t0)dx.

We can replace the spatial dependence with a dependence upon say the velocity (ormomentum), that is we have

W (v, t |v0, t0) =Z

W (v, t |v, t )W (v, t |v0, t0)dx.

172 / 540


Assume now that is very small so that we can make an expansion in terms of a smallstep xi , with x = x , that is

W (x, s|x0) +Ws

+ O(2) =Z

W (x, |x )W (x , s|x0)dx.

We assume that W (x, |x ) takes non-negligible values only when is small. This isjust another way of stating the Master equation!!

174 / 540


We say thus that x changes only by a small amount in the time interval . This meansthat we can make a Taylor expansion in terms of , that is we expand

W (x, |x )W (x , s|x0) =X

n=0

()n

n!n

xn[W (x + , |x)W (x, s|x0)] .

We can then rewrite the ESKC equation as

Ws

= W (x, s|x0) +X

n=0

()n

n!n

xn

W (x, s|x0)

Z

nW (x + , |x)d.

We have neglected higher powers of and have used that for n = 0 we get simply

W (x, s|x0) due to normalization.

175 / 540


We say thus that x changes only by a small amount in the time interval . This meansthat we can make a Taylor expansion in terms of , that is we expand

W (x, |x )W (x , s|x0) =X

n=0

()n

n!n

xn[W (x + , |x)W (x, s|x0)] .

We can then rewrite the ESKC equation as

W (x, s|x0)s

= W (x, s|x0)+X

n=0

()n

n!n

xn

W (x, s|x0)

Z

nW (x + , |x)d.

We have neglected higher powers of and have used that for n = 0 we get simply

W (x, s|x0) due to normalization.

176 / 540


We simplify the above by introducing the moments

Mn =1

Z

nW (x + , |x)d =[x()]n

,

resulting inW (x, s|x0)

s=X

n=1

()n

n!n

xn[W (x, s|x0)Mn] .

177 / 540

Importance sampling, Fok

Slides from FYS4411 Lectures

Documents

Transcript of Slides from FYS4411 Lectures