cs668-lec10-MonteCarlo.ppt

65
Lecture 10 Outline Monte Carlo methods History of methods Sequential random number generators Parallel random number generators Generating non-uniform random numbers Monte Carlo case studies

Transcript of cs668-lec10-MonteCarlo.ppt

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 1/65

Lecture 10 Outline

Monte Carlo methods

History of methods

Sequential random number generators

Parallel random number generators

Generating non-uniform random numbers

Monte Carlo case studies

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 2/65

Monte Carlo Methods

Monte Carlo is another name for statisticalsampling methods of great importance to physicsand computer science

Applications of Monte Carlo Method

Evaluating integrals of arbitrary functions of 6+dimensions

Predicting future values of stocks

Solving partial differential equations Sharpening satellite images

Modeling cell populations

Finding approximate solutions to NP-hard

 problems

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 3/65

• In 1738, Swiss physicist and mathematician Daniel Bernoulli published

 Hydrodynamica which laid the basis for the kinetic theory of gases: great

numbers of molecules moving in all directions, that their impact on asurface causes the gas pressure that we feel, and that what we experience

as heat is simply the kinetic energy of their motion.

• In 1859, Scottish physicist James Clerk Maxwell formulated the

distribution of molecular velocities, which gave the proportion of molecules having a certain velocity in a specific range. This was the

first-ever statistical law in physics. Maxwell used a simple thought

experiment: particles must move independent of any chosen coordinates,

hence the only possible distribution of velocities must be normal in each

coordinate.

• In 1864, Ludwig Boltzmann, a young student in Vienna, came across

Maxwell‟s paper and was so inspired by it that he spent much of his

long, distinguished, and tortured life developing the subject further.

An Interesting History

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 4/65

History of Monte Carlo Method Credit for inventing the Monte Carlo method is shared by Stanislaw Ulam,

John von Neuman and Nicholas Metropolis.

Ulam, a Polish born mathematician, worked for John von Neumann on theManhattan Project. Ulam is known for designing the hydrogen bomb withEdward Teller in 1951. In a thought experiment he conceived of the MCmethod in 1946 while pondering the probabilities of winning a card game of solitaire.

Ulam, von Neuman, and Metropolis developed algorithms for computer implementations, as well as exploring means of transforming non-random problems into random forms that would facilitate their solution via statisticalsampling. This work transformed statistical sampling from a mathematicalcuriosity to a formal methodology applicable to a wide variety of problems. Itwas Metropolis who named the new methodology after the casinos of MonteCarlo. Ulam and Metropolis published a paper called “The Monte CarloMethod” in Journal of the American Statistical Association, 44 (247), 335-341, in 1949.

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 5/65

Solving Integration Problems via Statistical Sampling:

Monte Carlo Approximation

How to evaluate integral of f(x)?

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 6/65

Integration Approximation

Can approximate using another function g(x)

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 7/65

Integration Approximation

Can approximate by taking the average or expected value

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 8/65

Integration Approximation

Estimate the average by taking N samples

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 9/65

Monte Carlo Integration

Im = Monte Carlo estimate

 N = number of samples

x1, x2, …, x N are uniformly distributed random numbers between aand b

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 10/65

Monte Carlo Integration

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 11/65

Monte Carlo Integration

We have the definition of expected value and how to estimate it.

Since the expected value can be expressed as an integral, the integral

is also approximated by the sum.

To simplify the integral, we can substitute g(x) = f(x)p(x).

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 12/65

Variance

The variance describes how much the sampled values vary fromeach other.

Variance proportional to 1/N

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 13/65

Variance

Standard Deviation is just the square root of the variance

Standard Deviation proportional to 1 / sqrt(N)

 Need 4X samples to halve the error 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 14/65

Variance

Problem: Variance (noise) decreases slowly

Using more samples only removes a small amount of noise

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 15/65

Variance Reduction

There are several ways to reduce the variance Importance Sampling

Stratified Sampling

Quasi-random Sampling 

 Metropolis Random Mutations

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 16/65

Importance Sampling

Idea: use more samples in important regions of the function If function is high in small areas, use more samples there

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 17/65

Importance Sampling

Want g/p to have low variance

Choose a good function p similar to g:

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 18/65

Stratified Sampling

Partition S into smaller domains Si

Evaluate integral as sum of integrals over Si

Example: jittering for pixel sampling

Often works much better than importance sampling in practice

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 19/65

Parallelism in Monte Carlo Methods

Monte Carlo methods often amenable to

 parallelism

Find an estimate about p times faster 

OR 

Reduce error of estimate by p1/2

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 20/65

Random versus Pseudo-random

Virtually all computers have “random number”generators

Their operation is deterministic

Sequences are predictable

More accurately called “pseudo-random number”generators

In this chapter “random” is shorthand for “pseudo-random” 

“RNG” means “random number generator” 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 21/65

Properties of an Ideal RNG

Uniformly distributed

Uncorrelated

 Never cycles

Satisfies any statistical test for randomness

Reproducible

Machine-independent

Changing “seed” value changes sequence 

Easily split into independent subsequences

Fast

Limited memory requirements

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 22/65

 No RNG Is Ideal

Finite precision arithmetic finite number 

of states cycles

Period = length of cycle

If period > number of values needed,

effectively acyclic

Reproducible correlations

Often speed versus quality trade-offs

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 23/65

Linear Congruential RNGs

 M c X a X  ii mod)( 1

Multiplier 

Additive constant

Modulus

Sequence depends on choice of seed, X 0

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 24/65

Period of Linear Congruential RNG

Maximum period is M

For 32-bit integers maximum period is 232,

or about 4 billion

This is too small for modern computers

Use a generator with at least 48 bits of 

 precision

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 25/65

Producing Floating-Point Numbers

 X i, a, c, and M are all integers

 X is range in value from 0 to M -1

To produce floating-point numbers in range

[0, 1), divide X i by M 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 26/65

Defects of Linear Congruential RNGs

Least significant bits correlated

Especially when M is a power of 2

k -tuples of random numbers form a lattice

Points tend to lie on hyperplanes

Especially pronounced when k is large

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 27/65

Lagged Fibonacci RNGs

qi pii X  X  X 

 p and q are lags, p > q * is any binary arithmetic operation

Addition modulo M 

Subtraction modulo M Multiplication modulo M 

Bitwise exclusive or 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 28/65

Properties of Lagged Fibonacci RNGs

Require p seed values

Careful selection of seed values, p, and q 

can result in very long periods and goodrandomness

For example, suppose M has b bits

Maximum period for additive laggedFibonacci RNG is (2 p -1)2b-1

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 29/65

Ideal Parallel RNGs

All properties of sequential RNGs

 No correlations among numbers in different

sequences

Scalability

Locality

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 30/65

Parallel RNG Designs

Manager-worker 

Leapfrog

Sequence splitting

Independent sequences

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 31/65

Manager-Worker Parallel RNG

Manager process generates random

numbers

Worker processes consume them

If algorithm is synchronous, may achieve

goal of consistency

 Not scalable

Does not exhibit locality

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 32/65

Leapfrog Method

Process with rank 1 of 4 processes

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 33/65

Properties of Leapfrog Method

Easy modify linear congruential RNG to

support jumping by p

Can allow parallel program to generatesame tuples as sequential program

Does not support dynamic creation of new

random number streams

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 34/65

Sequence Splitting

Process with rank 1 of 4 processes

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 35/65

Properties of Sequence Splitting

Forces each process to move ahead to its

starting point

Does not support goal of reproducibility

May run into long-range correlation

 problems

Can be modified to support dynamiccreation of new sequences

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 36/65

Independent Sequences

Run sequential RNG on each process

Start each with different seed(s) or other 

 parameters

Example: linear congruential RNGs with

different additive constants

Works well with lagged Fibonacci RNGs

Supports goals of locality and scalability

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 37/65

Statistical Simulation: Metropolis Algorithm

Metropolis algorithm. [Metropolis, Rosenbluth, Rosenbluth, Teller,

Teller 1953] Simulate behavior of a physical system according to principles of 

statistical mechanics.

Globally biased toward "downhill" lower-energy steps, but

occasionally makes "uphill" steps to break out of local minima.

Gibbs-Boltzmann function. The probability of finding a physical

system in a state with energy E is proportional to e -E / (kT), where T > 0

is temperature and k is a constant.

For any temperature T > 0, function is monotone decreasing

function of energy E.

System more likely to be in a lower energy state than higher one.

T large: high and low energy states have roughly same

 probability

T small: low energy states are much more probable

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 38/65

Metropolis algorithm.

Given a fixed temperature T, maintain current state S.

Randomly perturb current state S to new state S' N(S). If E(S') E(S), update current state to S'

Otherwise, update current state to S' with probability e - E / (kT),

where E = E(S') - E(S) > 0.

Convergence Theorem. Let f S(t) be fraction of first t steps in whichsimulation is in state S. Then, assuming some technical conditions,

with probability 1:

Intuition. Simulation spends roughly the right amount of time in each

state, according to Gibbs-Boltzmann equation.

limt 

 f S (t ) 1

 Z e E (S ) /(kT ),

where  Z  e E (S ) /(kT )

S   N (S )

.

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 39/65

Simulated Annealing Simulated annealing. 

T large

probability of accepting an uphill move is large. T small uphill moves are almost never accepted.

Idea: turn knob to control T.

Cooling schedule: T = T(i) at iteration i.

Physical analog.

Take solid and raise it to high temperature, we do not expect it to

maintain a nice crystal structure.

Take a molten solid and freeze it very abruptly, we do not expect

to get a perfect crystal either.

Annealing: cool material gradually from high temperature,

allowing it to reach equilibrium at succession of intermediate

lower temperatures.

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 40/65

Other Distributions

Analytical transformations

Box-Muller Transformation

Rejection method

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 41/65

Analytical Transformation-probability density function f(x)

-cumulative distribution F(x)

 In theory of probability , a quanti le function of a distribution is theinverse of its cumulative distribution function. 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 42/65

Exponential Distribution:An exponential distribution arises naturally when modeling the time between independent

events that happen at a constant average rate and are memoryless. One of the few cases

where the quartile function is known analytically.

1.0

m xe

m x f  

/1)(

m xe x F  /1)(

)1ln()(1 umu F 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 43/65

Example 1:

Produce four samples from an exponential

distribution with mean 3

Uniform sample: 0.540, 0.619, 0.452, 0.095

Take natural log of each value and multiply

 by -3

Exponential sample: 1.850, 1.440, 2.317,7.072

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 44/65

Example 2:

Simulation advances in time steps of 1 second

Probability of an event happening is from an exponential

distribution with mean 5 seconds

What is probability that event will happen in next second?

F(x=1/5) =1 - exp(-1/5)) = 0.181269247

Use uniform random number to test for occurrence of 

event (if u < 0.181 then „event‟ else „no event‟) 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 45/65

 Normal Distributions:

Box-Muller Transformation

Cannot invert cumulative distribution

function to produce formula yielding

random numbers from normal (gaussian)distribution

Box-Muller transformation produces a pair 

of standard normal deviates g 1 and g 2 froma pair of normal deviates u1 and u2

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 46/65

Box-Muller Transformation

repeat

v 1   2u 1 - 1

v 2  

2u 2  - 1r   v 1 2 + v 2 

until r > 0 and r < 1

f   sqrt (-2 ln r /r )

g 1   f v 1 

g 2   f v 2 

This is a consequence of 

the fact that the chi-

square distribution withtwo degrees of freedom

is an easily-generated

exponential random

variable.

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 47/65

Example

Produce four samples from a normal

distribution with mean 0 and standard

deviation 1

u 1  u 2  v 1  v 2  r f g 1  g 2 

0.234 0.784 -0.532 0.568 0.605 1.290 -0.686 0.732

0.824 0.039 0.648 -0.921 1.269

0.430 0.176 -0.140 -0.648 0.439 1.935 -0.271 -1.254

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 48/65

Rejection Method

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 49/65

Example

Generate random variables from this

 probability density function

otherwise,0

4/2/4if ),28/()84(

4/0if ,sin

)(    

 

 x x

 x x

 x f  

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 50/65

Example (cont.)

otherwise,0

4/20if ),4/2/(1)(

   x xh

)2/2/()4/2(    

)2/2/()4/2(    

otherwise,0

4/20if ,2/2)(     x xh

So   h(x)   f(x) for all x

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 51/65

Example (cont.)

x i  u i  u i h(x i  )  f(x i  )  Outcome

0.860 0.975 0.689 0.681 Reject

1.518 0.357 0.252 0.448 Accept

0.357 0.920 0.650 0.349 Reject

1.306 0.272 0.192 0.523 Accept

Two samples from f(x) are 1.518 and 1.306

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 52/65

Case Studies (Topics Introduced)

Temperature inside a 2-D plate (Randomwalk)

Two-dimensional Ising model(Metropolis algorithm)

Room assignment problem (Simulatedannealing)

Parking garage (Monte Carlo time)

Traffic circle (Simulating queues)

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 53/65

Temperature Inside a 2-D Plate

Random walk 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 54/65

Example of Random Walk 

}3,2,1,0{410 uu

13121322

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 55/65

 NP-Hard Assignment Problems

TSP:Find a tour of US cities that minimizes distance.

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 56/65

Physical Annealing

Heat a solid until it melts

Cool slowly to allow material to reach state

of minimum energy Produces strong, defect-free crystal with

regular structure

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 57/65

Simulated Annealing

Makes analogy between physical annealing

and solving combinatorial optimization

 problem Solution to problem = state of material

Value of objective function = energy

associated with state Optimal solution = minimum energy state

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 58/65

How Simulated Annealing Works

Iterative algorithm, slowly lower T 

Randomly change solution to create

alternate solution Compute , the change in value of objective

function

If  < 0, then jump to alternate solution

Otherwise, jump to alternate solution with

 probability e- /T  

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 59/65

Performance of Simulated

Annealing Rate of convergence depends on initial value of T

and temperature change function

Geometric temperature change functions typical;e.g., T i+1 = 0.999 T i

 Not guaranteed to find optimal solution

Same algorithm using different random number streams can converge on different solutions

Opportunity for parallelism

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 60/65

Convergence

Starting with higher 

initial temperature

leads to more iterations

 before convergence

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 61/65

Parking Garage

Parking garage has S stalls

Car arrivals fit Poisson distribution with

mean A: Exponentially distributed inter-arrival times

Stay in garage fits a normal distribution

with mean M and standard deviation M /S 

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 62/65

Implementation Idea

101.2 142.1 70.3 91.7 223.1

64.2

Current Time

Times Spaces Are Available

15

Car Count Cars Rejected

2

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 63/65

Summary (1/3)

Applications of Monte Carlo methods

 Numerical integration

Simulation

Random number generators

Linear congruential

Lagged Fibonacci

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 64/65

Summary (2/3)

Parallel random number generators

Manager/worker 

Leapfrog

Sequence splitting

Independent sequences

 Non-uniform distributions

Analytical transformations Box-Muller transformation

(Rejection method)

7/27/2019 cs668-lec10-MonteCarlo.ppt

http://slidepdf.com/reader/full/cs668-lec10-montecarloppt 65/65

Summary (3/3)

Concepts revealed in case studies

Monte Carlo time

Random walk 

Metropolis algorithm

Simulated annealing

Modeling queues