Introduction to stochastic...

Introduction to stochastic processes

Josselin Garnier (Universite Paris VII)

http://www.proba.jussieu.fr/~garnier

Lille January, 2013

A quick introduction to probability theory

Lille January, 2013

Probability space (Ω,A,P)

• Goal: Model a random experiment.

Example 1: toss a coin.

Example 2: pick a number between 0 and 1 (matlab function rand).

• Ω: fundamental set (set of all possible realisations).

→ A realization ω is an element of Ω.

Example 1: Ω = H,T.Example 2: Ω = [0, 1].

• A: σ-algebra on Ω (set of events).

Example 1: A = P(Ω);

an event: the result is head, A = H.Example 2: A = B([0, 1]);

an event: the number is larger than 1/2, A = [1/2, 1].

• P: probability (gives the probability of events): function from A to [0, 1] such that:

- P(Ω) = 1,

- if (Aj)j≥1 is a numerable family of disjoint sets of A then P(∪jAj) =∑

j P(Aj).

Example 1: P(H) = P(T) = 12.

Example 2: P([a, b]) = b− a for all 0 ≤ a ≤ b ≤ 1.

Lille January, 2013

Random variable

• Random variable X = random number.

Application X : Ω → R.

A realization X(ω) of a random variable is a real number.

• The distribution of a random variable is characterized by moments of the form

E[φ(X)] for φ ∈ Cb(R,R):E[φ(X)] =

∫

Ω

φ(X(ω))P(dω)

• The distribution of a (continuous) random variable is characterized by the

probability density function (pdf) pX :

E[φ(X)] =

∫ ∞

−∞

φ(x)pX(x)dx

• Usual pdfs:

−0.5 0 0.5 1 1.50

0.2

0.4

0.6

0.8

1

p X(x

)

x0 1 2 3 4 5

0

0.2

0.4

0.6

0.8

1

p X(x

)

x−5 0 50

0.1

0.2

0.3

0.4

0.5

p X(x

)

x

uniform exponential Gaussian

Lille January, 2013

• The mean (expectation) of a random variable X with pdf pX(x) is

E[X] =

∫ ∞

−∞

xpX(x)dx

• The variance of a random variable X with pdf pX(x) is

Var(X) = E[(X − E[X])2

]= E[X2]− E[X]2 =

∫ ∞

−∞

x2pX(x)dx−(∫ ∞

−∞

xpX(x)dx)2

The variance measures the dispersion of the random variable (around its mean).

• A standard Gaussian random variable X has the pdf

pX(x) =1√2π

exp(

− x2

2

)

Its mean is E[X] = 0 and its variance is Var(X) = 1.

We write X ∼ N (0, 1).

• A Gaussian random variable X with mean µ and variance σ2 has the pdf

pX(x) =1√2πσ

exp(

− (x− µ)2

2σ2

)

We write X ∼ N (µ, σ2).

Lille January, 2013

Random vector

• n-dimensional random vector X = collection of n random variables (Xi)i=1,...,n.

Application X : Ω → Rn.

A realization X(ω) of a random vector is a vector in Rn.

The distribution of a (continuous) random vector is characterized by the pdf pX :

E[φ(X)] =

∫

Rn

φ(x)pX(x)dx, ∀φ ∈ Cb(Rn,R)

The vector X = (Xi)i=1,...,n is independent if

pX(x) =

n∏

j=1

pXj(xj)

or equivalently

E[φ1(X1) · · ·φn(Xn)

]= E

[φ1(X1)

]· · ·E

[φn(Xn)

], ∀φ1, . . . , φn ∈ Cb(R,R)

Example: a normalized Gaussian random vector X has the Gaussian pdf

pX(x) =1

√(2π)n

exp(

− |x|22

)

It is a vector of independent random normalized Gaussian variables.

Lille January, 2013

• Let X = (Xi)i=1,...,n be a random vector.

• The mean of X is the vector µ = (µj)j=1,...,n:

µj = E[Xj ]

The covariance matrix of X is C = (Cjl)j,l=1,...,n:

Cjl = E[(Xj − E[Xj ])(Xl − E[Xl])

]

• Let (βj)j=1,...,n ∈ Rn. The random variable Z =

∑nj=1 βjXj has mean:

E[Z] = βTµ =

n∑

j=1

βjE[Xj ]

and variance:

Var(Z) = βTCβ =

n∑

j,l=1

Cjlβjβl

(as a byproduct, this shows that C is a nonnegative matrix).

• If the variables are independent then the covariance matrix is diagonal. In

particular:

Var( n∑

j=1

Xj

)

=

n∑

j=1

Var(Xj)

The reciprocal is false (in general).

Lille January, 2013

• Gaussian random vectors.

The random vector X = (Xi)i=1,...,n is Gaussian iff for any λ ∈ Rn, the sum λTX is

a Gaussian random variable.

• The distribution of X is characterized by the mean µ and the covariance matrix R.

When R is invertible, X has the pdf

p(x) =1

(2π)n/2√detR

exp(

− (x− µ)TR−1(x− µ)

2

)

• The Xj are independent iff the covariance matrix is diagonal (iff Cov(Xj, Xl) = 0

for all j 6= l).

• Let A be a m× n matrix. The random vector Y = AX is Gaussian with mean:

E[Y ] = Aµ

and covariance matrix:

E[(Y − E[Y ])(Y − E[Y ])T

]= ACA

T

→ Any linear functional transforms a Gaussian vector into a Gaussian vector.

Lille January, 2013

Conditional expectation

Let (X, Y ) be a random vector, X is Rn-valued and Y is R-valued.

• The expectation of Y is the deterministic constant that is the best approximation of

the random variable Y :

E[Y ] = argmina∈R

E[(Y − a)2

]

(in the mean square sense).

• Conditional expectation.

Assume that we observe X. What is the best estimate of Y given X ?

We look for:

ψ0 = argminψ:Rn→R

E[(Y − ψ(X))2

]

and then E[Y |X] = ψ0(X).

→ Complex minimization problem.

Interpretation: E[Y |X] is the orthogonal projection (in L2(Ω)) of Y onto the

(infinite-dimensional) subspace of square-integrable variables of the form ψ(X).

Remark: if X and Y are independent, then E[Y |X] = E[Y ].

Lille January, 2013

• Linear regression.

Assume that we observe X. What is the affine combination of X that best

approximates Y ?

The problem to be solved is:

(αregj )nj=0 = argmin(αj)

nj=0∈R

n+1

E

[(Y − α0 −

n∑

j=1

αjXj

)2]

and one finds Y reg = αreg0 +∑nj=1 α

regj Xj .

→ Quadratic minimization problem (much easier to solve than the computation of

the conditional expectation).

But linear regression is clearly sub-optimal in the point of view of approximation

theory: the best affine combination of X has no reason to be the best approximation

of Y given X.

• Result: If (X, Y ) is Gaussian, then the conditional expectation E[Y |X] if the linear

regression of Y on X.

Example: If (X,Y ) is a Gaussian vector with mean zero and covariance matrix

1 ρ

ρ 1

then E[Y |X] = ρX.

Lille January, 2013

Gaussian conditioning

Let

X1

X2

be a Gaussian random vector (with X1 of size n1 and X2 of size n2):

L(

X1

X2

)

= N(

0

0

,

R11 R12

R21 R22

)

with the covariance matrices R11 of size n1 × n1, R12 of size n1 × n2, R21 = RT12 of

size n2 × n1, and R22 of size n2 × n2.

Then the distribution of X1 conditionally to X2 is Gaussian:

L(X1|X2 = x2

)= N

(R12R

−122 x2,R11 −R12R

−122 R21

)

Proof: The posterior pdf of X1 given that X2 = x2 is ppost(x1) =p(x1,x2)

∫p(x′

1,x2)dx′1

.

Use

R11 R12

R21 R22

−1

=

Q−1 −Q−1R12R

−122

−R−122 R21Q

−1 R−122 +R−1

22 R21Q−1R12R

−122

,

where Q = R11 −R12R−122 R21 is the Schur complement.

Lille January, 2013

Example. Let us consider the Gaussian vector (n1 = n2 = 1)

X1

X2

∼ N(

0

0

,

1 ρ

ρ 1

)

with ρ ∈ [−1, 1] correlation coefficient of X1 and X2.

The distribution of X1 is

L(X1

)= N

(0, 1)

If ones observes X2 = x2, then:

L(X1|X2 = x2

)= N

(ρx2, 1− ρ2

)

- the mean of X1 is attracted (if ρ > 0) or repeled (if ρ < 0) by the observation.

- the variance of X1 is reduced (if ρ 6= 0).

Extremal cases: ρ = 0 (one learns nothing from the observation of X2) and ρ = 1

(ones knows everything once X2 is observed).

Lille January, 2013

Limit theorems

• Law of Large Numbers.

Let (Xn)n≥0 be independent and identically distributed (i.i.d.) random variables. If

E[|X1|] <∞, then

Xn =1

n(X1 +X2 + ...+Xn)

n→∞−→ m almost surely, with m = E[X1]

”The empirical mean converges to the statistical mean”.

• Central Limit Theorem. Fluctuations theory.

Let (Xn)n≥0 be i.i.d. random variables. If E[X21 ] <∞, then

√n(Xn −m

)=

√n

(1

n(X1 +X2 + ...+Xn)−m

)n→∞−→ N (0, σ2) in law

where

m = E[X1]

σ2 = E[X21 ]− E[X1]

2 = E[(X1 − E[X1])2]

”For large n, the error Xn −m obeys the Gaussian distribution N (0, σ2/n).”

Lille January, 2013

Stochastic processes

Lille January, 2013

Toy model

Let F (z) ∈ R be the stepwise constant random process

F (z) =

∞∑

i=1

Fi1[i−1,i)(z)

where Fi independent random variables E[Fi] = F and E[(Fi − F )2] = σ2.

0 2.5 5 7.5 100

0.2

0.4

0.6

0.8

1

z

F

Here Fi are independent with uniform distribution over (0, 1).

Lille January, 2013

Random process

• Random variable X = random number.

A realization of the random variable = a real number.

Distribution of X characterized by moments of the form E[φ(X)] where φ ∈ Cb(R,R).

• Stochastic process (µ(x))x∈Rd = random function.

A realization of the process = a function from Rd to R.

Distribution of (µ(x))x∈Rd characterized by moments of the form

E[φ(µ(x1), . . . , µ(xn))], for any n ∈ N∗, x1, . . . ,xn ∈ R

d, φ ∈ Cb(Rn,R).Example: Gaussian process. A random process is Gaussian iff any (finite) linear

combination is a Gaussian random variable.

Lille January, 2013

Gaussian process

• Gaussian process (µ(x))x∈Rd characterized by its first two moments

m(x1) = E[µ(x1)] and R(x1,x2) = E[µ(x1)µ(x2)].

Any linear combination µλ =∑ni=1 λiµ(xi) has Gaussian distribution N (mλ, σ

2λ) with

mλ =

n∑

i=1

λiE[µ(xi)] and σ2λ =

n∑

i,j=1

λiλjE[µ(xi)µ(xj)]−m2λ

• Simulation: in order to simulate (µ(x1), . . . , µ(xn)):

- evaluate the mean vector Mi = E[µ(xi)] and the covariance matrix

Cij = E[µ(xi)µ(xj)]− E[µ(xi)]E[µ(xj)].

- generate a random vector X = (Xi)i=1,...,n of n independent Gaussian random

variables with mean 0 and variance 1.

- compute Y = M +C1/2X. The vector Y has the distribution of (µ(x1), . . . , µ(xn)).

Note: the computation of the square root may be expensive (use Cholesky method).

Lille January, 2013

Brownian motion

• Brownian motion (Wz)z≥0 (starting from 0)= real Gaussian process with mean 0

and covariance function

E[WzWz′ ] = z ∧ z′

The realizations of the Brownian motion are continuous but not differentiable.

The increments of the Brownian motion are independent:

if zn ≥ zn−1 ≥ · · · ≥ z1 ≥ z0 = 0, then (Wzn −Wzn−1, . . . ,Wz2 −Wz1 ,Wz1) are

independent Gaussian random variables with mean 0 and variance

E[(Wzj −Wzj−1

)2]= zj − zj−1

• Simulation: in order to simulate (Wh,W2h, ...,Wnh):

- evaluate the covaraince matrix C = (Cjl)j=,l=1,...,n with Cjl = (j ∧ l)h.- generate a random vector X = (Xi)i=1,...,n of n independent Gaussian random


- compute Y = C1/2X.

The vector Y has the distribution of (Wh,W2h, ...,Wnh).

Lille January, 2013

• Simulation: in order to simulate (Wh,W2h, . . . ,Wnh) :



- compute Yj =√h∑ji=1Xi.

The vector Y has the distribution of (Wh,W2h, . . . ,Wnh)T .

0 10 20 30 40 50−4

−2

0

2

4

6

z

Wz

Lille January, 2013

A few results on Gaussian processes

Let (µ(x))x∈Rd be a Gaussian random process with mean zero and covariance

function E[µ(x)µ(x′)] = R(x,x′).

• The smoothness of the trajectory is related to the smoothness of the covariance

function R.

If R ∈ Ck,k(Rd), then the trajectories are of class Ck/2−ǫ.The derivative (∂x1µ(x))x∈Rd is Gaussian process with mean zero and covariance

function E[∂x1µ(x)∂x′1µ(x′)] = ∂x1,x′1R(x,x

′).

• The local extrema have the shape of the covariance function.

Given the existence of an extremum at x0 with amplitude µ0, then (µ(x))x∈Rd is a

Gaussian process with mean

E[µ(x)|µ(x0) = µ0,∇µ(x0) = 0] =R(x,x0)

R(x0,x0)µ0

and covariance matrix

Cov(µ(x), µ(x′)|µ(x0) = µ0,∇µ(x0) = 0

)

= R(x,x′)− R(x,x0)R(x0,x′)

R(x0,x0)−∇x0

R(x,x0)TH(x0,x0)

−1∇x0R(x0,x

′)

where H(x,x′) =(∂2xix

′

jR(x,x′)

)n

i,j=1.

Lille January, 2013

Stationary random process

• (µ(x))x∈Rd is stationary if (µ(x+ x0))x∈Rd has the same distribution as (µ(x))x∈Rd

for any x0 ∈ Rd.

Sufficient and necessary condition:

E[φ(µ(x1), . . . , µ(xn))

]= E

[φ(µ(x0 + x1), . . . , µ(x0 + xn))]

for any n, x0, . . . ,xn ∈ Rd, φ ∈ Cb(Rn,R).

• Example: Gaussian process µ(x) with mean zero E[µ(x)] = 0 ∀x and covariance

function E[µ(x′)µ(x′ + x)] = c(x).

• Bochner’s theorem: a function c(x) is a covariance function of a stationary process

if and only if its Fourier transform c(k) is nonnegative.

c(k) =

∫

Rd

eik·xc(x)dx

Lille January, 2013

• Spectral representation (of real-valued stationary Gaussian process):

µ(x) =1

(2π)d

∫

Rd

e−ik·x√

c(k)nkdk

with nk complex white noise, i.e.:

nk complex-valued, Gaussian, n−k = nk, E [nk] = 0, and E[nknk′

]= (2π)dδ(k− k′).

(the representation is formal, one should use stochastic integrals dWk = nkdk).

We have nk =∫eik·xn(x)dx where n(x) is a real white noise, i.e.:

n(x) real-valued, Gaussian, E [n(x)] = 0, and E [n(x)n(x′)] = δ(x− x′).

(in 1D, formally, n(x) = dWx/dx).

Lille January, 2013

• Spectral representation (of real-valued stationary Gaussian process):

µ(x) =1

(2π)d

∫

Rd

e−ik·x√

c(k)nkdk

with nk complex white noise, i.e.:

nk complex-valued, Gaussian, n−k = nk, E [nk] = 0, and E[nknk′

]= (2π)dδ(k− k′).

(the representation is formal, one should use stochastic integrals dWk = nkdk).

We have nk =∫eik·xn(x)dx where n(x) is a real white noise, i.e.:

n(x) real-valued, Gaussian, E [n(x)] = 0, and E [n(x)n(x′)] = δ(x− x′).

(in 1D, formally, n(x) = dWx/dx).

• Simulation (d = 1): in order to simulate (µ(x1), . . . , µ(xn)), xj = (j − 1)h:

- compute the covariance vector C = (c(xi))i=1,...,n.



- filter with the square root of the Fourier transform of C:

Y = IDFT(√

DFT(C)×DFT(X))

→ Y is a realization of (µ(x1), . . . , µ(xn)) (in practice, use FFT and IFFT).

Lille January, 2013

Random differential equations and ordinary differential equations

Lille January, 2013

Goal: determine the limit limε→0Xε(z) where

dXε

dz= F

(z

ε

)

for a fairly general random process (F (z))z≥0.

Lille January, 2013

Method of averaging: Toy model

Let Xε(z) ∈ R be the solution of

dXε

dz= F (

z

ε)

with F (z) =∑∞

i=1 Fi1[i−1,i)(z), Fi independent random variables E[Fi] = F and

E[(Fi − F )2] = σ2.

(z 7→ t, particle in a random velocity field)

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

z

F

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

z

X

ε = 0.2

Lille January, 2013

Xε(z) =

∫ z

0

F(s

ε

)ds = ε

∫ zε

0

F (s)ds = ε

[ zε ]∑

i=1

Fi

+ ε

∫ zε

[ zε ]F (s)ds

= ε[z

ε

]

ε→ 0 ↓z

× 1[zε

]

[ zε ]∑

i=1

Fi

a.s. ↓ (LLN)

E[F1] = F

+ ε(z

ε−[z

ε

])

F[ zε ]

a.s. ↓0

Thus:

Xε(z)ε→0−→ X(z),

dX

dz= F .

Lille January, 2013

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

z

F

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

zX

ε = 0.05

Lille January, 2013

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

z

F

0 0.5 1 1.5 20

0.2

0.4

0.6

0.8

1

zX

ε = 0.01

Lille January, 2013

Goal: determine the limit limε→0Xε(z) where

dXε

dz= F

(z

ε

)

for a stationary random process (F (z))z≥0.

The previous analysis can be extended provided

1

Z

∫ Z

0

F (z)dzZ→∞−→ F

(i.e. F is ergodic)

Lille January, 2013

Ergodic theory

Ergodic Theorem. If F satisfies the ergodic hypothesis, then

1

Z

∫ Z

0

F (z)dzZ→∞−→ F a.s., where F = E[F (0)] = E[F (z)]

Ergodic hypothesis = “the orbit (F (z))z≥0 visits all of phase space”.

Ergodic theorem = ”the spatial average is equivalent to the statistical average”.

Counter-example for the ergodic hypothesis:

Let F1 and F2 be stationary, both satisfy the ergodic theorem, Fj = E[Fj(z)], j = 1, 2,

with F1 6= F2.

Let χ be a Bernoulli random variable (independent of F1 and F2) →P(χ = 0) = P(χ = 1) = 1/2.

Let F (z) = χF1(z) + (1− χ)F2(z).

F is a stationary process with mean F = 12(F1 + F2).

1

Z

∫ Z

0

F (z)dz = χ

(1

Z

∫ Z

0

F1(z)dz

)

+ (1− χ)

(1

Z

∫ Z

0

F2(z)dz

)

Z→∞−→ χF1 + (1− χ)F2

which is a random limit different from F .

The limit depends on χ because F has been trapped in a part of phase space.

Lille January, 2013

Mean square theory

Let F be a stationary process, E[F (0)2] <∞. Its covariance function is:

c(z) = E[(F (z0)− F )(F (z0 + z)− F )

]

• c is independent of z0 by stationarity of F .

• |c(z)| ≤ c(0) by Cauchy-Schwarz:

|c(z)| ≤ E[(F (0)− F )2

]1/2E[(F (z)− F )2

]1/2= c(0)

• c is an even function c(−z) = c(z):

c(−z) = E[(F (z0 − z)− F )(F (z0)− F )

]

z0=z= E[(F (0)− F )(F (z)− F )

]= c(z)

• The Fourier transform of c is nonnegative (Bochner).

Proposition. If∫∞

0|c(z)|dz <∞, then

E

[( 1

Z

∫ Z

0

F (z)dz − F)2]Z→∞−→ 0

Lille January, 2013

We can show that

Z E

[( 1

Z

∫ Z

0

F (z)dz − F)2]Z→∞−→ 2

∫ ∞

0

c(z)dz

Proof:

E

[( 1

Z

∫ Z

0

F (z)dz − F)2]

= E

[1

Z2

∫ Z

0

dz1

∫ Z

0

dz2(F (z1)− F )(F (z2)− F )

]

=1

Z2

∫ Z

0

dz1

∫ Z

0

dz2c(z1 − z2)

=2

Z

∫ Z

0

Z − z

Zc(z)dz

Using Lebesgue’s theorem:

Z E

[( 1

Z

∫ Z

0

F (z)dz − F)2]Z→∞−→ 2

∫ ∞

0

c(z)dz

Lille January, 2013

Let F be a stationary zero-mean random process. Denote

Sk(Z) =1√Z

∫ Z

0

eikzF (z)dz

We can show similarly

E[|Sk(Z)|2] Z→∞−→ 2

∫ ∞

0

c(z) cos(kz)dz =

∫ ∞

−∞

c(z)eikzdz

Simplified form of Bochner’s theorem: If F is a stationary process with integrable

covariance function, then the Fourier transform of its covariance function is

nonnegative.

Lille January, 2013

Next goal: determine the limit limε→0 Xε(z) where

dXε

dz= F

(z

ε,Xε(z)

)

Lille January, 2013

Method of averaging: Khasminskii theorem

dXε

dz= F (

z

ε,Xε), X

ε(0) = x0

Assume:

x 7→ F (z,x) is Lipschitz,

z 7→ F (z,x) is stationary and ergodic.

Define:

F (x) = E[F (z,x)]

Let X be the solution ofdX

dz= F (X), X(0) = x0

Theorem: for any Z > 0,

supz∈[0,Z]

E[|Xε(z)− X(z)|

] ε→0−→ 0

[1] R. Z. Khasminskii, Theory Probab. Appl. 11 (1966), 211-228.

Application to wave propagation in random media

• Let (pε(t, z), uε(t, z)) be the solution of the one-dimensional acoustic wave equations

in a random medium:

ρ(z

ε)∂uε

∂t+∂pε

∂z= 0

∂pε

∂t+ κ(

z

ε)∂uε

∂z= 0

• The Fourier components (in time) of the solution

uε(ω, z) =∫uε(t, z)eiωtdt, pε(ω, z) =

∫pε(t, z)eiωtdt

satisfydXε

dz= F (

z

ε,Xε),

where

Xε =

pε

uε

, F (z,X) = −iω

0 ρ(z)

1κ(z)

0

X

• Apply the method of averaging =⇒ Xε(ω, z) converges to X(ω, z) solution of

dX

dz= −iω

0 ρ

1κ

0

X, ρ = E[ρ], κ =(E[κ−1]

)−1

→ deterministic “effective medium” with parameters ρ, κ.

Lille January, 2013

• Take an inverse Fourier transform: (pε(t, z), uε(t, z)) converges to (p(t, z), u(t, z))

solution of the homogeneous effective system

ρ∂u

∂t+∂p

∂z= 0

∂p

∂t+ κ

∂u

∂z= 0

The propagation speed of (p, u) is c =√κ/ρ.

→ the effective speed of the acoustic wave (pε, uε) as ε→ 0 is c.

• This analysis is just a small piece of the homogenization theory.

cf book The theory of composites by G. Milton.

Lille January, 2013

Example: bubbles in water

ρa = 1.2 103 g/m3, κa = 1.4 108 g/s2/m, ca = 340 m/s.

ρw = 1.0 106 g/m3, κw = 2.0 1012 g/s2/m, cw = 1425 m/s.

If the typical pulse frequency is 10 Hz - 30 kHz, then the typical wavelength is 1 cm -

100 m, which is larger than the bubble sizes =⇒ the effective medium theory can be

applied.

ρ = E[ρ] = φρa + (1− φ)ρw =

9.9 105 g/m3 if φ = 1%

9 105 g/m3 if φ = 10%

κ =(E[κ−1]

)−1=

(φ

κa+

1− φ

κw

)−1

=

1.4 1010 g/s2/m if φ = 1%

1.4 109 g/s2/m if φ = 10%

where φ = volume fraction of air.

Thus, c = 120 m/s if φ = 1% and c = 37 m/s if φ = 10 %.

→ the average sound speed c can be much smaller than ess inf(c).

The converse is impossible:

E[c−1] = E

[

κ−1/2ρ1/2]

≤ E[κ−1]1/2E[ρ]1/2 = c−1

Thus c ≤ E[c−1]−1 ≤ ess sup(c).

Lille January, 2013

Random differential equations and Brownian motion

Lille January, 2013

Toy model

dXε

dz= F (

z

ε)

with F (z) =∑∞

i=1 Fi1[i−1,i)(z), Fi independent random variables E[Fi] = F = 0 and

E[(Fi − F )2] = σ2.

0 0.5 1 1.5 2−0.5

0

0.5

z

F

0 0.5 1 1.5 2−0.5

0

0.5

1

z

X

ε = 0.05

Lille January, 2013

For any z ∈ [0, Z], we have

Xε(z)ε→0−→ X(z),

dX

dz= F = 0.

No macroscopic evolution is noticeable.

→ it is necessary to look at larger z to get an effective behavior

z 7→ z

ε, Xε(z) = Xε(

z

ε)

dXε

dz=

1

εF (

z

ε2)

Lille January, 2013

Diffusion-approximation: Toy model

dXε

dz=

1

εF (

z

ε2)

with F (z) =∑∞

i=1 Fi1[i−1,i)(z), Fi independent random variables E[Fi] = 0 and

E[F 2i ] = σ2.

0 0.5 1.0 1.5 2.0 2.5−10

0

10

z

F

0 0.5 1.0 1.5 2.0 2.5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

z

X

ε = 0.05

Lille January, 2013

Xε(z) =

∫ z

0

1

εF (

s

ε2)ds = ε

∫ z

ε2

0

F (s)ds = ε

[

z

ε2

]

∑

i=1

Fi

+ ε

∫ z

ε2

[

z

ε2

]

F (s)ds

= ε

√[ z

ε2

]

ε→ 0 ↓√z

× 1√[

zε2

]

[

z

ε2

]

∑

i=1

Fi

law ↓ (CLT )

N (0, σ2)

+ ε( z

ε2−[ z

ε2

])

F[

z

ε2

]

a.s. ↓0

Thus: Xε(z) converges in distribution as ε→ 0 to X(z) whose distribution is

N (0, σ2z).

With some more work: The process (Xε(z))z≥0 converges in distribution to a

Brownian motion (σWz)z≥0.

Lille January, 2013

Diffusion processes and stochastic differential equations

Lille January, 2013

Example: Brownian motion

0 10 20 30 40 50−4

−2

0

2

4

6

z

Wz

Wz (issued from 0): zero-mean Gaussian process

with covariance E[WzWz′ ] = z ∧ z′

Its increments are independent and:

E[(Wz+h −Wz)2] = h

Let φ be a bounded real function:

E[φ(x+Wz)] =

∫1√2πz

exp

(

−w2

2z

)

φ(x+ w)dw

=

∫1√2πz

exp

(

− (y − x)2

2z

)

︸︷︷︸pz(x,y)

φ(y)dy

For x, z fixed, y 7→ pz(x, y) is the pdf of the random variable x+Wz .

Note that pz(x, y) is the kernel of the heat operator.

Lille January, 2013

Example: Brownian motion

• The moment

u(z, x) := E[φ(x+Wz)] =

∫

pz(x, y)φ(y)dy, pz(x, y) =1√2πz

exp

(

− (y − x)2

2z

)

satisfies the Kolmogorov (backward) equation

∂u

∂z=

1

2

∂2u

∂x2, u(z = 0, x) = φ(x)

Reciprocal: The partial differential equation

∂u

∂z=

1

2

∂2u

∂x2, u(z = 0, x) = φ(x)

has the probabilistic representation: u(z, x) = E[φ(x+Wz)].

The reciprocal is useful for Monte Carlo simulation techniques for solving PDEs.

• The pdf y 7→ pz(x, y) of x+Wz satisfies the Fokker-Planck (Kolmogorov forward)

equation as a function of z and y:

∂pz∂z

=1

2

∂2pz∂y2

, pz=0(x, y) = δ(y − x)

Lille January, 2013

Stochastic differential equations

Let X(z) be the solution of the one-dimensional stochastic differential equation

X(z) = x+

∫ z

0

σ(X(s))dWs +

∫ z

0

b(X(s))ds

Existence and uniqueness ensured provided b and σ are C1 with bounded derivatives.

X(z) is continuous a.s., squared integrable, adapted (depends on (Ws)0≤s≤z).

• Ito integral∫ z

0

φ(X(s))dWs = lim∆z→0

∑

k

φ(X(zk))(Wzk+1−Wzk ), 0 = z0 < z1 < · · · < zn = z

Good properties: martingale (mean zero), Ito’s isometry:

E

[( ∫ z

0

φ(X(s))dWs

)2]

=

∫ z

0

E[φ(X(s))2]ds

• Stratonovich integral:∫ z

0

φ(X(s)) dWs = lim∆z→0

∑

k

φ(X(zk+1)) + φ(X(zk))

2(Wzk+1

−Wzk )

• Relation:∫ z

0

φ(X(s)) dWs =

∫ z

0

φ(X(s))dWs +1

2

∫ z

0

σ(X(s))φ′(X(s))ds

Lille January, 2013

X(z) = x+

∫ z

0

σ(X(s))dWs +

∫ z

0

b(X(s))ds

• Ito’s formula:

φ(X(z)) = φ(x) +

∫ z

0

σ(X(s))φ′(X(s))dWs +

∫ z

0

b(X(s))φ′(X(s))ds

+1

2

∫ z

0

σ(X(s))2φ′′(X(s))ds

• X(z) is a diffusion process with the generator Q =1

2σ2(x)

∂2

∂x2+ b(x)

∂

∂x

• The moment u(z, x) := E[φ(X(z))|X(0) = x

]satisfies the Kolmogorov equation:

∂u

∂z= Qu, u(z = 0, x) = φ(x)

Proof: Let u be the solution of the Kolmogorov equation. Let z0 > 0.

du(z0 − z,X(z)) = −∂zu(z0 − z,X(z))dz + ∂xu(z0 − z,X(z))σ(X(z))dWz

+∂xu(z0 − z,X(z))b(X(z))dz +1

2σ2(X(z))∂2

xu(z0 − z,X(z))dz

= ∂xu(z0 − z,X(z))σ(X(z))dWz

Therefore E[u(z0 − z,X(z))] is constant in z:

u(z0, x) = E[u(z0, X(0))|X(0) = x] = E[u(0,X(z0))|X(0) = x] = E[φ(X(z0))|X(0) = x]

Lille January, 2013

• The pdf y 7→ pz(x, y) of X(z) (starting from X(0) = x) satisfies the Fokker-Planck

equation as a function of z and y:

∂pz∂z

= Q∗pz , pz=0(x, y) = δ(x− y)

where Q∗ is the adjoint operator of Q:

Q∗p(y) =1

2

∂2

∂y2(σ2(y)p(y)

)− ∂

∂y

(b(y)p(y)

)

Proof: Consider a test function φ. By taking the expectation in Ito’s formula:

φ(X(z)) = φ(x) +

∫ z

0σ(X(s))φ′(X(s))dWs +

∫ z

0b(X(s))φ′(X(s))ds

+1

2

∫ z

0σ(X(s))2φ′′(X(s))ds

one gets

E[φ(X(z))] = φ(x) +

∫ z

0E[Qφ(X(s))]ds →

∂E[φ(X(z))]

∂z= E[Qφ(X(z))]

Therefore ∫φ(y)∂zpz(x, y)dy =

∫Qφ(y)pz(x, y)dy =

∫φ(y)Q∗pz(x, y)dy

Since this holds true for any φ, ones finds ∂zpz(x, y) = Q∗pz(x, y).

Lille January, 2013

Diffusion processes

• Let σ and b be C1(R,R) functions with bounded derivatives.

Let Wz be a Brownian motion.

The solution X(z) of the 1D stochastic differential equation:

dX(z) = σ(X(z))dWz + b(X(z))dz

is a diffusion process with the generator

Q =1

2σ2(x)

∂2

∂x2+ b(x)

∂

∂x

• Let σ ∈ C1(Rn,Rn×m) and b ∈ C1(Rn,Rn) with bounded derivatives.

Let Wz be a m-dimensional Brownian motion.

The solution X(z) of the stochastic differential equation:

dX(z) = σ(X(z))dWz + b(X(z))dz

is a diffusion process with the generator

Q =1

2

n∑

i,j=1

aij(x)∂2

∂xi∂xj+

n∑

i=1

bi(x)∂

∂xi

with a = σσT .

Lille January, 2013

Random differential equations and stochastic differential equations

Lille January, 2013

Next goal: determine the limit limε→0Xε(z) where

dXε

dz=

1

εF( z

ε2

)

for a fairly general process (F (z))z≥0 with E[F (z)] = 0.

Lille January, 2013

Write

Xε(z) =1

ε

∫ z

0

F( s

ε2

)

ds = ε

√z

ε2× 1√

zε2

∫ z

ε2

0

F (s) ds

We know that

E

[ 1√

zε2

∫ z

ε2

0

F (s) ds]

= 0,

limε→0

E

[( 1√

zε2

∫ z

ε2

0

F (s) ds)2]

= limZ→∞

ZE

[( 1

Z

∫ Z

0

F (s)ds)2]

= 2

∫ ∞

0

E[F (0)F (u)]du,

The Gaussian property of the limit of Xε(z) is ensured by an invariance principle.

Conclusion:1

ε

∫ z

0

F( s

ε2)ds

ε→0−→√2σWz

in distribution, where (Wz)z≥0 is a Brownian motion and

σ2 =

∫ ∞

0

E[F (0)F (u)]du

Lille January, 2013

Next goal: determine the limit limε→0 Xε(z) where

dXε

dz=

1

εF( z

ε2,Xε(z)

)

when

E[F (z,x)] = 0 ∀x

Lille January, 2013

Diffusion-approximation

dXε

dz(z) =

1

εF( z

ε2,Xε(z)

)

, Xε(0) = x0 ∈ R

d.

F stationary, centered, and ergodic (in z): E[F (z,x)] = 0.

Theorem: The processes (Xε(z))z≥0 converge in distribution in C0([0,∞),Rd) to the

diffusion process X with the generator L:

L =

d∑

i,j=1

aij(x)∂2

∂xi∂xj+

d∑

j=1

bj(x)∂

∂xj

with

aij(x) =

∫ ∞

0

E [Fi(0,x)Fj(u,x)] du

bj(x) =

d∑

i=1

∫ ∞

0

E [Fi(0,x)∂xiFj(u,x)] du

It means that the pdf pz(x) of X(z) satisfies:

∂pz∂z

=d∑

i,j=1

∂2

∂xi∂xj

(aij(x)pz

)−

d∑

j=1

∂

∂xj

(bj(x)pz

), pz=0(x) = δ(x− x0)

Lille January, 2013

Diffusion-approximation - the one-dimensional case

dXε

dz(z) =

1

εf( z

ε2

)

h (Xε(z)), Xε(0) = x0 ∈ R.

f stationary, centered, and ergodic: E[f(z)] = 0.

Theorem: The processes (Xε(z))z≥0 converge in distribution in C0([0,∞),R) to the

diffusion process X with the generator L:

L = a(x)∂2

∂x2+ b(x)

∂

∂x

with

a(x) =1

2σ2h2(x), b(x) =

1

2σ2h(x)h′(x), σ2 = 2

∫ ∞

0

E [f(0)f(u)] du

It means that X(z) is the solution of the SDE

dX(z) = σh(X(z))dWz + b(X(z))dz

or

dX(z) = σh(X(z)) dWz

Remember that∫ z

01εf(sε2

)ds converges in distribution to σWz .

→ the ”natural” integral is Stratonovich.

Lille January, 2013

Diffusion-approximation - a simple example

dXε

dz=

1

εµ( z

ε2)Xε, Xε(z = 0) = x0

Method 1. Apply the theorem → (Xε(z))z≥0 converges in distribution to the

diffusion process (X(z))z≥0 solution of the SDE

dX(z) = σX(z)dWz +σ2

2X(z)dz

where

σ2 = 2

∫ ∞

0

E[µ(0)µ(u)]du

Lille January, 2013

Method 2. We have

Xε(z) = x0 exp(1

ε

∫ z

0

µ( s

ε2)ds)

We know1

ε

∫ z

0

µ( s

ε2)ds

ε→0−→ σWz


σ2 = 2

∫ ∞

0

E[µ(0)µ(u)]du

Therefore

Xε(z)ε→0−→ X(z) := x0 exp

(σWz

)

in distribution. We have (by Ito’s formula):

dX(z) = σX(z)dWz +σ2

2X(z)dz

Lille January, 2013

Averaging and diffusion-approximation - the linear case

dXε

dz(z) =

1

εF( z

ε2

)

Xε(z) +G

( z

ε2

)

Xε(z), X

ε(0) = x0 ∈ Rd.

F matrix-valued, stationary, ergodic, and centered: E[F(z)] = 0.

F matrix-valued, stationary, and ergodic: E[G(z)] = G.

Theorem: The expectations (E[Xε(z)])z≥0 converge in C0([0,∞),Rd) to the solution

(X(z))z≥0 of the linear system:

dX

dz(z) = FX(z) + GX(z), X(0) = x0 ∈ R

d.

with

Fjl =d∑

k=1

∫ ∞

0

E[Fkl(0)Fjk(u)

]du

Lille January, 2013

Averaging and diffusion-approximation - a simple example

dXε

dz=i

εµ( z

ε2)Xε, Xε(z = 0) = x0

Method 1. Apply the theorem to

d

dz

Xεr

Xεi

=1

εµ( z

ε2)

0 −1

1 0

Xεr

Xεi

The expectation converges to

d

dz

Xr

Xi

=σ2

2

−1 0

0 −1

Xr

Xi

where

σ2 = 2

∫ ∞

0

E[µ(0)µ(u)]du

This shows that

E[Xε(z)

] ε→0−→ X(z)

where X(z) is the solution of

dX

dz= −σ

2

2X, X(z = 0) = x0

Lille January, 2013

Method 2. We have

Xε(z) = x0 exp( i

ε

∫ z

0

µ( s

ε2)ds)

We know1

ε

∫ z

0

µ( s

ε2)ds

ε→0−→ σWz


σ2 = 2

∫ ∞

0

E[µ(0)µ(u)]du

Therefore

Xε(z)ε→0−→ X(z) := exp

(iσWz

)

in distribution. Note

E[exp

(iσWz

)]=

1√2πz

∫ ∞

−∞

eiσwe−w2

2z dw = e−σ2z2

This shows that

E[Xε(z)

] ε→0−→ X(z) = x0e−σ2

2z

where X(z) is the solution of

dX

dz= −σ

2

2X, X(z = 0) = x0

Lille January, 2013

We have shown that the expectation of the solution of

dXε

dz=i

εµ( z

ε2)Xε, Xε(z = 0) = x0

converges as ε→ 0 to the solution of

dX

dz= −σ

2

2X, X(z = 0) = x0

→ random phases give rise to effective damping.

We have in distribution: the solution of

dXε

dz=i

εµ( z

ε2)Xε, Xε(z = 0) = x0

converges as ε→ 0 to the solution of

dX = iσX dWz, X(z = 0) = x0

→ note again the Stratonovich integral. Equivalently

dX = iσXdWz − σ2

2Xdz, X(z = 0) = x0

Lille January, 2013

Introduction to stochastic...

Documents

Transcript of Introduction to stochastic...