Introduction to stochastic...
Transcript of Introduction to stochastic...
Introduction to stochastic processes
Josselin Garnier (Universite Paris VII)
http://www.proba.jussieu.fr/~garnier
Lille January, 2013
A quick introduction to probability theory
Lille January, 2013
Probability space (Ω,A,P)
• Goal: Model a random experiment.
Example 1: toss a coin.
Example 2: pick a number between 0 and 1 (matlab function rand).
• Ω: fundamental set (set of all possible realisations).
→ A realization ω is an element of Ω.
Example 1: Ω = H,T.Example 2: Ω = [0, 1].
• A: σ-algebra on Ω (set of events).
Example 1: A = P(Ω);
an event: the result is head, A = H.Example 2: A = B([0, 1]);
an event: the number is larger than 1/2, A = [1/2, 1].
• P: probability (gives the probability of events): function from A to [0, 1] such that:
- P(Ω) = 1,
- if (Aj)j≥1 is a numerable family of disjoint sets of A then P(∪jAj) =∑
j P(Aj).
Example 1: P(H) = P(T) = 12.
Example 2: P([a, b]) = b− a for all 0 ≤ a ≤ b ≤ 1.
Lille January, 2013
Random variable
• Random variable X = random number.
Application X : Ω → R.
A realization X(ω) of a random variable is a real number.
• The distribution of a random variable is characterized by moments of the form
E[φ(X)] for φ ∈ Cb(R,R):E[φ(X)] =
∫
Ω
φ(X(ω))P(dω)
• The distribution of a (continuous) random variable is characterized by the
probability density function (pdf) pX :
E[φ(X)] =
∫ ∞
−∞
φ(x)pX(x)dx
• Usual pdfs:
−0.5 0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
p X(x
)
x0 1 2 3 4 5
0
0.2
0.4
0.6
0.8
1
p X(x
)
x−5 0 50
0.1
0.2
0.3
0.4
0.5
p X(x
)
x
uniform exponential Gaussian
Lille January, 2013
• The mean (expectation) of a random variable X with pdf pX(x) is
E[X] =
∫ ∞
−∞
xpX(x)dx
• The variance of a random variable X with pdf pX(x) is
Var(X) = E[(X − E[X])2
]= E[X2]− E[X]2 =
∫ ∞
−∞
x2pX(x)dx−(∫ ∞
−∞
xpX(x)dx)2
The variance measures the dispersion of the random variable (around its mean).
• A standard Gaussian random variable X has the pdf
pX(x) =1√2π
exp(
− x2
2
)
Its mean is E[X] = 0 and its variance is Var(X) = 1.
We write X ∼ N (0, 1).
• A Gaussian random variable X with mean µ and variance σ2 has the pdf
pX(x) =1√2πσ
exp(
− (x− µ)2
2σ2
)
We write X ∼ N (µ, σ2).
Lille January, 2013
Random vector
• n-dimensional random vector X = collection of n random variables (Xi)i=1,...,n.
Application X : Ω → Rn.
A realization X(ω) of a random vector is a vector in Rn.
The distribution of a (continuous) random vector is characterized by the pdf pX :
E[φ(X)] =
∫
Rn
φ(x)pX(x)dx, ∀φ ∈ Cb(Rn,R)
The vector X = (Xi)i=1,...,n is independent if
pX(x) =
n∏
j=1
pXj(xj)
or equivalently
E[φ1(X1) · · ·φn(Xn)
]= E
[φ1(X1)
]· · ·E
[φn(Xn)
], ∀φ1, . . . , φn ∈ Cb(R,R)
Example: a normalized Gaussian random vector X has the Gaussian pdf
pX(x) =1
√(2π)n
exp(
− |x|22
)
It is a vector of independent random normalized Gaussian variables.
Lille January, 2013
• Let X = (Xi)i=1,...,n be a random vector.
• The mean of X is the vector µ = (µj)j=1,...,n:
µj = E[Xj ]
The covariance matrix of X is C = (Cjl)j,l=1,...,n:
Cjl = E[(Xj − E[Xj ])(Xl − E[Xl])
]
• Let (βj)j=1,...,n ∈ Rn. The random variable Z =
∑nj=1 βjXj has mean:
E[Z] = βTµ =
n∑
j=1
βjE[Xj ]
and variance:
Var(Z) = βTCβ =
n∑
j,l=1
Cjlβjβl
(as a byproduct, this shows that C is a nonnegative matrix).
• If the variables are independent then the covariance matrix is diagonal. In
particular:
Var( n∑
j=1
Xj
)
=
n∑
j=1
Var(Xj)
The reciprocal is false (in general).
Lille January, 2013
• Gaussian random vectors.
The random vector X = (Xi)i=1,...,n is Gaussian iff for any λ ∈ Rn, the sum λTX is
a Gaussian random variable.
• The distribution of X is characterized by the mean µ and the covariance matrix R.
When R is invertible, X has the pdf
p(x) =1
(2π)n/2√detR
exp(
− (x− µ)TR−1(x− µ)
2
)
• The Xj are independent iff the covariance matrix is diagonal (iff Cov(Xj, Xl) = 0
for all j 6= l).
• Let A be a m× n matrix. The random vector Y = AX is Gaussian with mean:
E[Y ] = Aµ
and covariance matrix:
E[(Y − E[Y ])(Y − E[Y ])T
]= ACA
T
→ Any linear functional transforms a Gaussian vector into a Gaussian vector.
Lille January, 2013
Conditional expectation
Let (X, Y ) be a random vector, X is Rn-valued and Y is R-valued.
• The expectation of Y is the deterministic constant that is the best approximation of
the random variable Y :
E[Y ] = argmina∈R
E[(Y − a)2
]
(in the mean square sense).
• Conditional expectation.
Assume that we observe X. What is the best estimate of Y given X ?
We look for:
ψ0 = argminψ:Rn→R
E[(Y − ψ(X))2
]
and then E[Y |X] = ψ0(X).
→ Complex minimization problem.
Interpretation: E[Y |X] is the orthogonal projection (in L2(Ω)) of Y onto the
(infinite-dimensional) subspace of square-integrable variables of the form ψ(X).
Remark: if X and Y are independent, then E[Y |X] = E[Y ].
Lille January, 2013
• Linear regression.
Assume that we observe X. What is the affine combination of X that best
approximates Y ?
The problem to be solved is:
(αregj )nj=0 = argmin(αj)
nj=0∈R
n+1
E
[(Y − α0 −
n∑
j=1
αjXj
)2]
and one finds Y reg = αreg0 +∑nj=1 α
regj Xj .
→ Quadratic minimization problem (much easier to solve than the computation of
the conditional expectation).
But linear regression is clearly sub-optimal in the point of view of approximation
theory: the best affine combination of X has no reason to be the best approximation
of Y given X.
• Result: If (X, Y ) is Gaussian, then the conditional expectation E[Y |X] if the linear
regression of Y on X.
Example: If (X,Y ) is a Gaussian vector with mean zero and covariance matrix
1 ρ
ρ 1
then E[Y |X] = ρX.
Lille January, 2013
Gaussian conditioning
Let
X1
X2
be a Gaussian random vector (with X1 of size n1 and X2 of size n2):
L(
X1
X2
)
= N(
0
0
,
R11 R12
R21 R22
)
with the covariance matrices R11 of size n1 × n1, R12 of size n1 × n2, R21 = RT12 of
size n2 × n1, and R22 of size n2 × n2.
Then the distribution of X1 conditionally to X2 is Gaussian:
L(X1|X2 = x2
)= N
(R12R
−122 x2,R11 −R12R
−122 R21
)
Proof: The posterior pdf of X1 given that X2 = x2 is ppost(x1) =p(x1,x2)
∫p(x′
1,x2)dx′1
.
Use
R11 R12
R21 R22
−1
=
Q−1 −Q−1R12R
−122
−R−122 R21Q
−1 R−122 +R−1
22 R21Q−1R12R
−122
,
where Q = R11 −R12R−122 R21 is the Schur complement.
Lille January, 2013
Example. Let us consider the Gaussian vector (n1 = n2 = 1)
X1
X2
∼ N(
0
0
,
1 ρ
ρ 1
)
with ρ ∈ [−1, 1] correlation coefficient of X1 and X2.
The distribution of X1 is
L(X1
)= N
(0, 1)
If ones observes X2 = x2, then:
L(X1|X2 = x2
)= N
(ρx2, 1− ρ2
)
- the mean of X1 is attracted (if ρ > 0) or repeled (if ρ < 0) by the observation.
- the variance of X1 is reduced (if ρ 6= 0).
Extremal cases: ρ = 0 (one learns nothing from the observation of X2) and ρ = 1
(ones knows everything once X2 is observed).
Lille January, 2013
Limit theorems
• Law of Large Numbers.
Let (Xn)n≥0 be independent and identically distributed (i.i.d.) random variables. If
E[|X1|] <∞, then
Xn =1
n(X1 +X2 + ...+Xn)
n→∞−→ m almost surely, with m = E[X1]
”The empirical mean converges to the statistical mean”.
• Central Limit Theorem. Fluctuations theory.
Let (Xn)n≥0 be i.i.d. random variables. If E[X21 ] <∞, then
√n(Xn −m
)=
√n
(1
n(X1 +X2 + ...+Xn)−m
)n→∞−→ N (0, σ2) in law
where
m = E[X1]
σ2 = E[X21 ]− E[X1]
2 = E[(X1 − E[X1])2]
”For large n, the error Xn −m obeys the Gaussian distribution N (0, σ2/n).”
Lille January, 2013
Stochastic processes
Lille January, 2013
Toy model
Let F (z) ∈ R be the stepwise constant random process
F (z) =
∞∑
i=1
Fi1[i−1,i)(z)
where Fi independent random variables E[Fi] = F and E[(Fi − F )2] = σ2.
0 2.5 5 7.5 100
0.2
0.4
0.6
0.8
1
z
F
Here Fi are independent with uniform distribution over (0, 1).
Lille January, 2013
Random process
• Random variable X = random number.
A realization of the random variable = a real number.
Distribution of X characterized by moments of the form E[φ(X)] where φ ∈ Cb(R,R).
• Stochastic process (µ(x))x∈Rd = random function.
A realization of the process = a function from Rd to R.
Distribution of (µ(x))x∈Rd characterized by moments of the form
E[φ(µ(x1), . . . , µ(xn))], for any n ∈ N∗, x1, . . . ,xn ∈ R
d, φ ∈ Cb(Rn,R).Example: Gaussian process. A random process is Gaussian iff any (finite) linear
combination is a Gaussian random variable.
Lille January, 2013
Gaussian process
• Gaussian process (µ(x))x∈Rd characterized by its first two moments
m(x1) = E[µ(x1)] and R(x1,x2) = E[µ(x1)µ(x2)].
Any linear combination µλ =∑ni=1 λiµ(xi) has Gaussian distribution N (mλ, σ
2λ) with
mλ =
n∑
i=1
λiE[µ(xi)] and σ2λ =
n∑
i,j=1
λiλjE[µ(xi)µ(xj)]−m2λ
• Simulation: in order to simulate (µ(x1), . . . , µ(xn)):
- evaluate the mean vector Mi = E[µ(xi)] and the covariance matrix
Cij = E[µ(xi)µ(xj)]− E[µ(xi)]E[µ(xj)].
- generate a random vector X = (Xi)i=1,...,n of n independent Gaussian random
variables with mean 0 and variance 1.
- compute Y = M +C1/2X. The vector Y has the distribution of (µ(x1), . . . , µ(xn)).
Note: the computation of the square root may be expensive (use Cholesky method).
Lille January, 2013
Brownian motion
• Brownian motion (Wz)z≥0 (starting from 0)= real Gaussian process with mean 0
and covariance function
E[WzWz′ ] = z ∧ z′
The realizations of the Brownian motion are continuous but not differentiable.
The increments of the Brownian motion are independent:
if zn ≥ zn−1 ≥ · · · ≥ z1 ≥ z0 = 0, then (Wzn −Wzn−1, . . . ,Wz2 −Wz1 ,Wz1) are
independent Gaussian random variables with mean 0 and variance
E[(Wzj −Wzj−1
)2]= zj − zj−1
• Simulation: in order to simulate (Wh,W2h, ...,Wnh):
- evaluate the covaraince matrix C = (Cjl)j=,l=1,...,n with Cjl = (j ∧ l)h.- generate a random vector X = (Xi)i=1,...,n of n independent Gaussian random
variables with mean 0 and variance 1.
- compute Y = C1/2X.
The vector Y has the distribution of (Wh,W2h, ...,Wnh).
Lille January, 2013
• Simulation: in order to simulate (Wh,W2h, . . . ,Wnh) :
- generate a random vector X = (Xi)i=1,...,n of n independent Gaussian random
variables with mean 0 and variance 1.
- compute Yj =√h∑ji=1Xi.
The vector Y has the distribution of (Wh,W2h, . . . ,Wnh)T .
0 10 20 30 40 50−4
−2
0
2
4
6
z
Wz
Lille January, 2013
A few results on Gaussian processes
Let (µ(x))x∈Rd be a Gaussian random process with mean zero and covariance
function E[µ(x)µ(x′)] = R(x,x′).
• The smoothness of the trajectory is related to the smoothness of the covariance
function R.
If R ∈ Ck,k(Rd), then the trajectories are of class Ck/2−ǫ.The derivative (∂x1µ(x))x∈Rd is Gaussian process with mean zero and covariance
function E[∂x1µ(x)∂x′1µ(x′)] = ∂x1,x′1R(x,x
′).
• The local extrema have the shape of the covariance function.
Given the existence of an extremum at x0 with amplitude µ0, then (µ(x))x∈Rd is a
Gaussian process with mean
E[µ(x)|µ(x0) = µ0,∇µ(x0) = 0] =R(x,x0)
R(x0,x0)µ0
and covariance matrix
Cov(µ(x), µ(x′)|µ(x0) = µ0,∇µ(x0) = 0
)
= R(x,x′)− R(x,x0)R(x0,x′)
R(x0,x0)−∇x0
R(x,x0)TH(x0,x0)
−1∇x0R(x0,x
′)
where H(x,x′) =(∂2xix
′
jR(x,x′)
)n
i,j=1.
Lille January, 2013
Stationary random process
• (µ(x))x∈Rd is stationary if (µ(x+ x0))x∈Rd has the same distribution as (µ(x))x∈Rd
for any x0 ∈ Rd.
Sufficient and necessary condition:
E[φ(µ(x1), . . . , µ(xn))
]= E
[φ(µ(x0 + x1), . . . , µ(x0 + xn))]
for any n, x0, . . . ,xn ∈ Rd, φ ∈ Cb(Rn,R).
• Example: Gaussian process µ(x) with mean zero E[µ(x)] = 0 ∀x and covariance
function E[µ(x′)µ(x′ + x)] = c(x).
• Bochner’s theorem: a function c(x) is a covariance function of a stationary process
if and only if its Fourier transform c(k) is nonnegative.
c(k) =
∫
Rd
eik·xc(x)dx
Lille January, 2013
• Spectral representation (of real-valued stationary Gaussian process):
µ(x) =1
(2π)d
∫
Rd
e−ik·x√
c(k)nkdk
with nk complex white noise, i.e.:
nk complex-valued, Gaussian, n−k = nk, E [nk] = 0, and E[nknk′
]= (2π)dδ(k− k′).
(the representation is formal, one should use stochastic integrals dWk = nkdk).
We have nk =∫eik·xn(x)dx where n(x) is a real white noise, i.e.:
n(x) real-valued, Gaussian, E [n(x)] = 0, and E [n(x)n(x′)] = δ(x− x′).
(in 1D, formally, n(x) = dWx/dx).
Lille January, 2013
• Spectral representation (of real-valued stationary Gaussian process):
µ(x) =1
(2π)d
∫
Rd
e−ik·x√
c(k)nkdk
with nk complex white noise, i.e.:
nk complex-valued, Gaussian, n−k = nk, E [nk] = 0, and E[nknk′
]= (2π)dδ(k− k′).
(the representation is formal, one should use stochastic integrals dWk = nkdk).
We have nk =∫eik·xn(x)dx where n(x) is a real white noise, i.e.:
n(x) real-valued, Gaussian, E [n(x)] = 0, and E [n(x)n(x′)] = δ(x− x′).
(in 1D, formally, n(x) = dWx/dx).
• Simulation (d = 1): in order to simulate (µ(x1), . . . , µ(xn)), xj = (j − 1)h:
- compute the covariance vector C = (c(xi))i=1,...,n.
- generate a random vector X = (Xi)i=1,...,n of n independent Gaussian random
variables with mean 0 and variance 1.
- filter with the square root of the Fourier transform of C:
Y = IDFT(√
DFT(C)×DFT(X))
→ Y is a realization of (µ(x1), . . . , µ(xn)) (in practice, use FFT and IFFT).
Lille January, 2013
Random differential equations and ordinary differential equations
Lille January, 2013
Goal: determine the limit limε→0Xε(z) where
dXε
dz= F
(z
ε
)
for a fairly general random process (F (z))z≥0.
Lille January, 2013
Method of averaging: Toy model
Let Xε(z) ∈ R be the solution of
dXε
dz= F (
z
ε)
with F (z) =∑∞
i=1 Fi1[i−1,i)(z), Fi independent random variables E[Fi] = F and
E[(Fi − F )2] = σ2.
(z 7→ t, particle in a random velocity field)
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
z
F
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
z
X
ε = 0.2
Lille January, 2013
Xε(z) =
∫ z
0
F(s
ε
)ds = ε
∫ zε
0
F (s)ds = ε
[ zε ]∑
i=1
Fi
+ ε
∫ zε
[ zε ]F (s)ds
= ε[z
ε
]
ε→ 0 ↓z
× 1[zε
]
[ zε ]∑
i=1
Fi
a.s. ↓ (LLN)
E[F1] = F
+ ε(z
ε−[z
ε
])
F[ zε ]
a.s. ↓0
Thus:
Xε(z)ε→0−→ X(z),
dX
dz= F .
Lille January, 2013
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
z
F
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
zX
ε = 0.05
Lille January, 2013
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
z
F
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
zX
ε = 0.01
Lille January, 2013
Goal: determine the limit limε→0Xε(z) where
dXε
dz= F
(z
ε
)
for a stationary random process (F (z))z≥0.
The previous analysis can be extended provided
1
Z
∫ Z
0
F (z)dzZ→∞−→ F
(i.e. F is ergodic)
Lille January, 2013
Ergodic theory
Ergodic Theorem. If F satisfies the ergodic hypothesis, then
1
Z
∫ Z
0
F (z)dzZ→∞−→ F a.s., where F = E[F (0)] = E[F (z)]
Ergodic hypothesis = “the orbit (F (z))z≥0 visits all of phase space”.
Ergodic theorem = ”the spatial average is equivalent to the statistical average”.
Counter-example for the ergodic hypothesis:
Let F1 and F2 be stationary, both satisfy the ergodic theorem, Fj = E[Fj(z)], j = 1, 2,
with F1 6= F2.
Let χ be a Bernoulli random variable (independent of F1 and F2) →P(χ = 0) = P(χ = 1) = 1/2.
Let F (z) = χF1(z) + (1− χ)F2(z).
F is a stationary process with mean F = 12(F1 + F2).
1
Z
∫ Z
0
F (z)dz = χ
(1
Z
∫ Z
0
F1(z)dz
)
+ (1− χ)
(1
Z
∫ Z
0
F2(z)dz
)
Z→∞−→ χF1 + (1− χ)F2
which is a random limit different from F .
The limit depends on χ because F has been trapped in a part of phase space.
Lille January, 2013
Mean square theory
Let F be a stationary process, E[F (0)2] <∞. Its covariance function is:
c(z) = E[(F (z0)− F )(F (z0 + z)− F )
]
• c is independent of z0 by stationarity of F .
• |c(z)| ≤ c(0) by Cauchy-Schwarz:
|c(z)| ≤ E[(F (0)− F )2
]1/2E[(F (z)− F )2
]1/2= c(0)
• c is an even function c(−z) = c(z):
c(−z) = E[(F (z0 − z)− F )(F (z0)− F )
]
z0=z= E[(F (0)− F )(F (z)− F )
]= c(z)
• The Fourier transform of c is nonnegative (Bochner).
Proposition. If∫∞
0|c(z)|dz <∞, then
E
[( 1
Z
∫ Z
0
F (z)dz − F)2]Z→∞−→ 0
Lille January, 2013
We can show that
Z E
[( 1
Z
∫ Z
0
F (z)dz − F)2]Z→∞−→ 2
∫ ∞
0
c(z)dz
Proof:
E
[( 1
Z
∫ Z
0
F (z)dz − F)2]
= E
[1
Z2
∫ Z
0
dz1
∫ Z
0
dz2(F (z1)− F )(F (z2)− F )
]
=1
Z2
∫ Z
0
dz1
∫ Z
0
dz2c(z1 − z2)
=2
Z
∫ Z
0
Z − z
Zc(z)dz
Using Lebesgue’s theorem:
Z E
[( 1
Z
∫ Z
0
F (z)dz − F)2]Z→∞−→ 2
∫ ∞
0
c(z)dz
Lille January, 2013
Let F be a stationary zero-mean random process. Denote
Sk(Z) =1√Z
∫ Z
0
eikzF (z)dz
We can show similarly
E[|Sk(Z)|2] Z→∞−→ 2
∫ ∞
0
c(z) cos(kz)dz =
∫ ∞
−∞
c(z)eikzdz
Simplified form of Bochner’s theorem: If F is a stationary process with integrable
covariance function, then the Fourier transform of its covariance function is
nonnegative.
Lille January, 2013
Next goal: determine the limit limε→0 Xε(z) where
dXε
dz= F
(z
ε,Xε(z)
)
Lille January, 2013
Method of averaging: Khasminskii theorem
dXε
dz= F (
z
ε,Xε), X
ε(0) = x0
Assume:
x 7→ F (z,x) is Lipschitz,
z 7→ F (z,x) is stationary and ergodic.
Define:
F (x) = E[F (z,x)]
Let X be the solution ofdX
dz= F (X), X(0) = x0
Theorem: for any Z > 0,
supz∈[0,Z]
E[|Xε(z)− X(z)|
] ε→0−→ 0
[1] R. Z. Khasminskii, Theory Probab. Appl. 11 (1966), 211-228.
Application to wave propagation in random media
• Let (pε(t, z), uε(t, z)) be the solution of the one-dimensional acoustic wave equations
in a random medium:
ρ(z
ε)∂uε
∂t+∂pε
∂z= 0
∂pε
∂t+ κ(
z
ε)∂uε
∂z= 0
• The Fourier components (in time) of the solution
uε(ω, z) =∫uε(t, z)eiωtdt, pε(ω, z) =
∫pε(t, z)eiωtdt
satisfydXε
dz= F (
z
ε,Xε),
where
Xε =
pε
uε
, F (z,X) = −iω
0 ρ(z)
1κ(z)
0
X
• Apply the method of averaging =⇒ Xε(ω, z) converges to X(ω, z) solution of
dX
dz= −iω
0 ρ
1κ
0
X, ρ = E[ρ], κ =(E[κ−1]
)−1
→ deterministic “effective medium” with parameters ρ, κ.
Lille January, 2013
• Take an inverse Fourier transform: (pε(t, z), uε(t, z)) converges to (p(t, z), u(t, z))
solution of the homogeneous effective system
ρ∂u
∂t+∂p
∂z= 0
∂p
∂t+ κ
∂u
∂z= 0
The propagation speed of (p, u) is c =√κ/ρ.
→ the effective speed of the acoustic wave (pε, uε) as ε→ 0 is c.
• This analysis is just a small piece of the homogenization theory.
cf book The theory of composites by G. Milton.
Lille January, 2013
Example: bubbles in water
ρa = 1.2 103 g/m3, κa = 1.4 108 g/s2/m, ca = 340 m/s.
ρw = 1.0 106 g/m3, κw = 2.0 1012 g/s2/m, cw = 1425 m/s.
If the typical pulse frequency is 10 Hz - 30 kHz, then the typical wavelength is 1 cm -
100 m, which is larger than the bubble sizes =⇒ the effective medium theory can be
applied.
ρ = E[ρ] = φρa + (1− φ)ρw =
9.9 105 g/m3 if φ = 1%
9 105 g/m3 if φ = 10%
κ =(E[κ−1]
)−1=
(φ
κa+
1− φ
κw
)−1
=
1.4 1010 g/s2/m if φ = 1%
1.4 109 g/s2/m if φ = 10%
where φ = volume fraction of air.
Thus, c = 120 m/s if φ = 1% and c = 37 m/s if φ = 10 %.
→ the average sound speed c can be much smaller than ess inf(c).
The converse is impossible:
E[c−1] = E
[
κ−1/2ρ1/2]
≤ E[κ−1]1/2E[ρ]1/2 = c−1
Thus c ≤ E[c−1]−1 ≤ ess sup(c).
Lille January, 2013
Random differential equations and Brownian motion
Lille January, 2013
Toy model
dXε
dz= F (
z
ε)
with F (z) =∑∞
i=1 Fi1[i−1,i)(z), Fi independent random variables E[Fi] = F = 0 and
E[(Fi − F )2] = σ2.
0 0.5 1 1.5 2−0.5
0
0.5
z
F
0 0.5 1 1.5 2−0.5
0
0.5
1
z
X
ε = 0.05
Lille January, 2013
For any z ∈ [0, Z], we have
Xε(z)ε→0−→ X(z),
dX
dz= F = 0.
No macroscopic evolution is noticeable.
→ it is necessary to look at larger z to get an effective behavior
z 7→ z
ε, Xε(z) = Xε(
z
ε)
dXε
dz=
1
εF (
z
ε2)
Lille January, 2013
Diffusion-approximation: Toy model
dXε
dz=
1
εF (
z
ε2)
with F (z) =∑∞
i=1 Fi1[i−1,i)(z), Fi independent random variables E[Fi] = 0 and
E[F 2i ] = σ2.
0 0.5 1.0 1.5 2.0 2.5−10
0
10
z
F
0 0.5 1.0 1.5 2.0 2.5−0.2
0
0.2
0.4
0.6
0.8
1
1.2
z
X
ε = 0.05
Lille January, 2013
Xε(z) =
∫ z
0
1
εF (
s
ε2)ds = ε
∫ z
ε2
0
F (s)ds = ε
[
z
ε2
]
∑
i=1
Fi
+ ε
∫ z
ε2
[
z
ε2
]
F (s)ds
= ε
√[ z
ε2
]
ε→ 0 ↓√z
× 1√[
zε2
]
[
z
ε2
]
∑
i=1
Fi
law ↓ (CLT )
N (0, σ2)
+ ε( z
ε2−[ z
ε2
])
F[
z
ε2
]
a.s. ↓0
Thus: Xε(z) converges in distribution as ε→ 0 to X(z) whose distribution is
N (0, σ2z).
With some more work: The process (Xε(z))z≥0 converges in distribution to a
Brownian motion (σWz)z≥0.
Lille January, 2013
Diffusion processes and stochastic differential equations
Lille January, 2013
Example: Brownian motion
0 10 20 30 40 50−4
−2
0
2
4
6
z
Wz
Wz (issued from 0): zero-mean Gaussian process
with covariance E[WzWz′ ] = z ∧ z′
Its increments are independent and:
E[(Wz+h −Wz)2] = h
Let φ be a bounded real function:
E[φ(x+Wz)] =
∫1√2πz
exp
(
−w2
2z
)
φ(x+ w)dw
=
∫1√2πz
exp
(
− (y − x)2
2z
)
︸ ︷︷ ︸pz(x,y)
φ(y)dy
For x, z fixed, y 7→ pz(x, y) is the pdf of the random variable x+Wz .
Note that pz(x, y) is the kernel of the heat operator.
Lille January, 2013
Example: Brownian motion
• The moment
u(z, x) := E[φ(x+Wz)] =
∫
pz(x, y)φ(y)dy, pz(x, y) =1√2πz
exp
(
− (y − x)2
2z
)
satisfies the Kolmogorov (backward) equation
∂u
∂z=
1
2
∂2u
∂x2, u(z = 0, x) = φ(x)
Reciprocal: The partial differential equation
∂u
∂z=
1
2
∂2u
∂x2, u(z = 0, x) = φ(x)
has the probabilistic representation: u(z, x) = E[φ(x+Wz)].
The reciprocal is useful for Monte Carlo simulation techniques for solving PDEs.
• The pdf y 7→ pz(x, y) of x+Wz satisfies the Fokker-Planck (Kolmogorov forward)
equation as a function of z and y:
∂pz∂z
=1
2
∂2pz∂y2
, pz=0(x, y) = δ(y − x)
Lille January, 2013
Stochastic differential equations
Let X(z) be the solution of the one-dimensional stochastic differential equation
X(z) = x+
∫ z
0
σ(X(s))dWs +
∫ z
0
b(X(s))ds
Existence and uniqueness ensured provided b and σ are C1 with bounded derivatives.
X(z) is continuous a.s., squared integrable, adapted (depends on (Ws)0≤s≤z).
• Ito integral∫ z
0
φ(X(s))dWs = lim∆z→0
∑
k
φ(X(zk))(Wzk+1−Wzk ), 0 = z0 < z1 < · · · < zn = z
Good properties: martingale (mean zero), Ito’s isometry:
E
[( ∫ z
0
φ(X(s))dWs
)2]
=
∫ z
0
E[φ(X(s))2]ds
• Stratonovich integral:∫ z
0
φ(X(s)) dWs = lim∆z→0
∑
k
φ(X(zk+1)) + φ(X(zk))
2(Wzk+1
−Wzk )
• Relation:∫ z
0
φ(X(s)) dWs =
∫ z
0
φ(X(s))dWs +1
2
∫ z
0
σ(X(s))φ′(X(s))ds
Lille January, 2013
X(z) = x+
∫ z
0
σ(X(s))dWs +
∫ z
0
b(X(s))ds
• Ito’s formula:
φ(X(z)) = φ(x) +
∫ z
0
σ(X(s))φ′(X(s))dWs +
∫ z
0
b(X(s))φ′(X(s))ds
+1
2
∫ z
0
σ(X(s))2φ′′(X(s))ds
• X(z) is a diffusion process with the generator Q =1
2σ2(x)
∂2
∂x2+ b(x)
∂
∂x
• The moment u(z, x) := E[φ(X(z))|X(0) = x
]satisfies the Kolmogorov equation:
∂u
∂z= Qu, u(z = 0, x) = φ(x)
Proof: Let u be the solution of the Kolmogorov equation. Let z0 > 0.
du(z0 − z,X(z)) = −∂zu(z0 − z,X(z))dz + ∂xu(z0 − z,X(z))σ(X(z))dWz
+∂xu(z0 − z,X(z))b(X(z))dz +1
2σ2(X(z))∂2
xu(z0 − z,X(z))dz
= ∂xu(z0 − z,X(z))σ(X(z))dWz
Therefore E[u(z0 − z,X(z))] is constant in z:
u(z0, x) = E[u(z0, X(0))|X(0) = x] = E[u(0,X(z0))|X(0) = x] = E[φ(X(z0))|X(0) = x]
Lille January, 2013
• The pdf y 7→ pz(x, y) of X(z) (starting from X(0) = x) satisfies the Fokker-Planck
equation as a function of z and y:
∂pz∂z
= Q∗pz , pz=0(x, y) = δ(x− y)
where Q∗ is the adjoint operator of Q:
Q∗p(y) =1
2
∂2
∂y2(σ2(y)p(y)
)− ∂
∂y
(b(y)p(y)
)
Proof: Consider a test function φ. By taking the expectation in Ito’s formula:
φ(X(z)) = φ(x) +
∫ z
0σ(X(s))φ′(X(s))dWs +
∫ z
0b(X(s))φ′(X(s))ds
+1
2
∫ z
0σ(X(s))2φ′′(X(s))ds
one gets
E[φ(X(z))] = φ(x) +
∫ z
0E[Qφ(X(s))]ds →
∂E[φ(X(z))]
∂z= E[Qφ(X(z))]
Therefore ∫φ(y)∂zpz(x, y)dy =
∫Qφ(y)pz(x, y)dy =
∫φ(y)Q∗pz(x, y)dy
Since this holds true for any φ, ones finds ∂zpz(x, y) = Q∗pz(x, y).
Lille January, 2013
Diffusion processes
• Let σ and b be C1(R,R) functions with bounded derivatives.
Let Wz be a Brownian motion.
The solution X(z) of the 1D stochastic differential equation:
dX(z) = σ(X(z))dWz + b(X(z))dz
is a diffusion process with the generator
Q =1
2σ2(x)
∂2
∂x2+ b(x)
∂
∂x
• Let σ ∈ C1(Rn,Rn×m) and b ∈ C1(Rn,Rn) with bounded derivatives.
Let Wz be a m-dimensional Brownian motion.
The solution X(z) of the stochastic differential equation:
dX(z) = σ(X(z))dWz + b(X(z))dz
is a diffusion process with the generator
Q =1
2
n∑
i,j=1
aij(x)∂2
∂xi∂xj+
n∑
i=1
bi(x)∂
∂xi
with a = σσT .
Lille January, 2013
Random differential equations and stochastic differential equations
Lille January, 2013
Next goal: determine the limit limε→0Xε(z) where
dXε
dz=
1
εF( z
ε2
)
for a fairly general process (F (z))z≥0 with E[F (z)] = 0.
Lille January, 2013
Write
Xε(z) =1
ε
∫ z
0
F( s
ε2
)
ds = ε
√z
ε2× 1√
zε2
∫ z
ε2
0
F (s) ds
We know that
E
[ 1√
zε2
∫ z
ε2
0
F (s) ds]
= 0,
limε→0
E
[( 1√
zε2
∫ z
ε2
0
F (s) ds)2]
= limZ→∞
ZE
[( 1
Z
∫ Z
0
F (s)ds)2]
= 2
∫ ∞
0
E[F (0)F (u)]du,
The Gaussian property of the limit of Xε(z) is ensured by an invariance principle.
Conclusion:1
ε
∫ z
0
F( s
ε2)ds
ε→0−→√2σWz
in distribution, where (Wz)z≥0 is a Brownian motion and
σ2 =
∫ ∞
0
E[F (0)F (u)]du
Lille January, 2013
Next goal: determine the limit limε→0 Xε(z) where
dXε
dz=
1
εF( z
ε2,Xε(z)
)
when
E[F (z,x)] = 0 ∀x
Lille January, 2013
Diffusion-approximation
dXε
dz(z) =
1
εF( z
ε2,Xε(z)
)
, Xε(0) = x0 ∈ R
d.
F stationary, centered, and ergodic (in z): E[F (z,x)] = 0.
Theorem: The processes (Xε(z))z≥0 converge in distribution in C0([0,∞),Rd) to the
diffusion process X with the generator L:
L =
d∑
i,j=1
aij(x)∂2
∂xi∂xj+
d∑
j=1
bj(x)∂
∂xj
with
aij(x) =
∫ ∞
0
E [Fi(0,x)Fj(u,x)] du
bj(x) =
d∑
i=1
∫ ∞
0
E [Fi(0,x)∂xiFj(u,x)] du
It means that the pdf pz(x) of X(z) satisfies:
∂pz∂z
=d∑
i,j=1
∂2
∂xi∂xj
(aij(x)pz
)−
d∑
j=1
∂
∂xj
(bj(x)pz
), pz=0(x) = δ(x− x0)
Lille January, 2013
Diffusion-approximation - the one-dimensional case
dXε
dz(z) =
1
εf( z
ε2
)
h (Xε(z)), Xε(0) = x0 ∈ R.
f stationary, centered, and ergodic: E[f(z)] = 0.
Theorem: The processes (Xε(z))z≥0 converge in distribution in C0([0,∞),R) to the
diffusion process X with the generator L:
L = a(x)∂2
∂x2+ b(x)
∂
∂x
with
a(x) =1
2σ2h2(x), b(x) =
1
2σ2h(x)h′(x), σ2 = 2
∫ ∞
0
E [f(0)f(u)] du
It means that X(z) is the solution of the SDE
dX(z) = σh(X(z))dWz + b(X(z))dz
or
dX(z) = σh(X(z)) dWz
Remember that∫ z
01εf(sε2
)ds converges in distribution to σWz .
→ the ”natural” integral is Stratonovich.
Lille January, 2013
Diffusion-approximation - a simple example
dXε
dz=
1
εµ( z
ε2)Xε, Xε(z = 0) = x0
Method 1. Apply the theorem → (Xε(z))z≥0 converges in distribution to the
diffusion process (X(z))z≥0 solution of the SDE
dX(z) = σX(z)dWz +σ2
2X(z)dz
where
σ2 = 2
∫ ∞
0
E[µ(0)µ(u)]du
Lille January, 2013
Method 2. We have
Xε(z) = x0 exp(1
ε
∫ z
0
µ( s
ε2)ds)
We know1
ε
∫ z
0
µ( s
ε2)ds
ε→0−→ σWz
in distribution, where (Wz)z≥0 is a Brownian motion and
σ2 = 2
∫ ∞
0
E[µ(0)µ(u)]du
Therefore
Xε(z)ε→0−→ X(z) := x0 exp
(σWz
)
in distribution. We have (by Ito’s formula):
dX(z) = σX(z)dWz +σ2
2X(z)dz
Lille January, 2013
Averaging and diffusion-approximation - the linear case
dXε
dz(z) =
1
εF( z
ε2
)
Xε(z) +G
( z
ε2
)
Xε(z), X
ε(0) = x0 ∈ Rd.
F matrix-valued, stationary, ergodic, and centered: E[F(z)] = 0.
F matrix-valued, stationary, and ergodic: E[G(z)] = G.
Theorem: The expectations (E[Xε(z)])z≥0 converge in C0([0,∞),Rd) to the solution
(X(z))z≥0 of the linear system:
dX
dz(z) = FX(z) + GX(z), X(0) = x0 ∈ R
d.
with
Fjl =d∑
k=1
∫ ∞
0
E[Fkl(0)Fjk(u)
]du
Lille January, 2013
Averaging and diffusion-approximation - a simple example
dXε
dz=i
εµ( z
ε2)Xε, Xε(z = 0) = x0
Method 1. Apply the theorem to
d
dz
Xεr
Xεi
=1
εµ( z
ε2)
0 −1
1 0
Xεr
Xεi
The expectation converges to
d
dz
Xr
Xi
=σ2
2
−1 0
0 −1
Xr
Xi
where
σ2 = 2
∫ ∞
0
E[µ(0)µ(u)]du
This shows that
E[Xε(z)
] ε→0−→ X(z)
where X(z) is the solution of
dX
dz= −σ
2
2X, X(z = 0) = x0
Lille January, 2013
Method 2. We have
Xε(z) = x0 exp( i
ε
∫ z
0
µ( s
ε2)ds)
We know1
ε
∫ z
0
µ( s
ε2)ds
ε→0−→ σWz
in distribution, where (Wz)z≥0 is a Brownian motion and
σ2 = 2
∫ ∞
0
E[µ(0)µ(u)]du
Therefore
Xε(z)ε→0−→ X(z) := exp
(iσWz
)
in distribution. Note
E[exp
(iσWz
)]=
1√2πz
∫ ∞
−∞
eiσwe−w2
2z dw = e−σ2z2
This shows that
E[Xε(z)
] ε→0−→ X(z) = x0e−σ2
2z
where X(z) is the solution of
dX
dz= −σ
2
2X, X(z = 0) = x0
Lille January, 2013
We have shown that the expectation of the solution of
dXε
dz=i
εµ( z
ε2)Xε, Xε(z = 0) = x0
converges as ε→ 0 to the solution of
dX
dz= −σ
2
2X, X(z = 0) = x0
→ random phases give rise to effective damping.
We have in distribution: the solution of
dXε
dz=i
εµ( z
ε2)Xε, Xε(z = 0) = x0
converges as ε→ 0 to the solution of
dX = iσX dWz, X(z = 0) = x0
→ note again the Stratonovich integral. Equivalently
dX = iσXdWz − σ2
2Xdz, X(z = 0) = x0
Lille January, 2013