MA4F7 Brownian Motion

MA4F7 Brownian Motion

March 16, 2013

Contents

1 Brownian Sample Paths 11.1 Brownian Motion as a Gaussian Process . . . . . . . . . . . . . . . . . . . 21.2 Growth rate of paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Regularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 Brownian motion as a Markov Process 122.1 Markov transition functions . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2 Strong Markov Process. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.3 Arcsine Laws for Brownian motion . . . . . . . . . . . . . . . . . . . . . . 21

3 Brownian Martingales 22

4 Donsker’s theorem 32

5 Up Periscope 39

These notes are based on the 2013 MA4F7 Brownian Motion course, taught by RogerTribe, typeset by Matthew Egginton.No guarantee is given that they are accurate or applicable, but hopefully they will assistyour study.Please report any errors, factual or typographical, to [email protected]

i

MA4F7 Brownian motion Lecture Notes Autumn 2012

The key aim is to show that scaled random walks converge to a limit called Brownianmotion. In 1D, Pt 7→ Bt nowhere differentiable = 1

E(Bt) = 0,E(B2t ) = t and so t 7→ Bt is not differentiable at 0. By shifting gives it at

any t.We also have that P (

∫ 10 χ(Bs > 0)ds ∈ dx) = 1√

πx(1−x)dx

P (x+B exits D in A) = U(x) where ∆U(x) = 0 and U(x) =

1 on A

0 on∂D \A. For a

disc with inner radius a and outer radius b, U(x) = log b−log |x|log b−log a and this converges to 1 as

b→∞. Thus the probability that Brownian motion hits any ball is 1.For random walks, P (x+ r.v. exits at y) = U(x) where U(x) = 1

4(U(x+ e1) +U(x−e1) + U(x + e2) = U(x− e2)) which can be thought of as a discrete Laplacian. Thus wehave a nice equation for Brownian motion, but a not so nice one for random walks.

1 Brownian Sample Paths

Our standard space is a probability space (Ω,F ,P).

Definition 1.1 A stochastic process (Bt, t ≥ 0) is called a Brownian Motion on R if

1. t 7→ Bt is continuous for a.s. ω

2. For 0 ≤ t1 < t2 < ... < tn we have Bt2 −Bt1 , ..., Btn −Btn−1 are independent

3. For 0 ≤ s < t we have Bt −Bs is Gaussian with distribution N(0, t− s).

But does this even exist, and if it does, do the above properties characterise B. Theanswer to both is yes, and we will show these later.

We now define the terms used in the above definition, to avoid any confusion.

Definition 1.2 A random variable Z is a measurable function Z : Ω→ R. In full, Rhas the Borel σ-algebra B(R) and measurable means if A ∈ B(R) then Z−1(A) ∈ F .

Definition 1.3 A stochastic process is a family of random variables (Xt, t ≥ 0) all definedon Ω.

We do not worry what Ω is, we are only interested in the law/distribution of Z, i.e.P(Z ∈ A) or E(f(Z)) where P(Z ∈ A) = Pω : Z(ω) ∈ A

If we fix ω, the function t 7→ Bt(ω) is called the sample path for ω.The first property above means that the evaluation of Bt at ω is continuous, for almost

all ω. Sadly some books say that Pω : t 7→ Bt(ω) is continuous = 1 but how do weknow this set is measurable.

Definition 1.4 Z a real variable is Gaussian N(µ, σ2) if it has density

P(Z ∈ dz) =1√

2πσ2e−

(x−µ)2

2σ2 dz

for σ2 > 0, meaning integrate both sides over a set A to get the probability over A. Ifσ = 0 then P(Z = µ) = 1.

1 of 40


1.0.1 Related Animals

The Brownian Bridge is Xt = Bt − tB1 for t ∈ [0, 1]An Ornstein-Uhlenbeck process is one, for C > 0, of the form Xt = e−CtBe2Ct and is

defined for t ∈ R. We will check that X here is stationary. Also (Xt+T : t ≥ 0) is still an

O-U process. This arises as the solution of the simplest SDE dXdt = −CXt +

√2C dB

dt , or

in other form, Xt = X0 − C∫ t

0 Xsds+∫ t

0

√2CdBt.

A Brownian motion on Rd is a process (Bt : t ≥ 0) such that B = (B1t , ..., B

dt ) where

each t 7→ Bkt is a Brownian motion on R and they are independent.

1.1 Brownian Motion as a Gaussian Process

Proposition 1.5 (Facts about Gaussians) 1. ZD∼ N(µ, σ2) then for c ≥ 0 we

have cZD∼ N(cµ, c2σ2)

2. Z1D∼ N(µ1, σ

21), Z2

D∼ N(µ2, σ22) and are independent then Z1 + Z2

D∼ N(µ1 +µ2, σ

21 + σ2

2)

3. ZkD∼ N(µk, σ

2k) and if Zk → Z then limk→∞ µk = µ, limk→∞ σ

2k = σ2 and Z

D∼N(µ, σ2).

The convergence above can be any one of the following.

1. Almost surely convergence: Zka.s.→ Z means P (ω : Zk(ω)→ Z(ω)) = 1

2. In probability: Zkprob→ Z means P (|Zk − Z| > ε) →

k→∞0 for all ε > 0

3. In distribution: ZkD→ Z means E(f(Zk)) → E(f(Z)) for any continuous and

bounded f .

Example 1.1 I =∫ 1

0 Btdt is a Gaussian variable.

I = limk→∞

1

N(B1/N +B2/N + ...+BN/N )

= limk→∞

1

N((BN/N −B(N−1)/N ) + 2(B(N−1)/N −B(N−2)/N ) + ...+N(B1 −B0))

and all these are independent and so Gaussian.

1.1.1 Transforms

Definition 1.6 We define the Fourier transform, or the characteristic function tobe

φZ(θ) = E(eiθZ)

For example, if ZD∼ N(µ, σ2) then φZ(θ) = eiθµe−σ

2/2

Proposition 1.7 (More facts about Gaussians) 4. φZ(θ) determines the law ofZ, i.e. if φZ(θ) = φY (θ) then P (Z ∈ A) = P (Y ∈ A).

5. Z1, Z2 independent if and only if E((eiθ1Z1eiθ2Z2) = E(eiθ1Z1)E(eiθ2Z2) for all θ1, θ2.

2 of 40


6. φZk(θ)→ φZ(θ) if and only if ZkD→ Z.

These all hold true for Z = (Z1, ..., Zd) with φZ(θ1, ..., θd) = E(eiθ1Z1+...+iθdZd)

Definition 1.8 Z = (Z1, ..., Zd) ∈ Rd is Gaussian if∑d

i=1 λkZk is Gaussian in R forall λ1, ..., λd. (Xt, t ≥ 0) is a Gaussian process if (Xt1 , ..., XtN ) is a Gaussian vectoron RN for any t1, ..., tN and N ≥ 1.

Check that Brownian motion is a Gaussian process, i.e. is (Bt1 , ..., BtN ) a Gaussianvector, or is

∑λkBtk Gaussian on R. We can massage this into µ1(Bt1 − B0) + ... +

µN (BtN −BtN−1) and so is Gaussian. As an exercise, check this for Brownian bridges andO-U processes.

Proposition 1.9 (Even more facts about Gaussians) 7. The Law of the Gaus-sian Z = (Z1, ..., Zd) is determined by E(Zk) and E(Zj , Zk) for j, k = 1, ..., d

8. Suppose Z = (Z1, ..., Zd) is Gaussian. then Z1, ..., Zd are independent if and only if

E(ZjZk) = E(Zj)E(Zk)

for all j 6= k.

For 7, it is enough to calculate φZ(θ) and see that it is determined by them. For 8, noeneed only check that the transforms factor.

Example 1.2 (Bt) a Brownian motion on R. Then E(Bt) = 0 and, for 0 ≤ s < t,

E(BsBt) = E((Bt−Bs)(Bs−B0)+(Bs−B0)2) = E(Bt−Bs)E(Bs−B0)+E(Bs−B0)2 = s

and similarly equals t if 0 ≤ t < s.

Do the same for Brownian bridges and O-U processes.

Theorem 1.10 (Gaussian characterisation of Brownian motion) If (Xt, t ≥ 0) isa Gaussian process with continuous paths and E(Xt) = 0 and E(XsXt) = s ∧ t then (Xt)is a Brownian motion on R.

Proof We simply check properties 1,2,3 in the definition of Brownian motion. 1 isimmediate. For 2, we need only check that E((Xtj+1 −Xtj )(Xtk+1

−Xtk)) splits. Supposetj ≤ tj+1 ≤ tk ≤ tk+1 and then E((Xtj+1 −Xtj )(Xtk+1

−Xtk)) = tj+1 − tj+1 − tj + tj = 0as required. For 3, Xt −Xs is a linear combination of Gaussians and so is Gaussian. Ithas mean zero and

E(Xt −Xs)2 = E(X2

s − 2XsXt +X2t ) = s− 2s ∧ t+ t = t− s

Q.E.D.

Suppose I =∫ 1

0 Bsds and E(I) =∫ 1

0 E(Bs)ds = 0 and also

E(I2) = E(

∫ 1

0Bsds

∫ 1

0Brdr) =

∫ 1

0

∫ 1

0E(BsBr)dsdr =

∫ 1

0

∫ 1

0s ∧ rdsdr =

1

3

but we need to check that we can use Fubini, so we need to check thatK = E(∫ 1

0

∫ 10 |Br||Bs|drds) <

∞. Now

K =

∫ 1

0

∫ 1

0E(|Br||Bs|)drds ≤

∫ 1

0

∫ 1

0E(B2

r )E(B2s )drds ≤

√rs < 1

as we wanted.

3 of 40


Lemma 1.11 (Scaling Lemma) Suppose that B is a Brownian motion on R and c > 0.Define Xt = 1

cBc2t for t ≥ 0. Then X is a Brownian motion on R

Proof Clearly it has continuous paths and E(Xt) = 0. Now

E(XsXt) = E(1

cBc2s

1

cBc2t) = s ∧ t

and alsoN∑1

λkXtk =

N∑1

λkcBc2tk

and this is Gaussian since Bt is Gaussian. Q.E.D.

Lemma 1.12 (Inversion lemma) Suppose that B is a Brownian motion on R. Define

Xt =

tB 1

tt > 0

0 t = 0. Then X is a Brownian motion.

ProofN∑1

λkXtk =

N∑1

λktkB 1tk

which is still Gaussian for tk > 0. If any of the tk = 0 then the addition to the above sumof this term is zero, so we are fine. Clearly E(Xt) = 0 and

E(XsXt) = E(stB 1sB 1

t) = ts

1

t= s

for s < t. We also have no problem for t > 0 with the continuity of paths. However weneed to check that it is continuous at t = 0, i.e. that tB 1

t→ 0 as t→ 0, or that 1

sBs → 0

as s→∞. We expect that Bt ≈ ±√t and so Bt/t→ 0 should be clear.

However, we know that (Xt1 , ..., XtN ) = (Bt1 , ..., BtN ) providing ti > 0 for a Brownianmotion B and since Bt → 0 as t→ 0 surely Xt → 0 as well. We pin this down precisely:

[Xt → 0 as t→ 0] = Xq → 0, as q → 0, q ∈ Q= ∀ε > 0 : ∃δ > 0, q ∈ Q ∩ (0, δ] =⇒ |Xq| < ε

=

∞⋂N=1

∞⋃M=1

⋂q∈Q∩(0, 1

M]

|Xq| <1

N

and so

P[Xt → 0 as t→ 0] = limN→∞

limM→∞

limk→∞

P|Xq| <1

N = P[Bt → 0]

where q1, q2, ... lists Q ∩ (0, 1M ]. Q.E.D.

We used in the above that

A1 ⊇ A2 ⊇ ... then P(∩AN ) = limN→∞

P(AN )

A1 ⊆ A2 ⊆ ... then P(∪AN ) = limN→∞

P(AN )

Corollary 1.13 Bt/t→ 0 as t→∞.

In fact, Bt/tα → 0 for α > 1

2 but lim supt→∞Bt√t

= ∞ and lim inft→∞Bt√t

= −∞ and so

Bt visits every x ∈ R infinitely many times.This brings us nicely into the next subsection.

4 of 40


1.2 Growth rate of paths

Theorem 1.14 (Law of the Iterated Logarithm) Suppose that Bt is a Brownian mo-tion on R. Then

lim supt→∞

Btψ(t)

= +1

and

lim inft→∞

Btψ(t)

= −1

where ψ(t) =√

2t ln(ln t)

lim supt→∞Xt = limt→∞ sups≥tXs and lim supXt ≤ 1 means that ∀ε > 0 then sups≥tXs ≤1 + ε for large t, which is the same as for all ε > 0 Xt is eventually less than 1 + ε.

lim supXt ≥ 1 if and only if ∀ε > 0 we have sups≥tXs ≥ 1 − ε for large t. which isthe same as for all ε > 0 there exists a sequence sN →∞ with XsN ≥ 1− ε.

It is on an example sheet that Xt = e−tBe2t then the Law of the Iterated logarithmcan be converted to get that lim sup Xt√

2 ln t= 1.

We can also compare this to ZN an iid N(0, 1) and then lim supZN√2 lnN

= 1

Proof We first show that

P(lim supBtψ(t)

≤ 1) = 1

and this is the case if and only if

P(Bt ≤ (1 + ε)ψ(t) for large t) = 1

We first perform a calculation:

P(Bt > (1 + ε)ψ(t) for large t) = P(N(0, t) > (1 + ε)√

2t ln(ln t))

= P(N(0, 1) > (1 + ε)√

2 ln(ln t))

=

∞∫(1+ε)√

2 ln(ln t)

e−z2

2

√2tdz

Lemma 1.15 (Gaussian Tails)

1

a

(1− 2

a2

)e−a

2/2 ≤∫ ∞a

e−z2/2dz ≤ 1

ae−a

2/2

Then we get that

∞∫(1+ε)√

2 ln(ln t)

e−z2

2

√2tdz ≤ 1

(1 + ε)√

2 ln(ln t)√

2πe−2((1+ε)

√2 ln(ln t))2/2

=1

(1 + ε)√

2 ln(ln t)√

2πe−(1+ε)2 ln(ln t)

The strategy now is to control B along a grid of times tN = θN for θ > 1. Then

P(BθN > (1 + ε)ψ(θN )) ≤ 1√2π

1

1 + ε

1√2 ln(N ln θ)

e−(1+ε)2 ln(N ln θ) ≤ C(θ, ε)N−(1+ε)2

5 of 40


Lemma 1.16 (Borel-Cantelli part 1) If∑∞

1 P(AN ) <∞ then

P( only finitely many AN happen ) = 1

Proof Let χAN =

1 AN

0 AcNand then the number of AN s that occur is given by

∑∞1 χAN

and so

E[

∞∑1

χAN ] =

∞∑1

E[χAN ] =

∞∑1

P(AN ) <∞

and so∑∞

1 χAN is finite a.s. Q.E.D. Then by BCI we have that BθN ≤ (1 + ε)ψ(θN ) for

all large N . We now need to control B over (θN , θN+1).

Lemma 1.17 (Reflection trick)

P(sups≤t

Bs ≥ a) = 2P(Bt ≥ a)

for a ≥ 0

Proof Define Ω0 = sups≤tBs ≥ a and then

P(Ω0) = P(Ω0 ∩ Bt > a) + P(Ω0 ∩ Bt = a) + P(Ω0 ∩ Bt < a)= 2P(Ω0 ∩ Bt > a)= 2PBt > a

We will carefully justify this later by examining the hitting time Ta = inft : Bt = a.We consider (BTa+t − a, t ≥ 0 and check that this is still a Brownian motion.

P(Ta ≤ t) = P(sups≤t

Bs ≥ a) = 2P(Bt√t≥ a√

t) = 2

∫ ∞a

1√2πez

2/2dz

and also

P(Ta ∈ dt) =d

dtP(Ta ≤ t) = 2

1√2πe−a

2/2ta

2t−

32 =

1√2πt3

e−a2/2t =: φ(t)

and so E(Ta) =∞. Q.E.D.

Thus from this we get that

P( sups≤θN

Bs) ≥ (1 + ε)ψ(θN )) = 2P(BθN > (1 + ε)ψ(θN )) ≤ 2C(ε, θ)N−(1+ε)2

Borel-Cantelli part 1 still applies and so for large N we have

sups≤θN

Bs ≤ (1 + ε)ψ(θN )

This if t ∈ [θN , θN+1] we have

Btψ(t)

≤ (1 + ε)ψ(θN+1)

ψ(θN )= (1 + ε)

√θ

√ln((N + 1) ln θ)√

ln(N ln θ)→ (1 + ε)

√θ

6 of 40


and thus we have that

lim supt→∞

Btψ(t)

≤ (1 + ε)√θ

for all ε > 0 and θ > 1 and so

lim supt→∞

Btψ(t)

≤ 1

We now show

lim supt→∞

Btψ(t)

≥ (1− ε)

If we choose tN = θN for θ > 1 then

P(BθN > (1− ε)ψ(θN )) ≥ C(θ, ε)N−(1−ε)2

Lemma 1.18 (Borel-Cantelli part 2) If∑∞

1 P(AN ) = ∞ and AN s are independent,then

P( infinitely many AN occur) = 1

Proof Z =∑∞

1 χAN is the total number of AN s that occur. From BCI we get that

E(Z) <∞ =⇒ P(Z <∞) = 1

or E(e−Z) = 0 ⇐⇒ P(Z =∞) = 1. Then

E(e−∑χAN ) = E(

∞∏1

e−χAN )

=∏1∞

E(e−χAN )

=∞∏1

(1− αP(AN ))

≤∞∏1

e−αP(AN )

= e−α∑

P(AN )

= 0

Q.E.D.

We use this on AN = BθN > (1 − ε)ψ(θN ) but these are not independent, butnearly so for large N . We finalise by correcting this. We define AN = BθN − BθN−1 >(1− ε)

√1− θ−1ψ(θN ) and these are independent. Then

P(AN ) = P(AN ) ≥ C(θ, ε)N−(1−ε)2

BC2 tells us that infinitely many AN do occur a.s., i.e.

BθN ≥ (1− ε)√

1− θ−1ψ(θN )−BθN−1 ≥ (1− ε)√

1− θ−1ψ(θN )− (1 + ε)ψ(θN−1)

and soBθN

ψ(θN )≥ (1− ε)

√1− θ−1 − (1 + ε)

ψ(θN−1)

ψ(θN )

7 of 40


andψ(θN−1)

ψ(θN )=

√ln((N − 1) ln θ)√θ√

ln(N ln θ)→√θ−1

and so

lim supN→∞

BθN

ψ(θN )≥ (1− ε)

√1− θ−1 − (1 + ε)

√θ−1

and taking θ large and ε small gives the result Q.E.D.

We make some observations:

1. Can we do better? P(Bt ≤ ht for large t) =

1

0for ht deterministic. This is

called the 0-1 law, and we see it in week 4. For ht =√Ct ln ln t if C < 0 then we

get 0 and if C > 2 then we get 1. This uses an integral test for ht.

2. Random walk analogue. Suppose that X1, X2, ... are iid with E(Xk) = 0, E(X2k) = 1

and SN = X1 + ...+XN . then

lim supN→∞

SN√2N ln lnN

= 1

This was proved in 1913 but the proof was long. Was proved in a shorter mannerusing the Brownian motion result in 1941.

3. Xt = tB 1t

is still a Brownian motion and so lim supt→∞tB 1

tψ(t) or alternatively

lims→0

Bs√2s ln ln(1

s )= 1

and so we have a result about small t behaviour.

4. P(B diff at 0) = 0 and if we fix T0 > 0 then define Xt = BT0+t then this is still aBrownian motion. Thus

Corollary 1.19 P(B diff at t0) = 0 for all t0.

5. Suppose U ∼ U [0, 1] is uniform r.v. Then define Xt by the value up until U and thenmonotone increasing up until 1. Then P(X diff at t0) = 1 but it is not differentiableat all t. and so we cannot easily conclude that Brownian motion id differentiableeverywhere

6. Corollary 1.20Lebt : B is diff at t = 0

Proof

E(

∫ ∞0

χ(B diff at t)dt) =

∫ ∞0

Eχ(B diff at t)dt = 0

Q.E.D.

The points where it is differentiable are examples of random exceptional points.

8 of 40


1.3 Regularity

Definition 1.21 A function f : [0,∞)→ R is α-Holder continuous, for α ∈ (0, 1], att if there exists M, δ > 0 such that

|ft+s − ft| ≤M |s|α

for |s| ≤ δ.

The case of α = 1 is called Lipschitz.The aim of the next part is to show that P(B is α Holder continuous at all t ≥ 0) = 1

provided α < 1/2 and that P(B is α Holder continuous at any t ≥ 0) = 0 provided α >1/2.

Corollary 1.22 P(B differentiable at any t) = 0

The reasons for this are as follows. A differentiable function must lie in some cone as

f(t+ s)− f(s)

s→ a

and so (f(t+s)−f(s))/s ∈ (a−ε, a+ε) for small s and thus |f(t+s)−f(s)| ≤ (|a|+ε)|s|for small s, and so Lipschitz holds with M = |a|+ |eps.

Proposition 1.23 Define ΩM,δ = for some t ∈ [0, 1], |Bt+s−Bt| ≤M |s| for all |s| ≤ δand then P(ΩM,δ) = 0 and thus P(∪∞M=1 ∪∞N=1 ΩM, 1

N) = 0

Proof This hasn’t been bettered since 1931. Suppose that there exists a t ∈ [K/N, (K+1)/N ] where it is Lipschitz, i.e. |Bt+s −Bt| ≤M |s| for |s| < δ. Then if (K + 1)/N, (K +2)N ∈ [t, t+ δ] then

|B(K+1)N −Bt| ≤M

N|B(K+2)N −Bt| ≤

2M

N

and so by the triangle inequality we get

|B(K+1)N −B(K+2)/N | ≤3M

N

and then we have

P(ΩM,δ) ≤ P[

for some K = 1, ..., N − 1 : |B(K+1)N −B(K+2)/N | ≤3M

N

](1.1)

We first calculate the probability of the event on the right hand side.

P[|N(0, 1/N)| ≤ 3M

N] = P[|N(0, 1)| ≤ 3M√

N] =

3M√N∫

− 3M√N

1√2πe−z

2/2dz ≤ 6M√2πN

and then the last part of equation (1.1) is equal to

= P(∪N−11 |B(K+1)N −B(K+2)/N | ≤

3M

N) ≤

N−1∑1

(6M√2πN

) =6M√

2π

√N

9 of 40


Suppose we have a grid with size 2−N . Then we define AN = ∪2N

K=1|Xk/2N −X(K−1)/2N | > 1

2Nα

We then estimate this

P(AN ) ≤2N∑1

P|Xk/2N −X(K−1)/2N | >1

2Nα ≤ C2N2−N(1+γ)2Nαp = 2−N(γ−αp)

and thus we need γ > αp or α < γ/p. This is the key idea of the proof. We now have∑∞1 P(AN ) < ∞ and so by Borel Cantelli I we have that only finitely many AN s occur,

i.e. there exists N0(ω) such that

|XK/2N −X(K−1)/2N | <1

2Nαfor all K = 1, ..., 2N and for N > N0

We also know that P(N0 <∞) = 1. We now fix ω such that N0(ω) <∞. We can controlX on D = dyadic rationals i.e. of form K/2M. Fix t, s ∈ D with

1/2M ≤ |t− s| ≤ 1/2M−1

where M ≥ N0. We have two cases. Either we straddle two points K/2M and (K+1)/2M

or we straddle only one point K/2M . We consider the first case, and leave the secondcase to the reader. We have

t =K + 1

2M+

1 or 0

2M+1+ ...+

1 or 0

2M+L

and then we have

|Xt −X(K+1)/2M | ≤K + 1

2M+

1

2M+1+ ...+

1

2M+L≤ 1

2(M+1)α

1

1− 2−α

and similarly

|Xs −X(K)/2M | ≤1

2(M+1)α

1

1− 2−α

and thus

|Xs −Xt| ≤ |Xs −X(K)/2M |+ |X(K)/2M −X(K+1)/2M |+ |Xt −X(K+1)/2M |

≤ 1

2(M+1)α

2

1− 2−α+

1

2Mα

≤ C

2Mα

and thus using the assumptions of Kolmogorov we have this less than C|t− s|α. Invokingcontinuity gives the result.

11 of 40


2 Brownian motion as a Markov Process

Roughly, the future (Xs; s ≥ t) is independent of the past (Xs : s ≤ t) conditional on thepresent Xt.

Definition 2.1 Suppose that F ,G two σ-fields on Ω are called independent if P(A ∩B) = P(A)P(B) for all A ∈ F and B ∈ G.

Two variables X and Y are called independent if

P(X ∈ C1, Y ∈ C2) = P(X ∈ C1)P(Y ∈ C2)

or equivalentlyE(h(X)g(Y )) = E(h(X))E(g(Y ))

for all measurable bounded functions h, g.

They are equivalent as follows. We can take h = χC1 and g = χC2 and then the twostatements are the same. We then take simple functions and limits of simple functions toget any functions, using the standard machine.

Definition 2.2 σ(X), called the σ-field generated by X is defined as

X−1(C) : C measurable

Note that now X is independent of Y if and only if σ(X) is independent of σ(Y ).

Definition 2.3 σ(Xt, t ∈ I) called the σ-field generated by Xt is defined to be

σX−1t (C) : t ∈ I, C measurable

i.e. the smallest σ field containing the above sets.

Theorem 2.4 (Markov property of Brownian motion 1) Suppose that B is a Brow-nian motion, and fix t0 > 0. Define Xt = Bt+t0−Bt0 and then σ(Xt : t ≥ 0) is independentof σ(Bt : t ≤ t0).

Note that this is very close to independent increments.For example

∫ t00 Bsds is independent of

∫ t10 Xsds. Need to check that

∫ t00 Bsds is

measurable with respect to σ(Bt).∫ t0

0Bsds = lim

N→∞

N∑1

1

NBK/N

which is measurable since the limit and sum of measurable functions is measurable.A second example is supt≤sBt is independent of supt≤t1 Xt. We can write this as

supt≤sBt = supq∈[0,1]∩Q

Bq = limN→∞maxBq1 , ..., BqN where qi lists [0, 1] ∩Q. Now

w 7→ (Bq1(w), ..., BqN (w)) 7→ maxBqi(w)

and the latter is continuous, and the former is measurable, since if C = C1 × ... × CNthen w : (Bq1(w), ..., BqN (w)) ∈ C is measurable.

Definition 2.5 A collection of subsets A0 is called a π-system if it is closed under finiteintersections.

12 of 40


Lemma 2.6 If F0 and G0 are two different π-systems and P(A ∩ B) = P(A)P(B) forall A ∈ F0 and B ∈ G0 then P(A ∩ B) = P(A)P(B) holds for σ(F0) and σ(G0), i.e.independence of π-systems gives independence of generated σ-fields.

For us, σ(Xt, t ≥ 0) = σX−1t (C) : t ≥ 0, C ∈ B(R) and this is generated by the

π-system X−1t1

(C1) ∩ ... ∩X−1tN

(CN ) : t1, ..., tN ≥ 0, CK ∈ B(R)Proof (of theorem 2.4) By the above lemma, we need only check that (Xt1 , ..., XtN ) isindependent of (Bs1 , ..., BsN ). We have

E[ei

∑λkBtk ei

∑µk(BT+tk

−Btk )]

= E[e−

∑λk(Bsk−Bsk−1

)ei∑µk(BT+tk

−BT+tk−1)]

= E[ei

∑λkBtk

]E[ei

∑µk(BT+tk

−Btk )]

by independent increments. Q.E.D.

Proposition 2.7 The following are true

1. a ≤ b ≤ c ≤ d then P(max[a,b]

Bt 6= max[c,d]

Bt) = 1

2. P(∃!t? ∈ [a, b] : Bt? = max[a,b]

Bt) = 1

3. Local maxima are dense in [0,∞).

Proof We can get 2 from one by using that P(max[a,b]

Bt 6= max[c,d]

Bt : ∀a ≤ b ≤ c ≤

d rational ) = 1 We can get 3 from 2 by doing this for all rational (a, b) simultaneously.We now prove 1. Take Xt = Bc+t − Bc, and let Y = max[c,d](Bt − Bc) = max[0,d−c]Xt

which is independent of σ(Bs, s ≤ c). Take Z = max[a,b](Bt − Bc) and we aim to showthat P(Z 6= Y ) = 1. Y and Z are independent by the Markov property but Y has adensity. By the reflection principle, this implies that P(Y = Z) = 0. Q.E.D.

At the last part of this, we have used that if E(g(Y )h(Z)) = E(g(Y ))E(h(Z)) and ifν(dy) = P(Y ∈ dy) and µ(dz) = P(Z ∈ dz) then

E(φ(Y, Z)) =

∫ ∫φ(y, z)ν(dy)µ(dz)

This can be shown by the standard machine, with first taking φ(Y,Z) =

1 Y = Z

0 Y 6= Z

Definition 2.8 We define the Brownian filtration by FBt := σ(Bs : s ≤ t), and wealso define FBt+ = ∩s>tFBs and the germ field is FB0+.

Note that if dBdt (t) existed then it would be in FBt+. However, lim supt→0

Bth(t) is in FB0+,

since this limsup is FBδ measurable for any δ > 0.

Theorem 2.9 (Markov property version 2) Fix T ∈ R and define Xt = BT+t−BT .Then σ(Xt, t ≥ 0) is independent of FBt+.

Corollary 2.10 T = 0 so σ(Bt, t ≥ 0) is independent of FB0+ so is independent of itself.

Corollary 2.11 FB0+ is a 0/1 field.

13 of 40


Thus if X is FB0+ measurable then ω : Z(ω) ≤ c has probability 0 or 1. Thus Z is aconstant almost surely.

Example 2.1 (Lebesgue’s Thorn) Suppose that B is a d dimensional Brownian mo-tion and F is an open set in Rd. Take τ = inft : Bt ∈ F. We claim that τ = 0 ∈ FB0+.Indeed τ = 0 = ∩∞N=N0

∪q∈Q∩[0,1/N ] Bq ∈ F ∈ FB1/N0for all N0. Lebesgue’s thorn

is an example of F , e.g. F = (x, y, z) :√y2 + z2 ≤ f(x) a volume of rotation. Here

P(τ > 0) = 1 for thin thorns, or is 0 for thick thorns.

Proof ( of theorem 2.9) Take A ∈ FBt+ and B ∈ σ(Xs, s ≥ 0). The π-system lemmasays we can choose B = Xsi : i = 1, ...,m, i.e. enough to check that

E[e−∑λkXsk eiθχA ]

splits as a product of the two expectations.Let Xε

s = Bt+ε+s −Bt+ε and note that this is independent of FBt+ε. Then

E[e−∑λkX

εsk eiθχA ] = E[e−

∑λkX

εsk ]E[eiθχA ]

by the Markov property version 1. Letting ε → 0 and then using the DCT we get therequired result. Q.E.D.

Example 2.2 (Shakespeare problem) Suppose we have 2 dimensional Brownian mo-tion.

Then assume that

P(Bt, t ∈ [0, 1] traverses a tube A) = p0 > 0

and then

P(1√2Bt, t ∈ [0, 1/2] traverses a tube

1√2A) = p0

then let

AN = 1

2N/2Bt : t ∈ [0, 2−N ] traverses 2−N/2A

and then P(AN ) = p0 by scaling. Then note that AN ∈ FB2−N and let Ω0 = ∩∞M=M0∪∞N=M

AN = [AN i.o. ] ∈ FB2−M0

and so Ω0 ∈ FB0+ and now

P(Ω0) = limM→∞

P(∪∞N=MAN ) ≥ p0

and so P(Ω0) = 1.

2.1 Markov transition functions

You may know that

P(X0 = x0, ..., XN = xN )p(x0, x1)...p(xN , xN−1

and this is equivalent to

P(XN = xN |X0 = x0, ..., XN−1 = xN−1) = p(xn, xN−1)

Our aim is to come up with something similar in the continuous case. We intuitivelythink of the following as the probability that we start at x and end up in dy in time t.

14 of 40


Definition 2.12 A Markov transition kernel is a function p : (0,∞)×Rd×B(Rd)→ [0, 1]such that

1. A 7→ pt(x,A) is a probability measure on B(Rd)

2. x 7→ pt(x,A) is measurable.

Definition 2.13 A Markov process (Xt, t ≥ 0) on Rd with transition kernel stared at xmeans that

P(Xt1 ∈ A1, ..., XtN ∈ AN ) =

∫A1,...,AN

ptN−tN−1(xN−1, dxN )...pt1(x, dx1)

Observe that we have that

E(f(Xt1 , ..., XtN )) =

∫RN

f(x1, ..., xN )ptN−tN−1(xN−1, dxN )...pt1(x, dx1)

Example 2.3 1. Brownian motion with pt(x, dy) = 1√2πte−(x−y)2/2tdy

2. Xt = Bt + x+ ct

3. Reflected Brownian motion Xt = |x+Bt|

4. Absorbed Brownian motion Xt =

x+Bt t < τ

0 t ≥ τwhere τ = inft : Bt = 0

5. Ornstein-Uhlenbeck processes

6. Radial Brownian motion. Suppose that Bt is Brownian motion on Rd and define

Xt =√∑d

i=1(Bit)

2

7. dX = µ(X)dt+ σ(X)dBt an SDE.

8. A Brownian Bridge is NOT an example. However it is a time inhomogeneousMarkov process, where p is a function of two times.

Definition 2.14 We define Bxt = x+Bt i.e. Brownian motion starting at x.

We check that (Bxt , t ≥ 0) is a Markov process started at x with kernel pt(x, dy) =

qt(y − x)dy where qt(z) = 1√2πte−z

2/2t The short proof is as follows:

P(Bxt1 ∈ dx1, ..., B

xtN∈ dxN ) = P(Bx

t1 ∈ dx1, Bxt2 −B

xt1 ∈ d(x1 − x2), ..., Bx

tN−Bx

tN−1∈ d(xN − xN−1))

= qt1(x1 − x)dx1qt2−t1(x2 − x1)dx2...qtN−tN−1(xN − xN−1)dxN

and then we should integrate both sides over A1, ..., AN .The longer version is as follows:

E(f(Bxt1 , ..., B

xtN

)) = E(g(Bxt1 − x,B

xt2 −B

xt1 , ..., B

xtN−Bx

tN−1))

15 of 40


if

g(y1, ..., yN ) = f(x+ y1, x+ y1 + y2, ...)

=

∫RN

g(y1, ..., yN )qt1(y1)...qtN−tN−1(yN )dy1...dyN

=

∫RN

f(x1, ..., xN )qt1(x1 − x)...qtN−tN−1(xN − xN−1)Jdx1...dxN

where J is the jacobian of the change of variables y1 = x1 − x,...,yN = xN − xN−1, andequals 1.

Definition 2.15 Xt, t ≥ 0 is a Markov process with transition kernel p means

P(Xt ∈ A|FXs ) = pt−s(Xs, A)

for all s < t.

This is shorter than the previous definition, and implies it by an induction argument,which we will do.

2.1.1 Conditional Expectation

Suppose that we have a probability space (Ω,F ,P) and a random variable X : Ω → R.Suppose G ⊂ F is a sub σ field.

Our aim is to take E(X|G) to be the closest G measurable function to X.The natural place for this is L2(Ω,F ,P) = X : Ω → R|E(X2) < ∞ and this is

an inner product space with inner product (X,Y ) = E(XY ) and a corresponding norm||X||2 =

√E(X2).

For X ∈ L2(F) there exists a unique Y ∈ L2(G) such that

(X − Y, Z) = 0

for all Z ∈ L2(G). This means

E((X − Y )Z) = 0 ⇐⇒ E(XZ) = E(Y Z)

for all Z ∈ L2(G). It is enough though to check for Z = χA for A ∈ G, by the standardmachine of measure theory. This expectation is then∫

AXdP =

∫AdP (2.1)

and we write Y = E(X|G).

Proposition 2.16 For X ∈ L2(F) there is a unique Y ∈ L2(G) satisfying (2.1). Thiscan be improve to X ∈ L1(F) or to X ≥ 0.

There are several special cases

1. G = ∅,Ω and then E(X|G) = E(X)

2. G = ∅, A,Ac,Ω and then E(X|G) =

y1 A

y2 Ac. If X = χB then y1 = P(B|A) and

y2 = P(B|Ac) and this is like an extension to Bayes formula

16 of 40


3. G = F and then E(X|G) = X.

Lemma 2.17 Suppose X is G measurable and Z is independent of G, and φ : R2 → R isbounded and measurable. Then

E(φ(X,Z)|G) = E(φ(x, Z))|x=X

This can be proved using the measure theory machine.

Example 2.4 1. E(Bt|FBs ) = E(Bs + (Bt −Bs)|FBs ) = Bs

2. E(eBt |FBs ) = E(eBt−BseBs |FBs ) = eBse(t−s)/2

3. E(B2t |FBs ) = E(B2

s +2Bs(Bt−Bs)+(Bt−Bs)2|FBs ) = B2s +2BsE(Bt−Bs)+E(Bt−

Bs)2 = B2

s + t− s

4. (X,Y ) a Gaussian vector on R2 with mean 0 and Y 6= 0. We then postulate that

E(X|σ(Y )) = αY

but we need to find α. We do this as follows. Set X = αY + (X − α)Y and we

need to check that E(Y (X − αY )) = 0. Thus we choose α = E(XY )E(Y 2)

, and so then

E(X|σ(Y )) = E(αY + (X − α)Y |σ(Y )) = αY + 0.

5. We can do a similar thing for E(∫

0 1Bsds|σ(B1) and set it equal to αB1. We thenget α = 1/2.

Lemma 2.18 X ≥ 0 implies that E[X|G] ≥ 0 a.s.

Proof Y = E[X|G] means that E(Y Z) = E(XZ) for measurable Z. Let A = ω :Y (ω) < 0. Then

0 ≤∫AXdP =

∫AY dP ≤ 0

and A is G measurable, so P(A) = 0. Q.E.D.

We check that Brownian motion is Markov with this new definition. Let Xt = x+Bt.Note that FXs = FBs and so

P(Xt ∈ dy|FXs ) = P(x+Bs +Bt −Bs ∈ dy|FBs )

= P(Z +N(0, t− s) ∈ dy)|z=x+Bs = qt−s(y − z)dy|z=x+Bs

= qt−s(y −Xs)dy

Suppose now that Xt = |Bxt | and then FXs ⊂ FBs and we guess that the transition

kernel is given by

qt−s(y − x)dy = qt−s(y − x)dy + qt−s(−y − x)dy

since we can either reach dy or −dy at time t. We consider P(Xt ∈ dy|FXs ) = P(|Bxt | ∈

dy|FXs ). Then using the tower property we get

P(|x+Bt| ∈ dy|FBs ) = qt−s(y − (x+Bs))dy = qt−s(y − |x+Bs|)dy = qt−s(y −Xs)dy

and since this is already Xs measurable, conditioning on Xs does nothing.

17 of 40


However there is a warning. Is Xt = f(Bxt ) still Markov? The answer is usually not.

In the above case, were were just lucky to get this. For example, consider radial Brownianmotion, Xt =

√B1(t)2 + ...+Bd(t)2 and we want to show that

E(f(Xt1 , ..., XtN )) =

∫f(x1, ..., xN )ptN−tN−1(dxN − xN−1)...pt1(dx1 − x)

The measure theory machine says that it is enough to check for f(x1, ..., xn) =∏n

1 φk(xk).We show this by induction:

E(

n∏1

φk(xk)) = E(E(

n∏1

φk(xk)|FtXN−1))

= E(

n−1∏1

φk(xk)E(φn(Xtn |FtXn−1))

= E(n−1∏

1

φk(xk))

∫φn(xn)ptn−tn−1(xtn−1 , dxn)

and then we use induction hypothesis.

2.2 Strong Markov Process.

Define Xs(w) = BT+s −BT = BT (w)+s(w)−BT (w)(w) and is it still a Brownian motion?This is the case if Ta = inft : Bt = a, but not if T = supt ≤ 1 : Bt = 0. In otherwords, T must not look into the future.

Definition 2.19 (Ft, t ≥ 0) is called a filtration on (Ω,F ,P) if Fs ⊆ Ft ⊆ F for s ≤ t.

The key example is (Xt, t ≥ 0) and FXt = σ(Xs : s ≤ t).

Definition 2.20 T : Ω→ [0,∞] is called a stopping time for a filtration (Ft, t ≥ 0) ifT ≤ t ∈ Ft for all t ≥ 0.

The key example is (FBt , t ≥ 0) and Ta = inft : bt = a. We give two justificationsas to why this is a stopping time below.

Ta ≤ t = sups≤t

Bs ≥ a = sups∈Q∩[0,t]

Bs ≥ a ∈ FBt

since supBs is a measurable object.

Ta ≤ t = ∩∞N=1 ∪q∈Q∩[0,t] |Bq − a| ≤1

N

On the example sheet, we have TK = inft : Bt = K for K ⊂ Rd closed is a stoppingtime, but if K is open then it is not a stopping time, as you cannot write it in FBt .However, you can write it in FBt+.

Theorem 2.21 B a Brownian motion, FBt , T is an FBt stopping time with T < ∞.Then define Xs = BT+s −BT and it is a Brownian motion, and independent of FBT

Definition 2.22 (Information up to a stopping time) Suppose that T is a stoppingtime. Then we define

FT = A : A ∩ T ≤ t ∈ Ft for all t ≥ 0

18 of 40


We need to check that FT is indeed a σ-field, that if S ≤ T then FS ⊆ FT for two stoppingtimes and that T is FT measurable.

To show the first of these note that ∅,Ω are clearly in FT . Then if A ∈ FT

Ac ∩ T ≤ t = T ≤ t \ (A ∩ T ≤ t) ∈ Ft

Now if A1, A2, ... ∈ FT then

∞⋂1

AN ∩ T ≤ t =

∞⋂1

(AN ∩ T ≤ t)

To show the last note that it is enough to check that T ≤ s ∈ FT for all s ∈ R.Then

T ≤ s ∩ T ≤ t = T ≤ min(s, t) ∈ Fmin(s,t) ⊆ FtWe now justify the part of the Reflection principle that before was somewhat hand

wavy. This here is rigorous though.

P(Ta ≤ t) = P(Ta ≤ t, Bt ≥ a) + P(Ta ≤ t, Bt ≤ a)

and the former is P(B)t ≥ a) and the latter we consider below:

P(Ta ≤ t, Bt ≤ a) = P(Ta ≤ t,Xt−Ta ≤ 0)

= E(P(Ta ≤ t,Xt−Ta ≤ 0|FBTa))

= E(χTa≤tP(Xt−s ≤ 0)|s=Ta)

=1

2P(Ta ≤ t)

Proof (of theorem 2.21) We first assume that T is discrete, i.e. Ω = ∪kT = tk, i.e.T only takes the values t1 < t2 < .... Pick A ∈ FBT and look at

E(F (Bt+T −BT )χA) =∑k

E(χAχT=tkF (Btk+t −Btk , t ≥ 0))

=∑k

E(χA∩T=tkF (Btk+t −Btk , t ≥ 0))

but A ∩ T = tk = A ∩ T ≤ tk \ A ∩ T ≤ tk−1 ∈ FBtk and so by the first Markovproperty we get the above equal to∑

k

E(χAχT=tk)E(F (Btk+t −Btk)) = E(F (Bt, t ≥ 0))P(A)

We now consider general T < ∞. We approximate it by discrete stopping times.Define

TN =j

Nif T ∈ [

j − 1

N,j

N)

Then TN → T as N → ∞ and BTN → BT due to the continuity of B. We now mustverify that TN is a stopping time

TN ≤ t = T <j − 1

N ∈ FBj−1

N

⊆ FBt

19 of 40


if t ∈ [ j−1N , jN ). Then the above implies that BTN+t − BTN is a Brownian motion and

is independent of FBTN ⊇ FBT If we let N → ∞ then considering characteristic functions

gives

E(ei∑θk(BTN+sk

−BTN )χA) = E(ei∑θk(BTN+sk

−BTN ) = E(ei∑θk(BT+sk

−BT )P(A)

where we have used the DCT in the ultimate equality. Q.E.D.

We consider the hitting time process (Ta : a ≥ 0). Let St = maxs≤tBs and thena 7→ Ta is trying to be the inverse function to St but Ta has jumps in any interval (b, c)so this isnt a true inverse.

Proposition 2.23 Tb − Ta is independent of (Tc, c ≤ a) and Tb − Ta = Tb−a

This is an example of an independent increments process.Proof Define Xt = BTa+t − a and this is a new Brownian motion and is independent

of FBTa Then Tb − Ta = inft : Xt = b − a D= Tb−a and Tc is measurable with respect to

FBTc ⊆ FBTa

. Q.E.D.

Proposition 2.24 Let Z = t : Bt = 0. Then

1. Leb(Z) = 0

2. There are no isolated points in Z.

3. Z is uncountable

4. The Hausdorff dimension of Z is 1/2.

Recall that t ∈ Z is isolated if there exists ε > 0 such that (t− ε, t+ ε) ∩ Z = t.Proof

leb(Z) =

∫ ∞0

χ(t ∈ Z)dt =

∫ ∞0

χ(Bt = 0)dt

and E(leb(Z)) =∫∞

0 P(Bt = 0)dt = 0Let τs = inft ≥ s : Bt = 0 and then (Bτs+t − Bτs) is a new Brownian motion and

the Law of the Iterated Logarithm gives that τs is not isolated. Let Z = τs : s ∈ Q.We ask whether Z = Z. This is no, because we can take the last zero before a point andthis will not be of the form τs for some s.

Pick τ ∈ Z \ Z and then τ is not isolated from the left. Then pick a sequence sk → τand then sN ≤ τsN < τ for all N and so τsk → τ .

Z is closed as it is the preimage under a continuous map of a closed set. Thus by thedeterministic lemma below we have our result. Q.E.D.

Lemma 2.25 Suppose that A ⊂ R is closed and has no isolated points. Then A isuncountable.

Proof Pick t0 < t1 in A and choose B0 = B(t0, ε0) and B1 = B(t1, ε1) disjoint. Thenchoose points t00, t01 inside B0 and A and t10, t11 inside B1 and A and again choose disjointballs around these points. Continue this process.

We now have a chain B1 ⊇ B10 ⊇ ... of balls and so there exists a unique point tain the infinite intersection. A is closed so ta ∈ Z. If a 6= b then ta 6= tb so the set isuncountable. Q.E.D.

20 of 40


2.3 Arcsine Laws for Brownian motion

M is the unique time in [0, 1] where it is equal to its supremum, then

P(M ∈ dt) =1

π√t(1− t)

dt

but why is M symmetric?The last zero T = supt ≤ 1 : Bt = 0 then

P(T ∈ dt) =1

π√t(1− t)

dt

Let L =∫ 1

0 χ(Bs > 0)ds, i.e. the time it is above zero. Then

P(L ∈ dt) =1

π√t(1− t)

dt

These are called arcsine laws due to the following:∫ t

0

1

π√t(1− t)

dt =1

π

∫ sin−1(√t)

0

2 sin θ cos θ

sin θ cos θdθ =

2

πsin−1(

√t)

For the second law, note

P(T ≤ t) = P(X does not hit −Bt before time 1− t)

and using the following

P(a+X does not hit the origin by time 1− t) = P(B does not hit a by time 1− t)= 1− P(B does hit a by time 1− t)= 1− 2P(B1−t > a)

= 1− 2P(N(0, 1− t) > a)

= 1− 2P(N(0, 1) >a√

1− t)

= 1− 2

∫ ∞a√1−t

1√2πe−z

2/2dz

=: 1− 2Φ

(a√

1− t

)

and so

P(T ≤ t) = E(1− 2Φ

(a√

1− t

)∣∣∣∣a=Bt

)

= E(1− 2Φ

(√t|N(0, 1)|√

1− t

))

= −∫ ∞−∞

1− 2Φ

( √t|x|√

1− t

)1√2πe−x

2/2dx

21 of 40


and if you consider densities you get

P(t ∈ dt) = − 4√2π

∫ ∞0

e−tx2

1−t e−x2/2

(1

2√t

x√1− t

+

√tx

(1− t)32

)dx

and this is of the form∫∞

0 axe−bx2dx which eventually gives the answer.

Theorem 2.26 (Levy) Suppose St = sups≤tBs. Then Xt = St − Bt is a reflectedBrownian motion.

Then this would give that the alst zero for X is equal to the time of maximal valuefor B, and so the first two arcsine laws one would expect to be the same.

Corollary 2.27 TD= M

Remark There is almost a random walk analogue. If we do the same thing, then we geta reflected random walk, but it is “ sticky ” at the origin. We expect the set of stickytimes to be small though, because∫ t

0χ(Bs = 0)ds = 0

Proof It order to prove Levy’s theorem, it is left to the reader to check that X is aMarkov process with correct transition kernel as given before.

Then one can find P(Xt1 ∈ A1...XtN ∈ AN ). Then

P(Xt ∈ dy|FXs ) = P(Xt ∈ dy|FBs ) =: I + II|a=Ss−Bs

Where I and II are as below. Call Yr = Xs+r −Xs adn this is a new Brownian motionindependent of FBs . We then have two possibilities. Either I, it attains its maximumafter s, or it attains it before. We then have

I =

∫ ∞a

P(SYt−s ∈ dz,Bt−s ∈ dz − y) II = P(SYt−s ≤ a,B ∈ a− dy)

Q.E.D.

3 Brownian Martingales

Definition 3.1 Fix a filtration (Ft, t ≥ 0). A process (Mt, t ≥ 0) is called a (Ft)-martingale if

E(Mt|Fs) = Ms ∀s ≤ t

Observe that this requires E|Mt| <∞ for all t ≥ 0

Theorem 3.2 (Optional Stopping Theorem) If (Mt, t ≥ 0) is an Ft martingale andT is a bounded stopping time then

E(MT ) = E(M0)

22 of 40


There is a financial interpretation of this. Mt could be considered as a stock price.Then the martingale property is the natural assumption, in discounted money. In otherwords the best way to predict the future is to consider the present. Then the OST tellsyou that the expected selling value MT equals the initial value M0, you cannot makemoney from a martingale.

Some books use the OST as the definition of a martingale.T bounded means that there exists a K ∈ R such that P(T ≤ K) = 1Consider Ta = inft : Bt = a. This is not bounded. Then BTa = a so E(BTa) = a

but E(B0) = 0 and these are not equal.

Example 3.1 (Bt) is a martingale because E(Bt|FBs ) = E(Bs +Bt−Bs|FBs ) = Bs TakeT = Ta∧Tb with a < 0 and b > 0. This is the minimum of two stopping times and so is astopping time. It is also bounded. We now use the Optional Stopping theorem.. We haveE(BT∧N ) = 0 and so by DCT we have E(BT ) = 0 since |BT∧N | ≤ max(|a|, |b|). Note toalways check that this can be done. We then have

0 = E(BT ) = bP(Tb < Ta) + aP(Ta < Tb)

and we also have that1 = P(Tb < Ta) + P(Ta < Tb)

Solving these gives

P(Tb < Ta) =−ab− a

P(Ta < Tb) =b

b− a

Example 3.2 (B2t − t, t ≥ 0) is a martingale.

E(B2t |FBs ) = E(B2

s + 2Bs(Bt −Bs) + (Bt −Bs)2|FBs ) = B2s + t− s

and so E(B2t − t|FBs ) = B2

s − s as we want for a martingale. The OST says that

E(B2T∧N − T ∧N) = 0

and using DCT and MCT we get that

E(B2T − T ) = 0

and this gives that

E(T ) = E(B2T ) = b2P(Tb < Ta) + a2P(Ta < Tb) =

−b2ab− a

+a2b

b− a= −ab

Example 3.3 (eθBt−tθ2/2, t ≥ 0) is a martingale for all θ ∈ R But how is this a martingale

when it seems to be small always. The reason is the process goes to zero but the expectationdoesn’t. We now check that this is a martingale.

E(eθBt |FBs ) = E(eθBseθ(Bt−Bs)) = eθBseθ2/2(t−s)

and then rearranging gives it as a martingale. We then use the DCT to Ta ∧ N andeθBt−tθ

2/2 to get, with OST,

E(eθBTa∧N)−Ta∧Nθ2/2)=1

23 of 40


and Ta ∧N → Ta and BTa∧N → BTa = a and then as N →∞ we get E(e−θ2/2Ta) = e−θa.

Then if we let λ = θ2/2 we get

E(e−λTa) = e−√

2λa

Before we showd that P(Ta ≤ t) = 2P(Bt ≥ a) and we get here

E(e−λTa) =

∫ ∞0

e−λta

t3/21√2πe−a

2/2tdt = ...

and try this for yourself.

The above examples give a hint that many interesting facts about Brownian motioncan be found from the Optional stopping theorem, as applied to some stopping time.

Example 3.4 Ta∧Tb can be done similarly and inversion can be used to give P(Ta∧Tb ∈dt).

Recall that E(e−λT ) = φ(λ) = 1 + a1λ+ a2λ2 + ... = 1− λE(T ) + λ2/2E(T 2) + ... by

comparing coefficients.If we define Xt = Bt − ct then Xt → −∞. Find P(Ta < ∞). We consider the

martingale eθBt−θ2/2t = eθXte(θc−θ2/2)t and define Ta =

∫t : Xt = a. Then OST gives

youE(eθXTa∧N e(θc−θ2/2)Ta∧N ) = 1

Now eθXTa∧N →

eθa Ta <∞0 Ta =∞, θ > 0

and this is dominated by eθa.

Now e(θc−θ2/2)Ta∧N →

e(θc−θ2/2)Ta Ta <∞0 Ta =∞, θc− θ2/2 < 0

. If we use θc− θ2/2 =

λ then we getE(e−λTaχTa<∞) = e−θa

if θ > 0 and θc − θ2/2 < 0. Now choose θ → 2c and so P(Ta < ∞) = e−2ca. This makessense. If a was big, then the chance is small, and if c is large, i.e. the drift is stronglynegative, then the chance is also small.

Theorem 3.3 (OST version 1) Suppose (Mt : t ≥ 0) is an Ft martingale with contin-uous paths. Suppose also that T is a bounded stopping time and M is bounded. Then

E(MT ) = E(M0)

Proof Suppose first that T is discrete, and T ≤ K. Let T ∈ t1, .., tN where t1 < t2 <... < tN ≤ K. Then

E(MT ) =

N∑k=1

E(MtkχT=tk

and also E(Mt|Fs) = Ms and so E(MtχA) = E(MsχA) for A ∈ Fs. Also T = tk = T ≤tk \ T ≤ tk−1 ∈ Ftk and so

N∑k=1

E(MtkχT=tk = E(Mk) = E(M0)

24 of 40


Suppose now that we have a general stopping time T . Define TN = jN if T ∈ [ j−1

N , jN ).TN are discrete stopping times and TN → T . Also MTN →MT by the continuity of pathsand so

E(M0) = E(MTN )→ E(MT )

using the DCT since |Mt(ω)| ≤ K for some K, and for all t. Q.E.D.

Bt = (B(1)t , ..., B

(d)t ) is a Brownian motion on Rd. Suppose T = inft : B

(2)t = 1.

Find P(B(1)t ∈ dx). Solve using T and B(1) are independent and you know P(T ∈ dt).

Proposition 3.4 (Lots of Brownian martingales) Suppose that B is a d dimensionalBrownian motion, then

f(Bt, t)−∫ t

0

(∂f

∂s(Bs, s) +

1

2∆f(Bs, s)

)ds

is a martingale if f ∈ C2,1 and f, ∂f∂s ,∂f∂xj

, ∂2f∂xi∂xj

are of at most exponential growth.

For example, B2t − t with f(x, t) = x2 then 1

2∆f = 1. Also B4t isn’t but if we subtract∫ t

0 6B2sds it is. Also for eθBt−θ

2/2t we choose f(x, t) = eθx−θ2/2t and we subtract nothing.

If Bxt = x + Bt is d dimensional Brownian motion started at x then the above is the

same with Bt replaced with Bxt . We will prove it soon, and the proof uses the fact that

the Gaussian density solves ∂φt∂t = 1

2∆φtOf special importance are f(x, t) satisfying ∂f

∂s + 12∆f = 0 or ∆f = 0.

Consider the Dirichlet problem. Let D ⊂ Rd be open. Then find u such that∆u(x) = 0 in D

u(x) = f(x) on ∂D(3.1)

with f a given function.

Theorem 3.5 Suppose D is a bounded open set of Rd. Suppose u ∈ C2(Rd) and solves∆u(x) = 0 in D

u(x) = f(x) on ∂D

thenu(x) = E(f(Bx

T ))

where T = inft : Bxt ∈ ∂D.

Proof We can modify u outside of D so that it has at most exponential growth. Thenwe get, by the above proposition that u(Bx

t ) −∫ t

012∆u(Bx

s )ds is a martingale. Then theoptional stopping theorem, for T ∧N , gives

u(x) = E(u(Bx

T∧N )−∫ T∧N

0

1

2∆u(Bx

s )ds

)= E(u(Bx

T∧N ))

and by the DCT we get u(x) = E(u(BxT )) = E(f(Bx

T )) since u = f on ∂D. Q.E.D.

25 of 40


Example 3.5 Let u(x) =

1

|x|d−2 d ≥ 3

log |x| d = 2and this satisfies ∆u = 0 except at x = 0. I

leave this as a check for the reader, as it has been seen before many times. Let D = x ∈R3 : a < |x| < b and let Ta = inft : |Bx

t | = a and Tb = inft : |Bxt | = b. Then let

T = Ta ∧ Tb, and so

1

|x|= E(

1

|BxT |

) =1

aP(Ta < Tb) +

1

bP(Tb < Ta)

with 1 = P(Ta < Tb) + P(Tb < Ta) by the law of the iterated logarithm, and so

P(Ta < Tb) =

1|x| −

1b

1a −

1b

If we let b→∞ and we get P(Ta < Tb)→ P(Ta <∞) = a|x| .

Corollary 3.6 |Bt| → ∞ as t→∞ in d ≥ 3.

Proof If Bt doesnt tend to infinity then there exists K and tN →∞ where |BtN | ≤ K.Let TN = inft : |Bt| ≥ N then the law of the iterated logarithm says TN < ∞. Then

X(N)t = BTN+t−BTN is a Brownian motion and P(XN hits B(0,K)) =

(KN

)d−2and thus

P(∩∞N=KXN hits B(0,K)) = 0

and soP(∪∞N=1 ∩∞N=K XN hits B(0,K)) = 0

Q.E.D.

For d = 2 let b→∞ and then P(Ta <∞) = 1 and then if a→ 0 then P(T0 < Tb) = 0and if we now let b→∞ then P(T0 <∞) = 0

Corollary 3.7 In d = 2, P(B ever hits x) = 0 for x 6= 0

Corollary 3.8 Leb(Bt : t ≥ 0) = 0 a.s. and so is not a space filling curve.

Proof Leb(Bt : t ≥ 0) =∫R2 χx∈(Bt,t≥0)dx and we consider expectations.

E(Leb(Bt : t ≥ 0)) = E(∫

R2

χx∈(Bt,t≥0)dx

)=

∫R2

P(x ∈ Bt)dx = 0

Q.E.D.

Corollary 3.9 The range of (Bt, t ≥ 0) is dense in R2

The Poisson problem is the following, for D ⊂ Rd.12∆u = −g in D

u = 0 on ∂D(3.2)

and here g represents pumping heat in or out, and u is the steady state temperature.

26 of 40


Theorem 3.10 (Poisson version 1) Suppose u ∈ C2(R) solves equation (3.2) with Dbounded. Then

u(x) = E(

∫ T

0g(Bx

s )ds)

with T = inft : Bxt ∈ ∂D

Example 3.6 u(x) = R2−|x|2d solves the Poisson problem on a ball of radius R with g = 1.

Then u(x) = E(T ).

Example 3.7 Suppose D = (0,∞) ⊂ R and u(x) = −x2/2. However, E(T ) 6= u(x)

Proof We can modify u away from D to have exponential growth. Then

u(Bxt )−

∫ t

0

1

2∆u(Bx

s )ds

is a martingale. The OST then gives

u(x) = E(u(BxT∧N )−

∫ T∧N

0

1

2∆u(Bx

s )ds) = E(u(BxT∧N ) +

∫ T∧N

0g(Bx

s )ds)

and the former tens to 0 by DCT and the latter tends to∫ T

0 g(Bxs )ds by domination with

||g||∞T , and so we need E(T ) < ∞ but T ≤ inft : |B(1)t | = K and the right hand side

is finite. Q.E.D.

PDE people generally look for solutions to these problems that are in C2(D)∩C(∂D).Sadly, one cannot in general extend these to C2(Rd) or even C2(D).

Example 3.8 12∆u = −1 on (0, 1)2 and u = 0 on the boundary. Then u is not C2([0, 1]2)

because of the effects at the corners, as at (0, 0) we have ∆u = 0.

We can improve though, to allow solutions in C2(D)∩C(∂D) but still u(x) solves theDirichlet or Poisson problem. We do this as follows.

Shrink the domain in by ε by defining Dε = x : d(x,DC) > ε and then Dε → Dand we can then mollify. Let Tε = inft : Bt ∈ ∂Dε and then BTε → BT . Then one canbelieve that you can modify u outside of Dε to be C2(Rd) with exponential growth. Thenapply version 1 to Dε to get u(x) = E(u(BTε))→ E(f(BT )) as ε→ 0.

By exponential growth, we mean |g(t, x)| ≤ C0eC1|x| for all x, t.

Proof ( of theorem 3.4) We first show that E(|Mt|) <∞

E(|f(Bxt , t)|) ≤ C0E(eC1|Bxt |) ≤ C0E(eC1Bxt + e−C1Bxt ) ≤ C0e

C1xeC21 t/2 <∞

and similarly E(|∫ t

0 Lf(Bxs , s)ds|) <∞ WLOG we can take x = 0 since we can shift and

we still have exponential growth.

E(Mt|FBs ) = E(f(Bt, t)−∫ t

0Lf(Br, r)dr|FBs )

= E(f(Bs, s) + (f(Bt, t)− f(Bs, s))−

∫ s

0Lf(Br, r)dr −

∫ t

sLf(Br, r)dr|FBs

)= Ms + E(f(Bs +Xt−s, t)− f(Bs, s)−

∫ t−s

0Lf(Xr +Bs, s+ r)dr|FBs )

= Ms + E(f(Z +Xt−s, t)− f(Bs, s)−∫ t−s

0Lf(Xr + Z, s+ r)dr|FBs )|Z=Bs

27 of 40


where Xr = Bs+r−Bs. The result now follows from E(g(Xt, t)−g(0, 0)−∫ t

0 Lg(Xr, r)dr) =0 where we have used g(y, t) = f(z + y, t+ s). Alternatively this is

E(g(Xt, t)− g(0, 0) =

∫ t

0Lg(Xr, r)dr)

which is the same asd

dtE(g(Xt, t)) = E(Lg(Xt, t))

Then

d

dtE(g(Xt, t)) =

d

dt

∫Rdg(x, t)

1

(2πt)d/2e−|x|

2/2tdx

=

∫Rd

(∂g

∂t(x, t)φ(x, t) +

∂φ

∂tg

)dx

=

∫Rd

(∂g

∂t(x, t)φ(x, t) +

1

2(∆φ)g

)dx

=

∫Rd

(∂g

∂t(x, t)φ(x, t) +

1

2(∆g)φ

)dx

= E(Lg(Xt, t))

as required. Q.E.D.

Theorem 3.11 (Heat Equation) If u ∈ C1,2([0,∞)×Rd) is of exponential growth andsolves

∂u∂t = 1

2∆u t > 0, x ∈ Rd

u(0, x) = f(x) x ∈ Rd

then

u(x, t) = E(f(Bxt )) =

∫Rdf(y)

1

(2πt)d/2e−|x−y|

2/2tdx

Theorem 3.12 (Heat Equation on a region) Suppose D ⊂ Rd is bounded, and u ∈C1,2([0,∞)×D) ∩ C([0,∞)× D) solves

∂u∂t = 1

2∆u t > 0, x ∈ Du(0, x) = f(x) x ∈ Du(t, x) = g(x) x ∈ ∂D, t > 0

and let T = inft : Bxt ∈ ∂D and then

u(x, t) = E(g(Bx

T )χT≤t + f(Bxt )χT>t

)We prove both theorems simultaneously. Proof Fix t > 0 and consider the map

s 7→ u(Bxs , t− s)−

∫ s

0

(−∂u∂r

+1

2∆u

)(Bx

r , r)dr

in other words we run backwards in time. Then in Rd u(Bxs , t−s) is a martingale so using

the OST we getu(x, t) = E(u(Bx

t , 0)) = E(f(Bxt ))

28 of 40


as required.On D we stop at T ∧ t which is a bounded stopping time. Then u(Bx

s , s − t) is amartingale so

u(x, t) = E(u(BxT∧t, t− T ∧ t)) = E(χT≤tg(Bx

T ) + χT>tf(Bxt ))

as required. Q.E.D.

Example 3.9 Brownian motion staying in a tube. Suppose we write u(x, t) = P(|Bt| <1 for all s ≤ t). Then u(x, t) solves the equation

∂u∂t = 1

2∂2u∂x2

x ∈ (−1, 1), t ∈ (0, t)

u(0, x) = 1

u(t, x) = 0 x = ±1

but note that u 6∈ C([0, t]× [−1, 1]) and the solution is

u(x, t) =∞∑k=1

ak cos

((2k − 1)πx

2

)e−

(2k−1)2π2t2

Inspired by this, we consider the first term cos πx2 e−π2t/8 and then this solves the above

equation with initial data u(0, x) = cos πx2 and then

u(t, 0) = e−π2t/8 = E

(cos

Bxt π

2χT>t

)≤ P(T > t)

and so P(T > t) ≥ e−π2t/8, and this is correct asymptotically.

Example 3.10 Find P(Bx1s , ..., B

xds do not collide by time t). Let Bx

t = (Bx1s , ..., B

xds ) be

a d-dimensional Brownian motion. Let Vd = x ∈ Rd : x1 < x2 < ... < xd which is calleda cell, and then ∂Vd = x ∈ Rd : xi = xi+1 for some i. This solves

∂u∂t = 1

2∆u x ∈ Vd, t > 0

u(0, x) = f(x) x ∈ Vdu(t, x) = 0 x = ∂Vd

and this has a famous solution

u(x, t) =

∫Vd

f(y1, . . . , yd) det

∣∣∣∣∣∣∣φt(x1 − y1) . . . φt(x1 − yd)

......

φt(xd − y1) . . . φt(xd − yd)

∣∣∣∣∣∣∣ dy1 . . . dyd

where φt(z) = 1√2πte−z

2/2t

Definition 3.13 φ : R→ R is convex means φ(x) = supL(x)|L is linear , L ≤ φ

Note that if φ ∈ C2(R) then φ is convex if and only if φ′′ ≥ 0.

Lemma 3.14 (Jensen’s Inequality) If E(|X|) <∞ and φ : R→ R is convex then

E(φ(X)) ≥ φ(E(X))

E(φ(X)|F) ≥ φ(E(X|F)) a.s.

29 of 40


Proof

E(φ(X)|F) ≥ E(L(X)|F)

= E(aX + b|F)

= aE(X|F) + b

= L(E(X|F)) a.s.

and then taking a supremum over L ≤ φ gives the result. Note we need to ensure somehowthat we only need countably many Ls. Q.E.D.

Corollary 3.15 Suppose (Mt) a martingale and φ is convex, then

E(φ(Mt)|Fs) ≥ φ(E(Mt|Fs)) ≥ φ(Ms)

Lemma 3.16 (OST for φ(Mt)) Suppose (Mt, t ≥ 0) is a martingale with continuouspaths and φ ≥ 0 is a convex function. Suppose T ≤ K is a bounded stopping time. Then

E(φ(MT )) ≤ E(φ(MK))

Proof First assume that T is discrete, so T ∈ t1, ..., tM with t1 < t2 < ... < tM . Then

E(φ(MT )) =M∑k=1

E(φ(Mtk)χT=tk) ≤M∑k=1

E(φ(MK)χT=tk) = E(φ(MK))

Now for general T , find discrete stopping times TN ≤ K so that TN → T and then wehave E(φ(MTN )) ≤ E(φ(MK)) and using Fatou’s lemma we get

E(φ(MT )) ≤ E(φ(MK))

Q.E.D.

We now prove a more general OST, one without boundedness of the martingale

Theorem 3.17 Suppose (Mt, t ≥ 0) is a continuous martingale, and T ≤ K is a boundedstopping time. Then

E(MT ) = E(M0)

Proof For discrete T this works as above and is fine.We now consider Mt ≥ 0 and assume we have discrete stopping times TN → T and

we have E(MTN ) = E(M0). We then write x = x ∧ L+ (x− L)+ and we have

E(M0) = E(MTN ) = E(MTN ∧ L) + E((MTN − L)+)

and now E(MTN ∧ L)→ E(MT ∧ L) by DCT and E(MT ∧ L) = E(MT )− E((MT − L)+)Fix ε > 0 Choose L large so that

E((MTN − L)+) ≤ E((MK − L)+) < ε

and then take N large so that

|E(MTN ∧ L)− E(MT ∧ L)| ≤ ε

Finally truncate a general martingale at +L and −L. This is a bit messier though.Q.E.D.

30 of 40


We now have some final remarks on the Dirichlet problem.Can we define u(x) = E(f(Bx

T )) and check that it solves the Dirichlet problem. Sup-pose that D = x ∈ R2 : 0 < |x| < 1 the punctured disc. If we suppose that u = 1 onthe set x : |x| = 1 and 0 at 0 then the solution given by the Brownian motion formulais u ≡ 1 but this doesn’t solve the boundary conditions.

However, if u is of that form, then u ∈ C∞(D) and ∆u = 0 on D always.We call a point y ∈ ∂D regular if u(x) → f(y) as x → y in D. Thus if all point of

∂D are regular then u is a solution. We thus need a sufficient condition fo y ∈ ∂D to beregular.

To show the first point, we observe that u being harmonic is the same as u satisfyingthe ball averaging property, or the sphere averaging property, namely if B(x, ε) ⊂ D then

u(x) =1

|B(x, ε)|

∫B(x,ε)

u(y)dy

or

u(x) =1

SAε

∫∂B(x,ε)

u(y)dS

Sphere averaging for our formula is almost obvious. If we let Sε = inft : Bxt ∈ ∂B(x, ε),

Xt = BSε+t −BSε and T ′ = inft : Xt +BxSε∈ ∂D then

u(x) = E(f(BxT ))

= E(f(BxSε +Bx

T −BxSε))

= E(f(BxSε +XT ′))

= E(E(f(BxSε +XT ′)|FBSε))

= E(f(z +XT ′)|z=BxSε )

= E(u(z)|z=BxSε )

= E(u(BxSε))

which is sphere averaging,To do the part on regular points, the following is an equivalent definition of regular.

P(Tx > ε)→ 0 as x→ y where Tx = inft : Bxt ∈ ∂D for all ε > 0.

This is equivalent to P(σy = 0) = 1 where σy = inft > 0 : Byt ∈ DC. The 0-1 law

says that this is either 0 or 1. Remember Lebesgue’s thorn. If y is on a thin enough spikethen P(σy > 0) = 1 and then you cannot solve the Dirichlet problem.

A sufficient condition for y to be a regular point is the cone condition, namely if thereexists a cone in DC with vertex at y ∈ ∂D then y is regular. This is because

P(σy ≤ ε) ≥ P(Bε ∈ cone ) = p(α) > 0

where α is the solid angle of the cone. Then letting ε→ 0 gives

P(σy = 0) ≥ p(α) > 0

and so the 0-1 law gives P(σy = 0) = 1Book by Richard Bass on this sort of stuff

31 of 40


4 Donsker’s theorem

The idea is to show random walks converge to Brownian motion. Throughout this chapterwe have the following. Z1, Z2, ... are IID random variables with E(Zi) = 0 and E(Z2

i ) = 1and we define SN =

∑Nk=1 Zk and this is the position at time N . We can interpolate the

SN to get St and this is given by

St = (N + 1− t)SN + (t−N)SN+1 t ∈ [N,N + 1]

We also define X(N)t = SNt√

Nbut we cannot hope that X

(N)t → Bt. However, we do have

that X(N)t

D→ Bt. This is the aim of this section.We know that

SN√N→ N(0, 1)

by the central limit theorem, and so

X(N)t =

SNt√N

=SbNtc√N

+ error =SbNtc√bNtc

√bNtc√N

+ error → N(0, 1)√t

we hope. Thus it has the same distribution as Brownian motion.

Similarly (X(N)t1

, ..., X(N)tk

)→ (Bt1 , ..., Btk) but does

maxt∈[0,1]

X(N)t

D→ maxt∈[0,1]

Bt (4.1)

∫ 1

0Xt(N)dt

D→∫ 1

0Btdt ∼ N(0, 1/3) (4.2)∫ 1

0χX(N)

s >0dsD→∫ 1

0χBs>0ds (4.3)

(4.2) can be rewritten as∑N

1 SKN3/2

D→ and (4.3) can be rewritten almost as number of times when Sk>0N

The plan is to think of (X(N)t , t ∈ [0, 1]) =: X(N) and (Bt, t ∈ [0, 1]) = B as random

variables in C[0, 1]. We thus need to show that

X(N) D→ B

on C[0, 1]. This is a big improvement of the Central limit theorem. We also show that

F (XN )D→ F (B)

and we observe thatX(N) D→ B =⇒ F (XN )

D→ F (B)

if F is continuous. In the above questions, the maximum and integral are continuous, butthe last is not, as functions C[0, 1]→ R.

Definition 4.1 (E, d) a metric space. X : (Ω,F ,P)→ (E, d) measurable ( X−1(B) ∈ Ffor B Borel in E)is called an E-valued random variable.

32 of 40


We use this with E = C[0, 1] with

d(f, g) = supt∈[0,1]

|f(t)− g(t)|

and an open ball is B(f, ε) = g : |g(t) − f(t)| < ε and the Borel sets are generated byopen balls.

Lemma 4.2 B(C[0, 1]) = σ(F0) where

F0 = f : f(t1) ∈ O1, ..., f(tN ) ∈ ON

for t1, ..., tN and N ≥ 0 and Oi open sets.

Proof It is easy to check that F0 is a π-system.We check that σ(F0) ⊂ B(C[0, 1]). But if f ∈ f(t1) ∈ O1, ..., f(tN ) ∈ ON then since

O1 is open there is an ε1 such that if |f − g| < ε1 then g(t1) ∈ O1, and so on, then takeε = minε1, ..., εN and we get B(f, ε) ⊂ f(t1) ∈ O1, ..., f(tN ) ∈ ON and so it is open.

We check that B(C[0, 1]) ⊂ σ(F0). It is enough to check that B(f, ε) ∈ σ(F0). Now

B(f, ε) = : |g(t)− f(t)| ≤ ε ∀t ∈ [0, 1]

=⋂

t∈[0,1]∩Q

: |g(t)− f(t)| ≤ ε

=⋂

t∈[0,1]∩Q

⋂N≥0

: |g(t)− f(t)| < ε+1

N ∈ F0

and also B(f, ε) =⋃N≥1B(f, ε− 1/N) Q.E.D.

An example of this is B = (Bt, t ≥ 0) is a C[0, 1] valued variable. One observes

that ω Bt1 ∈ O1, ..., BtN ∈ ON ∈ F0 Also X(N)t are random variables in C[0, 1]. We

can show the latter using composition of measurable functions, Ω→ RN → C[0, 1] given

by ω 7→ (Z1(ω), ..., ZN (ω)) 7→ X(N)t (ω) and the former is measurable and the latter is

continuous.

Definition 4.3 X(N), X are (E, d) valued variables then X(N)t

D→ X means

E(F (X(N)))→ E(F (X))

for F : E → R bounded and continuous.

Theorem 4.4 (Donsker)

X(N) D→ B

on C[0, 1]

Theorem 4.5 (Continuous Mapping) If X(N) D→ X on (E, d) and G : (E, d) →(E, d) which is continuous then

G(X(N))D→ G(X)

on (E, d)

33 of 40


Proof Take F : E → R bounded and continuous. Then

E(F (G(X(N))))→ E(F (G(X)))

since F G is bounded and continuous. Q.E.D.

Corollary 4.6

maxt∈[0,1]

X(N)t → max

t∈[0,1]Bt

Proof F (t) = max f(t) is a continuous map. Q.E.D.

Consider∫ 1

0 χ(X(N)t >0)

dt→∫ 1

0 χ(Bt>0)dt is not continuous. Also recall that if X(N) D→X then this does not imply that P(XN ∈ (a, b)) → P(X ∈ (a, b)), for example considerXN = 1/N and (a, b) = (0, 1).

Consider the simple random walk SN . ThenS2NN

D→ N(0, 1) but P(S2NN ∈ Q) = 1 6= 0 =

P(N(0, 1) ∈ Q). However, P(S2NN ∈ (a, b))→ P(N(0, 1) ∈ (a, b))

Theorem 4.7 (Extended Continuous Mapping) If X(N) D→ X on (E, d) and F :(E, d)→ R is measurable and Disc(F ) = f : F is discontinuous atf is such that P(X ∈Disc(F )) = 0 then

F (X(N))D→ F (X)

on (E, d)

This is a big non trivial improvement.For the indicator problem, we guess the discontinuity set to be F is discontinuous at

f if∫ 1

0 χ(f(t)=0)dt > 0, but for a Brownian path,∫ 1

0 χ(Bt=0)dt = 0 and so we have theresult we want.

We now try to prove the approximation. We remind ourselves that X(N)t = SNt√

Nwith

SN the simple symmetric random walk and we take (Z1, Z2, ...) IID with mean 0 and

variance 1. The aim is to show that X(N) D→ B on C([0, 1]) with the supremum norm.We embed simple random walks as follows. We define T1 = inft : |Bt| = 1 and then

inductively define TN+1 = inft ≥ TN : |Bt − BTN | = 1 and then linearly interpolate.We have

(BT1 , BT2 , ...)D= (S1, S2, ...)

We are close to a proof. We know that E(T1) = 1 and the strong Markov propertyat TN implies that E(TN+1 − TN ) = 1 and TN+1 − TN is independent of T1, T2, ... and so

TN ≈ N ±O(√N) and if we take B

(N)t = BNt√

Nthen we expect X(N) is close to this.

Lemma 4.8 (Skorokhod) Take Z with E(Z) = 0. Then there exists a stopping time

T <∞ so that BTD= Z and E(T ) = E(Z2).

Financial mathematicians love this lemma. There are at least 14 different ways toprove this.

If B2t − t is a martingale and so E(B2

T∧N − T ∧N) = 0 or E(T ∧N) = E(B2T∧N ) and

this, by Fatou, gives E(T ) ≥ |E(B2T ).

If we take Z to be independent of B we choose T = inft : Bt = Z and so BT = Zbut E(T ) = E(E(T |σ(Z))) = E(E(Ta)|a=Z) =∞. This is a bit of a silly example.

34 of 40


Proof We first suppose Z ∈ a, b. Then define T byP(BT = a) = b

b−aP(BT = b) = −a

b−a

and then E(T ) = −ab.We now take a general Z. Choose random α < 0, β > 0 independent of B, and use

T = Tα,β. We need to find the distribution of α and β. To this end we need to choose

ν(da, db) = P(α ∈ da, β ∈ db)

and we have the target distribution ofµ+(dz) = P(Z ∈ dz) Z ≥ 0

µ−(dz) = P(Z ∈ dz) Z < 0

If Z ≥ 0 we need

µ+(dz) = (BTα,β ∈ dz)= E(P(BTα,β ∈ dz|σ(α, β)))

= E(−αβ − α

χ(β ∈ dz))

=

∫∫−ab− a

χ(b ∈ dz)ν(da, db)

=

∫−az − a

ν(da, dz)

and so we choose ν(da, dz) = (z − a)µ+(dz)π(da) where∫−aπ(da) = 1 and so we have

matched µ+.For Z < 0 we have

µ−(dz) = P(BTα,β ∈ dz)

= E(

β

β − αχ(α ∈ dz)

)=

∫∫b

b− aχ(a ∈ dz)ν(da, db)

=

∫bµ+(db)π(dz)

and so we choose

π(dz) =µ−(dz)∫bµ+(db)

and so α, β are distributed as

ν(da, db) =(b− a)µ+(db)µ−(da)∫

xµ+(dx)

We thus have four things to check:

1. P(BTα,β ∈ dz) = P(Z ∈ dz)

35 of 40


2.∫−aπ(da) = 1

3. E(Tα,β) = E(Z2)

4.∫∫

ν(da, db) = 1

These are all true, but we only check 2 and 3. Observe that 1 is by construction. For 2,we want to show that ∫

−aµ−(da) =

∫aµ+(da)

but we have

0 = E(Z) =

∫aµ+(da) +

∫aµ−(da)

which is what we want.For 3, observe that

E(Tα,β) = E(−αβ)

=

∫∫−abν(da, db)

=

∫∫−ab(b− a)

µ+(db)µ−(da)∫xµ+(dx)

=−∫xdµ−

∫x2dµ+ +

∫x2dµ−

∫xdµ+∫

xdµ+

=

∫x2dµ+∫ x2dµ−

= E(Z2)

Q.E.D.

We use a Skorokhod trick: BTα,β

D= Z

E(Tα,β) = 1

and take IID copies (α1, β1), (α2, β2) independent of B. Then T1 = inft : Bt ∈α1, β1,...,TN+1 = t ≥ TN : Bt − BTN ∈ αN+1, βN+1 and define (S1, S2, ...) =(BT1 , BT2 , ...) and then it has the random walk distribution that we want, i.e.

X(N)t =

SNt√N

B(N) =BNt√N

The key estimate is

P(||X(N) −B(N)||∞ > ε

)→ 0

as N → ∞. Assume this, and then fix F : C[0, 1] → R that is bounded and uniformlycontinuous. We get

|E(F (X(N)))− E(F (B(N)))| ≤ |E(F (X(N))− F (B(N)))|≤ E|F (X(N) − F (B(N)|≤ 2||F ||∞P(ΩN,ε) + E(|F (X(N) − F (B(N)|χΩCN,ε

)

Fix η > 0 and use uniform continuity to choose ε so that the second term is less thanor equal to η/2.

36 of 40


Then choose N large to make the first term less than or equal to η/2.We soon check that it is enough to only use uniformly continuous functions.We now show the key estimate.

X(N)K/N :=

SK√N

:=BTK√N

= B(N)TK/N

≈B

(N)K

N

where the approximation is the first gap we need to plug. This is the difference of thetwo at the endpoints. The second gap is the difference at other parts of the paths.

We plug the first gap. Take T1, T2 − T1, T3 − T2 and so on IID with mean 1. Then

TNN

a.s.→ 1

by the strong law of large numbers. Then

maxi=1,...,N

|TkN− k

N| → 0

almost surely, from the above (analysis 1).Let ΩN,δ = maxi=1,...,N |TkN −

kN | ≥ δ and then

P(ΩN,δ)→ 0

as N →∞.We now plug the second gap. Suppose ||X(N)

t − B(N)t ||∞ > ε. Then there exists a

t ∈ [0, 1] such that

|X(N)t −B(N)

t | ≥ εand suppose that t ∈ [K/n, (K + 1)/N ]. Then either

|B(N)t −B(N)

K/N | ≥ ε

or|B(N)

t −B(N)(K+1)/N | ≥ ε

and so

P(||X(N)t −B(N)

t ||∞ > ε) ≤ P(ΩN,δ) + P(|B(N)s −B(N)

t | ≥ ε for |s− t| ≤ δ + 1/N)

≤ P(ΩN,δ) + P(|B(N)s −B(N)

t | ≥ ε for |s− t| ≤ 2δ)

and then choose δ small and N large.This concludes the proof of Donskers theorem, modulo some other minor tidy ups.

Corollary 4.9 ∫ 1

0X

(N)t dt

D→∫ 1

0Btdt

We apply F (f) =∫ 1

0 f . The former isn’t really very useful, so we rewrite it as follows:∫ 1

0X

(N)t dt =

N−1∑0

1

2N

(SK√N

+SK+1√N

)=

1

N3/2

N−1∑0

SK −SNN3/2

and so we suspect that

1

N3/2

N−1∑0

SK → N(0, 1/3)

We need a tidy up lemma:

37 of 40


Lemma 4.10 Suppose that XND→ X and YN

D→ 0. Then XN + YND→ X.

This is not true if YN converges to a non zero limit. For example consider YN =Z N even

−Z N odd.

Proof We consider characteristic functions

|E(eiθ(XN+YN ) − E(eiθX)| ≤ |E(eiθXN − eiθX)|+ |E(eiθ(XN+YN ) − eiθXN )|

The former tends to zero as XN → X and the latter is as follows

|E(eiθ(XN+YN ) − eiθXN )| ≤ E(|eiθYN − 1|)

≤√E(|eiθYN − 1|2)

≤√

E(1− eiθYN − e−iθYN + 1)

→ 0

Q.E.D.

Now back to the problem in hand. We have F (f) =∫ 1

0 f and we have X(N) D→ B andthen

F (X(N) =1

N3/2

N−1∑0

SK −SNN3/2

Then

E∣∣∣∣ SNN3/2

∣∣∣∣ ≤√E∣∣∣∣ SNN3/2

∣∣∣∣2 =N

N3→ 0

This uses the following

Lemma 4.11 If E|XN | → 0 then XND→ 0

and the proof uses|E(eiθXN − 1)| ≤ θE|XN | → 0

We have another example, with a non continuous F .Take Ta = infN : SN ≥ a and we guess that√

T√NaND→ τa = inft : Bt = a

and we check this. We take

F (f) = inft : f(t) ≥ a ∧ 1

where the minimum with 1 is for convenience. Then

F (X(N)) =√T√NaN + error

and the error is at most 1/N/. We need only show that

F (X(N))→ F (B)

38 of 40


but the problem is F is not continuous. We thus guess the discontinuity set of F :

Disc(F ) ⊂ f : τa(f) < 1, ∃ε > 0 such that f(t) ≤ a for t ∈ [τa, τa + ε]

and our aim is to show thatP(B ∈ Disc(F )) = 0

If we define Xt = BTa+t − a then this is a new Brownian motion and P(Xt ≤ 0 for t ∈[0, ε]) = 0 by the time inverted law of the iterated logarithm.

We will check the inclusion of the discontinuity set. We prove the complement state-ment. Choose δ > 0 and assume that supt≤F (f)−δ f(t) = a − ε for some ε > 0. Thenif

||g − f ||∞ ≤ ε/2

and this implies that F (g) ≥ F (f) − δ. In other words, if g → f then F (g) ≥ F (f) andso we have

lim infg→f

F (g) ≥ F (f)

which is semicontinuity.We take an f where there exists SN → F (f) and f(XN ) > a. We need to show that

F is continuous at such an f .If f(SN ) = a+ε then take a g such that ||g−f ||∞ < ε/2 and this will have g(SN ) > a.

Thus F (g) ≤ SN and so lim supg→f F (g) ≤ F (f) and so we are continuous here.

Theorem 4.12 The following are equivalent for XN , X over (E, d).

1. E(F (XN ))→ E(F (X)) for all bounded continuous functions F : E → R.

2. E(F (XN ))→ E(F (X)) for all bounded uniformly continuous functions F : E → R.

3. lim supN→∞ P(XN ∈ A) ≤ P(X ∈ A) for all closed A.

4. lim infN→∞ P(XN ∈ A) ≥ P(X ∈ A) for all open A.

5. P(XN ∈ A)→ P(X ∈ A) for all A such that P(X ∈ ∂A) = 0, where ∂A = A \Ao.

6. E(F (XN ))→ E(F (X)) for all measurable F such that P(X ∈ Disc(F )) = 0.

Proof 1 =⇒ 2 is immediate.

2 =⇒ 3 Define Fε(x) =(

1− d(x,A)ε

)+and this is uniformly continuous and converges

to χA. Then

P(X ∈ A)← E(Fε(X)) = limN→∞

E(Fε(XN )) ≥ lim supN→∞

P(XN ∈ A)

3 =⇒ 4P(X ∈ A) = 1− P(X ∈ AC)

and so if A is open then AC is closed.4 =⇒ 5

P(X ∈ Ao) ≤ lim inf P(XN ∈ Ao) ≤ lim inf P(XN ∈ A) ≤ lim supP(XN ∈ A) ≤ lim supP(XN ∈ A) ≤ P(X ∈ A)

and soP(X ∈ A)− P(X ∈ Ao) = X ∈ ∂A = 0

39 of 40


5 =⇒ 6 We observe that 5 is a special case with f = χA with Discf = ∂A.Now choose α1 < α2 < ... < αN with |αi+1 − αi| < ε.We can approximate a general f by

fε(x) =

N∑1

αiχf−1(αi,αi+1]

and with |fε(x) − f(x)| ≤ ε. We will check E(fε(XN )) → E(fε(X)), and this is enough.We then apply part 5 with A = f−1(αi, αi+1] ∪ x : f(x) = αi ∪ x : f(x) = αi+1.

We claim that

Disc(χA) ⊂ Discf ∪ x : f(x) = αi ∪ x : f(x) = αi+1

and we prove the complemented statement.Pick x 6∈ Discf with f(x) 6= αi and f(x) 6= αi+1. Take xN → x and then as f is

continuous we have f(xN ) → f(x), because χ(f(xN ) ∈ (αi, αi+1]) is discontinuous onlyat αi and αi+1.

To apply 5 we need to know that P(X ∈ x : f(x) = αi) = 0. To this end letpα = P(f(X) = α) > 0. Now α : pα ≥ 1/N has at most N elements, and so pα > 0 foronly countably many α. Let Q = α : pα > 0. Then we need to choose α1, ..., αN notlying in Q. As Q is countable this is easy.

6 =⇒ 1 is immediate.Q.E.D.

5 Up Periscope

Suppose that a box has N balls, half of which are black and the other half are white. Wedraw at random until the box is empty. Let SK denote the number of blacks left by draw

K minus the number of whites by draw K. Let X(N)t = SNT√

Nand we expect:

Theorem 5.1 X(N) D→ Brownian Bridge.

We consider a population model. Let SN be the population size at time N . Assumethat there is a one half probability of one individual having 2 or zero offspring each timestep.

Lemma 5.2

P(SN ) > 0|S0 = 1) ∼ 2

N

Corollary 5.3

P(SN > 0|S0 = N) ∼ 1− (1− 2

N)N ∼ 1− e−2

If we instead choose S0 = N and linearly interpolate to get X(N)t = SNt

N we get

Theorem 5.4X(N) D→ X

where X solves Feller’s equation:

dX

dt=√XdB

dt

40 of 40


Such theorems are called diffusion approximations. One can split the proof into twoparts:

1. Soft nonsense- Show there exists a convergent subsequence X(Nl)

2. Characterise the limit points of X(N

The first point is to do with compactness.

Theorem 5.5 (Kolmogorov) Suppose X(N) is a continuous process and

E|X(N)t −X(N)

s | ≤ C|t− s|1+ε

for all s, t ≤ 1 and N ≥ 1. Then X(N) has a convergent subsequence.

We can deduce this from compact sets K ⊂ C[0, 1]. For example

f : |f(t)− f(s)| ≤ C1|t− s|α, |f(0)| ≤ C2

is compact.The second point is specific to each convergence. We look only at the population one.

How do we characterise Feller’s diffusion. We would want

Xt+∆t −Xt ≈√XtB(0,∆t)

The key estimate is the following:

X(N)K+1N

−X(N)KN

=SK+1

N− SK

N

and the bit on the right hand side has mean zero and variance SK , which agrees with√SKB(0,∆t).

41 of 40

MA4F7 Brownian Motion

Documents

Transcript of MA4F7 Brownian Motion