Math 635: An Introduction to Brownian Motion and ...kurtz/635/m635s07.pdf · Stochastic Calculus 1....

•First •Prev •Next •Go To •Go Back •Full Screen •Close •Quit 1

Math 635: An Introduction to Brownian Motion andStochastic Calculus

1. Introduction and review

2. Notions of convergence and resultsfrom measure theory

3. Review of Markov chains

4. Change of measure

5. Information and conditional expec-tations

6. Martingales

7. Brownian motion

8. Stochastic integrals

9. Black-Scholes and other models

10. The multidimensional stochasticcalculus

11. Stochastic differential equations

12. Markov property

13. SDEs and partial differential equa-tions

14. Change of measure and asset pric-ing

15. Martingale representation andcompleteness

16. Applications and examples

17. Stationary distributions and for-ward equations

18. Processes with jumps

19. Assignments

20. Problems


February 22 review

• Independence

• Conditional expectations: Basic properties

• Jensen’s inequality

• Functions of known and unknown random variables

• Filtrations and martingales

• Optional sampling theorem

• Doob’s inequalities


1. Introduction and review

• The basic concepts of probability: Models of experiments

• Sample space and events

• Probability measures

• Random variables

• The distribution of a random variable

• Definition of the expectation

• Properties of expectations

• Jensen’s inequality


Experiments

Probability models experiments in which repeated trials typically re-sult in different outcomes.

As a means of understanding the “real world,” probability identifiessurprising regularities in highly irregular phenomena.

If we roll a die 100 times we anticipate that about a sixth of the timethe roll is 5.

If that doesn’t happen, we suspect that something is wrong with thedie or the way it was rolled.


Probabilities of events

Events are statements about the outcome of the experiment: the roll is 6,the rat died, the television set is defective

The anticipated regularity is that

P (A) ≈ #times A occurs#of trials

This presumption is called the relative frequency interpretation ofprobability.


“Definition” of probability

The probability of an event A should be

P (A) = limn→∞

#times A occurs in first n trialsn

The mathematical problem: Make sense out of this.

The real world relationship: Probabilities are predictions about thefuture.


Random variables

In performing an experiment numerical measurements or observa-tions are made. Call these random variables since they vary randomly.

Give the quantity a name: X

X = a and a < X < b are statements about the outcome of theexperiment, that is, are events


The distribution of a random variable

If Xk is the value of X observed on the kth trial, then we should have

PX = a = limn→∞

#k ≤ n : Xk = an

If X has only finitely many possible values, then∑a∈R(X)

PX = a = 1.

This collection of probabilities determine the distribution of X .


Distribution function

More generally,

PX ≤ x = limn→∞

1

n

n∑k=1

1(−∞,x](Xk)

FX(x) ≡ PX ≤ x is the distribution function for X .


The law of averages

If R(X) = a1, . . . , am is finite, then

limn→∞

X1 + · · ·+Xn

n= lim

n→∞

m∑l=1

al#k ≤ n : Xk = al

n=

m∑l=1

alPX = al


More generally, if R(X) ⊂ [c, d], −∞ < c < d <∞, then∑l

xlPxl < X ≤ xl+1 = limn→∞

m∑l=1

xl#k ≤ n : xl < Xk ≤ xl+1

n

≤ limn→∞

X1 + · · ·+Xn

n

≤ limn→∞

m∑l=1

xl+1#k ≤ n : xl < Xk ≤ xl+1

n

=∑

l

xl+1Pxl < X ≤ xl+1

=∑

l

xl+1(FX(xl+1)− FX(xl))

→∫ d

c

xdFX(x)


The expectation as a Stieltjes integral

If R(X) ⊂ [c, d], define

E[X] =

∫ d

c

xdFX(x).

If the relative frequency interpretation is valid, then

limn→∞

X1 + · · ·+Xn

n= E[X].


A random variable without an expectation

Example 1.1 Suppose

PX ≤ x =x

1 + x, x ≥ 0.

Then

limn→∞

X1 + · · ·+Xn

n≥

m∑l=0

lPl < X ≤ l + 1

=m∑

l=0

l(l + 1

l + 2− l

l + 1)

=m∑

l=0

l

(l + 2)(l + 1)→∞ as m→∞

One could say E[X] = ∞, and we will.


Review of basic calculus

Definition 1.2 A sequence xn ⊂ R converges to x ∈ R (limn→∞ xn =x) if and only if for each ε > 0 there exists an nε > 0 such that n > nε

implies |xn − x| ≤ ε.

Let ak ⊂ R. The series∑∞

k=1 ak converges if limn→∞∑n

k=1 ak ∈ Rexists. The series converges absolutely if

∑∞k=1 |ak| converges. (Or we

write∑∞

k=1 |ak| <∞.)

Examples of things you should know:

limn→∞

(axn + byn) = a limn→

xn + b limn→∞

yn

if the two limits on the right exist.

If limn,m→∞ |xn−xm| = 0 (xn is a Cauchy sequence), then limn→∞ xn

exists.


Examples

If α > 1, then∞∑

k=1

1

kα<∞.

For α = 1,∞∑

k=1

1

k= ∞;

however∞∑

k=1

(−1)k

k= lim

n→∞

n∑k=1

(−1)k

k= lim

m→∞

m∑l=1

(− 1

2l − 1+

1

2l) = −

∞∑l=1

1

2l(2l − 1)


The sample space

The possible outcomes of the experiment form a set Ω called the sam-ple space.

Each event (statement about the outcome) can be identified with thesubset of the sample space for which the statement is true.


The collection of events

If

A = ω ∈ Ω : statement I is true for ωB = ω ∈ Ω : statement II is true for ω

Then

A ∩B = ω ∈ Ω : statement I and statement II are true for ωA ∪B = ω ∈ Ω : statement I or statement II is true for ω

Ac = ω ∈ Ω : statement I is not true for ω

Let F be the collection of events. Then A,B ∈ F should imply thatA ∩B, A ∪B, and Ac are all in F . F is an algebra of subsets of Ω.

In fact, we assume that F is a σ-algebra (closed under countableunions and complements).


The probability measure

Each event A ∈ F is assigned a probability P (A) ≥ 0.

From the relative frequency interpretation, we must have

P (A ∪B) = P (A) + P (B)

for disjoint events A and B and by induction, if A1, . . . , Am are dis-joint

P (∪mk=1Ak) =

m∑k=1

P (Ak) finite additivity

In fact, we assume countable additivity: IfA1, A2, . . . are disjoint events,then

P (∪∞k=1Ak) =∞∑

k=1

P (Ak).

P (Ω) = 1.


A probability space is a measure space

A measure space (M,M, µ) consists of a set M , a σ-algebra of subsetsM, and a nonnegative function µ defined onM that satisfies µ(∅) = 0and countable additivity.

A probability space is a measure space (Ω,F , P ) satisfying P (Ω) = 1.


Random variables

If X is a random variable, then we must know the value of X if weknow that outcome ω ∈ Ω of the experiment. Consequently, X is afunction defined on Ω.

The statement X ≤ c must be an event, so

X ≤ c = ω : X(ω) ≤ c ∈ F .

In other words, X is a measurable function on (Ω,F , P ).

R(X) will denote the range of X R(X) = x ∈ R : x = X(ω), ω ∈Ω


Distributions

Definition 1.3 The Borel subsets B(R) is the smallest σ-algebra of sub-sets of R containing (−∞, c] for all c ∈ R.

Definition 1.4 The distribution of a R-valued random variable X is theBorel measure defined by µX(B) = PX ∈ B, B ∈ B(R).

µX is called the measure induced by the function X .


Discrete distributions

Definition 1.5 A random variable is discrete or has a discrete distribu-tion if and only if R(X) is countable.

If X is discrete, the distribution of X is determined by the probabilitymass function

pX(x) = PX = x, x ∈ R(X).

Note that ∑x∈R(X)

PX = x = 1.


Examples

Binomial distribution

PX = k =

(n

k

)pk(1− p)n−k, k = 0, 1, . . . , n

for some postive integer n and some 0 ≤ p ≤ 1

Poisson distribution

PX = k = e−λλk

k!, k = 0, 1, . . .

for some λ > 0.


Absolutely continuous distributions

Definition 1.6 The distribution of X is absolutely continuous if andonly if there exists a nonnegative function fX such that

Pa < X ≤ b =

∫ b

a

fX(x)dx, a < b ∈ R.

Then fX is the probability density function for X .


Examples

Normal distribution

fX(x) =1√2πσ

e−(x−µ)2

2σ2

Exponential distribution

fX(x) =

λe−λx x ≥ 0

0 x < 0


Expectations

If X is discrete, then letting R(X) = a1, a2, . . .,

X =∑

i

ai1Ai

where Ai = X = ai.

If∑

i |ai|P (Ai) <∞, then

E[X] =∑

i

aiPX = ai =∑

i

aiP (Ai)


For general X , let

Yn =bnXcn

, Zn =dnXen

.

Then Yn ≤ X ≤ Zn, so we must have E[Yn] ≤ E[X] ≤ E[Zn]. Specif-ically, if

∑k |k|Pk < X ≤ k + 1 < ∞, which is true if and only

if E[|Yn|] < ∞ and E[|Zn|] < ∞ for all n (we will say that X is inte-grable), then define

E[X] ≡ limn→∞

E[Yn] = limn→∞

E[Zn]. (1.1)

NotationE[X] =

∫ΩXdP =

∫ΩX(ω)P (dω).


Properties

Lemma 1.7 (Monotonicity) If PX ≤ Y = 1 and X and Y are inte-grable, then E[X] ≤ E[Y ].

Lemma 1.8 (Positivity) If PX ≥ 0 = 1, and X is integrable, thenE[X] ≥ 0.

Lemma 1.9 (Linearity) IfX and Y are integrable and a, b ∈ R, then aX+bY is integrable and

E[aX + bY ] = aE[X] + bE[Y ].


Jensen’s inequality

Lemma 1.10 Let X be a random variable and ϕ : R → R be convex. IfE[|X|] <∞ and E[|ϕ(X)|] <∞, then ϕ(E[X]) ≤ E[ϕ(X)].

Proof. If ϕ is convex, then for each x, ϕ+(x) = limy→x+ϕ(y)−ϕ(x)

y−x existsand

ϕ(y) ≥ ϕ(x) + ϕ+(x)(y − x).

Setting µ = E[X],

E[ϕ(X)] ≥ E[ϕ(µ) + ϕ+(µ)(X − µ)] = ϕ(µ) + ϕ+(µ)E[X − µ] = ϕ(µ).


Consequences of countable additivity

P (Ac) = 1− P (A)

If A ⊂ B, then P (B) ≥ P (A).

If A1 ⊂ A2 ⊂ · · ·, then P (∪∞k=1Ak) = limn→∞ P (An).

P (∪∞k=1Ak) = P (∪∞k=1(Ak ∩ Ack−1)) =

∞∑k=1

P (Ak ∩ Ack−1)

= limn→∞

n∑k=1

P (Ak ∩ Ack−1) = lim

n→∞P (An)

If A1 ⊃ A2 ⊃ · · ·, then P (∩∞k=1Ak) = limn→∞ P (An).

An = ∩∞k=1Ak ∪ (∪∞k=nAk ∩ Ack+1)


Properties of cumulative distribution functions

If X is a R-valued random variable (P−∞ < X <∞ = 1)

limx→∞ FX(x) = limn→∞ PX ≤ n = PX <∞ = 1.

limx→−∞ FX(x) = limn→∞ PX ≤ −n = P (∩nX ≤ −n) = 0.

FX(x)− FX(x−) = limn→∞

(FX(x)− FX(x− n−1))

= limn→∞

Px− n−1 < X ≤ x = PX = x.

Since FX(x) − FX(x−) = PX = x, there can be only finitely manydiscontinuities with FX(x) − FX(x−) ≥ n−1, n = 1, 2, . . .. Conse-quently, a cdf has at most countably many discontinuities.


Expectations of nonnegative functions

If PX ≥ 0 = 1 and∑∞

l=0 lPl < X ≤ l + 1 = ∞, we will defineE[X] = ∞.

Note, however, whenever I write E[X] I mean that E[X] is finite un-less I explicitly allow E[X] = ∞.


2. Notions of convergence and results from measure theory

• Three kinds of convergence of random variables

• Limit theorems for expectations and integrals

• Computation of expectations


Three kinds of convergence of random variables

Let Xn be a sequence of random variables and X another randomvariable. Let FXn

(x) = PXn ≤ x denote the cdf for Xn.

Definition 2.1 Almost sure convergence: limn→Xn = X almost surely(a.s.) if and only if

Pω : limn→∞

Xn(ω) = X(ω) = 1.

Convergence in probability: The sequence Xn converges to X in prob-ability if and only if for each ε > 0

limn→∞

P|Xn −X| ≥ ε = 0.

Convergence in distribution: The sequence Xn converges to X is dis-tribution if and only if for each x such that FX is continuous at x

limn→∞

FXn(x) = FX(x).


Examples

Theorem 2.2 (The strong law of large numbers) Suppose ξ1, ξ2, . . . areindependent and identically distributed with E[|ξi|] <∞. Then

limn→∞

ξ1 + · · ·+ ξnn

= E[ξ] a.s.

Theorem 2.3 (The weak law of large numbers) Suppose ξ1, ξ2, . . . areiid with E[ξ2

i ] <∞. Then for ε > 0,

P|ξ1 + · · ·+ ξnn

− E[ξ]| ≥ ε ≤ V ar(ξ)

nε2→ 0.


Theorem 2.4 (Central limit theorem) Let ξ1, ξ2, . . . be iid with E[ξi] =µ and V ar(ξi) = σ2 <∞. Then

limn→∞

P∑n

i=1 ξi − nµ√nσ

≤ x =

∫ x

−∞

1√2πe−

y2

2 dy


Relationship among notions of convergence

Lemma 2.5 Almost sure convergence implies convergence in probability.Convergence in probability implies convergence in distribution.

Proof. Suppose Xn → X a.s. Then for ε > 0,

An ≡ supk≥n

|Xk −X| ≥ ε ⊃ |Xn −X| ≥ ε

A1 ⊃ A2 ⊃ · · ·, so limn→∞ P (An) = P (∩∞k=1Ak).

∩∞k=1Ak = ω : lim supn→∞ |Xn−X| ≥ ε and hence limn→∞ P (An) = 0.


Suppose Xn → X in probability. Then

FXn(x) ≤ FX(x+ε)+PXn ≤ x,X > x+ε ≤ FX(x+ε)+P|Xn−X| ≥ ε.

Therefore,

lim supn→∞

FXn(x) ≤ lim

ε→0FX(x+ ε) = FX(x),

and similarly

lim infn→∞

FXn(x) ≥ lim

ε→0FX(x− ε) = FX(x−).


Limit theorems for expectations and integrals

Theorem 2.6 (Bounded convergence theorem) Let Xn be a sequenceof random variables such that Xn → X in distribution. Suppose that thereexists C > 0 such that P|Xn| > C = 0 for all n. Then limn→∞E[Xn] =E[X].

Proof. Let a0 ≤ −C < a1 < · · · < C ≤ am be points of continuity forFX . Then

m−1∑l=0

alPal < Xn ≤ al+1 ≤ E[Xn] ≤m−1∑l=0

al+1Pal < Xn ≤ al+1.

Then the left and right sides converge to the left and right sides ofm−1∑l=0

alPal < X ≤ al+1 ≤ E[X] ≤m−1∑l=0

al+1Pal < X ≤ al+1.

and lim supn→∞ |E[Xn]− E[X]| ≤ maxl(al+1 − al).


Lemma 2.7 If X ≥ 0 a.s., then limK→∞E[X ∧K] = E[X].

Proof. If K > mn , then

E[X] ≥ E[X ∧K] ≥m−1∑k=1

k

nPk

n≤ X <

k + 1

n m→∞→ E[

bnXcn

].

More generally,

Theorem 2.8 (Monotone convergence theorem) If 0 ≤ X1 ≤ X2 ≤· · · a.s. and X = limn→∞Xn a.s., then

E[X] = limn→∞

E[Xn]

allowing ∞ = ∞.


Lemma 2.9 (Fatou’s lemma) If PXn ≥ 0 = 1 and Xn converges indistribution to X , then

lim infn→∞

E[Xn] ≥ E[X]

Proof.

lim infn→∞

E[Xn] ≥ limn→∞

E[Xn ∧K] = E[X ∧K] → E[X]

Let PXn = n = 1−PXn = 0 = n−1. ThenXn → 0 in distribution,and E[Xn] = 1 for all n.


Theorem 2.10 (Dominated convergence theorem) Suppose there existY such that |Xn| ≤ Y a.s. with E[Y ] <∞ and limn→∞Xn = X a.s. Thenlimn→∞E[Xn] = E[X].

Proof.

E[Y ]− lim supn→∞

E[Xn] = lim infn→∞

(E[Y ]− E[Xn])

= lim infn→∞

E[Y −Xn]

≥ E[Y −X] = E[Y ]− E[X]

E[Y ] + lim infn→∞

E[Xn] = lim infn→∞

(E[Y ] + E[Xn])

= lim infn→∞

E[Y +Xn]

≥ E[Y +X] = E[Y ] + E[X]


Computation of expectationsRecall µX(B) = PX ∈ B. g is Borel measurable if x : g(x) ≤ c ∈B(R) for all c ∈ R. Note that (R,B(R), µX) is a probability spaceand a Borel measurable function g is a “random variable” on thisprobability space.

Theorem 2.11 If X is a random variable on (Ω,F , P ) and g is Borel mea-surable, then g(X) is a random variable and if E[|g(X)|] <∞,

E[g(X)] =

∫Rg(x)µX(dx)


Lebesgue measure

Lebesgue measure is the unique measure L on B(R) such that for a < b,L((a, b)) = b− a. Integration with respect to L is denoted∫

Rg(x)dx ≡

∫Rg(x)L(dx)

If f(x) =∑m

i=1 ai1Ai, Ai ∈ B(R) with L(Ai) <∞, ai ∈ R, then∫

Rf(x)dx ≡

m∑i=1

aiL(Ai).

Such an f is called a simple function.


General definition of Lebesgue integral

For g ≥ 0,∫Rg(x)dx ≡ sup

∫Rf(x)dx : f simple, 0 ≤ f ≤ g.

If∫

R |g(x)|dx <∞, then setting g+(x) = g(x)∨0 and g−(x) = (−g(x))∨0, ∫

Rg(x)dx ≡

∫Rg+(x)dx−

∫Rg−(x)dx.


Riemann integrals

Definition 2.12 Let −∞ < a < b < ∞. g defined on [a, b] is Riemannintegrable if there exists I ∈ R so that for each ε > 0 there exists a δ > 0such that a = t0 < t1 < · · · < tm = b, sk ∈ [tk, tk+1] and max |tk+1− tk| ≤δ implies

|m−1∑k=0

g(sk)(tk+1 − tk)− I| ≤ ε.

Then I is denoted by ∫ b

a

g(x)dx

∑m−1k=0 g(sk)(tk+1− tk) is called a Riemann sum. The Riemann integral

is the limit of Riemann sums.


Some properties of Lebesgue integrals

Theorem 2.13 If g is Riemann integrable, then g is Lebesgue integrableand the integrals agree.

Fatou’s lemma, the monotone convergence theorem, and the domi-nated convergence theorem all hold for the Lebesgue integral.


Distributions with Lebesgue densitiesSuppose

µX(B) =

∫R1B(x)fX(x)dx, B ∈ B(R).

Then fX is a probability density function for X .

Theorem 2.14 Let X be a random variable on (Ω,F , P ) with a probabilitydensity function fX . If g is Borel measurable and E[|g(X)|] <∞, then

E[g(X)] =

∫Rg(x)fX(x)dx



• Elementary definition of conditional probability

• Independence

• Markov property

• Transition matrix

• Simulation of a Markov chain


Elementary definition of conditional probability

For two events A,B ∈ F , the definition of the conditional proba-bility of A given B (P (A|B)) is intended to capture how we wouldreassess the probability ofA if we knew thatB occurred. The relativefrequency interpretation suggests that we only consider trials of theexperiment on which B occurs. Consequently,

P (A|B) ≈ # times that A and B occur# times B occurs

=# times that A and B occur/# trials

# times B occurs/# trials

≈ P (A ∩B)

P (B)

leading to the definition P (A|B) = P (A∩B)P (B) Note that for each fixed

B ∈ F , P (·|B) is a probability measure on F .


Independence

If knowing that B occurs doesn’t change our assessment of the prob-ability that A occurs (P (A|B) = P (A)), then we say A is independentof B. By the defintion of P (A|B), independence is equivalent to

P (A ∩B) = P (A)P (B).

Ak, k ∈ I are mutually independent if

P (Ak1∩ · · · ∩ Akm

) =m∏

i=1

P (Aki)

for all choices of k1, . . . , km ⊂ I.


Independence of random variables

Random variables Xk, k ∈ I are mutually independent if

PXk1∈ B1, . . . , Xkm

∈ Bm =m∏

i=1

PXki∈ Bi

for all choices of k1, . . . , km ⊂ I and B1, . . . , Bm ∈ B(R).


Joint distributions

The collection of Borel subsets of Rm (B(Rm)) is the smallest σ-algebracontaining (−∞, c1]× · · · × (−∞, cm].

The joint distribution of (X1, . . . , Xm) is the measure on B(Rm) de-fined by

µX1,...,Xm(B) = P(X1, . . . , Xm) ∈ B, B ∈ B(Rm).

Lemma 3.1 The joint distribution of (X1, . . . , Xm) is uniquely determinedby the joint cdf

FX1,...,Xm(x1, . . . , xm) = PX1 ≤ x1, . . . , Xm ≤ xm.


The Markov property

Let X0, X1, . . . be positive, integer-valued random variables. Then

PX0 = i0, . . . , Xm = im= PXm = im|X0 = i0, . . . , Xm−1 = im−1×PXm−1 = im−1|X0 = i0, . . . , Xm−2 = im−2× · · · × PX1 = i1|X0 = i0PX0 = i0

X satisfies the Markov property if for each m and all choices of ik,

PXm = im|X0 = i0, . . . , Xm−1 = im−1 = PXm = im|Xm−1 = im−1


The transition matrixSuppose PXm = j|Xm−1 = i = pij. (The transition probabilities donot depend on m.) The matrix P = ((pij)) satisfies∑

j

pij = 1.

PXm+2 = j|Xm = i=∑

k

PXm+2 = j,Xm+1 = k|Xm = i

=∑

k

PXm+2 = j|Xm+1 = k,Xm = iPXm+1 = k|Xm = i

=∑

k

pikpkj ≡ p(2)ij

Then((p

(2)ij )) = P 2.


The joint distribution of a Markov chain

The joint distribution of X0, X1, . . . is determined by the transitionmatrix and the initial distribution, νi = PX0 = i.

PX0 = i0, . . . , Xm = im = νi0pi0i1pi1i2 · · · pim−1im.

In general,PXm+n = j|Xm = i = p

(n)ij ,

where((p

(n)ij )) = P n.


Simulation of a Markov chain

Let E be the state space (the set of values the chain can assume) of theMarkov chain. For simplicity, assume E = 1, . . . , N of 1, 2, . . ..Define H : E × [0, 1] → E by

H(i, u) = j,

j−1∑k=1

pik ≤ u <

j∑k=1

pik

and V : [0, 1] → E by

V (u) = j,

j−1∑k=1

νk ≤ u <

j∑k=1

νk.

If ξ is uniform [0, 1], then

PV (ξ) = j = Pj−1∑k=1

νk ≤ ξ <

j∑k=1

νk = νj.


Theorem 3.2 Let ξ0, ξ1, . . . be iid uniform [0, 1] random variables, and de-fine

X0 = V (ξ0), Xn+1 = H(Xn, ξn+1).

Then Xn is a Markov chain with initial distribution νk and transitionmatrix ((pij)).

Proof. As noted above, X0 has distribution νk. Note that Xk is afunction of ξ0, . . . , ξk and hence is independent of ξk+1, ξk+2, . . .

PXm = im|X0 = i0, . . . , Xm−1 = im−1= PH(Xm−1, ξm) = im|X0 = i0, . . . , Xm−1 = im−1= PH(im−1, ξm) = im|X0 = i0, . . . , Xm−1 = im−1= PH(im−1, ξm) = im= pim−1im.


Stationary distributions

Definition 3.3 A probability distribution π = πk is a stationary distri-bution for a Markov chain with transition matrix P = ((pij)) if

∑k πkpkj =

πj

Lemma 3.4 If π is a stationary distribution for the Markov chain Xnand X0 has distribution π, then for each n, Xn has distribution π.

Proof. Noting that

PX1 = j =∑

k

PX1 = j,X0 = k =∑

k

πkpkj = πj,

the lemma follows by induction.

Note that πTP = πT , so that πT is a left eigenvector of P for theeigenvalue 1.


The ergodic theorem

Definition 3.5 A Markov chain is irreducible if for each i, j ∈ E, thereexists n such that p(n)

ij > 0.

Theorem 3.6 If Xn is irreducible and and has stationary distribution π,then for each bounded function f on E,

limn→∞

f(X0) + · · ·+ f(Xn)

n+ 1=∑

k

πkf(k).


4. Change of measure

• Defining a change of measure

• Expectations under the new measure

• Absolute continuity and the Radon-Nikodym theorem

• Equivalent measures


Defining a change of measure

Lemma 4.1 LetZ be a nonnegative random variable on (Ω,F , P ) such thatE[Z] ≡ EP [Z] = 1. Define

P (A) = EP [1AZ], A ∈ F . (4.1)

Then P is a probability measure on F .


Proof.P (Ω) = EP [1ΩZ] = EP [Z] = 1.

Suppose A1, A2, . . . ∈ F are disjoint. Then

1∪∞i=1AiZ =

∞∑i=1

1AiZ

and

P (∪∞i=1Ai) = EP [1∪∞i=1AiZ] = EP [

∞∑i=1

1AiZ]

= limn→∞

EP [n∑

i=1

1AiZ] = lim

n→∞

n∑i=1

EP [1AiZ]

=∞∑i=1

P (Ai)

The third equality follows by the monotone convergence theorem.


Expectations under the new measure

Lemma 4.2 Let P be given by (4.1). Suppose EP [|X|Z] < ∞. ThenEP [X] = EP [XZ].

Proof. SinceEP [1Ai

] = P (Ai) = EP [1AiZ],

if X =∑m

i=1 ai1Ai, then

EP [X] =m∑

i=1

aiEP [1Ai

] =m∑

i=1

aiEP [1Ai

Z] = EP [XZ].

The result follows for general X by approximation.


Absolute continuity and the Radon-Nikodym theorem

Definition 4.3 Let (Ω,F , P ) be a probability space and let P be anotherprobability measure defined on F . Then P is absolutely continuous withrespect to P if and only if P (A) = 0 implies P (A) = 0.

Theorem 4.4 (Radon-Nikodym theorem) P is absolutely continuous withrespect to P if and only if there exists a nonnegative random variable Z suchthat P (A) = EP [1AZ], A ∈ F .


Equivalent measures

Definition 4.5 Probability measures P and P on F are equivalent if andonly if P is absolutely continuous with respect to P and P is absolutelycontinuous with respect to P .

Lemma 4.6 P and P are equivalent if and only if there exists a nonnegativerandom variable Z such that PZ > 0 = 1 and P (A) = EP [1AZ],A ∈ F . In addition,

P (A) = EP [1AZ−1].


Example

Let X be a standard normal random variable on (Ω,F , P ), and forθ ∈ R, define Zθ = expθX − 1

2θ2. Then EP [Zθ] = 1.

Define Pθ(A) = EP [1AZθ]. Then

PθX ≤ x = EP [1(−∞,x](X)Zθ]

= EP [1(−∞,x](X) expθX − 1

2θ2]

=

∫ ∞

−∞1(−∞,x](z) expθz − 1

2θ2 1√

2πe−

12z2

dz

=

∫ x

−∞

1√2π

exp−1

2(z − θ)2dz

so that under Pθ, X is normally distributed with EPθ [X] = θ andV arPθ(X) = 1.


5. Information and conditional expectations

• Modeling information

• Information obtained by observing a random variable

• Information evolving in time

• Information obtained by observing a stochastic process

• Discrete observations

• Approximating random variables using available information

• Conditional expectations

• Independence

• Properties of conditional expectations

• Definition of a Markov process


Modeling information

Recall, events correspond to statements, the event being the set ofoutcomes for which the statement is true.

Available information corresponds to the collection of statements whosetruth can be checked with that information.

We model information by the collection D of events correspondingto those statements.

Clearly,D is closed under finite intersections, finite unions, and com-plements, that is, D is an algebra. We assume that it is a σ-algebra.


Information obtained by observing a random variable

If X is a random variable and the available information is the infor-mation obtained by observing the value of X , then that informationcorresponds to σ(X), the smallest σ-algebra with respect to which Xis measurable.

Lemma 5.1 σ(X) = X ∈ B : B ∈ B(R).

Proof. Check that the right side is a σ-algebra.


Information evolving in time

We will assume that “time” is continuous and identified with [0,∞).

As time evolves, more information is obtained and assuming thatnothing is forgotten, the information available at time t, Ft, includesthe information available at time s for all s < t, that is, for s < t,Fs ⊂ Ft.

Definition 5.2 A filtration is an increasing family Ft of σ-algebras in-dexed by [0,∞). 0 < s < t implies Fs ⊂ Ft.


Information obtained by observing a stochastic process

A (continuous time) stochastic process is a family of random variablesX(t), t ∈ [0,∞) indexed by [0,∞).

Definition 5.3 The natural filtration corresponding to a stochastic pro-cess X is the filtration given by FX

t = σ(X(s), s ≤ t), t ≥ 0.

σ(X(s), s ≤ t) is the smallest σ-algebra with respect to which X(s) ismeasurable for each 0 ≤ s ≤ t.

Definition 5.4 A stochastic process is adapted to a filtration Ft if andonly if for each t ≥ 0, X(t) is Ft-measurable. In particular, FX

t ⊂ Ft.


Discrete observationsLet Dk, k = 1, 2, . . . be a partition of Ω, that is, a countable collectionof disjoint sets whose union is Ω.

σ(Dk) = ∪k∈KDk : K ⊂ 1, 2, . . .

(Check that the right side is a σ-algebra.)


Approximating random variables using available infor-mation

Let X be a random variable and D a σ-algebra modeling the avail-able information. We want to approximate X using the availableinformation.

If Y is the approximation, what must be true about Y ? For example,Y ≤ c must be in D, that is, Y is D-measurable.

If D = σ(Dk), the for each k, we “know” whether or not ω ∈ Dk.For each k, there must be a constant dk such if ω ∈ Dk, then Y (ω) =dk, that is

Y =∑

k

dk1Dk.


Conditional expectationsWe want Y to be a “good” approximation, that is, we want the errorX − Y to be small, at least on average.

Assuming E[X2] <∞, select Y to minimize

E[(X − Y )2].

Then for D-measurable Z,

E[(X − (Y + εZ))2] = E[(X − Y )2]− 2εE[Z(X − Y )] + ε2E[Z]2

has its minimum at ε = 0. Differentiating, we must have

E[Z(X − Y )] = 0.


Definition of conditional expectation

Definition 5.5 Y = E[X|D] (the conditional expectation of X give D) if

a) Y is D-measurable.

b) For each D ∈ D,E[1DX] = E[1DY ].

Note that the definition makes sense for all X with E[|X|] <∞.


Existence and uniqueness of the conditional expecta-tion

Existence follows by the Radon-Nikodym theorem, since for nonneg-ative X , Q(D) = E[1DX] defines a measure on D that is absolutelycontinuous with respect to P on D.

Uniqueness follows by the observation that if Y and Y satisfy theconditions of the definition, then

E[1Y >Y (Y − Y )] = 0 = E[1Y <Y (Y − Y )].

It follows that PY = Y = 1.


Example

Suppose that Dk is a partition and D = σ(Dk). Then

E[X|D] =∑

k

E[X1Dk]

P (Dk)1Dk

.


IndependenceAs in the elementary definition of independence, we could say thata random variable X is independent of the (information) σ-algebraD if E[g(X)|D] = E[g(X)] for every bounded, measure g (every g ∈B(R)).

This identity holds for all bounded measurable g if and only if

E[g(X)1D] = E[g(X)]P (D), g ∈ B(R), D ∈ D.

To be precise:

Definition 5.6 σ-algebrasD1 andD2 are independent if and only if P (D1∩D2) = P (D1)P (D2).

A random variable X is independent of a σ-algebra D if and only if

P (X ∈ B ∩D) = PX ∈ BP (D), B ∈ B(R), D ∈ D,that is, the σ-algebras D and σ(X) are independent.


Consequences of independence

Lemma 5.7 If X and Y are independent and g, h : R → R are Borel mea-surable, then g(X) and h(Y ) are independent. If in addition, g(X) andh(Y ) are integrable, then

E[g(X)h(Y )] = E[g(X)]E[h(Y )]. (5.1)

Proof. The first part of the lemma follows from the fact that

g(X) ∈ B = X ∈ g−1(B), h(Y ) ∈ C = Y ∈ h−1(C)

where g−1(B) = x : g(x) ∈ B is Borel for Borel B by the definitionof Borel measurable.

To prove (5.1), check first for indicator functions, then for simplefunctions, and complete the proof by approximation. (In other words,apply what Shreve calls the “standard machine.”)


Joint distributions of independent random variables

Lemma 5.8 If (X1, . . . , Xm) is independent of (Y1, . . . , Yn) andC ∈ Rm, D ∈Rn, then

µX1,...,Xm,Y1,...,Yn(C ×D) = µX1,...,Xm

(C)µY1,...,Yn(D)


The Tonelli and Fubini theorems

Theorem 5.9 Suppose X and Y are independent and ψ : R2 → [0,∞).Define ϕX(y) = E[ψ(X, y)] and ϕY (x) = E[ψ(x, Y )]. Then

E[ψ(X, Y )] = E[ϕX(Y )] = E[ϕY (X)] (5.2)

=

∫R

∫Rψ(x, y)µX(dx)µY (dy) =

∫R

∫Rψ(x, y)µY (dy)µX(dx)

If ψ : R2 → R is Borel measurable and E[|ψ(X, Y )|] < ∞, then (5.2)holds.

Note that the theorem will also hold for independent random vec-tors.


Conditioning on random variables

If X1, . . . , Xm are random variables, then we will write

E[Y |σ(X1, . . . , Xm)] = E[Y |X1, . . . , Xm]

Lemma 5.10 Suppose that Z is σ(X1, . . . , Xm). Then there exists B(Rm)-measurable h such that

Z = h(X1, . . . , Xm).

Consequently, if Y is integrable, there exists a B(Rm)-measurable hY suchthat

E[Y |X1, . . . , Xm] = hY (X1, . . . , Xm)


Properties of conditional expectations

Theorem 5.11 Let D,G ⊂ F be σ-algebras, X , Y integrable random vari-ables, and a, b ∈ R. Then

a) [Linearity]

E[aX + bY |D] = aE[X|D] + bE[Y |D].

b) [Positivity] If PX ≥ 0 = 1, then PE[X|D] ≥ 0 = 1.

c) [Monotonicity] If PX ≥ Y = 1, then PE[X|D] ≥ E[Y |D] = 1.

d) [Factoring out known quantity] If Y is D-measurable and X and XYare integrable,

E[XY |D] = Y E[X|D].

If Y is D-measurable, then E[Y |D] = Y .


e) [Iterated conditioning] If D ⊂ G,

E[E[X|G]|D] = E[X|D].

f) [Independence] If X is independent of D, then

E[X|D] = E[X].


Proof. In each case, check the measurability in Part (a) of the defini-tion of conditional expectation and they verify the integral identityin Part (b).

a)

E[(aE[X|D] + bE[Y |D])1D] = aE[E[X|D]1D] + bE[E[Y |D]1D]

= aE[X1D] + bE[Y 1D]

= E[(aX + bY )1D]

d) We wantE[Y E[X|D]1D] = E[Y X1D]

First, let Y be an indicator, then a simple D-measurable randomvariable, and then complete the argument by approximation.


e) For D ∈ D ⊂ G,

E[E[X|D]1D] = E[X1D] = E[E[X|G]1D],

where the last equality follows from the fact that D ∈ G.

f) For D ∈ D

E[E[X]1D] = E[X]E[1D] = E[X1D]


Jensen’s inequality

Theorem 5.12 If ϕ is convex and X and ϕ(X) are integrable,

E[ϕ(X)|D] ≥ ϕ(E[X|D]).

Proof. As before, ifϕ is convex, then for each x, ϕ+(x) = limy→x+ϕ(y)−ϕ(x)

y−x

exists andϕ(y) ≥ ϕ(x) + ϕ+(x)(y − x).

Setting Y = E[X|D],

E[ϕ(X)|D] ≥ E[ϕ(Y ) + ϕ+(Y )(X − Y )|D]

= ϕ(Y ) + ϕ+(Y )E[X − Y |D] = ϕ(Y ).


Functions of known and unknown random variable

Lemma 5.13 Suppose that X is independent of D and Y is D-measurable.Suppose that ϕ : R2 → R and E[|ϕ(X,Y )|] < ∞. Define ψ(y) =E[ϕ(X, y)]. Then

E[ϕ(X,Y )|D] = ψ(Y ).

Proof. Since Y is D-measurable, ψ(Y ) is D-measurable. For D ∈ D,X is independent of (Y,1D), Consequently, by the Fubini theorem,

E[ϕ(X, Y )1D] =

∫R2

∫Rϕ(x, y)zµX(dx)µY,1D

(dy × dz) = E[ψ(Y )1D].


Recursive generation of Markov chains

Theorem 5.15 Let X0, ξ1, ξ2, . . . be independent R-valued random vari-ables. For n = 1, 2, . . ., let Hn : R2 → R, and define

Xn = Hn(Xn−1, ξn).

Then Xn is Markov with respect to Fn = σ(X0, ξ1, . . . , ξn).

Proof. Let f be bounded and measurable, and define

Tnf(x) =

∫Rf(Hn(x, z))µξn

(dz)

By Lemma 5.13,

E[f(Xn)|Fn−1] = E[f(Hn(Xn−1, ξn))|Fn−1] = Tnf(Xn−1).


Definition of a Markov process

Definition 5.16 If X is a stochastic process adapted to a filtration Ft,then X is Ft-Markov (or Markov with respect to Ft) if and only if fors, t ≥ 0,

E[f(X(t+ s))|Ft] = E[f(X(t+ s))|X(t)].


6. Martingales

• Definitions of martingale and sub/supermartingale

• Stopping times

• Optional sampling theorem

• Doob’s inequalities


Definitions of martingale and sub/supermartingale

Definition 6.1 Let Ft be a filtration and X a stochastic process adaptedto Ft such that X(t) is integrable for each t.

a) X is a Ft-martingale if and only if

E[X(s)|Ft] = X(t), 0 ≤ t ≤ s

b) X is a Ft-submartingale if and only if

E[X(s)|Ft] ≥ X(t), 0 ≤ t ≤ s

c) X is a Ft-supermartingale if and only if

E[X(s)|Ft] ≤ X(t), 0 ≤ t ≤ s


Examples

Let ξ1, ξ2, . . . be iid, and define Sn =∑n

k=1 ξi and Fn = σ(ξ1, . . . , ξn).

If E[ξk] = 0, then Sn is a Fn-martingale.

If ρ = E[eξi] <∞, then

Xn =eSn

ρn

is a Fn-martingale.

Also, see Problem 5.


Applications of Jensen’s inequality

Lemma 6.2 Let X be a martingale and ϕ be a convex function. If Y (t) =ϕ(X(t)) is integrable for each t, then Y is a submartingale.

Let X be a submartingale and ϕ be a convex, nondecreasing function. IfZ(t) = ϕ(X(t)) is integrable for each t, then Z is a submartingale.

Proof. Suppose X is a martingale. By Jensen’s inequality, for s ≥ t,

E[ϕ(X(s))|Ft] ≥ ϕ(E[X(s)|Ft]) = ϕ(X(t)).

Suppose X is a submartingale and ϕ is nondecreasing. Then

E[ϕ(X(s))|Ft] ≥ ϕ(E[X(s)|Ft]) ≥ ϕ(X(t)).


Stopping times

Definition 6.3 Let Ft be a filtration. Then a nonnegative (or more gen-erally [0,∞]-valued) random variable τ is a Ft-stopping time if and onlyif τ ≤ t ∈ Ft for each t ≥ 0.

If τ is the time that an alarm clock goes off and Ft is the informationavailable to an observer, then the observer hears the alarm clock gooff.


Examples

Lemma 6.4 Let X be a continuous, R-valued process adapted to Ft, anddefine τc = inft : X(t) ≥ c. Then τc is a Ft-stopping time.

Proof. For t ≥ 0,

τc ≤ t = ∩n ∪s∈Q,s≤t X(s) ≥ c− n−1 ∈ Ft.

Lemma 6.5 Let X be a continuous, Rd-valued process adapted to Ft,and let K ⊂ Rd be closed. Define τK = inft : X(t) ∈ K. Then τK is aFt-stopping time.

Proof. Let On = x : infy∈K |x− y| < n−1. For t ≥ 0,

τK ≤ t = ∩n ∪s∈Q,s≤t X(s) ∈ On ∈ Ft.


Information up to a stopping time

Definition 6.6 Let τ be a Ft-stopping time. Then

Fτ = A ∈ F : A ∩ τ ≤ t ∈ Ft, t ≥ 0

Lemma 6.7 a) If τ is a Ft-stopping time, then Fτ is a σ-algebra.

b) If τ1 and τ2 are Ft-stopping times, then Fτ1⊂ Fτ2

.

c) If X is right continuous and Ft-adapted and τ is a Ft-stoppingtime, then X(t ∧ τ) is Ft-measurable and Fτ -measurable.


Optional sampling theorem

Theorem 6.8 Let X be a continuous Ft-submartingale and τ1 and τ2 beFt-stopping times. Then

E[X(t ∧ τ2)|Fτ1] ≥ X(t ∧ τ1 ∧ τ2).

Note that if X is a supermartingale, the inequality is reversed and ifX is a martingale, the inequality can be replaced by equality.


Doob’s inequalities

Theorem 6.9 Let X be a right continuous, nonnegative submartingale.Then for c > 0,

Psups≤t

X(s) ≥ c ≤ E[X(t)]

c

Proof. Let τc = inft : X(t) ≥ x. Then

E[X(t)] ≥ E[X(t ∧ τ)] ≥ cPτ ≤ t.

(Actually, for right continuous processes, τc may not be a stoppingtime unless one first modifies the filtration Ft, but the modificationcan be done without affecting the statement of the theorem.)


Doob’s inequalities

Theorem 6.10 Let X be a nonnegative submartingale. Then for p > 1,

E[sups≤t

X(s)p] ≤(

p

p− 1

)p

E[X(t)p]

Corollary 6.11 If M is a square integrable martingale, then

E[sups≤t

M(s)2] ≤ 4E[M(t)2].


7. Brownian motion

• Moment generating functions

• Gaussian distributions

• Characterization by mean and covariance

• Conditions for independence

• Conditional expectations

• Central limit theorem

• Construction of Brownian motion

• Functional central limit theorem

• The binomial model and geometric Brownian motion

• Martingale properties of Brownian motion

• Levy’s characterization

• Relationship to some partial differential equations


• First passage time

• Quadratic variation

• Nowhere differentiability

• Law of the iterated logarithim

• Modulus of continuity

• Markov property

• Transition density and operator

• Strong Markov property

• Reflection principle


Moment generating functions

Definition 7.1 Let X be a Rd-valued random variable. Then the momentgenerating function for X (it it exists) is given by

ϕX(λ) = E[eλ·X ], λ ∈ Rd.

Lemma 7.2 If ϕX(λ) < ∞ for all λ ∈ Rd, then ϕX uniquely determinesthe distribution of X .


Gaussian distributions

Definition 7.3 A R-valued random variable X is Gaussian (normal) withmean µ and variance σ2 if

fX(x) =1√2πσ

e−(x−µ)2

2σ2

The moment generating function for X is

E[eλX ] = expσ2λ2

2+ µλ


Multidimensional Gaussian distributions

Definition 7.4 A Rd-valued random variable X is Gaussian if λ · X isGaussian for every λ ∈ Rd.

Note that if E[X] = µ and

cov(X) = E[(X − µ)(X − µ)T ] = Σ = ((σij)),

thenE[λ ·X] = λ · µ, var(λ ·X) = λTΣλ.

Note that is X = (X1, . . . , Xd) is Gaussian in Rd, then for any choiceof i1, . . . , im ∈ 1, . . . , d, (Xi1, . . . Xim) is Gaussian in Rm.

Lemma 7.5 If X is Gaussian in Rd and C is a m× d matrix, then

Y = CX (7.1)

is Gaussian in Rm.


Characterization by mean and covariance

If X is Gaussian in Rd, then

ϕX(λ) = expλTΣλ

2+ λ · µ

Since the distribution of X is determined by its moment generatingfunction, the distribution of a Gausian random vector is determinedby its mean and covariance matrix.

Note that for Y given by (7.1), E[Y ] = Cµ and cov(Y ) = CΣCT .


Conditions for independence

Lemma 7.6 If X = (X1, X2) ∈ R2 is Gaussian, then X1 and X2 are inde-pendent if and only if cov(X1, X2) = 0. In general, if X is Gaussian in Rd,then the components of X are independent if and only if Σ is diagonal.

Proof. If R-valued random variables U and V are independent, thencov(U, V ) = 0, so if the components of X are independent then Σ isdiagonal.

If Σ is diagonal, then

ϕX(λ) =d∏

i=1

expλ2iσ

2i

2+ λiµ,

and since the moment generating function of X determines the jointdistribution, the components of X must be independent.


Conditional expectations

Lemma 7.7 Let X = (X1, . . . , Xd) be Gaussian, then

E[Xd|X1, . . . , Xd−1] = a+d−1∑i=1

biXi,

where

σdk =d−1∑i=1

biσik, k = 1, . . . , d− 1

and

µd = a+d−1∑i=1

biµi.

Proof. The bi are selected so that Xd− (a+∑d−1

i=1 biXi) is independentof X1, . . . , Xd−1.


Example

Suppose X1, X2, and X3 are independent, Gaussian, mean zero andvariance one. Then

E[X1|X1 +X2] =X1 +X2

2.

LetX1 =

X1 +X2

2+

1√2X3, X2 =

X1 +X2

2− 1√

2X3. (7.2)

Then X1 and X2 are independent, Gaussian, mean zero, and varianceone.


Central limit theorem

If ξi are iid with E[ξi] = 0 and V ar(ξi) = 1, then Zn = 1√n

∑ni=1 ξi

satisfies

Pa < Zn ≤ b →∫ b

a

1√2πe−

12x2

dx.

If we define

Zn(t) =1√n

[nt]∑i=1

ξi

then the distribution of (Zn(t1), . . . , Zn(tm)) is approximately Gaus-sian with

cov(Zn(ti), Zn(tj)) → ti ∧ tj,Note that for t0 < t1 < · · · < tm, Zn(tk) − Zn(tk−1), k = 1, . . . ,m areindependent.


The CLT as a process


Interpolation lemma

Lemma 7.8 Let Y and Z be independent, R-valued Gaussian random vari-ables with E[Y ] = E[Z] = 0, V ar(Z) = a, and V ar(Y ) = 1. Then

U =Z

2+

√a

2Y, V =

Z

2−√a

2Y

are independent Gaussian random variables satisfying Z = U + V .

Proof. Clearly, Z = U+V . To complete the proof, check that cov(U, V ) =0.


Construction of Brownian motion

We can construct standard Brownian motion by repeated applicationof the Interpolation Lemma 7.8. Let ξk,n be iid, Gaussian, E[ξk,n] = 0and V ar(ξk,n) = 1. Define

W (1) = ξ1,0,

and define W ( k2n ) inductively by

W (k

2n) = W (

k − 1

2n) +

W (k+12n )−W (k−1

2n )

2+

1

2(n+1)/2ξk,n

=1

2W (

k − 1

2n) +

1

2W (

k + 1

2n) +

1

2(n+1)/2ξk,n

for k odd and 0 < k < 2n.


By the Interpolation Lemma,

W (k

2n)−W (

k − 1

2n) =

1

2W (

k + 1

2n)− 1

2W (

k − 1

2n) +

1

2(n+1)/2ξk,n

and

W (k + 1

2n)−W (

k

2n) =

1

2W (

k + 1

2n)− 1

2W (

k − 1

2n)− 1

2(n+1)/2ξk,n

are independent, Gaussian random variables.


Convergence to a continuous process

Theorem 7.9 Let Γ be the set of diadic rationals in [0, 1]. Define

Wn(t) = W (k

2n) + (t− k

2n)2n(W (

k + 1

2n)−W (

k

2n)),

k

2n≤ t ≤ k + 1

2n.

Thensupt∈Γ

|W (t)−Wn(t)| → 0 a.s.,

and hence W extends to a continuous process on [0, 1].


Continuous functions on C[0, 1]

Definition 7.10 Let C[0, 1] denote the continuous, real-valued functionson [0, 1]. g : C[0, 1] → R is continuous if xn, x ∈ C[0, 1] satisfyingsup0≤t≤1 |xn(t)− x(t)| → 0 implies limn→∞ g(xn) = g(x).

For example,g(x) = sup

0≤t≤1x(t)

g(x) =

∫ 1

0x(s)ds


Functional central limit theorem

Theorem 7.11 (Donsker invariance principle) Let ξi be iid withE[ξi] =0 and V ar(ξi) = σ2. Define

Zn(t) =1√n

[nt]∑i=1

ξi + (t− k

n)√nξk+1,

k

n≤ t ≤ k + 1

n.

Then for each continuous g : C[0, 1] → R, g(Zn) converges in distributionto g(σW ).


The binomial model and geometric Brownian motionIn the binomial model for a stock price, at each time step the price ei-ther goes up a fixed percentage or down a fixed percentage. Assumethe up-down probabilities are 1

2−12 and the time step is 1

n .

PXn(k + 1

n) = (1+

σ√n

)Xn(k

n) = PXn(

k + 1

n) = (1− σ√

n)Xn(

k

n) =

1

2.

Then logXn(t) =∑[nt]

i=1 log(1+ σ√nξi) where Pξi = 1 = Pξi = −1 =

12 . By Taylor’s formula

Xn(t) ≈σ√n

[nt]∑i=1

ξi −1

2n

[nt]∑i=1

σ2 ≈ σW (t)− 1

2σ2t

andlogXn(t) ≈ expσW (t)− 1

2σ2t


Martingales

Recall that an Ft-adapted process M is a Ft-martingale if andonly if

E[M(t+ r)|Ft] = M(t)

for all t, r ≥ 0. Note that this requirement is equivalent to

E[M(t+ r)−M(t)|Ft] = 0.


Convergence of conditional expectations

Lemma 7.12 Suppose Xn, X are R-valued random variables and

E[|Xn −X|] → 0.

Then for a sub-σ-algebra D,

E[|E[Xn|D]− E[X|D]|] → 0

Proof. By Jensen’s inequality

E[|E[Xn|D]− E[X|D]|] = E[|E[Xn −X|D]|]≤ E[E[|Xn −X||D]]

= E[|Xn −X|]


Martingale properties of Brownian motionLet Ft = FW

t = σ(W (s) : s ≤ t). Since W has independent incre-ments, we have

E[W (t+ r)|Ft] = E[W (t+ r)−W (t) +W (t)|Ft]

= E[W (t+ r)−W (t)|Ft] +W (t) = W (t),

and W is a martingale. Similarly, let M(t) = W (t)2 − t. Then

E[M(t+ r)−M(t)|Ft]

= E[(W (t+ r)−W (t))2 + 2(W (t+ r)−W (t))W (t)− r|Ft]

= E[(W (t+ r)−W (t))2|Ft] + 2W (t)E[W (t+ r)−W (t)|Ft]− r

= 0

and hence M is a martingale.


An exponential martingale

Let M(t) = expW (t)− 12t, then

E[M(t+ r)|Ft] = E[expW (t+ r)−W (t)− 1

2rM(t)|Ft]

= M(t)e−12rE[expW (t+ r)−W (t)|Ft]

= M(t)


A general family of martingales

Let f ∈ C3b (R) (the bounded continuous functions with three bounded

continuous derivatives), and t = t0 < · · · < tm = t+ r. Then

E[f(W (t+ r))− f(W (t))|Ft]

= E[∑

f(W (ti+1))− f(W (ti))|Ft]

= E[∑

f(W (ti+1))− f(W (ti))− f ′(W (ti))(W (ti+1)−W (ti))

−1

2f ′′(W (ti))(W (ti+1)−W (ti))

2|Ft]

+E[∑ 1

2f ′′(W (ti))(ti+1 − ti)|Ft]

= Z1 + Z2


Note that since Fti ⊃ Ft,

E[f ′(W (ti))(W (ti+1)−W (ti))|Ft]

= E[E[f ′(W (ti))(W (ti+1)−W (ti))|Fti]|Ft]

= E[f ′(W (ti))E[(W (ti+1)−W (ti))|Fti]|Ft]

= 0

E[f ′′(W (ti))(W (ti+1)−W (ti))2|Ft]

= E[E[f ′′(W (ti))(W (ti+1)−W (ti))2|Fti]|Ft]

= E[f ′′(W (ti))E[(W (ti+1)−W (ti))2|Fti]|Ft]

= E[f ′′(W (ti))(ti+1 − ti)|Ft]

E[|∑ 1

2f ′′(W (ti))(ti+1 − ti)−

∫ t+r

t

1

2f ′′(W (s))ds|] → 0

E[|Z1|] ≤ CE[∑

|W (ti+1)−W (ti)|3] ≤ C∑

(ti+1 − ti)3/2 → 0

so f(W (t))− f(0)−∫ t

012f

′′(W (s))ds is a Ft-martingale.


Levy’s characterization

Theorem 7.13 Suppose M and Z given by Z(t) = M(t)2 − t are contin-uous, Ft-martingales, and M(0) = 0. Then M is a standard Brownianmotion.

Lemma 7.14 Under the assumptions of the theorem,

E[(M(t+r)−M(t))2|Ft] = E[M(t+r)2−2M(t)M(t+r)+M(t)2|Ft] = r,

if τ is a Ft-stopping time, then

E[(M((t+ r) ∧ τ)−M(t ∧ τ))2|Ft] = E[(t+ r) ∧ τ − t ∧ τ |Ft] ≤ r,

and for f ∈ C3b (R),

Mf(t) = f(M(t))− f(M(0))−∫ t

0

1

2f ′′(M(s))ds (7.3)

is a Ft-martingale.


Proof. Taking f(x) = eiθx,

eiθM(t) − eiθM(0) +

∫ t

0

1

2θ2eiθM(s)ds

is a martingale, and hence

E[eiθM(t+r)|Ft] = eiθM(t) −∫ t+r

t

1

2θ2E[eiθM(s)|Ft]ds

andϕt,θ(r) ≡ E[eiθ(M(t+r)−M(t))|Ft] = 1−

∫ r

0

1

2θ2ϕt,θ(s)ds

ThereforeE[eiθ(M(t+r)−M(t))|Ft] = e−

12θ2r.

It follows that (M(t+ r)−M(t)) is Gaussian and independent of Ft.


Proof.[of lemma] To prove that (7.3) is a martingale, the problem isto show

E[∑

f(M(ti+1))− f(M(ti))− f ′(M(ti))(M(ti+1)−M(ti))

−1

2f ′′(M(ti))(M(ti+1)−M(ti))

2|Ft]

converges to zero. Let ε, δ > 0, and assume that ti+1 − ti ≤ δ. Define

τε,δ = inft : supt−δ≤s≤t

|M(t)−M(s)| ≥ ε

Then∑|M(ti+1∧ τε,δ)−M(ti∧ τε,δ)|3 ≤ ε

∑(M(ti+1∧ τε,δ)−M(ti∧ τε,δ))2,

and the expectation of the term on the left is bounded by εt. Selectεn → 0 and δn → 0 so that τεn,δn

→∞ a.s.


Applications of the optional sampling theorem

Let −a < 0 < b, and define γa,b = inft : W (t) /∈ (−a, b). Then

E[W (γa,b ∧ t)] = 0

andE[γa,b ∧ t] = E[W (γa,b ∧ t)2] ≤ a2 ∨ b2.

Let t→∞. E[γa,b] <∞, so

0 = E[W (γa,b)] = −aPW (γa,b) = −a+ bPW (γa,b) = b= −a+ (a+ b)PW (γa,b) = b

Consequently, PW (γa,b) = b = aa+b and

E[γa,b] = E[W (γa,b)2] = a2 b

a+ b+ b2

a

a+ b


Relationship to some partial differential equations

LetW = (W1,W2) whereW1 andW2 are independent, R-valued stan-dard Brownian motions. For f ∈ C3

b (R2),

Mf(t) = f(W (t))− f(0)−∫ t

0

1

2∆f(W (s))ds

is a martingale. Let Xx(t) = x + W (t) (standard Brownian motionstarting at x). Then

Mxf (t) = f(Xx(t))− f(x)−

∫ t

0

1

2∆f(Xx(s))ds

is a martingale. Let D ⊂ R2 be bounded, open, and have a smoothboundary. Suppose ∆h(y) = 0, t ∈ D, and h(y) = ϕ(y), y ∈ ∂D (theboundary of D). Define τx

D = inft : Xx(t) /∈ D. Then

h(x) = E[ϕ(Xx(τxD))]


Finiteness of the exit time

Lemma 7.15 For ρ > 0, let γρ = inft : |W (t)| ≥ ρ. Then E[γρ] = ρ2

2 .

Proof. Let f(x) = |x|2 for |x| ≤ ρ. then ∆f(x) = 4 for |x| ≤ ρ, and theoptional sampling theorem implies

E[|W (γρ ∧ t)|2] = E[2(γρ ∧ t)].

Letting t→∞ gives the result.


Laplace transforms for nonnegative random variables

For a nonnegative random variable X , define the Laplace transformof its distribution by

FX(α) = E[e−αX ] =

∫Re−αxµX(dx), α ≥ 0.

Lemma 7.16 The distribution of a nonnegative random variable is uniquelydetermined by its Laplace transform.


First passage time

For c > 0, let τc = inft : W (t) ≥ c. Since

Ma(t) = expaW (t)− 1

2a2t

The optional sampling theorem implies

1 = E[expaW (t ∧ τc)−1

2a2t ∧ τc] → E[expac− 1

2a2τc]

since τc <∞ a.s. Consequently,

E[e−12a2τc] = e−ac

andFτc

(α) = E[e−ατc] = e−√

2αc.


Quadratic variation

For s < t, E[(W (t)−W (s))2] = t− s and

E[((W (t)−W (s))2 − (t− s)

)2] = (t− s)2E[

(W (1)2 − 1

)2].

Consequently, for 0 = t0 < · · · < tm = t,

E[

(m∑

i=1

(W (ti)−W (ti−1))2 − t

)2

] = E[(W (1)2 − 1

)2]

m∑i=1

(ti − ti−1)2

which converges to zero as max(ti − ti−1) → 0.

We say that[W ]t = t.


Covariation

For independent standard Brownian motions W1, W2, a similar cal-culation implies

[W1,W2]t = limm∑

i=1

(W1(ti)−W1(ti−1))(W2(ti)−W2(ti−1)) = 0

where the convergence is in probability.


Nowhere differentiability

Suppose that f(t) =∫ t

0 g(s)ds, where∫ t

0 |g(s)|ds < ∞ for all t > 0.Then

m∑i=1

(f(ti)− f(ti−1))2 ≤ max

i|f(ti)− f(ti−1)|

∫ t

0|g(s)|ds→ 0

as max(ti − ti−1) → 0.

Definition 7.17 A function f : [0,∞) → R is nowhere differentiable ifits derivative does not exist at any point t ∈ [0,∞). Let ΓND denote thecollection of nowhere differentiable funtions.

Theorem 7.18 PW ∈ ΓND = 1.


Law of the iterated logarithim

lim supt→∞

W (t)√2t log log t

= 1 a.s.

W (t) = tW (1/t) is Brownian motion. V ar(W (t)) = t2 1t = t Therefore

lim supt→0

W (1/t)√2t−1 log log 1/t

= lim supt→0

W (t)√2t log log 1/t

= 1 a.s.

Consequently,

lim suph→0

W (t+ h)−W (t)√2h log log 1/h

= 1 a.s.


Modulus of continuity

Definition 7.19 A function f : [a, b] → R is Holder continuous withexponent 0 ≤ α ≤ 1, if

supa≤s<t≤b

|f(t)− f(s)||t− s|α

<∞.

Let Γα denote that collection of functions that are Holder continuous withexponent α.

Theorem 7.20 Let h(t) =√

2t log 1/t. Then

Plimε→0

supt1,t2∈[0,1],|t1−t2|≤ε

|W (t1)−W (t2)|h(|t1 − t2|)

= 1 = 1

Corollary 7.21 For 0 ≤ α < 12 , PW ∈ Γα = 1.


Markov property


E[f(X(t+ s))|Ft] = E[f(X(t+ s))|X(t)].

Let X(t) = X(0) +W (t), X(0) independent of W .

T (t)f(x) ≡ E[f(x+W (t))] =

∫ ∞

−∞f(y)

1√2πt

e−(y−x)2

2t dy

E[f(X(t+s))|FXt ] = E[f(X(t)+W (t+s)−W (t))|FX

t ] = T (s)f(X(t))


Transition density and operatorT (t) is called the transition operator for W and

p(t, x, y) =1√2πt

e−(y−x)2

2t

the transition density. Note that p satisfies the Chapman-Kolmogorovequation

p(t+ s, x, y) =

∫Rp(t, x, z)p(s, z, y)dz


Stopping times

Recall the definition of a stopping time and the corresponding infor-mation σ-algebra Fτ .

Lemma 7.23 Let τ be a Ft-stopping time. Then there exists a decreasingsequence of discrete stopping times τn ≥ τ such that limn→∞ τn = τ .

Proof. Define

τn =k + 1

2non k

2n≤ τ <

k + 1

2n. (7.4)

Then τn > τ on τ <∞, and

τn ≤ t = τn ≤[2nt]

2n = τ < [2nt]

2n ∈ Ft.


Conditioning up to discrete stopping times

Lemma 7.24 Let E[|Z|] < ∞, and let τ be a discrete Ft-stopping time(R(τ) = t1, t2, . . . ∪ ∞). Then

E[Z|Fτ ] =∑

k

E[Z|Ftk]1τ=tk + Z1τ=∞.


Strong Markov property

Theorem 7.25 Let τ be a FXt -stopping time with Pτ <∞ = 1. Then

E[f(X(τ + t))|Fτ ] = T (t)f(X(τ)). (7.5)

More generally, define Wτ by Wτ(t) = W (τ + t) −W (τ). Then Wτ is astandard Brownian motion and Wτ is independent of Fτ .


Proof. Prove first for discrete stopping times and take limits. Let τnbe as above. Then

E[f(X(τn + t))|Fτn] =

∑k

E[f(X(τn + t))|Fk2−n]1τn=k2−n

=∑

k

E[f(X(k2−n + t))|Fk2−n]1τn=k2−n

=∑

k

T (t)f(X(k2−n))1τn=k2−n

= T (t)f(X(τn)).

Assume that f is continuous so that T (t)f is continuous. Then

E[f(X(τn + t))|Fτ ] = E[T (t)f(X(τn))|Fτ ]

and passing to the limit gives (7.5).


Lemma 7.26 If γ ≥ 0 is Fτ -measurable, then

E[f(X(τ + γ))|Fτ ] = T (γ)f(X(τ)).

Proof. First, assume that γ is discrete. Then

E[f(X(τ + γ))|Fτ ] =∑

r∈R(γ)

E[f(X(τ + γ))|Fτ ]1γ=r

=∑

r∈R(γ)

E[f(X(τ + r))|Fτ ]1γ=r

=∑

r∈R(γ)

T (r)f(X(τ))1γ=r

= T (γ)f(X(τ)).

Assuming that f is continuous, general γ can be approximated bydiscrete γ.


Reflection principle

Lemma 7.27

Psups≤t

W (s) ≥ c = 2PW (t) > c

Proof. Let τ = t ∧ infs : W (s) ≥ c, and γ = (t − τ). Then settingf = 1(c,∞),

E[f(W (t))|Fτ ] = E[f(W (τ + γ))|Fτ ] = T (γ)f(W (τ)) =1

21τ<t

and hence, Pτ < t = 2PW (t) > c. Since τ < t ∪ W (t) =c = sups≤tW (s) ≥ c and PW (t) = c = 0, Psups≤tW (s) ≥ c =Pτ < t.

Corollary 7.28 The cdf of τc = inft : W (t) ≥ c is

Fτc(t) = 2PW (t) > c =

2√2πt

∫ ∞

c

e−x2

2t dx.


Robert Brown’s original paper

http://sciweb.nybg.org/science2/pdfs/dws/Brownian.pdf

http://sciweb.nybg.org/science2/pdfs/dws/Brownian.pdf


8. Stochastic integrals

• Definitions

• Martingale properties

• Continuous integrators and cadlag integrands

• Stochastic integrals as integrators

• Interated integrals

• Quadratic variation of a stochastic integral

• Ito’s formula

• Solution of a stochastic differential equation


Notation for limits through partitionsLet H(ti) denote a functional of a partition 0 = t0 < t1 < · · · satis-fying limm→∞ tm = ∞. For example

H(ti) =∑

i

f(ti)(g(T ∧ ti+1)− g(T ∧ ti))

The assertion thatlimti→∗

H(ti) = H0 (8.1)

will mean that for each ε > 0 there exists a δε > 0 such that maxi(ti+1−ti) < δε implies |H(ti) − H0| < ε. If H(ti) and H0 are randomvariables, then the limit in (8.1) exists in probability if and only if foreach ε > 0, there exists a δε > 0 such that maxi(ti+1 − ti) < δε implies

P|H(ti)−H0| > ε ≤ ε.


Definitions

Definition 8.1 A Brownian motion W is a Ft-Brownian motion if W isFt-adapted and W (t+ s)−W (t) is independent of Ft for all s, t ≥ 0.

Note that FWt ⊂ Ft and that any Ft-Brownian motion is a FW

t -Brownian motion.

In what follows, we will assume that there is a fixed filtration Ftand that W is a Ft-Brownian motion. All adapted processes areadapted to this filtration.


Simple integrands

Let 0 = r0 < r1 < · · · < rm and rm+1 = ∞.

X(t) =m∑

i=0

ξi1[ri,ri+1). (8.2)

Then X is a simple process.

Lemma 8.2 The simple process X given by (8.2) is Ft-adapted if andonly if for each i, ξi is Fri

-measurable.

Definition 8.3 Let X be an adapted, simple process. Then

I(t) =

∫ t

0X(s)dW (s) ≡

m∑i=0

ξi(W (t ∧ ri+1)−W (t ∧ ri)).


Martingale properties

Lemma 8.4 Let X and I be as above. Suppose that for each i, E[|ξi|] <∞.Then I is a martingale.

Proof. Suppose s > t. Then, if ri ≥ t,

E[ξi(W (s ∧ ri+1)−W (s ∧ ri))|Ft] = 0 = ξi(W (t ∧ ri+1)−W (t ∧ ri)),

and if ri < t,

E[ξi(W (s ∧ ri+1)−W (s ∧ ri))|Ft] = ξiE[(W (s ∧ ri+1)−W (s ∧ ri))|Ft]

= ξi(W (t ∧ ri+1)−W (t ∧ ri)).


Ito isometry

Theorem 8.5 LetX and I be as above. Suppose that for each i, E[ξ2i ] <∞.

Then

E[I2(t)] = E[

∫ t

0X2(s)ds]

Proof. The martingale property of W implies that for i 6= j,

E[ξi(W (t ∧ ri+1)−W (t ∧ ri))ξj(W (t ∧ rj+1)−W (t ∧ rj))] = 0.

Consequently,

E[I2(t)] =m∑

i=0

E[ξ2i (W (ri+1 ∧ t)−W (ri ∧ t))2]

=m∑

i=0

E[ξ2i (ri+1 ∧ t− ri ∧ t))] = E[

∫ t

0X2(s)ds]


An approximation lemma

Lemma 8.6 Let X be adapted and E[∫ t

0 X2(s)ds] < ∞ for each t > 0.

Then there exists a sequence Xn of adapted, simple processes such that

limn→∞

E[

∫ t

0(X(s)−Xn(s))

2ds] = 0. (8.3)

Note that (8.3) implies

limn,m→∞

E[

∫ t

0(Xm(s)−Xn(s))

2ds] ≤ limn,m→0

(2E[

∫ t

0(X(s)−Xn(s))

2ds]

+2E[

∫ t

0(X(s)−Xm(s))2ds]) = 0.


Extension of the definition of the integral

By Doob’s inequality and the martingale properties of the stochasticintegral

E[supt≤T

|∫ t

0XndW −

∫ t

0XmdW |2] = E[sup

t≤T|∫ t

0(Xn −Xm)dW |2]

≤ 4E[|∫ T

0(Xn −Xm)dW |2]

= 4E[

∫ T

0(Xm(s)−Xn(s))

2ds]

Consequently, there exists a continuous stochastic process I such thatlimn→∞E[supt≤T |I(t) −

∫ t

0 XndW |2] = 0. (Note that the proof of thisfact requires additional machinery.) We define

∫XdW = I.


Properties of the stochastic integral

Theorem 8.7 Suppose that X1, X2 are adapted and that for t > 0,E[∫ t

0 X2i (s)ds] <∞. Let Ii =

∫XidW .

a) Ii is adapted.

b) For each t ≥ 0, E[I2i (t)] = E[

∫ t

0 X2i (s)ds].

c) Ii is a martingale.

d) For a, b ∈ R,∫

(aX1 + bX2)dW = aI1 + bI2.


Continuous integrators and cadlag integrandsLet Y be a continuous, adapted process and X be a cadlag, adaptedprocess. We define∫ t

0X(s−)dY (s) = lim

ti→∗

∑X(ti ∧ t)(Y (ti+1 ∧ t)− Y (ti ∧ t)), (8.4)

if the limit on the right exists in probability for each t.

Lemma 8.9 Suppose Y = W . If X is cadlag, adapted, and bounded by aconstant, the two definitions of the stochastic integral are consistent.


Proof. Note that the sum on the right of (8.4) is∫XtidW for

Xti(t) =∑

X(ti)1(ti,ti+1](t)

and that

limti→∗

∫ t

0|Xti(s)−X(s−)|2ds = 0.

By the bounded convergence theorem,

limti→∗

E[

∫ t

0|Xti(s)−X(s−)|2ds] = 0,

and the lemma follows by Lemma 8.8.


Elimination of the moment condition

Theorem 8.10 Suppose that X is cadlag and adapted and that for each t >0,∫ t

0 X2(s)ds <∞ a.s. Then∫ t

0X(s−)dW (s) ≡ lim

ti→∗

∑X(ti ∧ t)(W (ti+1 ∧ t)−W (ti ∧ t)), (8.5)

in probability.

Proof. Let τc = inft : |X(t)|∨|X(t−)| ≥ c. ThenXτc(t) = X(t)1[0,τc)(t)

is cadlag and adapted and satisfies E[∫ t

0 |Xτc(s)|2ds] ≤ c2t. Conse-

quently, by Lemma 8.9,

limti→∗

∑X(ti ∧ t)1[0,τc)(ti ∧ t)(W (ti+1 ∧ t)−W (ti ∧ t))

exists in probability. Since limc→∞ τc = ∞ a.s., (8.5) holds.


Convergence of stochastic integrals

Theorem 8.11 Suppose that Xn, X are cadlag and adapted and that foreach t > 0,

limn→∞

∫ t

0|Xn(s)−X(s)|2ds = 0

in probability. Then

limn→∞

supt≤T

|∫ t

0Xn(s)dW (s)−

∫ t

0X(s)dW (s)| = 0

in probability.


Proof. Let γnc = inft :

∫ t

0 X2n(s)ds ≥ c and γc = inft :

∫ t

0 X2(s)ds ≥

c. Then for each t > 0,

limn→∞

E[

∫ t

0|1[0,γn

c )(s)Xn(s)− 1[0,γc)X(s)|2] = 0

and hence

limn→∞

E[supt≤T

|∫ t

01[0,γn

c )(s)Xn(s)dW (s)−∫ t

01[0,γc)X(s)dW (s)|2] = 0

and

limn→∞

E[1γnc ∧γc>T sup

t≤T|∫ t

0Xn(s)dW (s)−

∫ t

0X(s)dW (s)|2] = 0.

For ε > 0 and c sufficiently large, lim supn→∞ Pγnc ∧ γc ≤ T ≤ ε and

the lemma follows.


Stochastic integrals as integrators

Theorem 8.12 Suppose that X and Z are cadlag and adapted and Y (t) =∫ t

0 Z(s−)dW (s). Then∫ t

0X(s−)dY (s) ≡ lim

ti→∗

∑X(ti)(Y (ti+1 ∧ t)− Y (ti ∧ t))

=

∫ t

0X(s−)Z(s−)dW (s) =

∫ t

0X(s)Z(s)dW (s).

More generally, if U is cadlag and adapted and Y (t) =∫ t

0 Z(s−)dW (s) +∫ t

0 U(s)ds, then∫ t

0X(s−)dY (s) =

∫ t

0X(s)Z(s)dW (s) +

∫ t

0X(s)U(s)ds.


Proof. First check the result for X simple. Then, for X cadlag andadapted, define Xti(t) =

∑X(ti)1(ti,ti+1](t), so∑

X(ti)(Y (ti+1 ∧ t)− Y (ti ∧ t)) =

∫ t

0Xti(s−)dY (s)

=

∫ t

0Xti(s−)Z(s−)dW (s).

Then, by Theorem 8.11,

limti→∗

∑X(ti)(Y (ti+1 ∧ t)− Y (ti ∧ t))

= limti→∗

∫ t

0Xti(s−)Z(s−)dW (s)

=

∫ t

0X(s−)Z(s−)dW (s).


Interated integrals

Corollary 8.13 Suppose X1, X2, U , and Z are cadlag and adapted, andY (t) =

∫ t

0 Z(s−)dW (s) +∫ t

0 U(s)ds and V (t) =∫ t

0 X1(s−)dY (s). Then∫ t

0X2(s−)dV (s)

=

∫ t

0X2(s−)X1(s−)dY (s)

=

∫ t

0X2(s−)X1(s−)Z(s−)dW (s) +

∫ t

0X2(s−)X1(s−)U(s)ds

=

∫ t

0X2(s)X1(s)Z(s)dW (s) +

∫ t

0X2(s)X1(s)U(s)ds


Quadratic variation as a stochastic integral

Lemma 8.14 [X]t exists if and only if∫ t

0 X(s)dX(s) exists, and

[X]t = X2(t)−X2(0)− 2

∫ t

0X(s)dX(s).

Proof. Noting that (b− a)2 = b2 − a2 − 2a(b− a),∑(X(ti+1 ∧ t)−X(ti ∧ t))2

=∑

(X2(ti+1 ∧ t)−X2(ti ∧ t))

−2∑

X(ti ∧ t)(X(ti+1 ∧ t)−X(ti ∧ t))

= X2(t)−X2(0)− 2∑

X(ti ∧ t)(X(ti+1 ∧ t)−X(ti ∧ t))


Quadratic variation of a stochastic integral

Lemma 8.15 If X and Z are adapted,∫ t

0 X2(s)ds <∞ and

∫ t

0 |Z(s)|ds <∞ for each t > 0, and

Y (t) =

∫ t

0X(s)dW (s) +

∫ t

0Z(s)ds,

then

[Y ]t =

∫ t

0X2(s)ds.

Proof. Check first for simple X and then extend by approximation.


Ito’s formula

Theorem 8.16 Let

Y (t) = Y (0) +

∫ t

0X(s)dW (s) +

∫ t

0Z(s)ds

and f ∈ C2(R). Then

f(Y (t)) = f(Y (0)) +

∫ t

0f ′(Y (s))dY (s) +

∫ t

0

1

2f ′′(Y (s))d[Y ]s

= f(Y (0)) +

∫ t

0f ′(Y (s))X(s)dW (s) +

∫ t

0f ′(Y (s))Z(s)ds

+

∫ t

0

1

2f ′′(Y (s))X2(s)ds


Proof. Assume that f ∈ C3b (R). Then

f(Y (t)) = f(Y (0)) +∑

(f(Y (ti+1))− f(Y (ti)))

= f(Y (0)) +∑

f ′(Y (ti))(Y (ti+1)− Y (ti))

+∑ 1

2f ′′(Y (ti))(Y (ti+1)− Y (ti))

2

+O(∑

|Y (ti+1)− Y (ti)|3)

Note that if lim∑

(Y (ti+1)−Y (ti)2 exists, lim

∑|Y (ti+1)−Y (ti)|3 = 0.

The second term on the right converges by Theorem 8.12 and thethird by Problem 10.


Ito’s formula take 2

Theorem 8.17 Let

Y (t) = Y (0) +

∫ t

0X(s)dW (s) +

∫ t

0Z(s)ds

and f ∈ C2,1(R× R). Then

f(Y (t), t) = f(Y (0), 0) +

∫ t

0fx(Y (s), s)dY (s) +

∫ t

0fs(Y (s), s)ds

+

∫ t

0

1

2fxx(Y (s), s)d[Y ]s

= f(Y (0), 0) +

∫ t

0fx(Y (s), s)X(s)dW (s)

+

∫ t

0(fx(Y (s), s)Z(s) + fs(Y (s), s))ds

+

∫ t

0

1

2fxx(Y (s), s)X2(s)ds


Solution of a stochastic differential equation

A solution of the stochastic differential equation

dX = σXdW + µXdt

is an adapted stochastic process satisfying

X(t) = X(0) +

∫ t

0σX(s)dW (s) +

∫ t

0µX(s)ds. (8.6)

Ito’s formula suggests applying a substitution method similar to thatused to solve linear ordinary differential equaions, that is, look for asolution of the form

X(t) = X(0) expaW (t) + bt (8.7)


Identification of solutionSetting Y (t) = aW (t) + bt so that [Y ]t = a2t, applying Ito’s formula,we have

X(t) = X(0) +

∫ t

0X(s)dY (s) +

∫ t

0

1

2X(s)d[Y ]s

= X(0) +

∫ t

0aX(s)dW (s) +

∫ t

0(b+

1

2a2)X(s)ds

Consequently, setting a = σ and b = µ− 12σ

2, (8.7) satisfies (8.6), thatis

X(t) = X(0) expσW (t) + (µ− 1

2σ2)t


Vasicek interest rate model=Ornstein-Uhlenbeck pro-cess

dR = (α− βR)dt+ σdW

or

R(t) = R(0) +

∫ t

0(α− βR(s))ds+ σW (t)

Employ an integrating factor setting Q(t) = eβtR(t). Then

Q(t) = R(0) +

∫ t

0αeβsds+ σ

∫ t

0eβsdW (s)

Hence

R(t) = e−βtR(0) +α

β(1− e−βt) + σe−βt

∫ t

0eβsdW (s)


Covariation

Theorem 8.18 Let

Y1(t) = Y1(0) +

∫ t

0X1(s)dW (s) +

∫ t

0U1(s)ds

Y2(t) = Y2(0) +

∫ t

0X2(s)dW (s) +

∫ t

0U2(s)ds.

Y1 and Y2 are called Ito processes. Then

[Y1, Y2]t ≡ limti→∗

∑(Y1(ti+1 ∧ t)− Y1(ti ∧ t))(Y2(ti+1 ∧ t)− Y2(ti ∧ t))

= Y1(t)Y2(t)− Y1(0)Y2(0)−∫ t

0Y1(s)dY2(s)−

∫ t

0Y2(s)dY1(s)

=

∫ t

0X1(s)X2(s)ds


Proof.

[Y1, Y2]t ≡ limti→∗


=∑

(Y1(ti+1 ∧ t)Y2(ti+1 ∧ t)− Y1(ti ∧ t)Y2(ti ∧ t))

− limti→∗

∑Y1(ti ∧ t)(Y2(ti+1 ∧ t)− Y2(ti ∧ t))

− limti→∗

∑Y2(ti ∧ t)(Y1(ti+1 ∧ t)− Y1(ti ∧ t))

= Y1(t)Y2(t)− Y1(0)Y2(0)−∫ t

0Y1(s)dY2(s)−

∫ t

0Y2(s)dY1(s)

To verify the second equality, first assume X1 and X2 are simple.Then approximate.


Integration by parts

Corollary 8.19 Let Y1 and Y2 be Ito processes. Then∫ t

0Y1(s)dY2(s) = Y1(t)Y2(t)− Y1(0)Y2(0)−

∫ t

0Y2(s)dY1(s)− [Y1, Y2]t.


9. Black-Scholes and other models

• Derivation of the Black-Scholes model

• A population model

• Value of a portfolio

• Hedging a contract

• Pricing contracts using the Black-Scholes model


Derivation of the Black-Scholes modelFirst consider a discrete time model of the value of a stock. At eachtime step, the value increases or decreases by a certain percentage, sothe value becomes

Yk+1 = Yk(1 + ξk+1)

If the time steps are small, the percentage change is small suggesting

Y nk+1 = Y n

k (1 +1√nξk+1 +

1

nα)

where we assume the ξk are independent with

E[ξk] = 0 V ar(ξk) = σ2


Define Sn(t) = Y n[nt]. Then

logSn(t) = logSn(0) +

[nt]−1∑k=0

log(1 +1√nξk+1 +

1

nα)

≈ logSn(0) +

[nt]−1∑k=0

1√nξk+1 +

[nt]

nα− 1

2

[nt]−1∑k=0

(1√nξk+1 +

1

nα)2

≈ logS(0) + σW (t) + αt− 1

2σ2t

andS(t) = S(0) expσW (t) + (α− 1

2σ2)t (9.1)

S is called geometric Brownian motion.


An alternative derivation

Since Y nk+1 = Y n

k (1 + 1√nξk+1 + 1

nα) = Y nk + Y n

k ( 1√nξk+1 + 1

nα), we have

Y n[nt] = Y n

0 +

[nt]−1∑k=0

(Y nk+1 − Y n

k ) = Y n0 +

[nt]−1∑k=0

Y nk

1√nξk+1 +

[nt]−1∑k=0

αY nk

1

n

Setting Sn(t) = Y n[nt], Wn(t) = 1

σ√

n

∑[nt]k=1 ξk, An(t) = [nt]

n ,

Sn(t) = Sn(0) +∫ t

0 σSn(s−)dWn(s) +

∫ t

0αSn(s−)dAn(s)

suggesting a limit satisfying

S(t) = S(0) +

∫ t

0σS(s)dW (s) +

∫ t

0αS(s)ds (9.2)

We know that (9.1) satisfies (9.2), so this derivation looks plausible.


A population modelConsider the following discrete-time branching process. ξk

i , i ≥1, k ≥ 0 iid nonnegative, integer-valued withE[ξk

i ] = 1 and V ar(ξki ) =

σ2. ξki gives the number of offspring of the ith individual in the kth

generation, so

Zk+1 =

Zk∑i=1

ξki = Zk +

Zk∑i=1

(ξki − 1).

Consider a sequence of branching processes with Zn0 = n and define

Xn(t) =1

nZn

[nt].


Derivation of limiting equation

Xn(t) = 1 +1

n

[nt]−1∑k=0

(Znk+1 − Zn

k ) = 1 +1

n

[nt]−1∑k=0

Znk∑

i=1

(ξki − 1)

Define

Wn(t) =

[nt]∑k=1

1

σ√n√Zn

k

Znk∑

i=1

(ξki − 1)

Claim: Wn and Wn(t)2 − [nt]

n are martingales, and

Xn(t) = 1 +

∫ t

0σ√Xn(s−)dWn(s)

suggesting that Xn converges to a solution of

X(t) = 1 +

∫ t

0σ√X(s)dW (s)


Value of a portfolioSuppose that a market consists of a stock with price given by

S(t) = S(0) +

∫ t

0σS(s)dW (s) +

∫ t

0αS(s)ds

and a money market account paying continuously compounded in-terest at rate r.

∆(t) denotes the number of shares of stock held by an investor attime t with the remainder of the investors wealth, X(t)−∆(t)S(t) inthe money market. The portfolio value satisfies

X(t) = X(0) +

∫ t

0∆(s)dS(s) +

∫ t

0r(X(s)−∆(s)S(s))ds

= X(0) +

∫ t

0σ∆(s)S(s)dW (s) +

∫ t

0(rX(s) + (α− r)∆(s)S(s))ds


Hedging a contractSuppose a bank sells a contract that pays, at time T , an amount basedon the the stock price at time T .

That is, the bank pays h(S(T )).

The contract can be hedged if there exists a portfolio whose value Xat time T equals h(S(T )). We want an initial value X(0) and an in-vestment strategy ∆ such that X(T ) = h(S(T )).


Derivation of the hedging strategy

Suppose X(t) = q(t, S(t)). Then

q(t, S(t))

= q(0, S(0)) +

∫ t

0qx(s, S(s))σS(s)dW (s)

+

∫ t

0(qx(s, S(s))αS(s) +

σ2

2qxx(s, S(s))S2(s) + qs(s, S(s)))ds

Comparing this identity to

X(t) = X(0) +

∫ t

0∆(s)dS(s) +

∫ t

0r(X(s)−∆(s)S(s))ds

= X(0) +

∫ t

0σ∆(s)S(s)dW (s) +

∫ t

0(rX(s) + (α− r)∆(s)S(s))ds


A PDE for q

Matching coefficients gives ∆(s) = qx(s, S(s))

qx(s, S(s))αS(s) +σ2

2qxx(s, S(s))S2(s) + qs(s, S(s))

= rq(s, S(s)) + (α− r)qx(s, S(s))S(s).

and hence

qx(s, S(s))rS(s) +σ2

2qxx(s, S(s))S2(s) + qs(s, S(s))

= rq(s, S(s)).


Pricing contracts using the Black-Scholes model

We want q to satisfy the partial differential equation

qs(s, x) + rxqx(s, x) +1

2σ2x2qxx(s, x) = rq(s, x)

with terminal condition q(T, x) = h(x). Then the price of the contractshould be X(0) = q(0, S(0)). Holding ∆(s) = qx(s, S(s)) shares ofstock at each time s and the remainder in the money market, thevalue of the portfolio at time T is q(T, S(T )) = h(S(T )).


10. Multidimensional stochastic calculus

• Multidimensional Brownian motion

• Ito processes

• Covariation

• Covariation for Ito processes

• Multidimensional Ito formula


Multidimensional Brownian motion

Definition 10.1 W = (W1, . . . ,Wd) is a d-dimensional standard Ft-Brownian motion if eachW1, . . . ,Wd are independent (1-dimensional) stan-dard Brownian motions adapted to Ft and for each t ≥ 0,W (t+·)−W (t)is independent of Ft.

We will view W as a column vector.


Stochastic integration with respect to W

Let Mm×d denote the collection m × d matrices. Let X be Mm×d-valued, cadlag and adapted. Then∫ t

0X(s)dW (s) = lim

ti→∗

∑X(ti)(W (ti+1 ∧ t)−W (ti ∧ t)).

Note that Z(t) =∫ t

0 X(s)dW (s) ∈ Rm and that

Zi(t) =d∑

j=1

∫ t

0Xij(s)dWj(s).


Ito processes

Definition 10.2 If X is Mm×d-valued, cadlag and adapted and U is Rm-valued, cadlag and adapted, then

Z(t) =

∫ t

0X(s)dW (s) +

∫ t

0U(s)ds

is an Ito process.


Covariation

Definition 10.3 The covariation of Y1 and Y2 is

[Y1, Y2]t = limti→∗


Lemma 10.4

[Y1, Y2]t = Y1(t)Y2(t)− Y1(0)Y2(0)−∫ t

0Y1(s)dY2(s)−

∫ t

0Y2(s)dY1(s)

if the two stochastic integrals exist.

Lemma 10.5 The covariation is bilinear in the sense that

[aY1 +bY2, cZ1 +dZ2]t = ac[Y1, Z1]t +ad[Y1, Z2]t +bc[Y2, Z1]t +bc[Y2, Z2]t


Covariation for Ito processesTo compute the covariations for an Ito process, it is enough to knowthe following:

Lemma 10.6 If

Y1(t) =

∫ t

0X1(s)dW1(s) +

∫ t

0U1(s)ds

Y2(t) =

∫ t

0X2(s)dW2(s) +

∫ t

0U2(s)ds

Y3(t) =

∫ t

0X2(s)dW1(s) +

∫ t

0U2(s)ds

then

[Y1, Y2]t = 0 [Y1, Y3]t =

∫ t

0X1(s)X2(s)ds


Covariation for Ito processes

Lemma 10.7 If Y1 and Y2 are R-valued Ito processes with [Y1, Y2]t =∫ t

0 R(s)ds,X1 and X2 are cadlag and adapted, and

Z1(t) =

∫ t

0X1(s)dY1(s) Z2(t) =

∫ t

0X2(s)dY2(s),

then

[Z1, Z2]t =

∫ t

0X1(s)X2(s)d[Y1, Y2]s =

∫ t

0X1(s)X2(s)R(s)ds


Matrix notation for covariation

Let Z(t) =∫ t

0 X(s)dW (s) +∫ t

0 U(s)ds. Then

[Zi, Zj]t =d∑

k=1

∫ t

0Xik(s)Xjk(s)ds,

or thinking of the covariation as defining a matrix [Z]t = (([Zi, Zj]t)),

[Z]t =

∫ t

0X(s)X(s)Tds

where MT denotes the transpose of the matrix M .


Multidimensional Ito formula

Theorem 10.8 Let Z be an m-dimensional Ito process

Z(t) = Z(0) +

∫ t

0X(s)dW (s) +

∫ t

0U(s)ds

and let f ∈ C2(Rm). Define the Hessian matrix Hf(x) = ((∂i∂jf(x) )).Then

f(Z(t)) = f(Z(0)) +

∫ t

0∇f(Z(s))TdZ(s) +

1

2

∑i,j

∫ t

0∂i∂jf(Z(s))d[Zi, Zj]s

= f(Z(0)) +

∫ t

0∇f(Z(s))TX(s)dW (s) +

∫ t

0∇f(Z(s))TU(s)ds

+1

2

∫ t

0trace(X(s)THf(Z(s))X(s))ds


Examples

f(W (t)) = f(0) +

∫ t

0∇f(W (s))TdW (s) +

1

2

∫ t

0∆f(W (s))ds

where ∆ is the Laplacian

∆f(x) =d∑

k=1

∂2kf(x)

X(t) = X(0) expk∑

k=1

σkWk(t) + (µ− 1

2

d∑k=1

σ2k)t

satisfies the stochastic differential equation

X(t) = X(0) +

∫ t

0X(s)σTdW (s) +

∫ t

0µX(s)ds


Diffusion equations

If X satisfies

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds,

then

f(X(t)) = f(X(0)) +

∫ t

0∇f(X(s))Tσ(X(s))dW (s) +

∫ t

0Lf(X(s))ds

whereLf(x) =

1

2

∑i,j

aij(x)∂i∂jf(x) + b(x) · ∇f(x)

and a(x) = σ(x)σ(x)T .



• Brownian bridge

• Convergence of empirical distribution functions

• Kolmogorov-Smirnov test

• Conditioned Brownian motion

• An SDE for Brownian bridge

• Stochastic differential equations

• Moment estimates

• Gronwall inequality

• Schwarz inequality

• Lipschitz conditions for existence and uniqueness


Brownian bridgeRecall, that if (Y1, . . . , Yl, X1, . . . , Xm) is jointly Gaussian andCov(Yi, Xj) =0 of i = 1, . . . , l and j = 1, . . . ,m, then Y = (Y1, . . . , Yl) and X =(X1, . . . , Xm) are independent.

Let W be a scalar, standard Brownian motion. By Problem 6, W (1) isindependent of B defined by B(t) = W (t)− tW (1), 0 ≤ t ≤ 1.

Note that Cov(B(t), B(s)) = s ∧ t− ts.

Lemma 11.1 σ(B(t) : 0 ≤ t ≤ 1) is independent of σ(W (1)).

Proof. Since E[B(t)W (1)] = 0, (B(t1), . . . , B(tm)) is independent ofW (1) for all choices of t1, . . . , tm ∈ [0, 1]. Independence of the σ-algebras follows by a standard Math 831 argument.


Convergence of empirical distribution functionsLet ξ1, ξ2, . . . be iid uniform [0, 1] random variables, and define theempirical distribution function

Fn(t) =1

n

n∑i=1

1(−∞,t](ξi).

E[Fn(t)] = Fξ(t) and

Cov(Fn(t), Fn(s)) =1

n(t ∧ s− ts)

Define Bn(t) =√n(Fn(t)−Fξ(t)), 0 ≤ t ≤ 1, then (Bn(t1), . . . , Bn(tm))

converges in distribution to (B(t1), . . . , B(tm)), t1, . . . , tm ∈ [0, 1].


A class of continuous functions

Definition 11.2 Let D[0, 1] denote the cadlag, real-valued functions on[0, 1]. Let Cu

D be the collection of functions g : D[0, 1] → R such that ifxn ∈ D[0, 1] and x ∈ C[0, 1] satisfy sup0≤t≤1 |xn(t) − x(t)| → 0, thenlimn→∞ g(xn) = g(x).

For example,g(x) = sup

0≤t≤1x(t)

g(x) =

∫ 1

0x(s)ds


Functional limit theorem

Theorem 11.3 For each g ∈ CuD, g(Bn) converges in distribution to g(B).

In particular,

limn→∞

P sup0≤t≤1

√n(Fn(t)− Fξ(t)) ≤ x = P sup

0≤t≤1B(t) ≤ x

and

limn→∞

P sup0≤t≤1

|√n(Fn(t)− Fξ(t))| ≤ x = P sup

0≤t≤1|B(t)| ≤ x


More general empirical distributionsLet X1, X2, . . . be iid with continuous distribution function FX . De-fine FX

n (x) = 1n

∑ni=1 1(−∞,x](Xi) and

Dn(x) =√n(FX

n (x)− FX(x)).

Since

FXn (x) =

1

n

n∑i=1

1(−∞,FX(x)](FX(Xi)) a.s.

and FX(Xi) is uniform [0, 1], Dn has the same distribution as Bn FX .In particular, supxDn(x) (supx |Dn(x)|) has the same distribution assup0≤t≤1Bn(t) (sup0≤t≤1 |Bn(t)|).


Conditioned Brownian motionNote that

E[g( sup0≤t≤1

W (t))|W (1)] = E[g( sup0≤t≤1

(B(t) + tW (1)))|W (1)],

and since W (1) is independent of B, if we define

h(y) = E[g( sup0≤t≤1

(B(t) + ty))],

we haveE[g( sup

0≤t≤1W (t))|W (1)] = h(W (1))

The same analysis works for any function of W restricted to [0, 1],and we say that the conditional distribution of W given that W (1) = b

is the distribution of Bb defined by Bb(t) = B(t) + tb.


Linear transformations of W

Lemma 11.5 Let K : (t, u) : 0 ≤ u ≤ t → ∞ satisfy∫ t

0 K(t, u)2du <

∞, for all t ≥ 0, and define

Z(t) =

∫ t

0K(t, u)dW (u).

Then Z is a mean-zero Gaussian process with covariance

E[Z(t)Z(s)] =

∫ t∧s

0K(t, u)K(s, u)du


An SDE for Brownian bridge

For 0 ≤ t ≤ 1,

Bn(t) =√n(Fn(t)− t)

=√n(Fn(t)−

∫ t

0

1− Fn(s)

1− sds)−

√n

∫ t

0

Fn(s)− s

1− sds

= Mn(t)−∫ t

0

Bn(s)

1− sds

where Mn is a martingale and

[Mn]t = Fn(t) → t,

suggesting that B has the same distribution as the solution of

B(t) = W (t)−∫ t

0

B(s)

1− sds.


Solving the SDE

Employing an integrating factor

1

1− tB(t) =

∫ t

0

1

1− sdB(s) +

∫ t

0

1

(1− s)2B(s)ds

=

∫ t

0

1

1− sdW (s)−

∫ t

0

1

(1− s)2B(s)ds+

∫ t

0

1

(1− s)2B(s)ds,

so

B(t) = (1− t)

∫ t

0

1

1− sdW (s).

Checking the covariance

E[B(t)B(r)] = (1− t)(1− r)

∫ t∧r

0

1

(1− s)2ds

= (1− t)(1− r)(1

1− t ∧ r− 1) = t ∧ r − tr


Stochastic differential equationsWe consider general stochastic differential equations of the form

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds, (11.1)

where σ : Rm → Mm×d and b : Rm → Rm.

Definition 11.6 X is a solution of (11.1) if and only if X is adapted to afiltration Ft for which W is an Ft-Brownian motion and the identity(11.1) holds a.s.


The associated differential operator

If X satisfies

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds,

then for f ∈ C2(Rm),

f(X(t)) = f(X(0)) +

∫ t

0∇f(X(s))Tσ(X(s))dW (s) +

∫ t

0Lf(X(s))ds

whereLf(x) =

1

2

∑i,j

aij(x)∂i∂jf(x) + b(x) · ∇f(x)

and a(x) = σ(x)σ(x)T .


Moment estimates

Lemma 11.7 Let m ≥ 2. Suppose that there exists K > 0 such that|σ(x)|+ |b(x)| ≤ K(1 + |x|) and that E[|X(0)|m] <∞.

Proof. Applying Ito’s formula,

|X(t)|m = |X(0)|m +

∫ t

0m|X(s)|m−2X(s)Tσ(X(s))dW (s)

+

∫ t

0m|X(s)|m−2(X(s) · b(X(s)) +

1

2tracea(X(s)))ds

+1

2

∫ t

0m(m− 2)|X(s)|m−4X(s)Ta(X(s))X(s)ds


Let τc = inft : |X(t)| ≥ c. Then there exists K > 0 such that

E[|X(t ∧ τc)|m] ≤ E[|X(0)|m] + E[

∫ t∧τc

0K(1 + |X(s)|m)ds]

≤ E[|X(0)|m] + Kt+ E[

∫ t

0K|X(s ∧ τc)|mds]

which by Gronwall’s inequality and Fatou’s lemma implies

E[|X(t)|m] ≤ lim infc→∞

E[|X(t ∧ τc)|m] ≤ (E[|X(0)|m] + Kt)eKt.


Gronwall’s inequality

Lemma 11.8 Suppose that A be continuous and nondecreasing, X is cad-lag, and that

0 ≤ X(t) ≤ ε+

∫ t

0X(s)dA(s) . (11.2)

ThenX(t) ≤ εeA(t).


Proof. Iterating (11.2), we have

X(t) ≤ ε+

∫ t

0X(s)dA(s)

≤ ε+ εA(t) +

∫ t

0

∫ s

0X(u)dA(u)dA(s)

≤ ε+ εA(t) + ε

∫ t

0A(s)dA(s) +

∫ t

0

∫ s

0

∫ u

0X(v)dA(v)dA(u)dA(s).

Checking that∫ t

0A(s)dA(s) =

1

2A(t)2,

∫ t

0

∫ s

0A(u)dA(u)dA(s) =

1

6A(t)3

and in general∫ t

0

∫ t1

0. . .

∫ tn−2

0A(tn−1)dA(tn−1) · · · dA(t1) =

1

n!A(t)n,

we see that X(t) ≤ εeA(t).


Schwarz inequality

Lemma 11.9 Suppose E[X2], E[Y 2] <∞. Then

E[XY ] ≤√E[X2]E[Y 2].

Proof. Note that 2ab ≤ z−2a2 + z2b2, so

2E[XY ] ≤ z−2E[X2] + z2E[Y 2].

Minimizing the right side with respect to z gives the desired inequal-ity.

The same argument gives

E[X · Y ] ≤√E[|X|2]E[|Y |2]

for random vectors.


Lipschitz conditions for existence and uniqueness

Theorem 11.10 Suppose that |σ(x) − σ(y)| + |b(x) − b(y)| ≤ K|x − y|.Then there exists a unique solution of

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds. (11.3)


Proof.[of existence] Assume, in addition, that σ and b are bounded.For h > 0, define ηh(t) = h[h−1t], and let Xh satisfy

Xh(t) = X(0) +

∫ ηh(t)

0σ(Xh(s))dW (s) +

∫ ηh(t)

0b(Xh(s))ds.

Note that Xh is constant on the interval [kh, (k + 1)h), and that

Xh((k+1)h) = Xh(kh)+σ(Xh(kh))W ((k+1)h)−W (kh))+b(Xh(kh))h.

(known as the Euler-Maruyama scheme in the SDE literature).

Define

Dh(t) =

∫ t

ηh(t)σ(Xh(s))dW (s) +

∫ t

ηh(t)b(Xh(s))ds,

so

Xh(t) +Dh(t) = X(0) +

∫ t

0σ(Xh(s))dW (s) +

∫ t

0b(Xh(s))ds.


Uniform moment boundsSuppose 2xb(x) + σ2(x) ≤ K1 − ε|x|2. (For example, consider theequation X(t) = X(0)−

∫ t

0 αX(s)ds+W (t).) Then

eεt|X(t)|2 ≤ |X(0)|2 +

∫ t

0

eεs2X(s)σ(X(s))dW (s)

+

∫ t

0

eεs[2X(s)b(X(s)) + σ2(X(s))]ds+

∫ t

0

εeεs|X(s)|2ds

≤ |X(0)|2 +

∫ t

0

eεs2X(s)σ(X(s))dW (s) +

∫ t

0

eεsK1ds

≤ |X(0)|2 +

∫ t

0

eεs2X(s)σ(X(s))dW (s) +K1

2(eεt − 1),

and hence

eεtE[|X(t)|2] ≤ E[|X(0)|2] +K1

ε[eεt − 1].

Therefore, we have the uniform bound

E[|X(t)|2]] ≤ e−εtE[|X(0)|2] +K1

ε(1− e−εt).


Vector caseIn the vector case

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds,

|X(t)|2 = |X(0)|2 +

∫ t

02X(s)

T

σ(X(s))dW (s)

+

∫ t

0(2X(s) · b(X(s)) + trace(σ(X(s))σ(X(s))

T

))ds .

As in the univariate case, if we assume,

2x · b(x) + trace(σ(x)σ(x)T

) ≤ K1 − ε|x|2,

then E[|X(s)|2] is uniformly bounded.


12. Markov property

• Definition

• Solution of SDE as a function of the initial position

• Flow property

• Markov property

• Strong Markov property

• Differentiability with respect to initial position


DefinitionRecall the definition of the Markov property:


E[f(X(t+ s))|Ft] = E[f(X(t+ s))|X(t)].


Solution of SDE as a function of the initial position

For t ≥ r, let

Xr(t, x) = x+

∫ r+t

r

σ(Xr(s− r, x))dW (s) +

∫ r+t

r

b(Xr(s− r.x))ds.

(12.1)Define Wr(t) = W (r + t)−W (r). Then

Xr(t, x) = x+

∫ t

0σ(Xr(s, x))dWr(s) +

∫ t

0b(Xr(s.x))ds. (12.2)

Assuming that uniqueness holds, since the distribution of Wr doesnot depend on r, the distribution of Xr does not depend on r. Inparticular, for f ∈ B(Rm), the bounded, Borel-measurable functionson Rm,

T (t)f(x) ≡ E[f(Xr(t, x))]

does not depend or r.


Flow propertyWrite X(t, x) = X0(t, x).

Theorem 12.2 Let σ and b be such that solutions of the SDE are uniquefor all choices of the initial condition (for example, if σ and b are Lipschitz).Then

X(r + t, x) = Xr(t,X(r, x)).

Proof. By subtraction

X(r + t, x) = X(r, x) +

∫ r+t

r

σ(X(s, x))dW (s) +

∫ r+t

r

b(X(s, x))ds(12.3)

= X(r, x) +

∫ t

0σ(X(r + s, x))dWr(s) +

∫ t

0b(X(r + s, x))ds.

Comparing this equation to (12.2), the proof becomes “clear”, al-though there are a number of technicalities. For example, we needto know that Xr(t, x) is at least a measurable function of x.


Corollary 12.3 Under the assumptions of the theorem, if X(0) is indepen-dent of W and X satisfies (11.1), then

X(r + t) = X(r + t,X(0)) = Xr(t,X(r)).


Markov property

Theorem 12.4 Let σ and b be such that solutions of the SDE are unique forall choices of the initial condition (for example, if σ and b are Lipschitz). IfX is a solution of (11.1), then X is Markov.

Proof. By uniqueness,X(r) is independent ofWr and hence ofXr(t, x).Consequently, for f ∈ B(Rm),

E[f(X(r + t))|Fr] = E[f(Xr(t,X(r))|Fr] = T (t)f(X(r)).



Theorem 12.5 Let σ and b be such that solutions of the SDE are unique forall choices of the initial condition (for example, if σ and b are Lipschitz). IfX is a solution of (11.1), then for each finite Ft-stopping time τ and eachf ∈ B(Rm),

E[f(X(τ + t))|Fτ ] = T (t)f(X(τ)).

Proof. By the strong Markov property for Brownian motion, Wτ de-fined byWτ(t) = W (τ+ t)−W (τ) is a Brownian motion independentof X(τ), and Xτ satisfying

Xτ(t, x) = x+

∫ t

0σ(Xτ(s, x))dWτ(s) +

∫ t

0b(Xτ(s.x))ds. (12.4)

has the same distribution as Xr(t, x). As with the Markov property,

E[f(X(τ + t))|Fτ ] = E[f(Xτ(t,X(τ))|Fr] = T (t)f(X(τ)).


Continuity with respect to the initial condition

Theorem 12.6 Let k ≥ 2. Under Lipschitz conditions, there exists a con-stant Mk such that

E[|X(t, x)−X(t, y)|k] ≤ |x− y|keMkt.

Proof. Let Zx,y(t) = X(t, x)−X(t, y). As in the proof of uniqueness

|Zx,y(t)|k = |x− y|k +

∫ t

0

k|Zx,y(s)|k−2(Zx,y(s))T (σ(X(s, x))−X(s, y))dW (s)

+

∫ t

0

x|Zx,y(s)|k−2(Zx,y(s) · (b(X(s, x))− b(X(s, y))

+1

2trace(σ(X(s, x))− σ(X(s, y))(σ(X(s, x)−X(s, y))T )ds

+1

2

∫ t

0

k(k − 2)|Zx,y(s)|k−4|σ(X(s, x))− σ(X(s, y)))TZx,y(s)|2ds


Differentiability with respect to initial position

Assume d = m = 1.

∆x,y(t) ≡ X(t, x)−X(t, y)

x− y

= 1 +

∫ t

0

σ(X(s, x))− σ(X(s, y))

X(s, x)−X(s, y)∆x,y(s)dW (s)

+

∫ t

0

b(X(s, x))− b(X(s, y))

X(s, x)−X(s, y)∆x,y(s)ds

and

∂xX(t, x) = 1 +

∫ t

0σ′(X(s, x))∂xX(s, x)dW (s)

+

∫ t

0b′(X(s, x))∂xX(s, x)ds


13. SDEs and partial differential equations

• Generator for a diffusion process

• One dimensional exit distributions

• Exit times

• Dirichlet problems

• Parabolic equations

• Feynman-Kac formula

• Pricing derivatives


Differential operators and diffusion processes

Consider

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds,

where X is Rd-valued, W is an m-dimensional standard Brownianmotion, σ is a d × m matrix-valued function and b is an Rd-valuedfunction. For a C2 function f ,

f(X(t)) = f(X(0)) +d∑

i=1

∫ t

0∂if(X(s))dX(s)

+1

2

∑1≤i,j≤d

∫ t

0∂i∂jf(X(s))d[Xi, Xj]s.


Computation of covariation

The covariation satisfies

[Xi, Xj]t =

∫ t

0

∑k

σi,k(X(s))σj,k(X(s))ds =

∫ t

0ai,j(X(s))ds,

where a = ((ai,j)) = σ · σT , that is ai,j(x) =∑

k σik(x)σkj(x).


Definition of the generator

Let

Lf(x) =d∑

i=1

bi(x)∂if(x) +1

2

∑i,j

ai,j(x)∂i∂jf(x),

then

f(X(t)) = f(X(0)) +

∫ t

0∇fT (X(s))σ(X(s))dW (s) +

∫ t

0Lf(X(s))ds .

Sincea = σ · σT ,∑

ξiξjai,j = ξTσσT ξ = |σT ξ|2 ≥ 0,

a is nonnegative definite, and L is an elliptic differential operator.

L is called the generator for the corresponding diffusion process.


The generator for Brownian motionIf

X(t) = X(0) +W (t),

then ((ai,j(x))) = I , and Lf(x) = 12∆f(x).


Exit distributions in one dimension

If d = m = 1 and a(x) = σ2(x), then

Lf(x) =1

2a(x)f ′′(x) + b(x)f ′(x)

Suppose Lf(x) = 0 Then

f(X(t)) = f(X(0)) +

∫ t

0f ′(X(s))σ(X(s))dW (s).

Fix α < β, and define τ = inft : X(t) /∈ (α, β). If supα<x<β |f ′(x)σ(x)| <∞, then

f(X(t ∧ τ)) = f(X(0)) +

∫ t

01[0,τ)(s)f

′(X(s))σ(X(s))dW (s)

is a martingale, and

E[f(X(t ∧ τ))|X(0) = x] = f(x).


Formula for the exit distribution

Theorem 13.1 Let f satisfy Lf = 0. Suppose supα<x<β |f ′(x)σ(x)| <∞,supα<x<β f(x) <∞, and τ <∞ a.s. Then

E[f(X(τ))|X(0) = x] = f(x) (13.1)

andP (X(τ) = β|X(0) = x) =

f(x)− f(α)

f(β)− f(α).

Proof. By (13.1)

f(α)P (X(τ) = α|X(0) = x) + f(β)P (X(τ) = β|X(0) = x) = f(x),

andP (X(τ) = β|X(0) = x) =

f(x)− f(α)

f(β)− f(α). (13.2)


Finiteness of exit time

Let Lg(x) = 1. Then

g(X(t)) = g((X(0)) +

∫ t

0g′(X(s))σ(X(s))dW (s) + t,

and assuming supα<x<β |g′(x)σ(x)| <∞,

g(X(t ∧ τ)) = g(x) +

∫ t∧τ

0g′(X(s))σ(X(s))dW (s) + t ∧ τ

is a martingale and hence

E[g(X(t ∧ τ))|X(0) = x] = g(x) + E[t ∧ τ ].


Solving the equation

1

2a(x)f ′′(x) + b(x)f ′(x) = 0

d

dxlog f ′(x) =

f ′′(x)

f ′(x)= −2b(x)

a(x)

and hencef ′(x) = C exp−

∫ x

x0

2b(y)

a(y)dy

1

2a(x)g′′(x) + b(x)g′(x) = 1

g′′(x) +2b(x)

a(x)g′(x) =

2

a(x)

d

dx

(exp

∫ x

x0

2b(y)

a(y)dyg′(x)

)= exp

∫ x

x0

2b(y)

a(y)dy 2

a(x)


Example

X(t) = X(0) +

∫ t

0σ√X(s)dW (s)

so Lf(x) = 12σ

2xf ′′(x). Note that Lf(x) = 0 for f(x) = x, so takingα = 0, if τ <∞ a.s.

PX(τ) = 0|X(0) = x =β − x

β.

Solving Lg = 1,

g′(x) =2

σ2 log x+ C, g(x) =2

σ2x log x

for C = 2σ2 . It follows that

E[τ |X(0) = x] =2

σ2 (x log β − x log x) =2x

σ2 logβ

x.


Dirichlet problems

Lf(x) = 0 x ∈ D

f(x) = h(x) x ∈ ∂D(13.3)

for D ⊂ Rd.

Definition 13.3 A function f is Holder continuous with Holder expo-nent δ > 0 if

|f(x)− f(y)| ≤ L|x− y|δ

for some L > 0.


Existence of solutions

Theorem 13.4 Suppose D is a bounded, smooth domain, there exists ε > 0such that

infx∈D

∑ai,j(x)ξiξj ≥ ε|ξ|2,

and ai,j, bi, and h are Holder continuous. Then there exists a unique C2 (inthe interior of D) solution f of the Dirichlet problem (13.3).


Representation of solution of Dirichlet problem

Let

X(t, x) = x+

∫ t

0σ(X(s, x))dW (s) +

∫ t

0b(X(s, x))ds. (13.4)

Define τ = τ(x) = inft : X(t, x) /∈ D. If f is C2 and bounded andsatisfies (13.3), then

f(x) = E[f(X(t ∧ τ, x))],

and assuming τ < ∞ a.s., f(x) = E[f(X(τ, x))]. By the boundarycondition

f(x) = E[h(X(τ, x))]. (13.5)

Conversely, define f by (13.5), and f will be, at least in some weaksense, a solution of (13.3). Note that if there is aC2, bounded solutionf and τ < ∞, f must be given by (13.5) proving uniqueness of C2,bounded solutions.


Harmonic functions

If ∆f = 0 (i.e., f is harmonic) on Rd, and W is standard Brownianmotion, then f(x+W (t)) is a martingale (at least a local martingale).


Parabolic equations

Suppose u is bounded and satisfiesut = Lu

u(0, x) = f(x).

By Ito’s formula, for a smooth function v(t, x),

v(t,X(t)) = v(0, X(0))+(local) martingale+

∫ t

0[vs(s,X(s))+Lv(s,X(s))]ds.

For fixed r > 0, define v(t, x) = u(r − t, x). Then ∂∂tv(t, x) = −u1(r −

t, x), where u1(t, x) = ∂∂tu(t, x). Since u1 = Lu and Lv(t, x) = Lu(r −

t, x), v(t,X(t)) is a martingale. Consequently,

E[u(r − t,X(t, x))] = u(r, x),

and setting t = r,

u(r, x) = E[u(0, X(r, x))] = E[f(X(r, x))].


Equations with a potential

We have that the solution ofut = Lu

u(0, x) = f(x).

is given by u(t, x) = E[f(X(t, x))]. We can also represent the solutionof

ut = Lu+ βu

u(0, x) = f(x).

where β is a function of x.


Feynman-Kac formula

Applying Ito’s formula

u(t0 − t,X(t, x))e∫ t

0β(X(z,x))dz

= u(t0, x) +

∫ t

0e∫ s

0β(X(z,x))dz∇xu(t0 − s,X(s, x))Tσ(X(s, x))dW (s)

+

∫ t

0e∫ s

0β(X(z,x))dz(β(X(s, x))u(t0 − s,X(s, x))

+Lu(t0 − s,X(s, x))− u1(t0 − s,X(s, x)))ds

and henceu(t, x) = E[f(X(t, x))e

∫ t

0β(X(z,x))dz]


Pricing derivativesRecall that we derived a partial differential equation for pricing con-tracts using the Black-Scholes model

We want q to satisfy the partial differential equation



with terminal condition q(T, x) = h(x). Then the price of the contractshould be X(0) = q(0, S(0)).

Consequently, we want q(s, x) = u(T − s, x), whereut = rxux + 1

2σ2x2uxx − ru

u(0, x) = h(x).


Formula for solution

u(t, x) = E[h(X(t, x))e−rt]

where

X(t, x) = x+

∫ t

0σX(s, x)dW (s) +

∫ t

0rX(s, x)ds

= x expσW (t) + (r − 1

2σ2)t

q(t, x) = u(T − t, x)

= E[h(x expσW (T − t) + (r − 1

2σ2)(T − t))e−r(T−t)]

= e−r(T−t)∫ ∞

−∞h(x expσ

√T − tz + (r − 1

2σ2)(T − t)) 1√

2πe−

z2

2 dz


Formula for a European call

A European call with strike price K and exercise time T is the rightto buy a share of the stock at time T at price K. Consequently, thepayoff at time T is (S(T )−K)+, that is, h(x) = (x−K)+.

Noting that the integrand is zero if

x expσ√T − tz + (r − 1

2σ2)(T − t) < K,

set d−(T − t, x) = 1σ√

T−t

[log x

K + (r − 12σ

2)(T − t)].


q(x, t) = e−r(T−t)∫ ∞

−d−(T−t,x)(x expσ

√T − tz + (r − 1

2σ2)(T − t)

−K)1√2πe−

z2

2 dz

=

∫ ∞

−d−(T−t,x)x exp−1

2z2 + σ

√T − tz − 1

2σ2(T − t) 1√

2πdz

−e−r(T−t)KΦ(d−(T − t, x))

=

∫ ∞

−d−(T−t,x)x exp−1

2(z − σ

√T − t)2 1√

2πdz


= xΦ(d−(T − t, x) + σ√T − t)



External sphere condition

Definition 13.5 An open domain D satisfies the external sphere condi-tion at y ∈ ∂D, if there exists a closed ball Bδ(x0) = z : |z − x0| ≤ δsuch that D ∩ Bδ = ∅ and D ∩ Bδ = y.

Definewy(x) = k(|y − x0|−p − |x− x0|−p),

and note that wy(x) > 0 for x ∈ D. Then

Lwy(x) = kp|x− x0|−(p+2)(− p+ 2

2

(x− x0)Ta(x)(x− x0)

|x− x0|2+

1

2tra(x)

+b(x) · (x− x0))

For k and p large enough, Lwy(x) ≤ −1, x ∈ B2δ(x0) ∩D.


Regularity of boundary

Let τx = inft : X(t, x) /∈ D ∩B2δ(x0). Then

E[wy(X(τx, x))] = wy(x) + E[

∫ τx

0Lwy(X(s, x))ds]

andwy(x) ≥ E[wy(X(τx, x))] + E[τx].

Consequently, for ε > 0,

limx→y

E[τx] = 0, limx→y

P|X(τx, x)− y| ≥ ε = 0.

Definition 13.6 y ∈ ∂D is regular for X if τy = 0 a.s.


Realization of boundary condition

Lemma 13.7 Suppose h is continuous on ∂D and for each y ∈ ∂D, Dsatisfies the external sphere condition at y. Then for each y ∈ ∂D,

limx∈D,x→y

E[h(X(τx, x)] = h(y)


14. Change of measure and asset pricing

• Change of measure

• Bayes theorem

• Radon-Nikodym theorem

• No arbitrage and pricing measures

• Pricing measure determined by a single stock

• Change of measure and filtrations

• Martingales under a change of measure

• Girsanov theorem


Change of measure

Lemma 14.1 Let (Ω,F , Q) be a probability space and let L ≥ 0 be a ran-dom variable with EQ[L] = 1. Define P (A) = EQ[1AL], A ∈ F . Then Pis a probability measure.

Proof. Clearly, P (A) ≥ 0 and P (Ω) = EQ[L] = 1. If Ak ⊂ F aredisjoint, then

∞∑k=1

P (Ak) = limn→∞

n∑k=1

EQ[1AkL] = lim

n→∞EQ[

n∑k=1

1AkL] = EQ[1∪kAk

L].

Note that if Q(A) = 1, then P (A) = 1.


Example

For general random variables, suppose X and Y are independent on(Ω,F , Q). Let L = H(X, Y ) ≥ 0, and E[H(X, Y )] = 1. Define

νY (Γ) = QY ∈ ΓdP = H(X, Y )dQ.

Bayes formula becomes

EP [g(Y )|X] =EQ[g(Y )H(X, Y )|X]

EQ[H(X, Y )|X]=

∫g(y)H(X, y)νY (dy)∫H(X, y)νY (dy)


Radon-Nikodym theorem

Definition 14.3 Let P and Q be probability measures on (Ω,D). ThenP is absolutely continuous with respect to Q (P << Q) if and only ifQ(A) = 0, A ∈ D, implies P (A) = 0.

Theorem 14.4 If P << Q on D, then there exists a D-measurable randomvariable L ≥ 0 such that

P (A) = EQ[1AL] =

∫A

LdQ, A ∈ D.

Consequently, a D-measurable random variable Z is P -integrable if andonly if ZL is Q-integrable, and

EP [Z] = EQ[ZL].

Standard notation: dPdQ = L.


Equivalent measures

Definition 14.5 Two measures P and Q are equivalent if P << Q andQ << P .

Lemma 14.6 If P andQ are probability measures onD with P << QwithdP = LdQ, then P and Q are equivalent if and only if QL > 0 = 1.


Model of a market

Consider financial activity over a time interval [0, T ] modeled by aprobability space (Ω,F , P ).

Assume that there is a “fair casino” or market which is complete inthe sense that at time 0, for each event A ∈ F , a price Q(A) ≥ 0 isfixed for a bet or a contract that pays one dollar at time T if and onlyif A occurs.

Assume that the market is frictionless in that an investor can eitherbuy or sell the contract at the same price and that it is liquid in thatthere is always a buyer or seller available. Also assume that Q(Ω) <∞.

An investor can construct a portfolio by buying or selling a variety ofcontracts (possibly countably many) in arbitrary multiples.


Price and payoff of a portfolio

If ai is the “quantity” of a contract for Ai (ai < 0 corresponds toselling the contract), then the payoff at time T is∑

i

ai1Ai.

Require∑

i |ai|Q(Ai) < ∞ (only a finite amount of money changeshands) so that the initial cost of the portfolio is (unambiguously)∑

i

aiQ(Ai).

The market has no arbitrage if no combination (buying and selling) ofcountably many policies with a net cost of zero results in a positiveprofit at no risk.


No arbitrage condition

If∑|ai|Q(Ai) <∞,∑

i

aiQ(Ai) = 0, and∑

i

ai1Ai≥ 0 a.s.,

then ∑i

ai1Ai= 0 a.s.


Consequences of the no arbitrage condition

Lemma 14.7 Assume that there is no arbitrage. If P (A) = 0, thenQ(A) =0. If Q(A) = 0, then P (A) = 0.

Proof. Suppose P (A) = 0 and Q(A) > 0. Buy one unit of Ω and sellQ(Ω)/Q(A) units of A.

Cost = Q(Ω)− Q(Ω)

Q(A)Q(A) = 0

Payoff = 1− Q(Ω)

Q(A)1A = 1 a.s.

which contradicts the no arbitrage assumption.

Now suppose Q(A) = 0. Buy one unit of A. The cost of the portfoliois Q(A) = 0 and the payoff is 1A ≥ 0. So by the no arbitrage assump-tion, 1A = 0 a.s., that is, P (A) = 0.


Price monotonicity

Lemma 14.8 If there is no arbitrage and A ⊂ B, then Q(A) ≤ Q(B), withstrict inequality if P (A) < P (B).

Proof. Suppose P (B) > 0 (otherwise Q(A) = Q(B) = 0) and Q(B) ≤Q(A). Buy one unit of B and sell Q(B)/Q(A) units of A.

Cost = Q(B)− Q(B)

Q(A)Q(A) = 0

Payoff = 1B −Q(B)

Q(A)1A = 1B−A + (1− Q(B)

Q(A))1A ≥ 0,

Payoff = 0 a.s. implies Q(B) = Q(A) and P (B − A) = 0.


Q must be a measure

Theorem 14.9 If there is no arbitrage, Q must be a measure on F .

Proof. A1, A2, . . . disjoint and A = ∪∞i=1Ai. Assume P (Ai) > 0 forsome i. (Otherwise, Q(A) = Q(Ai) = 0.)

Let ρ ≡∑

iQ(Ai), and buy one unit of A and sell Q(A)/ρ units of Ai

for each i.Cost = Q(A)− Q(A)

ρ

∑i

Q(Ai) = 0

Payoff = 1A −Q(A)

ρ

∑i

1Ai= (1− Q(A)

ρ)1A.

If Q(A) ≤ ρ, then Q(A) = ρ.

If Q(A) ≥ ρ, sell one unit of A and buy Q(A)/ρ units of Ai.


Equivalence of measures

Theorem 14.10 If there is no arbitrage, Q << P and P << Q. (P and Qare equivalent measures.)

Proof. The result follows from Lemma 14.7.


Pricing general payoffs

If X and Y are random variables satisfying X ≤ Y a.s., then no arbi-trage should mean

Q(X) ≤ Q(Y ).

It follows that for any Q-integrable X , the price of X is

Q(X) =

∫XdQ

By the Radon-Nikodym theorm, dQ = LdP , for some nonnegative,integrable random variable L, and

Q(X) = EP [XL]


Assets that can be traded at intermediate times

Ft represents the information available at time t.

B(t) is the price at time t of a bond that is worth $1 at time T (e.g.B(t) = e−r(T−t)), that is, at any time 0 ≤ t ≤ T , B(t) is the price of acontract that pays exactly $1 at time T .

Note that B(0) = Q(Ω)

Define Q(A) = Q(A)/B(0), so that Q is a probability measure


Martingale properties of tradable assets

Let S(t) be the price at time t of another tradable asset, that is, S(t)is the buying or selling price at time t of an asset that will be worthS(T ) at time T . S must be Ft-adapted.

For any stopping time τ ≤ T , we can buy one unit of the asset attime 0, sell the asset at time τ and use the money received (S(τ)) tobuy S(τ)/B(τ) units of the bond. Since the payoff for this strategy isS(τ)/B(τ) (the value of the bonds at time T ), we must have

S(0) =

∫S(τ)

B(τ)dQ = EQ[

B(0)S(τ)

B(τ)].

Theorem 14.11 If S is the price of a tradable asset, then S/B is a martin-gale on (Ω,F , Q).


Characterizing martingales by stopping

Theorem 14.12 LetM be an Ft adapted process. IfE[M(τ)] = E[M(0)]for every bounded Ft-stopping time, then M is a martingale.

Proof. Let t < s and A ∈ Ft. Define τ = t on A and τ = s on Ac. Thenτ is a stopping time and

E[M(s)] = E[M(0)] = E[M(τ)] = E[1AM(t)] + E[1AcM(s)].

Consequently, E[1AM(s)] = E[1AM(t)], A ∈ Ft, and hence,

E[M(s)|Ft] = M(t).


Pricing tradable assets in a market with a money-marketaccount

Instead of a bond, suppose that there is an account that pays interestat time t at rate R(t). Then, $1 invested in the account at time zero isworth L(T ) = e

∫ T

0R(s)ds at time T . Consequently, if Q is an arbitrage-

free pricing measure

1 =

∫e∫ T

0R(s)dsdQ.

Define dP = L(T )dQ, and note that P is a probability measure.

If S is another tradable asset, we must have

S(0) =

∫S(τ)e

∫ T

τR(s)dsdQ = EP [S(τ)e−

∫ τ

0R(s)ds],

so the discounted asset value S(t)e−∫ t

0R(s)ds is a martingale.


Pricing general contracts

Let V (T ) be a FT -measureable random variable, and suppose thatV (T ) is the value of a contract at time T . If the contract is tradableat intermediate times, its discounted price V (t)e−

∫ t

0R(s)ds must be a

martingale under P , and hence

V (t) = EP [V (T )e−∫ T

tR(s)ds|Ft]. (14.2)

Taking V (T ) ≡ 1, the price of a bond must be

B(t) = EP [e−∫ T

tR(s)ds|Ft].


Are the two approaches to pricing consistent?

Consider ∫S(τ)

B(τ)dQ = EP [

S(τ)

B(τ)e−

∫ T

0R(s)ds]

= EP [S(τ)

B(τ)EP [e−

∫ T

0R(s)ds|Fτ ]]

= EP [S(τ)e−∫ τ

0R(s)ds]

= S(0)

Note thatdP = e

∫ T

0R(s)dsdQ, dQ = B(0)−1dQ

are both probability measures. Discounted prices are martingalesunder P while prices normalized by the bond are martingales underQ. If R is deterministic, P = Q.


A market with one stock and a money market

Suppose we start with a market consisting of one stock with price Sand a money market with interest rate R where

S(t) = S(0) +

∫ t

0α(s)S(s)ds+

∫ t

0σ(s)S(s)dW (s)

and α, σ, and R are adapted to FWt . Suppose that we are free to

trade using the information given by FWt so that a portfolio worth

X(0) at time zero pursuing and adapted trading strategy ∆ is worth

X(T ) = X(0) +

∫ T

0∆(t)dS(t) +

∫ T

0R(t)(X(t)−∆(t)S(t))dt

= X(0) +

∫ T

0∆(t)σ(t)S(t)dW (t)

+

∫ T

0(R(t)X(t) + ∆(t)(α(t)−R(t))S(t))dt


Constraints on the pricing measure

Define Θ(t) = α(t)−R(t)σ(t) . Then

X(T ) = X(0) +

∫ T

0∆(t)σ(t)S(t)dW (t) +

∫ T

0∆(t)σ(t)Θ(t)S(t))dt

+

∫ T

0R(t)X(t)dt

If the market is complete and there is no arbitrage, any pricing mea-sure must satisfy

∫X(T )dQ = X(0) and under the corresponding

P , dP = e∫ T

0R(s)dsdQ, X(t)e−

∫ t

0R(s)ds must be a martingale. Setting

D(t) = e−∫ t

0R(s)ds

X(t)D(t) = X(0) +

∫ t

0∆(s)σ(s)D(s)S(s)(dW (s) + Θ(s)ds),

the integral must be a martingale under P .


Change of measure and filtrations

Lemma 14.13 Suppose P|Ft<< P|Ft

and letZ(t) denote the Radon-Nikodymderivative. Then Z is a Ft-martingale under P .

Proof. Suppose A ∈ Ft. Then, since A ∈ Ft+s,

EP [Z(t+ s)1A] = P (A) = EP [Z(t)1A]

and hence, Z(t) = EP [Z(t+ s)|Ft].


Martingales under a change of measure

Lemma 14.14 Let Z be as in Lemma 14.13. Y is a Ft-martingale underP if and only if Y Z is a Ft-martingale under P .

Proof.

EP [(Y (t+ s)− Y (t))|Ft] =EP [Z(t+ s)(Y (t+ s)− Y (t))|Ft]

EP [Z(t+ s)|Ft]

=EP [Z(t+ s)Y (t+ s)− Z(t)Y (t)|Ft]

EP [Z(t+ s)|Ft]

so the left side is zero if and only if the numerator on the right is zero.


Transformation of martingales

Lemma 14.15 Let Z be as in Lemma 14.13. Suppose thatM is a Ft-localmartingale under P and [Z,M ]t =

∫ t

0 V (s)ds. Then

Y (t) = M(t)−∫ t

0

V (s)

Z(s)ds

is a Ft-local martingale under P .

Proof.

Z(t)Y (t) = Z(0)Y (0) +

∫ t

0Y (s)dZ(s) +

∫ t

0Z(s)dY (s) + [Z, Y ]t

= Z(0)Y (0) +

∫ t

0Y (s)dZ(s) +

∫ t

0Z(s)dM(s)


Changing the distribution of a Brownian motion

Lemma 14.16 Suppose that W is a Ft-Brownian motion under P andthat Z satisfying

Z(t) = 1−∫ t

0Θ(s)Z(s)dW (s)

is a Ft-martingale under P . If dP|Ft= Z(t)dP|Ft

, t ≥ 0, then

W (t) = W (t) +

∫ t

0Θ(s)ds

is a Ft-Brownian motion under P .

Proof. Since [W,Z]t = −∫ t

0 Θ(s)Z(s)ds, W is a local martingale (infact, a martingale) with quadratic variation [W ]t = t and hence is aBrownian motion.


Choice of P

Recall that we want

X(t)D(t) = X(0) +

∫ t

0∆(s)σ(s)D(s)S(s)(dW (s) + Θ(s)ds),

to be a martingale under P . Taking P given by Lemma 14.16, therequired condition holds, at least for ∆ satisfying

EP [

∫ t

0(∆(s)σ(s)D(s)S(s))2ds] <∞.


Risk-neutral measureRecall that Θ(t) = α(t)−R(t)

σ(t) and

D(t)S(t) = S(0) +

∫ t

0σ(s)D(s)S(s)dW (s) +

∫ t

0σ(s)Θ(s)D(s)S(s)dt

= S(0) +

∫ t

0σ(s)D(s)S(s)dW (s)

Θ(t) is called the market price of risk. Under P , the market price of riskis zero and hence the model is “risk neutral.”


Black-Scholes formula

Assuming that R(t) ≡ r and

S(t) = S(0) +

∫ t

0σS(s)dW (s) +

∫ t

0αS(s)ds

we previously showed that a contract with payoff h(S(T )) should behave a price at time t of the form q(t, S(t)) where q satisfies



with terminal condition q(T, x) = h(x). We then determined that

q(t, x) = E[h(x expσW (T − t) + (r − 1

2σ2)(T − t))e−r(T−t)]

By (14.2),V (t) = EP [h(S(T ))e−r(T−t)|Ft]

which is q(t, S(t)).


15. Martingale representation and the Fundamental Theorems ofasset pricing

• Completeness

• Martingale representation theorems

• Hedging

• Multidimensional Girsanov formula

• Martingale representation for multidimensional Brownian mo-tion

• Markets with multiple stocks

• First Fundamental Theorem

• Second Fundamental Theorem


Completeness

We defined completeness to say, essentially, that any bet could bemade. In a market with one stock and a money market, the “bets”that can be made are of the form

X(T ) = X(0) +

∫ T

0∆(t)dS(t) +

∫ T

0R(t)(X(t)−∆(t)S(t))dt

= X(0) +

∫ T

0∆(t)σ(t)S(t)dW (t)

+

∫ T

0(R(t)X(t) + ∆(t)(α(t)−R(t))S(t))dt

Assuming that Ft = FWt and inf σ(t) > 0 (in particular, that R is

FWt -adapted), this market is, in fact, complete.


Martingale representation theorems

Definition 15.1 Let Ft be a filtration, and let M2T be the space of all

square integrable Ft-martingales indexed by [0, T ]. For M ∈ M2T , define

‖M‖ =√E[M(T )2.

Lemma 15.2 If E[|Mn(T )− Y |2] → 0, then by Doob’s inequality

E[supt≤T

|Mn(t)− E[Y |Ft]|2] ≤ 4E[|Mn(T )− Y |2] → 0. (15.1)

Note that we can identify M2T with L2(Ω,FT , P ), the space of square

integrable, FT -measurable random variables.


Subspace of stochastic integrals

Let W be a standard Brownian motion and Ft = FWt , and let

HT be the collection of FWt adapted processes φ on [0, T ] satisfying

E[∫ T

0 φ(s)2ds] <∞. Define

I2T = Y : Y = z0 +

∫ T

0φ(s)dW (s), z0 ∈ R, φ ∈ HT.

Lemma 15.3 I2T is a closed subspace of L2(Ω,FT , P ), that is, if Yn ⊂ I2

T

and E[|Yn − Y |2] → 0, then Y ∈ I2T .

Proof. Since E[|Yn−Ym|2] = |zn0 −zm

0 |2 +E[∫ T

0 |φn(s)−φm(s)|2ds] → 0,

there exist z0 and φ such that zn0 → z0 andE[

∫ T

0 |φn(s)−φ(s)|2ds] → 0.


Approximation of random variables in L2

Lemma 15.4 For each Y ∈ L2(Ω,FWT , P ) there exists a sequence hn ∈

C∞b (Rn) and times 0 ≤ tn1 < · · · < tnn ≤ T such that

E[|Y − hn(W (tn1), . . . ,W (tnn))|2] → 0.

It follows that if we can show the existence of φn and zn0 such that

hn(W (tn1), . . . ,W (tnn)) = zn0 +

∫ tnn

0φn(s)dW (s),

then I2T = L2(Ω,FW

T , P ).


Stochastic integral representation

Let h ∈ C∞b (R), and for 0 ≤ t ≤ t1, define

v(t, x) =

∫h(y)

1√2π(t1 − t)

e−(y−x)2

2(t1−t)dy

and recall that vt(t, x) + 12vxx(t, x) = 0 and v(t1, x) = h(x). Conse-

quently, by Ito’s formula,

v(t,W (t)) = v(0, 0) +

∫ t

0vx(s,W (s))dW (s), 0 ≤ t ≤ t1.

and hence

h(W (t1)) = v(0, 0) +

∫ t1

0vx(s,W (s))dW (s).


Induction step

For h ∈ C∞b (Rn), define

v(t, x1, . . . , xn) =

∫h(x1, . . . , xn−1, y)

1√2π(tn − t)

e−(y−xn)2

2(tn−t) dy.

Then, by Ito’s formula

h(x1, . . . , xn−1,W (tn)) = v(tn−1, x1, . . . , xn−1,W (tn−1))

+

∫ tn

tn−1

vxn(s, x1, . . . , xn−1,W (s))dW (s)

and

h(W (t1), . . . ,W (tn)) = v(tn−1,W (t1), . . . ,W (tn−1),W (tn−1))

+

∫ tn

tn−1

vxn(s,W (t1), . . . ,W (tn−1),W (s))dW (s)


Representation theorem

Theorem 15.5 For each Y ∈ L2(Ω,FWT , P ), there exist z0 ∈ R and φ ∈

HT such that

Y = z0 +

∫ T

0φ(s)dW (s)

and hence for each M ∈ M2T , there exist z0 ∈ R and φ ∈ HT such that

M(t) = z0 +

∫ t

0φ(s)dW (s)


HedgingLet Y be FW

T and bounded by a constant. We want Y = X(T ) for Xgiven by

X(t) = X(0) +

∫ t

0∆(s)σ(s)S(s)dW (s)

+

∫ t

0(R(s)X(s) + ∆(s)(α(s)−R(s))S(s))ds

Since XD must be a martingale under P , by Lemma 14.14, XDZmust be a martingale under P . Consequently, we want

X(t)D(t)Z(t) = EP [Y D(T )Z(T )|FWt ] (15.2)


Application of the martingale representation theorem

By Theorem 15.5, there exist z0 and φ satisfying

E[Y D(T )Z(T )|FWt ] = z0 +

∫ t

0φ(s)dW (s),

and by Ito’s formula

X(t)D(t)Z(t) = X(0)+

∫ t

0(∆(s)σ(s)−Θ(s)X(s))D(s)Z(s)S(s)dW (s),

so take X(0) = z0 and

∆(s) =φ(s) + Θ(s)X(s)D(s)Z(s)S(s)

σ(s)D(s)Z(s)S(s).


Multidimensional Girsanov formula

Theorem 15.6 Let W be a d-dimensional standard Brownian motion andΘ be Rd-valued and adapted. Let Z satisfy

Z(t) = 1−∫ t

0Z(s)Θ(s) · dW (s),

so that Z(t) = exp−∫ t

0 Θ(s) · dW (s)− 12

∫ t

0 |Θ(s)|2ds. SupposeE[∫ T

0 |Θ(s)|2Z(s)2ds] < ∞, so that Z is a martingale, and suppose thatdP|FW

T= Z(T )dP|FW

T. Then W defined by

W (t) = W (t) +

∫ t

0Θ(s)ds, 0 ≤ t ≤ T (15.3)

is a d-dimensional standard Brownian motion on [0, T ].

Proof. Observing that [Wi, Z]t =∫ t

0 Θi(s)Z(s)ds, applying Lemma14.14, the proof is the same as in one dimension.


Martingale representation for multidimensional Brow-nian motionLet HT (Rd) be the collection of Rd-valued, adapted and appropriatelymeasurable processes φ satisfying E[

∫ T

0 |φ(s)|2ds] <∞.

Theorem 15.7 Let W be d-dimensional standard Brownian motion. Foreach Y ∈ L2(Ω,FW

T , P ), there exist z0 ∈ Rd and φ ∈ HdT such that

Y = z0 +

∫ T

0φ(s) · dW (s)

and hence for each M ∈ M2T , there exist z0 ∈ Rd and φ ∈ Hd

T such that

M(t) = z0 +

∫ t

0φ(s) · dW (s)


Markets with multiple stocksLet W be Rd-valued standard Brownian motion, α Rd-valued andadapted, and σ Mm×d-valued and adapted. Let S = (S1, . . . , Sm) sat-isfy

Si(t) = Si(0) +

∫ t

0αi(s)Si(s)ds+

d∑j=1

∫ t

0Si(s)σij(s)dWj(s)

so

Si(t) = Si(0) expd∑

j=1

σij(s)dWj(s)−∫ t

0(αi(s)−

1

2

d∑j=1

σ2ij(s))ds.

We consider a market with these m stocks and a money market pay-ing interest rate R.


Equations for discounted prices

As before,

D(t)Si(t) = Si(0) +

∫ t

0(αi(s)−R(s))D(s)Si(s)ds

+d∑

j=1

∫ t

0D(s)Si(s)σij(s)dWj(s)

With W as in (15.3),

D(t)Si(t) = Si(0) +

∫ t

0(αi(s)−R(s)−

d∑j=1

σij(s)Θj(s))D(s)Si(s))ds

+d∑

j=1

∫ t

0D(s)Si(s)σij(s)dWj(s)


Condition for a risk-neutral measure

For DS to be a martingale, we must select Θ satisfying

d∑j=1

σij(s)Θj(s) = αi(s)−R(s), i = 1, . . . ,m,

so ordinarily we must have d ≥ m. If d = m, we want σ to be invert-ible.


Portfolio process

X(t) = X(0) +

∫ t

0∆(s) · dS(s) +

∫ t

0R(s)(X(s)−∆(s) · S(s))ds

= X(0) +

∫ t

0R(s)X(s)ds+

∫ t

0

m∑i=1

∆i(s)(αi(s)−R(s))Si(s)ds

+m∑

i=1

d∑j=1

∫ t

0∆i(s)Si(s)σij(s)dWj(s)

Assuming

d∑j=1

σij(s)Θj(s) = αi(s)−R(s), i = 1, . . . ,m,


D(t)X(t)Z(t) = X(0) +m∑

i=1

d∑j=1

∫ t

0∆i(s)D(s)Z(s)Si(s)σij(s)dWj(s)

−d∑

j=1

∫ t

0D(s)X(s)Z(s)Θj(s)dWj(s)

and the hedge must satisfym∑

i=1

∆i(s)Si(s)σij(s) =φj(s) +D(s)X(s)Z(s)Θj(s)

D(s)Z(s)


First fundamental theorem

The First fundamental theorem says essentially that if there is anequivalent pricing measure, then there is no arbitrage. Specifically, ifPY ≥ 0 = 1 and Q(Y ) = 0, then QY > 0 = 0 and by equivalencePY = 0. Stated specifically for the market model with multiplestocks and a money market:

Theorem 15.8 (First fundamental theorem) If the market model has arisk-neutral measure, then it does not admit arbitrage.


A caveat

Consider a model in which at each diadic time k2n , n = 1, 2, . . ., 0 ≤

k ≤ 2n, an even money bet can be placed that the player wins withprobability p > 0. Suppose the interest rate is 0. At time 0, the playerborrows a dollar and bets it. If the player loses, then at time 1

2 , theplayer borrows $2 and bets it. Continuing, the player places bets attimes 1− 1

2n , n = 0, 1, 2, . . . until the player wins (which will happenwith probability one). If the win comes at the nth stage, the totalborrowed is

∑nk=0 2k = 2n+1 − 1, and the player has 2n + 2n = 2n+1 to

pay off the loan with $1 left over.

A similar strategy ∆ can be defined for a stock with price given by ageometric Brownian motion S and an interest rate R.


Second Fundamental Theorem

Theorem 15.9 If the market model has a risk-neutral measure, then it iscomplete if and only if the measure is unique.

Proof. Suppose the market is complete and P1 and P2 are risk neutralmeasures. Then for each portfolio process

EP1[D(T )V (T )] = X(0) = EP2[D(T )V (T )]

Taking V (T ) = 1A,

µ1(A) ≡ EP1 = [D(T )1A] = X(0) = EP2[D(T )1A] ≡ µ2(A).

Since µ1 = µ2 on FWT , dP1 = D(T )−1dµ1 = dP2.


Now suppose there is only one risk-neutral measure. Then the solu-tion of

d∑j=1

σij(s)Θj(s) = αi(s)−R(s), i = 1, . . . ,m,

is unique. To hedge, we must be able to solvem∑

i=1

∆i(s)Si(s)σij(s) = κj(s)

for the appropriate κj.


A lemma from linear algebra

Lemma 15.10 Let A be an m×d-dimensional matrix, b ∈ Rm and c ∈ Rd.If there exists a unique solution of Ax = b, then there exists a solution ofATy = c.

Proof. Existence and uniqueness for Ax = b implies that the onlysolution of Az = 0 is z = 0. It follows that the columns of A must belinearly independent and hence the rank of A is d. But the row rankis the same as the column rank, so the rows of A span Rd.


Applications and examples

• Interest rate models

• Dividends

• Forward contracts

• Pricing a cash flow

• Reflecting Brownian motion


Interest rate models

Suppose

R(t) = R(0) +

∫ t

0β(s, R(s))ds+

∫ t

0γ(s, R(s))dW (s).

Then R is Markov and the bond price must satisfy

B(t, T ) = EP [e−∫ T

tR(s)ds|Ft] = f(t, R(t))

for some f . Since D(t)f(t, R(t)) must be a martingale under P ,

D(t)f(t, R(t)) = f(0, R(0)) +

∫ t

0D(s)(fs(s, R(s)) + Lf(s, R(s))

−R(s)f(s, R(s)))ds

+

∫ t

0D(s)γ(s, R(s))fr(s, R(s))dW (s)


PDE for bond price

Lf(t, r) = β(t, r)fr(t, r) +1

2γ2(t, r)frr(t, r).

and f must satisfy

ft(t, r) + Lf(t, r) = rf(t, r)


Hull and White interest rate model

R(t) = R(0) +

∫ t

0(a(s)− b(s)R(s))ds+

∫ t

0σ(s)dW (s)

where a, b, and σ are deterministic functions of time. Apply Ito’sformula to e

∫ t

0b(r)drR(t) to obtain

e∫ t

0b(r)drR(t) = R(0) +

∫ t

0e∫ s

0b(r)dra(s)ds+

∫ t

0e∫ s

0b(r)drσ(s)dW (s)

and

e∫ t

t0b(r)dr

R(t) = R(t0) +

∫ t

t0

e∫ s

t0b(r)dr

a(s)ds+

∫ t

t0

e∫ s

t0b(r)dr

σ(s)dW (s)


Computation of the bond price

B(t0, T ) = EP [e−∫ T

t0R(s)ds|Ft0]

= exp−∫ T

t0

e−∫ s

t0b(r)dr

dsR(t0)E[exp−α(t0, T )− Z(t0, T )]

where

α(t0, T ) =

∫ T

t0

∫ t

t0

e−∫ t

sb(r)dra(s)ds

and

Z(t0, T ) =

∫ T

t0

∫ t

t0

e−∫ t

sb(r)drσ(s)dW (s)


Cox-Ingersoll-Ross interest rate model

R(t) = R(0) +

∫ t

0(a− bR(s))ds+

∫ t

0σ√R(s)dW (s)

with a, b, σ > 0. Note that R ≥ 0. This process also arises as thediffusion approximation to a branching process with immigration.

The PDE for the bond price becomes

ft(t, r) + (a− br)fr(t, r) +1

2σ2rfrr(t, r) = rf(t, r)

Look for a solution of the form

f(t, r) = e−rC(t,T )−A(t,T )

and keep in mind that f(T, r) = 1.


Dividends

The simplest model of dividend payments assumes that dividendsare paid continuously at a rate proportional to the stock price. Pay-ment of dividends causes a reduction in the stockprice, so the modelbecomes

S(t) = S(0) +

∫ t

0σ(s)S(s)dW (s) +

∫ t

0(α(s)− A(s))S(s)ds

where the the cumulative dividends paid to the owner of one shareof stock is

∫ t

0 A(s)S(s)ds.


Portfolio valueThe dividend rate cancels from the portfolio value

X(t) = X(0) +

∫ t

0∆(s)dS(s) +

∫ t

0∆(s)A(s)S(s)ds

+

∫ t

0R(s)(X(s)−∆(s)S(s))ds

= X(0) +

∫ t

0R(s)X(s)ds+

∫ t

0∆(s)S(s)σ(s)dW (s)

where W (t) = W (t) +∫ t

0 Θ(s)ds, Θ(t) = α(t)−R(t)σ(t) .

Note that

S(t) = S(0) +

∫ t

0σ(s)S(s)dW (s) +

∫ t

0(R(s)− A(s))S(s)ds

so D(t)S(t) is not a martingale under P .


Pricing a cash flowHow should an agreement to make continuing payments over a pe-riod of time be priced? Suppose for 0 ≤ t ≤ T , the agreement is topay a cumulative amount C(t). If the amount received is depositedin the money market, then

X(t) = C(t) +

∫ t

0R(s)X(s)ds

and

D(t)X(t) =

∫ t

0D(s)dC(s)

and the price to be paid at time t to receive the cashflow after time tis

V (t) =EP [

∫ T

t D(s)dC(s)|Ft]

D(t)= EP [

∫ T

t

e∫ s

tR(u)dsdC(s)|Ft].


Dividend paying stockA portfolio consisting of a single share of stock with dividends in-vested in the money market is worth

X(t) = S(t) +

∫ t

0A(s)S(s)ds+

∫ t

0R(s)(X(s)− S(s))ds

and

D(t)X(t) = S(0) +

∫ t

0D(s)σ(s)S(s)dW (s)


Forward contractsA forward contract is an agreement to pay a stated priceK for an assetat a specified time T . No money changes hands at time zero. Whatshould K be at time t < T ?

The value of the contract at time T is (S(T )−K), so the price at timet should be

EP [D(T )(S(T )−K)|Ft]

D(t)=D(t)S(t)−KEP [D(T )|Ft]

D(t)

which should be zero. Consequently, solving for K

ForS(t, T ) =D(t)S(t)

EP [D(T )|Ft]=

S(t)

B(t, T ).


M/M/1 Queueing ModelArrivals form a Poisson process with parameter λ.

The service distribution is exponential with parameter µ. Conse-quently, the length of the queue at time t satisfies

Q(t) = Q(0) + Ya(λt)− Yd(µ

∫ t

01Q(s)>0ds) ,

where Ya and Yd are independent unit Poisson processes.

Define the busy period B(t) to be

B(t) ≡∫ t

01Q(s)>0ds


Rescaled queue lengthRescale to get

Xn(t) ≡Q(nt)√

n.

Then Xn(t) satisfies

Xn(t) = Xn(0) +Ya(λnnt)√

n− 1√

nYd(nµn

∫ t

01Xn(s)>0ds).


Diffusion approximation

For a unit Poisson process Y , define Y (u) ≡ Y (u) − u and observethat

Xn(t) = Xn(0) +1√nYa(nλnt)−

1√nYd(nµn

∫ t

01Xn(s)>0ds)

+√n(λn − µn)t+

√nµn

∫ t

01Xn(s)=0ds

If λn → λ and µn → µ, then

W na (t) ≡ 1√

nYa(nλnt) ⇒

√λW1(t)

W nd (t) ≡ 1√

nYd(nµnt) ⇒

√µW2(t)),

where W1 and W2 are standard Brownian motions.


Scaled equationDefining

cn ≡√n(λn − µn)

Bn(t) ≡∫ t

01Xn(s)>0ds

Λn(t) ≡√nµn(t−Bn(t)) =

√nµn

∫ t

01Xn(s)=0ds,

we can rewrite Xn(t) as

Xn(t) = Xn(0) +W na (t)−W n

d (Bn(t)) + cnt+ Λn(t).

Λn is nondecreasing and increases only when Xn is zero.


Skorohod problem

Lemma 15.11 For w ∈ DR[0,∞) with w(0) ≥ 0, there exists a uniquepair (x, λ) satisfying

x(t) = w(t) + λ(t) (15.1)

such that λ(0) = 0, x(t) ≥ 0∀t, and λ is nondecreasing and increases onlywhen x = 0. The solution is given by setting λ(t) = 0 ∨ sups≤t(−w(s))and defining x by (15.1).


Proof.[of uniqueness] For t < τ0 = infs : w(s) ≤ 0, λ(t) = 0 andhence x(t) = w(t). For t ≥ τ0, the nonnegativity of x implies

λ(t) ≥ −w(t),

and λ(t) nondecreasing implies

λ(t) ≥ sups≤t

(−w(s)).

If t is a point of increase of λ, then x(t) = 0, so we must have

λ(t) = −w(t) ≤ sups≤t

(−w(s)). (15.2)

Since the right side of (15.2) is nondecreasing, we must have λ(t) ≤sups≤t(−w(s)) for all t > τ0, and the result follows.


Convergence to reflecting Brownian motion

Λn(t) = 0 ∨ (− infs≤t

(Xn(0) +W na (s)−W n

d (Bn(s)) + cns))

Consequently, if Xn(0) +W na (t)−W n

d (Bn(t)) + cnt converges, so doesΛn and Xn along with it.

Assuming that cn → c, the limit will satisfy

X(t) = X(0) +√λW1(t)−

√λW2(t) + ct+ Λ(t)

Λ(t) = 0 ∨ sups≤t

(−(X(0) +√λ(W1(s)−W2(s)) + ct)).

Defining 1√2(W1 −W2),

X(t) = X(0) +√

2λW (t) + ct+ Λ(t)

X(t) ≥ 0 ∀t,

where Λ is nondecreasing and Λ increases only when X(t) = 0.


16. Stationary distributions and forward equations

• Forward equations

• Stationary distributions

• Reflecting diffusions


Equations for probability distributions

If

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds, (16.1)

a = σσT , f ∈ C2c (Rd), and

Lf(x) =1

2

∑aij(x)∂i∂jf(s) + b(x) · ∇f(x), (16.2)

f (X(t))−∫ t

0Lf (X(s)) ds

is a martingale.


Forward equationSince

E [f(X(t))] = E [f(X(0)] + E

[∫ t

0Lf(X(s))ds

]= E [f(X(0))] +

∫ t

0E [Lf(X(s))] ds,

defining νt(Γ) = PX(t) ∈ Γ, for f ∈ C2c (Rd),∫

fdνt =

∫fdν0 +

∫ t

0

∫Lfdνsds, (16.3)

which is a weak form of the equation

d

dtνt = L∗νt.


Uniqueness for the forward equation

Theorem 16.1 Let Lf be given by (16.2) with a and b continuous, and letνt be probability measures on Rd satisfying (16.3) for all f ∈ C2

c (Rd). If(16.1) has a unique solution for each initial condition, then PX(0) ∈ · =ν0 implies PX(t) ∈ · = νt.


The adjoint operatorIn nice situations, νt(dx) = pt(x)dx. Then L∗ should be a differentialoperator satisfying ∫

Rd

pLfdx =

∫Rd

fL∗pdx.


Example 16.2 Let d=1. Integrating by parts, we have∫ ∞

−∞p(x)

(1

2a(x)f ′′(x) + b(x)f ′(x)

)dx

=1

2p(x)a(x)f ′(x)

∣∣∣∣∞−∞

−∫ ∞

−∞f ′(x)

(1

2

d

dx(a(x)p(x))− b(x)p(x)

)dx.

The first term is zero, and integrating by parts again we have∫ ∞

−∞f(x)

d

dx

(1

2

d


)dx

soL∗p =

d

dx

(1

2

d


).

Example 16.3 Let Lf = 12f

′′ (Brownian motion). Then L∗p = 12p

′′, thatis, L is self adjoint.


Stationary distributions

Suppose∫Lfdπ = 0 for all f in C2

c (Rd). Then∫fdπ =

∫fdπ +

∫ t

0

∫Lfdπds,

and hence νt ≡ π gives a solution of (16.3). Under the conditions ofTheorem 16.1, if PX(0) ∈ · = π and f(X(t)) −

∫ t

0 Lf(X(s))ds isa martingale for all f ∈ C2

c (Rd), then PX(t) ∈ · = π, i.e. π is astationary distribution for X .


Computation of a stationary distribution if d = 1

Assuming π(dx) = π(x)dx

d

dx

(1

2

d

dx(a(x)π(x))− b(x)π(x)

)︸︷︷︸

this is a constant:let the constant be 0

= 0,

so1

2

d

dx(a(x)π(x)) = b(x)π(x).

Applying the integrating factor exp(−∫ x

0 2b(z)/a(z)dz)

1

2e−

∫ x

02b(z)a(z) dz d

dx(a(x)π(x))− b(x)e−

∫ x

02b(z)a(z) dzπ(x) = 0

a(x)e−∫ x

02b(z)a(z) dzπ(x) = C

π(x) =C

a(x)e∫ x

02b(z)a(z) dz.


Existence of a stationary distribution

Lemma 16.4 Assume a(x) > 0 for all x. If

C =

∫ ∞

−∞

1

a(x)e∫ x

02b(z)a(z) dzdx <∞,

thenπ(x) =

1

Ca(x)e∫ x

02b(z)a(z) dz

is a stationary density for (16.1).


Diffusion with a boundarySuppose

X(t) = X(0) +

∫ t

0σ(X(s))dW (s) +

∫ t

0b(X(s))ds+ Λ(t)

withX(t) ≥ 0, and that Λ is nondecreasing and increasing only whenX(t) = 0. Then

f(X(t)) = f(X(0)) +

∫ t

0∇fT (X(s))σ(X(s))dW (s) +

∫ t

0Lf(X(s))ds

+

∫ t

0f ′(X(s))dΛ(s),

so

f(X(t))−∫ t

0Lf(X(s))ds

is a martingale, if f ∈ C2c and f ′(0) = 0.


Adjoint operator for reflecting process

∫ ∞

0

p(x)Lf(x)dx =

[1

2p(x)a(x)f ′(x)

]∞0︸︷︷︸

=0

−∫ ∞

0

f ′(

1

2

d


)dx

=

[−f(x)

(1

2

d


)]∞0

+

∫ ∞

0

f(x)L∗p(x)dx

and hence

L∗p(x) =d

dx

(1

2

d


)for p satisfying(

1

2a′(0)− b(0)

)p(0) +

1

2a(0)p′(0) = 0 . (16.4)


Forward equation

The density for the distribution of the process should satisfy

d

dtpt = L∗pt

and the boundary condition (16.4).

The stationary density satisfies

d

dx

(1

2

d

dx(a(x)π(x))− b(x)π(x)

)= 0

and the boundary condition implies

1

2

d

dx(a(x)π(x))− b(x)π(x) = 0

givingπ(x) =

c

a(x)e∫ x

02b(z)a(z) dz, x ≥ 0.


Stationary distribution for reflecting Brownian motion

Example 16.5 Let X(t) = X(0) + σW (t)− bt + Λ(t), where a = σ2 andb > 0 are constant. Then

π(x) =2b

σ2e− 2b

σ2 x,

so the stationary distribution is exponential.


17. Processes with jumps

• Poisson process

• Martingale properties of the Poisson process

• Strong Markov property for the Poisson process

• Marked and compound Poisson processes

• Stochastic equations with jumps

• General counting processes

• Intensities


Poisson process

A Poisson process is a model for a series of random observationsoccurring in time. For example, the process could model the arrivalsof customers in a bank, the arrivals of telephone calls at a switch, orthe counts registered by radiation detection equipment.

x x x x x x x xt

Let N(t) denote the number of observations by time t. In the figureabove, N(t) = 6. Note that for t < s, N(s) − N(t) is the number ofobservations in the time interval (t, s].


Basic assumptions

1) Observations occur one at a time.

2) Numbers of observations in disjoint time intervals are indepen-dent random variables, i.e., if t0 < t1 < · · · < tm, then N(tk) −N(tk−1), k = 1, . . . ,m are independent random variables.

3) The distribution of N(t+ a)−N(t) does not depend on t.


Characterization of a Poisson process

Theorem 17.1 Under assumptions 1), 2), and 3), there is a constant λ > 0such that, for t < s, N(s) − N(t) is Poisson distributed with parameterλ(s− t), that is,

PN(s)−N(t) = k =(λ(s− t))k

k!e−λ(s−t).

Proof. LetNn(t) be the number of time intervals (kn ,

k+1n ], k = 0, . . . , [nt]

that contain at least one observation. Then Nn(t) is binomially dis-tributed with parameters n and pn = PN( 1

n) > 0. Then

(1− pn)n = PNn(1) = 0 = PN(1) = 0

and npn → λ ≡ − logPN(1) = 0, and the rest follows by standardPoisson approximation of the binomial.


Interarrival times

Let Sk be the time of the kth observation. Then

PSk ≤ t = PN(t) ≥ k = 1−k−1∑i=0

(λt)i

i!e−λt, t ≥ 0.

Differentiating to obtain the probability density function gives

fSk(t) =

1(k−1)!λ(λt)k−1e−λt t ≥ 0

0 t < 0

Theorem 17.2 Let T1 = S1 and for k > 1, Tk = Sk−Sk−1. Then T1, T2, . . .

are independent and exponentially distributed with parameter λ.


Martingale properties of the Poisson process

Theorem 17.3 (Watanabe) If N is a Poisson process with parameter λ,then N(t) − λt is a martingale. Conversely, if N is a counting processand N(t)− λt is a martingale, then N is a Poisson process.


Proof.

E[eiθ(N(t+r)−N(t))|Ft]

= 1 +n−1∑k=0

E[(eiθ(N(sk+1)−N(sk) − 1− (eiθ − 1)(N(sk+1)−N(sk))eiθ(N(sk)−N(t))|Ft]

+n−1∑k=0

λ(sk+1 − sk)(eiθ − 1)E[eiθ(N(sk)−N(t))|Ft]

The first term converges to zero by the dominated convergence the-orem, so we have

E[eiθ(N(t+r)−N(t))|Ft] = 1 + λ(eiθ − 1)

∫ r

0E[eiθ(N(t+s)−N(t))|Ft]ds

and E[eiθ(N(t+r)−N(t))|Ft] = eλ(eiθ−1)t.



A Poisson process N is compatible with a filtration Ft, if N is Ft-adapted and N(t+ ·)−N(t) is independent of Ft for every t ≥ 0.

Lemma 17.4 Let N be a Poisson process with parameter λ > 0 that iscompatible with Ft, and let τ be a Ft-stopping time such that τ < ∞a.s. Define Nτ(t) = N(τ + t)−N(τ). Then Nτ is a Poisson process that isindependent of Fτ and compatible with Fτ+t.


Proof. Let M(t) = N(t)− λt. By the optional sampling theorem,

E[M((τ + t+ r) ∧ T )|Fτ+t] = M((τ + t) ∧ T ),

so

E[N((τ+t+r)∧T )−N((τ+t)∧T )|Fτ+t] = λ((τ+t+r)∧T−(τ+t)∧T ).

By the monotone convergence theorem

E[N(τ + t+ r)−N(τ + t)|Fτ+t] = λr

which gives the lemma.


Marked Poisson processesLet N be a Poisson process with parameter λ and associate with eachjump time Sk a mark ξk in a space E, called the mark space. Assumethat the ξk are iid and independent of N .


An independence lemma

Lemma 17.5 LetN be a Poisson process and ξk be independent and iden-tically distributed and independent of N . Then

Ft = σ(N(s), ξN(s) : s ≤ t)

is independent of

Gt = σ(N(s)−N(t), ξN(t)+k : s ≥ t, k = 1, 2, . . .).

Proof. The proof is left as an exercise. The lemma may look obvious,but it in general, it is false if the assumption that the ξk are identicallydistributed is dropped.


Compound Poisson processes

Suppose E = R and let Q(t) =∑N(t)

k=1 ξk. Then Q is a compoundPoisson process.

Lemma 17.6 Suppose E[ξk] = β. Then Q has independent increments,E[Q(t)] = βλt, and

M(t) = Q(t)− βλt

is a martingale.

Proof. Since E[∑n

k=1 ξk] = nβ, E[Q(t)|N(t)] = N(t)β and

E[Q(t)] = E[E[Q(t)|N(t)]] = λtβ.

E[Q(t+ s)−Q(t)|Ft] = E[

N(t)+N(t+s)−N(t)∑k=N(t)+1

ξk|Ft] = λsβ


A stochastic equation for a jump process

Theorem 17.7 Let X(0) be independent of the marked Poisson process, letH : Rd × E → Rd, and let X satisfy

X(t) = X(0)+

N(t)∑k=1

H(X(s−), ξk)

(= X(0) +

∫[0,t]×E

H(X(s−), z)N(ds× dz)

).

Then X is a Markov process and

f(X(t))−∫ t

0λ

∫E

(f(X(s) +H(X(s), z))− f(X(s)))µξ(dz)ds

is a martingale for each f ∈ B(Rd).


Proof. Since

X(t+ s) = X(t) +

N(t+s)∑k=N(t)+1

H(X(s−), ξk)

is a function of X(t), N(t + s) − N(t), and (ξN(t)+1, ξN(t)+2, . . .), theMarkov property follows by Lemma 17.5.

Noting thatE[f(X(t+ h))− f(X(t))|Ft]

= E[1N(t+h)−N(t)=1(f(X(t) +H(X(t), ξN(t)+1))− f(X(t))|Ft] + o(h)

= λhe−λh

∫E

(f(X(t) +H(X(t), z))− f(X(t))µξ(dz) + o(h)

E[f(X(t+ s)− f(X(t))|Ft] =∑

E[f(X(ti+1))− f(X(ti))|Ft]

≈ E[

∫ t+s

t

λ

∫E

(f(X(r) +H(X(r), z))− f(X(r)))µξ(dz)dr


Martingales associated with a marked Poisson process

Lemma 17.8 If (N, ξk) is a marked Poisson process, then

E[

N(t+s)∑k=N(t)+1

G(k − 1, ξk)|Ft] = E[

∫ t+s

t

λ

∫E

G(N(s), z)µξ(dz)|Ft],

and hence

MG(t) =

N(t)∑k=1

G(k − 1, ξk)−∫ t

0λ

∫E

G(N(s), z)µξ(dz)

is a martingale.


Lemma 17.9 LetX be cadlag and adapted andG be a bounded, continuousfunction. Define

Z(t) =

∫ t

0G(X(s−), ξN(s))dN(s) =

N(t)∑k=1

G(X(Sk−), ξk).

Then

E[Z(t+ s)− Z(t)|Ft] = E[

∫ t+s

t

λ

∫E

G(X(s), z)µξ(dz)ds|Ft]

and hence

Z(t)−∫ t

0λ

∫E

G(X(s), z)µξ(dz)ds

is a martingale.


Proof. Note that

Z(t) = limh→0

[t/h]∑k=0

G(X(kh), ξN(kh)+1)(N((k + 1)h)−N(kh))

and

E[G(X(kh), ξN(kh)+1)(N((k+1)h)−N(kh))|Fkh] =

∫E

G(X(kh), z)µξ(dz)λh


Change of measure

Theorem 17.10 LetG be bounded and continuous, G ≥ −1, X cadlag andadapted, and Z satisfy

Z(t) = 1+

∫ t

0Z(s−)G(X(s−), ξN(s))dN(s)−

∫ t

0Z(s)λ

∫E

G(X(s), z)µξ(dz)ds.

Then

Z(t) = exp∫ t

0log(1+G(X(s−), ξN(s)))dN(s)−

∫ t

0λ

∫E

G(X(s), z)µξ(dz)ds

is a martingale.


Transformation of martingales under a change of mea-sure

Lemma 17.11 LetZ be as in Theorem 17.10, and suppose dP|Ft= Z(t)dP|Ft

.If M is a Ft-local martingale under P and [Z,M ]t =

∫ t

0 V (s−)dN(s).Then

Y (t) = M(t)−∫ t

0

V (s−)

Z(s)dN(s)

is a Ft-local martingale under P .


Proof.

Z(t)Y (t) = Z(0)Y (0) +

∫ t

0Y (s−)dZ(s) +

∫ t

0Z(s−)dY (s) + [Z, Y ]t

= Z(0)Y (0) +

∫ t

0Y (s−)dZ(s) +

∫ t

0Z(s−)dY (s) + [Z,M ]t

−∫ t

0

V (s−)G(X(s−), ξN(s))

1 +G(X(s−), ξN(s))dN(s)

= Z(0)Y (0) +

∫ t

0Y (s−)dZ(s) +

∫ t

0Z(s−)dM(s)


General counting processes

N is a counting process if N(0) = 0, N is right continuous, and N isconstant except for jumps of +1.

N is determined by its jump times 0 < σ1 < σ2 < · · ·. If N is adaptedto Ft, then the σk are Ft-stopping times.


Intensity for a counting processIf N is a Poisson process with parameter λ and N is compatible withFt, then

PN(t+ ∆t) > N(t)|Ft = 1− e−λ∆t ≈ λ∆t.

For a general counting process N , at least intuitively, a nonnegative,Ft-adapted stochastic process λ(·) is an Ft-intensity for N if

PN(t+ ∆t) > N(t)|Ft ≈ E[

∫ t+∆t

t

λ(s)ds|Ft] ≈ λ(t)∆t.

Definition 17.12 λ is an Ft-intensity for N if and only if for each n =1, 2, . . ..

N(t ∧ σn)−∫ t∧σn

0λ(s)ds

is a Ft-martingale.


Modeling with intensities

Let X be a stochastic process (cadlag, E-valued for simplicity) thatmodels “external noise.” Let Dc[0,∞) denote the space of countingpaths (zero at time zero and constant except for jumps of +1).

Condition 17.13

λ : [0,∞)×DE[0,∞)×Dc[0,∞) → [0,∞)

and assume that if X is cadlag and adapted, then s → λ(s, Z,N) is cadlagand adapted.


Change of measure

Theorem 17.14 On (Ω,F , P ), letN be a unit Poisson process,X be cadlagand adapted, λ satisfy Condition 17.13, and λ be bounded by a constant. LetZ satisfy

Z(t) = 1+

∫ t

0(λ(s−, X,N)−1)Z(s−)dN(s)−

∫ t

0Z(s)(λ(s,X,N)−1)ds.

Then Z is a martingale, and if dP|Ft= Z(t)dP|Ft

, then under P ,

N(t)−∫ t

0λ(s,X,N)ds

is a martingale, that is, N is counting process with intensity λ(s,X,N).


18. Assignments

1. Due February 1: Exercises 1.5 and 1.8

2. Due February 8: Exercises 1.14 and 1.15

3. Due February 15: Exercise 2.9 and Problem 2

4. Due February 22: Problems 6, 7, 8

5. Due March 13: Problems 9, 10

6. Due April 19: Problems 11, 12

7. Due April 26: Exercises 5.8 and 5.10

8. Due May 3: Exercises 5.11 and 6.1

9. Due May 10: Exercise 5.14


19. Problems

1. Let X , Y , and Z be random variables. X and Y are conditionally independent given Z if and only if

E[f(X)g(Y )|Z] = E[f(X)|Z]E[g(Y )|Z]

for all bounded measurable f and g.

(a) Show that X and Y are conditionally independent given Z if and only if

E[f(X)|Y,Z] = E[f(X)|Z] (19.1)

for all bounded measurable f .

(b) Let X , Y , and Z be independent, R-valued random variables, and let ψ : R2 → R and ϕ : R2 → Rbe Borel measurable functions. Define U = ψ(X,Z) and V = ϕ(Y,Z). Show that U and V areconditionally independent given Z.

2. Let X be uniformly distributed on [0, 1], and let D be the σ-algebra generated by X ≤ 12, 1

2< X ≤

34, 3

4< X ≤ 1. Give an explicit representation of E[X|D] in terms of X . What is the probability

distribution of E[X|D]?

3. LetQ be a probability measure on (Ω,F) that is absolutely continuous with respect to P and let L = dQdP

be the corresponding Radon-Nikodym derivative. Let PD and QD be the restrictions of P and Q to thesub-σ-algebra D. Show that QD is absolutely continuous with respect to PD and that

dQDdPD

= E[L|D] .


4. Suppose X and Y are independent random variables, and let µX and µY denote the distributions of Xand Y . Let h be a measurable function satisfying∫ ∞

−∞

∫ ∞

−∞|h(x, y)|µX(dx)µY (dy) <∞.

DefineZ = h(X,Y )

andg(y) =

∫ ∞

−∞h(x, y)µX(dx).

Show that E[Z|Y ] = g(Y ).

5. Let X1, X2, . . . be independent random variables with E[Xk] = 1, k = 1, 2, . . .. Let F0 = ∅,Ω andfor n = 1, 2, . . ., let Fn = σ(X1, . . . , Xn), that is, Fn is the smallest σ-algebra with respect to whichX1, . . . , Xn are measurable. Let M0 = 1 and for n = 1, 2, . . ., let Mn =

∏nk=1Xk. Show that

E[Mn+1|Fn] = Mn.

6. Let W be a standard Brownian motion and define

B(t) = W (t)− tW (1), 0 ≤ t ≤ 1.

Show that B is independent of W (1). (It is enough to show that B(t1), . . . , B(tm) is independent ofW (1) for each choice of 0 ≤ t1 < · · · < tm ≤ 1.

7. Use the result in Problem 6 to compute

E[eλW (t)|W (1)].


8. LetW (t) = tW (

1

t).

Show that W is a standard Brownian motion.

9. Let X1 and X2 be adapted, simple processes. Show that for a, b ∈ R, Z = aX1 + bX2 is an adapted,simple process and that ∫

ZdW = a

∫X1dW + b

∫X2dW.

(Note that if U and V are D-measurable, then aU + bV is D-measurable.)

10. For n = 1, 2, . . ., let 0 = tn0 < tn1 < · · ·, and let ani ⊂ [0,∞). Suppose that for each t > 0,

limn→∞

∑tni ≤t

ani = t.

Show that for f ∈ C[0,∞),

limn→∞

∑tni ≤t

f(tni )ani =

∫ t

0

f(s)ds.

(In fact, the result holds for all cadlag f .)


11. Consider

X(t) = 1 +

∫ t

0

σ√X(s)dW (s) + bt

and let τ = inft : X(t) = 0. Give necessary and sufficient conditions on σ and b > 0 for Pτ <∞ >0.

12. Consider

X(t) = 1 +

∫ t

0

σ(X(s))dW (s) +

∫ t

0

b(X(s))ds

where d = m = 1, σ and b are continuous, and infx∈R σ(x) > 0. Give necessary and sufficient conditionson σ and b for Plimt→∞X(t) = ∞ = 1.


List of topics1. Review of basic concepts of probability

2. From probability to measure theory


4. Information and sigma algebras

5. Conditional expectations

6. Martingales

7. The central limit theorem and Brownian mo-tion

8. Martingale properties of Brownian motion

9. Sample path properties of Brownian motion

10. The Markov property

11. Definition of the Ito integral

12. Properties and examples

13. Ito’s formula

14. Examples and applications

15. Black-Scholes formula

16. Multivariable Ito formula

17. Brownian bridge

18. Convergence of empirical distribution func-tions


20. Examples and the Markov property

21. SDEs and partial differential equations

22. Examples

23. Multidimensional SDEs

24. Examples

25. Change of probability measure

26. Girsanov formula

27. Pricing and risk neutral measures

28. Martingale representation

29. Fundamental theorems of asset pricing

30. Applications

Math 635: An Introduction to Brownian Motion and ...kurtz/635/m635s07.pdf · Stochastic Calculus 1....

Documents

Transcript of Math 635: An Introduction to Brownian Motion and ...kurtz/635/m635s07.pdf · Stochastic Calculus 1....