An Introduction to Stochastic Calculus · Example: A stochastic process is called Gaussian if all...
Transcript of An Introduction to Stochastic Calculus · Example: A stochastic process is called Gaussian if all...
An Introduction to Stochastic Calculus
Haijun Li
[email protected] of Mathematics and Statistics
Washington State University
Lisbon, May 2018
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 1 / 169
Outline
Basic Concepts from Probability TheoryRandom VectorsStochastic Processes
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 2 / 169
Notations
Sample or outcome space Ω := all possible outcomes ω of theunderlying experiment.σ-field or σ-algebra F : A non-empty class of subsets (orobservable events) of Ω closed under countable union, countableintersection and complements.Probability measure P(·) on F : P(A) denotes the probability ofevent A.Random variable X : Ω 7→ R is a real-valued measurable functiondefined on Ω. That is, events X−1(a,b) ∈ F are observable for alla,b ∈ R.Induced probability measurePX (B) := P(X ∈ B) = P(ω : X (ω) ∈ B), for any Borel set B ⊆ R.Distribution function FX (x) := P(X ≤ x), x ∈ R.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 3 / 169
Continuous and Discrete Random VariablesRandom variable X is said to be continuous if the distributionfunction FX has no jumps, that is,
limh→0
FX (x + h) = FX (x), ∀x ∈ R.
Most continuous distributions of interest have a density fX ≥ 0:
FX (x) =
∫ x
−∞fX (y)dy , x ∈ R
where∫∞−∞ fX (y)dy = 1.
Random variable X is said to be discrete if the distribution functionFX is a pure jump function:
FX (x) =∑
k :xk≤x
pk , x ∈ R
where the probability mass function pk satisfies that 1 ≥ pk ≥ 0and
∑∞k=1 pk = 1.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 4 / 169
Expectation, Variance and Moments
A General FormulaFor a real-valued function g, the expectation of g(X ) is given byEg(X ) =
∫g(x)dFX (x).
The k -th moment of X is given by E(X k ) =∫
xkdFX (x). Themean µX (or “center of gravity”) of X is the first moment.The variance (or “spread out”) of X is defined asσ2
X = var(X ) := E(X − µX )2. Clearly σ2X = E(X 2)− µ2
X .If the variance exists, then the Chebyshev inequality holds:
P(|X − µX | > kσX ) ≤ k−2, k > 0.
That is, the probability of tail regions that are k standarddeviations away from the mean is bounded by 1/k2.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 5 / 169
Random VectorsLet (Ω,F ,P) be a probability space.
X = (X1, . . . ,Xd ) : Ω 7→ Rd denotes a d-dimensional randomvector, where its components X1, . . . ,Xd are real-valued randomvariables.The induced probability measure: PX (B) = P(X ∈ B):= P(ω : X (ω) ∈ B) for all Borel subsets B of Rd .The distribution function FX (x) := P(X1 ≤ x1, . . . ,Xd ≤ xd ),x = (x1, . . . , xd ) ∈ Rd .If X has a density fX ≥ 0, then
FX (x) =
∫ x1
−∞· · ·∫ xd
−∞fX (x)dx
with∫∞−∞ · · ·
∫∞−∞ fX (x)dx = 1.
For any J ⊆ 1, . . . ,d, let X J := (Xj ; j ∈ J) be the J-margin of X .The marginal density of X J is given by
fX J (xJ) =
∫fX (x)dxJc .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 6 / 169
Expectation, Variance, and CovarianceThe expectation or mean value of X is denoted byµX = EX := (E(X1), . . . ,E(Xd )).The covariance matrix of X is defined as
ΣX := (cov(Xi ,Xj); i , j = 1, . . . ,d)
where the covariance of Xi and Xj is defined as
cov(Xi ,Xj) := E [(Xi − µXi )(Xj − µXj )] = E(XiXj)− µXiµXj .
The correlation of Xi and Xj is denoted by
corr(Xi ,Xj) :=cov(Xi ,Xj)
σXiσXj
.
It follows from the Cauchy-Schwarz inequality that−1 ≤ corr(Xi ,Xj) ≤ 1.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 7 / 169
Independence and DependenceThe events A1, . . . ,An are independent if for any1 ≤ i1 < i2 < · · · < ik ≤ n,
P(∩kj=1Aij ) =
k∏j=1
P(Aij ).
The random variables X1, . . . ,Xn are independent if for any Borelsets B1, . . . ,Bn, the events X1 ∈ B1, . . . , Xn ∈ Bn areindependent.The random variables X1, . . . ,Xn are independent if and only ifFX1,...,Xn (x1, . . . , xn) =
∏ni=1 FXi (xi), for all (x1, . . . , xn) ∈ Rn.
The random variables X1, . . . ,Xn are independent if and only ifE [∏n
i=1 gi(Xi)] =∏n
i=1 Egi(Xi) for any real-valued functionsg1, . . . ,gn.In the continuous case, the random variables X1, . . . ,Xn areindependent if and only if fX1,...,Xn (x1, . . . , xn) =
∏ni=1 fXi (xi), for all
(x1, . . . , xn) ∈ Rn.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 8 / 169
Two ExamplesLet X = (X1, . . . ,Xd ) have a d-dimensional Gaussian distribution.The random variables X1, . . . ,Xd are independent if and only ifcorr(Xi ,Xj) = 0 for i 6= j .For non-Gaussian random vectors, however, independence anduncorrelatedness are not equivalent. Let X be a standard normalrandom variable. Since both X and X 3 have expectation zero, Xand X 2 are uncorrelated:
cov(X ,X 2) = E(X 3)− EXE(X 2) = 0.
But X and X 2 are clearly dependent (co-monotone). SinceX ∈ [−1,1] = X 2 ∈ [0,1], we obtain
P(X ∈ [−1,1],X 2 ∈ [0,1]) = P(X ∈ [−1,1])
> [P(X ∈ [−1,1])]2 = P(X ∈ [−1,1])P(X 2 ∈ [0,1]).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 9 / 169
Autocorrelations
For a time series X0,X1,X2, . . . the autocorrelation at lag h isdefined by corr(X0,Xh), h = 0,1, . . . .Log-returns Xt := log St
St−1, where St is the price of a speculative
asset (equities, indexes, exchange rates and commodity) at theend of the t-th period. If the relative returns are small, thenXt ≈ St−St−1
St−1. Note that the log-returns are scale-free, additive,
stationary, ....Stylized Fact #1: Log-returns Xt are not iid (independent andidentically distributed) although they show little serialautocorrelation.Stylized Fact #2: Series of absolute |Xt | or squared X 2
t returnsshow profound serial autocorrelation (long-range dependence).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 10 / 169
Stochastic ProcessesA stochastic process X := (Xt , t ∈ T ) is a collection of randomvariables defined on some space Ω, where T ⊆ R.If index set T is a finite or countably infinite set, X is said to be adiscrete-time process. If T is an interval, then X is acontinuous-time process.A stochastic process X is a (measurable) function of twovariables: time t and sample point ω.
Fix time t , Xt = Xt (ω), ω ∈ Ω, is a random variable.
Fix sample point ω, Xt = Xt (ω), t ∈ T , is a sample path.
Example: An autoregressive process of order 1 is given by
Xt = φXt−1 + Zt , t ∈ Z,
where φ is a real parameter. Time series models can beunderstand as discretization of stochastic differential equations.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 11 / 169
Finite-Dimensional DistributionsAll possible values of a stochastic process X = (Xt , t ∈ T )constitute a function space of all sample paths(Xt (ω), t ∈ T ),∀ω ∈ Ω.Specifying the distribution of X on this function space is equivalentto specifying which information is available in terms of theobservable events from the σ-field generated by X .The distribution of X can be described by the distributions of thefinite-dimensional vectors
(Xt1 , . . . ,Xtn ), for all possible choices of times t1, . . . , tn ∈ T .
Example: A stochastic process is called Gaussian if all itsfinite-dimensional distributions are multivariate Gaussian. Thedistribution of this process is determined by the collection of themean vectors and covariance matrices.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 12 / 169
Expectation and Covariance FunctionsThe expectation function of a process X = (Xt , t ∈ T ) is defined as
µX (t) := µXt = EXt , t ∈ T .
The covariance function of X is given by
CX (t , s) := cov(Xt ,Xs) = E [(Xt − EXt )(Xs − EXs)], t , s ∈ T .
In particular, the variance function of X is given byσ2
X (t) = CX (t , t) = var(Xt ), t ∈ T .Example: A Gaussian white noise X = (Xt ,0 ≤ t ≤ 1) consists ofiid N(0,1) random variables. In this case its finite-dimensionaldistributions are given by, for any 0 ≤ t1 ≤ · · · ≤ tn ≤ 1,
P(Xt1 ≤ x1, . . . ,Xtn ≤ xn) =n∏
i=1
P(Xti ≤ xi) =n∏
i=1
Φ(xi), ∀x ∈ Rn.
Its expectation and covariance functions are given by µX (t) = 0,
CX (t , s) =
1 if t = s0 if t 6= s
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 13 / 169
Dependence StructureA process X = (Xt , t ∈ T ) is said to be strictly stationary if for anyt1, . . . , tn ∈ T
(Xt1 , . . . ,Xtn ) =d (Xt1+h, . . . ,Xtn+h).
That is, its finite-dimensional distribution functions are invariantunder time shifts.A process X = (Xt , t ∈ T ) is said to have stationary increments if
Xt − Xs =d Xt+h − Xs+h, ∀t , s, t + h, s + h ∈ T .
A process X = (Xt , t ∈ T ) is said to have independent incrementsif for all t1 < · · · < tn in T ,
Xt2 − Xt1 , . . . ,Xtn − Xtn−1 are independent.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 14 / 169
Strictly Stationary vs Stationary
A process X is said to be stationary (in the wide sense) if
µX (t + h) = µX (t), and CX (t , s) = CX (t + h, s + h).
If second moments exist, then the strictly stationarity implies thestationarity.Example: Consider a strictly stationary Gaussian process X . Thedistribution of X is determined by µX (0) and CX (t , s) = gX (|t − s|)for some function gX . In particular, for Gaussian white noise X ,gX (0) = 1 and gX (x) = 0 for any x 6= 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 15 / 169
Homogeneous Poisson Process
A stochastic process X = (Xt , t ≥ 0) is called an Poisson process withintensity rate λ > 0 if
X0 = 0,it has stationary, independent increments, andfor every t > 0, Xt has a Poisson distribution Poi(λt).
Simulation of Poisson ProcessesSimulate iid exponential Exp(λ) random variables Y1,Y2, . . . , and setTn :=
∑ni=1 Yi . The Poisson process can be constructed by
Xt := #n : Tn ≤ t, t ≥ 0.
Example: Claims arriving in an insurance portfolio.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 16 / 169
Outline
Brownian MotionSimulation of Brownian Sample Paths
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 17 / 169
DefinitionA stochastic process B = (Bt , t ∈ [0,∞)) is called a (standard)Brownian motion or a Wiener process if
B0 = 0,it has stationary, independent increments,for every t > 0, Bt has a normal N(0, t) distribution, andit has continuous sample paths.
Historical Note: Brownian motion is named after the botanist RobertBrown who first observed, in the 1820s, the irregular motion of pollengrains immersed in water. By the end of the nineteenth century, thephenomenon was understood by means of kinetic theory as a result ofmolecular bombardment. in 1900, L. Bachelier had employed it tomodel the stock market, where the analogue of molecularbombardment is the interplay of the myriad of individual marketdecisions that determine the market price. Norbert Wiener (1923) wasthe first to put Brownian motion on a firm mathematical basis.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 18 / 169
Distributional Properties of Brownian Motion
For any t > s, Bt − Bs =d Bt−s − B0 = Bt−s has an N(0, t − s)distribution. That is, the larger the interval, the larger thefluctuations of B on this interval.µB(t) = EBt = 0 and for any t > s,
CB(t , s) = E [((Bt − Bs) + Bs)Bs] = E(Bt − Bs)EBs + s = min(s, t).
Brownian motion is a Gaussian process: its finite-dimensionaldistributions are multivariate Gaussian.
Question: How irregular are Brownian sample paths?
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 19 / 169
Self-SimilarityA stochastic process X = (Xt , t ≥ 0) is H-self-similar for someH > 0 if it satisfies the condition
(T HXt1 , . . . ,THXtn ) =d (XTt1 , . . . ,XTtn )
for every T > 0 and any choice of ti ≥ 0, i = 1, . . . ,n.Self-similarity means that the properly scaled patterns of a samplepath in any small or large time interval have a similar shape.
Non-Differentiability of Self-Similar ProcessesFor any H-self-similar process X with stationary increments and0 < H < 1,
lim supt↓t0
|Xt − Xt0 |t − t0
=∞, at any fixed t0.
That is, sample paths of H-self-similar processes are nowheredifferentiable with probability 1.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 20 / 169
Path Properties of Brownian MotionBrownian motion is 0.5-self-similar.Its sample paths are nowhere differentiable. That is, any samplepath changes its shape in the neighborhood of any time epoch ina completely non-predictable fashion (Wiener, Paley andZygmund, 1930s).
Unbounded Variation of Brownian Sample Paths
supτ
n∑i=1
|Bti (ω)− Bti−1(ω)| =∞, a.s.,
where the supremum is taken over all possible partitionsτ : 0 = t0 < · · · < tn = T of any finite interval [0,T ].
The unbounded variation and non-differentiability of Brownian samplepaths are major reasons for the failure of classical integration methods,when applied to these paths, and for the introduction of stochasticcalculus.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 21 / 169
Brownian Bridge
Let B = (Bt , t ∈ [0,∞)) denote Brownian Motion.The process X := (Bt − tB1,0 ≤ t ≤ 1) satisfies that X0 = X1 = 0.This process is called the (standard) Brownian bridge.Since multivariate normal distributions are closed under lineartransforms, the finite-dimensional distributions of X are Gaussian.The Brownian bridge is characterized by two functions µX (t) = 0and CX (t , s) = min(t , s)− ts, for all s, t ∈ [0,1].The Brownian bridge appears as the limit process of thenormalized empirical distribution function of a sample of iiduniform U(0,1) random variables.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 22 / 169
Brownian Motion with Drift
Let B = (Bt , t ∈ [0,∞)) denote Brownian Motion.The process X := (µt + σBt , t ≥ 0), for constant σ > 0 and µ ∈ R,is called Brownian motion with (linear) drift.X is a Gaussian process with expectation and covariancefunctions
µX (t) = µt , CX (t , s) = σ2 min(t , s), s, t ≥ 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 23 / 169
Geometric Brownian Motion (Black, Scholes andMerton 1973)
The process X = (exp(µt + σBt ), t ≥ 0), for constant σ > 0 andµ ∈ R, is called geometric Brownian motion.Since EetZ = et2/2 for an N(0,1) random variable Z , it followsfrom the self-similarity of Brownian motion that
µX (t) = eµtEeσBt = eµtEeσt1/2B1 = e(µ+0.5σ2)t .
Since Bt − Bs and Bs are independent for any s ≤ t , andBt − Bs =d Bt−s, then
CX (t , s) = e(µ+0.5σ2)(t+s)(eσ2t − 1).
In particular, σ2X (t) = e(2µ+σ2)t (eσ
2t − 1).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 24 / 169
Central Limit Theorem
Consider a sequence Y1,Y2, . . . , of iid non-degenerate randomvariables with mean µY = EY1 and variance σ2
Y = var(Y1) > 0.Define the partial sums: R0 := 0, Rn :=
∑ni=1 Yi , n ≥ 1.
Central Limit Theorem (CLT)If Y1 has finite variance, then the sequence (Rn) obeys the CLT via thefollowing uniform convergence:
supx∈R
∣∣∣P ( Rn − ERn
[var(Rn)]1/2 ≤ x)− Φ(x)
∣∣∣→ 0, as n→∞,
where Φ(x) denotes the distribution of the standard normal distribution.
That is, for large sample size n, the distribution of[Rn − ERn]/[var(Rn)]1/2 is approximately standard normal.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 25 / 169
Functional ApproximationLet (Yi) be a sequence of iid random variables with mean µY = EY1and variance σ2
Y = var(Y1) > 0. Consider the processSn = (Sn(t), t ∈ [0,1]) with continuous sample paths on [0,1],
Sn(t) =
(σ2
Y n)−1/2(Ri − µY i), if t = i/n, i = 0, . . . ,nlinearly interpolated, otherwise.
Example: If Yis are iid N(0,1), consider the restriction of the processSn on the points i/n: Sn(i/n) = n−1/2∑i
k=1 Yk , i = 0, . . . ,n.Sn(0) = 0.Sn has independent increments: for any 0 ≤ i1 ≤ · · · ≤ im ≤ n,
Sn(i2/n)− Sn(i1/n), . . . ,Sn(im/n)− Sn(im−1/n)
are independent.For any 0 ≤ i ≤ n, Sn(i/n) has a normal N(0, i/n) distribution.Sn and Brownian motion B on [0,1], when restricted to the pointsi/n, have very much the same properties.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 26 / 169
Functional Central Limit TheoremLet C[0,1] denote the space of all continuous functions defined on[0,1]. With the maximum norm, C[0,1] is a complete separable space.
Donsker’s TheoremIf Y1 has finite variance, then the process Sn obeys the functional CLT:
Eφ(Sn(·))→ Eφ(B), as n→∞,
for all bounded continuous functionals φ : C[0,1]→ R, where B(t) isthe Brownian motion on [0,1].
The finite-dimensional distributions of Sn converge to thecorresponding finite-dimensional distributions of B: As n→∞,
P(Sn(t1) ≤ x1, . . . ,Sn(tm) ≤ xm)→ P(Bt1 ≤ x1, . . . ,Btm ≤ xm),
for all possible ti ∈ [0,1] xi ∈ R.The max functional max0≤i≤n Sn(ti) converges in distribution tomax0≤t≤1 Bt as n→∞.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 27 / 169
Functional CLT for Jump ProcessesStochastic processes are infinite-dimensional objects, andtherefore unexpected events may happen. For example, thesample paths of the converging processes may fluctuate verywildly with increasing n. In order to avoid such irregular behavior,a so-called tightness or stochastic compactness condition must besatisfied.The functional CLT remains valid for the processSn = (Sn(t), t ∈ [0,1]), where
Sn(t) = (σ2Y n)−1/2(R[nt] − µY [nt ])
and [nt ] denotes the integer part of the real number nt .In contrast to Sn, the process Sn is constant on the intervals[(i − 1)/n, i/n) and has jumps at the points i/n.Sn and Sn coincide at the points i/n, and the differences betweenthese two processes are asymptotically negligible: thenormalization n1/2 makes the jumps of Sn arbitrarily small for largen.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 28 / 169
Simulating a Brownian Sample Path
Plot the paths of the processes Sn, or Sn, for sufficiently large n,and get a reasonable approximation to Brownian sample paths.Since Brownian motion appears as a distributional limit,completely different graphs for different values of n may appear forthe same sequence of realizations Yi(ω)s.
Simulating a Brownian Sample Path on [0,T ]
Simulate one path of Sn, or Sn on [0,1], then scale the time interval bythe factor T and the sample path by the factor T 1/2.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 29 / 169
Lévy-Ciesielski RepresentationSince Brownian sample paths are continuous functions, we cantry to expand them in a series.However, the paths are random functions: for different ω, weobtain different path functions. This means that the coefficients ofthis series are random variables.Since the process is Gaussian, the coefficients must be Gaussianas well.
Lévy-Ciesielski ExpansionBrownian motion on [0,1] can be represented in the form
Bt (ω) =∞∑
n=1
Zn(ω)
∫ t
0φn(x)dx , t ∈ [0,1],
where Zns are iid N(0,1) random variables and (φn) is a completeorthonormal function system on [0,1].
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 30 / 169
Paley-Wiener RepresentationThere are infinitely many possible representations of Brownian motion.
Let (Zn,n ≥ 0) be a sequence of iid N(0,1) random variables, then
Bt (ω) = Z0(ω)t
(2π)1/2 +2π1/2
∞∑n=1
Zn(ω)sin(nt/2)
n, t ∈ [0,2π].
This series converges for every t , and uniformly for t ∈ [0,2π].
Simulating a Brownian Path via Paley-Wiener ExpansionCalculate
Z0(ω)tj
(2π)1/2 +2π1/2
M∑n=1
Zn(ω)sin(ntj/2)
n, tj =
2πjN, for 0 ≤ j ≤ N.
The problem of choosing the “right” values for M and N is similar to thechoice of the sample size n in the functional CLT.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 31 / 169
Outline
Conditional Expectation: An IllustrationGeneral Conditional Expectation
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 32 / 169
Discrete ConditioningLet X be a random variable defined on s probability space (Ω,F ,P),and B ⊂ Ω with P(B) > 0.
The conditional distribution function of X given B is defined as
FX (x |B) := P(X ≤ x |B) =P(X ≤ x ∩ B)
P(B).
The conditional expectation of X given B is given by
E(X |B) =
∫x dFX (x |B) =
∫B x dP(X ≤ x)
P(B)=
E(XIB)
P(B),
where IB(ω) is the indicator function of the event B.E(X |B) can be viewed as our estimate to X given the informationthat the event B has occurred.E(X |Bc) is similarly defined. Together, E(X |B) and E(X |Bc)provide our estimate to X depending on whether or not B occurs.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 33 / 169
Conditional Expectation Under Discrete ConditioningThink IB as a random variable that carries the information onwhether the event B occurs, and the conditional expectationE(X |IB) of X given IB is a random variable defined as
E(X |IB)(ω) =
E(X |B), if ω ∈ BE(X |Bc), if ω /∈ B.
The random variable E(X |IB) is our estimate to X based on theinformation provided by IB.Consider a discrete random variable Y on Ω taking distinct valuesyi , i = 1,2, . . . ,. Let Ai = ω ∈ Ω : Y (ω) = yi. Note that Y carriesthe information on whether or not events Ais occur.Define the conditional expectation of X given Y :
E(X |Y )(ω) := E(X |Ai) = E(X |Y = yi), if ω ∈ Ai , i = 1,2, . . . .
The random variable E(X |Y ) can be viewed as our estimate to Xbased on the information carried by Y .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 34 / 169
Example: Uniform Random VariableConsider the random variable X (ω) = ω on Ω = (0,1], endowed withthe probability measure P((a,b]) := b − a for any (a,b] ⊂ (0,1].
Assume that one of the events
Ai =
(i − 1
n,
in
], i = 1, . . . ,n,
occurred. Then
E(X |Ai) =1
P(Ai)
∫Ai
xfX (x)dx =12
2i − 1n
(i.e., the center of Ai ).
The value E(X |Ai) is the updated expectation on the new spaceAi , given the information that Ai occurred.Define Y (ω) :=
∑ni=1
i−1n IAi (ω), i = 1, . . . ,n. The conditional
expectation E(X |Y )(ω) = 12
2i−1n if ω ∈ Ai , i = 1, . . . ,n.
Since E(X |Y )(ω) is the average of X given the information thatω ∈ ((i − 1)/n, i/n], E(X |Y ) is a coarser version of X , that is, anapproximation to X , given the information that any of the Aisoccurred.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 35 / 169
Properties of Conditional Expectation
The conditional expectation is linear: for random variables X1,X2and constants c1, c2,
E([c1X1 + c2X2]|Y ) = c1E(X1|Y ) + c2E(X2|Y ).
EX = E [E(X |Y )].If X and Y are independent, then E(X |Y ) = EX .The random variable E(X |Y ) is a (measurable) function of Y :E(X |Y ) = g(Y ), where
g(y) =∞∑
i=1
E(X |Y = yi)Iyi(y).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 36 / 169
σ-Fields
Observe that the values of Y did not really matter for the definition ofE(X |Y ) under discrete conditioning, but it was crucial that conditioningevents Ais describe the information carried by all the distinct values ofY .That is, we estimate the random variable X via E(X |Y ) based on theinformation provided by observable events Ais and their compositeevents, such as Ai ∪ Aj and Ai ∩ Aj , ....
Definition of σ-FieldsA σ-field F on Ω is a collection of subsets (observable events) of Ωsatisfying the following conditions:∅ ∈ F and Ω ∈ F .If A ∈ F , then Ac ∈ F .If A1,A2, · · · ∈ F , then ∪∞i=1Ai ∈ F , and ∩∞i=1 Ai ∈ F .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 37 / 169
Generated σ-Fields
For any collection C of events, let σ(C) denote the smallest σ-fieldcontaining C, by adding all possible unions, intersections andcomplements. σ(C) is said to be generated by C.
The following are some examples.F = ∅,Ω = σ(∅).F = ∅,Ω,A,Ac = σ(A).F = A : A ⊆ Ω = σ(A : A ⊆ Ω).Let C = (a,b] : −∞ < a < b <∞, then any set in B1 = σ(C) iscalled a Borel subset in R.Let C = (a, b] : −∞ < ai < bi <∞, i = 1, . . . ,d, then any set inBd = σ(C) is called a Borel subset in Rd .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 38 / 169
σ-Fields Generated By Random VariablesLet Y be a discrete random variable taking distinct valuesyi , i = 1,2, . . . . Define
Ai = ω : Y (ω) = yi, i = 1,2, . . . .
A typical set in the σ-field σ(Ai) is of this form
A = ∪i∈IAi , I ⊆ 1,2, . . . .
σ(Ai) is called the σ-field generated by Y , and denoted by σ(Y ).Let Y be a random vector and
A(a, b] = ω : Y (ω) ∈ (a, b], −∞ < ai < bi <∞, i = 1, . . . ,d.
The σ-field σ(A(a, b],a, b ∈ Rd) is called the σ-field generatedby Y , and denoted by σ(Y ).σ(Y ) provides the essential information about the structure of Y ,and contains all the observable events ω : Y (ω) ∈ C, where C isa Borel subset of Rd .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 39 / 169
σ-Fields Generated By Stochastic ProcessesFor a stochastic process Y = (Yt , t ∈ T ), and any (measurable)set C of functions on T , let
A(C) = ω : the sample path (Yt (ω), t ∈ T ) belongs to C.
The σ-field generated by the process Y is the smallest σ-field thatcontains all the events of the form A(C).Example: For Brownian motion B = (Bt , t ≥ 0), let
Ft := σ(Bs, s ≤ t)
denote the σ-field generated by Brownian motion prior to time t .Ft contains the essential information about the structure of theprocess B on [0, t ]. One can show that this σ-field is generated byall sets of the form
At1,...,tn (C) = ω : (Bt1(ω), . . . ,Btn (ω)) ∈ C
for all the n-dimensional Borel sets C.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 40 / 169
Information Represented by σ-FieldsFor a random variable, a random vector or a stochastic process Yon Ω, the σ-field σ(Y ) generated by Y contains the essentialinformation about the structure of Y as a function of ω ∈ Ω. Itconsists of all subsets ω : Y (ω) ∈ C for suitable sets C.Because Y generates a σ-field, we also say that Y containsinformation represented by σ(Y ) or Y carries the informationσ(Y ).For any measurable function f acting on Y , since
ω : f (Y (ω)) ∈ C = ω : Y (ω) ∈ f−1(C), ∀ measurable set C
we have that σ(f (Y )) ⊆ σ(Y ). That is, a function f acting on Ydoes not provide new information about the structure of Y .Example: For Brownian motion B = (Bi , t ≥ 0), consider thefunction f (B) = sup0≤t≤1 Bt . The σ(f (B)) ⊂ σ(Bs, s ≤ t) for anyt ≥ 1.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 41 / 169
The General Conditional ExpectationLet (Ω,F ,P) be a probability space, and Y ,Y1 and Y2 denote randomvariables (or random vectors, stochastic processes) defined on Ω.
The information of Y is contained in F or Y does not contain moreinformation than that contained in F ⇔ σ(Y ) ⊆ F .Y1 contains more information than Y2 ⇔ σ(Y2) ⊆ σ(Y1).
Conditional Expectation Given the σ-FieldLet X be a random variable defined on Ω. The conditional expectationgiven F is a random variable, denoted by E(X |F), with the followingproperties:
E(X |F) does not contain more information than that contained inF : σ(E(X |F)) ⊆ F .For any event A ∈ F , E(XIA) = E(E(X |F)IA).
By virtue of the Radon-Nikodym theorem, we can show the existenceand almost sure (a.s.) uniqueness of E(X |F).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 42 / 169
Conditional Expectation Given Generated InformationLet Y be a random variable (random vector or stochastic process) onΩ. The conditional expectation of X given Y , denoted by E(X |Y ), isdefined as E(X |Y ) := E(X |σ(Y )).
The random variables X and E(X |F) are “close” to each other, notin the sense that they coincide for any ω, but averages(expectations) of X and E(X |F) on suitable sets A are the same.The conditional expectation E(X |F) is a coarser version of theoriginal random variable X and is our estimate to X given theinformation F .
Example: Let Y be a discrete random variable taking distinct valuesyi , i = 1,2, . . . . Any set A ∈ σ(Y ) can be written as
A = ∪i∈IAi = ∪i∈Iω : Y (ω) = yi, for some I ⊆ 1,2, . . . .Let Z := E(X |Y ). Then σ(Z ) ⊂ σ(Y ) and Z (ω) = E(X |Ai), for ω ∈ Ai .Observe that
E(XIA) = E(X∑i∈I
IAi ) =∑i∈I
E(XIAi ) =∑i∈I
E(X |Ai)P(Ai) = E(ZIA).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 43 / 169
Special Cases
Classical Conditional Expectation: Let B be an event withP(B) > 0,P(Bc) > 0. Define FB := σ(B) = ∅,Ω,B,Bc. Then
E(X |FB)(ω) = E(X |B), for ω ∈ B.
Classical Conditional Probability: If X = IA, then
E(IA|FB)(ω) = E(IA|B) =P(A ∩ B)
P(B), for ω ∈ B.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 44 / 169
Rules for Calculation of Conditional ExpectationsLet X ,X1,X2 denote random variables defined on (Ω,F ,P).
1 For any two constants c1, c2,E(c1X1 + c2X2|F) = c1E(X1|F) + c2E(X2|F).
2 EX = E [E(X |F)].3 If X and F are independent, then E(X |F) = EX . In particular, if X
and Y are independent, then E(X |Y ) = EX .4 If σ(X ) ⊂ F , then E(X |F) = X . In particular, if X is a function of
Y , then σ(X ) ⊂ σ(Y ) and E(X |Y ) = X .5 If σ(X ) ⊂ F , then E(XX1|F) = XE(X1|F). In particular, if X is a
function of Y , then σ(X ) ⊂ σ(Y ) and E(XX1|Y ) = XE(X1|Y ).6 If F and F ′ are two σ-fields with F ⊂ F ′, then
E(X |F) = E [E(X |F ′)|F ], and E(X |F) = E [E(X |F)|F ′].7 Let G be a stochastic process with σ(G) ⊂ F . If X and F are
independent, then for any function h(x , y),
E [h(X ,G)|F ] = E(EX [h(X ,G)]|F)
where EX [h(X ,G)] = expectation of h(X ,G) with respect to X .Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 45 / 169
Examples
Example 1: If X and Y are independent, then
E(XY |Y ) = Y EX , and E(X + Y |Y ) = EX + Y .
Example 2: Consider Brownian motion B = (Bt , t ≥ 0). The σ-fieldsFs = σ(Bx , x ≤ s) represent an increasing stream of information aboutthe structure of the process. Find E(Bt |Fs) = E(Bt |Bx , x ≤ s) fors ≥ 0.
If s ≥ t , then Fs ⊃ Ft and thus E(Bt |Fs) = Bt .If s < t , thenE(Bt |Fs) = E(Bt − Bs|Fs) + E(Bs|Fs) = E(Bs|Fs) = Bs.E(Bt |Fs) = Bmin(s,t).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 46 / 169
Another Example: Squared Brownian MotionConsider again Brownian motion B = (Bt , t ≥ 0), with the σ-fieldsFs = σ(Bx , x ≤ s). Define Xt := B2
t − t , t ≥ 0.If s ≥ t , then Fs ⊃ Ft and thus E(Xt |Fs) = Xt .If s < t , observe that
Xt = [(Bt − Bs) + Bs]2 − t = (Bt − Bs)2 + B2s + 2Bs(Bt − Bs)− t .
Since (Bt − Bs) and (Bt − Bs)2 are independent of Fs, we have
E [(Bt − Bs)2|Fs] = E(Bt − Bs)2 = (t − s),
E [Bs(Bt − Bs)|Fs] = BsE(Bt − Bs) = 0.
Since σ(B2s ) ⊂ σ(Bs) ⊂ Fs, we have
E(B2s |Fs) = B2
s .
Thus, E(Xt |Fs) = Xs.E(Xt |Fs) = Xmin(s,t).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 47 / 169
The Projection Property of Conditional ExpectationsWe now formulate precisely the meaning of the statement that theconditional expectation E(X |F) can be understood as the optimalestimate to X given the information F . Define
L2(F) := Z : σ(Z ) ⊂ F , EZ 2 <∞.
If F = σ(Y ), then Z ∈ L2(σ(Y )) implies that Z is a function of Y .
The Projection Property
Let X be a random variable with EX 2 <∞. The conditionalexpectation E(X |F) is that random variable in L2(F) which is closestto X in the mean square sense:
E [X − E(X |F)]2 = minZ∈L2(F)
E(X − Z )2.
If F = σ(Y ), then E(X |Y ) is that function of Y which has a finitesecond moment and which is closest to X in the mean square sense.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 48 / 169
The Best Prediction Based on Available Information
It follows from the projection property that the conditional expectationE(X |F) can be viewed as the best prediction of X given theinformation F .For example, for Brownian motion B = (Bt , t ≥ 0), we have, s ≤ t ,
E(Bt |Bx , x ≤ s) = Bs, and E(B2t − t |Bx , x ≤ s) = B2
s − s.
That is, the best predictions of the future values Bt and B2t − t , given
the information about Brownian motion until the present time s, are thepresent values Bs and B2
s − s, respectively. This property characterizesthe whole class of martingales with a finite second moment.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 49 / 169
Outline
MartingalesMartingale Transforms
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 50 / 169
Filtration
Let (Ft , t ≥ 0) be a collection of σ-fields on the same probability space(Ω,F ,P) with Ft ⊆ F , for all t ≥ 0.
DefinitionThe collection (Ft , t ≥ 0) of σ-fields on Ω is called a filtration if
Fs ⊆ Ft , ∀ 0 ≤ s ≤ t .
A filtration represents an increasing stream of information.The index t can be discrete, for example, the filtration(Fn,n = 0,1, ...) is a sequence of σ-fields on Ω with Fn ⊆ Fn+1 forall n ≥ 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 51 / 169
Adapted Processes
A filtration is usually linked up with a stochastic process.
DefinitionThe stochastic process Y = (Yt , t ≥ 0) is said to be adapted to thefiltration (Ft , t ≥ 0) if
σ(Yt ) ⊆ Ft , ∀ t ≥ 0.
The stochastic process Y is always adapted to the natural filtrationgenerated by Y :
Ft = σ(Ys, s ≤ t).
For a discrete-time process Y = (Yn,n = 0,1, . . . ), theadaptedness means σ(Yn) ⊆ Fn for all n ≥ 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 52 / 169
ExampleLet (Bt , t ≥ 0) denote Brownian motion and (Ft , t ≥ 0) denote thecorresponding natural filtration. Stochastic processes of the form
Xt = f (t ,Bt ), t ≥ 0, where f is a function of two variables,
are adapted to (Ft , t ≥ 0).
Examples: X (1)t = Bt and X (2)
t = B2t − t .
More Examples: X (3)t = max0≤s≤t Bs and X (4)
t = max0≤s≤t B2s .
Examples that are not adapted to the Brownian motion filtration:X (5)
t = Bt+1 and X (6)t = Bt + BT for some fixed number T > 0.
DefinitionIf the stochastic process Y = (Yt , t ≥ 0) is adapted to the naturalBrownian filtration (Ft , t ≥ 0) (that is, Yt is a function of (Bs, s ≤ t) forall t ≥ 0), we will say that Y is adapted to Brownian motion.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 53 / 169
Adapted to Different Filtrations
Consider Brownian motion (Bt , t ≥ 0) and the corresponding naturalfiltration Ft = σ(Bs, s ≤ t). The stochastic process
Xt := B2t , t ≥ 0,
generates its own natural filtration F ′t = σ(B2s , s ≤ t), t ≥ 0. The
process (Xt , t ≥ 0) is adapted to both F ′t and Ft .
Observe that F ′t ⊂ Ft . For example, we can only reconstruct the wholeinformation about |Bt | from B2
t ;, but not about Bt : we can say nothingabout the sign of Bt .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 54 / 169
Market Information or Information Histories
Share prices, exchange rates, interest rates, etc., can be modelledby solutions of stochastic differential equations which are drivenby Brownian motion.These solutions are then functions of Brownian motion.The fluctuations of these processes actually represent theinformation about the market. This relevant knowledge iscontained in the natural filtration.In finance there are always people who know more than theothers. For example, they might know that an essential politicaldecision will be taken in the very near future which will completelychange the financial landscape.This enables the informed persons to act with more competencethan the others. Thus they have their own filtrations which can bebigger than the natural filtration.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 55 / 169
MartingaleIf information Fs and X are dependent, we can expect that knowing Fsreduces the uncertainty about the values of Xt at t > s. That is, Xt canbe better predicted via E(Xt |Fs) with the information Fs than without it.
DefinitionThe stochastic process X = (Xt , t ≥ 0) adapted to the filtration(Ft , t ≥ 0) is called a continuous-time martingale with respect to(Ft , t ≥ 0), if
1 E |Xt | <∞ for all t ≥ 0.2 Xs is the best prediction of Xt given Fs: E(Xt |Fs) = Xs for all
0 ≤ s ≤ t .
The discrete-time martingale can be similarly defined by replacing thesecond condition by E(Xn+1|Fn) = Xn, ∀ n = 0,1, . . . .A martingale has the remarkable property that its expectation functionis constant: EXs = E [E(Xt |Fs)] = EXt for all s, t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 56 / 169
Example: Partial Sums
Let (Zn) be a sequence of independent random variables withfinite expectations and Z0 = 0. Consider the partial sums
Rn =n∑
i=0
Zi , n ≥ 0.
and the corresponding natural filtrationFn = σ(R0, . . . ,Rn) = σ(Z0, . . . ,Zn), n ≥ 0.Observe that
E(Rn+1|Fn) = E(Rn|Fn) + E(Zn+1|Fn) = Rn + EZn+1.
and hence, if EZn = 0 for all n ≥ 0, then (Rn,n ≥ 0) is amartingale with respect to the filtration (Fn,n ≥ 0).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 57 / 169
Collecting Information About a Random Variable
Let Z be a random variable on Ω with E |Z | <∞ and (Ft , t ≥ 0) bea filtration on Ω. Define
Xt = E(Z |Ft ), t ≥ 0.
Since Ft increases when time goes by, Xt gives us more and moreinformation about the random variable Z . In particular, ifσ(Z ) ⊆ Ft for some t , then Xt = Z .An appeal to Jensen’s inequality yields
E |Xt | = E |E(Z |Ft )| ≤ E [E(|Z ||Ft )] = E |Z | <∞.
σ(Xt ) ⊆ Ft .E(Xt |Fs) = E [E(Z |Ft )|Fs] = E(Z |Fs) = Xs.
X is a martingale with respect to (Ft , t ≥ 0).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 58 / 169
Brownian Motion is a Martingale
Let B = (Bt , t ≥ 0) be Brownian motion with the natural filtrationFt = σ(Bs, s ≤ t).B and (B2
t − t , t ≥ 0) are martingales with respect to the naturalfiltration.(B3
t − 3tBt , t ≥ 0) is a martingale.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 59 / 169
Martingale TransformLet X = (Xn,n = 0,1, . . . ) be a discrete-time martingale with respect tothe filtration (Fn,n = 0,1, . . . ). Let Yn := Xn − Xn−1, n ≥ 1, andY0 := X0. The sequence Y = (Yn,n = 0,1, . . . ) is called a martingaledifference sequence with respect to the filtration (Fn,n = 0,1, . . . ).
Consider a stochastic process C = (Cn,n = 1,2, . . . ), satisfyingthat σ(Cn) ⊆ Fn−1, n ≥ 1. Given Fn−1, we completely know Cn attime n − 1. Such a sequence is called predictable with respect to(Fn,n = 0,1, . . . ).Define
Z0 = 0, Zn =n∑
i=1
CiYi =n∑
i=1
Ci(Xi − Xi−1), n ≥ 1.
The process C · Y := (Zn,n ≥ 0) is called the martingaletransform of Y by C.Note that if Cn = 1 for all n ≥ 1, then C · Y = X is the originalmartingale.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 60 / 169
Martingale Transform Leads to a Martingale
Assume that the second moments of Cn and Yn are finite.
It follows from the Cauchy-Schwarz inequality that
E |Zn| ≤n∑
i=1
E |CiYi | ≤n∑
i=1
[EC2i EY 2
i ]1/2 <∞.
Since Y1, . . . ,Yn do not carry more information than Fn, andσ(C1, . . . ,Cn) ⊆ Fn−1 (predictability), we have σ(Zn) ⊆ Fn.Due to the predictability of C,
E(Zn − Zn−1|Fn−1) = E(CnYn|Fn−1) = CnE(Yn|Fn−1) = 0.
(Zn − Zn−1,n ≥ 0) is a martingale difference sequence, and (Zn,n ≥ 0)is a martingale with respect to (Fn,n = 0,1, . . . ).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 61 / 169
A Brownian Martingale TransformConsider Brownian motion B = (Bs, s ≤ t) and a partition
0 = t0 < t1 < · · · < tn−1 < tn = t .
The σ-fields at these time instants are described by the filtration:
F0 = ∅,Ω, Fi = σ(Btj ,1 ≤ j ≤ i), i = 1, . . . ,n.
The sequence ∆B := (∆iB,1 ≤ i ≤ n) defined by
∆0B = 0, ∆iB = Bti − Bti−1 , i = 1, . . . ,n,
forms a martingale difference sequence with respect to thefiltration (Fi ,1 ≤ i ≤ n).B := (Bti−1 ,1 ≤ i ≤ n) is predictable with respect to (Fi ,1 ≤ i ≤ n).The martingale transform B ·∆B is then a martingale:(B ·∆B)k =
∑ki=1 Bti−1(Bti − Bti−1), k = 1, . . . ,n.
This is precisely a discrete-time analogue of the Itô stochasticintegral
∫ t0 BsdBs.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 62 / 169
Martingale as a Fair Game
Let X = (Xn,n = 0,1, . . . ) be a discrete-time martingale with respect tothe filtration (Fn,n = 0,1, . . . ). Let Yn = Xn − Xn−1, n ≥ 0, denote themartingale difference, and Cn, n ≥ 1, be predictable with respect to(Fn,n = 0,1, . . . ).
Think of Y , as your net winnings per unit stake at the n-th gamewhich are adapted to a filtration (Fn,n = 0,1, . . . ).At the n-th game, your stake Cn, does not contain moreinformation than Fn−1 does. At time n − 1, this is the bestinformation we have about the game.CnYn is the net winnings for stake Cn at the n-th game.(C · Y )n =
∑ni=1 CiYi is the net winnings up to time n.
The game is fair because the best prediction of the net winningsCnYn of the n-th game, just before the n-th game starts, is zero:E(CnYn|Fn−1) = 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 63 / 169
Outline
The Itô IntegralsThe Stratonovich Integrals
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 64 / 169
Integrating With Respect To a FunctionLet B = (Bt , t ≥ 0) be Brownian motion.
Goal: Define an integral of type∫ 1
0 f (t)dBt (ω), where f (t) is afunction or a stochastic process on [0,1] and Bt (ω) is a Browniansample path.Difficulty: The path Bt (ω) does not have a derivative.The pathwise integral of the Riemann-Stieltjes type is one option.Consider a partition of the interval [0,1]:
τn : 0 = t0 < t1 < t2 < . . . tn−1 < tn = 1, n ≥ 1.
Let f and g be two real-valued functions on [0,1] and define∆ig := g(ti)− g(ti−1), 1 ≤ i ≤ n and the Riemann-Stieltjes sum:
Sn =n∑
i=1
f (yi)∆ig =n∑
i=1
f (yi)(g(ti)− g(ti−1)),
for ti−1 ≤ yi ≤ ti , i = 1, . . . ,n.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 65 / 169
Riemann-Stieltjes Integrals
DefinitionIf the limit S = limn→∞ Sn exists as mesh(τn)→ 0 and S isindependent of the choice of the partitions τn, and their intermediatevalues yi ’s, then S, denoted by
∫ 10 f (t)dg(t), is called the
Riemann-Stieltjes integral of f with respect to g on [0,1].
When does the Riemann-Stieltjes integral∫ 1
0 f (t)dg(t) exist, and isit possible to take g = B for Brownian motion B on [0,1]?One usual assumption is that f is continuous and g has boundedvariation:
supτn
n∑i=1
|g(ti)− g(ti−1)| <∞.
But Brownian sample paths Bt (ω) do not have bounded variation.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 66 / 169
Bounded p-VariationThe real function h on [0,1] is said to have bounded p-variation forsome p > 0 if
supτn
n∑i=1
|h(ti)− h(ti−1)|p <∞
where the supremum is taken over all partitions τ of [0,1].Brownian motion have bounded p-variation on any fixed finiteinterval, provided that p > 2, and unbounded variation for p ≤ 2.
A Sufficient and Almost Necessary Condition
The Riemann-Stieltjes integral∫ 1
0 f (t)dg(t) exists if1 The functions f and g do not have discontinuities at the same
point t ∈ [0,1].2 The function f has bounded p-variation and the function g has
bounded q-variation such that p−1 + q−1 > 1.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 67 / 169
Existence of the Riemann-Stieltjes IntegralAssume that f is a differentiable function with bounded derivativef ′(t) on [0,1]. Then f has bounded variation.The Riemann-Stieltjes integral∫ 1
0f (t)dBt (ω)
exits for every Brownian sample path Bt (ω).Example:
∫ 10 tkdBt (ω) for k ≥ 0,
∫ 10 etdBt (ω), ...
But the existence does not mean that you can evaluate theseintegrals explicitly in terms of Brownian motion.A more serious issue: How to define
∫ 10 Bt (ω)dBt (ω)?
Brownian motion has bounded p-variation for p > 2, not for p ≤ 2,and so the sufficient condition 2p−1 > 1 for the existence of theRiemann-Stieltjes integral is not satisfied.In fact, it can be shown that
∫ 10 Bt (ω)dBt (ω) does not exist as a
Riemann-Stieltjes integral.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 68 / 169
Another Fatal Blow to the Riemann-Stieltjes Approach
It can be shown that if∫ 1
0 f (t)dg(t) exists as a Riemann-Stieltjesintegral for all continuous functions f on [0,1], then g necessarilyhas bounded variation. But Brownian sample paths do not havebounded variation on any finite interval.Since pathwise average with respect to a Brownian sample path,as suggested by the Riemann-Stieltjes integral, does not lead to asufficiently large class of integrable functions f , one has to find adifferent approach to define the stochastic integrals such as∫ 1
0 Bt (ω)dBt (ω).We will try to define the integral as a probabilistic average, leadingto the Itô Integrals.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 69 / 169
A Motivating ExampleLet B = (Bt , t ≥ 0) be Brownian motion. Consider a partition of [0, t ]:
τn : 0 = t0 < t1 < · · · < tn−1 < tn = t , with ∆i = ti − ti−1, n ≥ 1
and the Riemann-Stieltjes sums, for n ≥ 1,
Sn =n∑
i=1
Bti−1∆iB, with ∆iB := Bti − Bti−1 1 ≤ i ≤ n.
Rewriting: Sn = 12B2
t −12∑n
i=1(∆iB)2 =: 12B2
t −12Qn(t).
The limit of Sn boils down to the limit of Qn(t), as n→∞.One can show that Qn(t) does not converge for a given Browniansample path and suitable choices of partitions τn.We will show that Qn(t) converges in probability to t , as n→∞.This is the key to define the Itô Integral!
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 70 / 169
Quadratic Variation of Brownian MotionSince Brownian motion has independent and stationaryincrements, E(∆iB∆jB) = 0 for i 6= j and
E(∆iB)2 = Var(∆iB) = ti − ti−1 = ∆i .
Thus E(Qn(t)) =∑n
i=1 E(∆iB)2 =∑n
i=1 ∆i = t .Var(Qn(t)) =
∑ni=1 Var((∆iB)2) =
∑ni=1[E((∆iB)4)−∆2
i ].Since E(B4
1) = 3 (standard normal), we haveE((∆iB)4) = EB4
ti−ti−1= E(∆
1/2i B1)4 = 3∆2
i (self-similarity). ThusVar(Qn(t)) = 2
∑ni=1 ∆2
i .If mesh(τn) = max1≤i≤n ∆i → 0, we obtain that
Var(Qn(t)) = E(Qn(t)−t)2 ≤ 2mesh(τn)n∑
i=1
∆i = 2t mesh(τn)→ 0.
It follows from the Chebyshev inequality that Qn(t)→ t inprobability as mesh(τn)→ 0 (n→∞). This limiting functionf (t) = t is called the quadratic variation of Brownian motion.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 71 / 169
Mean Square Limit is a MartingaleThe quadratic variation is an emerging characteristic only forBrownian motion.Since Sn = 1
2B2t −
12Qn(t) converges in mean square to 1
2B2t −
12 t ,
we define the Itô Integral in the mean square sense:∫ t
0BsdBs =
12
(B2t − t).
The values of Brownian motion were evaluated at the left endpoints of the intervals [ti−l , ti ], then the martingale transform
B ·∆B =k∑
i=1
Bti−1(Bti − Bti−1)
is a martingale with respect to the filtration σ(Bti ,0 ≤ i ≤ k), for allk = 1, . . . ,n.As a result, the mean square limit limn→∞ Sn = 1
2(B2t − t) is a
martingale with respect to the natural Brownian filtration.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 72 / 169
Heuristic RulesThe increment ∆iB = Bti − Bti−1 on the interval [ti−1, ti ] satisfies
E(∆iB) = 0, Var(∆iB) = ∆i = ti − ti−1.
These properties suggest that (∆iB)2 is of order ∆i .In terms of differentials, we write
(dBt )2 = (Bt+dt − Bt )
2 = dt .
In terms of integrals, we write∫ t
0(dBs)2 =
∫ t
0ds = t .
These rules can be made mathematically precise in the meansquare sense.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 73 / 169
Stratonovich IntegralConsider partitions τn:
0 = t0 < t1 < · · · < tn−1 < tn = t , with mesh(τn)→ 0.
Using the same arguments and tools as given for the Itô Integral,the Riemann-Stieltjes sums
Sn =n∑
i=1
Byi ∆iB, with ∆iB := Bti − Bti−1 ,
where yi = 12(ti−1 + ti),1 ≤ i ≤ n, converges to the mean square
limit 12B2
t .This quantity is called the Stratonovich stochastic integral anddenoted by ∫ t
0Bt dBt =
12
B2t .
The Riemann-Stieltjes sums∑k
i=1 Byi ∆iB, k = 1, . . . ,n, do notconstitute a martingale, and neither does the limit process 1
2B2t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 74 / 169
Itô Integral vs Stratonovich Integral
The Itô Integral is a martingale with respect to the naturalBrownian filtration, but its does not obey the classical chain rule ofintegration.A chain rule which is well suited for Itô integration is given by theItô lemma.The Stratonovich Integral is not a martingale, but it does obey theclassical chain rule of integration.It turns out that the Stratonovich Integral will also be a useful toolfor solving Itô stochastic differential equations.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 75 / 169
Outline
The Itô Stochastic Integrals
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 76 / 169
Simple Processes
Let B = (Bt , t ≥ 0) denote Brownian motion and Ft = σ(Bs, s ≤ t)denote the corresponding natural filtration. Consider a partition on[0,T ]:
τn : 0 = t0 < t1 < . . . tn−1 < tn = T .
The stochastic process C = (Ct , t ∈ [0,T ]) is said to be simple if thereexists a sequence (Zi , i = 1, . . . ,n) of random variables such that
(Zi , i = 1, . . . ,n) is adapted to (Fti ,0 ≤ i ≤ n), i.e., Zi is a functionof (Bs, s ≤ ti−1) and EZ 2
i <∞, 1 ≤ i ≤ n.Ct =
∑ni=1 Zi Iti−1 ≤ t < ti+ ZnIt = T.
Example: fn(t) =∑n
i=1i−1
n I[ i−1n , i
n )(t) + n−1
n IT(t) on [0,1].
Example: Cn(t) =∑n
i=1 Bti−1 I[ i−1n , i
n )(t) + Btn−1 IT(t) on [0,T ].
Note that Ct is a function of Brownian motion until time t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 77 / 169
Itô Stochastic Integrals of Simple ProcessesDefine ∫ T
0CsdBs :=
n∑i=1
Cti−1(Bti − Bti−1) =n∑
i=1
Zi∆iB.
Itô Integrals of Simple Processes on [0, t ], tk−1 ≤ t < tk∫ t
0CsdBs :=
∫ T
0CsI[0,t](s)dBs =
k−1∑i=1
Zi∆iB + Zk (Bt − Btk−1).
Example:∫ t
0 fn(s)dBs =∑k−1
i=1i−1
n (Bti − Bti−1) + k−1n (Bt − Btk−1)
for k−1n ≤ t < k
n . Note that
limn→∞
∫ t
0fn(s)dBs =
∫ t
0sdBs.
Example:∫ t
o Cn(s)dBs =∑k−1
i=1 Bti−1∆iB + Btk−1(Bt − Btk−1) fortk−1 ≤ t < tk .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 78 / 169
Itô Integral of a Simple Process is a Martingale
The form of the Itô stochastic integral for simple processes very muchreminds us of a martingale transform, which results in a martingale.
A Martingale Property
The stochastic process It (C) :=∫ t
0 CsdBs, t ∈ [0,T ], is a martingalewith respect to the natural Brownian filtration (Ft , t ∈ [0,T ]).
Using the isometry property, E(|It (C)|) <∞, for all t ∈ [0,T ].It (C) is adapted to (Ft , t ∈ [0,T ]).E(It (C)|Fs) = Is(C), for s < t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 79 / 169
PropertiesThe Itô stochastic integral has expectation zero.The Itô stochastic integral satisfies the isometry property:
E(∫ t
0CsdBs
)2
=
∫ t
0EC2
s ds, t ∈ [0,T ].
For any constants c1 and c2, and simple processes C(1) and C(2)
on [0,T ],∫ t
0(c1C(1)
s + c2C(2)s )dBs = c1
∫ t
0C(1)
s dBs + c2
∫ t
0C(2)
s dBs.
For any t ∈ [0,T ],∫ T
0CsdBs =
∫ t
0CsdBs +
∫ T
tCsdBs.
The process I(C) has continuous sample paths.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 80 / 169
Basic Assumptions
Assumptions on the Integrand Process C1 C = (Ct , t ∈ [0,T ]) is adapted to Brownian motion on [0,T ], i.e. Ct
is a function of Bs, s ≤ t .2 The integral
∫ T0 EC2
s ds <∞.
For fixed t and a given partition τn = (ti) of [0, t ], we defined∫ T0 CsdBs =
∑ni=1 Cti−1(Bti − Bti−1), as the Riemann-Stieltjes
sums, for a simple process C.Brownian motion B = (Bt , t ∈ [0,T ]) satisfies the Assumptions.Simple process C = (Ct , t ∈ [0,T ]) satisfies the Assumptions.Another class of admissible integrands consists of thedeterministic functions c(t) on [O,T ] with
∫ T0 c2(t)dt <∞.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 81 / 169
Key Steps to Define Itô Integrals and ProofsLet C = (Ct , t ∈ [0,T ]) be a process satisfying the Assumptions.
We need to find a sequence C(n) = (C(n)t , t ∈ [0,T ]) of simple
processes such that∫ T
0E [Cs − C(n)
s ]2ds → 0, as mesh(τn)→ 0.
That is, the simple processes C(n) converge in a certain meansquare sense to the integrand process C.Since C(n) is simple we can evaluate the Itô IntegralsIt (C(n)) =
∫ t0 C(n)
s dBs for every n and t .We need to show the existence of a process I(C) on [0,T ] suchthat
E sup0≤t≤T
[It (C)− It (C(n))]2 → 0, as mesh(τn)→ 0.
That is, to show that the sequence (It (C(n))) of Itô stochasticintegrals converges in a certain mean square sense to a uniquelimit process.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 82 / 169
The General Itô Stochastic IntegralDefinitionThe mean square limit I(C) is called the Itô stochastic integral of C. Itis denoted by It (C) =
∫ t0 CsdBs, t ∈ [0,T ].
For practical purposes, the following rule of thumb is helpful:
The Itô stochastic integrals It (C) =∫ t
0 CsdBs, t ∈ [0,T ], constitute astochastic process. For a given partition
τn : 0 = t0 < t1 < · · · < tn−1 < tn = T
and t ∈ [tk−l , tk ], the random variable It (C) is “close” to theRiemann-Stieltjes sum
k−1∑i=1
Cti−1(Bti − Bti−1) + Ctk−1(Bt − Btk−1)
and this approximation is the closer (in the mean square sense) to thevalue of It (C) the more dense the partition τn in [0,T ].Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 83 / 169
Properties of the General Itô Stochastic IntegralThe stochastic process
∫ t0 CsdBs is a martingale with respect to
the natural Brownian filtration (Ft , t ∈ [0,T ]).The Itô stochastic integral has expectation zero.The Itô stochastic integral satisfies the isometry property:
E(∫ t
0CsdBs
)2
=
∫ t
0EC2
s ds, t ∈ [0,T ].
For any constants c1, c2, and processes C(1) and C(2) on [0,T ],∫ t
0(c1C(1)
s + c2C(2)s )dBs = c1
∫ t
0C(1)
s dBs + c2
∫ t
0C(2)
s dBs.
For any t ∈ [0,T ],∫ T
0CsdBs =
∫ t
0CsdBs +
∫ T
tCsdBs.
The process I(C) has continuous sample paths.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 84 / 169
Outline
The Itô Lemma: Stochastic Analogue of the Chain Rule
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 85 / 169
A Simple Version of the Itô LemmaLet B = (Bt , t ≥ 0) denote Brownian motion and Ft = σ(Bs, s ≤ t)denote the corresponding natural filtration. Assume that f is a twicedifferentiable function, it follows from the Taylor expansion
f (Bt + dBt )− f (Bt ) = f ′(Bt )dBt +12
f ′′(Bt )(dBt )2 + · · · .
In contrast to the deterministic case, the contribution of thesecond order term an the Taylor expansion is not negligible.The squared differential (dBt )
2 can be interpreted as dt .
Integrating both sides in a formal sense and neglecting terms of orderhigher than 3 on the right-hand side, we obtain
Itô Lemma (1951)
f (Bt )− f (Bs) =
∫ t
sf ′(Bx )dBx +
12
∫ t
sf ′′(Bx )dx , s < t ,
for any twice continuously differentiable f .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 86 / 169
Examples1 Let f (t) = t2, then
B2t = 2
∫ t
0BxdBx + t
resulting in∫ t
0 BxdBx = 12(B2
t − t).2 Let f (t) = t3, then
B3t − B3
s = 3∫ t
sB2
x dBx + 3∫ t
sBxdx .
We cannot express∫ t
s Bxdx in simpler terms of Brownian motion,simulations have to be used.
3 Let f (t) = et , we have
eBt − eBs =
∫ t
seBx dBx +
12
∫ t
seBx dx >
∫ t
seBx dBx .
So, the exponential function is not the Itô exponential.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 87 / 169
Extension I of the Itô LemmaAssume that f (t , z) has continuous partial derivatives of at leastsecond order. Let
fi(t , x) =∂
∂xif (x1, x2)
∣∣x1=t ,x2=x , fij(t , x) =
∂2
∂xi∂xjf (x1, x2)
∣∣x1=t ,x2=x ,
for i , j = 1,2.
Itô LemmaFor any s < t ,
f (t ,Bt )− f (s,Bs) =
∫ t
s[f1(x ,Bx ) +
12
f22(x ,Bx )]dx +
∫ t
sf2(x ,Bx )dBx .
Example: Let f (t , x) = ex−0.5t , and then
eBt−0.5t − eBs−0.5s =
∫ t
seBx−0.5xdBx .
eBt−0.5t is called the Itô exponential.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 88 / 169
Geometric Brownian MotionConsider a particular form of geometric Brownian motion
Xt = f (t ,Bt ) = e(c−0.5σ2)t+σBt ,
where c and σ > 0 are constants.
An application of the Itô lemma yields that the process X satisfies
Xt − X0 = c∫ t
0Xsds + σ
∫ t
0XsdBs.
Symbolically in the differential form,
dXt = σXtdBt + cXtdt .
The process suggested by Black, Scholes and Merton.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 89 / 169
Itô ProcessesLet B = (Bt , t ≥ 0) denote Brownian motion and Ft = σ(Bs, s ≤ t)denote the corresponding natural filtration. Consider
Xt = X0 +
∫ t
0A(1)
s ds +
∫ t
0A(2)
s dBs,
or symbolically in the differential form,
dXt = A(1)t dt + A(2)
t dBt ,
where processes A(1) and A(2) are adapted to Brownian motion(Ft , t ≥ 0).
A process X with this representation is called an Itô process.One can show that the processes A(1) and A(2) are uniquelydetermined in the sense that, if X has above representation,where the A(i)s are replaced with adapted processes D(i), thenA(i) and D(i) necessarily coincide.The geometric Brownian motion is an Itô process with A(1) = cXand A(2) = σX .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 90 / 169
Extension II of the Itô LemmaLet f (t , z) be a function whose second order partial derivatives arecontinuous.
Itô Lemma
If dXt = A(1)t dt + A(2)
t dBt , then for any s < t ,
f (t ,Xt )− f (s,Xs) =
∫ t
sA(2)
y f2(y ,Xy )dBy +
∫ t
s[f1(y ,Xy ) + A(1)
y f2(y ,Xy ) +12
[A(2)y ]2f22(y ,Xy )]dy ,
This Itô Lemma is frequently given in the following form:
f (t ,Xt )− f (s,Xs) =
∫ t
sf2(y ,Xy )dXy +∫ t
s[f1(y ,Xy ) +
12
[A(2)y ]2f22(y ,Xy )]dy .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 91 / 169
Extension III of the Itô LemmaLet X (1) and X (2) be two Itô processes and f (t , x1, x2) be a functionwhose second order partial derivatives are continuous.
Itô Lemma
f (t ,X (1)t ,X (2)
t )− f (s,X (1)s ,X (2)
s ) =
∫ t
sf1(y ,X (1)
y ,X (2)y )dy
+3∑
i=2
∫ t
sfi(y ,X
(1)y ,X (2)
y )dX (i−1)y
+12
3∑i=2
3∑j=2
∫ t
sfij(y ,X
(1)y ,X (2)
y )A(2,i−1)y A(2,j−1)
y dy ,
where for i = 1,2,
dX (i)t = A(1,i)
t dt + A(2,i)t dBt .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 92 / 169
Stochastic Integration by PartsConsider the function f (t , x1, x2) = x1x2, then we obtain
Integration by Parts Formula
d(X (1)t X (2)
t ) = X (2)t dX (1)
t + X (1)t dX (2)
t + A(2,1)t A(2,2)
t dt ,
where for i = 1,2, dX (i)t = A(1,i)
t dt + A(2,i)t dBt .
Example: Consider X (1)t = et − 1 =
∫ t0 esds and
X (2)t = Bt =
∫ t0 dBs. Integration by parts yields∫ t
0esdBs = etBt −
∫ t
0Bsesds.
More generally, for any continuously differentiable function f ,∫ t
0f (s)dBs = f (t)Bt −
∫ t
0f ′(s)Bsds.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 93 / 169
Outline
The Stratonovich and Other IntegralsItô Stochastic Differential Equations
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 94 / 169
Sums with the Values at the Center of SubintervalsThere is a large variety of other integrals, the Itô integral being just onemember of this family.Let B = (Bt , t ≥ 0) denote Brownian motion and Ft = σ(Bs, s ≤ t)denote the corresponding natural filtration. Consider a partition of[0,T ]:
τn : 0 = t0 < t1 < · · · < tn−1 < tn = T .
Consider first Ct = f (Bt ), t ∈ [0,T ], for a twice differentiablefunction f on [0,T ].Define the Riemann-Stieltjes sums
Sn =n∑
i=1
f (Byi )∆iB, where yi =ti−1 + ti
2.
The mean square limit of the Riemann-Stieltjes sums Sn exists asmesh(τn)→ 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 95 / 169
The Stratonovich IntegralDefinition
Assume that∫ T
0 E [f (Bt )]2dt <∞. The mean square limit of the sumsSn is called the Stratonovich Integral, and denoted by
St (f (B)) =
∫ t
0f (Bs) dBs, t ≤ T
1 Let f (t) = t , then the itô integral∫ t
0 BxdBx = 12(B2
t − t).2 Let f (t) = t , consider Sn =
∑ni=1 B ti−1+ti
2(Bti − Bti−1). The mean
square limit of the Riemann-Stieltjes sums Sn is 12B2
t , and thus itsStratonovich Integral ∫ t
0Bs dBs =
12
B2t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 96 / 169
Relation between Itô and Stratonovich IntegralsAssume that
∫ T0 E [f (Bt )]2dt <∞, and
∫ T0 E [f ′(Bt )]2dt <∞.
Observe that the Taylor expansion
f(B(ti−1+ti )/2
)= f (Bti−1) + f ′(Bti−1)(Byi − Bti−1) + · · ·
where we neglect higher order terms. Then the Riemann-Stieltjessums can be written as follows
n∑i=1
f (Byi )∆iB = S(1)n + S(2)
n + S(3)n ,
1 S(1)n =
∑ni=1 f (Bti−1)∆iB converges in the mean square sense to
the Itô integral∫ t
0 f (Bs)dBs,2 S(2)
n =∑n
i=1 f ′(Bti−1)(Byi − Bti−1)2 converges in the mean squaresense to 1
2
∫ t0 f ′(Bs)ds,
3 S(3)n =
∑ni=1 f ′(Bti−1)(Byi − Bti−1)(Bti − Byi ) converges in the mean
square sense to 0.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 97 / 169
Chain Rule for the Stratonovich Integrals
Transformation Formula∫ t
0f (Bs) dBs =
∫ t
0f (Bs)dBs +
12
∫ t
0f ′(Bs)ds.
Chain RuleThe Stratonovich stochastic integral satisfies the chain rule of classicalcalculus: ∫ t
0g′(Bs) dBs = g(Bt )− g(B0).
1 The Stratonovich stochastic integral (St (f (B)), t ≤ T ) does notconstitute a martingale, but obeys the “nice” classical chain rule.
2 The Itô integral does not obey the classical chain rule, but is amartingale that offers rich structural properties.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 98 / 169
A More General Transformation Formula
ConsiderCt = f (t ,Xt ), t ∈ [0,T ],
where f (t , x) is a function with continuous partial derivatives of ordertwo. The process X is supposed to be an Itô process given by thestochastic differential equation:
Xt = X0 +
∫ t
0a(s,Xs)ds +
∫ t
0b(s,Xs)dBs
where the continuous functions a(t , x) and b(t , x) satisfy someregularity conditions.
Theorem∫ t
0f (s,Xs) dBs =
∫ t
0f (s,Xs)dBs +
12
∫ t
0b(s,Xs)f2(s,Xs)ds.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 99 / 169
Another Approximation
Let f (t , z) be a function whose second order partial derivatives arecontinuous. Assume that∫ T
0E [f (t ,Xt )]2dt <∞.
Consider approximating Riemann-Stieltjes sums
Sn =n∑
i=1
f(
ti−1,Xti−1 + Xti
2
)∆iB.
One can show that this definition is consistent with the previous one Snwith f (t , x) = f (x) and X = B.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 100 / 169
p-Stochastic Integrals, 0 ≤ p ≤ 1Let process (Ct , t ∈ [0,T ]) is adapted to Brownian motion B. Consider
Spn =
n∑i=1
Cti−1+p(ti−ti−1)∆iB.
Under some regularity conditions, the mean square limit of theRiemann-Stieltjes sums Sp
n , as mesh(τn)→ 0, exists. This meansquare limit is called the p-Stochastic Integral and denoted by(p)−
∫ T0 CsdBs.
1 If p = 0, we obtain the Itô integral. If p = 0.5, we obtain theStratonovich integral.
2 For non-trivial integrands C, the values (p)−∫ T
0 CsdBs differ fordistinct ps.
3 For example, using arguments similar to the Itô and Stratonovichcases, one can show that
(p)−∫ T
0BsdBs = 0.5B2
t + (p − 0.5)T .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 101 / 169
Itô Stochastic Differential EquationsLet B = (Bt , t ≥ 0) be Brownian motion. The randomness in thedifferential equation is introduced via a perturbed initial condition andan additional random noise term:
dXt = a(t ,Xt )dt + b(t ,Xt )dBt , X0(ω) = Y (ω),
where a(t , x) and b(t , z) are deterministic functions.An Itô stochastic differential equation with driving process B isgiven in the integral form by
Xt = X0 +
∫ t
0a(s,Xs)ds +
∫ t
0b(s,Xs)dBs, X0(ω) = Y (ω).
It is possible to replace the driving process B by semimartingales,which contains Brownian motion and a large variety of jumpprocesses. They are useful tools when one is interested inmodeling the jump character of real-life processes, e.g., the strongoscillations of foreign exchange rates or crashes of the stockmarket.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 102 / 169
Diffusions: Strong and Weak Solutions
A strong solution is a stochastic process (Xt , t ≥ 0) that satisfies thefollowing conditions:
1 X is adapted to Brownian motion.2 The integrals in the integral form are well defined as Riemann or
Itô stochastic integrals, respectively.3 X is a function of the underlying Brownian sample path and of the
coefficient functions a(t , x) and b(t , x).For weak solutions the path behavior is not essential, we are onlyinterested in the distribution of X . The initial condition X0 and thecoefficient functions a(t , z) and b(t , x) are given, and we have to find aBrownian motion such that the Itô stochastic differential equationholds.We only consider strong solutions.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 103 / 169
Existence of Strong Solutions
TheoremA unique strong solution of an Itô stochastic differential equation withdriving process B exists on [0,T ] if
1 The initial condition X0 has a finite second moment and isindependent of B.
2 The coefficient functions a(t , x) and b(t , x) are continuous.3 The coefficient functions a(t , x) and b(t , x) satisfy a Lipschitz
condition with respect to the second variable.
Example: A linear Itô stochastic differential equation given by
Xt = X0 +
∫ t
0(c1Xs + c2)ds +
∫ t
0(σ1Xs + σ2)dBs, X0(ω) = Y (ω),
has a unique strong solution.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 104 / 169
Linear Itô SDE with Multiplicative NoiseConsider the linear Itô stochastic differential equation
Xt = X0 + c∫ t
0Xsds + σ
∫ t
0XsdBs, X0(ω) = Y (ω)
Let Xt = f (t ,Bt ), for some smooth function f . From Itô Lemma, wehave
cf (t , x) = f1(t , x) +12
f22(t , x), σf (t , x) = f2(t , x).
If f (t .x) = g(t)h(x) is separable, then we have
f (t , x) = g(0)h(0)e(c−0.5σ2)t+σx .
The unique strong solution is given by the geometric Brownianmotion:
Xt = X0e(c−0.5σ2)t+σBt .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 105 / 169
Langevin Equation
Linear SDE with Additive Noise
Xt = X0 + c∫ t
0Xsds + σ
∫ t
0dBs, t ∈ [0,T ],
where c is a constant.
In the differential form dXt = cXtdt + σdBt .This resembles a time series (autoregressive process of order 1),
Xt+1 − Xt = cXt + σ(Bt+1 − Bt ), or Xt+1 = φXt + Zt ,
where φ = c + 1 and Zt = σ(Bt+1 − Bt ) ∼ N(0, σ2). This timeseries model can be considered as a discrete analogue of thesolution to the Langevin equation.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 106 / 169
Ornstein-Uhlenbeck Process
The unique strong solution of the Langevin Equation is given by
Xt = ectX0 + σect∫ t
0e−csdBs.
For a constant initial condition X0, this process is called anOrnstein-Uhlenbeck process.The Ornstein-Uhlenbeck process is a Gaussian process.if X0 = 0, then
EXt = 0, cov(Xt ,Xs) =σ2
2c(ec(t+s) − ec(t−s)), s < t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 107 / 169
Example: Two Independent Driving Brownian MotionsLet B(i) = (B(i)
t , t ≥ 0) be two independent Brownian motions and σi ,i = 1,2, positive real numbers.
Define the process
Bt = (σ21 + σ2
2)−1/2(σ1B(1)t + σ2B(2)
t ).
Bt is a Brownian motion, because it has exactly the sameexpectation and covariance functions as standard Brownianmotion: E(Bt ) = 0, cov(Bt , Bs) = min(s, t).Consider the integral equation
Xt = X0 + c∫ t
0Xsds + σ1
∫ t
0XsdB(1)
s + σ2
∫ t
0XsdB(2)
s
= X0 + c∫ t
0Xsds + (σ2
1 + σ22)1/2
∫ t
0XsdBs
for some constants c and σ1 and σ2.The solution: Xt = X0e[c−0.5(σ2
1+σ22)]t+[σ1B(1)
t +σ2B(2)t ].
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 108 / 169
Outline
Solving Itô Differential Equations via Stratonovich CalculusThe General Linear Differential Equation
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 109 / 169
Converting Itô’s to Stratonovich’sConsider the Itô differential equation
Xt = X0 +
∫ t
0a(s,Xs)ds +
∫ t
0b(s,Xs)dBs, t ∈ [0,T ]
where the coefficient functions a(t , x) and b(t , x) satisfy the regularityconditions for existence and uniqueness of a strong solution.Using the transformation formula for Stratonovich integrals in terms ofItô and Riemann integrals, we have∫ t
0b(s,Xs)dBs =
∫ t
0b(s,Xs) dBs −
12
∫ t
0b(s,Xs)b2(s,Xs)ds.
We then arrive at the equivalent Stratonovich stochastic differentialequation:
Xt = X0 +
∫ t
0a(s,Xs)ds +
∫ t
0b(s,Xs) dBs, t ∈ [o,T ]
where a(t , x) = a(t , x)− 12b(t , x)b2(t , x).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 110 / 169
The Stratonovich Version of Itô LemmaNow consider the stochastic process Yt = u(t ,Xt ) for some smoothfunction u(t , x). Using the Itô lemma, we have
Yt = Y0 +
∫ t
0(u1 + au2 +
12
b2u22)ds +
∫ t
0bu2dBs.
Applying the transformation formula for f = bu2 and f2 = b2u2 + bu22,we obtain
Theorem
Yt = Y0 +
∫ t
0[u1 + (a− 0.5bb2)u2]ds +
∫ t
0bu2 dBs.
This formula is the exact analogue of the classical chain rule for atwice differentiable function u(t , x), evaluated at x(t) satisfying
dx(t) = [a(t , x(t))− 0.5b(t , x(t))b2(t , x(t))]dt + b(t , x(t))dc(t)
where c(t) is a differentiable function.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 111 / 169
A Typical Scheme via Stratonovich IntegralsConsider an Itô stochastic differential equation,
Xt = X0 +
∫ t
0[qf (Xs) +
12
f (Xs)f ′(Xs)]ds +
∫ t
0f (Xs)dBs.
The equivalent Stratonovich stochastic differential equation is
Xt = X0 +
∫ t
0qf (Xs)ds +
∫ t
0f (Xs) dBs.
This corresponds to the deterministic differential equation
dx(t) = qf (x(t))dt + f (x(t))dc(t)
where c(t) is a differential function.Separating variables, we have
g(x(t))− g(x(0)) :=
∫ x(t)
x(0)
dxf (x)
= qt + c(t)− c(0).
The solution is given by g(Xt )− g(X0) = qt + Bt .Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 112 / 169
The General Linear Differential Equation
Linear Differential Equation
Xt = X0 +
∫ t
0(c1(s)Xs + c2(s))ds +
∫ t
0(σ1(s)Xs + σ2(s))dBs, t ∈ [0,T ]
1 if the (deterministic) coefficient functions ci and σi are continuous,then the existence and uniqueness conditions guarantee that ithas a unique strong solution.
2 It is particularly attractive because it has an explicit solution interms of the coefficient functions and of the underlying Browniansample path.
3 We derive this solution by multiple use of different variants of theItô lemma.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 113 / 169
Linear Equations with Additive NoiseConsider
Xt = X0 +
∫ t
0(c1(s)Xs + c2(s))ds +
∫ t
0σ2(s)dBs, t ∈ [0,T ].
The process X is not directly involved in the stochastic integral.
Try Yt = f (t ,Xt ) := y(t)Xt , where y(t) = e−∫ t
0 c1(s)ds.An application of the Itô lemma yieldsdYt = d(y(t)Xt ) = c2(t)y(t)dt + σ2(t)y(t)dBt .
The Solution:
Xt = (y(t))−1(
X0 +
∫ t
0c2(s)y(s)ds +
∫ t
0σ2(s)y(s)dBs
).
Observe that∫ t
0 σ2(s)y(s)dBs is Gaussian with variance∫ t0 σ
22(s)y2(s)ds. If X0 is a constant, X is a Gaussian process.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 114 / 169
The Vasicek Interest Rate Model
Let rt denote the instantaneous interest rate at time t for borrowingand lending money.In the Vasicek model, rt is described by
drt = c[µ− rt ]dt + σdBt , t ∈ [0,T ],
where c, µ and σ are positive constants.rt reverts to the mean µ in the sense that when rt deviates from µit will immediately be drawn back to µ and the speed at which thishappens is proportional to |µ− rt | adjusted by the parameter c.The volatility parameter σ is a measure for the order of themagnitude of the fluctuations of rt around µ.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 115 / 169
The Vasicek Process
The solution of the Vasicek interest rate model is
rt = r0e−ct + µ(1− e−ct ) + σe−ct∫ t
0ecsdBs.
1 If r0 is a constant. Then r is a Gaussian process with
Ert = r0e−ct + µ(1− e−ct ), var(rt ) =σ2
2c(1− e−2ct ).
2 For µ = 0 we obtain an Ornstein-Uhlenbeck process.3 As t →∞, rt →d N(µ, σ2/2c).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 116 / 169
Homogeneous Equations with Multiplicative NoiseConsider
Xt = X0 +
∫ t
0c1(s)Xsds +
∫ t
0σ1(s)XsdBs, t ∈ [0,T ].
Without loss of generality, we assume that X0 = 1.Since we expect an exponential form of the solution, we assumethat Xt > 0 for all t . Let Yt = f (Xt ) := ln Xt .Applying the Itô Lemma, we obtain that
dYt = [c1(t)− 0.5σ21(t)]dt + σ1(t)dBt
The Solution:
Xt = X0 exp∫ t
0[c1(s)− 0.5σ2
1(s)]ds +
∫ t
0σ1(s)dBs
.
Example: If c1(t) = c and σ1(t) = σ, we get the geometric Brownianmotion.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 117 / 169
The General CaseConsider
Xt = X0 +
∫ t
0(c1(s)Xs + c2(s))ds +
∫ t
0(σ1(s)Xs + σ2(s))dBs, t ∈ [0,T ].
1 Let Y denote the solution of the homogeneous stochasticdifferential equation with Y0 = 1.
2 Consider X (1)t = Y−1
t and X (2)t = Xt . Apply the Itô lemma to
X (1)t = Y−1
t , we havedX (1)
t = [−c1(t) + σ21(t)]X (1)
t dt − σ1(t)X (1)t dBt .
3 An appeal to the integration by parts formula yieldsd(X (1)
1 X (2)2 ) = [c2(t)− σ1(t)σ2(t)]X (1)
t dt + σ2(t)X (1)t dBt .
The Solution:
Xt = Yt
(X0 +
∫ t
0[c2(s)− σ1(s)σ2(s)]Y−1
s ds +
∫ t
0σ2(s)Y−1
s dBs
).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 118 / 169
The Expectation and Variance of the SolutionConsider again
Xt = X0 +
∫ t
0(c1(s)Xs + c2(s))ds +
∫ t
0(σ1(s)Xs + σ2(s))dBs, t ∈ [0,T ].
Let µX (t) = E(Xt ).1 Take expectations on both sides and notice that the stochastic
integral has expectation zero. Hence
µX (t) = µX (0) +
∫ t
0(c1(s)µX (s) + c2(s))ds.
2 This corresponds to the general linear differential equation
µ′X (t) = c1(t)µ′X (t) + c2(t).
3 The variance function can be similarly obtained.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 119 / 169
Outline
The Black-Scholes Option Pricing FormulaChange of MeasuresExtensions and Limitations of the Model
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 120 / 169
A Short Excursion into Finance
Let Xt denote the price of a risky asset (let’s call it a stock) at timet .Assume that the relative return from the asset in the period of time[t , t + dt ] has a linear trend c dt which is disturbed by a stochasticnoise term σdBt .
Xt+dt − Xt
Xt= cdt + σdBt , or dXt = cXtdt + σXtdBt .
The constant c > 0 is the so-called mean rate of return, and σ > 0is the volatility.Observe that this is a crude, first order approximation to a realprice process. But people in economics believe in exponentialgrowth and they are often happy with this model.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 121 / 169
Trading StrategyAssume that you have a non-risky asset such as a bank account,which can be called a bond. Let βt denote the bond yield at time t .Let’s say your initial capital is β0, which will be continuouslycompounded with a constant interest rate r > 0. That is,
dβt = rβtdt , or βt = β0ert .
Note again that this is an idealization since the interest ratechanges over time as well.If you have at shares in stock and bt shares in bond at time t , thenyour portfolio at time t can be represented by (at ,bt ), t ∈ [0,T ],which is called a trading strategy.You want to adjust your strategy according to information availableto you at time t , as to maximize your wealth Vt = atXt + btβt (thevalue of your portfolio) at time t . So, It is reasonable to assumethat at and bt are stochastic processes adapted to Brownianmotion B.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 122 / 169
Self-Financing Conditionat and bt can be positive or negative. A negative value of atmeans short sale of stock (i.e. you sell the stock at time t). Anegative value of bt means that you borrow money at the bond’sriskless interest rate r .We neglect transaction costs for operations on stock and sale forsimplicity.Assume that you spend no money on other purposes (such asfood), i.e., you do not make your portfolio smaller by consumption.We assume finally that your trading strategy (at ,bt ) isself-financing. That is, the increments of your wealth Vt result onlyfrom changes of the prices Xt and βt of your asserts:
dVt = atdXt + btdβt = (catXt + rbtβt )dt + σatXtdBt .
Vt = V0 +
∫ t
0(casXs + rbsβs)ds +
∫ t
0σasXsdBs.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 123 / 169
OptionAn option at time t = 0 is a “ticket” which entitles you to buy oneshare of stock until or at time T , the time of maturity or time ofexpiration of the option.If you can exercise this option (or exercise the call) at a fixed priceK , called the exercise price or strike price of the option, only attime of maturity T , this is called a European call option. If you canexercise it until or at time T , it is called an American call option.There are many other kinds ....The purchaser of a European call option is entitled to a payment of
(XT − K )+ = max(0,XT − K ).
We illustrate option pricing using European call options.A put is an option to sell stock at a strike price K on or until aparticular date of maturity T . A European put option is exercisedonly at time of maturity with profit (K −XT )+, and an American putcan be exercised until or at time T .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 124 / 169
Option Pricing
Since you do not know the price XT at time t = 0 when you purchasethe call, a natural question arises:
How much would you be willing to pay for such a ticket, i.e. what is arational price for this option at time t = 0?
Black, Scholes and Merton responded as follows:
1 You, after investing this rational value of money in stock and bondat time t = 0, can manage your portfolio according to aself-financing strategy so as to yield the same payoff (XT − K )+
as if the option had been purchased.2 If the option were offered at any price other than this rational
value, there would be an opportunity of arbitrage, i.e. forunbounded profits without an accompanying risk of loss.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 125 / 169
Hedging Against the Contingent ClaimGoal: Find a self-financing strategy (at ,bt ) and a wealth process Vt ,such that
Vt = atXt + btβt = u(T − t ,Xt ), t ∈ [0,T ],
for some smooth (a technical assumption) deterministic function u(t , x)with the terminal condition
VT = u(0,XT ) = (XT − K )+.
That is, to hedge against the contingent claim (XT − K )+.1 Apply the Itô lemma to the wealth process Vt = u(T − t ,Xt ) and
we obtain an integral representation.2 Plug bt = Vt−at Xt
βtinto the self-financing condition, and we obtain
another integral representation.3 From these two integral representations, we derive a PDE with
condition u(0, x) = (x − K )+, x > 0.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 126 / 169
Black-Scholes PDEWe obtain that
0.5σ2x2u22(t , x) + rxu2(t , x) + u1(t , x)− ru(t , x) = 0
with boundary conditions
u(t ,0) = 0, limx→∞
u(t , x)
x= 1 ∀t ∈ [0,T ]; u(0, x) = (x − K )+ ∀x ≥ 0.
Transform the equation into a diffusion equation by usingθ = T − t , y = log(x/K ) + (r − σ2/2)θ, w(θ, y) = erθu(t , x).
We arrive at a heat equation
∂w∂θ
=σ2
2∂2w∂y2
with an initial condition w(0, y) = K (ey − 1)+.Use the heat kernel, we have
w(θ, y) = (2πσ2θ)1/2∫ ∞−∞
w(0, z)e−(y−z)2/2σ2θdz.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 127 / 169
Black-Scholes-Merton ApproachThe explicit solution can be simplified as
u(t , x) = xΦ(g(t , x))− Ke−rt Φ(h(t , x)),
where Φ is the standard normal distribution function, and
g(t , x) =ln(x/K ) + (r + 0.5σ2)t
σt1/2 , h(t , x) = g(t , x)− σt1/2.
Black-Scholes Option Pricing FormulaA rational price at time t = 0 for a European call option with exerciseprice K is
V0 = X0Φ(g(T ,X0))− Ke−rT Φ(h(T ,X0)).
The stochastic process Vt = u(T − t ,Xt ) is the value of yourself-financing portfolio with trading strategy
at = u2(T − t ,Xt ) > 0, bt =u(T − t ,Xt )− atXt
βt.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 128 / 169
The Radon-Nikodym TheoremConsider two measures µ and ν defined on a σ-field F on Ω. µ is saidto be absolutely continuous with respect to ν (denoted by µ ν) if
ν(A) = 0 implies µ(A) = 0, ∀A ∈ F .
We say that µ and ν are equivalent measures if µ ν and ν µ.
TheoremAssume µ and ν are two σ-finite measures. Then µ ν holds if andonly if there exists a non-negative measurable function f such that
µ(A) =
∫A
f (ω)dν(ω), ∀A ∈ F .
Moreover, f is almost everywhere unique with respect to ν. Thefunction f is called the (relative) density of µ with respect to ν, anddenoted by f = dµ
dν .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 129 / 169
Girsanov’s TheoremLet B = (Bt , t ≥ 0) be standard Brownian motion on the probabilityspace (Ω,F ,P), and Ft = σ(Bs, s ≤ t) the Brownian filtration.Consider
Bt = Bt + qt , t ∈ [0,T ], for some constant q.
Although B is not a standard Brownian motion under P for q 6= 0, Bcan be shown to be a standard Brownian motion under the newprobability measure Q.
Girsanov-Cameron-Martin Theorem1. The stochastic process
Mt = exp−qBt −12
q2t, t ∈ [0,T ]
is a martingale with respect to the natural Brownian filtration underthe probability measure P.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 130 / 169
Eliminating the Drift TermGirsanov-Cameron-Martin Theorem
2. Q(A) =∫
A MT (ω)dP(ω), A ∈ F , defines a probability measure Q(called an equivalent martingale measure) on F that is equivalentto P.
3. Under the probability measure Q, the process B is a standardBrownian motion.
4. The process B is adapted to the filtration Ft .
Consider the linear stochastic differential equation
dXt = cXtdt + σXtdBt , t ∈ [0,T ].
With a linear drift term, X is not a martingale under P.Define Bt = Bt + c
σ t , and we havedXt = σXtd(Bt + c
σ t) = σXtdBt , t ∈ [0,T ].
B is a standard Brownian motion under the equivalent martingalemeasure Q, and thus X is a martingale under Q.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 131 / 169
Significance of the Change-of-Measure Trick
If we had known the solution only for the case without a lineardrift, we could have derived the solution for the case with a lineardrift via the change of measure.More significantly, X is a martingale under the equivalentmartingale measure Q, and one can make use of the martingaleproperty for proving various results about X .In fact, this is not just a technical trick, and as we demonstratebelow, the change of measure provides an effective method toincorporate uncertainty and to hedge against contingent claims.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 132 / 169
Recap: The Black-Scholes Model
The price of one share of the risky asset (stock) is described by
dXt = cXtdt + σXtdBt , t ∈ [0,T ].
The price of the riskless asset (bond) is described by
dβt = rβdβt , t ∈ [0,T ].
Portfolio = (at ,bt ), with value Vt = atXt + btβt at time t .The portfolio is self-financing: dVt = atdXt + btdβt , t ∈ [0,T ].At time of maturity, VT = h(XT ), where h(Xt ) is the contingentclaim at time t . For a European call option, h(x) = (x − K )+, andfor a European put option, h(x) = (K − x)+.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 133 / 169
Pricing via the Change-of-MeasureYour gain from the option at time of maturity is h(XT ). Todetermine the value of this amount of money at t = 0, you have todiscount it with given interest rate r : e−rT h(XT ), and take theexpectation of it as the price for the option at t = 0.You have to also discount price of one share of stock:Xt = e−rtXt , t ∈ [0,T ], and the Itô lemma leads to dXt = σXtdBt ,where Bt = Bt + c−r
σ t .There exists an equivalent martingale measure Q which turns Binto a standard Brownian motion, and
Xt = X0e−0.5σ2t+σBt
becomes a martingale with respect to the natural Brownianfiltration under Q.The value of the portfolio at time t is given byVt = EQ[e−r(T−t)h(XT )|Ft ], t ∈ [0,T ].
At time t = 0, V0 = EQ[e−rT h(XT )] is a rational price of the option.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 134 / 169
The Value of an European OptionWrite θ = T − t for t ∈ [0,T ].
Since Xt = X0e(r−0.5σ2)t+σBt , we haveVt = EQ
[e−rθh(Xte(r−0.5σ2)θ+σ(BT−Bt ))|Ft
].
Since σ(Xt ) ⊆ Ft , Xt can be treated as a constant under Ft .Under Q, BT − Bt ∼ N(0, θ), and is independent of Ft .Thus, Vt = f (t ,Xt ), wheref (t , x) = e−rθ ∫∞
−∞ h(xe(r−0.5σ2)θ+σyθ1/2)dΦ(y).
For a European call option, h(x) = (x − K )+, and thus
f (t , x) = xΦ(z1)− Ke−rθΦ(z2),
z1 =ln(x/K ) + (r + 0.5σ2)θ
σθ1/2 , z2 = z1 − σθ1/2.
For a European put option, f (t , x) = Ke−rθΦ(−z2)− xΦ(−z1).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 135 / 169
Extensions and Limitations of the ModelThe Black-Scholes model can be extended for variable (butdeterministic) rates and volatilities.The model may be also used to value European style options oninstruments paying dividends, and closed-form solutions availableif the dividend is a known proportion of the stock price.The model underestimates extreme moves that yields tail risk.In reality security prices do not follow a strict stationary log-normalprocess, nor is the risk-free interest actually known (and is notconstant over time).The variance has been observed to be non-constant leading tomodels such as GARCH to model volatility changes.Pricing discrepancies between empirical and the Black-Scholesmodel have long been observed in options corresponding toextreme price changes; such events would be very rare if returnswere log-normally distributed, but are observed much more oftenin practice.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 136 / 169
Historical Notes
A sociologist investigating the behavior of the probabilitycommunity during the early 1990s would surely report aninteresting phenomenon. Many of the best minds of this (or anyother) generation began concentrating their research in the areaof mathematical finance. The main reason for this can be summedup in two words: option pricing (D. Applebaum, 2004)The Black-Scholes model is widely employed as a usefulapproximation, but proper application requires understanding itslimitations.The limitations and defects of the model have led manyprobabilists to query it.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 137 / 169
Lévy MattersHeavy tails of stock prices, which is incompatible with a Gaussianmodel, suggests that it might be fruitful to replace Brownianmotion with a more general Lévy process.A Lévy process L = (Lt , t ≥ 0) has independent and stationaryincrements and is stochastically continuous, i.e.,limt→s P(|Lt − Ls| > ε) = 0 for any ε > 0.Example: Brownian motion, the Poisson process, compoundPoisson processes and their “combinations”.The Lévy-Itô decomposition for a one-dimensional Lévy process:Lt = bt + Bt +
∫|x |<1 x(N(t ,dx)− tν(dx)) +
∫|x |≥1 xN(t ,dx), where
N = Poisson random measure and ν = the Lévy measure.The small jumps term
∫|x |<1 x(N(t ,dx)− tν(dx)) describes the
day-to-day jitter that causes minor fluctuations in stock prices,while the big jumps term
∫|x |≥1 xN(t ,dx) describes large stock
price movements caused by major market upsets arising from,e.g., earthquakes or terrorist atrocities.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 138 / 169
Outline
More on Change of MeasuresThe Feynman-Kac FormulaConstruction of Risk-Neutral and Distorted MeasuresThe World is Incomplete.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 139 / 169
Risk-Neutral MeasureA risk-neutral measure is a probability measure under which theunderlying risky asset has the same expected return as theriskless bond (or money market account).We often demand more for bearing uncertainty. To price assets,the calculated values need to be adjusted for the risk involved.One way of doing this is to first take the expectation under thephysical distribution and then adjust for risk.A better way is to first adjust the probabilities of future outcomesby incorporating the effects of risk, and then take the expectationunder those adjusted, ‘virtual’ risk-neutral probabilities.
DefinitionA risk-neutral measure is a probability measure under which thecurrent value of all financial assets at time t is equal to the expectedfuture payoff of the asset discounted at the risk-free rate, given theinformation structure available at time t .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 140 / 169
Complete Market
The existence of a risk-neutral measure involves absence ofarbitrage in a complete market.A market is complete with respect to a trading strategy if all cashflows for the trading strategy can be replicated by a similarsynthetic trading strategy.For example, consider the put-call parity: A put is synthesized bybuying the call, investing the strike at the risk-free rate, andshorting the stock.If at some time before maturity, they differ, then someone elsecould purchase the cheaper portfolio and immediately sell themore expensive one to make risk-less profit (since they have thesame value at maturity).In insurance markets, a complete market models the situation thatagents can buy insurance contracts to protect themselves againstany future time and state-of-the-world.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 141 / 169
Fundamental Theorem of Arbitrage-Free PricingConsider a finite state market.
1 There is no arbitrage if and only if there exists a risk-neutralmeasure that is equivalent to the physical probability measure.
2 In absence of arbitrage, a market is complete if and only if there isa unique risk-neutral measure that is equivalent to the physicalprobability measure.
Let B = (Bt , t ≥ 0) denote standard Brownian motion and Ft thenatural filtration generated by B. When risky asset price is driven by asingle Brownian motion, there is a unique risk-neutral measure Q.
Harrison-Pliska TheoremIf (rt , t ≥ 0) is the short rate process driven by Brownian motion, andVt is any Ft -adapted contingent claim payable at time t , then its valueat time t ≤ T is given by Vt = EQ
(e−
∫ Tt ruduVT |Ft
).
The result can be extended to the case when the asset price is drivenby a semi-martingale (see Delbaen and Schachermayer 1994).Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 142 / 169
A PDE ConnectionConsider a parabolic partial differential equation
∂u∂t
+ µ(t , x)∂u∂x
+12σ2(t , x)
∂2u∂x2 = r(x)u(t , x), x ≥ 0, t ∈ [0,T ]
subject to the terminal condition u(T , x) = h(x).The functions µ, σ, h and r are known functions, and T is aparameter.It turns out that the solution can be expressed as a conditionalexpectation with respect to an Itô process starting at xdXt = µ(t ,Xt )dt + σ(t ,Xt )dBt , X0 = x .
The Feynman-Kac Formula
u(t , x) = E(e−
∫ Tt r(Xs)dsh(XT )|Xt = x
).
Example: For the Black-Scholes PDE of the European call option,µ(t , x) = rx , σ(t , x) = σx , r(t , x) = r and h(x) = (x − K )+.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 143 / 169
Exponential MartingalesPositive martingales play a central role in changing probabilitymeasures. Since a necessary condition for an Itô process to be amartingale is that its drift term vanishes, many continuous positivemartingales used in option pricing have an exponential form inconnection with Itô processes.As usual, let X denote a solution of an Itô SDEdXt = µ(t ,Xt )dt + σ(t ,Xt )dBt .
Consider Mt = exp∫ t
0 bsσ(s,Xs)dBs − 12
∫ t0 b2
sσ2(s,Xs)ds, where
(bt , t ≥ 0) is an Ft -adapted stochastic process.
Novikov’s ConditionThe process Mt is a martingale with respect to Ft for any process bt
satisfying Novikov’s condition E(exp12
∫ T0 b2
sσ2(s,Xs)ds) <∞.
Example: In Girsanov’s Theorem, Mt = exp−qBt − 12q2t is a
martingale.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 144 / 169
Itô Integral RepresentationLet B = (Bt , t ≥ 0) be standard Brownian motion on the probabilityspace (Ω,F ,P), and Ft = σ(Bs, s ≤ t) the Brownian filtration.
Consider an Itô process dXt = µ(t ,Xt )dt + σ(t ,Xt )dBt .
If µ = 0, Xt = X0 +∫ t
0 σ(s,Xs)dBs becomes a martingale withrespect to Ft . Conversely, such an integral representation holdsfor any square integrable martingale.
Martingale Representation Theorem
If a martingale (Mt , t ≥ 0) with respect to Ft satisfies E(M2t ) <∞ for
any t ≥ 0, then there exists a unique Ft -adapted stochastic processσM(t) with E(σ2
M(t)) <∞ (called the volatility process), such thatMt = M0 +
∫ t0 σM(s)dBs.
Example: Let X be a random variable on the probability space(Ω,FT ,P) with EX 2 <∞. Then X = E(X |FT ) = E(X ) +
∫ T0 σX (s)dBs.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 145 / 169
Adjusted Measure: A Fundamental Idea of DistortionWe may want to price in uncertainty by adjusting the probabilitymeasure under which our expectation is taken.For a given Itô process, it means to adjust the probability of eachpath of the process so that the Itô process under the newprobabilities has a specific drift.For pricing an option or a contingent claim, this often requiresfinding a equivalent probability measure Q under which theunderlying asset price process has the same stochastic return asthat of the money market account (i.e., risk-neutral) or a processof our choice (e.g., a long-term zero-coupon bond).The Randon-Nikodym derivative dQ
dP of the adjusted measure Qwith respect to the physical measure P can be viewed as adistortion factor for P that incorporates uncertainty. This distortionfactor often takes an exponential form.Example: In Girsanov’s Theorem, dQ
dP = exp−qBt − 12q2t is a
distortion factor.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 146 / 169
An Extension of the Girsanov’s TheoremLet B = (Bt , t ≥ 0) be standard Brownian motion on the probabilityspace (Ω,F ,P), and Ft = σ(Bs, s ≤ t) the Brownian filtration.
Let (bt , t ≥ 0) denote an Ft -adapted stochastic process, satisfyingNovikov’s condition.Define a new probability measure Q(A) =
∫A MT dP, where
Mt = exp∫ t
0 bsdBs − 12
∫ t0 b2
sds, t ∈ [0,T ], is an exponentialmartingale with respect to Ft . Clearly, Q and P are equivalent.The stochastic process Bt = −
∫ t0 bsds + Bt , t ∈ [0,T ], is standard
Brownian motion under the probability measure Q.Note that −
∫ t0 bsds + Bt represents a stochastic process with a
predetermined drift −∫ t
0 bsds under P. To make the driftdisappear, we adjust the probability of each path by multiplying adistortion factor MT .Consider an Itô SDE dXt = µ(t ,Xt )dt + σ(t ,Xt )dBt under theprobability measure P. It has a new drfit µ(t ,Xt ) + σ(t ,Xt )bt underthe distorted probability measure Q.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 147 / 169
Adjust for a Specified DriftPricing a contingent claim often requires us to find a probabilitymeasure for which the underlying risky asset has a specified drfit.Consider an Itô SDE dXt = µ(t ,Xt )dt + σ(t ,Xt )dBt under theprobability measure P.Let µ′(t , x) be a continuous function such that
µ′(t , x)− µ(t , x)
σ(t , x)
satisfies Novikov’s Condition.Construct a new probability measure Q with the Radon-Nikodymderivative
dQdP
= exp
∫ T
0bsdBs −
12
∫ T
0b2
sds
, bt =
µ′(t ,Xt )− µ(t ,Xt )
σ(t ,Xt ).
Under Q, X is a solution of the SDEdXt = µ′(t ,Xt )dt + σ(t ,Xt )dBt , where Bt is standard Brownianmotion under Q.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 148 / 169
Relation Between Bond Price and Short RateConsider a continuously trading bond market over [0,T ].
Let P(t , s), 0 ≤ t ≤ s ≤ T , be the price of a default-free zerocoupon bond at time t that pays one monetary unit at maturity s.Let Pt be the σ-field generated by the bond prices P(t , s).The forward rate, compounded continuously for time s that isdetermined at time t , is defined as f (t , s) = −∂ ln P(t ,s)
∂s .The short rate (i.e., instantaneous interest rate) at time t is definedas rt = f (t , t).To ensure no arbitrage for the the bond market, there exists arisk-neutral measure Q such that for all s ≥ 0, the discountedprocess
V (t , s) = e−∫ t
0 ruduP(t , s), 0 ≤ t ≤ s,
is a martingale with respect to Pt .Thus V (t , s) = EQ(V (s, s)|Pt ) which leads toP(t , s) = EQ
(e−
∫ st rudu|Pt
).
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 149 / 169
Hull-White (Extended Vasicek) Interest Rate ModelAssume that the short rate rt follows the SDE
drt = κ(θ(t)− rt )dt + σdBt
under the risk-neutral measure Q, where the mean-revertingintensity κ is a positive constant and the long-run average θ(t) is adeterministic function.Solving it, the short rate (Markov) process is given byrt = r0e−κt + κ
∫ t0 e−κ(t−u)θ(u)du + σ
∫ t0 e−κ(t−u)dBu.
Pt = Ft , the natural Brownian filtration.The bond price P(t , s) can then be solved, and more generally, forany Ft -contingent claim C(s) payable at time s, its price C(t) attime t is given by C(t) = EQ
(e−
∫ st ruduC(s)|Ft
).
The closed form expression for P(t , s) is given by the so-calledaffine form P(t , s) = eA(t ,s)−B(s−t)rt , where A and B are explicit,deterministic and independent of the short rate.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 150 / 169
The One-Factor Gaussian Forward Rate ModelAssume that under a risk-neutral probability measure Q, theforward rate is governed by the SDE
df (t , s) = µ(t , s)dt + σ(t , s)dBt , 0 ≤ t ≤ s,
where the deterministic function µ(t , s) is the term structure of theforward rate drifts and the deterministic function σ(t , s) is the termstructure of the forward rate volatilities.The forward rate processes are Gaussian.Since the discounted bond price V (t , s) is a martingale under therisk-neutral measure, we have µ(t , s) = σ(t , s)
∫ st σ(t , y)dy . So the
term structures are uniquely determined.Using Itô Lemma, the bond price satisfies the SDE
dP(t , s) = rtP(t , s)dt −[∫ s
tσ(t , y)dy
]P(t , s)dBt
This linear homogeneous equation with multiplicative noise can besolved using the standard method.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 151 / 169
From Risk-Neutral to Forward Risk
Consider again the one-factor Gaussian forward rate model:
df (t , s) = µ(t , s)dt + σ(t , s)dBt , 0 ≤ t ≤ s,
with µ(t , s) = σ(t , s)∫ s
t σ(t , y)dy under the risk-neutral measureQ.For any Ft -adapted contingent claim C(s), payable at time s, itsprice C(t), t ≤ s, can be obtained via expectation under Q.The implementation of calculating the risk-neutral expectation issometime difficult because the joint distribution of e−
∫ st rudu and
C(s) under Q needs to be identified.Rewrite: df (t , s) = σ(t , s)
[∫ st σ(t , y)dydt + dBt
], 0 ≤ t ≤ s.
Consider Bst =
∫ t0 b(u, s)du + Bt , where b(u, s) = −
∫ su σ(u, y)dy .
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 152 / 169
Forward Risk Adjusted MeasureThe Girsanov’s Theorem implies that there is a probabilitymeasure Qs, called the forward risk adjusted measure, such thatBs
t , 0 ≤ t ≤ s, is standard Brownian motion under Qs.Under Qs, df (t , s) = σ(t , s)dBs
t , 0 ≤ t ≤ s, becomes a martingale.For any Ft -contingent claim C(s) payable at time s, its discountedprice e−
∫ t0 ruduC(t) is a martingale under Q.
It follows from the Martingale Representation Theorem thatd(e−
∫ t0 ruduC(t)
)= σC(t)dBt for some volatility process σC(t).
This martingale representation, the SDE for P(t , s) and the ItôLemma imply that C(t)/P(t , s), 0 ≤ t ≤ s, is a martingale underthe forward risk adjusted measure Qs.C(t) = P(t , s)EQs (C(s)|Ft ). That is, the discounted processe−
∫ st rudu is separated from the contingent claim payoff under Qs.
This is useful for pension valuation for which one often needs toevaluate the expected cash flow from a fixed income portfolio andthen discount it using a yield curve.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 153 / 169
Bond Option PricingConsider European call options on zero-coupon bond P(t ,T ) withstrike price K and maturity s, t ≤ s ≤ T .The payoff of the option is (P(s,T )− K )+.The forward rate f (t , s) follow the one-factor Gaussian model.The process P(t ,T )/P(t , s) is a martingale under the forward riskadjusted measure Qs, and satisfies
d(
P(t ,T )
P(t , s)
)= −P(t ,T )
P(t , s)
[∫ T
sσ(t , y)dy
]dBs
t .
Hence P(s,T ) = P(s,T )/P(s, s) has a lonnormal distributionunder Qs.The price of the call option can then be calculated usingφc(t) = P(t , s)EQs (P(s,T )− K )+.
The corresponding put price φp(t) = P(t , s)EQs (K − P(s,T ))+
may be obtained by the put-call parity.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 154 / 169
Market is Incomplete
If stock prices are modelled by Lévy processes, then a problemarising from non-Gaussian option pricing is that the market isincomplete.That is, there may be more than one possible pricing formula. Thisis clearly undesirable, and a number of selection principles, suchas entropy minimization, have been employed to overcome thisproblem.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 155 / 169
Outline
Numerical SolutionsReferences
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 156 / 169
Numerical Solution of Stochastic Differential EquationsSDEs which admit an explicit solution are few exceptions.Therefore numerical techniques for the approximation of thesolution to a SDE are often called for.One purpose is to visualize a variety of sample paths of thesolution. A collection of such paths is called a scenario, which canbe used for some kind of “prediction” of the stochastic process atfuture instants of time.A second objective is to achieve reasonable approximations to thedistributional quantities (expectations, variances, covariance andhigher-order moments) of the solution to a SDE.Only in a few cases one is able to give explicit formulas for thesequantities, and even then they frequently involve special functionswhich have to be approximated numerically.Numerical solutions allow us to simulate as many sample paths aswe want; they constitute the basis for Monte-Carlo techniques toobtain the distributional characteristics and option pricing.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 157 / 169
The Euler Approximation SchemeFor illustration, consider the SDE
dXt = µ(Xt )dt + σ(Xt )dBt , t ∈ [0,T ].
We assume that the coefficient functions µ(x) and σ(x) are Lipschitzcontinuous, and EX 2
0 <∞, which guarantee the existence anduniqueness of a strong solution.
1 To approximate the solution, partition [0,T ] as follows,
τn : 0 = t0 < t1 < · · · < tn−1 < tn = T , with ∆i = ti−ti−1,1 ≤ i ≤ n,
and mesh(τn) = max1≤i≤n ∆i . Let ∆iB = Bti − Bti−1 , 1 ≤ i ≤ n.2 Define recursively, 1 ≤ i ≤ n,
X (n)ti = X (n)
ti−1+ µ(X (n)
ti−1)∆i + σ(X (n)
ti−1)∆iB,
with X (n)0 = X0
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 158 / 169
Idea: The First-Order Approximation1 Consider, 1 ≤ i ≤ n,
Xti = Xti−1 +
∫ ti
ti−1
µ(Xs)ds +
∫ ti
ti−1
σ(Xs)dBs.
2 The Euler approximation is based on a discretization of theintegrals∫ ti
ti−1
µ(Xs)ds ≈ µ(Xti−1)∆i ,
∫ ti
ti−1
σ(Xs)dBs ≈ σ(Xti−1)∆iB.
3 That is, for 1 ≤ i ≤ n,
Xti ≈ Xti−1 + µ(Xti−1)∆i + σ(Xti−1)∆iB.
In practice one usually chooses equi-distant points ti such thatmesh(τn) = T/n, and
X (n)iT/n = X (n)
(i−1)T/n + µ(X (n)(i−1)T/n)∆i + σ(X (n)
(i−1)T/n)∆iB, 1 ≤ i ≤ n.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 159 / 169
Strong Numerical Solution
Strong Marginal Convergence1 The numerical solution X (n) converges strongly to X with orderγ > 0 if there exists a constant c > 0 such thatE |XT − X (n)
T | ≤ c mesh(τn)γ , ∀n ≥ 1.2 X (n) is a strong numerical solution of the SDE if
E |XT − X (n)T | → 0, as mesh(τn)→ 0.
One could use E sup0≤t≤T |Xt − X (n)t | as a more appropriate criteria to
describe the pathwise closeness of X and X (n). But this quantity ismore difficult to deal with theoretically.
The Euler ApproximationThe equidistant Euler approximation converges strongly with order 0.5.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 160 / 169
Weak Numerical SolutionIn contrast to a strong numerical solution, a weak numerical solutionaims at the approximation of the moments of the solution X . Let f bechosen from a class of smooth functions, e.g., certain polynomials orfunctions with a specific polynomial growth.
Weak Marginal Convergence1 The numerical solution X (n) converges weakly to X with orderγ > 0 if there exists a constant c > 0 such that|Ef (XT )− Ef (X (n)
T )| ≤ c mesh(τn)γ , ∀n ≥ 1.2 X (n) is a weak numerical solution of the SDE if|Ef (XT )− Ef (X (n)
T )| → 0, as mesh(τn)→ 0.
The Euler ApproximationThe equidistant Euler approximation converges weakly with order 1.0for a class of functions f with appropriate polynomial growth.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 161 / 169
The Milstein Approximation SchemeIn contrast to the first order approximation, the Milsteinapproximation exploits a so-called Taylor-Itô expansion thatincorporates high order approximation.Heuristics: Apply the Itô lemma to the integrands µ(Xs) and σ(Xs)at each point ti−1 of discretization, and then estimate the higherorder terms using the fact that (dBs)2 = ds.Taylor-Itô expansions involve multiple stochastic integrals. Theirrigorous treatment requires a more advanced theory of thestochastic calculus.
The Milstein ApproximationDefine recursively for 1 ≤ i ≤ n,X (n)
ti = X (n)ti−1
+µ(X (n)ti−1
)∆i +σ(X (n)ti−1
)∆iB + 12σ(X (n)
ti−1)σ′(X (n)
ti−1)[(∆iB)2−∆i ],
with X (n)0 = X0.
The equidistant Milstein approximation converges strongly with order1.0.Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 162 / 169
Monte Carlo vs Numerical MethodsOnce sample paths (or scenarios) of the solution of an Itô SDEare obtained, they can be used to estimate the distributionalquantities (expectations, variances, covariance and higher-ordermoments) of the solution.Since derivative prices are often written as expectations ofunderlying asset values, which are the solutions of SDEs, MonteCarlo method becomes an essential tool in the pricing ofderivative securities and in risk management.Monte Carlo is generally not a competitive method for calculatingunivariate expectation. For example, the error in a trapezoidal rulefor the integral of a d-dimensional twice continuously differentiablefunction is O(n−2/d ), which is in contrast to the standard errorO(n−1/2) of the Monte Carlo method for the same problem.The performance degradation with increasing dimension is acharacteristic of all deterministic integration methods, and thusMonte Carlo methods are attractive in evaluating integrals in highdimension.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 163 / 169
Illustrative Example: European Call Option
The price of one share of a risky asset (stock) is described by
dXt = cXtdt + σXtdBt , t ∈ [0,T ].
The price of a riskless asset (bond) is described bydβt = rβdβt , t ∈ [0,T ].
At time of maturity T , VT = (XT − K )+.Using the Fundamental Theorem of Arbitrage-Free Pricing, wehave
C := V0 = E(e−rT (XT − K )+).
with XT = X0e(r− 12σ
2)T+σBT .Although this formula can be written explicitly in terms of thenormal distribution (the Black-Scholes formula), we can alsoestimate C using Monte Carlo method.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 164 / 169
MC Estimate of European Call Options
Algorithmfor i = 1, . . . ,n
generate the standard normal Zi
set Xi(T ) = X0e(r− 12σ
2)T+σ√
TZi
set Ci = e−rT (Xi(T )− K )+
set Cn = (C1 + · · ·+ Cn)/n.
The estimator Cn is unbiased and strongly consistent.For finite but at least moderately large n, we can supplement thepoint estimate Cn with a (1− α)100% confidence intervalCn +−tα/2,n−1
sC√n , where sC is the sample standard deviation, and
tα/2,n−1 is the upper 100(α/2)th percentage point of a t distributionwith n − 1 degrees of freedom.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 165 / 169
Another Illustrative Example: Asian Options
Consider the payoff VT = (X − K )+, where X = (∑m
j=1 Xtj )/m fora fixed set of dates 0 = t0 < t1 < · · · < tm = T .Again, the Fundamental Theorem of Arbitrage-Free Pricingimplies that C := V0 = E(e−rT (X − K )+) whereXtj+1 = Xtj e
(r− 12σ
2)(tj+1−tj )+σ√
tj+1−tj Zj+1 .
Algorithmfor i = 1, . . . ,n
for j = 1, . . . ,mgenerate the standard normal Zij
set Xi(j) = Xi(j − 1)e(r− 12σ
2)(tj−tj−1)+σ√
tj−tj−1Zij
set Xi = (Xi(1) + · · ·+ Xi(m))/mset Ci = e−rT (Xi − K )+
set Cn = (C1 + · · ·+ Cn)/n.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 166 / 169
Efficiency of Simulation EstimatorsCn from above two examples is unbiased and asymptoticallynormal.More precisely, let s denote our computational budget, and τdenote the computational time needed for Ci , then
√s[Cbs/τc − C]→d N(0, σ2
Cτ),
as s →∞. In comparing unbiased estimators, we should preferthe one for which σ2
Cτ is smallest.Bias frequently occurs in estimation via MC methods. Forexample, the bias can arise due to the following errors.
1 Model discretization error: For many models, exact sampling of thecontinuous-time dynamics is infeasible, some discretizationapproximation has to be used, resulting a bias.
2 Payoff discretization error: Discretization has to be used for thepayoffs that are functionals of the underlying asset processes.
3 Nonlinear functions of means: In a compound option, the price ofthe first option depends on the price of the second option ..., butthese prices can only be estimated, resulting a bias.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 167 / 169
Some References and Further ReadingThis lecture notes are written using the books “ElementaryStochastic Calculus” (World Scientific, 2002) by Thomas Mikosch,and “Introductory Stochastic Analysis for Finance and Insurance”(Wiley, 2006) by Sheldon Lin.A Standard Advanced Textbook on Itô Integrals: “Brownian Motionand Stochastic Calculus” (Springer 1991) by I. Karatzas and S. E.Shreve.Stochastic Integrals and SDEs Driven by Lévy Processes: “LévyProcesses and Stochastic Calculus” (Cambridge 2009) by D.Applebaum.Stochastic Finance: “Stochastic Calculus for Finance I, II”(Springer 2004) by S. E. Shreve.SDE Application in Actuarial Science: “Introductory StochasticAnalysis for Finance and Insurance” (Wiley, 2006) by Sheldon Lin,and “Stochastic Control in Insurance” (Springer 2008) by H.Schmidli.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 168 / 169
More References
Numerical Analysis on SDEs: “Numerical Solution of StochasticDeferential Equations” (Springer 1995) by P. Kloeden and E.Platen.Monte Carlo Simulation: “Monte Carlo Methods in FinancialEngineering” (Springer 2004) by Paul Glasserman.Lévy Matters: “Financial Modelling with Jump Processes”(Chapman & Hall 2004) by Rama Cont and Peter Tankov.Financial Times Series (GARCH, univariate and multivariate):“Statistics of Financial Markets” (Springer 2008) by J. Franke, C.M. Hafner and W. K. Hardle.
Haijun Li An Introduction to Stochastic Calculus Lisbon, May 2018 169 / 169