M5A44 COMPUTATIONAL STOCHASTIC PROCESSESpavl/M5A44LectureNotes.pdf · M5A44 COMPUTATIONAL...

download M5A44 COMPUTATIONAL STOCHASTIC PROCESSESpavl/M5A44LectureNotes.pdf · M5A44 COMPUTATIONAL STOCHASTIC PROCESSES Professor G.A. Pavliotis Department of Mathematics Imperial College

If you can't read please download the document

Transcript of M5A44 COMPUTATIONAL STOCHASTIC PROCESSESpavl/M5A44LectureNotes.pdf · M5A44 COMPUTATIONAL...

  • M5A44 COMPUTATIONAL STOCHASTICPROCESSES

    Professor G.A. Pavliotis

    Department of Mathematics Imperial College London, [email protected]

    WINTER TERM 2014-201514/01/2015

    Pavliotis (IC) CompStochProc 1 / 298

  • Lectures: Mondays, 11:00-13:00, Wednesdays 12:00-13:00,Huxley 658.

    Office Hours: Mondays 14:00-15:00, Wednesdays 14:00-15:00 orby appointment. Huxley 621.Course webpage:http://www.ma.imperial.ac.uk/~pavl/comp_stoch_proc.htm

    Text: Lecture notes, available from the course webpage. Also,recommended reading from various textbooks/review articles.

    Pavliotis (IC) CompStochProc 2 / 298

    http://www.ma.imperial.ac.uk/~pavl/comp_stoch_proc.htm

  • This is an introductory course on computational stochasticprocesses, aimed towards 4th year, MSc and MRes students inapplied mathematics applied mathematics and theoretical physics.

    Prior knowledge of basic stochastic processes in continuous time,scientific computing and a programming language such as matlabor C is assumed.

    Pavliotis (IC) CompStochProc 3 / 298

  • 1 PART I: INTRODUCTION Random number generators. Random variables, probability distribution functions. Introduction to Monte Carlo techniques. Variance reduction techniques.

    2 SIMULATION OF STOCHASTIC PROCESSES Introduction to stochastic processes. Brownian motion and related stochastic processes and their

    simulation. Gaussian stochastic processes. Karhunen-Loeve expansion and simulation of Gaussian random

    fields. Numerical solution of ODEs and PDEs with random coefficients.

    Pavliotis (IC) CompStochProc 4 / 298

  • 3. NUMERICAL SOLUTION OF STOCHASTIC DIFFERENTIALEQUATIONS.

    Basic properties of stochastic differential equations. Its formula and stochastic Taylor expansions. Examples of numerical methods: The Euler-Marayama and Milstein

    schemes. Theoretical issues: convergence, consistency, stability. Weak versus

    strong convergence. Implicit schemes and stiff SDEs.

    Pavliotis (IC) CompStochProc 5 / 298

  • 4. ERGODIC DIFFUSION PROCESSES Deterministic methods: numerical solution of the Fokker-Planck

    equation. Numerical calculation of the invariant measure/steady state

    simulation. Calculation of expectation values, the Feynman-Kac formula.

    5. MARKOV CHAIN MONTE CARLO (MCMC). The Metropolis-Hastings and MALA algorithms. Diffusion limits for MCMC algorithms. Peskun and Tierney orderings and Markov Chain comparison. Optimization of MCMC algorithms. Bias correction and variance reduction techniques.

    Pavliotis (IC) CompStochProc 6 / 298

  • 6. STATISTICAL INFERENCE FOR DIFFUSION PROCESSES. Estimation of the diffusion coefficient (volatility). Maximum likelihood estimation of drift coefficients in SDEs. Nonparametric and Bayesian techniques.

    7. APPLICATIONS Probabilistic methods for the numerical solution of partial differential

    equations. Exit time problems. Molecular dynamics/computational statistical mechanics. Molecular motors, stochastic resonance.

    Pavliotis (IC) CompStochProc 7 / 298

  • Prerequisites

    Basic knowledge of ODEs and PDEs.

    Familiarity with the theory of stochastic processes.

    Stochastic differential equations.

    Basic knowledge of functional analysis and stochastic analysis

    Numerical methods and scientific computing.

    Familiarity with a programming language: Matlab, C, Fortran....

    Pavliotis (IC) CompStochProc 8 / 298

  • Assessment

    Based on 1 project (25%) and a final exam (75%).

    Mastery exam: additional project/paper to study and write areport.

    The project will be (mostly) computational.

    The exam will be theoretical (e.g. analysis of algorithms).

    Pavliotis (IC) CompStochProc 9 / 298

  • Bibliography

    Lecture notes will be provided for all the material that we will coverin this course. They will be posted on the course webpage.Books that cover (parts of) the contents of this course are

    G.A. Pavliotis Stochastic Processes and Applications, Springer(2014).

    Kloeden & Platen: Numerical Solution of Stochastic DifferentialEquations, Springer (1992).

    S.M. Iacus Simulation and inference for stochastic differentialequations, Springer (2008).

    Asmussen and Glynn, Stochastic Simulation, Springer (2007). Huynh, Lai, Soumare Stochastic simulation and applications in

    finance with Matlab programs. Wiley (2008).

    Pavliotis (IC) CompStochProc 10 / 298

  • Lectures

    Slides and whiteboard.

    Computer experiments in Matlab.

    Pavliotis (IC) CompStochProc 11 / 298

  • why Computational Stochastic Processes?

    Many models in science, engineering and economics areprobabilistic in nature and we have to deal with uncertainty.

    Many deterministic problems can be solved more efficiently usingprobabilistic techniques.

    Pavliotis (IC) CompStochProc 12 / 298

  • Deriving dynamical models from paleoclimatic recordsF. Kwasniok, and G. Lohmann, Phys. Rev. E, 80, 6, 066104 (2009)

    Pavliotis (IC) CompStochProc 13 / 298

  • Fit this data to a bistable SDE

    dx = V (x; a) dt + dW, V(x) =4

    j=1

    ajxj. (1)

    Estimate the coefficients in the drift from the palecolimatic datausing the unscented Kalman filter.

    the resulting potential is highly asymmetric.

    Pavliotis (IC) CompStochProc 14 / 298

  • Pavliotis (IC) CompStochProc 15 / 298

  • Computational Statistical Physics

    Sample from the Boltzmann distribution:

    (x) =1Z

    eU(x), (2)

    where = (kBT)1 is the inverse temperature and thenormalization constant Z is the partition function

    Z =

    R3keU(x) dx. (3)

    x denotes the configuration of the system x = {xi : i = 1, . . . k},xi = (xi1, xi2, xi3).The potential U(x) is a sum of pairwise interactions

    U(x) =

    i,j

    (|xi xj|), (r) = 4[(

    r

    )12(

    r

    )6].

    Pavliotis (IC) CompStochProc 16 / 298

  • We need to be able to calculate Z(), which is a very highdimensional integral.

    We want to be able to calculate expectations with respect to (x):

    Ef (x) =

    R3kf (x)(x) dx.

    One method for doing this is by simulating a diffusion process(MCMC):

    dXt = U(Xt) dt +

    21 dWt.

    Pavliotis (IC) CompStochProc 17 / 298

  • Some interesting questions

    How many experiments should we do?

    How de we compute expectations with respect to stationarydistributions?

    Can we exploit the problem structure to speed up thecomputation?

    How do we efficiently compute probabilities of rare events?

    How do we estimate the sensitivity of a stochastic model tochanges in a parameter?

    How we we use stochastic simulation to optimize our choice ofdecision parameters?

    Pavliotis (IC) CompStochProc 18 / 298

  • Example: Brownian motion in a periodic potential

    Consider the Langevin dynamics in a smooth periodic potential

    q = V(q) q +

    21 W, (4)

    where W(t) denotes standard d-dimensional Brownian motion, > 0 the friction coefficient and the inverse temperature(strength of the noise).

    Write as a first order system:

    dqt = pt dt, (5a)

    dpt = qV(qt) dt pt dt +

    21 dWt. (5b)

    Pavliotis (IC) CompStochProc 19 / 298

  • At long times the solution to qt performs an effective Brownianmotion with diffusion coefficient D, Eq2t 2Dt, t 1.We compute the diffusion coefficient for V(q) = cos(q) using theformula

    D = limt

    Var(qt)2t

    ,

    as a function of , at a fixed temperature 1.

    The SDE (5) has to be solved numerically using an appropriatescheme.

    The numerical simulations suggest that

    D 1, for both 1 and 1. (6)

    Pavliotis (IC) CompStochProc 20 / 298

  • 101

    102

    103

    104

    101

    100

    101

    102

    t

    < x

    2>

    /2t

    101

    102

    103

    104

    101

    100

    101

    102

    t

    < x

    2>

    /2t

    101

    102

    103

    104

    101

    100

    101

    102

    t

    < x

    2>

    /2t

    = 0.063

    = 0.01

    = 0.0158

    = 0.0251

    = 0.0398

    103

    102

    101

    100

    101

    102

    103

    102

    101

    100

    101

    102

    103

    Deff

    Deff

    D0eff

    /

    0.1826/0.97

    a. Var(qt)/2t vs t b. D vs Figure: Variance and effective diffusion coefficient for various values of .

    Pavliotis (IC) CompStochProc 21 / 298

  • The One-Dimensional Random Walk

    We let time be discrete, i.e. t = 0, 1, . . . . Consider the followingstochastic process Sn:

    S0 = 0;

    at each time step it moves to 1 with equal probability 12 .In other words, at each time step we flip a fair coin. If the outcome isheads, we move one unit to the right. If the outcome is tails, we moveone unit to the left.Alternatively, we can think of the random walk as a sum of independentrandom variables:

    Sn =n

    j=1

    Xj,

    where Xj {1, 1} with P(Xj = 1) = 12 .

    Pavliotis (IC) CompStochProc 22 / 298

  • We can simulate the random walk on a computer:

    We need a (pseudo)random number generator to generate nindependent random variables which are uniformly distributed inthe interval [0,1].

    If the value of the random variable is > 12 then the particle movesto the left, otherwise it moves to the right.

    We then take the sum of all these random moves.

    The sequence {Sn}Nn=1 indexed by the discrete timeT = {1, 2, . . .N} is the path of the random walk. We use a linearinterpolation (i.e. connect the points {n, Sn} by straight lines) togenerate a continuous path.

    Pavliotis (IC) CompStochProc 23 / 298

  • 0 5 10 15 20 25 30 35 40 45 50

    6

    4

    2

    0

    2

    4

    6

    8

    50step random walk

    Figure: Three paths of the random walk of length N = 50.

    Pavliotis (IC) CompStochProc 24 / 298

  • 0 100 200 300 400 500 600 700 800 900 1000

    50

    40

    30

    20

    10

    0

    10

    20

    1000step random walk

    Figure: Three paths of the random walk of length N = 1000.

    Pavliotis (IC) CompStochProc 25 / 298

  • Every path of the random walk is different: it depends on theoutcome of a sequence of independent random experiments.

    We can compute statistics by generating a large number of pathsand computing averages. For example, E(Sn) = 0, E(S2n) = n.

    The paths of the random walk (without the linear interpolation) arenot continuous: the random walk has a jump of size 1 at each timestep.

    This is an example of a discrete time, discrete space stochasticprocesses.

    The random walk is a time-homogeneous (the probabilistic lawof evolution is independent of time) Markov (the future dependsonly on the present and not on the past) process.

    If we take a large number of steps, the random walk starts lookinglike a continuous time process with continuous paths.

    Pavliotis (IC) CompStochProc 26 / 298

  • Consider the sequence of continuous time stochastic processes

    Znt :=1n

    Snt.

    In the limit as n , the sequence {Znt } converges (in someappropriate sense) to a Brownian motion with diffusioncoefficient D = x

    2

    2t =12 .

    Pavliotis (IC) CompStochProc 27 / 298

  • 0 0.2 0.4 0.6 0.8 11.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    t

    U(t)

    mean of 1000 paths5 individual paths

    Figure: Sample Brownian paths.

    Pavliotis (IC) CompStochProc 28 / 298

  • Brownian motion W(t) is a continuous time stochastic processeswith continuous paths that starts at 0 (W(0) = 0) and hasindependent, normally. distributed Gaussian increments.

    We can simulate the Brownian motion on a computer using arandom number generator that generates normally distributed,independent random variables.

    Pavliotis (IC) CompStochProc 29 / 298

  • We can write an equation for the evolution of the paths of aBrownian motion Xt with diffusion coefficient D starting at x:

    dXt =

    2DdWt, X0 = x.

    This is an example of a stochastic differential equation.

    The probability of finding Xt at y at time t, given that it was at x attime t = 0, the transition probability density (y, t) satisfies thePDE

    t= D

    2

    y2, (y, 0) = (y x).

    This is an example of the Fokker-Planck equation.

    The connection between Brownian motion and the diffusionequation was made by Einstein in 1905.

    Pavliotis (IC) CompStochProc 30 / 298

  • ELEMENTS OF PROBABILITY THEORY

    Pavliotis (IC) CompStochProc 31 / 298

  • A collection of subsets of a set is called a algebra if itcontains and is closed under the operations of takingcomplements and countable unions of its elements.

    A sub-algebra is a collection of subsets of a algebra whichsatisfies the axioms of a algebra.

    A measurable space is a pair (,F) where is a set and F is aalgebra of subsets of .

    Let (,F) and (E,G) be two measurable spaces. A functionX : 7 E such that the event

    { : X() A} =: {X A}

    belongs to F for arbitrary A G is called a measurable function orrandom variable.

    Pavliotis (IC) CompStochProc 32 / 298

  • Let (,F) be a measurable space. A function : F 7 [0, 1] iscalled a probability measure if () = 1, () = 1 and(k=1Ak) =

    k=1 (Ak) for all sequences of pairwise disjoint sets

    {Ak}k=1 F .The triplet (,F , ) is called a probability space.Let X be a random variable (measurable function) from (,F , ) to(E,G). If E is a metric space then we may define expectation withrespect to the measure by

    E[X] =

    X() d().

    More generally, let f : E 7 R be Gmeasurable. Then,

    E[f (X)] =

    f (X()) d().

    Pavliotis (IC) CompStochProc 33 / 298

  • Let U be a topological space. We will use the notation B(U) todenote the Borel algebra of U: the smallest algebracontaining all open sets of U. Every random variable from aprobability space (,F , ) to a measurable space (E,B(E))induces a probability measure on E:

    X(B) = PX1(B) = ( ;X() B), B B(E).

    The measure X is called the distribution (or sometimes the law)of X.

    Example

    Let I denote a subset of the positive integers. A vector0 = {0,i, i I} is a distribution on I if it has nonnegative entries andits total mass equals 1:

    iI 0,i = 1.

    Pavliotis (IC) CompStochProc 34 / 298

  • We can use the distribution of a random variable to computeexpectations and probabilities:

    E[f (X)] =

    Sf (x) dX(x)

    and

    P[X G] =

    GdX(x), G B(E).

    When E = Rd and we can write dX(x) = (x) dx, then we refer to(x) as the probability density function (pdf), or density withrespect to Lebesque measure for X.

    When E = Rd then by Lp(;Rd), or sometimes Lp(;) or evensimply Lp(), we mean the Banach space of measurable functionson with norm

    XLp =(E|X|p

    )1/p.

    Pavliotis (IC) CompStochProc 35 / 298

  • Example

    Consider the random variable X : 7 R with pdf

    ,m(x) := (2) 12 exp

    ((x m)

    2

    2

    ).

    Such an X is termed a Gaussian or normal random variable. Themean is

    EX =

    R

    x,m(x) dx = m

    and the variance is

    E(X m)2 =

    R

    (x m)2,m(x) dx = .

    Pavliotis (IC) CompStochProc 36 / 298

  • Example (Continued)

    Let m Rd and Rdd be symmetric and positive definite. Therandom variable X : 7 Rd with pdf

    ,m(x) :=((2)ddet

    ) 12 exp(1

    21(x m), (x m)

    )

    is termed a multivariate Gaussian or normal random variable.The mean is

    E(X) = m (7)

    and the covariance matrix is

    E

    ((X m) (X m)

    )= . (8)

    Pavliotis (IC) CompStochProc 37 / 298

  • Since the mean and variance specify completely a Gaussianrandom variable on R, the Gaussian is commonly denoted byN (m, ). The standard normal random variable is N (0, 1).Since the mean and covariance matrix completely specify aGaussian random variable on Rd, the Gaussian is commonlydenoted by N (m,).

    Pavliotis (IC) CompStochProc 38 / 298

  • The Characteristic Function

    Many of the properties of (sums of) random variables can bestudied using the Fourier transform of the distribution function.

    The characteristic function of the random variable X is definedto be the Fourier transform of the distribution function

    (t) =

    R

    eit dX() = E(eitX). (9)

    The characteristic function determines uniquely the distributionfunction of the random variable, in the sense that there is aone-to-one correspondance between F() and (t).

    The characteristic function of a N (m,) is

    (t) = em,t12 t,t.

    Pavliotis (IC) CompStochProc 39 / 298

  • Lemma

    Let {X1,X2, . . . Xn} be independent random variables with characteristicfunctions j(t), j = 1, . . . n and let Y =

    nj=1 Xj with characteristic

    function Y(t). ThenY(t) =

    nj=1j(t).

    Lemma

    Let X be a random variable with characteristic function (t) andassume that it has finite moments. Then

    E(Xk) =1ik(k)(0).

    Pavliotis (IC) CompStochProc 40 / 298

  • Types of Convergence and Limit Theorems

    One of the most important aspects of the theory of randomvariables is the study of limit theorems for sums of randomvariables.

    The most well known limit theorems in probability theory are thelaw of large numbers and the central limit theorem.

    There are various different types of convergence for sequences orrandom variables.

    Pavliotis (IC) CompStochProc 41 / 298

  • DefinitionLet {Zn}n=1 be a sequence of random variables. We will say that(a) Zn converges to Z with probability one (almost surely) if

    P(

    limn+

    Zn = Z)= 1.

    (b) Zn converges to Z in probability if for every > 0

    limn+

    P(|Zn Z| >

    )= 0.

    (c) Zn converges to Z in Lp if

    limn+

    E[Zn Z

    p] = 0.

    (d) Let Fn(), n = 1, +, F() be the distribution functions ofZn n = 1, + and Z, respectively. Then Zn converges to Z indistribution if

    limn+

    Fn() = F()

    for all R at which F is continuous.Pavliotis (IC) CompStochProc 42 / 298

  • Let {Xn}n=1 be iid random variables with EXn = V. Then, thestrong law of large numbers states that average of the sum ofthe iid converges to V with probability one:

    P

    (lim

    n+1N

    N

    n=1

    Xn = V

    )= 1.

    The strong law of large numbers provides us with informationabout the behavior of a sum of random variables (or, a largenumber or repetitions of the same experiment) on average.We can also study fluctuations around the average behavior.Indeed, let E(Xn V)2 = 2. Define the centered iid randomvariables Yn = Xn V. Then, the sequence of random variables

    1

    N

    Nn=1 Yn converges in distribution to a N (0, 1) random

    variable:

    limn+

    P

    (1

    N

    N

    n=1

    Yn 6 a

    )=

    a

    12

    e12 x

    2dx.

    Pavliotis (IC) CompStochProc 43 / 298

  • Assume that E|X|

  • Random Number Generators

    How can a deterministic machine produce random numbers?We can use a chaotic dynamical system:

    xn+1 = f (xn), x0 = x, n = 1, 2, . . .

    Provided that this dynamical system is sufficiently chaotic, thenwe can hope that for n sufficiently large the sequence of numbersxn, xn+1, xn+2, . . . is random with nice statistical properties.Statistical tests can be used in order to determine whether thissequence of pseudo-random numbers can be used in order toperform Monte Carlo simulations, simulate stochastic processesetc.Modern programming languages have well tested build in RNG. InMatlab:rand U(0, 1).randn N(0, 1).

    Pavliotis (IC) CompStochProc 45 / 298

  • DefinitionA uniform pseudo-number generator is an algorithm which, startingfrom an initial value x0 (the seed), produces a sequence ui = Diu0 ofvalues in [0, 1].

    LemmaSuppose Y U(0, 1) and F a 1d cumulative distribution function. ThenX = F1(U) with F1(u) = inf {x ; F(x) > u} has the distribution F.

    We can use the Box-Muller algorithm to generate Gaussian randomvariables, given a pseudonumber generator of uniform r.v.: GivenU1, U2 U(0, 1) independent, then

    Z0 =

    2 ln U1 cos(2U2),Z1 =

    2 ln U1 sin(2U2)

    are independent N(0, 1) r.v.

    Pavliotis (IC) CompStochProc 46 / 298

  • Fast Chaotic NoiseFast/slow system:

    dxdt

    = x x3 +

    y2 ,

    dy1dt

    =102

    (y2 y1) ,dy2dt

    =12(28y1 y2 y1y3) ,

    dy3dt

    =12(y1y2

    83

    y3)

    Effective Dynamics: [Melbourne, Stuart 11]dXt = A

    (Xt Xt3

    )dt +

    dWt

    true values:

    A = 1 , =2

    45, = 22

    0lim

    T1T

    T

    0s(y)s+t(y) ds dt

    Pavliotis (IC) CompStochProc 47 / 298

  • Fast Chaotic NoiseEstimators

    Values for reported in the literature ( = 103/2) 0.126 0.003 via Gaussian moment approx. 0.13 0.01 via HMM

    here: = 101 0.121 and = 103/2 0.124But we estimate also A

    Pavliotis (IC) CompStochProc 48 / 298

  • Generation of Multidimensional Gaussian Random Variables

    We are given Z N (0, I), where I Rdd denotes the identitymatrix.

    We want to generate a Gaussian random vector Y N (b,),EY = b, E((Y b)(Y b)T) = .X = Y b is mean zero, sufficient to consider this case.We can use the Cholesky decomposition: Assuming that > 0there exists a unique lower triangular matrix such that

    = T . (10)

    Then: X = Z is X N (0,).The computational cost is O(d3).In matlab:

    chol()

    Pavliotis (IC) CompStochProc 49 / 298

  • Eigenvalue decomposition of Cov(X)

    The Cholesky decomposition is not unique for covariancematrices that are not strictly positive definite.

    The eigenvalues and eigenvectors of the covariance matrixprovide us complete information about the Gaussian randomvector.

    Principal Component Analysis.

    Let X N (0,), > 0. Then

    = PDPT , PPT = I, D = diag(d1, dd). (11)

    Set L = P

    D and use the Cholesky decomposition.

    In Matlab:

    [P,D] = eig ()

    Pavliotis (IC) CompStochProc 50 / 298

  • Other techniques for generating random numbers with a givendistribution:

    Acceptance-rejection algorithm.

    Markov Chain Monte Carlo.

    Pavliotis (IC) CompStochProc 51 / 298

  • Monte Carlo Techniques

    Example:

    I =

    Df (x) dx = Ef (X), x (U(D))d

    1N

    N

    i=1

    f (xi) =: IN , xi iid U(D).

    To prove convergence use the (strong) law of large numbers:

    limN+

    1N

    N

    i=1

    f (xi) = I a.s.

    Pavliotis (IC) CompStochProc 52 / 298

  • To obtain an estimate on the rate of convergence use the centrallimit theorem:

    N(IN I

    ) N(0, 2), in distribution,

    where 2 = Var(f (x)).

    The error of the Monte Carlo approximation is O(N1/2),regardless of the dimensionality of x.

    For smooth functions, deterministic approximations (Riemannsum, trapezoidal rule..) can have better convergence rate, but theydont scale well as the number of dimensions increases.

    Pavliotis (IC) CompStochProc 53 / 298

  • More generally, consider the integral

    I =

    Rdf (x)(x) dx, (12)

    where (x) is a probability distribution. We can always writeintegrals in this form (multiply and divide by (x)).Monte Carlo estimator:

    IN =1N

    N

    i=1

    f (Xi), Xi (x), iid. (13)

    The MC estimator is unbiased:

    EIN = I.

    Variance of the estimator:

    2I := var(IN) =1N

    var(f (x)) =1N

    Rd(f (x) I)2(x) dx.

    Pavliotis (IC) CompStochProc 54 / 298

  • From the strong law of large numbers:

    limN+

    IN = I a.s.

    From the weak LLN and Chebyshevs inequality:

    P (|IN I| 6 ) > 1 2I2

    = 1 2f

    N2. (14)

    Accuracy of the MC estimator depends on N and on the variance2I .In order for

    P (IN [I , I + ]) = 1 ,for a confidence level (0, 1) we need

    N >2f2

    .

    MC error scales like 1/

    N, independently of the number ofdimensions. It is expected that 2f increases as the number ofdimensions increases.

    Pavliotis (IC) CompStochProc 55 / 298

  • Example (Huynh et al Stochastic Simulation andApplications in finance ch. 5)

    1 function [Call, VarCall]=CalculateCall2(N)2 % Price a call option3 %N: Number of generated points to compute the expectation4

    5 x = exp(sqrt(0.1) * randn(1,N) + 5);6 Call = mean( max(x - 110,0));7 VarCall = var(max(x - 110,0))/N;8

    9 end

    Pavliotis (IC) CompStochProc 56 / 298

  • VARIANCE REDUCTION TECHNIQUES

    To improve the accuracy of the MC estimator and to reduce thecomputational cost we can use variance reduction techniques.Examples of such techniques are:

    Antithetic variables. Control variates. Importance sampling.

    To implement these techniques we need to be able to estimate themean and variance "on the fly":

    IN+1 =1

    N + 1(NIN + f (XN+1))

    and

    var(IN) =1N

    (1

    N 1N

    i=1

    |f (Xi)|2 N

    N 1 I2N

    ).

    Pavliotis (IC) CompStochProc 57 / 298

  • Antithetic Variables: generate additional random variables thathave the same distribution as Xi, i = 1, . . .N but are negativelycorrelated to Xi.

    Example: suppose we want to estimate the mean X of iid randomvariables Xi, i = 1, . . .N. We introduce the auxiliary r.v.Xai , i = 1, . . .N with

    E[(Xai x)(Xj X)] = ij2X.

    The variance of the sample mean is

    Var(a) =1

    2N22X +

    14N2

    N

    k,j=1

    E[(Xai x)(Xj X)]

    =1

    4N2X.

    Not possible in general to find perfectly anti-correlated randomvariables.

    Pavliotis (IC) CompStochProc 58 / 298

  • ExampleLet X U(0, 1). Then Xa = 1 X is an antithetic variable of X

    Example

    Let X N (, 2). Then Xa = 2 X is an antithetic variable of X.

    ExampleLet X N (, ) a 2d Gaussian random vector. We use antitheticvariables to reduce the variance and increase the accuracy of theestimation of Ef (X1, X2). We will estimate the expectation ofYj = eXj , j = 1, 2. These integrals can be calculated analytically.

    Pavliotis (IC) CompStochProc 59 / 298

  • (Huynh et al Stochastic Simulation and Applications in finance ch. 5)

    1 function Rep=SimulationsAnti2 %calculate the mean of an exponential gaussian using3 %antithetic variables.4 NbTraj=100;5 % construct the covariance matrix6 sigma1=0.2; sigma2=0.5;rho=-0.3;7 MoyTheo=[10;15];8 CovTheo=[sigma1^2,rho*sigma1*sigma2;rho*sigma1*sigma2,sigma2^2];9 L=chol(CovTheo)';

    10 %Simulation of the random variables.11 Sample=randn(2,NbTraj);12 SampleSimple=repmat(MoyTheo,1,NbTraj)+L*Sample;13 %Introduction of the antithetic variables14 XAnti=cat(2,SampleSimple,2*repmat(MoyTheo,1,NbTraj)-SampleSimple15 %Transformation of Xi into Zi16 Z1Anti=exp(XAnti(1,:))17 Z2Anti=exp(XAnti(2,:))18 Rep=mean(Z1Anti,2);19 end

    Pavliotis (IC) CompStochProc 60 / 298

  • CONTROL VARIATES

    Let Z be a random variable. We want to estimate EZ.

    We look for a r.v. W that is strongly correlated with Z and withknown EW.

    Consider the new r.v.

    X = Z + (W EW). (15)

    We have

    Var(X) = Var(Z) + 2Var(W) + 2Cov(Z,W).

    We minimize the variance with respect to :

    = Cov(Z,W)Var(W)

    and Varmin(X) = Var(Z)(1 2).

    Pavliotis (IC) CompStochProc 61 / 298

  • CONTROL VARIATES

    The estimator (16) is unbiased and the variance is reduced when

    the correlation coefficient 2 = Cov(Z,W)Var(Z)Var(W) is close to 1.

    The optimal value of and Var(X) can be estimated using theempirical means.

    We can also construct confidence intervals using the empiricalvalues.

    We can add R control variates:

    X = Z +R

    j=1

    (Wj EWj). (16)

    A similar idea can be used for vector valued r.v. X, or even ininfinite dimensions.

    Pavliotis (IC) CompStochProc 62 / 298

  • Importance sampling: choose a good probability distribution fromwhich to simulate random variables:

    Rdf (x)(x) dx =

    Rd

    f (x)(x)(x)

    (x) dx

    = E

    (f (x)(x)(x)

    ).

    where (x) is a probability distribution.

    Consider the MC estimator

    IN =1N

    N

    j=1

    f (Xj)(Xj)

    (Xj), Xj (x).

    The estimator is unbiased. The variance is

    var(IN ) =1N

    var(

    f (X)(X)(X)

    ).

    Pavliotis (IC) CompStochProc 63 / 298

  • We can to choose (x) so that it has similar shape to f (x)(x).

    The closer f (X)(X)(X) is to a constant the smaller the variance is.

    ExampleWe use importance sampling to estimate the integral

    I = 1

    0c exp(a(x b)4) dx = 18.128

    with c = 100, a = 10000, b = 0.5. we choose = (0.5, 0.05);For N = 10000 we have IN = 18.2279, var(IN) = 0.1193, errN =0.0999, IIC = 18.1271, var(IIC) = 0.0041, err(IIC) = 8.56 104.

    Pavliotis (IC) CompStochProc 64 / 298

  • 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    Figure: c exp(a(x b)4) and (0.5, 0.05)

    Pavliotis (IC) CompStochProc 65 / 298

  • 1 function [yun,varyun,errun,yIC,varyIC,errIC] = quarticIS(N)2 % Use importance sampling to calculate the mean3 % of exp(-V) where V is a quartic polynomial4 a = 10000;b=0.5;c=100;5 intexact = c*0.18128;6 x = rand(N,1);7 yun = mean(c*exp(-a*(x-b).^4));8 varyun = var(c*exp(-a*(x-b).^4))/N;9 errun = abs(intexact-yun);

    10 mu = 0.5;sigma = 0.05;11 xx = mu + sigma*randn(N,1);12 yIC = mean((c*exp(-a*(xx-b).^4))./normpdf(xx,mu,sigma));13 varyIC = var((c*exp(-a*(xx-b).^4))./normpdf(xx,mu,sigma))/N;14 errIC = abs(intexact-yIC);

    Pavliotis (IC) CompStochProc 66 / 298

  • ELEMENTS OF THE THEORY OF STOCHASTIC PROCESSES

    Pavliotis (IC) CompStochProc 67 / 298

  • Let T be an ordered set. A stochastic process is a collection ofrandom variables X = {Xt; t T} where, for each fixed t T, Xt isa random variable from (,F) to (E,G).The measurable space {,F} is called the sample space. Thespace (E,G) is called the state space .In this course we will take the set T to be [0,+).The state space E will usually be Rd equipped with the algebraof Borel sets.

    A stochastic process X may be viewed as a function of both t Tand . We will sometimes write X(t),X(t, ) or Xt() insteadof Xt. For a fixed sample point , the function Xt() : T 7 E iscalled a sample path (realization, trajectory) of the process X.

    Pavliotis (IC) CompStochProc 68 / 298

  • The finite dimensional distributions (fdd) of a stochasticprocess are the distributions of the Ekvalued random variables(X(t1),X(t2), . . . ,X(tk)) for arbitrary positive integer k and arbitrarytimes ti T, i {1, . . . , k}:

    F(x) = P(X(ti) 6 xi, i = 1, . . . , k)

    with x = (x1, . . . , xk).

    We will say that two processes Xt and Yt are equivalent if theyhave same finite dimensional distributions.

    From experiments or numerical simulations we can only obtaininformation about the (fdd) of a process.

    Pavliotis (IC) CompStochProc 69 / 298

  • Definition

    A Gaussian process is a stochastic processes for which E = Rd andall the finite dimensional distributions are Gaussian

    F(x) = P(X(ti) 6 xi, i = 1, . . . , k)

    = (2)n/2(detKk)1/2 exp[1

    2K1k (x k), x k

    ],

    for some vector k and a symmetric positive definite matrix Kk.

    A Gaussian process x(t) is characterized by its mean

    m(t) := Ex(t)

    and the covariance function

    C(t, s) = E((

    x(t) m(t))(x(s) m(s)

    )).

    Thus, the first two moments of a Gaussian process are sufficientfor a complete characterization of the process.

    Pavliotis (IC) CompStochProc 70 / 298

  • Examples of Gaussian stochastic processes

    Random Fourier series: let i, i N (0, 1), i = 1, . . .N and define

    X(t) =N

    j=1

    (j cos(2jt) + j sin(2jt)) .

    Brownian motion is a Gaussian process withm(t) = 0, C(t, s) = min(t, s).

    Brownian bridge is a Gaussian process withm(t) = 0, C(t, s) = min(t, s) ts.The Ornstein-Uhlenbeck process is a Gaussian process withm(t) = 0, C(t, s) = e|ts| with , > 0.

    Pavliotis (IC) CompStochProc 71 / 298

  • To simulate a Gaussian stochastic process: Fix t and define tj = (j 1)t, j = 1, . . .N. Set Xj := X(tj) and define the Gaussian random vector

    XN ={

    XNj}N

    j=1. Then XN N (N , N) with N = ((t1), . . . (tN))

    and Nij = C(ti, tj). Then XN = N + N (0, I) with N = T .

    We can calculate the square root of the covariance matrix eitherusing the Cholesky factorization or by calculating its eigenvaluedecomposition (PCA).

    might be a full matrix and the calculation of can becomputationally expensive.

    Pavliotis (IC) CompStochProc 72 / 298

  • 1 function [t,x,gamma,mx,varx] =gaussiansimulation(T,dt,M)2 % simulate a mean zero gaussian process3 % use the eigenvalue decomposition to calculate sqrt{Gamma}4 % Input: T: length of path5 % dt: timestep6 % gamma: covariance7 % M : number of paths simulated8 % calculate first and second moment9 t = 0:dt:T;

    10 N = length(t);11 for i =1:N,12 for j=1:N,13 % gamma(i,j) = covBM(t(i),t(j));14 % gamma(i,j) = covOU(t(i),t(j),1,1);15 % gamma(i,j) = covBB(t(i),t(j));16 gamma(i,j) = covfBM(t(i),t(j),0.2);17 end18 end19 [vv,dd] = eig(gamma);lambda = vv*sqrt(dd)*vv';20 x = lambda*randn(N,M);21 mx = mean(x'); varx = mean(x'.*x');22 figure;plot(t,x(:,1:10),'Linewidth',1.0);23 hold ...Pavliotis (IC) CompStochProc 73 / 298

  • 1 function rr = covBM(x,y)2 % covariance function of Brownian motion3 rr = min(x,y);4 end

    1 function rr = covBB(x,y)2 % covariance function of Brownian motion3 rr = min(x,y) - x*y;4 end

    1 function rr = covOU(x,y,dd,aa)2 % covariance function of Ornstein-Uhlenbeck process3 % aa: inverse correlation time4 % dd: diffusion coefficient, rr(o)= dd/aa5 rr = (dd/aa)*exp(-aa*abs(x-y));6 end

    Pavliotis (IC) CompStochProc 74 / 298

  • 1 function rr = covfBM(x,y,H)2 % covariance function of fractional Brownian motion3 % H: Hurst exponent4 rr = (1/2)*(abs(x)^(2*H) + abs(y)^(2*H) - abs(x-y)^(2*H) ...

    );5 end

    Pavliotis (IC) CompStochProc 75 / 298

  • Let(,F ,P

    )be a probability space. Let Xt, t T (with T = R or Z) be

    a real-valued random process on this probability space with finitesecond moment, E|Xt|2 < + (i.e. Xt L2 ).

    Definition

    A stochastic process Xt L2 is called second-order stationary orwide-sense stationary if the first moment EXt is a constant and thesecond moment E(XtXs) depends only on the difference t s:

    EXt = , E(XtXs) = C(t s).

    Pavliotis (IC) CompStochProc 76 / 298

  • The constant m is called the expectation of the process Xt. Wewill set m = 0.

    The function C(t) is called the covariance or the autocorrelationfunction of the Xt.

    Notice that C(t) = E(XtX0), whereas C(0) = E(X2t ), which is finite,by assumption.

    Since we have assumed that Xt is a real valued process, we havethat C(t) = C(t), t R.

    Pavliotis (IC) CompStochProc 77 / 298

  • The correlation function of a second order stationary processenables us to associate a time scale to Xt, the correlation timecor:

    cor =1

    C(0)

    0C() d =

    0E(XX0)/E(X

    20) d.

    The slower the decay of the correlation function, the larger thecorrelation time is. We have to assume sufficiently fast decay ofcorrelations so that the correlation time is finite.

    Pavliotis (IC) CompStochProc 78 / 298

  • ExampleConsider the mean zero, second order stationary process withcovariance function

    R(t) =D

    e|t|. (17)

    The spectral density of this process is:

    f (x) =1

    2D

    +

    eixte|t| dt

    =1

    2D

    ( 0

    eixtet dt +

    +

    0eixtet dt

    )

    =1

    2D

    (1

    ix + +

    1ix +

    )

    =D

    1x2 + 2

    .

    Pavliotis (IC) CompStochProc 79 / 298

  • Example (Continued)This function is called the Cauchy or the Lorentz distribution.

    The Gaussian stochastic process with covariance function (17) iscalled the stationary Ornstein-Uhlenbeck process.

    The correlation time is (we have that C(0) = D/())

    cor =

    0et dt = 1.

    Pavliotis (IC) CompStochProc 80 / 298

  • Second order stationary processes are ergodic: time averages equalphase space (ensemble) averages. An example of an ergodic theoremfor a stationary processes is the following L2 (mean-square) ergodictheorem.

    TheoremLet {Xt}t>0 be a second order stationary process on a probabilityspace , F , P with mean and covariance R(t), and assume thatR(t) L1(0,+). Then

    limT+

    E

    1T

    T

    0X(s) ds

    2

    = 0. (18)

    Pavliotis (IC) CompStochProc 81 / 298

  • Proof.We have

    E

    1T

    T

    0X(s) ds

    2

    =1

    T2

    T

    0

    T

    0R(t s) dtds

    =2

    T2

    T

    0

    t

    0R(t s) dsdt

    =2

    T2

    T

    0(T v)R(u) du 0,

    using the dominated convergence theorem and the assumptionR() L1. In the above we used the fact that R is a symmetric function,together with the change of variables u = t s, v = t and an integrationover v.

    Pavliotis (IC) CompStochProc 82 / 298

  • DefinitionA stochastic process is called (strictly) stationary if all finitedimensional distributions are invariant under time translation: for anyinteger k and times ti T, the distribution of (X(t1),X(t2), . . . ,X(tk)) isequal to that of (X(s + t1),X(s + t2), . . . ,X(s + tk)) for any s such thats + ti T for all i {1, . . . , k}. In other words,

    P(Xt1+t A1,Xt2+t A2 . . .Xtk+t Ak)= P(Xt1 A1,Xt2 A2 . . .Xtk Ak), t T.

    Let Xt be a strictly stationary stochastic process with finite secondmoment (i.e. Xt L2). The definition of strict stationarity impliesthat EXt = , a constant, and E((Xt )(Xs )) = C(t s).Hence, a strictly stationary process with finite second moment isalso stationary in the wide sense. The converse is not true.

    Pavliotis (IC) CompStochProc 83 / 298

  • Remarks

    1 A sequence Y0, Y1, . . . of independent, identically distributedrandom variables is a stationary process withR(k) = 2k0, 2 = E(Yk)2.

    2 Let Z be a single random variable with known distribution and setZj = Z, j = 0, 1, 2, . . . . Then the sequence Z0, Z1, Z2, . . . is astationary sequence with R(k) = 2.

    3 The first two moments of a Gaussian process are sufficient for acomplete characterization of the process. A corollary of this is thata second order stationary Gaussian process is also a (strictly)stationary process.

    Pavliotis (IC) CompStochProc 84 / 298

  • The most important continuous time stochastic process isBrownian motion. Brownian motion is a mean zero, continuous(i.e. it has continuous sample paths: for a.e the function Xtis a continuous function of time) process with independentGaussian increments.

    A process Xt has independent increments if for every sequencet0 < t1...tn the random variables

    Xt1 Xt0 , Xt2 Xt1 , . . . ,Xtn Xtn1

    are independent.

    If, furthermore, for any t1, t2 and Borel set B R

    P(Xt2+s Xt1+s B)

    is independent of s, then the process Xt has stationaryindependent increments.

    Pavliotis (IC) CompStochProc 85 / 298

  • Definition

    A one dimensional standard Brownian motion W(t) : R+ R is areal valued stochastic process with the following properties:

    1 W(0) = 0;2 W(t) is continuous;3 W(t) has independent increments.4 For every t > s > 0 W(t) W(s) has a Gaussian distribution with

    mean 0 and variance t s. That is, the density of the randomvariable W(t) W(s) is

    g(x; t, s) =(

    2(t s))

    12

    exp

    ( x

    2

    2(t s)

    ); (19)

    Pavliotis (IC) CompStochProc 86 / 298

  • Definition (Continued)

    A ddimensional standard Brownian motion W(t) : R+ Rd is acollection of d independent one dimensional Brownian motions:

    W(t) = (W1(t), . . . ,Wd(t)),

    where Wi(t), i = 1, . . . , d are independent one dimensionalBrownian motions. The density of the Gaussian random vectorW(t) W(s) is thus

    g(x; t, s) =(

    2(t s))d/2

    exp

    ( x

    2

    2(t s)

    ).

    Brownian motion is sometimes referred to as the Wiener process .

    Pavliotis (IC) CompStochProc 87 / 298

  • To generate Brownian paths we can use the general algorithm forsimulating Gaussian processes with N = 0 and{ = min(ti, tj)

    }Ni,j=1.

    It is easier to use the fact that Brownian motion has independentincrements with dW N (0, dt):

    Wj = Wj1 + dWj, j = 1, 2, . . .N, where dWj =

    tN (0, 1).

    After having generated Brownian paths we can compute statisticsusing Monte Carlo

    Ef (Wt) 1N

    N

    j=1

    f (W jt ).

    We can also simulate processes related to Brownian motion suchas the geometric Brownian motion:

    St = S0eat+Wt . (20)

    Pavliotis (IC) CompStochProc 88 / 298

  • 0 0.2 0.4 0.6 0.8 11.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    t

    U(t)

    mean of 1000 paths5 individual paths

    Figure: Brownian sample paths

    Pavliotis (IC) CompStochProc 89 / 298

  • It is possible to prove rigorously the existence of the Wiener process(Brownian motion):

    Theorem(Wiener) There exists an almost-surely continuous process Wt withindependent increments such and W0 = 0, such that for each t > therandom variable Wt is N (0, t). Furthermore, Wt is almost surely locallyHlder continuous with exponent for any (0, 12).

    Notice that Brownian paths are not differentiable.

    Pavliotis (IC) CompStochProc 90 / 298

  • Brownian motion is a Gaussian process. For the ddimensionalBrownian motion, and for I the d d dimensional identity, we have (see(7) and (8))

    EW(t) = 0 t > 0and

    E

    ((W(t) W(s)) (W(t) W(s))

    )= (t s)I. (21)

    Moreover,E

    (W(t) W(s)

    )= min(t, s)I. (22)

    Pavliotis (IC) CompStochProc 91 / 298

  • From the formula for the Gaussian density g(x, t s), eqn. (19), weimmediately conclude that W(t) W(s) and W(t + u) W(s + u)have the same pdf. Consequently, Brownian motion has stationaryincrements.

    Notice, however, that Brownian motion itself is not a stationaryprocess.

    Since W(t) = W(t) W(0), the pdf of W(t) is

    g(x, t) =12t

    ex2/2t.

    We can easily calculate all moments of the Brownian motion:

    E(xn(t)) =12t

    +

    xnex

    2/2t dx

    ={ 1.3 . . . (n 1)tn/2, n even,

    0, n odd.

    Pavliotis (IC) CompStochProc 92 / 298

  • We can define the OU process through the Brownian motion via atime change.

    LemmaLet W(t) be a standard Brownian motion and consider the process

    V(t) = etW(e2t).

    Then V(t) is a Gaussian second order stationary process with mean 0and covariance

    K(s, t) = e|ts|.

    Pavliotis (IC) CompStochProc 93 / 298

  • DefinitionA (normalized) fractional Brownian motion WHt , t > 0 with Hurstparameter H (0, 1) is a centered Gaussian process with continuoussample paths whose covariance is given by

    E(WHt WHs ) =

    12

    (s2H + t2H |t s|2H

    ). (23)

    Fractional Brownian motion has the following properties.

    1 When H = 12 , W12t becomes the standard Brownian motion.

    2 WH0 = 0, EWHt = 0, E(W

    Ht )

    2 = |t|2H , t > 0.3 It has stationary increments, E(WHt WHs )2 = |t s|2H .4 It has the following self similarity property

    (WHt, t > 0) = (HWHt , t > 0), > 0,

    where the equivalence is in law.

    Pavliotis (IC) CompStochProc 94 / 298

  • A very useful result is that we can expand every centered(EXt = 0) stochastic process with continuous covariance (andhence, every L2-continuous centered stochastic process) into arandom Fourier series.

    Let (,F ,P) be a probability space and {Xt}tT a centeredprocess which is mean square continuous. Then, theKarhunen-Love theorem states that Xt can be expanded in theform

    Xt =

    n=1

    nn(t), (24)

    where the n are an orthogonal sequence of random variables withE|k|2 = k, where {k n}k=1 are the eigenvalues andeigenfunctions of the integral operator whose kernel is thecovariance of the processes Xt. The convergence is in L2(P) forevery t T.If Xt is a Gaussian process then the k are independent Gaussianrandom variables.

    Pavliotis (IC) CompStochProc 95 / 298

  • Set T = [0, 1].Let K : L2(T) L2(T) be defined by

    K(t) = 1

    0K(s, t)(s) ds.

    The kernel of this integral operator is a continuous (in both s andt), symmetric, nonnegative function. Hence, the correspondingintegral operator has eigenvalues, eigenfunctions

    Kn = n, n > 0, (n, m)L2 = nm,

    such that the covariance operator can be expanded in a uniformlyconvergent series

    B(t, s) =

    n=1

    nn(t)n(s).

    Pavliotis (IC) CompStochProc 96 / 298

  • The random variables n are defined as

    n =

    1

    0X(t)n(t) dt.

    The orthogonality of the eigenfunctions of the covariance operatorimplies that

    E(nm) = nnm.

    Thus the random variables are orthogonal. When X(t) isGaussian, then k are Gaussian random variables. Furthermore,since they are also orthogonal, they are independent Gaussianrandom variables.

    Pavliotis (IC) CompStochProc 97 / 298

  • Example

    The Karhunen-Love Expansion for Brownian Motion We setT = [0, 1].The covariance function of Brownian motion isC(t, s) = min(t, s). The eigenvalue problem Cn = nn becomes

    1

    0min(t, s)n(s) ds = nn(t).

    Or, 1

    0sn(s) ds + t

    1

    tn(s) ds = nn(t).

    We differentiate this equation twice:

    1

    tn(s) ds = n

    n(t) and n(t) = nn (t),

    where primes denote differentiation with respect to t.

    Pavliotis (IC) CompStochProc 98 / 298

  • ExampleFrom the above equations we immediately see that the right boundaryconditions are (0) = (1) = 0. The eigenvalues and eigenfunctionsare

    n(t) =

    2 sin

    (12(2n 1)t

    ), n =

    (2

    (2n 1)

    )2.

    Thus, the Karhunen-Love expansion of Brownian motion on [0, 1] is

    Wt =

    2

    n=1

    nsin((

    n 12)t)

    (n 12

    )

    . (25)

    Pavliotis (IC) CompStochProc 99 / 298

  • Example KL expansion for Brownian motion

    1 function [t,bb] =KLBmotion(dt)2 % generate paths of Brownian motion on [0,1]3 % using the KL expansion4 t = 0:dt:1;5 N = length(t);6 bb = zeros(1,N);7 for i =1:N,8 xi = sqrt(2)*randn/((i-0.5)*pi);9 bb = bb + xi*sin((i-0.5)*pi*t);

    10 end11 figure;plot(t,bb,'Linewidth',2);12

    13 end

    Pavliotis (IC) CompStochProc 100 / 298

  • ExampleThe Karhunen-Love expansion of the Brownian bridge Bt = Wt tW1on [0, 1] is

    Bt =

    n=1

    n

    2 sin (nt)

    n. (26)

    Pavliotis (IC) CompStochProc 101 / 298

  • Example KL expansion for Brownian bridge

    1 function [t,bb] =KLBBridge(dt)2 % generate paths of Brownian bridge on [0,1]3 % using the KL expansion4 t = 0:dt:1;5 N = length(t);6 bb = zeros(1,N);7 for i =1:N,8 xi = sqrt(2)*randn/(i*pi);9 bb = bb + xi*sin(i*pi*t);

    10 end11 figure;plot(t,bb,'Linewidth',2);12

    13 end

    Pavliotis (IC) CompStochProc 102 / 298

  • STOCHASTIC DIFFERENTIAL EQUATIONS (SDEs)

    Pavliotis (IC) CompStochProc 103 / 298

  • In this part of the course we will study stochastic differentialequation (SDEs): ODEs driven by Gaussian white noise.

    Let W(t) denote a standard mdimensional Brownian motion,h : Z Rd a smooth vector-valued function and : Z Rdm asmooth matrix valued function (in this course we will takeZ = Td, Rd or Rl Tdl.Consider the SDE

    dzdt

    = h(z) + (z)dWdt, z(0) = z0. (27)

    We think of the term dWdt as representing Gaussian white noise: amean-zero Gaussian process with correlation (t s)I.The function h in (27) is sometimes referred to as the drift and as the diffusion coefficient.

    Pavliotis (IC) CompStochProc 104 / 298

  • Such a process exists only as a distribution. The preciseinterpretation of (27) is as an integral equation for z(t) C(R+,Z):

    z(t) = z0 + t

    0h(z(s))ds +

    t

    0(z(s))dW(s). (28)

    In order to make sense of this equation we need to define thestochastic integral against W(s).

    Pavliotis (IC) CompStochProc 105 / 298

  • The It Stochastic Integral

    For the rigorous analysis of stochastic differential equations it isnecessary to define stochastic integrals of the form

    I(t) = t

    0f (s) dW(s), (29)

    where W(t) is a standard one dimensional Brownian motion. Thisis not straightforward because W(t) does not have boundedvariation.

    In order to define the stochastic integral we assume that f (t) is arandom process, adapted to the filtration Ft generated by theprocess W(t), and such that

    E

    ( T

    0f (s)2 ds

    )

  • The It stochastic integral I(t) is defined as the L2limit of theRiemann sum approximation of (29):

    I(t) := limK

    K1

    k=1

    f (tk1) (W(tk) W(tk1)) , (30)

    where tk = kt and Kt = t.

    Notice that the function f (t) is evaluated at the left end of eachinterval [tn1, tn] in (30).

    The resulting It stochastic integral I(t) is a.s. continuous in t.

    These ideas are readily generalized to the case where W(s) is astandard d dimensional Brownian motion and f (s) Rmd for eachs.

    Pavliotis (IC) CompStochProc 107 / 298

  • The resulting integral satisfies the It isometry

    E|I(t)|2 = t

    0E|f (s)|2Fds, (31)

    where | |F denotes the Frobenius norm |A|F =

    tr(AT A).

    The It stochastic integral is a martingale:

    EI(t) = 0

    andE[I(t)|Fs] = I(s) t > s,

    where Fs denotes the filtration generated by W(s).

    Pavliotis (IC) CompStochProc 108 / 298

  • ExampleConsider the It stochastic integral

    I(t) = t

    0f (s) dW(s),

    where f ,W are scalarvalued. This is a martingale with quadraticvariation

    It = t

    0(f (s))2 ds.

    More generally, for f , W in arbitrary finite dimensions, the integralI(t) is a martingale with quadratic variation

    It = t

    0(f (s) f (s)) ds.

    Pavliotis (IC) CompStochProc 109 / 298

  • The Stratonovich Stochastic Integral

    In addition to the It stochastic integral, we can also define theStratonovich stochastic integral. It is defined as the L2limit of adifferent Riemann sum approximation of (29), namely

    Istrat(t) := limK

    K1

    k=1

    12

    (f (tk1) + f (tk)

    )(W(tk) W(tk1)) , (32)

    where tk = kt and Kt = t. Notice that the function f (t) isevaluated at both endpoints of each interval [tn1, tn] in (32).

    The multidimensional Stratonovich integral is defined in a similarway. The resulting integral is written as

    Istrat(t) = t

    0f (s) dW(s).

    Pavliotis (IC) CompStochProc 110 / 298

  • The limit in (32) gives rise to an integral which differs from the Itintegral.

    The situation is more complex than that arising in the standardtheory of Riemann integration for functions of bounded variation:in that case the points in [tk1, tk] where the integrand is evaluateddo not effect the definition of the integral, via a limiting process.

    In the case of integration against Brownian motion, which does nothave bounded variation, the limits differ.

    When f and W are correlated through an SDE, then a formulaexists to convert between them.

    Pavliotis (IC) CompStochProc 111 / 298

  • The It and Stratonovich stochastic integrals coincide providedthat the integrand is sufficiently regular then the It andStratonovich stochastic integrals coincide.

    Suppose that there exist C 0 such that

    E|f (t, ) f (s, )|2 6 C|t s|1+,

    for all s, t [0,T]. Then T

    0f (t, ) dWt =

    t

    0f (t, ) dWt.

    This is more than the regularity of the solution to an SDE.

    On the other hand: T

    0Wt dWt =

    12

    W2T 12

    T, T

    0Wt dWt =

    12

    W2T .

    Pavliotis (IC) CompStochProc 112 / 298

  • It and Stratonovich stochastic integrals (Higham2001)

    1 function [itoerr, straterr] = stint(T,N)2 %3 % Ito and Stratonovich integrals of W dW4 %5

    6 dt = T/N;7

    8 dW = sqrt(dt)*randn(1,N); % increments9 W = cumsum(dW); % cumulative sum

    10

    11 ito = sum([0,W(1:end-1)].*dW)12 strat = sum((0.5*([0,W(1:end-1)]+W) + ...

    0.5*sqrt(dt)*randn(1,N)).*dW)13

    14 itoerr = abs(ito - 0.5*(W(end)^2-T))15 straterr = abs(strat - 0.5*W(end)^2)16 end

    Pavliotis (IC) CompStochProc 113 / 298

  • We can associate a second order differential operator with andSDE, the generator of the process.

    Consider the It SDE

    dXt = b(Xt) dt + (Xt) dW(t), (33)

    We define(x) = (x)(x)T . (34)

    The generator of Xt is

    L = b + 12

    d

    i,j=1

    ij(x)2

    xixj, (35)

    Pavliotis (IC) CompStochProc 114 / 298

  • Using the generator we can write Its formula, which enables usto calculate the rate of change in time of functions V(t,Xt).

    In the absence of noise the rate of change of V can be written as

    ddt

    V(t,Xt) =Vt

    (t, xt) +AV(t, x(t)), (36)

    where Xt is the solution of the ODE Xt = b(x) and A is

    A = b(x) . (37)

    For the SDE the chain rule (36) has to be modified by the additionof a term due to noise:

    ddt

    V(t,Xt) =Vt

    (t,Xt)+AV(t,Xt)+V(t,Xt), (Xt)

    dWdt

    (t)

    . (38)

    Pavliotis (IC) CompStochProc 115 / 298

  • The additional term in LV is proportional to and it arises fromthe lack of smoothness of Brownian motion.The precise interpretation of the expression for the rate of changeof V is in integrated form:

    V(t,Xt) = V(x(0)) + t

    0

    Vs

    (s,Xs) ds + t

    0LV(s,Xs) ds

    +

    t

    0V(s,Xs), (Xs) dW(s) . (39)

    The Brownian differential scales like the square root of thedifferential in time: E(dWt)2 = dt. We can write Its formula in theform

    dV(t, x(t)) =Vt

    dt +d

    i,j=1

    Vxi

    dxi +122Vxixj

    dxidxj, (40)

    We are using the conventiondWi(t)dWj(t) = ij dt, dWi(t)dt = 0, i, j = 1, . . . d.Thus, we can think of (40) as a generalization of Leibniz rule (38)where second order differentials are kept.

    Pavliotis (IC) CompStochProc 116 / 298

  • Taking the expectation of (40) and using the fact that thestochastic integral is a martingale we obtain (we assume that V isindependent of time and that the I.C. are deterministic)

    EV(Xt) = E t

    0LV(s,Xs) ds.

    We can use Its formula to calculate the expectation value offunctionals of the solution on an SDE.For a Stratonovich SDE the rules of standard calculus apply; let Xtdenote the solution of the Stratonovich SDE

    dXt = b(Xt) dt + (Xt) dWt, (41)where b : Rd 7 Rd and : Rd 7 Rdd. Then the generator of Xtis

    L = b +12 : (T). (42)

    Furthermore, the Newton-Leibniz chain rule applies: leth C2(Rd) and define Yt = h(Xt). Then

    dYt = h(Xt) (b(Xt) dt + (Xt) dWt

    ). (43)

    Pavliotis (IC) CompStochProc 117 / 298

  • Example (Geometric Brownian motion)Consider the linear It SDE with multiplicative noise

    dXt = Xt dt + Xt dWt, X0 = x, (44)

    The generator of Xt is

    L = x x

    +2x2

    22

    x2.

    The solution of (44) is

    X(t) = x exp

    ((

    2

    2)t + W(t)

    ). (45)

    Pavliotis (IC) CompStochProc 118 / 298

  • Example (Geometric Brownian motion contd.)To derive this formula, we apply Its formula to the functionf (x) = log(x):

    d log(Xt) = L(

    log(Xt))

    dt + xx log(Xt) dWt

    =

    (Xt

    1Xt

    +2X2t

    2

    ( 1

    X2t

    ))dt + dWt

    =

    (

    2

    2

    )dt + dWt.

    Consequently:

    log

    (XtX0

    )=

    (

    2

    2

    )t + W(t),

    from which (45) follows.

    Pavliotis (IC) CompStochProc 119 / 298

  • Existence and Uniqueness of solutions for SDEs

    Definition

    By a solution of (27) we mean an Rd-valued stochastic process {Xt} ont [0,T] with the properties:

    1 Xt is continuous and Ftadapted, where the filtration is generatedby the Brownian motion W(t);

    2 h(Xt) L1((0,T)), (Xt) L2((0,T));3 equation (27) holds for every t [0,T] with probability 1.

    The solution is called unique if any two solutions Xi(t), i = 1, 2 satisfy

    P(X1(t) = X2(t), t [0.T]) = 1.

    Pavliotis (IC) CompStochProc 120 / 298

  • It is well known that existence and uniqueness of solutions forODEs (i.e. when 0 in (27)) holds for globally Lipschitz vectorfields h(x).

    A very similar theorem holds when 6= 0.As for ODEs the conditions can be weakened, when a prioribounds on the solution can be found.

    Pavliotis (IC) CompStochProc 121 / 298

  • Theorem

    Assume that both h() and () are globally Lipschitz on Rd and that x0is a random variable independent of the Brownian motion W(t) with

    E|x0|2

  • RemarksThe Stratonovich analogue of (27) is

    dXdt

    = h(X) + (X) dWdt, X(0) = X0. (46)

    By this we mean that z C(R+,Rd) satisfies the integral equation

    X(t) = X(0) + t

    0h(X(s))ds +

    t

    0(X(s)) dW(s). (47)

    By using definitions (30) and (32) it can be shown that z satisfyingthe Stratonovich SDE (46) also satisfies the It SDE

    dXdt

    = h(X)+12((X)(X)T

    ) 1

    2(X)

    ((X)T

    )+(X)

    dWdt, (48a)

    X(0) = X0, (48b)

    provided that (z) is differentiable.

    Pavliotis (IC) CompStochProc 123 / 298

  • White noise is, in most applications, an idealization of a stationaryrandom process with short correlation time. In this context theStratonovich interpretation of an SDE is particularly importantbecause it often arises as the limit obtained by using smoothapproximations to white noise.

    On the other hand the martingale machinery which comes withthe It integral makes it more important as a mathematical object.

    It is very useful that we can convert from the It to theStratonovich interpretation of the stochastic integral.

    There are other interpretations of the stochastic integral, e.g. theKlimontovich stochastic integral.

    Pavliotis (IC) CompStochProc 124 / 298

  • Examples of SDEs

    The Ornstein-Uhlenbeck process (, > 0):

    dXt = Xt dt + dWt. (49)The solution is

    Xt = etX0 +

    t

    0e(ts) dWs.

    Geometric Brownian motion:

    dXt = rXt dt + Xt dWt. (50)

    The solution isXt = X0e

    Wt+(r 122)t.

    Note that the solution is different in for the It and the StratonovichSDEs.The Cox-Ingersoll-Ross SDE (, b > 0):

    dXt = (b Xt) dt +

    Xt dWt. (51)

    Pavliotis (IC) CompStochProc 125 / 298

  • Examples of SDEs (contd)

    Stochastic Verhulst equation (population dynamics)

    dXt = (Xt X2t ) dt + Xt dWt. (52)

    Lotka-Volterra SDEs:

    dXi(t) = Xi(t)

    ai +

    d

    j=1

    bijXj(t)

    dt + iXi(t) dWi(t). (53)

    Protenin kinetics:

    dXt = ( Xt + Xt(1 Xt)) dt + Xt(1 Xt) dWt. (54)

    Pavliotis (IC) CompStochProc 126 / 298

  • Examples of SDEs (contd)

    Tracer particle (turbulent diffusion)

    dXt = u(Xt, t) dt + dWt, u(x, t) = 0. (55)

    Josephson junction (pendulum with friction and noise)

    t = sin(t) t +

    21 Wt. (56)

    Noisy Duffing oscillator (stochastic resonance)

    Xt = Xt X3t Xt + A cos(t) + Wt (57)

    Stochastic heat equation

    tu = 2x u + tW(x, t), (58)

    on [0, 1] with Dirichlet boundary conditions and W(x, t) denoting aninfinite dimensional Brownian motion.

    Pavliotis (IC) CompStochProc 127 / 298

  • Consider the SDE (80). The generator L and its L2-adjoint are

    L = b(x)x +122(x)2x ,

    and

    L = x( b(x) +1

    2x(2(x)

    )).

    We can obtain evolution equations for the expectation offunctionals of the solution of the SDE and for the transitionprobability density.

    Pavliotis (IC) CompStochProc 128 / 298

  • Theorem

    (Kolmogorov) Let f (x) Cb(R) and let

    u(x, s) := E(f (Xt)|Xs = x) =

    f (y)p(y, t|x) dy C2b(R).

    Assume furthermore that the functions b(x), (x) = 2(x) arecontinuous in x. Then u(x, s) C2,1(R R+) and it solves the finalvalue problem

    us

    = b(x)ux

    +12(x, s)

    2ux2

    , limst

    u(s, x) = f (x). (59)

    Pavliotis (IC) CompStochProc 129 / 298

  • Theorem

    (Kolmogorov) Assume that p(y, t|, ), b(y),(y) C2(R R+). Then thetransition probability density of Xt satisfies the equation

    pt

    = y

    (b(y)p) +122

    y2((y)p) , p(0, y|x) = (x y). (60)

    Pavliotis (IC) CompStochProc 130 / 298

  • Assume that initial distribution of Xt is 0(x). Define

    p(y, t) :=

    p(y, t|x)0(x) dx.

    We multiply the forward Kolmogorov equation (60) by 0(x) andintegrate with respect to x to obtain the equation

    p(y, t)t

    = y

    (b(y, t)p(y, t)) +122

    y2((y)p(t, y)) , (61)

    together with the initial condition

    p(y, 0) = 0(y). (62)

    Pavliotis (IC) CompStochProc 131 / 298

  • Numerical Solution of SDEs

    We consider the scalar It SDE

    dXt = b(Xt) dt + (Xt) dWt, X0 = x, (63)

    posed on the interval [0,T]. The initial condition can be eitherdeterministic or random.

    Our goal is to obtain an approximate solution to this equation tocompute quantities of the form Ef (Xt), where E denotes theexpectation with respect to the law of the process Xt.

    The simplest numerical method is the Euler-Marayama methodthat is the analogue of the explicit Euler method for ODEs.

    Pavliotis (IC) CompStochProc 132 / 298

  • We partition the interval [0,T] into N equal subintervals of size twith Nt = T.

    We setXj := X(jt), j = 0, 1, . . .N.

    The Euler-Marayama method for the SDE (63) is

    Xj = Xj1 + b(Xj1)t + (Xj1)Wj, j = 1, . . .N, (64)

    where Wj = Wj Wj1 denotes the jth Brownian increment.We can write Wj = j

    t with j N (0, 1) i.i.d.

    The Euler-Marayama scheme can be written as

    Xj = Xj1 + b(Xj1)t + (Xj1)tj, j = 1, . . .N. (65)

    Pavliotis (IC) CompStochProc 133 / 298

  • 0 1 2 3 4 5 6 7 8 9 102

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    2.5

    t

    Xt

    Figure: Five sample paths of the Ornstein-Uhlenbeck process.

    Pavliotis (IC) CompStochProc 134 / 298

  • We solve the OrnsteinUhlenebeck SDE in one dimension usingthe EM scheme.

    dXt = Xt dt +

    2 dWt. (66)

    We solve (66) for = 1, = 12 for t [0, 10] with t = 0.0098 andinitial conditions X0 2U(0, 1), where U(0, 1) denotes the uniformdistribution in the interval (0, 1).

    In Figure 7 we present five sample paths of the OU process.

    In Figure 8 we plot the first two moments of the Euler-Marayamaapproximation of the OU process. We compare against thetheoretical solution.

    Pavliotis (IC) CompStochProc 135 / 298

  • 0 1 2 3 4 5 6 7 8 9 100.2

    0

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    t

    E X(t)

    EulerMarayama

    exact formula

    0 1 2 3 4 5 6 7 8 9 100.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    1.1

    1.2

    1.3

    1.4

    t

    E(X(t)2)

    EulerMarayama

    exact formula

    a. EXt b. EX2t

    Figure: First and second moments of the Ornstein-Uhlenebeck process usingthe Euler-Marayama method.

    Pavliotis (IC) CompStochProc 136 / 298

  • We use the analytical solution to investigate the error of theEuler-Marayama scheme, as a function of t.

    We generate 1, 000 paths using the EM scheme as well as theexact solution, using the same Brownian paths.

    Since the noise in the OU SDE is additive, we expect that thestrong order of convergence is 1.

    This is verified by our numerical experiments, see Figure 9.

    Pavliotis (IC) CompStochProc 137 / 298

  • 103

    102

    101

    100

    104

    103

    102

    101

    100

    t

    E|X(T) Xem(T)|

    Figure: Strong order of convergence for the Euler-Marayama method for theOrnstein-Uhlenbeck process. The linear equation with slope 1 is plotted forcomparison.

    Pavliotis (IC) CompStochProc 138 / 298

  • Consider the motion of a Brownian particle in a bistable potential:

    dXt = V (Xt) dt +

    21 dWt, (67)

    with

    V(x) =x4

    4 x

    2

    2. (68)

    The deterministic dynamical system has two stable equilibria.

    For weak noise strengths the solution of the SDE spends mosttime oscillating around the minima of the potential.

    In Figure 10 we present a sample path of the SDE with = 10,obtained using the EM algorithm.

    Pavliotis (IC) CompStochProc 139 / 298

  • 0 50 100 150 200 250 300 350 400 450 5002

    1.5

    1

    0.5

    0

    0.5

    1

    1.5

    2

    t

    Xt

    Figure: Sample path of the SDE (67).

    Pavliotis (IC) CompStochProc 140 / 298

  • 0 1 2 3 4 5 6 7 8 9 10

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    t

    E(X(t)2)

    Figure: Second moment of the bistable SDE (67).

    Pavliotis (IC) CompStochProc 141 / 298

  • Another standard numerical method for solving stochasticdifferential equations is the Milsteins method

    Xj = Xj1+b(Xj1)t+(Xj1)jt+

    12(Xj1)

    (Xj1)t(2j 1

    ).

    (69)

    Notice that there is an additional term, in comparison to the EMscheme.

    The Milstein scheme converges strongly with order 1.

    Pavliotis (IC) CompStochProc 142 / 298

  • The drift term is discretized in the same way in the Euler andMilstein schemes.On the other hand, the Milstein scheme has an additional term,which is related to the approximation of the stochastic term in (63).We can derive the Milstein scheme as follows. First, we write anincrement of the solution to the SDE in the form

    Xj+1 = Xj + (j+1)t

    jtb(Xs) ds +

    (j+1)t

    jt(Xs) dWs. (70)

    We apply Its formula to the drift and diffusion coefficients toobtain

    b(Xs) = b(Xj) + s

    jt(Lb)(X) d+

    s

    jt(b)(X) dW

    and

    (Xs) = (Xj) + s

    jt(L)(X) d +

    s

    jt()(X) dW,

    for s [jt, (j + 1)t], where L denotes the generator of theprocess Xt.

    Pavliotis (IC) CompStochProc 143 / 298

  • We substitute these formulas into (70) to obtain, with j N (0, 1),Xj+1 = Xj + b(Xj)t + (Xj)Wj

    +

    (j+1)t

    jt

    s

    jt(Lb)(X) d ds +

    (j+1)t

    jt

    s

    jt(b)(X) dWds

    +

    (j+1)t

    jt

    s

    jt(L)(X) ddWs +

    (j+1)t

    jt

    s

    jt()(X) dWdWs

    = Xj + b(Xj)t + (Xj)Wj + ()(Xj)

    (j+1)t

    jt

    s

    jtdWdWs + O(

    Xj + b(Xj)t + (Xj)Wj +12()(Xj)

    (W2j t

    )

    = Xj + b(Xj)t + (Xj)Wj +12t()(Xj)

    (2j 1

    ),

    In the above, we have used the fact that (t)(Wj) = O((t)+/2)and that, in one dimension,

    (j+1)t

    jt

    s

    jtdWdWs =

    12

    (W2j t

    ). (71)

    Pavliotis (IC) CompStochProc 144 / 298

  • We will say that a numerical scheme has strong order ofconvergence if there exists a positive constant C such that

    E|Xj X(jt)| 6 Ct, (72)

    for any j = 1, . . .N and for t sufficiently small.

    We usually evaluate (72) at t = T.

    We will show that the Euler-Marayama method has strong order ofconvergence 12 .

    The Milstein scheme has strong order 1.

    The strong order of convergence for the Euler-Marayama methodbecomes 1 when the noise is additive.

    There are applications such as filtering and statistical inferencewhere strong convergence of the numerical solution of the SDE isneeded.

    Pavliotis (IC) CompStochProc 145 / 298

  • Quite often we are interested in the calculation of statisticalquantities of solutions to SDEs, such as moments, rather than inpathwise properties.

    For such a purpose, the strong convergence (72) is more thanwhat we actually need.

    We will say that a numerical method has weak order ofconvergence provided that there exists a positive constant Csuch that for all f CP(R) (the class of times continuouslydifferentiable functions which, together with their derivatives up toand including order have at most polynomial growth)

    Ef (Xj) Ef (X(jt)) 6 Ct, (73)

    for any j = 1, . . .N and for t sufficiently small.

    Both the Euler-Marayama and the Milstein schemes have weakorder of convergence 1.

    The Milstein scheme requires more function evaluations.

    Pavliotis (IC) CompStochProc 146 / 298

  • For the calculation of the expectation in (72) and (73) we need touse a Monte Carlo method for calculating the expectation.

    We generate N sample paths of the exact solution of the SDE andof the numerical discretization that are driven by the same noise:

    =1N

    N

    k=1

    |XkT XkT |.

    is a random variable for which we can prove a central limittheorem.

    In addition to the discretization error we also have the Monte Carloerror.

    We can perform similar numerical experiments in order to studythe weak order of convergence. In this case the exact and theapproximate solutions do not need to be driven by the same noise.

    Pavliotis (IC) CompStochProc 147 / 298

  • To estimate the strong order of convergence of a numericalscheme we plot the error as a function of step size in log-log anduse least squares fitting:

    log(err) = log C + logt.

    The weak convergence error consists of two steps, thediscretization error and the statistical error, due to the MCapproximation.We estimate (XT denotes the solution of the numerical scheme):

    = |Ef (XT)1M

    M

    m=1

    f (XmNt)|

    6 |Ef (XT) Ef (XT)|+ |Ef (XT)1M

    M

    m=1

    f (XmNt)|

    6 CN + CM1/2.

    Fixed computational resources: optimize over N, M.

    Pavliotis (IC) CompStochProc 148 / 298

  • The EM or Milstein schemes provide us with the solution of theSDE only at the discretization times.The values at intermediate instants can be estimated usinginterpolation:

    Constant interpolation:

    X(t) = Xjt , jt = max{j = 0, 1, 2 . . . ,N : jt 6 t},

    linear interpolation:

    X(t) = Xjt +t jt

    jt+1 jt(Xjt+1 Xjt),

    where j = jt and where we have used X(t) to denote theinterpolated numerical solution of the SDE.

    Pavliotis (IC) CompStochProc 149 / 298

  • Consider an approximate Brownian motion Wh(t) constructed as aGaussian random walk at tn = nh, h = T/N, T = 1 and by linearinterpolation in between. Then

    E

    1

    0|Wh(t) W(t)| dt = c1

    N1/2,

    where c1 =/32. Furthermore

    limN+

    N

    log NE sup

    06t61|Wh(t) W(t)| = c2,

    for some constant c2 (0,+).We can obtain similar results for diffusion processes.

    It is possible to simulate exactly a diffusion process: Exactsimulation of diffusions A. Beskos, G.O. Roberts - The Annalsof Applied Probability, 2005

    Pavliotis (IC) CompStochProc 150 / 298

  • Consider an SDE in arbitrary dimensions:

    dXt = b(Xt) dt + (Xt) dWt, (74)

    where Xt : [0,T] 7 Rd, b C(Rd), C(Rdd), Wt standardd-dimensional Brownian motion.

    The EM scheme is (we use the notation bin = bi(Xn) for the ith

    component of the vector b etc.)

    Xin+1 = Xin + b

    int +

    d

    j=1

    ijW jn. (75)

    It is straightforward to implement this scheme.

    Pavliotis (IC) CompStochProc 151 / 298

  • For the Milstein scheme we have:

    Xin+1 = Xin + b

    int +

    d

    j=1

    ijW jn +d

    j1,j2=1,=1

    jn

    xj1j2n Ij1j2 , (76)

    where Ij1j2 denotes a double It integral:

    Ij1j2 = tn+1

    tn

    s1tn

    dW j1s2 dWj2s1 . (77)

    The diagonal terms Ij1j2 are given by the 1d formula. Theoff-diagonal elements might be difficult/expensive to simulate.

    We can obtain higher order schemes using the Stochastic Taylorexpansion.

    Higher order schemes require the evaluation of a large number ofderivatives of the drift and diffusion coefficients

    Pavliotis (IC) CompStochProc 152 / 298

  • We can exploit the structure of the SDE to simplify the calculationof the Milstein correction to the EM scheme.Consider a nonlinear oscillator with friction and noise (example:the noisy Duffing-Van der Pol oscillator):

    Xt = b(Xt) (Xt)Xt +d

    j=1

    j(Xt) Wj. (78)

    Write as a first order system:

    dX1t = X2t dt,

    dX2t =(b(X1t ) (X1t )X2t

    )dt +

    d

    j=1

    j(Xt) dWj.

    The Milstein discretization for this SDE is

    Y1n+1 = Y1n + Y

    2n t,

    Y2n+1 = Y2n

    (b(Y1n ) (Y1n )Y2n

    )t +

    d

    j=1

    j(Yn) dWj.

    Pavliotis (IC) CompStochProc 153 / 298

  • No Milstein correction appears (double It integrals vanish). Thisscheme has strong order of convergence 1.

    Solve the first equation for Y2n and substitute to the second toobtain a multistep scheme for Y1n :

    Y1n+2 =(2 (Y1n )t

    )Y1n+1 (1 (Y1n )t)Y1n + b(Y1n )(t)2

    +

    d

    j=1

    j(Y1n )Wjn t. (79)

    This is the Lepingle-Ribemont scheme. The first equation in theMilstein scheme can be used as a starting routine and forcalculating Y2n .

    Pavliotis (IC) CompStochProc 154 / 298

  • Population dynamics

    dXt = rXt(K Xt) dt + Xt dWt, X(0) = X0.

    Motion in a periodic incompressible velocity field subject tomolecular diffusion

    dXt = v(Xt) dt +

    2dWt, v(x) = 0.

    Pavliotis (IC) CompStochProc 155 / 298

  • 1 % compute effective diffusivities for particles moving ...in periodic velocity

    2 % fields using monte carlo. the velocity field is a ...Taylor-Green flow

    3 % dt: time step N: number of steps M: number of paths4 %5 % D: molecular diffusion6 %7 function [kxx, kyy,tint] = effdiff(N,dt,M,D)8 T = dt*N;9 tint = [0:dt:T];

    10 %11 u = zeros(M,N);12 v = zeros(M,N);13 uin = u(:,1);14 vin = v(:,1);15 %16 DDD = num2str(D);17 MMM = num2str(M);18 elll = num2str(dt);19 %20 dW1 = sqrt(dt)*randn(M,N); % Brownian increments

    Pavliotis (IC) CompStochProc 155 / 298

  • 21 dW2 = sqrt(dt)*randn(M,N);22 for j = 1:N-123 u(:,j+1) = u(:,j) + chsowx(u(:,j),v(:,j))*dt + ...

    sqrt(2*D)*dW1(:,j);24 v(:,j+1) = v(:,j) + chsowy(u(:,j),v(:,j))*dt + ...

    sqrt(2*D)*dW2(:,j);25 end26 %27 v = [vin,v];28 u=[uin,u];29 %30 figure31 plot(u(1,:),v(1,:))32 grid on33 %34 kxx = var(v)./(2*tint);35 kyy = var(u)./(2*tint);36 %37 figure38 plot(tint,kxx,'r','LineWidth',3)39 ylabel('var(x)/2t','Fontsize',16,'Rotation',0)40 xlabel('t','Fontsize',16)41 title(['var(x)/2t ', ' D = ' DDD ' , M = ' MMM ' , dt = ...

    Pavliotis (IC) CompStochProc 155 / 298

  • ' elll ])42 %43 figure44 plot(tint,kyy,'r','LineWidth',3)45 ylabel('var(y)/2t','Fontsize',16,'Rotation',0)46 xlabel('t','Fontsize',16)47 title(['var(y)/2t ', ' D = ' DDD ' , M = ' MMM ' , dt = ...

    ' elll ])

    Pavliotis (IC) CompStochProc 156 / 298

  • We consider the It SDE

    dXt = b(Xt) dt + (Xt) dWt, X0 = x. (80)

    We assume that the coefficients are smooth, globally Lipschitzcontinuous and satisfy the linear growth condition:

    |b(x) b(y)| + |(x) (y)| 6 C|x y|, (81)

    and|b(x)| + |(x)| 6 C(1 + |x|). (82)

    We also assume that the initial condition is a random variableindependent of the Brownian motion Wt with

    E|X0|2 6 C.

    Under the above assumptions there exists a unique strongsolution to (80) for t [0,T].

    Pavliotis (IC) CompStochProc 156 / 298

  • Lemma

    Under the above assumptions the solution to the SDE satisfies

    E supt[0,T]

    |Xt|2 6 C(T). (83)

    Proof.Use the assumptions on the coefficients b() and () and the initialconditions, Cauchy-Schwartz and the Burkholder-Gandy-Davis andGronwall inequalities.

    The BDG inequality that we need is

    c2E t

    02(s) ds 6 E sup

    t[0,T]

    ( t

    0(s) dWs

    )26 C2E

    t

    02(s) ds (84)

    Pavliotis (IC) CompStochProc 157 / 298

  • Let {Xtn }Nn=0 denote the Euler discretisation of the SDE (80) withconstant interpolation.

    We partition the interval [0,T]: tn = nt, n = 0, N, t = T/N.

    Lemma

    Under the above assumptions we have

    E supt[tn,tn+1]

    |Xt Xtn |2 6 Ct. (85)

    Proof.Use the assumptions on the coefficients, the BDG inequality andLemma 34.

    Pavliotis (IC) CompStochProc 158 / 298

  • Theorem

    Let {Xtn }Nn=0 denote the Euler discretisation of the SDE (80) withconstant interpolation. Under the above assumptions we have

    E supt[0,T]

    |Xt Xtt |2 6 Ct. (86)

    For the proof we will need the discrete Gronwall inequality in thefollowing form: let {yn}Nn=0 be a sequence and A, B > 0 satisfying

    y0 = 0, yn 6 A + Bhn1

    j=0

    yj, 1 6 n 6 N, h = 1/N.

    Thenmax

    06i6Nyi 6 Ae

    B. (87)

    Pavliotis (IC) CompStochProc 159 / 298

  • We can write the Euler scheme in the form

    Xtn+1 = Xtn +

    tn+1tn

    b(Xtn ) ds + tn+1

    tn(Xtn ) dWs.

    For the exact solution we have

    Xn+1 = Xn + tn+1

    tnb(Xn) ds +

    tn+1tn

    (Xn) dWs + n,

    where

    n =

    tn+1tn

    (b(Xs) b(Xn)) ds + tn+1

    tn((Xs) (Xn)) dWs.

    Pavliotis (IC) CompStochProc 160 / 298

  • Take the difference between the Euler approximation and theexact solution (use the notationXn = Xn Xtn , b(Xn) = b(Xn) b(Xtn ) etc.):

    Xn+1 = Xn + tn+1

    tnbn ds +

    tn+1tn

    n dWs + n.

    Sum over n and use X0 = 0 to obtain

    Xn+1 =N1

    n=0

    tn+1tn

    bn ds +N1

    n=0

    tn+1tn

    n dWs + RN ,

    where

    RN =N1

    n=0

    n.

    Pavliotis (IC) CompStochProc 161 / 298

  • Use the Lipschitz continuity, Cauchy-Schwarz, BGD to obtain(t 6 1)

    E|Xn+1|2 6 C(T)tN1

    n=0

    E|Xn|2 + CE|RN |2.

    Use Lemma 35 and Lipschitz continuity, Cauchy-Schwarz, BGD toobtain

    E|RN |2 6 C.Use the discrete Gronwall inequality to conclude the proof ofTheorem 36.

    Pavliotis (IC) CompStochProc 162 / 298

  • Theorem

    Let {Xtn }Nn=0 denote the Euler discretisation of the SDE (80) withconstant interpolation and assume that b, C4b and f C4P. Then

    |Ef (XtT ) Ef (XT)| 6 Ct. (88)

    Pavliotis (IC) CompStochProc 163 / 298

  • For the proof of this theorem we will use the backward Kolmogorovequation. Let u(t, x) denote the solution of the backward Kolmogorovequation with u(T, x) = f (x). We have

    Ef (XtT ) Ef (XT) = E(u(T,XtT )) u(0,X0)

    =n1

    i=0

    E[u(ti+1,X

    tti+1) u(ti,X

    tti )]

    =n1

    i=0

    E

    ti+1ti

    [tu(s,X

    ts ) + L(Xtti )u(s,Xts )

    ]ds

    =n1

    i=0

    E

    ti+1ti

    [ (t + L(Xtti )

    )(u(s,Xts ) (u(ti,Xtti )

    ]ds

    We use Its formula:

    E|tu(s,Xts ) tu(ti,Xtti )| = E|(Xti )|O(t).

    Pavliotis (IC) CompStochProc 164 / 298

  • when we are interested in weak approximation, we can replacethe Brownian increments by different random variables that mightbe easier to simulate and have the right statistics.

    We can use, for example the two point distributed random variabletn where n are i.i.d. with

    P

    (n = 1

    )=

    12. (89)

    For higher order approximations we can use the 3-point distributedrandom variable

    P

    (n =

    3)=

    16, P

    (n = 0

    )=

    23. (90)

    This is useful when considering weak convergence for fully implicitschemes.

    Pavliotis (IC) CompStochProc 165 / 298

  • Its formula can be used to obtain a probabilistic description ofsolutions to linear partial differential equations of parabolic type.

    Let Xxt be a diffusion process with drift b(), diffusion () = T(),and generator L with Xx0 = x, and let f C20(Rd) and V C(Rd),bounded from below. Then the function

    u(x, t) = E(

    e t

    0 V(Xxs ) dsf (Xxt )

    )(91)

    is the solution to the initial value problem

    ut

    = Lu Vu, (92a)u(0, x) = f (x). (92b)

    Pavliotis (IC) CompStochProc 166 / 298

  • To derive this result, we introduce the variableYt = exp

    ( t

    0 V(Xs) ds).

    We rewrite the SDE for Xt as

    dXxt = b(Xxt ) dt + (X

    xt ) dWt, X

    x0 = x, (93a)

    dYxt = V(Xxt ) dt, Yx0 = 0. (93b)

    The process {Xxt , Yxt } is a diffusion process with generator

    Lx,y = L V(x)

    y.

    We can write

    E

    (e

    t0 V(X

    xs ) dsf (Xxt )

    )= E((Xxt ,Y

    xt )),

    where (x, y) = f (x)ey. We apply Its formula to this function toobtain (92).

    Pavliotis (IC) CompStochProc 167 / 298

  • The representation formula (91) for the solution of the initial valueproblem (92) is called the FeynmanKac formula.

    We can use the Feynman-Kac formula to develop a numericalscheme for solving parabolic PDEs based on the numericalsolution of the underlying SDE.

    Pavliotis (IC) CompStochProc 168 / 298

  • Stability analysis of numerical schemes

    Goal: solve SDEs accurately over long time intervals.

    Constant in the strong/weak order of convergence estimatesgrows exponentially fast (Gronwalls lemma).

    Stability analysis: fix t, study the T + limit.Test the stability of the scheme on the linear SDE (geometric BM)

    dXt = Xt dt + Xt dWt, X(0) = X0, (94)

    where , C.

    Pavliotis (IC) CompStochProc 169 / 298

  • We will say that an SDE is stable in mean square if for all X0 6= 0with prob. 1

    limt+

    E|Xt|2 = 0, (95)

    We will say that an SDE is asymptotically stable if for all X0 6= 0with prob. 1

    P

    (lim

    t+|Xt| = 0

    )= 1. (96)

    For the gBM (94) we have

    limt+

    E|Xt|2 = 0 Re() +12||2 < 0. (97)

    and

    P

    (lim

    t+|Xt| = 0

    )= 1 Re

    ( 1

    22)< 0. (98)

    Pavliotis (IC) CompStochProc 170 / 298

  • LemmaThe geometric Brownian motion (94) with , C is stable in meansquare provided that

    Re() +12||2 < 0. (99)

    Pavliotis (IC) CompStochProc 171 / 298

  • Proof.

    We apply Its formula to |Xt|2 = (ReXt)2 + (ImXt)2 to obtain

    d |Xt|2 =(

    2Re() + ||2)|Xt|2 dt + dMt,

    where Mt is a martingale. We take the expectation to obtain

    ddtE|Xt|2 =

    (2Re() + ||2

    )E|Xt|2.

    The solution to this equation is

    E|Xt|2 = exp(2Re() + ||2

    )E|X0|2,

    from which (99) follows.

    Pavliotis (IC) CompStochProc 172 / 298

  • Mean square stability is more stringent than asymptotic stability.

    Suppose that the parameters and are chosen so that gBM ismean square stable. For what values of t is the EM also meansquare stable?

    limj+

    EX2j = 0 (1 +t + )2 +t||2 < 1. (100)

    Similar results can be obtained for the Milstein scheme.

    Pavliotis (IC) CompStochProc 173 / 298

  • Definition

    Let Xtt denote a time-discrete approximation of the SDE

    dXt = b(Xt) dt + (Xt) dWt (101)

    with step size t starting at Xt0 at t = 0 and let Xtt denote the

    approximation starting at Xt0 We will say that Xtt is stochastically

    numerically stable for the SDE (101) if for any finite interval [0,T]t0 s.t. > 0, t (0,t0) we have

    lim|Xt0 X

    t0 |0

    supt[0,T]

    P

    (|Xtnt Xtnt | >

    )= 0. (102)

    We will say that a numerical scheme is stochastically numericallystable if it satisfies (103) for all SDEs for which it is convergent.

    Pavliotis (IC) CompStochProc 174 / 298

  • Asymptotic stochastic stability: Replace (103) with

    lim|Xt0 X

    t0 |0

    limT+

    P

    (|Xtnt Xtnt | >

    )= 0. (103)

    Consider the linear SDE

    dXt = Xt dt + dWt.

    It is mean square stable for Re() < 0.

    Consider numerical schemes that can be written in the form

    Xtn+1 = Xt G(t) + Z

    tn .

    The region of absolute stability of the scheme is the set of tfor which

    Re() < 1 and |G(t)| < 1.If the region of absolute stability is the left half complex plane thescheme is A-stable.

    Pavliotis (IC) CompStochProc 175 / 298

  • Just as with ODEs, implicit schemes have better stabilityproperties than explicit schemes.

    The implementation of an implicit scheme requires the solution ofan additional algebraic equation at each time step, which canusually be done using the Newton-Raphson algorithm.

    Treating the noise involves reciprocals of Gaussian randomvariables which do not have finite absolute moments.

    We will treat the drift implicitly and the noise explicitly.

    The implicit Euler scheme is

    Xn+1 = Xn + b(Xn+1)t + (Xn)Wn. (104)

    If we are only interested in weak convergence we can replacetN (0, 1) by random variables that are accurate in the weak

    sense and are easier to take their reciprocals.

    Pavliotis (IC) CompStochProc 176 / 298

  • Family of implicit Euler schemes (stochastic theta method, STM):

    Xn+1 = Xn +( b(Xn+1) + (1 )b(Xn)

    )t + (Xn)Wn, (105)

    for [0, 1].Family of implicit Milstein schemes:

    Xn+1 = Xn +( b(Xn+1) + (1 )b(Xn)

    )t (106)

    +(Xn)Wn +12()(Xn)

    ((Wn)

    2 t). (107)

    For both schemes we have

    G(t) =1 + (1 )t

    1 t .

    Pavliotis (IC) CompStochProc 177 / 298

  • Test problem: geometric Brownian motion (linear SDE withmultiplicative noise), eqn. (94).

    For what stepsizes t does the stochastic method

    share the stability properties of the test problem?

    Use the concept of mean square stability.

    For the SDE the mean square stability depends on the parameters, .

    For the numerical scheme it depends on the model parameters, and on the numerical scheme parameters t, .

    Our goal is to find for what values of t, , , the stochastictheta method is also mean square stable.

    Pavliotis (IC) CompStochProc 178 / 298

  • The STM for the geometric Brownian motion becomes

    Xn+1 = Xn + (1 )hXn + hXn+1 +

    hnXn, (108)

    with h := t. We can rewrite this equation as

    Xn+1 =(

    a + b n)

    Xn, (109)

    with

    a =1 + (1 )h

    1 h , b =

    h1 h. (110)

    This is a homogeneous Markov chain with an uncountable statespace.

    To study the stability properties of the STM we need to study thelong time behavior of the Markov Chain (109)

    Pavliotis (IC) CompStochProc 179 / 298

  • DefinitionThe numerical scheme (Markov Chain) (109) is mean square stable iff

    limn+

    E|Xn|2 = 0. (111)

    TheoremThe MC (109) is mean square stable if and only if

    |a|2 + |b|2 < 1. (112)

    In particular, for the STM we have

    |1 + (1 )h|2 + h|2||1 h|2 < 1. (113)

    Pavliotis (IC) CompStochProc 180 / 298

  • Define the stability regions for the gBM and for the STM:

    SSDE := {, C : Re() +12||2 < 0} (114)

    and

    SSTM(, h) :=

    {, C : |1 + (1 )h|

    2 + h|2||1 h|2 < 1

    }. (115)

    TheoremFor all h > 0 we have

    SSTM(, h) SSDE for [0, 12).SSTM(, h) = SSDE for = 12 .

    SSTM(, h) SSDE for (12 , 1].

    Pavliotis (IC) CompStochProc 181 / 298

  • DefinitionThe numerical scheme is (109) (mean sqaure) A-stable provided thatwhenever the test problem (94) is mean-sqaure stable, then (109) isalso mean square stable for all t > 0.

    Corollary

    The STM is A-stable for all [12 , 1].

    If the SDE is unstable, then so is the STM for all h.

    If the SDE is stable, then so is the STM for sufficiently small h.

    In this case, the resulting stepsize restriction can be arbitrarilysevere.

    Pavliotis (IC) CompStochProc 182 / 298

  • Numerical methods for ergodic SDEs

    We consider SDEs in Rd

    dXt = b(Xt) dt + (Xt) dWt. (116)

    We assume that Xt is ergodic with respect to the invariantmeasure (dx) = (x) dx.Assuming sufficient regularity, the invariant densi