Estimation of the joint density of a Volatility Process of the joint density of a Volatility Process...

Estimation of the joint density

of a Volatility Process

Riste Gjorgjiev

November 12, 2006

Master Thesis

Stochastics and Financial MathematicsKorteweg-de Vries Institute for Mathematics

Faculty of ScienceUniversiteit of Amsterdam

supervisors

A.J. van EsP.J.C. Spreij

Keywords:deconvolution, kernel density estimator, bias, variance

i

Abstract

We consider stochastic models for financial processes. Using deconvolution the-ory, Fourier transformation and kernel estimation, we construct an estimator forthe density of a process known as the volatility process. We derive expansionsfor the bias and variance of this multivariate density estimator.

ii

Contents

Abstract ii

1 Introduction 1

2 Main result 4

3 Proof of the theorem 63.1 Proof of the bound for the bias of the estimator . . . . . . . . . . 63.2 Proof of the bound for the variance of the estimator . . . . . . . 11

4 Technical lemmas 21

References 24

iii

Chapter 1

Introduction

We consider a continuous stochastic process S, that is a log-price process ofsome stock. The process is assumed to satisfy a stochastic differential equationof the form

dSt = bt dt + σt dWt, (1.1)

where W is a Brownian motion and b and σ are stochastic processes thatwill satisfy certain conditions for the existence of the solution of the differ-ential equation. The process σ is called the volatility process and the con-struction of an estimator for the multivariate density of (σt1 , . . . , σtp

), wheret1 = 0, t2 = ∆, ..., tp = (p − 1)∆, is the point of our main interest. Certainmixing conditions for the process σ will be needed. We assume that σ is inde-pendent of W and it’s multivariate marginal distribution is required to have adensity with respect to the Lebesgue measure.

First we present the estimator of the univariate density of log σ2t , i.e. at one

point x, as proposed in Van Es, Spreij and Van Zanten (2003).

Although our process S is continuous, we observe it at discrete time instantsk∆ = k∆n, for k = 1, 2, ...n, where ∆n → 0 and n∆n →∞ as n →∞.First we consider the stochastic equation without the drift term b and work withprocesses X∆

i , where

X∆i = 1√

∆(Si∆ − S(i−1)∆), i = 1, ..., n.

Let X∆i stand for the product σ(i−1)∆Z∆

i , where

Z∆i = 1√

∆(Wi∆ −W(i−1)∆), i = 1, ..., n,

is a i.i.d. sequence of standard normal random variables.For ∆ small enough, X∆

i is a rough approximation of X∆i , because

X∆i = 1√

∆

∫ i∆

(i−1)∆σtdWt ≈ σ(i−1)∆

1√∆

(Wi∆ −W(i−1)∆) = σ(i−1)∆Z∆i .

1

CHAPTER 1. INTRODUCTION 2

After squaring and taking the logarithm of the last approximation,X∆

i ≈ σ(i−1)∆Z∆i , we have

log((X∆i )2) ≈ log(σ2

(i−1)∆) + log((Z∆i )2)

In the next step, we use deconvolution theory for random variables X, Y andZ, that satisfy X = Y + Z, with Y and Z independent. Our goal is to useobservations from X to estimate the density of Y , say f , that is unknown. Weassume that the density of Z, say k, is known. The density of X is denoted bygBy φg, φf and φk we denote the characteristic functions of g, f and k. In thiscase, g is the convolution of f and k. Then for the corresponding characteristicfunctions we have that

φg(t) = φf (t)φk(t).

Using a kernel estimator with bandwidth h and a kernel function w, we canestimate g and also its characteristic function φg. Knowing the characteris-tic function of k and having the estimator φgnh

for φg, where gnh is a kernelestimator of g, i.e.

gnh(x) =1

nh

n∑i=1

w

(x−Xi

h

)we can estimate φf (t) by φgnh

(t)/φk(t).By φemp and φw we denote the empirical characteristic function, i.e. φemp(t) =1/n

∑eitXj , and the Fourier transform of w. Note that φgnh

(t) = φemp(t)φw(ht).Now we define the function vh by

vh(x) =12π

∫ ∞

−∞e−isx φw(s)

φk(s/h)ds

and the estimator fnh(x) of the density f of Y at the point x by

fnh(x) =1

nh

n∑i=1

vh

(x−Xi

h

)(1.2)

In our case the role of the random variables X, Y and Z is taken by log((X∆i )2),

log(σ2(i−1)∆) and log((Z∆

i )2). From the observed data log((X∆i )2), i = 1, ..., n,

we will estimate the univariate density f(x) of log(σ2(i−1)∆).

To ensure integrability of the function w and it’s characteristic function φw

we require the following conditions:

Condition 1: (i)∫∞−∞ |w(u)|du < ∞,

∫∞−∞ w(u)du = 1

∫∞−∞ u2|w(u)|du < ∞

(ii) the characteristic function φw has support on [−1, 1].(iii) φw(1− t) = Atα + o(tα), when t ↓ 0 and α > 0(iv) supx |w(x)| < ∞, lim|x|→∞ w(x) = 0

CHAPTER 1. INTRODUCTION 3

Then using the same kernel function vh, the estimator is defined by

fnh(x) =1

nh

n∑j=1

vh

(x− log((X∆

j )2)h

)Note that fnh(x) is an estimator of the density f(x) of log σ2

t . An estimator ofthe density of σ2

t can be obtained by a simple transformation.

Remark: The estimator is based on the equality X = Y + Z. In our prob-lem, instead of equality, we have an approximation. That will give us extraterms in the bounds for the bias and variance of the estimator, compared to theusual deconvolution estimator (1.2).

Now we return to our original problem, the estimation of the multivariate den-sity of the p-dimensional vector (log σ2

0 , . . . , log σ2(p−1)∆).

For the p-dimensional vector x=(x1, . . . , xp) we introduce the functions k andw, as a product of the functions k and w defined above;k(x) =

∏pi=1 k(xi) and w(x) =

∏pi=1 w(xi). As a consequence, characteristic

functions also factorize. We introduce a new kernel function Vh as

Vh(x) =1

(2π)p

∫Rp

φw(s)φk(s/h)

e−is·x ds

and define a new estimator for the p-dimensional density:

fnh(x) =1

(n− p + 1)hp

n∑j=p

Vh

(x− log((X∆

j )2)h

)

where X∆j = (X∆

j−p+1, . . . , X∆j ) and log(X∆

j )2 = (log(X∆j−p+1)

2, . . . , log(X∆j )2),

for j = p, ..., n.

Chapter 2

Main result

In this section, we will give bounds for the bias and the variance of the estimatorfnh. Certain mixing conditions for the process X are used to give bounds onthe variance.

Let Fba, be the σ-algebra generated by all Xt, with t ∈ [a, b].

We say that the process X is strongly mixing with coefficients α(t), defined as

α(t) = supA∈F0

−∞ , B∈F∞t|P (A ∩B)− P (A)P (B)|,

if α(t) → 0, when t →∞.

As in Van Es et al (2003), we require the following condition.

Condition 2: For t → 0, we have E|σ2t − σ2

0 | = O(√

t).

Theorem 1 Assume that Eb2t is bounded. Suppose that log σ2

t has a twice con-tinuously differentiable density f , with bounded second derivative and that σ2

t

has a bounded density in the neighborhood of zero. Let the process σ be stronglymixing with coefficients α(t) s.t.∫ ∞

0

α(t)q dt < ∞,

for some q ∈ (0, 1).Let the kernel function w satisfy Condition 1 in the previous section. For 0 <δ < 1 choose the step size ∆ = n−δ, h = pγπ/ log n for γ > 4/δ. Then for theestimator fnh of the multivariate density f∆, we have the following bounds forthe bias and the variance

Efnh(x)− f∆(x) =12h2

∫yT∇2f∆(x)y w(y) dy + o(h2)

varfnh(x) = O

(1

n hp(1+q)∆

)+ O

(1n

h2pα epπ/h

).

4

CHAPTER 2. MAIN RESULT 5

The proof of the theorem is given in the next section. It will be divided into fourlemmas. First we derive a bound for the estimator based on the approximatedvalues and then we study the difference between the estimator based on the trueand the approximated values.

Remark: The bias of the estimator is of order h2, as for the ordinary kernelestimator. The bound for the variance is different.

Remark: Note for the result. We do not have a concrete value for the firstorder term of the expansion of the variance. We just have a bound that de-pends on h, the number of observations n and the time difference ∆.

Remark: Making experiments from real life data is also important. We cantest our choice for the bandwidth. By making simulations from it we can com-pare the estimated density with the real one. That can serve us as a rough,indicator of how good the estimator is.

Chapter 3

Proof of the theorem

3.1 Proof of the bound for the bias of the esti-mator

Let Xj = (X∆j−p+1, · · · , X∆

j ), where as before X∆j = σ(j−1)∆Z∆

j , we denote thevector of the approximated values. Then for the estimator fnh based on thesevalues we have

fnh(x) =1

(n− p + 1)hp

n∑j=p

Vh

(x− log(Xj)2

h

)

where log(Xj)2 denotes the vector (log(X∆j−p+1)

2, · · · , log(X∆j )2). We have the

following lemma.

Lemma 1

Efnh(x) =1hp

∫w

(x− u

h

)f∆(u)du.

Proof: With the σ-algebra generated by the process σ, say Fσ, it holds that

E[fnh(x)|Fσ

]=

1(n− p + 1)hp

n∑j=p

E

[Vh

(x− log(Xj)2

h

)|Fσ

]=

=1

(n− p + 1)hp

n∑j=p

E

[1

(2π)p

∫φw(s)

φk(s/h)·

e−is·(x−(log σ2

(j−p)∆,...,log σ2(j−1)∆)−(log(Z∆

j−p+1)2,...,log(Z∆

j )2))/h ds| Fσ

]=

6

CHAPTER 3. PROOF OF THE THEOREM 7

=1

(n− p + 1)hp

n∑j=p

1(2π)p

∫φw(s)

φk(s/h)e−is·x/h e(is·(log σ2

(j−p)∆,...,log σ2(j−1)∆))/h·

E[e(is·(log(Z∆

j−p+1)2,...,log(Z∆

j )2))/h|Fσ

]ds =

=1

(n− p + 1)hp

n∑j=p

1(2π)p

∫φw(s)

φk(s/h)e−is·x/h e(is·(log σ2

(j−p)∆,...,log σ2(j−1)∆))/h φk(s/h)ds =

=1

(n− p + 1)hp

n∑j=p

w

(x− (log σ2

(j−p)∆, . . . , log σ2(j−1)∆)

h

)

Taking expectations on the both sides, and knowing the density f∆ of the vector(log σ2

(j−p)∆, . . . , log σ2(j−1)∆), we have that

Efnh(x) = E

1(n− p + 1)hp

n∑j=p

w

(x− (log σ2

(j−p)∆, . . . , log σ2(j−1)∆)

h

)and by stationarity of σ, the last expectation equals to

E1hp

w

(x− (log σ2

0 , . . . , log σ2p−1)

h

)=

1hp

∫w(

x− u

h

)f∆(u)du

For the second lemma, we use the functions γ0(h) = 12π

∫ 1

−1

∣∣∣ φw(s)φk(s/h)

∣∣∣ ds and

γ1(h, x) = eπ/2h + 1h exp

(π2

1+π/|x|h

)log 1+π/|x|

h , as in [1] p.456.Define γ0(h) and γ1(h) as

γ0(h) =1

(2π)p

∫ 1

−1

∣∣∣∣ φw(s)φk(s/h)

∣∣∣∣ ds

and

γ1(h, x) =p∏

j=1

γ1(h, xj), x = (x1, · · · , xp).

Lemma 2: For the difference between the expectation of the estimators basedon Xj and X∆

j , we have|Efnh(x)− Efnh(x)|

= O

(1

hp+1

1ε

γ0(h) 4√

∆ +1hp

γ0(h)√

∆ε2

+ γ1

(h, (

|log2ε|h

, . . . ,|log2ε|

h)) ε

(|log2ε|/h)p

)


Proof: Before starting the proof, note that by Condition 2 in chapter 2,E|σ2

t − σ20 | = O(

√t) and because for x, y ≥ 0, (x − y)2 ≤ |x2 − y2|, we have

E(σt − σ0)2 = O(√

t).Since E(X∆

1 − σ0Z∆1 )2 = 1

∆E∫∆

0(σt − σ0)2dt we have that

E(X∆1 − σ0Z

∆1 )2 ≤ C

√∆ (3.1)

and moreover E∣∣∣ 1∆

∫∆

0σ2

t dt− σ20

∣∣∣ = O(√

∆).

Let Wj denote the difference

Wj = Vh

(x− log((X∆

j )2)h

)− Vh

(x− log(Xj)2

h

)Then because fnh(x)− fnh(x) = 1

(n−p+1)hp

∑nj=p Wj we have that

|Efnh(x)− Efnh(x)| =∣∣∣∣E 1

(n− p + 1)hp

n∑j=p

Wj

∣∣∣∣ = ∣∣E 1hp

Wj

∣∣ ≤ 1hp

E|Wj |

Now we split the space Ω into three sets and on each of them we give boundsfor 1

hp E|Wj |.

1hp

E|Wj | =1hp

E|Wj | I[‖X∆1 ‖ ≥ε and ‖X1‖ ≥ε]+

+1hp

E|Wj | I[‖X∆1 ‖ ≤ε or ‖X1‖ ≤ε] I[‖X∆

1 −X1‖≥ε]+

+1hp

E|Wj | I[‖X∆1 ‖ ≤ε or ‖X1‖ ≤ε] I[‖X∆

1 −X1‖<ε]

=(1)+(2)+(3)

By lemma 4.3, we have that∣∣∣∣∣Vh

(x− log((X∆

1 )2)h

)− Vh

(x− log(X1)2

h

)∣∣∣∣∣ ≤ 2γ0(h)h

‖ log |X∆1 | − log |X1|‖

and using (3.1) we have the following bound for (1)

(1) ≤ 2hp+1

γ0(h) E‖ log |X∆1 | − log |X1|‖ I[‖X∆

1 ‖ ≥ε and ‖X1‖ ≥ε]

≤ 2hp+1

1εγ0(h) E‖X∆

1 − X1‖ ≤2

hp+1

1ε

γ0(h)√

p C4√

∆


The last inequality follows from

E‖X∆1 − X1‖2 = E

p∑j=1

(X∆j − X∆

j )2

=p∑

j=1

E(X∆

j − σ(j−1)∆Z∆j

)2 ≤ p∑j=1

C√

∆ = p C√

∆.

For similar reasons we obtain a bound for (2)

(2) ≤ 2hp

γ0(h) P

(‖X∆

1 − X1‖ ≥ ε

)≤ 2

hpγ0(h)C p2

√∆

ε2.

For part (3), we use the factorization of Vh

Vh(x) =p∏

i=1

vh(xi),

where x = (x1, . . . , xp). Having that the arguments in vh are larger than| log 2ε|

h , we use |vh|,≤ D|x|γ1(h, x), as |x| → ∞, a property in [1] p.459. Using

lemma 4.4 and the definition of the function γ1, we bound (3) with:

(3) ≤ 1hp

(γ1

(h,|log2ε|

h

) 1|log2ε|/h

)p

P (‖X1‖ ≤ 2ε)

≤ 1hp

γ1

(h, (

|log2ε|h

, . . . ,|log2ε|

h)) ( 1

|log2ε|/h

)p

P (‖X1‖ ≤ 2ε)

≤ C1hp

γ1

(h, (

|log2ε|h

, . . . ,|log2ε|

h)) p ε

(|log2ε|/h)p

The previous two lemmas will serve us to prove the result for the bias of ourestimator.

Efnh(x) = f∆(x) +12h2

∫uT∇f∆(x)2uw(u)du + o(h2). (3.2)

From Lemma 1, we have that Efnh(x) = 1hp

∫w(

x−uh

)f∆(u)du. The integral,

using the change of variable x−uh = y, can be written as

Efnh(x) =∫

w(y)f∆(x− hy)dy. (3.3)


Now, using Taylor’s formula, we expand f∆ at the point x. The result is

f∆(x− hy) = f∆(x) +11!∇f∆(x)(−hy) +

12!

(hy)T∇2f∆(x)(hy) + ...

Substituting the last result in (3.3) and the fact that∫

w(u)u du = 0 we have

Efnh(x) =∫

w(y)f∆(x)du +12

∫w(y)(hy)T∇2f∆(x)hy dy + Rem

orEfnh(x) =

∫w(y)f∆(x)dy +

12h2

∫w(y)yT∇2f∆(x)y dy + o(h2)

= f∆(x) +12h2

∫w(y)yT∇2f∆(x)y dy + o(h2).

The Rem term becomes o(h2), because the third expansion in the Taylor’s for-mula is 1

3! (u∇)[uT∇2f∆(x)u]. After substituting the variable u, the power of hbecomes 3.Now we go back to Lemma 2, and investigate the order of n in the bound of thedifference |Efnh(x)− Efnh(x)|.For that purpose, we are going to use the bounds for the functions γ0(h) andγ1(h, x), given in lemmas 4.2 and 4.4 below.

For 0 < δ < 1, we choose δ > 4γ . Then pick

β ∈(

12γ

,δ

4− 1

2γ

), ε = n−β , ∆ = n−δ, h =

pγπ

log n

Substituting these in the bound of γ0,

γ0(h) = O(hp(1+α)epπ/2h

)and then all this in the first term of the result in Lemma 2 will give:

O

(1

hp+1γ0(h)

p1/2∆1/4

ε

)= O

(hpα−1 e

pπ

2 pγπlog n

p1/2n−(δ/4)

n−β

)= O

(hpα−1 elog 1

n2γ p1/2 nβ− δ4

)

The exponent of n, in the last bound is 12γ + β − δ

4 , which, because of ourchoice, β < δ

4 −12γ , is negative.

In the same way, we derive the exponent of n in the second and third termin the result of Lemma 2.


The order in the second term, 1hp γ0(h)p2∆1/2

ε2 , is 12γ + 2β − δ

2 , and in the third,12γ − β.

Putting together all these bounds, as a result we have the order

n12γ +β− δ

4 + n12γ +2β− δ

2 + n12γ−β .

All exponents are negative so the term is negligible with respect to h2 = p2γ2π2

log2 n.

Now finally, we derive the result for the difference Efnh(x)− f(x)

|Efnh(x)− f(x)| = |Efnh(x)− Efnh(x) + Efnh(x)− f(x)|

≤ |Efnh(x)− Efnh(x)|+ |Efnh(x)− f(x)|.

By the last arguments this equals

= f∆(x) +12h2

∫w(y)yT∇2f∆(x)y dy + o(h2) + Rem

where the Rem is negligible with respect to h2.This finishes the result for the bias of the estimator, which as we can see is thesame as the bias for the ordinary kernel estimator.

3.2 Proof of the bound for the variance of theestimator

For the variance of the estimator fnh we have the following lemma

Lemma 3 For h → 0, n →∞, n∆ →∞, it holds that

varfnh(x) = O

(1

n hp(1+q)∆

)+ O

(1n

h2pα epπ/h

)Proof: Here we use the following variance decomposition

varfnh(x) = var(E(fnh(x)|Fσ)

)+ E

(var(fnh(x)|Fσ)

).

By the proof of lemma 1, the conditional expectation E(fnh(x)|Fσ) equals

1(n− p + 1) hp

n∑j=p

w

(x− log σ2

(j−1)∆

h

).

Below we use the following Theorem by Masry (1983), applied to the case whenYj stands for log σ2

(j−1)∆.


Theorem Let Y be a strong mixing process with coefficient α(τ), s.t.∫[α(τ)]qdτ < ∞, 0 < q < 1. Then for the estimator fnh(x) of the stationery

density of Yt, we have that

cov

fnh(x), fnh(y)

= O

(1

nhp(1+q)

)where fnh(x) =

1(n− p + 1)hp

n∑j=p

w(

x− Yj

h

)Proof of the theorem We write wh(x) = 1

hp w(

xh

). Then we have

fnh(x) =1

(n− p + 1)

n∑k=p

wh (x− Yk)

Because of this and the stationarity of the process Y , we can write the covarianceas:

cov

fnh(x), fnh(y)

=1

(n− p + 1)2

n∑j=p

n∑i=p

cov wh (x− Yj−i) , wh (y − Y0)

=n−p∑

k=−(n−p)

(n−p+1)−|k|∑j=1

1(n− p + 1)2

covwh

(x− Y|k|

), wh (y − Y0)

=n−p∑

k=−(n−p)

In,k(x, y) = In,0(x, y) + Rn(x, y)

where

Rn,k(x, y) =n−p∑

k=−(n−p)k 6=0

In,k(x, y),

First we consider the first part, In,0(x, y) and derive its order in terms of n.We have

In,0(x, y) =1

(n− p + 1)2

(n−p+1)∑j=1

cov wh(x− Y0), wh(y − Y0)

=1

n− p + 1cov wh(x− Y0), wh(y − Y0)

=1

n− p + 1

∫wh(x−u)wh(y−u)f∆(u)du− 1

n− p + 1E(fnh(x)

)E(fnh(y)

)


By Theorem 1a) in Masry (1983), we have that E[fnh(x)] → f∆(x) as n →∞,which means that the second term is O( 1

n ).The first part equals

1n− p + 1

∫1hp

w(

x− u

h

)1hp

w(

y − u

h

)f∆(u) du.

After a change of variable x−uh = v, the last integral equals

1n− p + 1

1hp

∫w(v)w

(y − x

h+ v

)f∆(x− hv)dv.

The integrand is bounded by a constant, because of the assumption that σ hasa bounded density. Since w ∈ L1, and in view of (iv) in Condition 1, it tendsto w2(v)f∆(x)δy,x, where

δy,x =

1 if y = x0 if y 6= x

By the dominated convergence theorem we have that

limn→∞

(n− p + 1) hp In,0(x, y) = f∆(x)δy,x

∫w2(v)dv

which implies

In,0 = O

(1

(n− p + 1)hp

)= O

(1

n hp

).

Now we give a bound for

Rn,k(x, y) =n−p∑

k=−(n−p)k 6=0

In,k(x, y).

For that purpose we will use a lemma by Doe[3] for strong mixing processes.

Lemma: For a strong mixing process X, let U and V denote random vari-ables, measurable with respect to the σ-algebras F0

−∞ and F∞τ , where τ ≥ 0.Suppose also that

E|U |2+δ < ∞ and E|V |2+δ < ∞, for some δ > 0

Then we have that

|covU, V | ≤ 10 [α(τ)]δ/(2+δ) E|U |2+δ E|V |2+δ

1/(2+δ).

In our case we set U = wh(y − Y0) and V = wh(x− Y|k|). Then we have that


E|U |2+δ =∫

[wn(y − u)]2+δf∆(u)du =∫ [

1hp

w(

y − u

h

)]2+δ

f∆(u)du

and making a change of variable y−uh = v, the last integral equals to:

hp

hp(2+δ)

∫w2+δ(v) f∆(y − vh)dv =

1 + o(1)hp(1+δ)

f∆(y)∫|w(v)|2+δ dv < ∞

by Condition 1 and the dominated convergence theorem.Similar, E|V |2+δ < ∞. Having this, we can use Doe’s Lemma and finally derivea bound for |Rn(x, y)|.

|Rn(x, y)| =

=n−p+1∑

k=−(n−p+1), k 6=0

1(n− p + 1)2

(n−p+1)−|k|∑j=1

covwh

(x− Y(|k|)

), wh (y − Y0)

≤n−p+1∑

k=−(n−p+1), k 6=0

1(n− p + 1)2

(n−p+1)−|k|∑j=1

10 (α(|k|))δ/(2+δ) ·

·

(1 + o(1)hp(1+δ)

)2

f∆(y) f∆(x)(∫

|w(v)|2+δdv

)21/(2+δ)

=n−p+1∑

k=1

1(n− p + 1)2

(n−p+1)−k∑j=1

20 (α(k))δ/(2+δ) ·

·

(1 + o(1)hp(1+δ)

)2

f∆(y) f∆(x)(∫

|w(v)|2+δdv

)21/(2+δ)

≤ 20(1 + o(1))

(n− p + 1) h2p(1+δ)

2+δ

(f∆(y)f∆(x)

) 12+δ

(∫|w(v)|2+δdv

) 22+δ

∫ ∞

0

[α(τ)]δ/2+δdτ

for every δ > 0.If we put δ = 2q/(1 − q), then the power of h in the last equality is p(1 + q).Hence

|Rn(x, y)| = O

(1

(n− p + 1)hp(1+q)

)= O

(1

n hp(1+q)

).

Having that


cov

fnh(x), fnh(y)

= In,0(x, y) + Rn(x, y)

we have proved the theorem.

We continue the proof of Lemma 3.Note that, since Yj = log σ2

(j−1)∆, the mixing coefficients of the Y process interms of the mixing coefficients of σ process equal α(∆τ).Hence we have a bound for the first part of the lemma, that is O

(1

n hp(1+q)∆

).

Because of the independence of log X2t given σ, for t’s that are at least distance

p∆ apart, we bound the sum of variances part of E(var(fnh(x)|Fσ)

)by

1(n− p + 1)2 h2p

E

[ n∑j=p

var Vn

(x− log X2

j

h

)∣∣Fσ

]

≤ 1(n− p + 1) h2p

E

(Vn

(x− log X2

1

h

))2

It can be shown that the contribution of the sum of the covariances, of whichthere are at most (n− p + 1)p nonzero, is of the same order.Using the first part of Lemma 4.3, i.e. |Vh(x)| ≤ γ0(h), the last term is boundedby

1n h2p

γ20(h).

Applying the bound of γ0(h) from Lemma 4.2, we have

E(var(fnh(x)|Fσ)) = O

(1

n h2ph2p(1+α) epπ/h

)= O

(1n

h2pα epπ/h

).

Putting together the last results we have finished the proof of Lemma 3.

Lemma 4 For h → 0 and ε > 0 we have

var(fnh(x)− fnh(x)

)=

O

(1n

1h2p+2

γ20(h)

√∆

ε2+

1n

γ21

(h, (

| log 2ε|h

, ...,| log 2ε|

h)

pε

| log 2ε|2p

))


+1

n− p + 11

h2p∆O

(p

1−q2 ∆

1−q2

h2ε2+ ε1−q

)Proof: Here we use same notation for Wj as in the proof of lemma 2,

Wj = Vh

(x− log((X∆

j )2)h

)− Vh

(x− log(Xj)2

h

).

Then

fnh(x)− fnh(x) =1

n− p + 11hp

n∑j=p

Wj

and writing the variance as

var(fnh(x)− fnh(x)

)= var

(E[fnh(x)− fnh(x) | Fσ

])+E

[var(fnh(x)− fnh(x) | Fσ

)],

we have that conditioned on Fσ, (X∆i , Xi) and (X∆

j , Xj) are independent,when the vectors have no common components. This means that the conditionalcovariances of these pairs disappear. Then

var(E[fnh(x)− fnh(x)|Fσ]

)= var

(E

[1

n− p + 11hp

n∑j=p

Wj

∣∣∣∣Fσ

])

=1

(n− p + 1)21

h2pvar( n∑

j=p

E[Wj |Fσ])

=1

(n− p + 1)21

h2p

[ n∑j=p

varE[Wj |Fσ] +n∑

i=p

n∑j=pj 6=i

covE[Wj |Fσ] , E[Wi |Fσ]

]

=1

n− p + 11

h2pvarW1 +

1(n− p + 1)2

1h2p

n∑i=p

n∑j=pj 6=i

covE[Wj |Fσ] , E[Wi |Fσ]

= (∗) + (∗ ∗). First we consider the first part.

(∗) Because varX = EX2 − (EX)2, we have varX ≤ EX2

As in the proof of lemma 2, we split EW 2j in three terms

1hp

EW 2j =

1hp

EW 2j I[‖X∆

1 ‖ ≥ε and ‖X1‖ ≥ε]


+1hp

EW 2j I[‖X∆

1 ‖ ≤ε or ‖X1‖ ≤ε] I[‖X∆1 −X1‖≥ε]

+1hp

EW 2j I[‖X∆

1 ‖ ≤ε or ‖X1‖ ≤ε] I[‖X∆1 −X1‖<ε]

=(1)+(2)+(3),and in the same way using Lemma 4.3 and (3.1) we derive bounds for them.By lemma 4.3 we have

EW 2j ≤ γ2

0(h)

∥∥∥∥∥ log((X∆j )2)− log(X2

j )h

∥∥∥∥∥2

Then 1hp EW 2

j I[‖X∆1 ‖ ≥ε and ‖X1‖ ≥ε] is bounded by

2hp+2

γ20(h)E‖ log |X∆

1 | − log |X1|‖2 I[‖X∆1 ‖ ≥ε and ‖X1‖ ≥ε]

≤ 2hp+2

1ε2

γ20(h)E‖X∆

1 − X1‖2 ≤ 2hp+2

γ20(h) p C

√∆

ε2,

because of (3.1).In a similar way we derive the bounds for (2) and (3).

(2) ≤ 4hp+1

γ20(h) p2 C

√∆

ε2

and

(3) ≤ C γ21

(h, (

| log 2ε|h

, ...,| log 2ε|

h))

hp pε

| log 2ε|2p.

So

(∗) = O

(1n

1h2p+2

γ20(h)

√∆

ε2+

1n

γ21

(h, (

| log 2ε|h

, ...,| log 2ε|

h)

pε

| log 2ε|2p

)).

Before continuing with investigating of (∗ ∗), let us recall the notation that isused and introduce some new processes.

X∆i = (X∆

i−p+1, · · · , X∆i ), where X∆

i =1√∆

∫ i∆

(i−1)∆

σt dWt

Xi = (X∆i−p+1, · · · , X∆

i ), where X∆i = σ(i−1)∆Z∆

i

Having the volatility process σt, let σt, denote the process

σi = (σ∆i−p+1, · · · , σ∆

i ) where σ∆i =

1∆

∫ i∆

(i−1)∆

σ2t dt


Then for Wi being E[Wi|Fσ], it holds that

Wi = w(

x− log σi

h

)−w

(x− log σ2

(i−1)∆

h

)and stationarity of Wj implies the stationarity of Wi. Hence

∑i 6=j

covWi, Wj = 2n−p∑k=1

(n− p + 1− k) covW0, Wk

For the mixing coefficient of W j, α(k), we have that α(k) ≤ α((k − p)∆) fork ≥ p and α(k) ≤ 1 otherwise.Again, referring to lemma by Deo (1973), we bound the last covariance

covW0, Wk ≤ 10α(k)τ

2+τ

E|W0|

2+τ2 E|Wk|

2+τ2

22+τ

,

which by stationarity and the inequality for the mixing coefficients can bebounded by

10α((k − p)∆)τ

2+τE|W1|2+τ

22+τ

Now we consider (∗∗). We have∣∣∣∣ 1(n− p + 1)2

1h2p

n∑i=p

n∑j=pj 6=i

covWi , Wj

∣∣∣∣=∣∣∣∣ 1(n− p + 1)2

2h2p

n−p∑k=1

(n− p + 1− k) covW0 , Wk

∣∣∣∣≤∣∣∣∣ 1(n− p + 1)

2h2p

n−p∑k=1

(1− k

n− p + 1)10α((k − p)∆)

τ2+τ

E|W1|2+τ

22+τ∣∣∣∣

≤ 20n− p + 1

1h2p

E|W1|2+τ

22+τ

(α(0)

τ2+τ +

1∆

∫ ∞

0

ατ

2+τ (t)dt

),

by the monotonicity of the mixing coefficients α.

Now we concentrate on E|W1|2+τ . First, again we split the space Ω into threesets and then use similar bounds as in lemma 4.4 in the paper[1] referred above.


E|W1|2+τ = E

∣∣∣∣w(x− log σ1

h

)−w

(x− log σ2

0

h

) ∣∣∣∣2+τ

·

·

I‖σ1‖≥ε and ‖σ2

0‖≥ε + I‖σ1‖<ε or ‖σ20‖<ε I‖σ1−σ2

0‖≥ε

+I‖σ1‖<ε or ‖σ20‖<ε I‖σ1−σ2

0‖<ε

=(1)+(2)+(3).

Then by the conditions for the function w in introduction chapter, using Fourierinversion we have that w is Lipschitz and bounded by 1/π. Hence

(1) ≤ 1(κhε)2+τ

E‖σκ1 − σ2κ

0 ‖2+τ

(2) ≤ P (‖σκ1 − σ2κ

0 ‖ ≥ ε) ≤ 1ε2+τ

E‖σκ1 − σ2κ

0 ‖2+τ

The third term can bounded by

E|W1|2+τ = E

∣∣∣∣w(x− log σ1

h

)−w

(x− log σ2

0

h

) ∣∣∣∣2+τ

·

I‖σ1‖≤ε(1+ε1−κ)1/κ and ‖σ20‖≤ε(1+ε1−κ)1/κ

which is bounded by P (‖σ20‖ ≤ 2ε) = O(ε), since σ2

0 has, by assumption, abounded density.To bound E‖σ1 − σ2

0‖2, we use that E|σ∆1 − σ2

0 |2 ≤ ∆. Then

E‖(σ1 − σ20)‖2 = E

p∑t=1

|σ∆t − σ2

(t−1)∆|2 =

p∑t=1

E|σ∆t − σ2

(t−1)∆|2 ≤ p∆.

Finally, from (1), (2) and (3), choosing τ = 2q1−q and κ = 1

2+τ = 1−q2 , we have∣∣∣∣ 1

(n− p + 1)21hp

∑i 6=j

covWi, Wj∣∣∣∣

=1

n− p + 11

h2p∆O

(1

h2+τ

1ε2+τ

E‖(σk1 − σ2k

0 )‖2+τ + ε

)2/2+τ

=1

n− p + 11

h2p∆O

(1

h2+τ

1ε2+τ

E‖(σ1 − σ20)‖k(2+τ) + ε

)2/2+τ


=1

n− p + 11

h2p∆O

(E‖σ1 − σ2

0‖2/2+τ

h2ε2+ ε

22+τ

)

=1

n− p + 11

h2p∆O

((p∆)1/(2+τ)

h2ε2+ ε

22+τ

)

=1

n− p + 11

h2p∆O

(p

1−q2 ∆

1−q2

h2ε2+ ε1−q

)

Chapter 4

Technical lemmas

Lemma 4.1 For the kernel function Vh, we have the following bound:

‖ Vh ‖2= O(hp(1/2+α) epπ/2h

)Proof: By lemma 5.2 of Van Es et al(2003), we have that the univariate kernelfunction vh satisfies

vh = O(h1/2+α eπ/2h

)Using this and the fact that Vh, by its definition factorizes as Vh(x) =

∏pj=1 vh(xj),

we have our required bound.

Lemma 4.2 For h → 0

γ0(h) = O(hp(1+α) epπ/2h

)Proof: Using the factorization γ0(h) = (γ0(h))p and referring to the same paperas in the previous lemma for the bound of γ0(h) we have the required result.

Lemma 4.3 For the kernel function Vh we have

|Vh(x)| ≤ γ0(h) and |Vh(x + u)− Vh(x)| ≤ γ0(h)p∑

j=1

|uj |

Proof:

|Vh(x)| =∣∣∣∣ 1(2π)p

∫Rp

φw(s)φk(s/h)

e−is·xds

∣∣∣∣21

CHAPTER 4. TECHNICAL LEMMAS 22

≤ 1(2π)p

∫Rp


e−is·x∣∣∣∣ds

=1

(2π)p

∫Rp


∣∣∣∣ds = γ0(h)

which finishes the first part of the lemma. For the second we have

|Vh(x + u)− Vh(x)|

=∣∣∣∣ 1(2π)p

∫φw(s)

φk(s/h)e−is·x [e−is·u − 1

]ds

∣∣∣∣1

(2π)p

∫ ∣∣∣∣ φw(s)φk(s/h)

∣∣∣∣∣∣e−is·u − 1∣∣ ds

≤[ γ0(h)p∑

j=1

|uj |

The step [ in the proof comes from:∣∣e−is·u − 1∣∣ =

∣∣e−i(s1u1+,...,+spup) − 1∣∣

=

∣∣∣∣∣∣cos(p∑

j=1

sjuj)− i sin(p∑

j=1

sjuj)− 1

∣∣∣∣∣∣=

√√√√√cos(p∑

j=1

sjuj)− 1

2

+ sin2(p∑

j=1

sjuj)

=

√√√√√cos2

p∑j=1

sjuj

− 2 cos

p∑j=1

sjuj

+ 1 + sin2

p∑j=1

sjuj

=

√√√√2(

1− cos( p∑

j=1

sjuj

))= 2

√1− cos

(∑pj=1 sjuj

)2

= 2

∣∣∣∣∣sin∑p

j=1 sjuj

2

∣∣∣∣∣ ≤ 2

∣∣∣∣∣∑p

j=1 sjuj

2

∣∣∣∣∣≤

p∑j=1

|uj | for |sj | < 1

CHAPTER 4. TECHNICAL LEMMAS 23

Lemma 4.4 As x →∞ we have that

|Vh(x)| ≤ Cγ1(h, x)p∏

j=1

1|xj |

and

γ1(h, x) = O

(| log h|p

hpe

pπ2h (1+

Pj

πp|xj |

))

for some constant C1, as h → 0.

Proof: Again, we obtain these bounds, by combining the factorization of Vh(x)and γ1(h, x) and their bounds

|vh(x)| ≤ D

|x|γ1(h, x), for|x| → ∞ and γ1(h, x) = O

(| log h|

heπ(1+π/|x|)/2h

)as in Van Es et al (2003).

Then

Vh(x) =p∏

j=1

vh(xj) ≤p∏

i=1

Ci

|xi|γ1(h, xi) = Cγ1(h, x)

p∏i=1

1|xi|

For γ1(h, x) we have

γ1(h, x) =p∏

i=1

γ1(h, xi) =p∏

i=1

O

(| log h|

heπ(1+π/|xi|)/2h

)

= O

(| log h|p

hpe

pπ2h (1+

Pj

πp|xj |

))

References

[1] B. van Es, P. Spreij, H. van Zanten, Nonparametric volatility density esti-mation, Bernoulli 9(3), 2003, 451-465

[2] B. van Es, P. Spreij, H. van Zanten, Nonparametric volatility density es-timation for discrete time models, Nonparametric statistic, Vol.17, no.2, 2005,237-251

[3] C. M. Doe, A note on empirical processes for strong mixing processes, Ann.Prob., 1973, vol 1, 870-875

[4] E. Masry, Asymptotic normality for deconvolution estimators of multivari-ate densities of stationary processes”, Journal of Multivariate Analysis 44, 1993,47-68

[5] E. Masry, Multivariate probability density deconvolution for stationary ran-dom processes, IEEE, Vol.37 no.4, 1991, 1105-1115

[6] E. Masry, Probability density estimation from sampled data, IEEE, 1983,vol. IT-29 no.5, 696-709

[7] E. Masry, Strong consistency and rates for deconvolution of multivariatedensities of stationary processes, Stochastic Processes and their Applications47, 1993, North Holland, 53-74

24

Estimation of the joint density of a Volatility Process of the joint density of a Volatility Process...

Documents

Transcript of Estimation of the joint density of a Volatility Process of the joint density of a Volatility Process...