Deterministic versus Stochastic Volatility Nizar Touzi, Nick Webber and seminar participants at the...

48
Deterministic versus Stochastic Volatility * Valentina Corradi University of Warwick Walter Distaso Tanaka Business School Imperial College London November 2007 Abstract This paper provides a testing procedure to discriminate between the classes of deterministic and stochastic volatility models, under minimal assumptions. In particular, apart from standard regularity conditions, no assumptions are made on the functional forms of either the drift or the variance term. Our test is constructed by comparing two estimators of integrated volatility: one is a kernel estimator of the instantaneous variance, averaged over the sample realization on a fixed time span; the other is realized volatility. Under the null hypothesis of a deterministic volatility model, the statistics have a mixed normal limiting distribution, under the alternative diverge at an appropriate rate. Versions of the test robust to jumps and to microstructure noise are also provided. The findings from a Monte Carlo study indicate that the suggested tests have good finite sample properties. An empirical illustration to short term rates is included. Finally, the economic effect on bond pricing of incorrectly choosing deterministic volatility models is quantified. Keywords and phrases: Diffusions, jumps, local times, mixed normal distribution, microstructure noise, realized volatility, realized power variation. JEL classification: C22, C12, G12. * We are grateful to Karim Abadir, Carol Alexander, Federico Bandi, Andrea Buraschi, James Davidson, Mahmoud El-Gamal, Marcelo Fernandes, Marc Hallin, Javier Hidalgo, Rod McCrorie, Nour Meddahi, Antonio Mele, Joon Park, Hashem Pesaran, Peter Phillips, Viktor Todorov, Nizar Touzi, Nick Webber and seminar participants at the University of Exeter, Cass Business School, LSE, Cambridge University, University of Leicester, Rice University, Tanaka Business School, Texas A&M University, Universit`a di Rimini, Universit`a di Venezia, C`a Foscari, 2004 SIS Conference in Bari, 2005 Winter Meeting of the Econometric Society, 2007 Duke Conference, 2007 European Meeting of the Econometric Society in Budapest and the 2007 Brazilian Time Series and Econometrics School in Gramado for very helpful comments and suggestions. The authors gratefully acknowledge financial support from the ESRC, grant codes R000230006 and RES-062-23-0311. University of Warwick, Department of Economics, Coventry CV4 7AL, UK, [email protected]. Imperial College, Tanaka Business School, Finance and Accounting, London, SW7 2AZ, UK, email: [email protected].

Transcript of Deterministic versus Stochastic Volatility Nizar Touzi, Nick Webber and seminar participants at the...

Deterministic versus Stochastic Volatility∗

Valentina Corradi†

University of Warwick

Walter Distaso‡

Tanaka Business School

Imperial College London

November 2007

Abstract

This paper provides a testing procedure to discriminate between the classes of deterministic and stochastic volatility

models, under minimal assumptions. In particular, apart from standard regularity conditions, no assumptions are

made on the functional forms of either the drift or the variance term. Our test is constructed by comparing two

estimators of integrated volatility: one is a kernel estimator of the instantaneous variance, averaged over the sample

realization on a fixed time span; the other is realized volatility. Under the null hypothesis of a deterministic volatility

model, the statistics have a mixed normal limiting distribution, under the alternative diverge at an appropriate rate.

Versions of the test robust to jumps and to microstructure noise are also provided. The findings from a Monte Carlo

study indicate that the suggested tests have good finite sample properties. An empirical illustration to short term

rates is included. Finally, the economic effect on bond pricing of incorrectly choosing deterministic volatility models

is quantified.

Keywords and phrases: Diffusions, jumps, local times, mixed normal distribution,

microstructure noise, realized volatility, realized power variation.

JEL classification: C22, C12, G12.

∗We are grateful to Karim Abadir, Carol Alexander, Federico Bandi, Andrea Buraschi, James Davidson, Mahmoud El-Gamal, Marcelo

Fernandes, Marc Hallin, Javier Hidalgo, Rod McCrorie, Nour Meddahi, Antonio Mele, Joon Park, Hashem Pesaran, Peter Phillips, Viktor

Todorov, Nizar Touzi, Nick Webber and seminar participants at the University of Exeter, Cass Business School, LSE, Cambridge University,

University of Leicester, Rice University, Tanaka Business School, Texas A&M University, Universita di Rimini, Universita di Venezia, Ca

Foscari, 2004 SIS Conference in Bari, 2005 Winter Meeting of the Econometric Society, 2007 Duke Conference, 2007 European Meeting of the

Econometric Society in Budapest and the 2007 Brazilian Time Series and Econometrics School in Gramado for very helpful comments and

suggestions. The authors gratefully acknowledge financial support from the ESRC, grant codes R000230006 and RES-062-23-0311.†University of Warwick, Department of Economics, Coventry CV4 7AL, UK, [email protected].‡Imperial College, Tanaka Business School, Finance and Accounting, London, SW7 2AZ, UK, email: [email protected].

1 Introduction

Continuous time diffusion models are nowadays an accepted paradigm in finance. Since the introduction of the simple

homoskedastic diffusion

dXt = µdt + σdWt (1)

in the 1970s through the work of Merton, they have been an extremely popular choice to characterize the behaviour of

financial assets and have formed the basic infrastructure for pricing contingent claims.

A lot of work has been done in the past twenty years in order to make more realistic and empirically plausible

assumptions on the data generating process. Particular attention has been put on the volatility coefficient σ, which is a

key input in any derivative pricing formula and is of crucial importance in financial risk management.

The first important departure from the basic diffusion described in (1) has been the introduction of time varying

volatility models; this has been accomplished by modeling volatility as a known deterministic function of the underlying

asset. Notable examples are the popular models by Cox, Ingersoll and Ross (1985) and Black, Derman and Toy (1990).

In the paper, we refer to this class as either deterministic volatility or one factor models, since there is only one source

of randomness driving the underlying variable.

A second strand of literature has treated volatility as an additional state variable (albeit latent), driven by random

shocks which can be correlated with those driving the asset. These models, labeled stochastic volatility models, have been

introduced by Hull and White (1987); further important contributions to this literature include Stein and Stein (1991)

and Heston (1993). More recently, an interesting literature has used stochastic volatility models to study the variation

of the yield curve and of the stock market as a function of multiple macroeconomic and unobservable factors. See, for

example, the work by Ang and Piazzesi (2003), Piazzesi (2005), Bibkov and Chernov (2006), Buraschi and Jiltsov (2006)

and Corradi, Distaso and Mele (2006).

Overall, the correct choice between the class of one factor models and of stochastic volatility models is important

for the pricing of derivative securities. In fact, when there is only one source of randomness driving the traded asset,

derivative securities can be priced by no arbitrage, via a pure replication argument, without the need to specify the

behavior of the risk premium. However, in the presence of stochastic volatility, one needs to specify a risk premium, in

order to recover a pricing function. We provide numerical examples showing that the wrong choice of the class of model

used to price contingent claims can lead to severe mispricing errors and hence is economically relevant.

This paper provides a testing procedure which allows to discriminate between the classes of one factor and stochastic

volatility models, under minimal assumptions. Apart from standard regularity conditions, no assumptions are made on

the functional forms of either the drift or the diffusion term. Being able to choose between classes of models, our testing

procedure is nonparametric in nature.

We derive our testing procedure by comparing two estimators of the quadratic variation of the diffusion driving the

underlying asset. One is a kernel estimator of the instantaneous variance, similar to Florens-Zmirou’s (1993), averaged

over the sample realization of the asset trajectory; the other is realized volatility. Under the null hypothesis of a one

factor model, both estimators are consistent for the “true” integrated volatility. Under the alternative hypothesis, the

kernel type estimator is not consistent, while realized volatility retains the consistency property.

We show that the test statistic converges to a mixed normal distribution under the null hypothesis and diverges under

1

the alternative. The derived asymptotic theory is based on the time interval between successive observations approaching

zero, while the time span is kept fixed. As a consequence, the limiting behavior of the statistic is not affected by the

drift specification. Also, no stationarity or ergodicity assumption is required. We also propose appropriately modified

versions of the test, to control for the presence of jumps and market frictions.

Our testing procedure is relevant and can be applied to several situations in financial economics. The first important

application is the literature on interest rate models. In order to have a model which perfectly fits the entire initial

term structure of interest rates, Heath, Jarrow and Morton (2002) proposed to take as primitives the entire structure

of instantaneous forward rates, ft,l, rather than just looking at the short rate st = liml↓t ft,l. In particular, their model

specifies the following dynamics, for any T ,

dst = θtdt + σ′tdWt (2)

dft,T = θt,T dt + σ′t,T

dWt, t ∈ (τ, T ]

where Wt is a vector of standard Brownian motions and the initial term structure f(τ, T ) is observed from the market.

This type of models has the attractive feature of fitting the initial term structure exactly, by construction. It is a standard

practice, in pricing and hedging, to continuously recalibrate the model over t < T , instead of directly tackling the question

of the correct specification of the volatility function. In the case of stochastic volatility, though, continuous recalibration

leads to time-inconsistency, as shown by Buraschi and Corielli (2005); the cost of model specification error (which the

authors call model risk) turns out to be economically significant. Our test could be applied to these situations, in order

to determine whether volatility is stochastic or not.

Another important application, related to the previous one, concerns the pricing of derivatives written on traded

assets. Given an initial set of observed (call) option prices for different strikes K and maturities T at time τ , it is possible

to construct a volatility function σloc(K, T ), called local volatility, such that the implied dynamics (under the risk neutral

density) of the traded asset St takes the form

dSt

St= sdt + σloc(St, t)dWt, t ∈ (τ, T ], (3)

where s denotes the drift of the risk free asset and we use Wt to stress the fact that we are working under the risk

neutral measure. Model (3) fits exactly (by construction) the initial implied volatility smile. This type of models was

proposed independently by Derman and Kani (1994) and Dupire (1994). Of course, for each future t ∈ (τ, T ], a new

set of observed option prices will become available, and one needs to update σloc(St, t) as an input to (3), in order to

perfectly fit the cross-section of option prices. Recalibration seems to be unavoidable, as these models do not appear to be

stable over time, and their out of sample performance is worse than ad hoc (smoothed) version of the constant volatility

Black-Scholes model (see Dumas, Fleming and Whaley, 1998). Again, recalibration turns out to be time inconsistent if

volatility is stochastic (given the instability of local volatility models); hence our procedure could be successfully applied

in this context as well.

Tests for the null hypothesis of one factor models have been already suggested in the financial literature. In fact, one

factor models have three important implications for options: (i) monotonicity. Call (put) option prices are monotonically

increasing (decreasing) in the price of the underlying asset; (ii) perfect correlation. As there is only one Brownian motion

driving the behaviour of the asset price, option prices are perfectly correlated with the underlying asset prices; (iii)

2

redundancy. Option payoffs can be perfectly replicated with the risk free asset and the underlying asset, and so are

redundant securities.

Bakshi, Cao and Chen (2000) have derived testable implications of the monotonicity and perfect correlation properties,

while Buraschi and Jackwerth (2001) have suggested a test for redundancy.

However, there are important differences between the hypotheses tested in the two papers mentioned above and our

procedure.

First, as Bergman, Grundy and Wiener (1996) pointed out, there are counterexamples to the monotonicity and

redundancy properties. These include discontinuous stock price processes. Therefore, in the presence of jumps, the

properties described above do not hold and, using the restrictions proposed by Bakshi, Cao and Chen (2000), one would

reject the null hypothesis even if volatility is a deterministic function of the underlying asset. Our procedure, conversely,

controls for the presence of jumps. Similarly, as Bakshi, Cao and Chen (2000) report, one can reject at least some of

their testable restrictions because of market frictions, which instead are controlled for in our framework.

Second, in the papers cited above, the construction of the tests requires the use of option data. Therefore, one has to

choose both the moneyness and the maturity of the options, and the outcome of the test may be rather sensitive to that.

On the contrary, the test suggested here simply requires the availability of high frequency observations on the price of

the underlying asset.

Summarizing, our testing procedure is geared towards discriminating between the classes of deterministic versus

stochastic volatility models, being based on the object of interest, volatility. Hence, one can expect it to be more

powerful and more informative.

The rest of this paper is organized as follows. In Section 2, the testing procedure is outlined and the relevant limit

theory is derived. Sections 3 and 4 propose robust versions of the test to jumps, and microstructure noise, respectively.

Section 5 reports the findings from a Monte Carlo exercise, in order to assess the finite sample behavior of the proposed

tests. Section 6 contains the results of an empirical application to short term rates and of a numerical exercise aimed at

quantifying the cost of model misspecification. Concluding remarks are given in Section 7. All the proofs are gathered

in the Appendix.

In the paper,p−→, d−→ and a.s.−→ denote respectively convergence in probability, in distribution and almost sure

convergence. We write 1· for the indicator function, b$c for the integer part of $, ‖ · ‖ to denote the euclidean norm,

IJ for the identity matrix of dimension J , MN (·, ·) to denote a mixed normal distribution and N (·, ·) to indicate a normal

distribution.

2 Testing for One Factor vs Stochastic Volatility Models

2.1 Set-Up and Heuristics

As discussed above, our objective is to device a data driven procedure for deciding between diffusion models with

deterministic volatility and stochastic volatility models, under minimal assumptions.

We consider the following class of deterministic volatility models

dXt = µ(Xt)dt + σtdW1,t

3

σ2t = σ2 (Xt) (4)

and the following class of stochastic volatility models

dXt = µ1(Xt)dt + σt

(√1− ρ2dW1,t + ρdW2,t

)

dσ2t = b1(σ2

t )dt + b2(σ2t )dW2,t, (5)

where W1,t, W2,t are standard, independent Brownian motions, and ρ ∈ (−1, 1), so that possible leverage effects are

allowed.

In our procedure, we compare generic classes of deterministic versus stochastic volatility models, without any func-

tional form assumption on either the drift or the variance term, apart from some mild regularity conditions.

We state the hypotheses of interest as

H0 : σ2t = σ2 (Xt) ∈ FX

t , a.s., (6)

versus the alternative

HA : σ2t is a diffusion process not FX

t −measurable, a.s., (7)

where FXt = σ(Xs, s ≤ t).

Thus, under the null hypothesis the volatility process is a measurable function of the price process Xt. For a survey on

the recent econometric results related to this kind of models, see Fan (2005). On the other hand, under the alternative,

the volatility process is a diffusion process driven by a different Brownian motion, not perfectly correlated with the one

driving the asset, and so is no longer a measurable function of Xt. Note that a generic specification for the drift term is

allowed under both hypotheses.

As the contribution of the drift term is negligible over a finite time span, the case of a drift depending on some latent

factor falls under the null.

Suppose that we have data recorded at frequency 1/n over a fixed time span [0, T ], where, without loss of generality,

we set T = 1, so that n denotes the number of intradaily observations. The suggested statistic is based on

Zn,r =√

n

1

n

bnrc−1∑

i=1

S2n(Xi/n)−RVn,r

(8)

where r ∈ (0, 1),

S2n(Xi/n) =

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)n

(X(j+1)/n −Xj/n

)2

∑n−1j=1 K

(Xi/n−Xj/n

ξn

) (9)

and

RVn,r =bnrc−1∑

j=1

(X(j+1)/n −Xj/n

)2. (10)

Here, S2n(Xi/n) is a nonparametric estimator of the volatility process evaluated at Xi/n; Florens-Zmirou (1993) has

established consistency and the asymptotic distribution of a scaled version of (9) when the variance process follows (4),

for the case of a uniform kernel function. Recently, S2n(Xi/n) has been used by Bandi and Phillips (2003, 2007), in the

context of fully nonparametric and parametric estimation of diffusion processes. In the sequel, we require the kernel

function to be differentiable, as this allows us to rely on some results by Bandi and Phillips (2007).

4

The estimator RVn,r, which is known as realized volatility, has been proposed as a measure for volatility concurrently

by Andersen, Bollerslev, Diebold and Labys (2001) and Barndorff-Nielsen and Shephard (2002).

The statistic in (8) can be straightforwardly interpreted as an Hausman test, for all r < 1. Heuristically, under H0

√n

1

n

bnrc−1∑

i=1

S2n(Xi/n)−

∫ r

0

σ2(Xs)ds

d−→ MN

(0, 2

∫ ∞

−∞σ4(a)

LX(r, a)2

LX(1, a)da

), (11)

and

√n

(RVn,r −

∫ r

0

σ2(Xs)ds

)d−→ MN

(0, 2

∫ r

0

σ4(Xs)ds

)

≡ MN(

0, 2∫ ∞

−∞σ4(a)LX(r, a)da

), (12)

where∫ r

0σ2(Xs)ds and

∫ r

0σ4(Xs)ds denote respectively the quadratic variation and the integrated quarticity of the

process, and LX(r, a) denotes the local time of the diffusion around point a, i.e. it is a random variable which measures

the amount of time, between time 0 and time r, spent by the diffusion in a neighborhood of a. The equality on the right

hand side of (12) follows from Lemma 3 in Bandi and Phillips (2003).

For any a, LX(r, a) is an increasing function of r. Thus, under H0, both estimators are consistent, but 1n

∑bnrc−1i=1 S2

n(Xi/n)

is more efficient than RVn,r, given that, for all r < 1, LX(r, a) < LX(1, a). In particular, due to the quadratic term in the

numerator of the variance in (11), the relative efficiency of the former estimator first increases and then decreases with

r. Note that, for r = 1, both estimators are equally efficient. Thus, Zn,1 has a limiting distribution which is degenerate

at zero.

Under the alternative hypothesis, RVn,r is still a consistent estimator of the quadratic variation, while 1n

∑bnrc−1i=1 S2

n(Xi/n)

is no longer consistent. Therefore, it follows that the absolute value of (8) diverges to infinity. As the finite sample size

and the power of Zn,r depend on r, we shall also consider the maximum of Zn,r over a grid of values for r.

In the sequel, we shall need the following assumption.

Assumption 1.

(a) σ (·) and µ (·) , defined in (4), are at least once continuously differentiable functions and satisfy local Lipschitz and

growth conditions. Hence, for any compact subset M of the range of the process Xt under the null hypothesis, there

exist constants CM1 , CM

2 , such that, ∀(x, y) ∈ M ,

|µ(x)− µ(y)|+ |σ(x)− σ(y)| ≤ CM1 |x− y| ,

and

|µ(x)|+ |σ(x)| ≤ CM2 (1 + |x|).

Also,

σ2(·) > 0 on R.

(b) b1 (·), b2 (·), and µ (·), defined in (5), are at least once continuously differentiable functions. Therefore, they satisfy

local Lipschitz and growth conditions. Thus, for each L > 0 there exist constants CL1 , CL

2 , such that, for all

x, y ∈ R× (0,∞) satisfying ‖x‖, ‖y‖ ≤ L,

‖µ(x)− µ(y)‖+ ‖V (x)− V (y)‖ ≤ CL1 ‖x− y‖,

5

and

‖µ(x)‖+ ‖V (x)‖ ≤ CL2 (1 + ‖x‖),

where

x = (x1, x2)′ ∈ R× (0,∞), µ(x) = (µ1(x1), b1(x2))′, V (x) =

x2 ρ√

x2

0 b2(x2)

.

Also, the corresponding diffusion matrix V (·)V (·)′ is positive definite on (0,∞).

(c) The kernel function K(·) is a symmetric and nonnegative function with bounded support, continuously differentiable

in the interior of its support, with bounded first derivative, which satisfies∫ ∞

−∞K(u)du = 1,

∫ ∞

−∞K2(u)du < ∞.

Assumptions 1(a)(b) ensure the existence of a unique strong solution under both hypotheses (see e.g. Karatzas and

Shreve, 1991, Chapter 5). These assumptions are standard in the literature (see Aıt-Sahalia, 1996a), because global

Lipschitz and growth conditions fail to be met by a lot of models of interest in financial applications.

2.2 Limiting Behavior of the Statistic

In this section we study the asymptotic behavior of the statistic Zn,r, as defined in (8) and of the statistic Zn, defined as

Zn = maxj=1,...,J

∣∣Zn,rj

∣∣ ,

where 0 < r1 < . . . < rj−1 < rj < . . . < rJ < 1, with J arbitrarily large but finite.

Theorem 1. Let Assumption 1 hold. Under H0,

(i)a if, as n, ξ−1n →∞, nξn →∞ and for any arbitrarily small ε > 0, n1/2+εξn → 0, then, pointwise in r ∈ (0, 1)

Zn,rd−→ Zr ∼ MN

(0, 2

∫ ∞

−∞σ4(a)

LX(r, a) (LX(1, a)− LX(r, a))LX(1, a)

da

), (13)

where

LX(r, a) = limψ→0

1σ2(a)

∫ r

0

1Xu∈[a−ψ,a+ψ]σ2(Xu)du

denotes the standardized local time of the process Xt.

(i)b Define Z = maxj=1,...,J

∣∣Zrj

∣∣. If, as n, ξ−1n → ∞, nξn → ∞, and, for any ε > 0 arbitrarily small, n1/2+εξn → 0,

then

Znd−→ Z,

with

Zr1

Zr2

...

ZrJ

∼ MN

0, 2

V (r1, r1) V (r1, r2) . . . V (r1, rJ)

V (r2, r1) V (r2, r2) . . . V (r2, rJ)...

.... . .

...

V (rJ , r1) V (rJ , r2) . . . V (rJ , rJ)

, (14)

where ∀ r, r′,

V (r, r′) = V (r′, r) =∫ ∞

−∞σ4 (a)

LX(min(r, r′), a)LX(1, a)− LX(r, a)LX(r′, a)LX(1, a)

da.

6

Under HA,

(ii) if, as n, ξ−1n →∞, nξn →∞ and for any arbitrarily small ε > 0, n1/2+εξn → 0, then, pointwise in r ∈ (0, 1),

Pr(

ω :1√n|Zn,r(ω)| ≥ ς(ω)

)→ 1,

where ς(ω) > 0 for all ω, apart from a subset with probability measure approaching zero.

From part (i)a of the Theorem above, we see that under the null hypothesis Zn,r converges to a mixed normal

random variable. Part (i)b follows from the joint limiting distribution of (Zn,r1 . . . Zn,rJ)′ and from the continuous

mapping theorem. From part (ii), we note that under the alternative hypothesis there exists a (almost surely) strictly

positive random variable ς, such that (1/√

n) |Zn,r| ≥ ς, with probability approaching one. Intuitively, what drives the

power is the fact that there are subsets with strictly positive probability measure in which Xi/n and Xj/n are very close

to each other, while σ2i/n and σ2

j/n are instead far away from each other.

By choosing a large enough J , the effect of the grid choice on the size and power of the test is almost negligible,

as also confirmed by the Monte Carlo findings. Of course, it would be preferable to have a statistic based over a con-

tinuum of r, say supr∈(0,1) |Zn,r| . Now, convergence in distribution of supr∈(0,1) |Zn,r| requires stochastic equicontinuity

of Zn,r over r ∈ (0, 1). We believe that Zn,r is not stochastic equicontinuous over r ∈ (0, 1). This is because, while√

n(RVn,r −

∫ r

0σ2(Xs)ds

)is indeed stochastic equicontinuous,

√n

1

n

bnrc−1∑

i=1

S2n(Xi/n)−

∫ r

0

σ2(Xs)ds

=

1√n

bnr′c−1∑

i=1

(S2

n(Xi/n)− σ2(Xi/n))

+ op(1)

is not. Heuristically, for any r′ > r,

1√n

bnr′c−1∑

i=bnrc−1

(S2

n(Xi/n)− σ2(Xi/n))

=1

n√

ξn

bnr′c−1∑

i=bnrc−1

1√nξn

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1s

1nξn

∑n−1j=1 K

(Xi/n−Xj/n

ξn

) + op(1)

=1√ξn

(r′ − r)Op(1) + op(1)

which does not necessarily goes to zero as r′ − r → 0. For this reason, in the sequel we limit our attention to the case of

a finite grid of evaluation points, i.e. to a finite J.

The theoretical results derived above provide an unfeasible limit theory, since the variance components have to be

estimated. In order to conduct inference based on the statistic Zn, we first need a consistent estimator of the matrix in

(14). Furthermore, we need to take into account the fact that the maximum over a J−dimensional mixed normal random

variable is no longer distributed as a mixed normal. We now suggest a simple data driven procedure which allows to

compute asymptotic valid critical values for Zn. For j = 1, . . . , J , define

Ωn(rj , rj) =1n

bnrjc−1∑

i=1

bnrjc−1∑

h=1

K(

Xh/n−Xi/n

ξn

)

∑n−1l=1 K

(Xh/n−Xl/n

ξn

) σ4

n(Xi/n), (15)

with

σ4n(Xi/n) =

13

n−1∑

j=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)n2(X(j+1)/n −Xj/n

)4. (16)

7

Also, define

Cn(r, r) =1n

bnrc−1∑

i=1

σ4n(Xi/n),

and construct

Vn(rj , rj) = Cn(rj , rj)− Ωn(rj , rj). (17)

It should be mentioned that a natural estimator for the variance of the model free estimator would be the estimated

integrated quarticity

Cn(rj , rj) =

13

bnrjc−1∑

i=1

n(X(i+1)/n −Xi/n

)4,

whose consistency for∫ rj

0σ4

sds, under both the null and the alternative hypothesis, has been shown by Barndorff-Nielsen

and Shephard (2002).1 The reason why we suggest an estimator based on Cn(rj , rj)− Ωn(rj , rj), rather than one based

on Cn(rj , rj) − Ωn(rj , rj), is that the former is by construction positive definite for any rj , while the latter in finite

sample may be not positive definite.

As for the covariance terms, for all rj 6= r′j , define

Ωn(rj , r′j) =

1n

bnrjc−1∑

i=1

bnr′jc−1∑

h=1

K(

Xh/n−Xi/n

ξn

)

∑n−1l=1 K

(Xh/n−Xl/n

ξn

) σ4

n(Xi/n) (18)

and construct

Vn(rj , r′j) = Cn

(min(rj , r

′j), min(rj , r

′j)

)− Ωn(rj , r′j). (19)

Having constructed the elements of the matrix in (14), we can now proceed as follows to calculate asymptotically

valid critical values. For s = 1, . . . , S, where S denotes the number of replications, let

d(s)n,r =

d(s)n,r1

d(s)n,r2

...

d(s)n,rJ

=√

2

Vn(r1, r1) Vn(r1, r2) . . . Vn(r1, rJ)

Vn(r1, r2) Vn(r2, r2) . . . Vn(r2, rJ)...

.... . .

...

Vn(r1, rJ ) Vn(r2, rJ) . . . Vn(rJ , rJ)

1/2

η(s)1

η(s)2

...

η(s)J

, (20)

where Vn(rj , rj) and Vn(rj , r′j) are defined respectively in (17) and (19) and, for each s,

(η(s)1 η

(s)2 . . . η

(s)J

)′is drawn from

a N(0, IJ ). Then, compute maxj=1,...,J

∣∣∣d(s)n,r

∣∣∣ , repeat this step S times, and construct the empirical distribution. Finally,

let CV S,nα be the (1 − α) quantile of maxj=1,...,J

∣∣∣d(s)n,r

∣∣∣ . The rule based on the (non) rejection of the null hypothesis if

Zn is (smaller or equal) larger than CV S,nα provides a test with asymptotic size equal to α and asymptotic unit power.

The logic underlying this procedure is quite simple. Under the null hypothesis, as n → ∞, the matrix with typical

element Vn(ri, rj) is a consistent estimator of the matrix with element V (ri, rj), as defined in (14). Therefore, as S and

n approach infinity, CV S,nα will converge to the (1− α) critical value of the limiting distribution of Zn.

Under the alternative hypothesis, for all j = 1, . . . , J , the matrix with element Vn(ri, rj) is still bounded in probability,

and so is the implied CV S,nα . On the other hand, Zn diverges to infinity at rate

√n. This ensures consistency of the

proposed test.

More formally, we have:

1Kalnina and Linton (2007) consider estimators of∫ rj

0 σ4sds based on in-fill subsampling.

8

Proposition 1. Let Assumption 1 hold and suppose that as n, ξ−1n →∞, nξn →∞, and, for any ε > 0 arbitrarily small,

n1/2+εξn → 0. Then, as n →∞ and S →∞,

(i) under H0,

limn→∞

Pr(Zn > CV S,n

α

)= α.

(ii) under HA,

limn→∞

Pr(Zn > CV S,n

α

)= 1.

3 The Case of Jump Diffusions

The asymptotic theory derived in the previous subsection relies on the fact that the underlying return process is a

continuous semi-martingale. However, realistic models of asset prices are made of a continuous component plus a jump

component (see, e.g., Andersen, Benzoni and Lund, 2002). In the sequel, we provide a test for discriminating between

the following classes of one-factor jump diffusions

dXt = µ(Xt−)dt + σt−dW1,t + dJ1,t

σ2t = σ2 (Xt−) (21)

versus stochastic volatility jump diffusions of the kind

dXt = µ1(Xt−)dt + σt−(√

1− ρ2dW1,t + ρdW2,t

)+ dJ2,t

dσ2t = b1(σ2

t−)dt + b2(σ2t−)dW2,t + dZt. (22)

Here, for i = 1, 2,

dJi,t =∫

U

c(u)Ni(dt,du),

Ni([t1, t2],A) is a Poisson random measure counting the number of jumps, in the interval [t1, t2], whose size belongs to

the set A, and c(u) is an i.i.d. random variable denoting the size of the jump. Note that, over a finite time span, only

a finite number of jumps can occur. In fact, Ni([t1, t2],A) = v(A)(t2 − t1), where the (Levy) intensity measure v(A),

denoting the expected number of jumps per unit of time of size belonging to A, is a measure of finite variation.

On the other hand, we can allow the volatility process to display either a small number of relatively large positive

jumps, or an infinite number of small jumps. In the former case, dZt is defined as dJ1,t, except for the fact that only

positive jumps are allowed. In the latter case,

dZt =∫

u>1

c(u)N3(dt, du) +∫

0≤u≤1

c(u)(N3(dt,du)− ν(dt,du)),

where N3(dt,du) is a Poisson random measure, but its associated intensity measure ν is now a measure of infinite

variation. The heuristic reason why we need to separately consider the case of 0 ≤ u ≤ 1 and u > 1 is that, as there are

infinitely many small jumps, the intensity measure can blow up in the neighborhood of zero. Hereafter, Ji,t and Zt are

assumed to be independent of (W1,t, W2,t).

It is important to note that the contribution of the jump component to 1n

∑bnrc−1i=1 S2

n(Xi/n) and to RVn,r does not

cancel out; this is because in the former statistic the effect of a jump at time i/n depends on the local time of Xi/n,

9

while in the latter case it doesn’t. Therefore, we cannot use the test statistic defined in (8) to discriminate between (21)

and (22).

The issue of estimating the quadratic variation of jump diffusion processes has received some attention in the literature.

Bandi and Nguyen (2003) provide consistent estimation of the conditional moments of the diffusion in (21), using a double

asymptotics, in which the discrete interval goes to zero and the time span goes to infinity. In particular, they show that

the conditional second moment captures both the continuous and the jump component of volatility. Mancini and Reno

(2006) propose a version of (9) based only on those observations for which(X(j+1)/n −Xj/n

)2 is smaller than a given

threshold. The implementation of their estimator does require the choice of the threshold parameter.

We state the hypotheses of interest as

H′0 : σ2t− = σ2 (Xt−) ∈ FX

t , a.s.,

versus the alternative

H′A : σ2t− is a diffusion process not FX

t −measurable, a.s.,

where FXt = σ(Xs−, s ≤ t). Note that, under the null hypothesis, if there are jumps in the return process, as a

straightforward consequence of the Ito’s Lemma for jump diffusions (see, e.g., Cont and Tankov, 2004, Ch.8), σ2t− is also

a jump diffusion. On the other hand, under the alternative hypothesis, σ2t− may be either a pure diffusion, if Zt = 0

almost surely, or otherwise a jump diffusion process.

The suggested statistic is based on

ZJn,r = µ−32/3

√n

1

n

bnrc−3∑

i=1

SJ2n(Xi/n)− TVn,r

, (23)

where µk = E(|SN |k

), SN denotes a standard normal random variable and r ∈ (0, 1). Here

SJ2n(Xi/n) (24)

=

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)n

(∣∣∆X(j+3)/n

∣∣2/3 ∣∣∆X(j+2)/n

∣∣2/3 ∣∣∆X(j+1)/n

∣∣2/3)

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

) ,

K+ is a right kernel function (i.e. zero for negative values) and

TVn,r =bnrc−3∑

j=1

∣∣∆X(j+3)/n

∣∣2/3 ∣∣∆X(j+2)/n

∣∣2/3 ∣∣∆X(j+1)/n

∣∣2/3. (25)

The limit theory for TVn,r, named tripower variation, has been established by Barndorff-Nielsen, Shephard and Winkel

(2006), who show that√

n(µ−3

2/3TVn,r −∫ r

0σ2

s−ds)

is asymptotically mixed normal, under both H′0 and H′A. In the

sequel, we show that, under H′0,

√n

µ−3

2/3

n

bnrc−3∑

i=1

SJ2n(Xi/n)−

∫ r

0

σ2s−ds

is also asymptotically mixed normal, while under H′A it diverges at rate√

n. Before doing so, we need some different

assumption on the kernel function, to ensure that we can consistently estimate local times in the presence of jumps. As

we are considering the case of state independent intensity and jump size, the conditions for the existence of a unique

10

strong solution are the same as in the continuous case. If we allowed for state dependent intensity parameters, than we

would require that the latter are weakly Lipschitz and satisfy the usual growth conditions. The asymptotic results stated

below rely on Barndorff-Nielsen, Shephard and Winkel (2006), who require that both the jump size and the counting

process governing the number of jumps do not depend on the underlying state variable.

Assumption 2.

(a) The drift and variance function in (21) satisfy Assumption 1(a).

(b) The drift and variance function in (22) satisfy Assumption 1(b).

(c) The kernel function K+(·) is a nonnegative function with bounded positive support, continuously differentiable in

the interior of its support, which satisfies∫ ∞

0

K+(u)du = 1,

∫ ∞

0

∣∣∣∣∂K+(u)

∂u

∣∣∣∣ du < ∞.

Theorem 2. Let Assumption 2 hold. Under H′0,

(i)a if, as n, ξ−1n →∞, nξn →∞ and for any arbitrarily small ε > 0, n1/2+εξn → 0, then, pointwise in r ∈ (0, 1)

ZJn,rd−→ ZJr ∼ MN

(0, γ

∫ ∞

−∞σ4(a)

LX(r, a) (LX(1, a)− LX(r, a))LX(1, a)

da

), (26)

where

γ =µ3

4/3 − 5µ62/3 + 2

(µ2

2/3µ24/3 + µ4

2/3µ4/3

)

µ62/3

and

LX(r, a) = LX(r, a+) = limψ→0

1σ2(a)

∫ r

0

1a≤Xu≤a+ψσ2(Xu−)du (27)

denotes the standardized local time of the process Xt−.

(i)b Define ZJn = maxj=1,...,J

∣∣ZJn,rj

∣∣ and ZJ = maxj=1,...,J

∣∣ZJrj

∣∣. If, as n, ξ−1n →∞, nξn →∞, and, for any ε > 0

arbitrarily small, n1/2+εξn → 0, then

ZJnd−→ ZJ,

with

ZJr1

ZJr2

...

ZJrJ

∼ MN

0, γ

V (r1, r1) V (r1, r2) . . . V (r1, rJ)

V (r2, r1) V (r2, r2) . . . V (r2, rJ)...

.... . .

...

V (rJ , r1) V (rJ , r2) . . . V (rJ , rJ)

. (28)

Under H′A,

(ii) if, as n, ξ−1n →∞, nξn →∞ and for any arbitrarily small ε > 0, n1/2+εξn → 0, then, pointwise in r ∈ (0, 1),

Pr(

ω :1√n|ZJn,r(ω)| ≥ ς(ω)

)→ 1,

where ς(ω) > 0 for all ω, apart from a subset with probability measure approaching zero.

11

It is immediate to see that the price paid for having a statistic which is robust to jumps is a loss of efficiency. In fact,

by comparing the limiting variances in (13) and (26), we note that in the former case the quarticity is multiplied by 2

while in the latter is multiplied by γ, with γ ≈ 3.

In order to provide valid asymptotic critical values, we can proceed as in Section 2. However, in the case of jumps,

σ4n(Xi/n) is no longer a consistent estimator of σ4

n(Xi/n). Therefore, Vn(rj , r′j) is no longer consistent for Vn(rj , r

′j). For

j = 1, . . . , J , define

Ωn(rj , rj) =1n

bnrjc−4∑

i=1

bnrjc−4∑

h=1

K+(

Xh/n−Xi/n

ξn

)

∑n−1l=1 K+

(Xh/n−Xl/n

ξn

) σ4

n(Xi/n), (29)

with

σ4n(Xi/n)

= µ−41

n−4∑

j=1

K+(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K+

(Xl/n−Xi/n

ξn

)n2∣∣∣∆X j+4

n

∣∣∣∣∣∣∆X j+3

n

∣∣∣∣∣∣∆X j+2

n

∣∣∣∣∣∣∆X j+1

n

∣∣∣ . (30)

Also, define

Cn(r, r) =1n

bnrc−4∑

i=1

σ4n(Xi/n),

and construct

Vn(rj , rj) = Cn(rj , rj)− Ωn(rj , rj). (31)

As for the covariance terms, for all rj 6= r′j , define

Ωn(rj , r′j) =

1n

bnrjc−4∑

i=1

bnr′jc−4∑

h=1

K+(

Xh/n−Xi/n

ξn

)

∑n−1l=1 K+

(Xh/n−Xl/n

ξn

) σ4

n(Xi/n) (32)

and construct

Vn(rj , r′j) = Cn

(min(rj , r

′j), min(rj , r

′j)

)− Ωn(rj , r′j). (33)

Then, we can proceed as in (20), replacing Cn (·, ·) with Cn (·, ·) (and 2 by γ), and obtain the quantile CV JS,nα .

Proposition 2. Let Assumption 2 hold and suppose that as n, ξ−1n →∞, nξn →∞, and, for any ε > 0 arbitrarily small,

n1/2+εξn → 0. Then, as n →∞ and S →∞,

(i) under H′0,

limn→∞

Pr(ZJn > CV JS,n

α

)= α.

(ii) under H′A,

limn→∞

Pr(ZJn > CV JS,n

α

)= 1.

4 The case of Microstructure Noise

The asymptotic theory derived in the previous sections assumes that the underlying return process is a continuous semi-

martingale plus a possible jump component. However, we still require that recorded prices are not affected by market

frictions.

12

It is a well documented fact that transaction data occurring in financial markets are contaminated by market mi-

crostructure effects, such as bid-ask spreads, rounding errors and price discreteness. To account for those effects, robust

volatility measures have been suggested in the literature (see, e.g., Zhang, Mykland and Aıt-Sahalia, 2005, Aıt-Sahalia,

Mykland and Zhang, 2006, Barndorff-Nielsen, Hansen, Lunde, and Shephard, 2006, 2007 and Zhang, 2006). Hence,

robust counterparts to RVn,r are already available to us. The purpose of this section is to derive a robust counterpart of

n−1∑bnrc−1

i=1 S2n(Xi/n) in order to construct a valid test.

Kernel based estimators of regression functions with measurement error are available in the literature. Fan and

Troung (1993) have looked at the case where the density of the measurement error is known and Schennach (2004) deals

with the case of unknown density.2 Unfortunately, their results do not apply to our context. The crucial difference is

that in our case the evaluation point of the instantaneous volatility is Xi/n, which is also measured with error. Therefore,

broadly speaking, the fact that the noise contaminated price Yj/n is close to Yi/n does not imply that Xj/n is close to

Xi/n. Furthermore, kernel estimators robust to measurement error converge at a logarithmic rate, when the density of

the error has tails declining exponentially.

We then follow a different route, based on a modified version of the Florens-Zmirou type estimator in (9). For the

consistency of our estimator, we assume that the error term approaches zero as n → ∞, though this can occur at a

rather slow rate. This assumption is not as strong as it may seem. In fact, empirically the microstructure noise is very

small (for example, Bandi and Russell, 2007, report values ranging from 0.87e-07 to 2.1e-07). Based on this observation,

Zhang, Mykland and Aıt-Sahalia (2006) argue that the microstructure noise is indeed “too small” to be considered Op(1),

and derive an Edgeworth correction for the two-scale estimator of Zhang, Mykland and Aıt-Sahalia (2005), under the

assumption that it approaches zero as n →∞.3 Also, the rate at which the noise is allowed to go to zero is quite slow in

our Theorem, and its magnitude is well above the empirically observed values.

Our setup is the following. Suppose that the sources of microstructure noise mentioned above have the effect of

perturbing the efficient market log-price of the traded asset. In other words, what we observe becomes

Yj/n = Xj/n + εj/n,

where X is the efficient log-price and ε is the microstructure noise. We assume that εj/n is an i.i.d. sequence, with

E(Xj/nεj/n) = 0, and that the magnitude of the noise tends to zero as we sample the process more densely. This

specification implies that observed returns are contaminated by a moving average of order 1 error, viz.

Yj/n − Y(j−1)/n =(Xj/n −X(j−1)/n

)+

(εj/n − ε(j−1)/n

).

Our statistic of interest becomes

ZMl,r =√

l

1

l

blrc−1∑

i=1

SM2l (Yi/l)− TRVl,r

, (34)

where l = n/B, and r ∈ (0, 1). Again, the statistic is derived as the scaled difference of two estimators, namely

SM2l (Yi/l) =

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l(Y(jB+b)/n − Y((j−1)B+b)/n

)2

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

) , (35)

2In the case of unknown density of the error, it is required to have (at least) two different noisy measures of the variable of interest.3Awartani, Corradi and Distaso (2006) suggest a test for the null hypothesis of a constant noise variance versus the alternative of a variance

approaching zero, at a slower rate than√

n. When applied to the DJIA stocks, the evidence is that, at frequencies lower than one minute, the

variance of the noise starts tending towards zero.

13

and

TRVl,r =1B

B∑

b=1

blrc−1∑

j=1

(Y(jB+b)/n − Y((j−1)B+b)/n

)2. (36)

SM2l (Yi/l) is obtained by averaging B estimators of instantaneous volatility calculated over nonoverlapping subsamples

of length l. TRV is calculated using a similar subsampling technique, averaging B realized volatilities calculated on

samples of size blrc. TRV it is closely related to the two-scale realized volatility of Zhang, Mykland and Aıt-Sahalia

(2005), but differs from the latter because it is still a biased estimator of integrated volatility. The reason why bias

correction is unnecessary in the present context is that the bias term is the same for both statistics and so its effect

cancels out. Before stating our main result, we need the following assumption.

Assumption 3. Let Yj/n = Xj/n + εj/n, where εj/n = Op(an) uniformly in j and Xj/n is the skeleton of Xt. Xt is

generated according to (4) or (5).

Theorem 3. Let Assumptions 1 and 3 hold. As n →∞, let ξ−1l , a−1

n , l, B →∞, (a) anξ−1l → 0, (b) ξllB

−1/2 → 0, (c)

ξll →∞ and for any arbitrarily small ε > 0, l1/2+εξl → 0. Under H0,

(i)a pointwise in r ∈ (0, 1)

ZMl,rd−→ ZMr ∼ MN

(0,

43

∫ ∞

−∞σ4(a)

LX(r, a) (LX(1, a)− LX(r, a))LX(1, a)

da

). (37)

(i)b Define ZMl = maxj=1,...,J

∣∣ZMl,rj

∣∣ and ZM = maxj=1,...,J

∣∣ZMrj

∣∣. Then

ZMld−→ ZM,

with

ZMr1

ZMr2

...

ZMrJ

∼ MN

0,43

V (r1, r1) V (r1, r2) . . . V (r1, rJ)

V (r2, r1) V (r2, r2) . . . V (r2, rJ)...

.... . .

...

V (rJ , r1) V (rJ , r2) . . . V (rJ , rJ)

. (38)

Under HA,

(ii) pointwise in r ∈ (0, 1),

Pr(

ω :1√l|Zl,r(ω)| ≥ ς(ω)

)→ 1,

where ς(ω) > 0 for all ω, apart from a subset with probability measure approaching zero.

It is immediate to see that, apart from the different constant term appearing in (38), we have the same limiting

distribution as in Theorems 1 and 2, though convergence occurs at the slower rate√

l. However, this is not a big

problem, because robustness to microstructure noise allows us to sample the process much more densely, without the

danger of noise contamination.

Note that the rate conditions (b) and (c) imply that l should grow at a slower rate than B. Within this constraint, the

faster l approaches infinity, the faster an has to approach zero. For example, if we set l = n2/5, B = n3/5, ξl = n−19/50,

then the rate conditions above are satisfied provided an approaches zero at a rate faster than n−19/50; if we set l = n1/3,

B = n2/3, and ξl = n−19/60, then the rate conditions in the Theorem are satisfied provided an approaches zero at a rate

14

faster than n−19/60. Such a slow rate of convergence to zero is fully consistent with the orders of magnitude of the noise

found in empirical studies.4

Again, Theorem 3 only states an unfeasible limit theory, and we need a consistent estimator of the variance terms,

in order to be able to use the test in practice. Similarly to the previous Sections, we define, for j = 1, . . . , J ,

Ωn,l(rj , rj) =1n

bnrjc−1∑

i=1

bnrjc−1∑

h=1

K(

Yh/n−Yi/n

ξl

)

∑n−1j=1 K

(Yh/n−Yj/n

ξl

)σ4

n,l(Yi/n),

where

σ4n,l(Yi/n)

=13

1B

∑Bb=1

∑l−1j=1 K

(Yi/n−Y((j−1)B+b)/n

ξl

)l2

(Y(jB+b)/n − Y((j−1)B+b)/n

)4

1B

∑Bb=1

∑l−1j=1 K

(Yi/n−Y((j−1)B+b)/n

ξl

) .

Now, let

Cn,l(r, r) =1n

bnrc−1∑

i=1

σ4n,l(Yi/n).

and construct

V n,l(rj , rj) = Cn,l(rj , rj)− Ωn,l(rj , rj).

Analogously, the covariance terms are obtained by defining

Ωn,l(rj , r′j) =

1n

bnrjc−1∑

i=1

bnr′jc−1∑

h=1

K(

Yh/n−Yi/n

ξl

)

∑n−1j=1 K

(Yh/n−Yj/n

ξl

)σ4

n,l(Yi/n),

and constructing

V n,l(rj , r′j) = Cn,l

(min(rj , r

′j),min(rj , r

′j)

)− Ωn,l(rj , r′j).

Then, we can proceed as in (20), replacing Cn (·, ·) with Cn,l (·, ·) (and 2 by 4/3), and obtain the quantile CV MS,n,lα .

Proposition 3. Let Assumptions 1 and 3 hold. If, as n →∞, ξ−1l , a−1

n , l, B →∞, anξ−1l → 0, ξllB

−1/2 → 0, ξll →∞and for any arbitrarily small ε > 0, l1/2+εξl → 0, then

(i) under H′0,

limS,n,l→∞

Pr(ZMl > CV MS,n,l

α

)= α.

(ii) under H′A,

limS,n,l→∞

Pr(ZMl > CV MS,n,l

α

)= 1.

4For example, if n = 86400, corresponding to data sampled every second on a full trading day, then our condition requires that an ≈ 0.027

and var(ε) ≈ 0.0007, which is several orders of magnitude larger than estimated values obtained by the literature (see, e.g., Bandi and Russell,

2007).

15

5 Monte Carlo results

In this section, the small sample performance of the testing procedure proposed in the previous sections is assessed through

a Monte-Carlo experiment. Under the null hypothesis, we consider a version of the constant elasticity of variance model

with a mean reverting component in the drift,

dXt = (κ + µXt)dt + ηXβt dW1,t, µ < 0. (39)

We consider three values of β, namely 0, 0.5, 1. When β = 0, the process in (39) is a version of the Vasicek (1977) model;

when β = 0.5, then the process generating the data is a version of the Cox, Ingersoll and Ross (1985) model; finally,

when β = 1, the process is a GARCH diffusion. We work under the assumption that markets are open continuously.

We first simulate a discretized version of the continuous trajectory of Xt under (39). We use a Milstein scheme in

order to approximate the trajectory, following Pardoux and Talay (1985), who provide conditions for uniform, almost

sure convergence of the discrete simulated path to the continuous path, for given initial conditions and over a finite time

span. In order to get a precise approximation to the continuous path, we choose a small time interval between successive

observations (1/5760); moreover, the initial value is fixed to the unconditional mean of Xt, and the first 1000 observations

are then discarded.

We then sample the simulated process at the chosen frequency, 1/n, and compute the test statistic. In particular, 6

different values have been chosen for the number of intradaily observations n, ranging from 72 (corresponding to data

recorded every twenty minutes) to 1440 (corresponding to data recorded every minute). The process is repeated for a

total of 10000 replications.

Results are reported for Zn. Under the conditions stated in Theorem 1, we know that

Znd−→ Z = max

j=1,...,J

∣∣Zrj

∣∣ ,

where the vector (Zr1Zr2 . . . ZrJ )′ is defined in (14).

The simulation experiment has been conducted for two different grids over the total number of observations n, namely:

• J = 16, with r starting from r1 = .15 and then increasing by .05 until r16 = .85, and

• J = 8, with r starting from r1 = .15 and then increasing by .10 until r8 = .85.

Also, four different bandwidth parameters have been used in the simulations, ranging from ξn = n−10/13 to ξn = n−7/13

and we have used the Epanechnikov kernel to compute our statistic. Finally, the critical values constructed according to

(20) have been obtained with S = 1000. The simulation results over the different bandwidths and grids used are very

similar. Therefore, for space reasons, only results relative to the case where J = 16 and ξn = n−8/13 are reported.

The empirical sizes (at 5% and 10% level) of the proposed test are reported in Table 1, columns 2 and 3, for κ = 0,

η = .02, µ = −.8. Again, results for different values of the parameters needed to generate (39) display a virtually

identical pattern and therefore are omitted. Inspection of the Table reveals an overall good small sample behaviour of

the considered test statistics. The reported empirical sizes are everywhere very close to the nominal ones, with a slight

tendency to underreject for small values of n.

Under the alternative hypothesis, the following model has been considered,

dXt = (κ + µXt)dt + η(σ2

t

)β(√

1− ρ2dW1,t + ρdW2,t

)

16

d ln(σ2

t

)=

(κ1 + µ1 ln

(σ2

t

))dt + η1dW2,t. (40)

A discretized version of (40) has been simulated using a Milstein scheme as above, with κ1 = .2, η1 = 0.02, µ1 = −2.

Then, using the obtained values of σ2t , the series for Xt has been generated, for different parameter values, for β = 0.5, 1

and ρ = 0.

Since the only parameter having some influence on the power was η, we report results for κ and µ fixed at the values

used to generate Xt under (39) and for different values of η.

The findings for the power of the proposed test are reported in Table 2. The experiment reveals that the proposed

test has good power properties.

Power is higher for the case when β = 1. In this case, changing η does not seem to have an effect on the power figures.

When β = 0.5, the power seems to be influenced by the value of the scale parameter η; in general, we observe slightly

higher powers for lower values of η.

5.1 The case of jumps

The jump component has been added to the continuous price process as in Barndorff-Nielsen and Shephard (2004). In

particular, one randomly scattered jump has been added to the log-price process described in (39), and the jump is

N(0, 0.64E

(∫ 1

0σ2(Xs)ds

)). Therefore, the jump component has 64% of the variance as that expected over a day with

no jumps.

Results are reported for our robust test statistic ZJn, using the following positive kernel function

K+ (u) =32

(1− u2

)10≤u≤1.

The empirical sizes (at 5% and 10% level) of the proposed test are reported in Table 1, columns 4 and 5, for the same

set of values as those used for Zn. The reported sizes are very similar to those obtained for Zn and confirm the good

finite sample performance of the test.

Under the alternative hypotheses, jumps have been added to the log-price process (40) in the same way as above.

The obtained power figures are almost identical to those of Zn and are therefore omitted for space reasons.

5.2 Adding microstructure noise

In the simulation exercise, we have fixed B = bn3/5c, l = bn2/5c and ξl = bn−3/10c. Correspondingly, we have chosen

an = n−i/5, for i = 1, 2, 3. Notice that the case i = 1 does not satisfy the regularity conditions of Theorem 3 and

Proposition 3. However, we have still considered the case to crash test our procedure.

Results are reported in Table 3. Starting from the empirical size, we notice that tests are a bit undersized, but the

size distortion tends to disappear as the sample size increases and as the magnitude of the noise decreases.

Power figures are smaller than those of Zn and ZJn. This is because the rate of divergence in this case is√

l = n1/5

instead of the√

n of the previous tests. We also notice the importance of the regularity condition on the magnitude of

the noise. The test does not have any power when the condition is not satisfied.

17

6 Empirical application

In this section we apply our tests to different sets of financial data. Then, we address the question of quantifying the

cost of misspecifying the volatility function.

6.1 Testing for deterministic volatility in the short rate

Different popular theories and models have assigned to the short term rate the important role of the main determinant of

bond prices. Therefore, there has been a lot of effort towards finding the correct specification of the short rate dynamics.

An important contribution, in this literature, is the paper by Aıt-Sahalia (1996b). There, he evaluates different

models for the short term rate, based on a distance between the (marginal) density implied by a given model and the

nonparametric one.

He ends up rejecting all the considered one factor models, apart from one characterized by a nonlinear drift. Similar

findings about nonlinearity in the drift are reached by Stanton (1997). Subsequently, Pritsker (1998) and Chapman and

Pearson (2000) have expressed some concerns about these results, based on size distortion and boundary bias problems

suffered by the methods in Aıt-Sahalia (1996b) and Stanton (1997), respectively.

Hong and Li (2005) have developed a specification test using the distance between the conditional density implied

by a given model and its nonparametric counterpart. Transition densities capture the full dynamics of continuous time

models; hence, tests based on them are expected to yield better results.

Hong and Li reject all the one factor models (including those with a nonlinear drift) and argue that the source of

rejection is likely to come from violation of the markovian property enjoyed by one factor models.

Violation of this property could be due, for example, to continuous time models with stochastic volatility and/or

jumps (which Hong and Li do not test in their paper).

Our procedure differentiates from the cited literature in at least two respects. First, we do not test specific models, but

classes of models. Second, we only focus on volatility and do not perform any specification test on the drift component

of the process. In other words, we are interested in testing whether volatility satisfies certain properties (i.e. being a

measurable function of the underlying asset), rather than whether it is generated by a particular model.

We conduct our analysis considering three countries, namely US, UK and Japan. For the US case, we use the same

data set as Aıt-Sahalia (1996b). Its main features are reported in Table 4 and the plot of the series is shown in Figure 1.

The results of our tests, reported in Table 7, show that the null hypothesis is rejected at the 1% level, using both the

standard test and the one robust to jumps. Hence, our results are similar to the ones by Hong and Li (2005); we are also

able to characterize what kind of violation of the markovian property is causing the rejection of the null hypothesis. It

is not the presence of jumps, which are controlled for without altering the outcome of the test, but rather the presence

of stochastic volatility.

A similar conclusion is reached when we consider UK data (see Table 5, Figure 2 and Table 7, for a summary of the

results).

Japan, instead, shows a different behaviour of the short term rate (see Table 6, Figure 3). The two tests never

reject the null hypothesis (not even at 10%), as reported in Table 7. A visual inspection of Figure 3 reveals that, form

about 1998 to the end of the sample, there has been very little variation in the short rate (Japanese economy has been

18

considered by many to be locked in a liquidity trap). We have also run the test on the subsample up to 1998, to check

whether non rejection of the null was driven by the flatness of the short rate in the post 1998 part of the sample, but the

outcome does not change.

6.2 The cost of model misspecification

Let the short rate, under the risk neutral probability measure, evolve according to

dst = ms (s− st) dt + σtdW1,t

dσ2t = mσ

(σ2 − σ2

t

)dt + σtdW2,t, (41)

where W1,t and W2,t are uncorrelated Brownian motions. Model (41) has been proposed by Fong and Vasicek (1992),

who derive the formula for the price of a zero-coupon bond at time t, paying one dollar at time T as

PFV

(t, T , st, σ

2t

)= exp

(−stD(t, T ) + σ2t F (t, T ) + G(t, T )

).

Suppose that a naive trader (wrongly) thinks that the short rate evolves as a Cox, Ingersoll and Ross (1985) process

dst = ms (s− st) dt +√

stdW1,t.

Then he/she would price the zero coupon bond according to the formula5

PCIR

(t, T , st

)= A(t, T ) exp

(−B(t, T )st

).

We compare PFV

(t, T , st, σ

2t

)with the mispriced PCIR

(t, T , st

), for different states of the world (different values of

st, σ2t , which are simulated under (41)) and different maturities (T − t).

Results are reported in Table 8 and show that the amount of mispricing can become severe, especially in states of the

economy characterized by high levels of volatility.

7 Concluding remarks

This paper provides a testing procedure which allows to discriminate between one factor and stochastic volatility models,

under minimal assumptions. Apart from standard regularity conditions, no assumptions are made on the functional

forms of either the drift or the diffusion term, and jumps and microstructure noise are properly accounted for.

The suggested test statistics can be seen as an Hausman type test. They weakly converge to well defined distributions

under the null hypothesis and diverge at an appropriate rate under the alternative. As the Monte carlo experiment shows,

the tests perform well in finite samples. An application to financial data sheds some new light on the behaviour of the

short term rate. Finally, we quantify the economic cost of model misspecification in the context of pricing bonds.

5For the exact expressions of D(t, T ), F (t, T ), G(t, T ) see Fong and Vasicek (1992). Expressions for A(t, T ), B(t, T ) can be found, for

example, in James and Webber (2000).

19

Table 1: Actual sizes of the tests based on Zn and ZJn for different values of n and β

Zn ZJn

5% nominal size 10% nominal size 5% nominal size 10% nominal size

β = 0

n = 72 0.03 0.07 0.03 0.06

n = 144 0.03 0.07 0.03 0.07

n = 288 0.04 0.08 0.03 0.08

n = 576 0.04 0.10 0.04 0.08

n = 720 0.06 0.11 0.05 0.10

n = 1440 0.06 0.11 0.06 0.10

β = 0.5

n = 72 0.02 0.07 0.03 0.07

n = 144 0.03 0.07 0.03 0.07

n = 288 0.06 0.09 0.04 0.07

n = 576 0.05 0.11 0.04 0.09

n = 720 0.06 0.11 0.05 0.10

n = 1440 0.06 0.12 0.06 0.11

β = 1

n = 72 0.02 0.07 0.03 0.06

n = 144 0.04 0.08 0.03 0.07

n = 288 0.04 0.08 0.03 0.08

n = 576 0.05 0.09 0.04 0.09

n = 720 0.06 0.10 0.05 0.10

n = 1440 0.06 0.11 0.05 0.10

20

Table 2: Actual powers of the tests based on Zn for different values of n, β and η

β = 0.5 β = 1

5% size 10% size 5% size 10% size

η = 0.02

n = 72 0.26 0.36 0.78 0.88

n = 144 0.40 0.48 0.86 0.96

n = 288 0.62 0.74 0.95 1.00

n = 576 0.97 1.00 1.00 1.00

n = 720 1.00 1.00 1.00 1.00

n = 1440 1.00 1.00 1.00 1.00

η = 0.05

n = 72 0.20 0.36 0.70 0.86

n = 144 0.22 0.46 0.86 0.92

n = 288 0.60 0.70 0.94 1.00

n = 576 0.93 1.00 1.00 1.00

n = 720 1.00 1.00 1.00 1.00

n = 1440 1.00 1.00 1.00 1.00

η = 0.1

n = 72 0.14 0.28 0.76 0.88

n = 144 0.30 0.44 0.90 0.96

n = 288 0.50 0.62 0.97 1.00

n = 576 0.82 0.90 1.00 1.00

n = 720 1.00 1.00 1.00 1.00

n = 1440 1.00 1.00 1.00 1.00

21

Table 3: Actual size and power of the tests based on ZMl for different values of n and an, β = 1, η = .05

Size Power

5% nominal size 10% nominal size 5% size 10% size

an = n−1/5

n = 1440 0.01 0.03 0.01 0.02

n = 2880 0.02 0.04 0.01 0.02

n = 5760 0.02 0.04 0.01 0.01

an = n−2/5

n = 1440 0.02 0.06 0.22 0.34

n = 2880 0.03 0.08 0.31 0.39

n = 5760 0.04 0.09 0.38 0.45

an = n−3/5

n = 1440 0.03 0.07 0.25 0.38

n = 2880 0.04 0.09 0.33 0.40

n = 5760 0.04 0.09 0.37 0.45

22

Table 4: Features of US data

Source Bank of America seven-day Eurodollar (bid-ask midpoint)

Sample period 1.6.1973 to 25.2.1995

Sample size 5505

Frequency Daily

1975 1980 1985 1990 1995

0.05

0.10

0.15

0.20

0.25

Figure 1: The US spot rate

23

Table 5: Features of UK data

Source Bank of England, 1 month yield

Sample period 3.3.1997 to 14.11.2005

Sample size 2165

Frequency Daily

1998 2000 2002 2004 2006

0.04

0.05

0.06

0.07

Figure 2: The UK spot rate

24

Table 6: Features of Japan data

Source Bank of Japan, Average interest rates on certificates of

Deposit, less than 30 days

Sample period 4.4.1988 to 14.8.2006

Sample size 959

Frequency Weekly

1990 1995 2000 2005

0.00

0.02

0.04

0.06

0.08

Figure 3: The Japan spot rate

25

Table 7: Results of the test for the different Countries

αStatistic

1% 5% 10 %

US

Zn reject reject reject

ZJn reject reject reject

UK

Zn reject reject reject

ZJn reject reject reject

Japan - Full sample

Zn do not reject do not reject do not reject

ZJn do not reject do not reject do not reject

Japan - Subsample until 1998

Zn do not reject do not reject do not reject

ZJn do not reject do not reject do not reject

26

Table 8: Values of zero coupon bonds

T − t = 6 months T − t = 1 year

PFV

(t, T , s, σ2

)PCIR

(t, T , s

)PFV

(t, T , s, σ2

)PCIR

(t, T , s

)

s = .01, σ2 = .01 .986 .995 .971 .991

s = .01, σ2 = .04 .954 .995 .909 .991

s = .01, σ2 = .08 .914 .995 .832 .991

s = .03, σ2 = .01 .976 .986 .952 .986

s = .03, σ2 = .04 .945 .986 .891 .986

s = .03, σ2 = .08 .905 .986 .816 .986

s = .08, σ2 = .01 .952 .963 .907 .963

s = .08, σ2 = .04 .922 .963 .849 .963

s = .08, σ2 = .08 .882 .963 .777 .963

27

Appendix

Hereafter, without loss of generality, we assume that the kernel functions K and K+ are supported on [−1, 1] and [0, 1],

respectively.

Proof of Theorem 1

Given Assumption 1, the diffusion is not explosive, and, by the usual localization argument, it follows that

sups∈[0,1]

|Xs| = Oa.s.(nε/2),

for any ε > 0, arbitrarily small.

(i)a The proof is then organized in three steps.

Step 1:

Zn,r

=1√n

bnrc−1∑

j=1

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) − 1

2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s

+1√n

n∑

j=bnrc

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) 2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s + op(1)

Step 2:

〈Zn,r〉 = 2∫ ∞

−∞

(LX(a, r) (LX(1, a)− LX(r, a)

L(1, a)

)σ4(a)da + op(1)

Step 3:

Zn,rd−→ MN

(0, 2

∫ ∞

−∞

(LX(a, r) (LX(1, a)− LX(r, a)

L(1, a)

)σ4(a)da

)

Proof of Step 1. Note that

Zn,r =1√n

bnrc−1∑

i=1

(S2

n(Xi/n)− σ2(Xi/n))

︸ ︷︷ ︸An,r

−√n

bnrc−1∑

j=1

((X(j+1)/n −Xj/n

)2 − σ2(Xi/n)/n)

︸ ︷︷ ︸Bn,r

. (42)

Using Ito’s formula,

An,r =1√n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

︸ ︷︷ ︸A

(1)n,r

+1√n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)µ(Xs)ds

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

︸ ︷︷ ︸A

(2)n,r

(43)

28

+1√n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)n

(∫ (j+1)/n

j/n

(σ2(Xs)− σ2(Xi/n)

)ds

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

︸ ︷︷ ︸A

(3)n,r

.

Now

A(3)n,r ≤

√n sup|Xs−Xτ |≤ξn

∣∣σ2(Xs)− σ2(Xτ )∣∣

≤ √n sup

τ∈[0,1]

∣∣∇σ2(Xτ )∣∣ sup|Xs−Xτ |≤ξn

|Xs −Xτ |

= O(√

n)Oa.s.(nε/2)Oa.s(ξn) = oa.s.(1). (44)

It is immediate to see that A(2)n,r is of a smaller order of probability than A

(1)n,r. Thus, An,r = A

(1)n,r + op(1), where the

op(1) term holds uniformly in r. By the same argument as above, Bn,r can be expanded as

Bn,r =1√n

bnrc−1∑

i=1

(2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s

)+ op(1).

By noting that the two summation in the numerator of Gn,r can be switched, after a few straightforward manipulations,

we have that

Zn,r

=1√n

bnrc−1∑

j=1

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) − 1

2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s

︸ ︷︷ ︸Z

(1)n,r

+1√n

n∑

j=bnrc

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) 2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s

︸ ︷︷ ︸Z

(2)n,r

+ op(1). (45)

Proof of Step 2:

〈Zn,r〉 =⟨Z(1)

n,r

⟩+

⟨Z(2)

n,r

⟩+ 2

⟨Z(1)

n,r, Z(2)n,r

⟩. (46)

Along the lines of the proof of Theorem 4 in Bandi and Phillips (2007),

〈Z(1)n,r〉

=1n

bnrc−1∑

j=1

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) − 1

2

〈2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s〉

=2n

bnrc−1∑

j=1

bnrc−1∑

i=1

bnrc−1∑

k=1

K(

Xj/n−Xi/n

ξn

)K

(Xj/n−Xk/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) ∑n−1l=1 K

(Xl/n−Xk/n

ξn

) + 1− 2

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)

×σ4(Xj/n) + op(1)

= 2∫ r

0

∫ r

0

∫ r

0

K(

Xa−Xu

ξn

)K

(Xa−Xv

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

∫ 1

0K

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da

29

+2∫ r

0

σ4(Xa)da− 4∫ r

0

∫ r

0

K(

Xa−Xu

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

du

σ4(Xa)da + op(1)

= 2∫ ∞

−∞

∫ ∞

−∞

∫ ∞

−∞

K(

a−uξn

)K

(a−vξn

)LX(r, u)LX(r, v)

∫∞−∞K

(c−uξn

)LX(1, c)dc

∫∞−∞K

(c−vξn

)LX(1, c)dc

dudv

×σ4(a)LX(r, a)da + 2∫ ∞

−∞σ4(a)LX(r, a)da

−4∫ ∞

−∞

∫ ∞

−∞

K(

a−uξn

)LX(r, u)

∫∞−∞K

(c−uξn

)LX(1, c)dc

du

σ4(a)LX(r, a)da + op(1)

= 2∫ ∞

−∞

(LX(r, a)3

LX(1, a)2+ LX(r, a)− 2

LX(r, a)2

LX(1, a)

)σ4(a)da + op(1). (47)

Similarly

〈Z(2)n,r〉

=1n

n∑

j=bnrc

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)

2

〈2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s〉

=2n

n∑

j=bnrc

bnrc−1∑

i=1

bnrc−1∑

k=1

K(

Xj/n−Xi/n

ξn

)K

(Xj/n−Xk/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)∑n−1l=1 K

(Xl/n−Xk/n

ξn

) σ4(Xj/n) + op(1)

= 2∫ 1

r

∫ r

0

∫ r

0

K(

Xa−Xu

ξn

)K

(Xa−Xv

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

∫ 1

0K

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da + op(1)

= 2∫ 1

0

∫ r

0

∫ r

0

K(

Xa−Xu

ξn

)K

(Xa−Xv

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

∫ 1

0K

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da

−2∫ r

0

∫ r

0

∫ r

0

K(

Xa−Xu

ξn

)K

(Xa−Xv

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

∫ 1

0K

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da

= 2∫ ∞

−∞

(LX(r, a)2

LX(1, a)− LX(r, a)3

LX(1, a)2

)σ4(a)da + op(1).

Finally

〈Z(1)n,r, Z

(2)n,r〉

=1n

bnrc−1∑

k=1

n−1∑

j=bnrc

bnrc−1∑

i=1

K(

Xk/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) − 1

[nr]−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)

×⟨

2n

∫ (k+1)/n

k/n

(Xs −Xk/n

)σ(Xs)dW1,s, 2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ(Xs)dW1,s

= 0.

The statement in Step 2 then follows by noting that⟨Z

(1)n,r

⟩+

⟨Z

(2)n,r

⟩= 2

∫∞−∞

(LX(a,r)(LX(1,a)−LX(r,a)

L(1,a)

)σ4(a)da.

Proof of Step 3.

Let W(1)r =

∑bnrc−1j=1

(X(j+1)/n −Xj/n

)and W

(2)r =

∑nj=bnrc

(X(j+1)/n −Xj/n

), so that they are independent by

30

construction. We need to show that

⟨Z(1)

n,r,W(1)r

⟩= op(1) and

⟨Z(2)

n,r,W(2)r

⟩= op(1). (48)

Now, along the lines of the proof of Theorem 5 in Bandi and Phillips (2007),

⟨Z(1)

n,r,W(1)r

=1n

bnrc−1∑

j=1

bnrc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) − 1

2

〈2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)σ2(Xs)ds〉+ op(1)

= op(1),

as it is of smaller order than⟨Z

(1)n,r

⟩, which is Op(1). By the same argument, it also follows that

⟨Z

(2)n,r,W

(2)r

⟩is op(1).

Thus, by the Dambis, Dubin and Schwartz theorem (Karatzsas and Shreve, 1993, ch.3), it follows that Z

(1)n,r

Z(2)n,r

d−→ MN

0,

2

∫∞−∞

(LX(r,a)3

LX(1,a)2 + LX(r, a)− 2LX(r,a)2

LX(1,a)

)σ4(a)da 0

0 2∫∞−∞

(LX(r,a)2

LX(1,a) − LX(r,a)3

LX(1,a)2

)σ4(a)da

and, as a direct consequence of the continuous mapping theorem,

Zn,r = Z(1)n,r + Z(2)

n,rd−→ MN

(0, 2

∫ ∞

−∞

(LX(a, r) (LX(1, a)− LX(r, a)

L(1, a)

)σ4(a)da

).

This concludes part 1(a).

(i)b Let Z(1)n,rj and Z

(2)n,rj be defined as in (45). For all rj < r′j , using a similar argument as that used in the proof of Step

2 in part (i)a,

〈Z(1)n,rj

, Z(1)n,r′j

=2n

bnrjc−1∑

j=1

bnrjc−1∑

i=1

bnr′jc−1∑

k=1

K(

Xj/n−Xi/n

ξn

)K

(Xj/n−Xk/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) ∑n−1l=1 K

(Xl/n−Xk/n

ξn

) + 1− 2

bnr′jc−1∑

i=1

K(

Xj/n−Xi/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)

×σ4(Xj/n) + op(1)

= 2∫ ∞

−∞

∫ ∞

−∞

∫ ∞

−∞

K(

a−uξn

)K

(a−vξn

)LX(rj , u)LX(r′j , v)

∫∞−∞K

(c−uξn

)LX(1, c)dc

∫∞−∞K

(c−vξn

)LX(1, c)dc

dudv

×σ4(a)LX(rj , a)da + 2∫ ∞

−∞σ4(a)LX(rj , a)da

−4∫ ∞

−∞

∫ ∞

−∞

K(

a−uξn

)LX(r′j , u)

∫∞−∞K

(c−uξn

)LX(1, c)dc

du

σ4(a)LX(rj , a)da + op(1)

= 2∫ ∞

−∞

(LX(rj , a)2LX(r′j , a)

LX(1, a)2+ LX(rj , a)− 2

LX(r′j , a)LX(r′j , a)LX(1, a)

)σ4(a)da + op(1).

Also,

〈Z(2)n,rj

, Z(2)n,r′j

=2n

n∑

j=bnrjc

bnrjc−1∑

i=1

bnr′jc−1∑

k=1

K(

Xj/n−Xi/n

ξn

)K

(Xj/n−Xk/n

ξn

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)∑n−1l=1 K

(Xl/n−Xk/n

ξn

) σ4(Xj/n) + op(1)

31

= 2∫ 1

0

∫ rj

0

∫ r′j

0

K(

Xa−Xu

ξn

)K

(Xa−Xv

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

∫ 1

0K

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da

−2∫ rj

0

∫ rj

0

∫ r′j

0

K(

Xa−Xu

ξn

)K

(Xa−Xv

ξn

)

∫ 1

0K

(Xc−Xu

ξn

)dc

∫ 1

0K

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da

= 2∫ ∞

−∞

(LX(rj , a)LX(r′j , a)

LX(1, a)− LX(rj , a)2LX(r′j , a)

LX(1, a)2

)σ4(a)da + op(1).

Finally, by the same argument as in Part i(a), 〈Z(1)n,rj , Z

(2)n,r′j

〉 = op(1). As

〈Zn,rj , Zn,r′j 〉 = 〈Z(1)n,rj

, Z(1)n,r′j

〉+ 〈Z(2)n,rj

, Z(2)n,r′j

=∫ ∞

−∞σ4 (a)

LX(min(r, r′), a)LX(1, a)− LX(r, a)LX(r′, a)LX(1, a)

da,

the statement in i(b) then follows straigthforwardly by the continuous mapping theorem.

(ii) Define

Wt =√

1− ρ2dW1,t + ρdW2,t.

Pointwise in r, we can rewrite Zn,r as

Zn,r =1√n

bnrc−1∑

i=1

(S2

n(Xi/n)− σ2i/n

)

︸ ︷︷ ︸Dn,r

−√n

bnrc−1∑

j=1

(X(j+1)/n −Xj/n

)2 −∫ r

0

σ2sds

︸ ︷︷ ︸En,r

+1√n

bnrc−1∑

i=1

σ2i/n −

√n

∫ r

0

σ2sds

︸ ︷︷ ︸Fn,r

.

Fn,r = op(1) as shown by Barndorff-Nielsen, Graversen, Jacod, Podolskji and Shephard (2006, Section 7), and En,r =

Op(1) by Jacod (1994). Also

Dn,r =1√n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)√σ2

sdWs

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

︸ ︷︷ ︸D

(1)n,r

+1√n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)2n

∫ (j+1)/n

j/n

(Xs −Xj/n

)µ1(Xs)ds

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

︸ ︷︷ ︸D

(2)n,r

+1√n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)n

(∫ (j+1)/n

j/n

(σ2

s − σ2i/n

)ds

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

︸ ︷︷ ︸D

(3)n,r

.

32

Now, D(1)n,r is Op(1), again for the same argument used in the proof of part (i)a, Step 1, and similarly D

(2)n,r = op(1), since

it has the same behavior under both H0 and HA.

Because of the modulus of continuity of the diffusion process σ2t ,

1√n

∣∣∣D(3)n,r

∣∣∣ =1n

∣∣∣∣∣∣

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣+ oa.s.(1). (49)

Let ε > 0 be arbitrarily small; the first term on the right hand side of (49) is larger than∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1∣∣∣σ2

j/n−σ2

i/n

∣∣∣>ε

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣(50)

∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1∣∣∣σ2

j/n−σ2

i/n

∣∣∣≤ε

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣,

where the second term in (50) is smaller than than the first, almost surely. The term in (50) can be written as

max

∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

σ2j/n

−σ2i/n

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣,

∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

σ2j/n

−σ2i/n

<−ε

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣

−min

∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

0<σ2j/n

−σ2i/n

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣,

∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1−ε<σ2

j/n−σ2

i/n<0

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣

.

Now,∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

σ2j/n

−σ2i/n

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣

≥ ε1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

σ2j/n

−σ2i/n

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

(51)

and, analogously,∣∣∣∣∣∣∣1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

σ2j/n

−σ2i/n

<−ε

(σ2

j/n − σ2i/n

)

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

∣∣∣∣∣∣∣

≥ ε1n

bnrc−1∑

i=1

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)1

σ2j/n

−σ2i/n

<−ε

∑n−1j=1 K

(Xi/n−Xj/n

ξn

)

. (52)

We first need to show that the right hand sides of the inequalities in (51) and (52) are strictly positive, with probability

(approaching) 1. This will ensure that the left hand sides of (51) and (52) are strictly positive.

33

As for (51), it is enough that1

nξn

n−1∑

j=1

K

(Xi/n −Xj/n

ξn

)

and1

nξn

n−1∑

j=1

K

(Xi/n −Xj/n

ξn

)1

σ2j/n

−σ2i/n

are of the same order of (probabilistic) magnitude. This occurs whenever σ2t and Xt are driven by two non perfectly

correlated Brownian motions. The same line of reasoning applies to (52).

Finally, the left hand sides of (51) and (52) are not equal, as they are two strictly positive random variables, and

therefore, the probability that their difference is zero is almost surely zero.

Therefore,∣∣∣D(3)

n,r

∣∣∣ diverges at rate√

n for all r < 1. ¥

Proof of Proposition 1

(i) For the sake of brevity of exposition we omit the drift term, which is asymptotically negligible by the same argument

used in the proof of Theorem 1, part (i)a, Step 1. We begin to show that σ4n(Xi/n)− σ4(Xi/n) = op(1), uniformly

in i. Now,

σ4n(Xi/n)− σ4(Xi/n)

=13

n−1∑

j=1

K(

Xj/n−Xi/n

ξn

)n2

((X(j+1)/n −Xj/n

)4 − 3n2 σ4(Xi/n)

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

)

=13

n−1∑

j=1

K(

Xj/n−Xi/n

ξn

)n2

(σ(Xj/n)4

(∫ (j+1)/n

j/ndW1,s

)4

− 3n2 σ4(Xi/n)

)

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) + op(1)

=13

n−1∑

j=1

K(

Xj/n−Xi/n

ξn

) (σ(Xj/n)4

(u4

j − 3))

∑n−1l=1 K

(Xl/n−Xi/n

ξn

) + op(1).

where the op(1) term holds uniformly in j, and uj are i.i.d. N(0, 1) variates. Now,

1nξn

n−1∑

l=1

K

(Xl/n −Xi/n

ξn

)= Op(1),

uniformly in i and ∣∣∣∣∣∣1

3nξ

n−1∑

j=1

K

(Xj/n −Xi/n

ξn

) (σ(Xj/n)4

(u4

j − 3))

∣∣∣∣∣∣

= Oa.s.(nε/2)Op

(1√nξn

)= op(1), (53)

because √1nξ

n−1∑

j=1

K

(Xj/n −Xi/n

ξn

)σ(Xj/n)4

(u4

j − 3)

satisfies a central limit for independent, heterogeneous triangular arrays. Also, note that, given the boundedness

of the kernel function, the op(1) term on the last line of (53) holds uniformly in i. Thus,

Ωn(rj , rj) =∫ rj

0

∫ rj

0

K(

Xu−Xa

ξn

)

∫ 1

0K

(Xu−Xb

ξn

)db

du

σ4(Xa)da + op(1)

34

and by the same argument as the one used in the proof of Step 2 of Theorem 1,

Ωn(rj , rj)−∫ ∞

−∞σ4(a)

LX(rj , a)2

LX(1, a)da = op(1).

As for Cn(rj , rj), note that

Cn(rj , rj) =1n

bnrjc−1∑

i=1

σ4(Xi/n) + op(1) =∫ ∞

−∞σ4(a)LX(rj , a)da + op(1).

Therefore, for j = 1, . . . , J,

Cn(rj , rj)− Ωn(rj , rj) = V (rj , rj) + op(1).

Finally, as for the off diagonal terms of the covariance matrix, by the same argument as above, for rj 6= r′j ,

Cn(minrj , r′j,minrj , r

′j)− Ωn(rj , r

′j) = V (rj , r

′j) + op(1).

The statement in the Proposition is then proved.

(ii) Under the alternative hypothesis, Cn(rj , rj) and Ωn(rj , r′j) do no longer have a well defined probability limit.

However, by construction, both Cn(rj , rj) and Ωn(rj , r′j) are Op(1). Thus, CV S,n

α is also Op(1). As Zn diverges at

rate√

n,

limn→∞

Pr(Zn > CV S,nα ) = 1.

The statement follows. ¥

Proof of Theorem 2

Let

TV(j+1)/n (W1) =∣∣∆W1,(j+3)/n

∣∣2/3 ∣∣∆W1,(j+2)/n

∣∣2/3 ∣∣∆W1,(j+1)/n

∣∣2/3,

TVσ,(j+1)/n (W1) = σ2(Xj/n)TV(j+1)/n (W1) ,

TV(j+1)/n (X) =∣∣∆X(j+3)/n

∣∣2/3 ∣∣∆X(j+2)/n

∣∣2/3 ∣∣∆X(j+1)/n

∣∣2/3,

and let Ej/n denote the expectation conditional on Fj/n = σ(Xτ/n; τ/n ≤ j/n),

(i)a The proof is organized in three steps.

Step 1:

ZJn,r

=µ−3

2/3√n

bnrc−3∑

j=1

bnrc−3∑

i=1

K+(

Xi/n−Xj/n

ξn

)

∑n−3l=1 K+

(Xi/n−Xl/n

ξn

) − 1

(

n(TVσ,(j+1)/n (W1)− Ej/n

(TVσ,(j+1)/n (W1)

)))

+µ−3

2/3√n

n∑

j=bnrc−2

bnrc−3∑

i=1

K+(

Xi/n−Xj/n

ξn

)

∑n−3l=1 K+

(Xi/n−Xl/n

ξn

)(

n(TVσ,(j+1)/n (W1)− Ej/n

(TVσ,(j+1)/n (W1)

)))

+op(1).

Step 2:

〈ZJn,r〉 = γ

∫ ∞

−∞σ4(a)

LX(r, a) (LX(1, a)− LX(r, a))LX(1, a)

da + op(1).

35

Step 3:

ZJn,rd−→ ZJr ∼ MN

(0, γ

∫ ∞

−∞σ4(a)

LX(r, a) (LX(1, a)− LX(r, a))LX(1, a)

da

).

Proof of Step 1. Using simple algebra,

√n

µ−3

2/3

n

bnrc−3∑

i=1

SJ2n(Xi/n)−

∫ r

0

σ2(Xs−)ds

(54)

=µ−3

2/3√n

bnrc−3∑

i=1

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)n

(TV(j+1)/n (X)− Ej/n

(TV(j+1)/n (X)

))

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)

+µ−3

2/3√n

bnrc−3∑

i=1

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)n

(Ej/n

(TV(j+1)/n (X)− TVσ,(j+1)/n (W1)

))

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)

+1√n

bnrc−3∑

i=1

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)(µ−3

2/3nEj/n

(TVσ,(j+1)/n (W1)

)− nbnrc−3

∫ r

0σ2(Xs−)ds

)

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

) .

As the absolute value function is differentiable everywhere but zero, the second term on the right hand side of (54) is

op(1) by the same argument as the one used in the proof of equation 7.1 of Theorem 5.1 by Barndorff-Nielsen, Graversen,

Jacod, Podolskij and Shephard (2006). The results of Barndorff-Nielsen, Graversen, Jacod, Podolskij and Shephard

(2006) were proved for the case of jumps in the volatility process, but not in the asset process. However, they continue to

hold here, as shown by Barndorff-Nielsen, Shephard and Winkel (2006). Also, by noting that nEj/n

(TVσ,(j+1)/n (W1)

)=

µ32/3σ

2(Xj/n), the last term on the right hand side of (54) is op(1).

By Theorem 5.1 and Lemma 5.2 in Barndorff-Nielsen, Graversen, Jacod, Podolskij and Shephard (2006),

µ−32/3√n

bnrc−3∑

i=1

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)n

(TV(j+1)/n (X)

)− Ej/n

(TV(j+1)/n (X)

)

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)

=µ−3

2/3√n

bnrc−3∑

i=1

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)n

(TVσ,(j+1)/n (W1)− Ej/n

(TVσ,(j+1)/n (W1)

))

∑n−3j=1 K+

(Xi/n−Xj/n

ξn

)

+op(1). (55)

By the same argument,

√n

(µ−3

2/3TVn,r −∫ r

0

σ2(Xs−)ds

)

=µ−3

2/3√n

bnrc−3∑

j=1

n(TVσ,(j+1)/n (W1)− Ej/n

(TVσ,(j+1)/n (W1)

))+ op(1).

Then, the statement in Step 1 follows by simple algebra.

Proof of Step 2. Let

ZJ (1)n,r =

µ−32/3√n

bnrc−3∑

j=1

bnrc−3∑

i=1

K+(

Xi/n−Xj/n

ξn

)

∑n−3l=1 K+

(Xi/n−Xl/n

ξn

) − 1

(

n(TVσ,(j+1)/n (W1)− Ej/n

(TVσ,(j+1)/n (W1)

))),

and

ZJ (2)n,r =

µ−32/3√n

n∑

j=bnrc−2

bnrc−3∑

i=1

K+(

Xi/n−Xj/n

ξn

)

∑n−3l=1 K+

(Xi/n−Xl/n

ξn

) (

n(TVσ,(j+1)/n (W1)− Ej/n

(TVσ,(j+1)/n (W1)

))),

36

so that 〈ZJn,r〉 =⟨ZJ

(1)n,r

⟩+

⟨ZJ

(2)n,r

⟩+ 2

⟨ZJ

(1)n,r, ZJ

(2)n,r

⟩.

Thus,

〈ZJ (1)n,r〉

=µ−6

2/3

n

bnrc−3∑

j=1

σ(Xj/n)4

×bnrc−3∑

i=1

bnrc−3∑

k=1

K+(

Xj/n−Xi/n

ξn

)K+

(Xj/n−Xk/n

ξn

)

∑n−1l=1 K+

(Xl/n−Xi/n

ξn

) ∑n−1l=1 K+

(Xl/n−Xk/n

ξn

) + 1− 2bnrc−3∑

i=1

K+(

Xi/n−Xj/n

ξn

)

∑n−3l=1 K+

(Xi/n−Xl/n

ξn

)

×(E

((nTV(j+1)/n (W1)− nEj/n

(TV(j+1)/n (W1)

))2)

+2n2E((

TV(j+1)/n (W1)− Ej/n

(TV(j+1)/n (W1)

)) (TV(j+2)/n (W1)− Ej/n

(TV(j+2)/n (W1)

)))

+ 2n2E((

TV(j+1)/n (W1)− Ej/n

(TV(j+1)/n (W1)

)) (TV(j+3)/n (W1)− Ej/n

(TV(j+3)/n (W1)

)))))

+op(1)

n

bnrc−3∑

j=1

σ4(Xj/n)

×bnrc−3∑

i=1

bnrc−3∑

k=1

K+(

Xj/n−Xi/n

ξn

)K+

(Xj/n−Xk/n

ξn

)

∑n−1l=1 K+

(Xl/n−Xi/n

ξn

) ∑n−1l=1 K+

(Xl/n−Xk/n

ξn

) + 1− 2bnrc−3∑

i=1

K+(

Xi/n−Xj/n

ξn

)

∑n−3l=1 K+

(Xi/n−Xl/n

ξn

) + op(1)

= γ

∫ r

0

∫ r

0

∫ r

0

K+(

Xa−Xu

ξn

)K+

(Xa−Xv

ξn

)

∫ 1

0K+

(Xc−Xu

ξn

)dc

∫ 1

0K+

(Xc−Xv

ξn

)dc

dudv

σ4(Xa)da

∫ r

0

σ4(Xa)da− 2γ

∫ r

0

∫ r

0

K+(

Xa−Xu

ξn

)

∫ 1

0K+

(Xc−Xu

ξn

)dc

du

σ4(Xa)da + op(1)

= γ

∫ ∞

−∞

(LX(r, a)3

LX(1, a)2+ LX(r, a)− 2

LX(r, a)2

LX(1, a)

)σ4(a)da + op(1),

where the last equality follows from Theorem 1 in Bandi and Nguyen (2003), as

1nξn

n−3∑

j=1

K+

(Xi/n − a

ξn

)a.s.−→ LX(1, a+) = LX(1, a).

By the same argument, ⟨ZJ (2)

n,r

⟩= γ

∫ ∞

−∞

(LX(r, a)2

LX(1, a)− LX(r, a)3

LX(1, a)2

)σ4(a)da + op(1)

and⟨ZJ

(1)n,r, ZJ

(2)n,r

⟩= op(1). The statement in Step 2 then follows by simple algebra.

Proof of Step 3: Given Step 2, from Barndorff-Nielsen, Shephard and Winkel (2006, Theorem 1).

Points (i)b and (ii) are proved by the same argument as the one used in the proof of (i)b and (ii) in Theorem 1. ¥

Proof of Proposition 2

σ4n(Xi/n)− σ4

n(Xi/n)

37

=n−4∑

j=1

K+(

Xj/n−Xi/n

ξn

) (n2µ−4

1

∣∣∣∆X j+4n

∣∣∣∣∣∣∆X j+3

n

∣∣∣∣∣∣∆X j+2

n

∣∣∣∣∣∣∆X j+1

n

∣∣∣− σ4n(Xj/n)

)

∑n−1l=1 K+

(Xl/n−Xi/n

ξn

)

+n−4∑

j=1

K+(

Xj/n−Xi/n

ξn

) (σ4

n(Xj/n)− σ4n(Xi/n)

)

∑n−1l=1 K+

(Xl/n−Xi/n

ξn

)

=n−4∑

j=1

K+(

Xj/n−Xi/n

ξn

)σ4

n(Xj/n)(n2µ−4

1

∣∣∣∆W j+4n

∣∣∣∣∣∣∆W j+3

n

∣∣∣∣∣∣∆W j+2

n

∣∣∣∣∣∣∆W j+1

n

∣∣∣− 1)

∑n−1l=1 K+

(Xl/n−Xi/n

ξn

)

+op(1) (56)

= op(1). (57)

(56) follows by the same argument as the one used in the proof of Step 1 of Theorem 2, and (57) follows from Barndorff-

Nielsen, Graversen, Jacod, Podolskij and Shephard (2006, Theorem 2.1). The statement of the Proposition then follows

by the same argument as that used in the proof of Proposition 1. ¥

Proof of Theorem 3

Hereafter, let s2n = E

((ε(jB+b)/n − ε((j−1)B+b)/n

)2)

.

(i)a The proof is organized in the following five steps.

Step 1:

TRVl,r − lrs2n =

1B

B∑

b=1

blrc−1∑

j=1

(X(jB+b)/n −X((j−1)B+b)/n

)2 + op

(1

l1/2

).

Step 2:

SM2l (Yi/l)− ls2

n

=1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l(X(jB+b)/n −X((j−1)B+b)/n

)2

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξk

) + op

(1

l1/2

),

where the op

(1

l1/2

)is independent of i.

Step 3:

ZMl,r

=1√l

blrc−1∑

j=1

1

B

B∑

b=1

blrc−1∑

i=1

K(

Xi/l−X((j−1)B+b)/n

ξl

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

) − 1

×2l

(∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s

)

+1√l

l∑

j=blrc

1

B

B∑

b=1

blrc−1∑

i=1

K(

Xi/l−X((j−1)B+b)/n

ξl

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)

×2l

(∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s

)+ op(1). (58)

Step 4:

〈ZMl,r〉 =43

∫ ∞

−∞σ4(a)

LX(r, a) (LX(1, a)− LX(r, a))LX(1, a)

da + op(1).

38

Step 5:

ZMl,rd−→ MN

(0,

43

∫ ∞

−∞

(LX(a, r) (LX(1, a)− LX(r, a)

L(1, a)

)σ4(a)da

).

(i)b and (ii) follow by the same argument as in the proof of Theorem 1.

Proof of Step 1:

√l

B

B∑

b=1

blrc−1∑

j=1

((Y(jB+b)/n − Y((j−1)B+b)/n

)2 − s2n

)

=

√l

B

B∑

b=1

blrc−1∑

j=1

(X(jB+b)/n −X((j−1)B+b)/n

)2

+ 2

√l

B

B∑

b=1

blrc−1∑

j=1

(ε(jB+b)/n − ε((j−1)B+b)/n

) (X(jB+b)/n −X((j−1)B+b)/n

)(59)

+

√l

B

B∑

b=1

blrc−1∑

j=1

((ε(jB+b)/n − ε((j−1)B+b)/n

)2 − s2n

). (60)

We begin by considering the term in (60). As εj/n = Op(an), s2n = O(a2

n), and Bl = n,

lvar

1

B

B∑

b=1

blrc−1∑

j=1

a2n

((ε(jB+b)/n − ε((j−1)B+b)/n

an

)2

− s2n

a2n

)

= lOp

(la4

nB−1)

= op(1),

given that anξ−1l → 0 and ξllB

−1/2 → 0 imply l2a2nB−1 → 0. Since E

(εj/nXj

)= 0, then the term in (59) is op(l−1/2),

regardless of the rate at which an → 0, by the same argument as in the proof of Theorem 1 in Zhang, Mykland and

Aıt-Sahalia (2005).

Proof of Step 2:

1√l

blrc−1∑

i=1

(SM2

l (Yi/l)− ls2n

)

=1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l((

Y(jB+b)/n − Y((j−1)B+b)/n

)2 − s2n

)

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)

=1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l(X(jB+b)/n −X((j−1)B+b)/n

)2

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)

+2√l

blrc−1∑

i=1

×

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l(X(jB+b)/n −X((j−1)B+b)/n

) (ε(jB+b)/n − ε((j−1)B+b)/n

)

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

) (61)

+1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l((

ε(jB+b)/n − ε((j−1)B+b)/n

)2 − s2n

)

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

) . (62)

By the same argument as in Step 1, the term in (62) is of a larger probability order than the one in (61) (involving the

cross term between log-efficient price and microstructure noise). Thus, it suffices to show the term in (62) is op(1).

39

We can rewrite it as

1√l

blrc−1∑

i=1

1Bξ

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)((ε(jB+b)/n − ε((j−1)B+b)/n

)2 − s2n

)

1Blξ

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

) .

As anξ−1 →∞,

1Blξ

B∑

b=1

l−1∑

j=1

K

(Yi/l − Y((j−1)B+b)/n

ξl

)

=1

Blξ

B∑

b=1

l−1∑

j=1

K

(Xi/l −X((j−1)B+b)/n

ξl

)

︸ ︷︷ ︸Op(1)

+ op(1), (63)

uniformly in i. The Op(1) term above is bounded away from zero, because it converges to the the local time of the

diffusion evaluated at Xi/l. Finally,

var

1√l

blrc−1∑

i=1

1Bξ

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)a2

n

((ε(jB+b)/n−ε((j−1)B+b)/n

an

)2

− s2n

a2n

)

1Blξ

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)

= a4nOp

(l2

Bξ2

)= op(1),

since ξ−1l an → 0 and this, togheter with the condition ξllB

−1/2 → 0, implies l2a2nB−1 → 0.

Proof of Step 3:

Given Step 1 and 2,

ZMl,r

=1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)l

((X(jB+b)/n −X((j−1)B+b)/n

)2 − σ2(Xi/l)l

)

1B

∑Bb=1

∑l−1j=1 K

(Yi/l−Y((j−1)B+b)/n

ξl

)

−√

l

B

B∑

b=1

blrc−1∑

j=1

((X(jB+b)/n −X((j−1)B+b)/n

)2 − σ2(X((j−1)B+b)/n

))+ op(1). (64)

By Ito’s formula,

√l

B

B∑

b=1

blrc−1∑

j=1

((X(jB+b)/n −X((j−1)B+b)/n

)2 − σ2(X((j−1)B+b)/n

))

=2√

l

B

B∑

b=1

blrc−1∑

j=1

∫ (Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s + op(1).

For any i, by a mean value expansion around εi/l − ε((j−1)B+b)/n = 0, we have

K

(Yi/l − Y((j−1)B+b)/n

ξl

)= K

(Xi/l −X((j−1)B+b)/n

ξl

)+ K ′

(Y i/l − Y ((j−1)B+b)/n

ξl

)εi/l − ε((j−1)B+b)/n

ξl,

where Y ((j−1)B+b)/n ∈(Y((j−1)B+b)/n, X((j−1)B+b)/n

). As K ′ is bounded and

εi/l−ε((j−1)B+b)/n = Op (an) = op (ξl) ,

40

uniformly in i, j, the first term of the tight hand side of (64) can be rewritten as

1√l

blrc−1∑

i=1

(SM2

l (Xi/l)− σ2(Xi/l))

=1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)l

(∫ (jB+b)/n

((j−1)B+b)/nσ2(Xs)ds− σ2(Xi/l)

l

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)

(65)

+1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)2l

(∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)

=1√l

blrc−1∑

i=1

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)2l

(∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/l−X((j−1)B+b)/n

ξl

)

+op(1), (66)

as (65) is op(1), given that l1/2+εξl → 0. The statement in Step 3 then follows by simple algebra, along the same lines as

in the proof of Theorem 1.

Proof of Step 4:

Let ZM(1)l,r and ZM

(2)l,r denote the first and second term of the right hand side of (58), respectively. Then, up to a op(1)

term,

⟨ZM

(1)l,r

=1

lB2

B∑

b=1

B∑

k=1

blrc−1∑

j=1blrc−1∑

i=1

blrc−1∑

i′=1

K(

Xi/l−X((j−1)B+b)/n

ξl

)K

(Xi′/l−X((j−1)B+k)/n

ξl

)

1B

∑nj=1 K

(Xi/l−X(j−1)/n

ξl

)1B

∑nj=1 K

(Xi′/l−X(j−1)/n

ξl

) + 1− 2blrc−1∑

i=1

K(

Xi/l−X((j−1)B+b)/n

ξl

)

1B

∑nj=1 K

(Xi/l−X(j−1)/n

ξl

)

×⟨

2l

∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s, 2l

∫ (jB+k)/n

((j−1)B+k)/n

(Xs −X((j−1)B+k)/n

)σ (Xs) dW1,s

⟩.

Now,∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)dW1,s is a martingale difference sequence; hence, for k > b,

⟨2l

∫ (jB+b)/n

((j−1)B+b)/n

(Xs −X((j−1)B+b)/n

)σ (Xs) dW1,s, 2l

∫ (jB+k)/n

((j−1)B+k)/n

(Xs −X((j−1)B+k)/n

)σ (Xs) dW1,s

=

⟨2l

∫ (jB+b)/n

((j−1)B+k)/n

(Xs −X((j−1)B+k)/n

)σ (Xs) dW1,s

= 2σ4(X((j−1)B+k)/n

)(1−

(k − b

B

))2

+ op(1).

Thus,

⟨ZM

(1)l,r

=4

lB2

B∑

b=1

blrc−1∑

j=1

41

blrc−1∑

i=1

blrc−1∑

i′=1

K(

Xi/l−X((j−1)B+b)/n

ξl

)K

(Xi′/l−X((j−1)B+k)/n

ξl

)

1B

∑nj=1 K

(Xi/l−X(j−1)/n

ξl

)1B

∑nj=1 K

(Xi′/l−X(j−1)/n

ξl

) + 1− 2blrc−1∑

i=1

K(

Xi/l−X((j−1)B+b)/n

ξl

)

1B

∑nj=1 K

(Xi/l−X(j−1)/n

ξl

)

×σ4(X((j−1)B+b)/n

) j2

B2+ op(1)

=43n

bnrc−1∑

j′=1blrc−1∑

i=1

blrc−1∑

i′=1

K(

Xi/l−X(j−1)/n

ξl

)K

(Xi′/l−X(j−1)/n

ξl

)

1B

∑nm=1 K

(Xi/l−X(m−1)/n

ξl

)1B

∑nm=1 K

(Xi′/l−X(m−1)/n

ξl

) + 1− 2blrc−1∑

i=1

K(

Xi/l−X(j−1)/n

ξl

)

1B

∑nm=1 K

(Xi/l−X(m−1)/n

ξl

)

×σ4(X(j′−1)/n

)+ op(1).

By the same argument used in the proof of Step 2 of Theorem 1(a),

⟨ZM

(1)l,r

⟩=

43

∫ ∞

−∞

(LX(r, a)3

LX(1, a)2+ LX(r, a)− 2

LX(r, a)2

LX(1, a)

)σ4(a)da + op(1),

⟨ZM

(2)l,r

⟩=

43

∫ ∞

−∞

(LX(r, a)2

LX(1, a)− LX(r, a)3

LX(1, a)2

)σ4(a)da + op(1)

and⟨ZM

(1)l,r , ZM

(2)l,r

⟩= op(1). The statement in Step 4 then follows by straigthforward calculations.

Step 5: By the same argument used in the proof of Step 3 of Theorem 1(a). ¥

Proof of Proposition 3

Let ∆Xj/n = Xj/n−X(j−1)/n, and ∆εj/n = εj/n− ε(j−1)/n, and note that as the conditions anξ−1l → 0 and l1/2+εξl → 0

imply la2n → 0. Thus,

l2(Y(jB+b)/n − Y((j−1)B+b)/n

)4

= l2(∆X4

(jB+b)/n + ∆ε4(jB+b)/n + 4∆X3(jB+b)/n∆ε(jB+b)/n

+4∆X(jB+b)/n∆ε3(jB+b)/n + 6∆X2(jB+b)/n∆ε2(jB+b)/n

)

= l2∆X4(jB+b)/n + Op

(l2a4

n

)= l2∆X4

(jB+b)/n + op (1) ,

Thus, recalling (63),

σ4n,l(Xi/l)

=13

1B

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

)l2

(X(jB+b)/n −X((j−1)B+b)/n

)4

1B

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

) + op(1).

So, uniformly in i,

σ4n,l(Xi/l)− σ4(Xi/n)

=13

1B

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

)l2

((X(jB+b)/n −X((j−1)B+b)/n

)4 − 3l2 σ4(Xi/n)

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

) + op(1)

=13

1B

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

)

1B

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

)

42

×l2

σ4(X((j−1)B+b)/n)

(∫ (jB+b)/n

((j−1)B+b)/n

dW1,s

)4

− 3l2

σ4(X((j−1)B+b)/n)

+ op(1)

=13

1lBξl

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

)σ4(X((j−1)B+b)/n)

(u4

((j−1)B+b)/n− 3

)

1lξlB

∑Bb=1

∑l−1j=1 K

(Xi/n−X((j−1)B+b)/n

ξl

) + op(1)

=13

1nξl

∑nj=1 K

(Xi/n−X(j−1)/n

ξl

)σ4(X(j−1)/n)

(u4

j− 3

)

1nζl

∑nj=1 K

(Xi/n−X(j−1)/n

ξl

) + op(1) (67)

and uj are N(0, 1), and E((

u4j− 3

) (u4

k− 3

))= 0 for |j − k| > B.

Thus,

1nξln

n∑

j=1

K

(Xi/n −X(j−1)/n

ξl

)σ4(X(j−1)/n)

(u4

j− 3

)

= Op

(√nBξl

n2ξ2l

)= Op

(√1lξl

)= op(1),

as lξl →∞. The rest of the proof then follows along the same lines in the proof of Proposition 1. ¥

43

References

Aıt-Sahalia, Y. (1996a). Nonparametric Pricing of Interest Rate Derivative Securities. Econometrica, 64, 527-561.

Aıt-Sahalia, Y. (1996b). Testing Continuous-Time Models of the Spot Interest Rate. Review of Financial Studies, 9,

385-426.

Aıt-Sahalia, Y., P.A. Mykland and L. Zhang (2006). Ultra High Frequency Volatility Estimation and Dependent

Microstructure Noise. Manuscript, Princeton University.

Andersen, T.G., L. Benzoni and J. Lund (2002). An Empirical Investigation of Continuous Time Equity Returns

Models. Journal of Finance, VLII, 1239-1284.

Andersen, T.G., T. Bollerslev, F.X. Diebold and P. Labys (2001). The Distribution of Realized Exchange Rate Volatility.

Journal of the American Statistical Association, 96, 42-55.

Ang, A. and M. Piazzesi (2003). A No-Arbitrage Vector Autoregression of Term Structure Dynamics with Macroeco-

nomic and Latent Variables. Journal of Monetary Economics, 50, 745-787.

Awartani, B., V. Corradi and W. Distaso (2006). Testing Market Microstructure Effects with an Application to the

Dow Jones Industrial Average Stocks. Working Paper, Tanaka Business School, Imperial College, London.

Bakshi, G., C. Cao and Z.W. Chen (2000). Do Call Prices and the Underlying Stock Always Move in the Same Direction?

Review of Financial Studies, 13, 549-584.

Bandi, F.M. and T.H. Nguyen (2003). On the Functional Estimation of Jump Diffusion Models. Journal of Economet-

rics, 116, 293-328.

Bandi, F.M. and P.C.B. Phillips (2003). Fully Nonparametric Estimation of Scalar Diffusion Models. Econometrica,

71, 241-283.

Bandi, F.M. and P.C.B. Phillips (2007). A Simple Approach to the Parametric Estimation of Potentially Nonstationary

Diffusions. Journal of Econometrics, 137, 354-395.

Bandi, F.M. and J.R. Russell (2007). Market Microstructure Noise, Integrated Variance Estimators, and the Accuracy

of Asymptotic Approximations. Working Paper, University of Chicago, Graduate School of Business.

Barndorff-Nielsen, O.E., S.E. Graversen, J. Jacod, M. Podolskij and N. Shephard (2006). A central limit theorem for

realised power and bipower variations of continuous semimartingales. In Kabanov, Y., R. Lipster and J. Stoyanov

(eds.), From Stochastic Analysis to Mathematical Finance, Festschrift for Albert Shiryaev, pp. 33-68, Springer,

New York.

Barndorff-Nielsen, O.E., P.R. Hansen, A. Lunde and N. Shephard (2006). Designing Realised Kernels to Measure the

Ex-Post Variation of Equity Prices in the Presence of Noise. Manuscript, University of Oxford.

44

Barndorff-Nielsen, O.E., P.R. Hansen, A. Lunde and N. Shephard (2007). Subsampling Realised Kernels. Working

Paper, Oxford University.

Barndorff-Nielsen, O.E. and N. Shephard (2002). Econometric Analysis of Realized Volatility and Its Use in Estimating

Stochastic Volatility Models. Journal of the Royal Statistical Society, Series B, 64, 253-280.

Barndorff-Nielsen, O.E. and N. Shephard (2004). Power and Bipower Variation with Stochastic Volatility and Jumps

(with Discussion). Journal of Financial Econometrics, 2, 1-48.

Barndorff-Nielsen, O.E., N. Shephard and M. Winkel (2006). Limit Theorems for Multipower Variation in the Presence

of Jumps. Stochastic Processes and Their Applications, 116, 796-806.

Bergman, Y.Z., B.D. Grundy and Z. Wiener (1996). General Properties of Option Prices. Journal of Finance, LI,

1573-1610.

Bibkov, R. and M. Chernov (2006). No-Arbitrage Macroeconomic Determinants of the Yield Curve. Working Paper,

London Business School.

Black, F., E. Derman and W. Toy (1990). A One-Factor Model of Interest Rates and its Application to treasury Bond

Options. Financial Analyst Journal, 46, 33-39.

Buraschi, A. and F. Corielli (2005). Risk management implications of time-inconsistency: model updating and recali-

bration of no-arbitrage models. Journal of Banking and Finance, 29, 2883-2907.

Buraschi, A. and J. Jackwerth (2001). The Price of a Smile: Hedging and Spanning in Option Markets. Review of

Financial Studies, 14, 495-527.

Buraschi, A. and A. Jiltsov (2006). Model Uncertainty and Option Markets with Heterogeneous Beliefs. Journal of

Finance, LXI, 2841-2897.

Chapman, D.A. and N.D. Pearson (2000). Is the Short Rate Drift Actually Nonlinear? Journal of Finance, LV, 355-388.

Cont, R. and P. Tankov (2004). Financial Modelling With Jump Processes. Chapman and Hall, New York.

Corradi, V., W. Distaso and A. Mele (2006). Macroeconomics Strikes Back: Real Determinants of Volatility Risk-

Premia. Working Paper, Tanaka Business School, Imperial College, London.

Cox, J.C., J.E. Ingersoll and S.A. Ross (1985). A Theory of the Term Structure of Interest Rates. Econometrica, 53,

385-407.

Derman, E. and I. Kani (1994). Riding on a smile. Risk, 7 , 32-39.

Dumas, B., J. Fleming, and R.E. Whaley (1998). Implied volatility functions: Empirical tests. Journal of Finance,

LIII, 2059-2106.

45

Dupire, B. (1994). Pricing with a smile. Risk, 7, 18-20.

Fan, J. (2005). A Selective Overview of Nonparametric Methods in Financial Econometrics (with discussion). Statistical

Science, 20, 317-357.

Fan, J. and Y.K. Truong (1993). Nonparametric Regression with Errors in Variables. Annals of Statistics, 21, 1900-1925.

Florens-Zmirou, D. (1993). On Estimating the Diffusion Coefficient from Discrete Observations. Journal of Applied

Probability, 30, 790-804.

Fong, H.G. and O.A. Vasicek (1992). Interest Rate Volatility as a Stochastic Factor. Gifford Fong Associates Working

Paper.

Heath, D., R.A. Jarrow and A.J. Morton (2002). Bond Pricing and the Term Structure of Interest Rates: A New

Methodology for Contingent Claims Valuation. Econometrica, 60, 77-105.

Heston, S. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency

options. Review of Financial Studies, 6, 327-343.

Hong, Y. and H. Li (2005). Nonparametric Specification Testing for Continuous-Time Models with Applications to

Term Structure of Interest Rates. Review of Financial Studies, 18, 37-84.

Hull, J.C. and A.D. White (1987). The Pricing of Options on Assets with Stochastic Volatilities. Journal of Finance,

XLII, 281-300.

Jacod, J. (1994). Limit of Random Measures Associated with the Increments of a Brownian Semimartingale. Preprint

number 120, Laboratoire de Probabilities, Universite Pierre et Marie Curie, Paris.

Jacod, J. and P. Protter (1998). Asymptotic Error Distribution for the Euler Method for Stochastic Differential

Equations. Annals of Probability, 26, 267-307.

James, J. and N. Webber (2000). Interest Rate Modelling. Wiley, Chichester.

Kalnina, I. and O. Linton (2007). Inference About Realized Volatility Using Infill Subsampling. Working Paper, LSE.

Karatzsas, I. and S.E. Shreve (1991). Brownian Motion and Stochastic Calculus. Springer Verlag, New York.

Mancini, C. and R. Reno (2006). Threshold Estimation of Jump-Diffusion Models and Interest Rate Modelling.

Manuscript, University of Siena.

Pardoux, E. and D. Talay (1985).Discretization and Simulation of Stochastic Differential Equations. Acta Applicandae

Mathematicae, 3, 23-47.

Piazzesi, M. (2005). Bond Yields and the Federal Reserve. Journal of Political Economy, 113, 311-344.

46

Pritsker, M. (1998). Nonparametric Density Estimation of Tests of Continuous Time Interest Rate Models. Review of

Financial Studies, 11, 449-487.

Schennach, S.M. (2004). Nonparametric Regression in the Presence of Measurement Error. Econometric Theory, 20,

1046–1093.

Stanton, R. (1997). A Nonparametric Model of Term Structure Dynamics and the Market Price of Interest Rate Risk.

Journal of Finance, LII, 1973-2002.

Stein, E. and J.C. Stein (1991). Stock Price Distributions with Stochastic Volatility: An Analytic Approach. Review

of Financial Studies, 4, 727-752.

Vasicek, O.A. (1977). An Equilibrium Characterization of the Term Structure. Journal of Financial Economics, 5,

177-188.

Zhang, L. (2006). Efficient Estimation of Stochastic Volatility Using Noisy Observations: A Multi-Scale Approach.

Bernoulli, 12, 1019-1043.

Zhang, L., P.A. Mykland and Y. Aıt-Sahalia (2005). A Tale of Two Time Scales: Determining Integrated Volatility

with Noisy High Frequency Data. Journal of the American Statistical Association, 100, 1394-1411.

Zhang, L., P.A. Mykland and Y. Aıt-Sahalia (2006). Edgeworth Expansions for Realized Volatility and Related Esti-

mators, Manuscript, Princeton University.

47