Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.

Advanced topics in Financial Econometrics

Bas Werker

Tilburg University, SAMSI fellow

In which we will ...

... consider the modern theory ofasymptotic statistics à la Hájek/Le Cam,

with a special emphasis on financialeconometric applications, semiparametric

analysis, and rank based inference methods

Contents

1. Introduction

2. Inference in parametric models

3. Semiparametric analysis for models with i.i.d. observations

4. Semiparametric time series models

5. Rank based statistics

6. Semiparametric efficiency of rank based inference

Literature

Aad W. van der Vaart, “Asymptotic Statistics”, Cambridge University Press, 1998/2000

Reference (AS-x) is to Chapter x of this book

Various papers

Introduction

Contents

Consistency and asymptotic normality (AS-2,3)

M- and Z-estimators (AS-5)

Local alternatives and continguity (AS-6)

Local power of tests

Stochastic convergence (AS-2)

Consider a sequence of -dimensional

random vectors

All random variables are (“for fixed sample

size”) defined on the same implicit probability

space

k

0nnX

Weak convergence

Convergence of the distributions: for each

point where is continuous,

we have

as

Convergence in distribution/law

Notation

kx xXP

xXPxXP n

n

XXL

n

Convergence in probability

Convergence of the random variables:

as , for all

Euclidean distance

Basic to the notion of consistency of estimators

Notation

0 XXP n

n 0

XXp

n

Continuous mapping theorem

Let be a function which is continuous at

each point of a set for which ,

then

gC 1CXP

XgXgXXL

n

L

n

XgXgXXsa

n

sa

n

....

XgXgXXp

n

p

n

o and O notation

Convenient short-hand notation and calculus

means bounded in

probability, i.e., for all there exists

such that

means

npn ROX

npn RoX

nn RX

0p

nn RX

0 0M

MRXP nnn

max

Rules of calculus

Convenient rules

111

111

111

111

ppp

ppp

ppp

ppp

ooO

OOO

OoO

ooo

11

1

1

111

1

ppp

pnnp

pnnp

pp

oOo

ORRO

oRRo

Oo

Delta method (AS-3)

Suppose that for numbers we have

Suppose is differentiable at

Then

nr

TTrL

nn

T

oTrTrL

pnnnn

'

1'

Uniform Delta method

Suppose that for numbers and vectors

Suppose

Suppose is continuously differentiable in a neighborhood of

Then

nr

TTrL

nnn

T

oTrTrL

pnnnnnn

'

1'

n

n

M-estimators

Define a statistic (“estimator”) for

observations as a maximizer ofnXX ,,1

n

iin Xm

nM

1

1

Z-estimators

Define a statistic (“estimator”) for

observations as a solution of

Also called “Estimating equation”

Often, but not always, based on M-estimator

nXX ,,1

01

1

n

iin X

n

Examples

Maximum likelihood

(Generalized) Method of Moments

Chi-square estimation

... “all” parametric inference

Consistency

Uniform convergence of criterion function leads to consistency of M-estimators

“Approximate” maximization is sufficient

Theorem AS 5.7

Uniform convergence of criterion function leads to consistency of Z-estimators

Asymptotic normality

Let us be given a Z-estimator

Suppose the Z-criterion satisfies

Suppose is differentiable with derivative at the zero of

Then, under some additional regularity,

0ˆ n

x

2121 xxx

XE 0V 0 XE

11ˆ

1

10 00 p

n

iin oX

nVn

One-step estimators

A technical trick to reduce the conditions for

consistency and asymptotic normality of Z

estimators significantly

Starting from an initial root-n consistent

estimator , i.e., , we

consider the solution of the (linear) equation

n~ 1~

0 pn On

0~~~

nnnn

Asymptotic normality

The previously derived asymptotic expansion/distribution holds now under the sole condition

1sup 0000

pnnMn

onn

Discretization trick

The previous condition can be relaxed further by considering an initial discretized estimator, i.e., one which essentially only takes a finite number of possible values

Now, we only need, for all non-random

, that

1000 pnnnn onn nOn 10

Contiguity (AS-6)

To understand the idea, consider a statistical model where we observe one variable from a distribution or

We want to test if the distribution is or

If and are orthogonal, this testing problem is trivial

Orthogonality: disjoint support

P QX

P

QP

Q

Contiguity - 2

If and “have the same support”, i.e., are absolutely continuous, the problem is non-trivial (this is the interesting case)

Clearly, “good” tests should in that case be based on the likelihood ratio

QP

XdP

dQL

Intermezzo

Radon-Nikodym derivatives always refer to the derivative defined for the part where dominates

As a consequence, expectations of Radon-Nikodym derivatives may be strictly smaller than one

QP

Contiguity - definition

Contiguity the the asymptotic version of absolute continuity for sequences of probability measures

Definition: if

Definition: if both

and

nn QP 00: nnnnn APAQA

nn QP nn QP

nn QP

Le Cam’s first lemma

The well known equivalence for absolute continuity translates in the obvious way to contiguity (AS Lemma 6.4)

The following are equivalent

PQ

00 dQdPQ 1dPdQEP

Consistency

An estimator which is consistent under a (sequence of) probability (measures) is also consistent under a contiguous (sequence of) probability (measures) nQ

nP

Le Cam’s third lemma

Change of probability measures using contiguous probabilities may be taken to the limit

See AS Theorem 6.6

It looks complicated, but is actually quite intuitive

Local alternatives

The idea of contiguity is basic to the construction of local alternatives

In a sequence of statistical experiments with identical parameter space , asymptotic tests for versus are trivial

Non-trivial is versus

00 : H 11 : H

00 : H

n

hH 00 :

Example

Consider the model where we observe i.i.d.

copies of a random variable

Denote

When are and contiguous?

What is the asymptotic distribution of he

sample average under ?

n 1,N

nn NP 1,

nn

P nP0

n

n

hP

0

Inference in parametric models

Contents

Local Asymptotic Normality (AS-7)

Optimal testing

Efficiency of estimators (AS-8)

Nuisance parameters and geometry

Limits of experiments (AS-9)

Local Asymptotic Normality(AS-7) Local Asymptotic Normality (LAN) is the

formalization of a “regular” statistical experiment

The concept is a refinement of contiguity

“All” standard econometric models are LAN

LAN - definition

A statistical model is identified as a sequence

of probability models

LAN holds if for each and every

sequence

nP

hhn

hIhhIhN

ohIhhdp

dP

TTL

pTnT

n

n

nhn

;2

1

12

1log

Remarks

is called the central sequence and the equivalent of the derivative of the log-likelihood in classical statistics

is the Fisher information

The root-n rate can be any other, but this is the usual situation

n

I

Terminology

The terminology derives from

with a single observation from

hIhXIhIdN

IhdN TT

2

1

;0

;log

1

1

X 1; IhN

Examples

In models with i.i.d. observations, differentiability conditions on the densities lead to LAN

This is the so-called “differentiability in quadratic mean condition”

See AS Theorem 7.2

Regression, Probit/Logit, etc...

Time series examples

LAN has also been shown to hold for

ARMA (Kreiss, 1987)

ARCH (Linton, 1993)

GARCH (Drost and Klaassen, 1997)

...

In all cases with the “obvious” central sequence

Optimal testing in LAN experiments

Consider a (test) statistic in a LAN experiment that satisfies, under ,

An asymptotic size (under ) test is easily constructed

nT

Ic

cN

T TL

nn

2

;0

0

0

Local power

Consider a sequence of alternatives

What’s the behavior of under ?

Le Cam’s third lemma: under

nhn 0nnT

Ic

c

hIh

hcN

T T

T

TL

nn

2

;

n

Maximize local power

To maximize local power, we need to maximize

Hence take the central sequence evaluated at the null as statistic

Lagrange multiplier type

Use quadratic forms in multidimensional case

c

Efficiency (AS-8)

We may also formalize the Cramér-Rao lower bound idea

Let’s first look at the asymptotic counterpart of an unbiased estimator

... which requires more than mere consistency

Regular estimator

Consider an estimator for satisfying

under

How does this estimator behave under

?

00

00

0

;0

0ˆ0

IC

CN

n TL

nn

n̂

nhn 0

0

Once more...

Le Cam’s third lemma, under ,

Which leads to the requirement

If not, estimator does not follow local shifts

Such an estimator is called “regular”

00

000

0

;0

ˆ0

IC

ChCN

n TTL

nn

n

IC 0

Convolution theorem

For any regular estimator we have

The idea of regularity can be relaxed to general limiting distributions

In that case, we find

The latter result explains the name

1 I

L

MINL 1;0

Efficient estimator

An estimator is therefore called efficient if

Note that this estimator is trivially regular

n̂

1ˆ 1p

nn oIn

Minimax theorem

Theorem on asymptotic loss of any estimator (regular or not)

Only gives a bound for the asymptotic risk, no more distribution information

Nuisance parameters

The Convolution theorem also leads to optimal estimators in case we have both a parametric of interest and a parametric as nuisance parameter

In that case we need to consider

II

IINn

n

;0

0~

Efficient estimation

If one is only interested in estimating , one should consider just the upper part of

From the partitioned inverses formula, this is

n

n

II

II

1

nn IIIIII 111

The geometry of inference with nuisance parameters Using the intuition that Fisher information

matrices are variances of central sequences, we find that the central sequence to use when there are nuisance parameters is the residual of the projection of the central sequence for the parameter of interest on the central sequences of the nuisance parameters

Limits of experiments (AS-9)

The previous ideas can be extended to a general concept of “convergence of statistical experiments”

Crucial is an identical parameter space

LAN corresponds to a Guassian shift limit

Other limits are possible

Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.

Documents

Transcript of Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.