Likelihood and Asymptotic Theory for Statistical Inference · 2012-11-12 · Last week 1.Likelihood...

Post on 17-Apr-2020

15 views 0 download

Transcript of Likelihood and Asymptotic Theory for Statistical Inference · 2012-11-12 · Last week 1.Likelihood...

Likelihood and Asymptotic Theory forStatistical Inference

Nancy Reid

020 7679 1863reid@utstat.utoronto.ca

n.reid@ucl.ac.uk

http://www.utstat.toronto.edu/reid/ltccF12.html

LTCC Likelihood Theory Week 2 November 12, 2012 1/36

Last week1. Likelihood – definition, examples, direct inference2. Derived quantities – score, mle, Fisher information, Bartlett

identities3. Inference from derived quantities – consistency of mle,

asymptotic normality4. Inference via pivotals – standardized score function,

standardized mle, likelihood ratio, likelihood root5. Nuisance parameters and parameters of interest;

invariance under parameter transformation6. Asymptotics for posteriors

LTCC Likelihood Theory Week 2 November 12, 2012 2/36

This week1. Bayesian approximation

1.1 careful statement of asymptotic normality1.2 Laplace approximation to posterior density and cumulative

distribution function1.3 Laplace approximation to marginal posterior density and cdf1.4 relation to modified profile likelihood

2. Frequentist inference with nuisance parameters2.1 first order summaries; difficulties with profile likelihood2.2 marginal and conditional likelihood2.3 exponential families2.4 transformation families2.5 adjustments to profile likelihood

3. Notation

LTCC Likelihood Theory Week 2 November 12, 2012 3/36

Posterior is asymptotically normal

π(θ | y).∼ N{θ̂, j−1(θ̂)} θ ∈ R, y = (y1, . . . , yn)

careful statement

LTCC Likelihood Theory Week 2 November 12, 2012 4/36

Nancy
Nancy

... posterior is asymptotically normal

π(θ | y).∼ N{θ̂, j−1(θ̂)} θ ∈ R, y = (y1, . . . , yn)

equivalently `π(θ) =

LTCC Likelihood Theory Week 2 November 12, 2012 5/36

Nancy
Nancy

... posterior is asymptotically normalIn fact,

If π(θ) > 0 and π′(θ) is continuous in a neighbourhood of θ0,there exist constants D and ny s.t.

|Fn(ξ)− Φ(ξ)| < Dn−1/2, for all n > ny ,

on an almost-sure set with respect to f (y ; θ0), wherey = (y1, . . . , yn) is a sample from f (y ; θ0), and θ0 is anobservation from the prior density π(θ).

Fn(ξ) = Pr{(θ − θ̂)j1/2(θ̂) ≤ ξ | y}

Johnson (1970); Datta & Mukerjee (2004)

LTCC Likelihood Theory Week 2 November 12, 2012 6/36

Nancy
Nancy

Laplace approximation

π(θ | y).

=1

(2π)1/2 |j(θ̂)|+1/2 exp{`(θ; y)− `(θ̂; y)}π(θ)

π(θ̂)

π(θ | y) =1

(2π)1/2 |j(θ̂)|+1/2 exp{`(θ; y)−`(θ̂; y)}π(θ)

π(θ̂){1+Op(n−1)}

y = (y1, . . . , yn), θ ∈ R1

π(θ | y) =1

(2π)1/2 |jπ(θ̂π)|+1/2 exp{`π(θ; y)−`π(θ̂π; y)}{1+Op(n−1)}

LTCC Likelihood Theory Week 2 November 12, 2012 7/36

Nancy
Nancy
Nancy
Nancy

Posterior cdf∫ θ

−∞π(ϑ | y)dϑ .

=

∫ θ

−∞

1(2π)1/2 e`(ϑ;y)−`(ϑ̂;y)|j(ϑ̂)|1/2π(ϑ)

π(ϑ̂)dϑ

LTCC Likelihood Theory Week 2 November 12, 2012 8/36

Nancy
Nancy

Posterior cdf∫ θ

−∞π(ϑ | y)dϑ .

=

∫ θ

−∞

1(2π)1/2 e`(ϑ;y)−`(ϑ̂;y)|j(ϑ̂)|1/2π(ϑ)

π(ϑ̂)dϑ

SM, §11.3

LTCC Likelihood Theory Week 2 November 12, 2012 9/36

Nancy
Nancy
Nancy
Nancy

BDR, Ch.3, Cauchy with flat prior

Nancy

Nuisance parametersy = (y1, . . . , yn) ∼ f (y ; θ), θ = (ψ, λ)

πm(ψ | y) =

∫π(ψ, λ | y)dλ

=

∫exp{`(ψ, λ; y)π(ψ, λ)dλ∫

exp{`(ψ, λ; y)π(ψ, λ)dψdλ

LTCC Likelihood Theory Week 2 November 12, 2012 12/36

Nancy
Nancy

... nuisance parametersy = (y1, . . . , yn) ∼ f (y ; θ), θ = (ψ, λ)

πm(ψ | y) =

∫π(ψ, λ | y)dλ

=

∫exp{`(ψ, λ; y)π(ψ, λ)dλ∫

exp{`(ψ, λ; y)π(ψ, λ)dψdλ

|j(θ̂)| = |jψψ(θ̂)| |jλλ(θ̂)|

LTCC Likelihood Theory Week 2 November 12, 2012 13/36

Nancy
Nancy

Posterior marginal cdf, d = 1

Πm(ψ | y) =

∫ ψ

−∞πm(ξ | y)dξ

.=

∫ ψ

−∞

1(2π)1/2 e`P(ξ)−`P(ξ̂)j1/2

P (ξ̂)π(ξ, λ̂ξ)

π(ξ̂, λ̂)

|jλλ(ξ̂, λ̂)|1/2

|jλλ(ξ, λ̂ξ)|1/2dξ

LTCC Likelihood Theory Week 2 November 12, 2012 14/36

... posterior marginal cdf, d = 1

Πm(ψ | y).

= Φ(r∗B) = Φ{r +1r

log(qB

r)}

r = r(ψ) =

qB = qB(ψ) =

LTCC Likelihood Theory Week 2 November 12, 2012 15/36

Nancy

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k=2

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 16/36

Nancy

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k=2

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 17/36

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k=2

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 18/36

Nancy

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k = 2, 5, 10

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 19/36

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k = 2, 5, 10

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 20/36

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k = 2, 5, 10

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 21/36

2 3 4 5 6 7 8

0.0

0.2

0.4

0.6

0.8

1.0

normal circle, k = 2, 5, 10

ψψ

p−va

lue

LTCC Likelihood Theory Week 2 November 12, 2012 22/36

Posterior marginal and adjusted log-likelihoods

πm(ψ | y).

=1

(2π)d/2 e`P(ξ)−`P(ξ̂)j1/2P (ξ̂)

π(ξ, λ̂ξ)

π(ξ̂, λ̂)

|jλλ(ξ̂, λ̂)|1/2

|jλλ(ξ, λ̂ξ)|1/2

Πm(ψ | y) =

LTCC Likelihood Theory Week 2 November 12, 2012 23/36

Frequentist inference, nuisance parametersI first-order pivotal quantities

I ru(ψ) = `′P(ψ)jP(ψ̂)1/2 .∼ N(0,1),

I re(ψ) = (ψ̂ − ψ)jP(ψ̂)1/2 .∼ N(0,1),

I r(ψ) = sign(ψ̂ − ψ)2{`P(ψ̂)− `P(ψ)} .∼ N(0,1)

I all based on treating profile log-likelihood as aone-parameter log-likelihood

I example y = Xβ + ε, ε ∼ N(0, ψ)

I ψ̂ = (y − X β̂)T (y − X β̂)/n

LTCC Likelihood Theory Week 2 November 12, 2012 25/36

3 4 5 6 7 8

-6-4

-20

ψ1 2

log-likelihood

Nancy

Eliminating nuisance parametersI by using marginal density

I f (y ;ψ, λ) ∝ fm(t1;ψ)fc(t2 | t1;ψ, λ)

I ExampleN(Xβ, σ2I) : f (y ;β, σ2) ∝ fm(RSS;σ2)fc(β̂ | RSS;β, σ2)

I by using conditional density

I f (y ;ψ, λ) ∝ fc(t1 | t2;ψ)fm(t2;ψ, λ)

I ExampleN(Xβ, σ2I) : f (y ;β, σ2) ∝ fc(RSS | β̂;σ2)fm(β̂;β, σ2)

LTCC Likelihood Theory Week 2 November 12, 2012 27/36

Nancy

Linear exponential familiesI conditional density free of nuisance parameterI f (yi ;ψ, λ) = exp{ψT s(yi) + λT t(yi)− k(ψ, λ)}h(yi)

I f (y ;ψ, λ) =

s = t =

I f (s, t ;ψ, λ) =

I f (s | t ;ψ) =

LTCC Likelihood Theory Week 2 November 12, 2012 28/36

Nancy

Saddlepoint approximation in linear exponentialfamilies

I no nuisance parameters f (yi ; θ) = exp{θT s(yi)− k(θ)}h(yi)

I f (s; θ) = exp{θT s − nk(θ)}h̃(s)

I `(θ; s) = θT s − nk(θ)

I f (s; θ).

=

I f (θ̂; θ).

=

LTCC Likelihood Theory Week 2 November 12, 2012 29/36

Nancy

Saddlepoint approximation to conditional densityI f (yi ;ψ, λ) = exp{ψT s(yi) + λT t(yi)− k(ψ, λ)}h(yi)

I f (s | t ;ψ) =

I f (ψ̂ | t ;ψ).

= c|jP(ψ̂)|1/2e`P(ψ)−`P(ψ̂)

SM §12.3

LTCC Likelihood Theory Week 2 November 12, 2012 31/36

Nancy

Approximating distribution functionI f (θ̂; θ)

.= c|j(θ̂)|1/2 exp{`(θ; θ̂)− `(θ̂; θ̂)}

I∫ θ̂−∞ f (ϑ̂; θ)d ϑ̂ .

=

LTCC Likelihood Theory Week 2 November 12, 2012 32/36

Nancy
Nancy

SummaryI No nuisance parameters

I Bayesian p-value Φ(r∗B)I r∗B = r + 1

r log qBr

I Exponential family p-value Φ(r∗)I r∗ = r + 1

r log qr

I Nuisance parametersI Bayesian p-value Φ(r∗B)I r∗B = r + 1

r log qBr

I Exponential family p-value Φ(r∗)I r∗ = r + 1

r log qr

LTCC Likelihood Theory Week 2 November 12, 2012 33/36

Nancy