Inference in High-Dimensional Varying Coefficient Models (slides)

31
Inference in High-dimensional Varying Coefficient Models Mladen Kolar The University of Chicago Booth School of Business Dec 15, 2014

Transcript of Inference in High-Dimensional Varying Coefficient Models (slides)

Page 1: Inference in High-Dimensional Varying Coefficient Models (slides)

Inference in High-dimensional Varying CoefficientModels

Mladen Kolar

The University of ChicagoBooth School of Business

Dec 15, 2014

Page 2: Inference in High-Dimensional Varying Coefficient Models (slides)

Acknowledgments

D. Kozbur

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 2

Page 3: Inference in High-Dimensional Varying Coefficient Models (slides)

Varying-coefficient Model

yi = x>i β(ui) + σ(ui, xi)εi, εi ∼ Fi,E[εi | xi, ui] = 0, i = 1, . . . , n, E

[ε2i]

= 1,

High dimensional setting: Xi ∈ Rp, p n

Approximate sparsity

E[yi | xi = x, ui = u] ≈ x>S(u)βS(u)(u),

where S(u) ⊂ [p], |S(u)| ≤ s n

Our goal: constructing confidence bands for βj(u)

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 3

Page 4: Inference in High-Dimensional Varying Coefficient Models (slides)

Varying-coefficient Model

Widely used

economics, finance, medical science, ecology

Flexible modeling

less restrictive assumptions

domain scientists have prior knowledge that can be used

Interpretable

for each value of the index parameter, one has a parametric model

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 4

Page 5: Inference in High-Dimensional Varying Coefficient Models (slides)

Confidence Intervals/Bands

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 5

Page 6: Inference in High-Dimensional Varying Coefficient Models (slides)

Local Linear Lasso

For a fixed point u, we estimate β(u) as

(β(u)

δ(u)

)= arg min

β,δ∈Rp1

2n

∑i∈[n]

Kh(ui − u)(yi − x>i β − x>i δ(ui − u)

)2+λ

∑j∈[p]

(σ1j |βj |+ σ2j |δj |)

where

σ21j = n−1∑i∈[n]

Kh(ui − u)x2ij(yi − x>i β(ui))2

estimates the variance of the score vector

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 6

Page 7: Inference in High-Dimensional Varying Coefficient Models (slides)

Naive Confidence Bands

An idea

Use the local linear lasso to select the model

Use the selected components and refit the model

Construct confidence bands using the results of Fan and Zhang(2000)

Issues

not uniformly valid

hinges on correct model selection

requires stringent design conditions and

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 7

Page 8: Inference in High-Dimensional Varying Coefficient Models (slides)

Example

Yi = 4XiUi(1− Ui)

+ Zi1√Ui/2 + Zi2

√Ui/4 + Zi3

√Ui/8 + Zi4

√Ui/16

+ Zi5(1− Ui)/2 + Zi6(1− Ui)/4 + Zi7(1− Ui)/8 + Zi8(1− Ui)/16

+ εi

Xi = Zi1√Ui/2 + Zi2

√Ui/4 + Zi3

√Ui/8 + Zi4

√Ui/16

+ Zi5(1− Ui)/2 + Zi6(1− Ui)/2 + Zi7(1− Ui)/8 + Zi8(1− Ui)/16

+ σxξi

εi, ξi ∼ N (0, 1), Zi ∈ Np(0, I), p = 50, n = 200

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 8

Page 9: Inference in High-Dimensional Varying Coefficient Models (slides)

Example (con’t)

σx = 0.5

σx = 1

0 0.2 0.4 0.6 0.8 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

Post-Double-Selection Estimator

u

α(u)

0 0.2 0.4 0.6 0.8 1−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

Post-Single-Selection Estimator

u0 0.2 0.4 0.6 0.8 1

−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

2.5

Oracle Estimator

u

0 0.2 0.4 0.6 0.8 1−1.5

−1

−0.5

0

0.5

1

1.5

2

Post-Double-Selection Estimator

u

α(u)

0 0.2 0.4 0.6 0.8 1−1.5

−1

−0.5

0

0.5

1

1.5

2

Post-Single-Selection Estimator

u0 0.2 0.4 0.6 0.8 1

−1.5

−1

−0.5

0

0.5

1

1.5

2

Oracle Estimator

u

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 9

Page 10: Inference in High-Dimensional Varying Coefficient Models (slides)

Example (con’t)

σx = 0.5

σx = 1

0 0.5 1 1.5 2

Post-Double-Selection Estimator

α(0.5)0 0.5 1 1.5 2

Post-Single-Selection Estimator

α(0.5)0 0.5 1 1.5 2

Oracle Estimator

α(0.5)

0.4 0.6 0.8 1 1.2 1.4 1.6

Post-Double-Selection Estimator

α(0.5)0.4 0.6 0.8 1 1.2 1.4 1.6

Post-Single-Selection Estimator

α(0.5)0.4 0.6 0.8 1 1.2 1.4 1.6

Oracle Estimator

α(0.5)

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 10

Page 11: Inference in High-Dimensional Varying Coefficient Models (slides)

Example (con’t)

Confidence Interval at u = 0.5

σx = 0.5 σx = 1n 100 200 300 100 200 300

Post-Double-Selection 861 907 927 876 872 898Post-Single-Selection 752 653 574 861 845 866

Oracle 934 949 944 933 945 944

Confidence Band

σx = 0.5 σx = 1n 100 200 300 100 200 300

Post-Double-Selection 770 875 915 834 816 825Post-Single-Selection 585 425 395 785 750 716

Oracle 780 940 980 855 940 964

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 11

Page 12: Inference in High-Dimensional Varying Coefficient Models (slides)

This Talk

Question:How to construct valid confidence intervals in high-dimensionalvarying-coefficient models?

Requirements:

robust against model selection mistakes

valid for a wide range of data generating processes

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 12

Page 13: Inference in High-Dimensional Varying Coefficient Models (slides)

Outline

1 Recent developments

2 Post-double selection estimator

3 An application to inference in graphical models

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 13

Page 14: Inference in High-Dimensional Varying Coefficient Models (slides)

Recent Developments

Inference in high-dimensional linear and generalized linear models

least squares regression: Zhang and Zhang (2013), Belloni et al.(2013a), Javanmard and Montanari (2014)

generalized linear models: van de Geer et al. (2014), Belloni et al.(2013d)

LAD and QR: Belloni et al. (2013c), Belloni et al. (2013b)

Gaussian graphical models

Ren et al. (2013), Chen et al. (2013), Jankova and van de Geer(2014)

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 14

Page 15: Inference in High-Dimensional Varying Coefficient Models (slides)

Recent Developments

Selective inference

along the path: Lockhart et al. (2014)

fixed λ: Lee et al. (2013), Taylor et al. (2014)

Other

sample splitting Wasserman and Roeder (2009), Meinshausenet al. (2009)

stability selection Meinshausen and Buhlmann (2010), Shah andSamworth (2013)

FDR control Foygel Barber and Candes (2014)

POSI Berk et al. (2013)

...

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 15

Page 16: Inference in High-Dimensional Varying Coefficient Models (slides)

Outline

1 Recent developments

2 Post-double selection estimator

3 An application to inference in graphical models

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 16

Page 17: Inference in High-Dimensional Varying Coefficient Models (slides)

Post-Double-Selection Estimator

Step 1: Regress Y onto X using local linear lasso

Obtain set of relevant predictors S1(u)

Step 2: Regress Xj onto X−j using local linear lasso

Obtain set of relevant predictors S2(u)

Step 3: Sj(u) = j ∪ S1(u) ∪ S2(u)

(β(u)

δ(u)

)min←−− 1

2n

∑i∈[n]

Kh(ui−u)(yi − x>i,Sj(u)β − x

>i,Sj(u)

δ(ui − u))2

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 17

Page 18: Inference in High-Dimensional Varying Coefficient Models (slides)

Confidence Intervals and Bands

Confidence interval at a point u

σ−1(βj(u))(βj(u)− βj(u)− bias(βj(u))

)→D N (0, 1)

where σ2(βj(u)) =(nhf(u)E[X

SX>S| U = u]

)−1jj

(∫K2(u)du

)σ2(u)

Confidence band

P

(−2 log h)1/2

supu∈[0,1]

∣∣∣βj(u)− βj(u)− bias(βj(u))∣∣∣

σ2(βj(u))− dv,n

< x

→ exp(−2 exp(−x))

where dv,n = (−2 log h)1/2 + C(K)

(−2 log h)1/2

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 18

Page 19: Inference in High-Dimensional Varying Coefficient Models (slides)

What do we need from Lasso?

Prediction bound

supu∈[0,1]

||X>(β(u)− β(u)

)||2 .P

√s (log p+ log h−1)

nh

Estimation bound

supu∈[0,1]

||β(u)− β(u)||1 .P s

√(log p+ log h−1)

nh

Size of the estimated support

supu∈[0,1]

|S(u)| ≤P cs

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 19

Page 20: Inference in High-Dimensional Varying Coefficient Models (slides)

Conditions for Kernel-Lasso

Key Ingredients for Lasso Bounds:

Strong Convexity =⇒ Local sparse eigenvalue condition. ForA(u) = 1

nhf(u)

∑i∈[n]Kh(ui − u)xix

>i ,

κ1(C) ≤ φmin(C · s)(A(u)) ≤ φmax(C · s)(A(u)) ≤ κ2(C)

with probability 1− o(1).

Score Domination:

supu∈[0,1]

λ(u) ≥ c 1

σ21jn−1

∑i∈[n]

Kh(ui − u)xij(yi − x>i β(u))

for every 1 ≤ j ≤ p with probability 1− o(1)

Penalty Loading Quality: For `→P , u = OP (1),

`σ21j ≤ σ21j ≤ uσ21j for every 1 ≤ j ≤ p

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 20

Page 21: Inference in High-Dimensional Varying Coefficient Models (slides)

Conditions for Kernel-Lasso

Conditions for Lasso: With probability 1− o(1)

E[Kh(u− ui)x2ij(yi − x>i β(u))2] bounded from above and awayfrom zero uniformly in n, u ∈ [0, 1].

max1≤j≤p σ1j(u)/min1≤j≤p σ1j(u) = O(1) uniformly in u ∈ [0, 1]

max1≤j≤p |σ1j(u)− σ1j(u)|/σ1j(u) = o(1) uniformly in u ∈ [0, 1]

max1≤j≤p

(n−1

∑i∈[n]Kh(ui − u)x3ij(yi − x>i β(u))3

)1/3/σ1j(u) =

O(1) uniformly in u ∈ [0, 1]

log3 p = o(nh) and s log(maxp, n) = o(nh)

Sparse eignevalue and penalty loading quality conditions aresatisfied.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 21

Page 22: Inference in High-Dimensional Varying Coefficient Models (slides)

Outline

1 Recent developments

2 Post-double selection estimator

3 An application to inference in graphical models

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 22

Page 23: Inference in High-Dimensional Varying Coefficient Models (slides)

Application to Inference in Gaussian graphical models

ModelX | U = u ∼ N (µ(u),Σ(u)),

Let Ω(u) = Σ−1(u) = (ωab(u))a,b∈[p]×[p]. For I = a, b and J = [p]\I,

ΩII(u) =(ΣII(u)− ΣIJ(u)Σ−1JJΣJI

)−1=:

(θaa(u) θab(u)θba(u) θbb(u)

)−1θaa(u) = σaa(u)− γ>a (u)ΣJJ(u)γa(u),

θbb(u) = σbb(u)− γ>b (u)ΣJJ(u)γb(u),

θab(u) = σab(u)− γ>a (u)ΣJJ(u)γb(u),

where γa(u) = Σ−1JJ (u)ΣJa(u) are coefficients in linear regression of Xa

onto XJ given U = u.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 23

Page 24: Inference in High-Dimensional Varying Coefficient Models (slides)

Application to Inference in Gaussian graphical models

Estimate the Markov blanket of Xa, Xb

J(u) = supp(γa(u)) ∪ supp(γb(u))

Define

θab(u) = σab(u)− ΣaJ(u)(u)(

ΣJ(u)J(u)(u))−1

ΣJ(u)b(u)

and similarly θaa(u) and θbb(u).

The estimator of ΩII(u) is

ΩII(u) =

(θaa(u) θab(u)

θba(u) θbb(u)

)−1.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 24

Page 25: Inference in High-Dimensional Varying Coefficient Models (slides)

Example – chain graph (p = 200)

n = 500

n = 1000

0 0.2 0.4 0.6 0.8 1

−0.25

0

0.25

0.5

0.75

1

Post-Double-Selection Estimator

u

EdgeV

al

0 0.2 0.4 0.6 0.8 1

−0.25

0

0.25

0.5

0.75

1

Oracle Estimator

u

0 0.2 0.4 0.6 0.8 1

−0.25

0

0.25

0.5

0.75

1

Post-Double-Selection Estimator

u

EdgeV

al

0 0.2 0.4 0.6 0.8 1

−0.25

0

0.25

0.5

0.75

1

Oracle Estimator

u

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 25

Page 26: Inference in High-Dimensional Varying Coefficient Models (slides)

Improvements

Multiplier bootstrap instead of asymptotic theory

Hypothesis testing for more than one component of the unknownparameter vector

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 26

Page 27: Inference in High-Dimensional Varying Coefficient Models (slides)

Thank you!

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 27

Page 28: Inference in High-Dimensional Varying Coefficient Models (slides)

References I

A. Belloni, V. Chernozhukov, and C. B. Hansen. Inference ontreatment effects after selection amongst high-dimensional controls.Rev. Econ. Stud., 81(2):608–650, Nov 2013a.

A. Belloni, V. Chernozhukov, and K. Kato. Robust inference inhigh-dimensional approximately sparse quantile regression models.arXiv preprint arXiv:1312.7186, December 2013b.

A. Belloni, V. Chernozhukov, and K. Kato. Uniform post selectioninference for lad regression models. arXiv preprint arXiv:1304.0282,2013c.

A. Belloni, V. Chernozhukov, and Y. Wei. Honest confidence regionsfor logistic regression with a large number of controls. arXiv preprintarXiv:1304.3969, 2013d.

R. Berk, L. D. Brown, A. Buja, K. Zhang, and L. Zhao. Validpost-selection inference. Ann. Stat., 41(2):802–837, 2013.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 28

Page 29: Inference in High-Dimensional Varying Coefficient Models (slides)

References II

M. Chen, Z. Ren, H. Zhao, and H. H. Zhou. Asymptotically normaland efficient estimation of covariate-adjusted gaussian graphicalmodel. arXiv preprint arXiv:1309.5923, 2013.

J. Fan and W. Zhang. Simultaneous confidence bands and hypothesistesting in varying-coefficient models. Scand. J. Stat., 27(4):715–731,Dec 2000.

R. Foygel Barber and E. J. Candes. Controlling the false discovery ratevia knockoffs. ArXiv e-prints, arXiv:1404.5609, April 2014.

J. Jankova and S. A. van de Geer. Confidence intervals forhigh-dimensional inverse covariance estimation. ArXiv e-prints,arXiv:1403.6752, March 2014.

A. Javanmard and A. Montanari. Confidence intervals and hypothesistesting for high-dimensional regression. J. Mach. Learn. Res., 15(Oct):2869–2909, 2014.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 29

Page 30: Inference in High-Dimensional Varying Coefficient Models (slides)

References III

J. D. Lee, D. L. Sun, Y. Sun, and J. E. Taylor. Exact post-selectioninference with the lasso. ArXiv e-prints, arXiv:1311.6238, November2013.

R. Lockhart, J. E. Taylor, R. J. Tibshirani, and R. J. Tibshirani. Asignificance test for the lasso. Ann. Stat., 42(2):413–468, 2014.

N. Meinshausen and P. Buhlmann. Stability selection. J. R. Stat. Soc.B, 72(4):417–473, 2010.

N. Meinshausen, L. Meier, and P. Buhlmann. P-values forhigh-dimensional regression. J. Am. Stat. Assoc., 104(488), 2009.

Z. Ren, T. Sun, C.-H. Zhang, and H. H. Zhou. Asymptotic normalityand optimalities in estimation of large gaussian graphical model.arXiv preprint arXiv:1309.6024, 2013.

R. D. Shah and R. J. Samworth. Variable selection with error control:another look at stability selection. J. R. Stat. Soc. B, 75(1):55–80,2013.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 30

Page 31: Inference in High-Dimensional Varying Coefficient Models (slides)

References IV

J. E. Taylor, R. Lockhart, R. J. Tibshirani, and R. J. Tibshirani.Post-selection adaptive inference for least angle regression and thelasso. arXiv preprint arXiv:1401.3889, January 2014.

S. A. van de Geer, P. Buhlmann, Y. Ritov, and R. Dezeure. Onasymptotically optimal confidence regions and tests forhigh-dimensional models. Ann. Stat., 42(3):1166–1202, Jun 2014.

L. A. Wasserman and K. Roeder. High-dimensional variable selection.Ann. Stat., 37(5A):2178–2201, 2009.

C.-H. Zhang and S. S. Zhang. Confidence intervals for low dimensionalparameters in high dimensional linear models. J. R. Stat. Soc. B, 76(1):217–242, Jul 2013.

M. Kolar (Chicago Booth) Inference in High-dimensional Varying Coefficient ModelsDecember 15, 2014 31