Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV...
Transcript of Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV...
![Page 1: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/1.jpg)
Optimal Sup-norm Rate, Adaptive Estimation,and Inference on NPIV
Xiaohong Chen (Yale) and Tim Christensen (NYU)
Cemmap Celebration Conference | Andrew’s Birthday ConferenceNovember 14-16, 2014
![Page 2: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/2.jpg)
Introduction (1)
I we consider nonparametric instrumental variables (NPIV) regression:
Yi = h0(Xi) + ui
E[ui|Xi] 6= 0
E[ui|Wi] = 0
I endogeneity is an important issue in economicsI nonparametric in h0 avoids functional form misspecification
I h0 is identified via the conditional moment restriction
E[Yi|Wi] = E[h0(Xi)|Wi]
I this “smoothes out” features of h0, making h0 difficult to recoverI NIPV is an ill-posed inverse problem with unknown operator
![Page 3: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/3.jpg)
Introduction (2)
I there is a large and growing literature on NPIV:
1. identification/consistency: Newey & Powell (03); Carrasco, Florens& Renault (07); Andrews (11);...
2. convergence rates in L2 norm: Hall & Horowitz (05); Blundell, Chen& Kristensen (BCK, 07); Chen & Reiß (11); Darolles, Fan, Florens &Renault (11);...
3. almost rate-adaptive estimation in L2: Horowitz (14)4. almost rate-adaptive estimation of linear functionals: Breunig &
Johannes (13)5. inference on linear functionals of h0: Ai & Chen (AC, 03, 07);
Carrasco, Florens & Renault (07) Horowitz & Lee (13)6. inference on nonlinear functionals of h0: Chen & Pouzo (14)7. testing: Horowitz (12); Canay, Santos & Shaikh (13); Breunig
(13);...8. partial identification: Santos (12); Freyberger & Horowitz (13);...
I all the existing published results on NPIV are based on L2 norm.
![Page 4: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/4.jpg)
Contributions of this paper1. we derive the upper bound on sup-norm convergence rates for
general sieve NPIV estimators.
2. we derive minimax lower bounds in sup-norm loss over Holder classof functions for NPIR (nonparametric indirect regression) and NPIV.
3. we show that spline and wavelet sieve NPIV estimators attain thesup-norm minimax lower bounds, and hence attain the optimalsup-norm convergence rates.
4. we introduce a data-driven procedure for choosing the dimension ofthe sieve NPIV that is sup-norm rate-adaptive
5. we provide inference theory for plug-in sieve NPIV estimators ofnonlinear functionals of h0 under mild conditions.
I An application: inference on exact consumer surplus innonparametric demand estimation when both price and income areendogenous.
![Page 5: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/5.jpg)
Parametric vs nonparametric IV
I parametric IV model
Yi = X ′iβ0 + ui
E[uiXi] 6= 0
E[uiWi] = 0
I identified if rank(E[XiW′i ]) = dim(β0)
I nonparametric IV model
Yi = h0(Xi) + ui
E[ui|Xi] 6= 0
E[ui|Wi] = 0
I identified if h 7→ E[h(Xi)|Wi = ·] is injective
![Page 6: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/6.jpg)
Parametric vs sieve nonparametric IV
I A parametric IV model can be estimated via 2SLS:
β = [X ′W (W ′W )−1W ′X]−1[X ′W (W ′W )−1W ′Y ]
I NP (03), AC (03), BCK (07): A nonparametric IV model can beestimated via sieve NPIV, i.e., 2SLS on basis functions
h(x) = ψJ(x)′c
c = [Ψ′B(B′B)−1B′Ψ]−1Ψ′B(B′B)−1B′Y
ψJ(x) = (ψJ1(x), . . . , ψJJ(x))′, Ψ = (ψJ(X1), . . . , ψJ(Xn))′
bK(w) = (bK1(w), . . . , bKK(w))′, B = (bK(W1), . . . , bK(Wn))′
I K ≥ J , with J = sieve number of endogenous regressors (the keysmoothing parameter), K = sieve number of instruments.
I Horowitz (11): modified sieve NPIV: K = J andbK = ψJ=orthognormal series of L2([0, 1]d).
![Page 7: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/7.jpg)
Outline
1. Optimal sup-norm rates
2. Sup-norm rate-adaptive estimation
3. MC study I: Adaptive estimation procedure
4. Application: Asymptotic normality of plug-in NPIV of nonlinearfunctionals
5. MC study II: Bootstrap uniform confidence sets
![Page 8: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/8.jpg)
Outline
1. Optimal sup-norm rates
2. Sup-norm rate-adaptive estimation
3. MC study I: Adaptive estimation procedure
4. Application: Asymptotic normality of plug-in NPIV of nonlinearfunctionals
5. MC study II: Bootstrap uniform confidence sets
![Page 9: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/9.jpg)
Preliminaries: measuring ill-posedness
I let ΠK : L2(W )→ BK denote the orthogonal projection onto thesieve space BK = ∨bK1, . . . , bKK
I weak norm ‖h‖w,2 = ‖ΠKTh‖L2(W ) where Th(Wi) = E[h(Xi)|Wi]
I BCK (07): sieve measure of ill-posedness
s−1JK = sup
h∈ΨJ :‖h‖w,2 6=0
‖h‖L2(X)
‖h‖w,2=
1
smin(G−1/2ψ S′G
−1/2b )
where Gb = Gb,K = E[bK(Wi)bK(Wi)
′],Gψ = Gψ,J = E[ψJ(Xi)ψ
J(Xi)′],
S′ = S′JK = E[ψJ(Xi)bK(Wi)
′].
I the NPIV model is said to beI mildly ill-posed if s−1
JK = O(Jς/d) for some ς > 0I severely ill-posed if s−1
JK = O(exp( 12Jς/d)) for some ς > 0
![Page 10: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/10.jpg)
Preliminaries: roughness properties of the sieve
I following Newey (97), we define ζ(J) = ζb(K) ∨ ζψ(J),
ζb(K) := supw‖G−1/2
b bK(w)‖`2
ζψ(J) := supx‖G−1/2
ψ ψJ(x)‖`2
I we also introduceξψ(J) := sup
x‖ψJ(x)‖`1
which is better suited to studying sup-norm rates
I sup-norm variance term depends on ξψ(J), eJ = λmin(Gψ,J), s−1JK
![Page 11: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/11.jpg)
Assumptions imposed for sup-norm rate
1. (i) (Xi, Yi,Wi)ni=1 is an i.i.d. sample; (ii) X has compact supportX ⊂ Rd with nonempty interior; W has support W ⊂ Rdw ; (iii)supx |h0(x)| <∞; (iv) h 7→ E[h(X)|W = ·] is injective on L∞(X)
2. (i) supw E[u2i |Wi = w] ≤ σ2; (ii) E[|ui|(2+δ)] <∞ for some δ > 0.
3. (i) λmin(Gb,K) > 0; eJ = λmin(Gψ,J) > 0; J ≤ K;
(ii) s−1JKζ(J)
√(J log J)/n = o(1);
(iii) ζb(K)(2+δ)/δ√
(log J)/n = o(1)
4. there exists πJh0 ∈ ΨJ such that: (i) ‖h0 − πJh0‖∞ ≤ C∗J−p/d;(ii) s−1
JK‖h0 − πJh0‖w,2 ≤ C∗2‖h0 − πJh0‖L2(X);(iii) ‖QJ(h0 − πJh0)‖∞ ≤ C∗∞‖h0 − πJh0‖∞with QJ : L2(X)→ ΨJ the oblique projectionQJh(x) = ψJ(x)[S′G−1
b S]−1S′G−1b E[bK(Wi)h(Xi)].
![Page 12: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/12.jpg)
Upper bound (1)
Theorem (Upper bound for NPIV)Let Assumptions 1–4 hold. Then:
‖h− h0‖∞ = Op
(J−p/d + s−1
JKξψ(J)√
(log J)/(neJ)).
I For Cohen-Daubechies-Vial (CDV) wavelets and B-splines, we showthat [ξψ(J)]2/eJ = O(J), hence
‖h− h0‖∞ = Op
(J−p/d + s−1
JK
√(J log J)/n
).
![Page 13: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/13.jpg)
Upper bound (2)
CorollaryLet X = [0, 1]d, 0 < infx f(x), supx f(x) <∞, and let ΨJ be spannedby a CDV wavelet basis or B-spline basis of sufficient regularity. Then:
Mildly ill-posed case: Choosing J K (n/ log n)d/(2(p+ς)+d) yields:
‖h0 − h‖∞ = Op
((n/ log n)−p/(2(p+ς)+d)
).
Severely ill-posed case: Choosing J = c′0(log n)d/ς for any c′0 ∈ (0, 1)and K = c0J for some finite c0 ≥ 1 yields:
‖h0 − h‖∞ = Op
((log n)−p/ς
).
![Page 14: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/14.jpg)
Optimality (1)
I Chen and Reiß (11) showed that the L2(X) rates
I ‖h− h0‖L2(X) = Op(n−p/(2(p+ς)+d)) in the mildly ill-posed case
I ‖h− h0‖L2(X) = Op((logn)−p/ς) in the severely ill-posed case
are optimal in a L2- minimax sense
I sup norm ≥ L2 norm
I therefore our sup-norm rate are optimal in the severely ill-posed case
I what about the mildly ill-posed case?
I now derive the minimax lower bound in sup-norm loss, i.e. the ratern over a parameter space H s.t.
lim infn→∞
infhn
suph∈H
Ph(‖h− hn‖∞ ≥ crn
)≥ c′ > 0,
for constants c, c′.
![Page 15: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/15.jpg)
Optimality (2)
I trick: rewrite the NPIV model in terms of a nonparametric indirectregression (NPIR) model:
Yi = E[h0(Xi)|Wi] + εi
E[εi|Wi] = 0
εi ∼ N(0, σ0(Wi)2)
where E[ · |Wi] is known and σ0(·)2 ≥ σ20 > 0
I NPIV:
Yi = h0(Xi) + E[h0(Xi)|Wi]− h0(Xi) + εi︸ ︷︷ ︸=:ui
where by construction E[ui|Wi] = 0
I NPIR is more informative than NPIV
I implication: lower bound for NPIV ≥ lower bound for NPIR
![Page 16: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/16.jpg)
Lower bound for NPIR
Assumption (S)(i) h0 ∈ Bp∞,∞([0, 1]d), (ii) there is a ς > 0 such that
‖Th‖L2(X) . ‖h‖B−ς2,2
for all h ∈ B(p, L) := h ∈ Bp∞,∞([0, 1]d) : ‖h‖Bp∞,∞ ≤ L.
Theorem (Lower bound for NPIR)Let Assumption S hold for the NPIR model with a random sample(Yi,Wi)ni=1. Then:
lim infn→∞
infhn
suph∈B(p,L)
Ph(‖h− hn‖∞ ≥ c(n/ log n)−p/(2(p+ς)+d)
)≥ c′ > 0,
where inf hndenotes the infimum over all estimators based on the sample
of size n, and the constants c, c′ depend only on p, L, d, ς, σ0 .
![Page 17: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/17.jpg)
Lower bound for NPIV
Corollary (Lower bound for NPIV)Let Assumption S hold for the NPIV model with a random sample(Xi, Yi,Wi)ni=1 and infw E[u2|W = w] ≥ σ2
0. Then:
lim infn→∞
infhn
suph∈B(p,L)
Ph(‖h− hn‖∞ ≥ c(n/ log n)−p/(2(p+ς)+d)
)≥ c′ > 0,
where inf hndenotes the infimum over all estimators based on the sample
of size n, and the constants c, c′ depend only on p, L, d, ς.
![Page 18: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/18.jpg)
Outline
1. Optimal sup-norm rates
2. Sup-norm rate-adaptive estimation
3. MC study I: Adaptive estimation procedure
4. Application: Asymptotic normality of plug-in NPIV of nonlinearfunctionals
5. MC study II: Bootstrap uniform confidence sets
![Page 19: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/19.jpg)
Adaptive estimation for NPIV
I must choose J optimally to attain optimal rates
I optimal choice depends on the unknown p and s−1JK
I want a data-driven method for choosing J optimally
I existing methods focus on L2 loss, minimizing a MSE-type criterionI Horowitz (14): modified sieve NPIV: K = J andbK = ψJ=orthonormal series of L2([0, 1]d).
I optimal in L2 up to a logn factorI Liu & Tao (14): Mallows Cp model selection of sieve NPIV assuming
homoskedastic error.
I CV/AIC/BIC/Mallows criteria aren’t well suited to sup-norm rates
I we introduce a sup-norm adaptive Lepski-type procedure
![Page 20: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/20.jpg)
Lepski-type procedureI set K = K(J) J deterministically (e.g. K = c0J + a)
I choose J by the following method. Define the sets:
J0 =j ∈ [Jmin, Jmax] : j−p/d ≤ C0Vsup(j)
J =
j ∈ [Jmin, Jmax] : ‖hj − hl‖∞ ≤
√2σ[Vsup(j) + Vsup(l)]
∀ l ∈ (j, Jmax]
where
Vsup(j) = s−1jK(j)ξψ(j)
√(log n)/(nej)
Vsup(j) = s−1jK(j)ξψ(j)
√(log n)/(nej)
sJK(J) = smin((Ψ′Ψ)−1/2(Ψ′B)(B′B)−1/2), eJ = λmin(Ψ′Ψ/n).
I J0 = minj∈J0j is optimal but infeasible
I J = minj∈J j is our data-driven estimator of J
I hJ denotes the sieve NPIV estimator with J = J , K = K(J)
![Page 21: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/21.jpg)
Heuristic argument
I Suppose J0 ⊆ J . Then:
J := min J ≤ J0 := minJ0
and:
‖hJ − h0‖∞ ≤ ‖hJ0 − h0‖∞ + ‖hJ − hJ0‖∞≤ ‖hJ0 − h0‖∞
+2√
2σs−1J0K(J0)ξψ(J0)
√(log n)/(neJ0)
. ‖hJ0 − h0‖∞+s−1
J0K(J0)ξψ(J0)√
(log n)/(neJ0) wpa1
⇒ ‖hJ − h0‖∞ = Op(J−p/d0 + s−1
J0K(J0)ξψ(J0)√
(log n)/(neJ0))
I implication: J is rate adaptive to the oracle J0
![Page 22: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/22.jpg)
Choosing Jmax
I still need to choose Jmax
I data-driven estimator of Jmax:
Jmax = minJ > Jmin : s−1JK(J)ζ(J)
√(JL(J) log n)/n ≥ 1
where L(J) = a log(log(J)) for some constant a > 0.
![Page 23: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/23.jpg)
Oracle propertyI now consider the special case with a CDV wavelet or B-spline sieve,
rectangular support, and well-behaved density
Theorem (Adaptivity)Let Assumptions 1–4 hold and s−1
JmaxK(Jmax)
√(J2
max log n)/n = o(1).
Then: Jmax ≤ Jmax ≤ Jmax wpa1; and
J0 ⊆ J wpa1
and so:
‖hJ − h0‖∞ = Op(J−p/d0 + s−1
J0K(J0)
√(J0 log n)/n) .
I implication: sup-norm rate adaptive in the mildly and severelyill-posed cases; no loss of log n factor.
I automatically implies L2(X)-norm rate adaptive in the severelyill-posed case, and almost adaptive in the mildly ill-posed case (upto log n factor).
![Page 24: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/24.jpg)
Outline
1. Optimal sup-norm rates
2. Sup-norm rate-adaptive estimation
3. MC study I: Adaptive estimation procedure
4. Application: Asymptotic normality of plug-in NPIV of nonlinearfunctionals
5. MC study II: Bootstrap uniform confidence sets
![Page 25: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/25.jpg)
MC design
I Newey and Powell (03) design, but with compact support: generate UiV ∗iW ∗i
∼ N 0
00
,
1 0.5 00.5 1 00 0 1
and set Xi = Φ((W ∗i + V ∗i )/
√2) and Wi = Φ(W ∗i )
I linear design: h0(x) = 4x− 2
I nonlinear design: h0(x) = log(|6x− 3|+ 1)sgn(x− 12 )
I generate 1000 samples of length 1000
I implement with cubic/quartic B-splines (with nested knots) andLegendre polynomials
I use σ = 1 (true σ) and σ = .1
I take L(J) = 110 log log J in definition of J
I compare sup-norm and L2-norm error of Lepski procedure againstinfeasible choice of J which minimizes sup-norm error in each sample
![Page 26: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/26.jpg)
MC design: linear h0 (black), nonlinear h0 (red)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
![Page 27: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/27.jpg)
MC design: scatter plot of (Xi,Wi)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
X
W
![Page 28: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/28.jpg)
MC design: scatter plot of (Xi, Yi) with nonlinear h0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−5
−4
−3
−2
−1
0
1
2
3
4
5
X
![Page 29: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/29.jpg)
MC results: Lepski procedure, linear design
Table 1: Linear design, cubic (r = 4) and quartic (r = 5) B-spline bases
Lepski (σ = 1) Lepski (σ = 0.1) InfeasiblerJ rK L∞ loss L2 loss L∞ loss L2 loss L∞ loss L2 loss
Results with K(J) = J − rJ + rKMean 4 4 0.4262 0.1547 0.4262 0.1547 0.4141 0.1608Med. 4 4 0.3828 0.1394 0.3828 0.1394 0.3708 0.1443Mean 4 5 0.4179 0.1524 0.4209 0.1536 0.3937 0.1540Med. 4 5 0.3681 0.1368 0.3692 0.1370 0.3476 0.1368Mean 5 5 0.6633 0.2355 0.6633 0.2355 0.6243 0.2494Med. 5 5 0.6007 0.2202 0.6007 0.2202 0.5646 0.2311
Results with K(J) = 2(J − rJ) + rK + 1Mean 4 4 0.4188 0.1526 0.4188 0.1526 0.3895 0.1552Med. 4 4 0.3696 0.1375 0.3696 0.1375 0.3470 0.1371Mean 4 5 0.3918 0.1439 0.3945 0.1449 0.3720 0.1486Med. 4 5 0.3430 0.1291 0.3430 0.1291 0.3295 0.1311Mean 5 5 0.6366 0.2277 0.6366 0.2277 0.5816 0.2352Med. 5 5 0.5800 0.2089 0.5800 0.2089 0.5228 0.2111
![Page 30: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/30.jpg)
MC results: Lepski procedure, linear design
Table 2: Linear design, Legendre polynomial bases
Lepski (σ = 1) Lepski (σ = 0.1) InfeasibleL∞ loss L2 loss L∞ loss L2 loss L∞ loss L2 loss
Results with K(J) = JMean 0.0882 0.0492 0.2943 0.1185 0.0869 0.0494Med. 0.0777 0.0452 0.1674 0.0810 0.0764 0.0453
Results with K(J) = 2JMean 0.0878 0.0490 0.2745 0.1119 0.0862 0.0492Med. 0.0779 0.0453 0.1640 0.0807 0.0766 0.0455
![Page 31: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/31.jpg)
MC results: Lepski procedure, nonlinear design
Table 3: Nonlinear design, cubic (r = 4) and quartic (r = 5) B-spline bases
Lepski (σ = 1) Lepski (σ = 0.1) InfeasiblerJ rK L∞ loss L2 loss L∞ loss L2 loss L∞ loss L2 loss
Results with K(J) = J − rJ + rKMean 4 4 0.4343 0.1621 0.4343 0.1621 0.4233 0.1671Med. 4 4 0.3855 0.1469 0.3855 0.1469 0.3748 0.1503Mean 4 5 0.4262 0.1600 0.4271 0.1605 0.4030 0.1615Med. 4 5 0.3738 0.1444 0.3744 0.1445 0.3514 0.1445Mean 5 5 0.6726 0.2407 0.6726 0.2407 0.6318 0.2531Med. 5 5 0.6069 0.2278 0.6069 0.2278 0.5646 0.2345
Results with K(J) = 2(J − rJ) + rK + 1Mean 4 4 0.4271 0.1601 0.4286 0.1609 0.3987 0.1623Med. 4 4 0.3764 0.1445 0.3764 0.1445 0.3518 0.1443Mean 4 5 0.4002 0.1518 0.4029 0.1528 0.3812 0.1563Med. 4 5 0.3410 0.1384 0.3414 0.1384 0.3258 0.1402Mean 5 5 0.6471 0.2330 0.6471 0.2330 0.5895 0.2390Med. 5 5 0.5797 0.2143 0.5797 0.2143 0.5341 0.2141
![Page 32: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/32.jpg)
MC results: Lepski procedure, nonlinear design
Table 4: Nonlinear design, Legendre polynomial bases
Lepski (σ = 1) Lepski (σ = 0.1) InfeasibleL∞ loss L2 loss L∞ loss L2 loss L∞ loss L2 loss
Results with K(J) = JMean 0.2494 0.1305 0.4283 0.1719 0.2297 0.1224Med. 0.2367 0.1266 0.3210 0.1426 0.2218 0.1243
Results with K(J) = 2JMean 0.2475 0.1306 0.4063 0.1644 0.2241 0.1208Med. 0.2346 0.1267 0.3132 0.1395 0.2178 0.1242
![Page 33: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/33.jpg)
Outline
1. Optimal sup-norm rates
2. Sup-norm rate-adaptive estimation
3. MC study I: Adaptive estimation procedure
4. Application: Asymptotic normality of plug-in NPIV of nonlinearfunctionals
5. MC study II: Bootstrap uniform confidence sets
![Page 34: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/34.jpg)
Pointwise and uniform inference
I Our sup-norm rates allow for low-level mild conditions forasymptotic normality of plug-in sieve NPIV estimators of possiblynonlinear functionals of h0 in two cases
1. “pointwise” inference on f(h0)I e.g.: exact consumer surplus of a price variation from p0 to p1 at
income i
Qi = h0(Pi, Ii) + ui
f(h0) = S(p0)
where S′i(p) = −h0(p, i− Si(p))
Si(p1) = 0
cf. Hausman & Newey (95), Vanhems (10), Blundell et al. (12)
2. “uniform” inference on fτ (h0) : τ ∈ T where T ⊂ RdTI e.g.: uniform inference on consumer surplus/deadweight loss
![Page 35: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/35.jpg)
Pointwise inference (1)I we focus on slower than root-n functionals that are bounded wrt the
sup norm:
5. (i) there exists a linear functional Df(h0)[·] and constant C s.t.
|f(h)− f(h0)−Df(h0)[h− h0]| ≤ C‖h− h0‖2∞
for all h ∈ Nn(h0) where h ∈ Nn wpa1;
(ii) V−1/2n ‖h− h0‖2∞ = op(n
− 12 )
I includes CS/DWL functionals and quadratic functional.
I sufficient for a more general condition of Chen and Pouzo (14).
I here Vn ∞ is the sieve variance
Vn = Df(h0)[ψJ ]′ΣnDf(h0)[ψJ ]
Σn = [S′G−1b S]−1
(S′G−1
b ΩG−1b S
)[S′G−1
b S]−1,
where S = E[bK(Wi)ψJ(Xi)
′], Ω = E[u2i bK(Wi)b
K(Wi)′]
![Page 36: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/36.jpg)
Pointwise inference (2)
Theorem (Pointwise asymptotic normality of sieve t-statistics)Let Assumptions 1–5 (etc) hold. Then:
√n(f(h)− f(h0))
V1/2n
→d N(0, 1).
I here Vn ∞ is the sieve variance estimator
Vn = Df(h)[ψJ ]′ΣDf(h)[ψJ ]
Σ = [S′G−1b S]−1
(S′G−1
b ΩG−1b S
)[S′G−1
b S]−1
S = B′Ψ/n, Gb = (B′B/n), Ω = n−1∑ni=1 u
2i bK(Wi)b
K(Wi)′.
I just like 2SLS variance estimator but using basis functions. Chenand Pouzo (14), Newey (13).
![Page 37: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/37.jpg)
Uniform inference (1)
I now impose a uniform (for τ ∈ T ) version of Assumption 5
5′. (i) Dfτ (h0)[·] is a linear functional for each τ ∈ T , (ii) there exists aconstant C s.t.
supτ∈T|fτ (h)− fτ (h0)−Dfτ (h0)[h− h0]| ≤ C‖h− h0‖2∞
for all h ∈ Nn(h0) where h ∈ Nn wpa1;
(ii) supτ∈T V−1/2τ,n ‖h− h0‖2∞ = op(n
− 12 )
I here Vτ,n = Dfτ (h0)[ψJ ]′ΣnDfτ (h0)[ψJ ]
I estimate with Vτ,n = Dfτ (h)[ψJ ]′ΣDfτ (h)[ψJ ]
![Page 38: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/38.jpg)
Uniform inference (2)
Theorem (Uniform asymptotic normality of sieve t-statistics)Let Assumptions 1–5’ (etc) hold. Then there exists a sequence of tightGaussian processes Gn on `∞(T ) with covariance function
E[Gn(t1)Gn(t2)] =Dft1(h0)[ψJ ]′ΣnDft2(h0)[ψJ ]
V1/2t1,nV
1/2t2,n
and random variables Zn =d supτ∈T |Gn(τ)| such that
supτ∈T
∣∣∣∣∣√n(fτ (h)− fτ (h0))
V1/2τ,n
∣∣∣∣∣ = Zn + op(1)
as n, J,K →∞.
I we follow Chernozhukov, Chetverikov, Kato (14) (also seeChernozhukov, Lee, Rosen (13)) construction rather than strongapproximation
![Page 39: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/39.jpg)
Example: uniform confidence bands
I fτ (h0) = h0(τ) with T = X , and Dfτ (h)[ψJ ] = ψJ(τ)
I by previous theorem, there exists a sequence of tight Gaussianprocesses Gn on `∞(X ) with covariance function
E[Gn(x1)Gn(x2)] =ψJ(x1)′Σnψ
J(x2)
V1/2x1,nV
1/2x2,n
and random variables Zn =d supx∈X |Gn(x)| such that
supx∈X
∣∣∣∣∣√n(h(x)− h0(x))
V1/2x,n
∣∣∣∣∣ = Zn + op(1)
as n, J,K →∞.
I invert for uniform confidence band
![Page 40: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/40.jpg)
Example: uniform inference on exact consumer surplus
I fτ (h0) = Si(p) with T = [p0, p0]× [i, i]
Dfτ (h)[ψJ ] = −∫ p
p1ψJ(t, i− Si(t))e
∫ sp∂2h0(u,i−Si(u)) du dt
Dfτ (h)[ψJ ] = −∫ p
p1ψJ(t, i− Si(t))e
∫ sp∂2h(u,i−Si(u)) du dt
I uniform asymptotic normality of Si(p) : (p, i) ∈ [p0, p0]× [i, i]follows from previous theorem
I could equally consider uniform inference on deadweight loss
I our sup-norm rates here are critical to control bias
![Page 41: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/41.jpg)
Outline
1. Optimal sup-norm rates
2. Sup-norm rate-adaptive estimation
3. MC study I: Adaptive estimation procedure
4. Application: Asymptotic normality of plug-in NPIV of nonlinearfunctionals
5. MC study II: Bootstrap uniform confidence sets
![Page 42: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/42.jpg)
MC design
I same Newey and Powell (03) linear and nonlinear designs
I estimate Jmax as before, use J = Jmax, K = K(Jmax) toimplement sieve NPIV estimator
I estimate critical values for uniform confidence bands using the sievescore bootstrap (Chen and Pouzo, 14) with Mammen (93) two-pointdistribution with 1000 bootstrap replications for each sample
I computationally simpler than bootstrap sieve t stat in Chen-Pouzo(14) or the bootstrap in Horowitz-Lee (12).
I compare MC with nominal coverage probabilities
![Page 43: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/43.jpg)
Estimated UCBs (dashed), h (black line), h0 (red line)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−2.5
−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
![Page 44: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/44.jpg)
MC results: coverage probabilities
Table 5: Linear and nonlinear design, cubic (r = 4) and quartic (r = 5)
B-spline bases.
rJ rK 90% 95% 99% 90% 95% 99%
linear 4 4 0.933 0.966 0.996 0.944 0.971 0.994linear 4 5 0.937 0.975 0.995 0.937 0.963 0.994linear 5 5 0.961 0.983 0.997 0.959 0.985 0.997nonlinear 4 4 0.884 0.945 0.987 0.912 0.956 0.989nonlinear 4 5 0.894 0.946 0.987 0.906 0.951 0.987nonlinear 5 5 0.956 0.978 0.995 0.951 0.979 0.996
Note: Left panel uses K(J) = J − rJ + rK , right panel uses K(J) =2(J − rJ) + rK + 1.
![Page 45: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/45.jpg)
MC results: coverage probabilities
Table 6: Linear and nonlinear design, Legendre polynomial bases
90% 95% 99% 90% 95% 99%
linear 0.937 0.964 0.997 0.928 0.959 0.989nonlinear 0.901 0.952 0.988 0.906 0.948 0.989
Note: Left panel uses K(J) = J , right panel uses K(J) = 2J .
![Page 46: Optimal Sup-norm Rate, Adaptive Estimation, and Inference ... Chen.pdf · general sieve NPIV estimators. 2.we deriveminimax lower bounds in sup-norm lossover Holder class of functions](https://reader030.fdocuments.net/reader030/viewer/2022040600/5e8dc90ea04fb275cc43590a/html5/thumbnails/46.jpg)
Conclusions
I contributions:
1. optimal sup-norm rates and attainability by sieve estimators2. Lepski procedure for adaptive estimation in sup norm3. pointwise and uniform inference on possibly nonlinear functionals
I first such results for NPIV (or indeed any ill-posed inverse problemwith unknown operator)
I application to inference on consumer surplus in demand estimationand uniform confidence bands