Smoothed stationary bootstrap bandwidth selection for density...
Transcript of Smoothed stationary bootstrap bandwidth selection for density...
Smoothed stationary bootstrap bandwidth selection fordensity estimation with dependent data
Ricardo CaoMODES group, University of A Coruna (Spain)
Joint work with Ines Barbeito Cal
Galician Seminar of Nonparametric Statistical Inference,June 8, 2016
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 1 / 81
Index
1 Introduction and BackgroundAimsSmoothed Bootstrap iid caseBootstrap methods for dependent data
2 Already existing smoothing parameter selectors3 Bootstrap bandwidth selector under independence4 Smooth Stationary Bootstrap under dependence5 Smooth Moving Blocks Bootstrap under dependence6 Simulations7 Real data application8 Conclusions9 References
10 Contact info
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 2 / 81
Motivation & Background Aims
Setup and aims
General dependent data, {Xt}t∈Z: stationary, α-mixing, φ-mixing, . . .Nonparametric Parzen-Rosenblatt kernel density estimation
fh(x) = 1nh
n∑i=1
K
(x−Xi
h
)= 1n
n∑i=1
Kh(x−Xi)
Smooth bootstrap methodsBandwidth (h) selection
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 3 / 81
Motivation & Background Smoothed Bootstrap iid case
Smoothed bootstrap for independent data
Consider some statistic of interest: R(→X,F
)Smoothed bootstrap algorithm
1 Using the sample (X1, . . . , Xn) and the bandwidth h > 0, compute fh2 Draw bootstrap resamples
→X∗= (X∗1 , . . . , X∗n) from fh
3 Obtain the bootstrap version of the statistic: R∗ = R
( →X∗, Fh
)4 Repeat Steps 1-3, B times to obtain R∗(1), . . . , R∗(B)
5 Use the values R∗(1), . . . , R∗(B) to approximate the samplingdistribution of R.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 4 / 81
Motivation & Background Smoothed Bootstrap iid case
How to draw from fh?
Considering two independent random variables: Y ∼ Fn and U withdensity K, it is easy to prove that Y + hU has density fhDrawing resamples from fh
1 Draw naive bootstrap resamples→
XNAIV E∗= (XNAIV E∗1 , . . . , XNAIV E∗
n ) from Fn
2 Draw a sample→U= (U1, . . . , Un) from the density K
3 Obtain the smoothed bootstrap resample→X∗= (X∗1 , . . . , X∗n), where
X∗i = XNAIV E∗i + hUi
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 5 / 81
Motivation & Background Dependent data: bootstrap methods
Moving Blocks Bootstrap (MBB)
MBB algorithmKunsch (1989), Liu and Singh (1992)
1 Fix the block lenght, b ∈ N, and define k = min`∈N ` ≥ nb
2 Define:Bi,b = (Xi, Xi+1, . . . , Xi+b−1)
3 Draw ξ1, ξ2, . . . , ξk with uniform discrete distribution on{B1, B2, . . . , Bq}, with q = n− b+ 1
4 Define→X∗ as the vector formed by the first n components of
(ξ1,1, ξ1,2, . . . , ξ1,b, ξ2,1, ξ2,2 . . . , ξ2,b, . . . , ξk,1, ξk,2, . . . , ξk,b)
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 6 / 81
Motivation & Background Dependent data: bootstrap methods
Stationary Bootstrap (SB)
SB algorithmPolitis and Romano (1994a)
1 Draw X∗1 from Fn
2 Once obtained X∗i = Xj , for some j ∈ {1, 2, . . . , n− 1}, i < n, defineX∗i+1 as follows:
X∗i+1 = Xj+1 (if j = n, Xj+1 = X1), with probability 1− pX∗i+1 is drawn from Fn with probability p
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 7 / 81
Motivation & Background Dependent data: bootstrap methods
SubsamplingSubsampling algorithm (for dependent data)Politis and Romano (1994b)
1 Consider a dependent data sample (X1, . . . , Xn) with marginaldistribution F and θ = θ(F )
2 An estimator Tn = Tn(X1, . . . , Xn) of θ = θ(F ) is considered and
Jn(u, F ) = P (τn(Tn − θ) ≤ u)
3 Fix some b ∈ N such that b < n and define:
Sn,i = Tb(Bi,b), i = 1, 2, . . . , N, where N = n− b+ 1.
4 Use:
Ln(x) = 1N
N∑i=1
111{τb(Sn,i−Tn)≤x}
to approximate the sampling distribution of τn(Tn − θ):R. Cao SSB bandwidth selection GSNSI, June 8, 2016 8 / 81
Already existing bandwidth selectors
Plug-in method under dependence (PI)Hall, Lahiri and Truong (1995)
Minimizing in h the asymptotic MISE:
AMISE(h) = 1nhR(K) + 1
4h4µ2
2R(f ′′)− h6 124µ2µ4R(f ′′′)
+ 1n
(2n−1∑i=1
(1− i
n
)∫gi(x, x)dx−R(f)
).
results in hAMISE =(J1n
)1/5+ J2
(J1n
)3/5, with
gi(x1, x2) = fi(x1, x2)− f(x1)f(x2), fi the density of (Xj , Xi+j),
J1 = R(K)µ2
2R(f ′′)and J2 = µ4R(f ′′′)
20µ2R(f ′′) .
Now hPI =(J1n
)1/5
+ J2
(J1n
)3/5
, with J1 and J2 some estimators
of J1 and J2.R. Cao SSB bandwidth selection GSNSI, June 8, 2016 9 / 81
Already existing bandwidth selectors
Plug-in method under dependence (PI)
Replace R(f ′′) by I2 and R(f ′′′) by I3, where:
Ik = 2θ1k − θ2k, k = 2, 3,
θ1k = 2(n(n− 1)h2k+1
1
)−1 ∑∑1≤i<j≤n
K(2k)1
(Xi −Xj
h1
),
θ2k = 2(n(n− 1)h2(k+1)
1
)−1 ∑∑1≤i<j≤n
∫K
(k)1
(x−Xi
h1
)K
(k)1
(x−Xj
h1
)dx.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 10 / 81
Already existing bandwidth selectors
Leave-(2l + 1)-out cross validation (CVl)Hart and Vieu (1990)
DefineCVl(h) =
∫f2(x)dx− 2
n
n∑j=1
f jl (Xj),
wheref jl (x) = 1
nl
n∑i:|j−i|>l
1hK
(x−Xi
h
).
Choose nl such that:
nnl = #{(i, j) : |i− j| > l}.
The CVl bandwidth selector is
hCVl= arg min
hCVl(h).
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 11 / 81
Already existing bandwidth selectors
Penalized cross validation (PCV)
Estevez, Quintela and Vieu (2002) proposed it for hazard rate estimationThe PCV bandwidth selecctor is
hPCV = hCVl+ λ.
λ is chosen empirically as follows:
λn =(0.8e7.9ρ−1
)n−3/10hCVl
100 ,
where ρ is the estimated autocorrelation
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 12 / 81
Already existing bandwidth selectors
Modified cross validation under dependence (SMCV )
Stute (1992) proposed it for independent dataDefine
SMCV (h) =1nh
∫K2(t)dt
+1
n(n− 1)h
∑i6=j
[1h
∫K
(x−Xi
h
)K
(x−Xj
h
)dx
]
−1
nnlh
n∑j=1
n∑i:|j−i|>l
[K
(Xi −Xj
h
)− dK′′
(Xi −Xj
h
)].
The SMCV bandwidth selector is
hSMCV = arg minhSMCV (h)
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 13 / 81
Bootstrap bandwidth selector in the iid case
Exact MISE expression for the iid case
MISE (h) = E[∫ (
fh (x)− f (x))2dx
]= B (h) + V (h) ,
where
B (h) =∫ [
E(fh (x)
)− f (x)
]2dx, e
V (h) =∫V ar
(fh (x)
)dx
Exact expression for MISE(h):
B (h) =∫
(Kh ∗ f (x)− f (x))2 dx, and
V (h) = n−1h−1R (K)− n−1∫
(Kh ∗ f (x))2 dx.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 14 / 81
Bootstrap bandwidth selector in the iid case
Smoothed bootstrap for the iid caseSmooth bootstrap algorithm for bandwidth selection Cao (1993)
1 Starting from (X1, . . . , Xn) (iid), and using a pilot bandwidth, g,compute fg
2 Draw bootstrap resamples (X∗1 , . . . , X∗n) from fg3 For every h > 0, obtain
f∗h(x) = 1nh
n∑i=1
K
(x−X∗ih
)4 Construct the bootstrap version of MISE:
MISE∗(h) =∫
E∗[(f∗h(x)− fg(x)
)2]dx
5 Obtain the bootstrap selector:
h∗MISE = arg minh>0
MISE∗(h).
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 15 / 81
Bootstrap bandwidth selector in the iid case
Smoothed bootstrap for the iid case
Closed expression for the bootstrap MISEAn exact expression for MISE∗(h) can be found:
MISE∗(h) = 1n2
n∑i,j=1
[(Kh ∗Kg −Kg) ∗ (Kh ∗Kg −Kg)] (Xi −Xj)
+R(K)nh
− 1n3
n∑i,j=1
[(Kh ∗Kg) ∗ (Kg ∗Kg)] (Xi −Xj) ,
where ∗ denotes the convolution operator: f ∗ g(x) =∫∞−∞ f(x− y)g(y)dy.
Consequently, there is no need to draw bootstrap resamples by MonteCarlo to approximate MISE∗(h).
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 16 / 81
Smooth Stationary Bootstrap (SSB)
Exact MISE expression under dependence and stationarity
Exact expression for MISE(h):
MISE (h) = B (h) + V (h) , where
B (h) =∫
(Kh ∗ f (x)− f (x))2 dx, and
V (h) = n−1h−1R (K)−∫
(Kh ∗ f (x))2 dx
+ 2n−2n−1∑`=1
(n− `)∫ ∫
Kh (x− y) f (y) (Kh ∗ f` (•|y)) (x) dx dy,
where f` (•|y) is the conditional density function of Xt+` given Xt = y.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 17 / 81
Smooth Stationary Bootstrap (SSB)
Smooth Stationary BootstrapSSB resampling plan Barbeito and Cao (2016)
1 Draw X∗(SB)1 from Fn.
2 Draw U∗1 with density K and independently of X∗(SB)1 and define
X∗1 = X∗(SB)1 + gU∗1
3 Assume we have drawn X∗1 , . . . , X∗i and consider the indexj/X
∗(SB)i = Xj . Define I∗i+1, such that
P∗(I∗i+1 = 1
)= 1− p,
P∗(I∗i+1 = 0
)= p.
Assign X∗(SB)i+1 |I∗i+1=1 = X(jmodn)+1 and draw X
∗(SB)i+1 |I∗i+1=0 from
the empirical distribution function4 Define X∗i+1 = X
∗(SB)i+1 + gU∗i+1 (where U∗i+1 has density K). Go to
the previous step if i+ 1 < n.R. Cao SSB bandwidth selection GSNSI, June 8, 2016 18 / 81
Smooth Stationary Bootstrap (SSB)
MISE closed expression for SSBAn explicit expression for MISE∗(h) can be obtained:
MISE∗(h) = n−1h−1R (K)
+[n− 1n3 −2
1− p− (1− p)n
pn3 + 2(n− 1) (1− p)n+1 − n (1− p)n + 1− p
p2n4
]·
n∑i,j=1
[(Kh ∗Kg) ∗ (Kh ∗Kg)] (Xi −Xj)
−2n−2n∑
i,j=1
(Kh ∗Kg ∗Kg) (Xi −Xj)
+n−2n∑
i,j=1
(Kg ∗Kg) (Xi −Xj) +2n−3n−1∑`=1
(n− `) (1− p)`
·n∑
k=1
[(Kh ∗Kg) ∗ (Kh ∗Kg)](Xk −Xd(k+`−1)modne+1
).
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 19 / 81
Smooth Moving Blocks Bootstrap (SMBB)
Smooth Moving Blocks Bootstrap
SMBB resampling plan1 Fix the block lenght, b ∈ N, and define k = min`∈N ` ≥ n
b
2 Define:Bi,b = (Xi, Xi+1, . . . , Xi+b−1)
3 Draw ξ1, ξ2, . . . , ξk with uniform discrete distribution on{B1, B2, . . . , Bq}, with q = n− b+ 1
4 Define X∗(MBB)1 , . . . , X
∗(MBB)n as the first n components of
(ξ1,1, ξ1,2, . . . , ξ1,b, ξ2,1, ξ2,2 . . . , ξ2,b, . . . , ξk,1, ξk,2, . . . , ξk,b)
5 Define X∗i = X∗(MBB)i + gU∗i , where U∗i has been drawn with density
K and independently from X∗(MBB)i , for all i = 1, 2, . . . , n
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 20 / 81
Smooth Moving Blocks Bootstrap (SMBB)
MISE closed expression for SMBBAn explicit expression for MISE∗(h) can be obtained, considering n anentire multiple of b.
If b = n,
MISE∗(h) =R(K)nh
+1n2
n∑i=1
n∑j=1
ψ(Xi −Xj)
−2n2
n∑i=1
n∑j=1
[(Kh ∗Kg) ∗Kg ] (Xi −Xj)
+1n2
n∑i=1
n∑j=1
[Kg ∗Kg ] (Xi −Xj)
+ψ(0)n
,
where ψ(Xi −Xj) = [(Kh ∗Kg) ∗ (Kh ∗Kg)] (Xi −Xj).
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 21 / 81
Smooth Moving Blocks Bootstrap (SMBB)
MISE closed expression for SMBBIf b < n,
MISE∗(h) =R(K)nh
+n∑
i=1
ai
n∑j=1
aj · ψ(Xi −Xj)
−2n
n∑i=1
ai
n∑j=1
[(Kh ∗Kg) ∗Kg ] (Xi −Xj)
+1n2
n∑i=1
n∑j=1
[Kg ∗Kg ] (Xi −Xj)
−b− 1
n(n− b+ 1)2
n−b+1∑i=b−1
n−b+2∑j=b
ψ(Xi −Xj)
−1
nb · (n− b+ 1)2
[b−1∑i=1
b−1∑j=1
(min{i, j})ψ(Xi −Xj)
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 22 / 81
Smooth Moving Blocks Bootstrap (SMBB)
MISE closed expression for SMBB
+b−1∑i=1
i
n−b+1∑j=b
ψ(Xi −Xj) +b−1∑i=1
n∑j=n−b+2
(min{(n− b+ i− j + 1), i})ψ(Xi −Xj)
+n−b+1∑
i=b
b−1∑j=1
j · ψ(Xi −Xj) +n∑
i=n−b+2
(min{(n− i+ 1), b})n−b+1∑
j=b
ψ(Xi −Xj)
+n−b+1∑
i=b
n∑j=n−b+2
(min{(n− j + 1), b}) · ψ(Xi −Xj)
+n∑
i=n−b+2
b−1∑j=1
(min{(n− b+ j − i+ 1), j})ψ(Xi −Xj) + b
n−b+1∑i=b
n−b+1∑j=b
ψ(Xi −Xj)
+n∑
i=n−b+2
n∑j=n−b+2
(n+ 1−max{i, j})ψ(Xi −Xj)
]
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 23 / 81
Smooth Moving Blocks Bootstrap (SMBB)
MISE closed expression for SMBB
+2
nb(n− b+ 1)
b−1∑s=1
n−s∑j=1
(min{j, b− s} −max{1, j + b− n}+ 1)ψ(Xj+s −Xj)
−2
nb(n− b+ 1)2
b∑k,`=1k<`
[b−2∑i=k
b−1∑j=`
ψ(Xi −Xj) +n−b+k∑
i=n−b+2
n−b+`∑j=n−b+3
ψ(Xi −Xj)
+b−2∑i=k
n−b+`∑j=n−b+3
ψ(Xi −Xj) +n−b+k∑
i=n−b+2
b−1∑j=`
ψ(Xi −Xj)
]
+b−1∑k=1
(b− k)b−2∑i=k
n−b+2∑j=b
ψ(Xi −Xj) +b∑
`=2
(`− 1)n−b+1∑i=b−1
b−1∑j=`
ψ(Xi −Xj)
+b∑
`=2
(`− 1)n−b+1∑i=b−1
n−b+`∑j=n−b+3
ψ(Xi −Xj) +b−1∑k=1
(b− k)n−b+k∑
i=n−b+2
n−b+2∑j=b
ψ(Xi −Xj)
],
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 24 / 81
Smooth Moving Blocks Bootstrap (SMBB)
MISE closed expression for SMBB
considering aj such that:
aj =
j
b(n− b+ 1) , if j = 1, . . . , b− 1
1n− b+ 1 , if j = b, . . . , n− b+ 1
n− j + 1b(n− b+ 1) , if j = n− b+ 2, . . . , n
.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 25 / 81
Simulations
Simulated models
Six time series models have been considered
Model 1:Xt = −0.9Xt−1 − 0.2Xt−2 + at,
where the atd= N(0, 1) are independent. Thus Xt
d= N(0, 0.42)Model 2:
Xt = at − 0.9at−1 + 0.2at−2,
where atd= N(0, 1) are independent. Thus Xt
d= N(0, 1.85).
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 26 / 81
Simulations
Simulated models
Model 3:Xt = φXt−1 + (1− φ2)1/2at,
with atd= N(0, 1), φ = 0,±0.3,±0.6,±0.9. Thus Xt
d= N(0, 1).Model 4:
Xt = φXt−1 + at,
where the distribution of at is given by P(It = 1) = φ,P(It = 2) = 1− φ, with at|It=1
d= 0 (constant), at|It=2d= exp(1),
and φ = 0, 0.3, 0.6, 0.9. We have Xtd= exp(1)
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 27 / 81
Simulations
Simulated models
Model 5:Xt = φXt−1 + at,
where the distribution of at is P(It = 1) = φ2, P(It = 2) = 1− φ2,with at|It=1
d= 0 (constant), at|It=2d= Dexp(1), and
φ = 0,±0.3,±0.6,±0.9. Thus Xtd= Dexp(1).
Model 6:
Xt ={X
(1)t with probability 1/2
X(2)t with probability 1/2
,
where X(j)t = (−1)j+1 + 0.5X(j)
t−1 + a(j)t with j = 1, 2, ∀t ∈ Z,
a(j)t
d= N(0, 0.6) independent and Xtd= 1
2N(2, 0.8) + 12N(−2, 0.8)
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 28 / 81
Simulations
Performance measures
The following results will be shown for the six models considered in thesimulations
log(
h
hMISE
)
log(
MISE(h)MISE(hMISE)
),
where h = hCVl, hSMCV , hPCV , hPI , h
∗SSB, h
∗SMBB.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 29 / 81
Simulations
Approximating the optimal bandwidth
Consider some criterion function Ψ(h) (e.g. MISE∗(h) under SSB orSMBB; CVl(h) for Hart and Vieu’s CV, Stute’s MCV or Estevez, Quintelaand Vieu PCV).
1 Consider a set of five equispaced bandwidths, H1 between 0.01 and 102 Obtain hOPT1 = arg minh∈H1 Ψ(h)3 Consider ha the previous value of hOPT1 within H1 and hb the
following value to hOPT1 within H1
4 Construct a new set, H2, of equispaced bandwidths between ha andhb
5 Repeat Steps 2-4 10 times6 The approximated optimal bandwidth is the value obtained in the
10th repetition
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 30 / 81
Simulations
Technical aspects
l = 5 for CVlhSMCV is considered as the smallest h for which SMCV (h) attains alocal minimum, not its global onePilot bandwidth for PI: h1 = 1Pilot bandwidth for h∗SSB and h∗SMBB as in the iid case: some normalreference estimator of
g0 =( ∫
K ′′(t)2dt
ndK∫f (3)(x)2dx
)1/7
p = 0.05 for SSBb = 20 for SMBBFor every model, 1000 random samples of size n = 100 were drawn
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 31 / 81
Simulations
log(h/hMISE). Model 1
CV_l SMCV PCV SSB SMBB PI
−2
02
46
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 32 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 1
CV_l SMCV PCV SSB SMBB PI
01
23
4
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 33 / 81
Simulations
log(h/hMISE). Model 2
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
23
45
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 34 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 2
CV_l SMCV PCV SSB SMBB PI
01
23
4
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 35 / 81
Simulations
log(h/hMISE). Model 3, φ = −0.9
CV_l SMCV PCV SSB SMBB PI
−2
02
4
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 36 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = −0.9
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
3.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 37 / 81
Simulations
log(h/hMISE). Model 3, φ = −0.6
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 38 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = −0.6
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 39 / 81
Simulations
log(h/hMISE). Model 3, φ = −0.3
CV_l SMCV PCV SSB SMBB PI
−3.
0−
2.5
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 40 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = −0.3
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
3.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 41 / 81
Simulations
log(h/hMISE). Model 3, φ = 0
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
2
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 42 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = 0
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 43 / 81
Simulations
log(h/hMISE). Model 3, φ = 0.3
CV_l SMCV PCV SSB SMBB PI
−3.
0−
2.5
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 44 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = 0.3
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 45 / 81
Simulations
log(h/hMISE). Model 3, φ = 0.6
CV_l SMCV PCV SSB SMBB PI
−2.
0−
1.5
−1.
0−
0.5
0.0
0.5
1.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 46 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = 0.6
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 47 / 81
Simulations
log(h/hMISE). Model 3, φ = 0.9
CV_l SMCV PCV SSB SMBB PI
−3
−2
−1
01
2
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 48 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 3, φ = 0.9
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 49 / 81
Simulations
log(h/hMISE). Model 4, φ = 0
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
2
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 50 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 4, φ = 0
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 51 / 81
Simulations
log(h/hMISE). Model 4, φ = 0.3
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
23
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 52 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 4, φ = 0.3
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 53 / 81
Simulations
log(h/hMISE). Model 4, φ = 0.6
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
23
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 54 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 4, φ = 0.6
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 55 / 81
Simulations
log(h/hMISE). Model 4, φ = 0.9
CV_l SMCV PCV SSB SMBB PI
−3
−2
−1
01
23
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 56 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 4, φ = 0.9
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 57 / 81
Simulations
log(h/hMISE). Model 5, φ = −0.9
CV_l SMCV PCV SSB SMBB PI
−4
−2
02
46
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 58 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = −0.9
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 59 / 81
Simulations
log(h/hMISE). Model 5, φ = −0.6
CV_l SMCV PCV SSB SMBB PI
−3
−2
−1
01
23
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 60 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = −0.6
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
3.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 61 / 81
Simulations
log(h/hMISE). Model 5, φ = −0.3
CV_l SMCV PCV SSB SMBB PI
−4
−2
02
46
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 62 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = −0.3
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 63 / 81
Simulations
log(h/hMISE). Model 5, φ = 0
CV_l SMCV PCV SSB SMBB PI
−2
−1
01
2
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 64 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = 0
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
3.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 65 / 81
Simulations
log(h/hMISE). Model 5, φ = 0.3
CV_l SMCV PCV SSB SMBB PI
−2
02
4
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 66 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = 0.3
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 67 / 81
Simulations
log(h/hMISE). Model 5, φ = 0.6
CV_l SMCV PCV SSB SMBB PI
−2
02
4
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 68 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = 0.6
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
2.5
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 69 / 81
Simulations
log(h/hMISE). Model 5, φ = 0.9
CV_l SMCV PCV SSB SMBB PI
−4
−2
02
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 70 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 5, φ = 0.9
CV_l SMCV PCV SSB SMBB PI
0.0
0.5
1.0
1.5
2.0
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 71 / 81
Simulations
log(h/hMISE). Model 6
CV_l SMCV PCV SSB SMBB PI
−4
−3
−2
−1
01
23
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 72 / 81
Simulations
log(MISE(h)/MISE(hMISE)). Model 6
CV_l SMCV PCV SSB SMBB PI
0.0
0.2
0.4
0.6
0.8
1.0
1.2
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 73 / 81
Real data application
Real data application: Data sets considered
1 lynx data set: Number of Canadian lynxes trapped (114 observations).
Time
serie
1820 1840 1860 1880 1900 19200
1000
2000
3000
4000
5000
6000
7000
Time
serie
1820 1840 1860 1880 1900 1920
45
67
89
(1− φ1B − φ2B2)Yt = c+ (1 + θ1B + θ2B
2 + θ3B3)(1 + Θ1B
12)at.
2 sunspot.year data set: Yearly number of sunspots from 1700 to 1988(289 observations).
Time
serie
1700 1750 1800 1850 1900 1950
050
100
150
Time
serie
1700 1750 1800 1850 1900 1950
12
34
5
(1− φ1B − φ2B2 − φ2B3 − φ4B
4)(1−B)(1−B12)Yt =
c+ (1 + θ1B + θ2B2 + θ3B
3 + θ4B4) · (1 +B12Θ1)at.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 74 / 81
Real data application
Real data application: lynx data set
4 6 8 10
0.00
0.05
0.10
0.15
0.20
0.25
0.30
Den
sity
h_SSBh_SMBBh_CV_lh_PCVh_SMCVh_PI
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 75 / 81
Real data application
Real data application: sunspot.year data set
0 1 2 3 4 5 6
0.0
0.1
0.2
0.3
0.4
Den
sity
h_SSBh_SMBBh_CV_lh_PCVh_SMCVh_PI
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 76 / 81
Real data application
Real data application: Bandwidth parameters
h∗SSB h∗SMBB hCVlhPCV hSMCV hPI
0.4345 0.4246 0.3173 0.6194 0.2585 0.4152
Table: Bandwidth parameters for lynx data set.
h∗SSB h∗SMBB hCVlhPCV hSMCV hPI
0.3173 0.3295 0.3002 0.5065 0.196 0.3392
Table: Bandwidth parameters for sunspot.year data set.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 77 / 81
Conclusions
Main conclusions
New SSB and SMBB bootstrap resampling plans under dependence.Closed expressions for MISE∗ under SSB and SMBB. Monte Carlo isnot needed.Bandwidth selection for the KDE with dependent data:
Plug-inLeave-(2l + 1)-out cross validationPenalized cross validationModified cross validationSmooth Stationary BootstrapSmooth Moving Blocks Bootstrap
Good empirical behaviour of hPI , but sometimes it producesextremely large banwdidthsh∗SSB and h∗SMBB display the overall best performance.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 78 / 81
Conclusions
References
Barbeito, I. and Cao, R. (2016). Smoothed stationary bootstrapbandwidth selection for density estimation with dependent data.Unpublished manuscript.
Cao, R. (1993). Bootstrapping the mean integrated squared error. J.Mult. Anal. 45, 137-160.Efron, B. and Tibishirani, R. (1986). An Introduction to theBootstrap. Chapman and Hall.
Estevez-Perez, G., Quintela-del-Rıo, A. and Vieu, P. (2002).Convergence rate for cross-validatory bandwidth in kernel hazardestimation from dependent samples. J. Statist. Planning andInference, 104, 1-30.Hall, P., Lahiri, S.N. and Truong, Y.K.(1995). On bandwidth choicefor density estimation with dependent data. Ann. Statist., 23, 6,2241-2263.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 79 / 81
Conclusions
References (continued)
Hart, J. and Vieu, P. (1990). Data-driven bandwidth choice for densityestimation based on dependent data. Ann. Statist., 18, 873-890.
Kunsch, H.R. (1989). The jackknife and the bootstrap for generalstationary observations. Ann. Statist., 17, 1217-1241.
Liu, R.Y., and Singh, K. (1992). Moving blocks jackknife andbootstrap capture weak dependence, in Exploring the Limits of theBootstrap, ed. by R. LePage and L. Billiard. New York: Wiley.
Politis, D.N. and Romano, J.R. (1994a). The stationary bootstrap. J.Amer. Statist. Assoc., 89, 1303-1313.Politis, D.N. and Romano, J.R. (1994b). Large Sample ConfidenceRegions Based on Subsamples under Minimal Assumptions. Ann.Statist., 22, 2031-2050 .Stute, W. (1992). Modified cross-validation in density estimation. J.Statist. Planning and Inference, 3, 293-305.
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 80 / 81
Conclusions
Contact info
Thank you for your attention!
You can contact me at [email protected]
R. Cao SSB bandwidth selection GSNSI, June 8, 2016 81 / 81