Convergence rates of wavelet estimators in semiparametric regression models under NA samples

16
Chin. Ann. Math. 32B(4), 2012, 609–624 DOI: 10.1007/s11401-012-0718-z Chinese Annals of Mathematics, Series B c The Editorial Office of CAM and Springer-Verlag Berlin Heidelberg 2012 Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples Hongchang HU 1 Li WU 1 Abstract Consider the following heteroscedastic semiparametric regression model: yi = X T i β + g(ti)+ σi ei , 1 i n, where {Xi , 1 i n} are random design points, errors {ei , 1 i n} are negatively associated (NA) random variables, σ 2 i = h(ui ), and {ui } and {ti } are two nonrandom sequences on [0, 1]. Some wavelet estimators of the parametric component β, the non- parametric component g(t) and the variance function h(u) are given. Under some general conditions, the strong convergence rate of these wavelet estimators is O(n 1 3 log n). Hence our results are extensions of those results on independent random error settings. Keywords Semiparametric regression model, Wavelet estimate, Negatively associated random error, Strong convergence rate 2000 MR Subject Classification 62G05, 62G20 1 Introduction Consider the following heteroscedastic semiparametric regression model: y i = X T i β + g(t i )+ σ i e i , 1 i n, (1.1) where {y i } are scalar response variables, β =(β 1 2 , ··· d ) T is an unknown d-dimensional parameter vector, σ 2 i = h(u i ), g(·) and h(·) are unknown functions defined on [0, 1], {u i } and {t i } are two nonrandom sequences on [0, 1], X i =(x i1 ,x i2 , ··· ,x id ) T are random design points with d n, and random errors {e i , 1 i n} are negatively associated (its definition is given by the following) random variables with Ee i = 0 and Ee 2 i = 1. Following [1], denote x ir = f r (t i )+ η ir , 1 i n, 1 r d, (1.2) where f r (·) is some unknown smooth function defined on [0, 1], { η i } is a stochastic sequence with η i =(η i1 , ··· η id ) T i.i.d. and E η i =0, Var( η i )= V, (1.3) Manuscript received September 4, 2010. Revised April 14, 2012. 1 College of Mathematics and Statistics, Hubei Normal University, Huangshi 435002, Hubei, China. E-mail: [email protected] [email protected] Project supported by the National Natural Science Foundation of China (No. 11071022), the Key Project of the Ministry of Education of China (No. 209078) and the Youth Project of Hubei Provincial Department of Education of China (No. Q20122202).

Transcript of Convergence rates of wavelet estimators in semiparametric regression models under NA samples

Chin. Ann. Math.32B(4), 2012, 609–624DOI: 10.1007/s11401-012-0718-z

Chinese Annals ofMathematics, Series Bc© The Editorial Office of CAM and

Springer-Verlag Berlin Heidelberg 2012

Convergence Rates of Wavelet Estimators inSemiparametric Regression Models

Under NA Samples∗

Hongchang HU1 Li WU1

Abstract Consider the following heteroscedastic semiparametric regression model:

yi = XTi β + g(ti) + σiei, 1 ≤ i ≤ n,

where {Xi, 1 ≤ i ≤ n} are random design points, errors {ei, 1 ≤ i ≤ n} are negativelyassociated (NA) random variables, σ2

i = h(ui), and {ui} and {ti} are two nonrandomsequences on [0, 1]. Some wavelet estimators of the parametric component β, the non-parametric component g(t) and the variance function h(u) are given. Under some general

conditions, the strong convergence rate of these wavelet estimators is O(n− 13 log n). Hence

our results are extensions of those results on independent random error settings.

Keywords Semiparametric regression model, Wavelet estimate, Negativelyassociated random error, Strong convergence rate

2000 MR Subject Classification 62G05, 62G20

1 Introduction

Consider the following heteroscedastic semiparametric regression model:

yi = XTi β + g(ti) + σiei, 1 ≤ i ≤ n, (1.1)

where {yi} are scalar response variables, β = (β1, β2, · · · , βd)T is an unknown d-dimensionalparameter vector, σ2

i = h(ui), g(·) and h(·) are unknown functions defined on [0, 1], {ui} and{ti} are two nonrandom sequences on [0, 1], Xi = (xi1, xi2, · · · , xid)T are random design pointswith d ≤ n, and random errors {ei, 1 ≤ i ≤ n} are negatively associated (its definition is givenby the following) random variables with Eei = 0 and Ee2

i = 1.Following [1], denote

xir = fr(ti) + ηir , 1 ≤ i ≤ n, 1 ≤ r ≤ d, (1.2)

where fr(·) is some unknown smooth function defined on [0, 1], {ηi} is a stochastic sequencewith ηi = (ηi1, · · · ηid)T i.i.d. and

Eηi = 0, Var(ηi) = V, (1.3)

Manuscript received September 4, 2010. Revised April 14, 2012.1College of Mathematics and Statistics, Hubei Normal University, Huangshi 435002, Hubei, China.E-mail: [email protected] [email protected]

∗Project supported by the National Natural Science Foundation of China (No. 11071022), the KeyProject of the Ministry of Education of China (No. 209078) and the Youth Project of Hubei ProvincialDepartment of Education of China (No. Q20122202).

610 H. C. Hu and L. Wu

where V = (Vij) (i, j = 1, 2, · · · , d) is a positive definite matrix with d-order. Moreover, {ηir}and {ei} are independent of each other.

Definition 1.1 (see [2]) Random variables ξ1, ξ2, · · · , ξn, n ≥ 2 are said to be negativelyassociated (NA) if for every pair of disjoint subsets B1 and B2 of {1, 2, · · · , n},

Cov(l1(ξi, i ∈ B1), l2(ξi, i ∈ B2)) ≤ 0,

where l1 and l2 are increasing for every variable (or decreasing for every variable) such that thiscovariance exists. A sequence of random variables {ξi, i ≥ 1} is said to be NA if every finitesubfamily is NA.

The definition of NA random variables is introduced by Alam and Saxena [3] and is carefullystudied by Joag-Dev and Proschan [4]. Because of its wide applications in the multivariate sta-tistical analysis and the system reliability, the notion of NA has received considerable attentionrecently. There are some fundamental convergence results for NA random variables, and werefer to Joag-Dev and Proschan [4] for fundamental properties, Matula [5] for the three seriestheorem, Shao and Su [6] for the law of the iterated logarithm, Wu and Jiang [7] for the law ofthe iterated logarithm of partial sums, Liang [8] and Baek et al. [9] for the complete conver-gence, Liang and Zhang [10] for the strong convergence for weighted sums of NA arrays, Su etal. [11] for the moment inequality and the weak convergence, Jin [12] for the convergence ratein the law of logarithm for NA random fields, and Roussas [13] for the central limit theorem ofrandom fields.

When {σiei} are independent and identically distributed random variables, the model (1.1)can be reduced to the usual semiparametric regression model. Chen [14], Qian and Cai [15],Xue and Liu [16], Chai and Xu [17], Hu [18], Heckman [19], Green and Silverman [20], Hardleet al. [21], Shi and Li [22] and Bianco and Boente [23] used various estimation methods (thepiecewise polynomial method, and the wavelet method, the ridge method, the spline method,the penalized least squares method, the kernel smoothing method, the robust estimate method)to obtain estimators of the unknown parametric components in (1.1) and discussed some prop-erties of these estimators. However, the independence assumption for the errors is not alwaysappropriate in applications. Recently, the semiparametric regression model with correlated er-rors has attracted attentions of many authors, such as Hu and Hu [24–25], You and Chen [26],Wu and Jiang [7], Liang and Jing [27], Zhou et al. [28], and so on.

When {Xi, 1 ≤ i ≤ n} are fixed design points, a few authors studied the model (1.1) with NAerrors. For instance, Baek and Liang [29] discussed the strong consistency and the asymptoticnormality of weighted least-squares estimators; Liang and Wang [30] discussed the convergencerate of wavelet estimator; and Ren and Chen [31] obtained the strong consistency of a classof estimators. However, there are few asymptotic results for the estimators of parametric andnonparametric components in model (1.1) with NA errors when {Xi, 1 ≤ i ≤ n} are randomdesign points.

By using the wavelet method, we continue to discuss the semiparametric regression modelwith an NA error sequence, and obtain the strong convergence rates of wavelet estimators inthis paper. The organization of this paper is as follows. In Section 2, the wavelet estimators of

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 611

parametric component β, the nonparametric component g(t) and the variance function h(u) aregiven by the wavelet smoothing method. Under some general conditions, the strong convergencerates of the wavelet estimators are investigated in Section 3. The main proofs are presented inSection 4.

2 Wavelet Estimators

In the following, we apply the wavelet technique to estimate β, g(t) and h(u). Suppose thatthere exists a scaling function φ(x) in the Schwartz space Sl and a multiresolution analysis{Vm} in the concomitant Hilbert space L2(R), with its reproducing kernel Em(t, s) given by

Em(t, s) = 2mE0(2mt, 2ms) = 2m∑k∈Z

φ(2mt − k)φ(2ms − k). (2.1)

Let Ai = [si−1, si] denote intervals that partition [0, 1] with ti ∈ Ai and 1 ≤ i ≤ n. Thenwe can define some wavelet estimators as follows:

Firstly, suppose that β is known. We define the estimator of g(·) by

g0(t, β) =n∑

i=1

(yi − XTi β)

∫Ai

Em(t, s)ds. (2.2)

In succession, we define the wavelet estimator βn given by

βn = argminβ

n∑i=1

(yi − XTi β − g0(t, β))2 = (XTX)−1XTY , (2.3)

where X = (Xir)n×d, Y = (y1, · · · , yn)T, S = (Sij)n×n, Sij =∫

AjEm(ti, s)ds, X = (I − S)X ,

Y = (I − S)Y .Finally, we define the linear wavelet estimators of g(·) and h(·) given by

gn(t) = g0(t, gn) =n∑

i=1

(yi − XTi βn)

∫Ai

Em(t, s)ds, (2.4)

hn(u) =n∑

i=1

(yi − XTi βn)2

∫Ai

Em(u, s)ds. (2.5)

It is well-known that the weighted least squares estimator is superior to the least squares esti-mator for a linear regression model with the heteroscedastic errors. Thus, the above estimatorsare modified by the following estimators:

βn = (XTΣ−1X)−1XTΣ−1Y , (2.6)

gn(t) = g0(t, βn) =n∑

i=1

(yi − XTi βn)

∫Ai

Em(t, s)ds, (2.7)

hn(u) =n∑

i=1

(yi − XTi βn)2

∫Ai

Em(u, s)ds, (2.8)

where Σ = diag(σ21 , σ2

2 , · · · , σ2n).

612 H. C. Hu and L. Wu

Since σ2i = h(ui) are unknown functions, the above estimators (2.6)–(2.8) can not be directly

used. Because hn(u) is a uniformly strong consistency estimator of h(u) (see the followingTheorem 3.1), it is very natural that h(u) is instead of hn(u) in (2.6)–(2.8). Thus, we obtainthe other estimators given by

βn = (XTΣ−1X)−1XTΣ−1Y , (2.9)

gn(t) =n∑

i=1

(yi − XTi βn)

∫Ai

Em(t, s)ds, (2.10)

hn(u) =n∑

i=1

(yi − XTi βn)2

∫Ai

Em(u, s)ds. (2.11)

To obtain our results, the following five conditions are sufficient:(A1) g(·), fr(·), h(·) ∈ Hα (Sobolev space), for some α > 1

2 , 1 ≤ r ≤ d.(A2) g(·), fr(·) and h(u) are Lipschitz functions of order γ > 0, 1 ≤ r ≤ d.(A3) φ(·) belongs to Sl, which is a Schwartz space for l ≥ α. φ is a Lipschitz function of

order 1 and has compact support, in addition to |φ(ξ) − 1| = O(ξ) as ξ → 0, where φ denotesthe Fourier transform of φ.

(A4) si (i = 1, · · · , n) and m satisfy max1≤i≤n

(si − si−1) = O(n−1) and 2m = O(n13 ), respec-

tively.(A5) 0 < m0 ≤ min

1≤i≤nh(ui) ≤ max

1≤i≤nh(ui) ≤ M0 < ∞.

3 Statements of the Results

Theorem 3.1 Suppose that conditions (A1)–(A4) hold. If supi≥1

E|ei|p < ∞ for some p > 3

and Eη21j < ∞ (j = 1, 2, · · · , d), then for γ ≥ 1

3 and α > 32 ,

sup1≤i≤d

|βni − βi| = O(n− 13 log n), a.s., n → ∞,

where βni is the ith component of βn.

Theorem 3.2 Under the same assumptions in Theorem 3.1, we have

sup0≤t≤1

|gn(t) − g(t)| = O(n− 13 log n), a.s ., n → ∞.

Theorem 3.3 Under the same assumptions in Theorem 3.1, and supi≥1

E|ei|4 < ∞, we have

sup0≤u≤1

|hn(u) − h(u)| = O(n− 13 log n), a.s ., n → ∞.

Theorem 3.4 Under the same assumptions in Theorem 3.1, we have

sup1≤i≤d

|βni − βi| = O(n− 13 log n), a.s ., n → ∞.

Theorem 3.5 Under the same assumptions in Theorem 3.1, we have

sup0≤t≤1

|gn(t) − g(t)| = O(n− 13 log n), a.s ., n → ∞.

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 613

Theorem 3.6 Under the same assumptions in Theorem 3.3, we have

sup0≤u≤1

|hn(u) − h(u)| = O(n− 13 log n), a.s ., n → ∞.

Theorem 3.7 Under the same assumptions in Theorem 3.1, we have

sup1≤i≤d

|βni − βi| = O(n− 13 log n), a.s ., n → ∞.

Theorem 3.8 Under the same assumptions in Theorem 3.1, we have

sup0≤t≤1

|gn(t) − g(t)| = O(n− 13 log n), a.s ., n → ∞.

Theorem 3.9 Under the same assumptions in Theorem 3.3, we have

sup0≤u≤1

|hn(u) − h(u)| = O(n− 13 log n), a.s ., n → ∞.

Remark 3.1 Theorem 3.1 is the same as [15, Theorem 2], where the errors are i.i.d.random variables. In this paper, however, the errors are NA sequences. Hence the result isan extension of Qian and Cai [15]. In addition, we easily obtain the strong consistencies anda weak convergence rate of the wavelet estimators by the above results. Hence our results areextensions of those results on independent random error settings, such as Qian and Cai [15],Zhou and You [32], and so on.

Remark 3.2 From the following proofs, we know that Lemma 4.4 and Lemma 4.5 arecrucial results for investigating the above convergence rates. Thus the convergence rates of theparametric and nonparametric components are the same under the same conditions. Qian andCai [15] obtained different convergence rates under different conditions. We think that we shallobtain different convergence rates under different conditions by their technique.

4 Proofs of Theorems

Throughout this paper, let C denote a generic positive constant which could take differentvalues at each occurrence. To prove the main results, we first introduce some lemmas.

Lemma 4.1 (see [33]) If condition (A3) holds, then

(I) |E0(t, s)| ≤ Ck

(1 + |t − s|)kand |Em(t, s)| ≤ 2mCk

(1 + 2m|t − s|)k, where k is a positive integer

and Ck is a constant depending on k only.(II) sup

0≤s≤1|Em(t, s)| = O(2m).

(III) supt

∫ 1

0

|Em(t, s)|ds ≤ C, where C is a positive constant.

Lemma 4.2 (see [24]) Let τm = 2−m(α− 12 ) when 1

2 < α < 32 , τm =

√m · 2−m when α = 3

2 ,and τm = 2−m when α > 3

2 . If conditions (A1)–(A4) hold, then

supt

∣∣∣fj(t) −n∑

k=1

( ∫Ak

Em(t, s)ds)fj(tk)

∣∣∣ = O(n−γ) + O(τm),

614 H. C. Hu and L. Wu

supt

∣∣∣g(t) −n∑

k=1

( ∫Ak

Em(t, s)ds)g(tk)

∣∣∣ = O(n−γ) + O(τm).

Lemma 4.3 (see [34]) Let {ei, i ≥ 1} be an NA sequence with supi≥1

E|ei|p < ∞ for some

p > 3. Assume that ani(t) are nonnegative functions on [0, 1] such thatn∑

i=1

ani(t) ≤ c and

max1≤i≤n

ani(t)n23 ≤ c for any t ∈ [0, 1], and max

1≤i≤n|ani(t1) − ani(t2)| ≤ c|t1 − t2| for any t1, t2 ∈

[0, 1]. Then

sup0≤t≤1

∣∣∣ n∑i=1

ani(t)εi

∣∣∣ = O(n− 13 log n), a.s., n → ∞.

Lemma 4.4 Let {ei, i ≥ 1} be an NA sequence with supi≥1

E|ei|p < ∞ for some p > 3.

Suppose that conditions (A3) and (A4) hold. Then

sup0≤t≤1

∣∣∣ n∑i=1

ei

∫Ai

Em(t, s)ds∣∣∣ = O(n− 1

3 log n), a.s., n → ∞.

Proof Let ai(t) =∫

AiEm(t, s)ds. Then by Lemma 4.3,

sup0≤t≤1

n∑i=1

∣∣∣ ∫Ai

Em(t, s)ds∣∣∣ ≤ sup

0≤t≤1

∫ 1

0

|Em(t, s)|ds ≤ c

andsup

0≤t≤1max

1≤i≤n

∣∣∣ ∫Ai

Em(t, s)ds∣∣∣n 2

3 ≤ cn−12mn23 ≤ c.

Since E0(t, s) satisfies Lipschitz condition with order 1 for any t, we obtain

max1≤i≤n

∣∣∣ ∫Ai

Em(t1, s)ds −∫

Ai

Em(t2, s)ds∣∣∣ ≤ ∫

Ai

|Em(t1, s) − Em(t2, s)|ds

≤ cn−12m|E0(2mt1, 2ms) − E0(2mt2, 2ms)|≤ cn−12m|2mt1 − 2mt2| ≤ c|t1 − t2|.

By Lemma 4.3, we obtain Lemma 4.4.

Lemma 4.5 (see [35]) Suppose that conditions (A3) and (A4) hold, and Eη21j < ∞ (j =

1, 2, · · · , d). Then

supt

∣∣∣ n∑k=1

ηkj

∫Ak

Em(t, s)ds∣∣∣ = O(n− 1

3 log n), a.s ., n → ∞.

Lemma 4.6 (see [36]) Let {ei, i ≥ 1} be a strongly mixing sequence with Eei = 0 and

supi≥1

E|ei|p < ∞ for some p > 2. Assume that∞∑

n=1

( ∞∑i=1

a2ni log n

) p2

< ∞ and∞∑

n=1

α(n)p−2

p < ∞.

Then ∞∑i=1

aniei = o(1), a.s ., n → ∞.

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 615

where α(n) are their mixing coefficients, and {ani, i = 1, 2, · · · } is a real number sequence.

Lemma 4.7 (see [37]) Let {bn, n ≥ 1} be a sequence of positive nondecreasing real numbers

and let {ei, i ≥ 1} be NA random variables with∞∑

n=1

σ2n

b2n

< ∞, where σ2n = Var(en). Assume

that 0 < bn ↑ ∞. Thenn∑

i=1

ei

bn= o(1), a.s ., n → ∞.

Proof of Theorem 3.1 Let ε = (I − S)ε = (I − S)Σe, e = (e1, e2, · · · , en)T, g =(g(t1), · · · , g(tn))T, and g = (I − S)g. Then

βn − β = (n−1XTX)−1(n−1XTg + n−1XTε). (4.1)

By the proof of Theorem 2.1 in [24], we obtain that

n−1XTXa.s.−−→ V (n → ∞). (4.2)

Next we shall prove that

(n−1XTg)i = O(n− 13 log n), a.s., n → ∞. (4.3)

where (n−1XTg)i is the ith component of n−1XTg . In fact,

(n−1XTg)i =1n

n∑h=1

ηhi

(g(th) −

n∑r=1

Shrg(tr))

− 1n

n∑h=1

n∑k=1

Shkηki

(g(th) −

n∑r=1

Shrg(tr))

+1n

n∑h=1

(fi(th) −

n∑k=1

Shkfi(tk))(

g(th) −n∑

r=1

Shrg(tr))

=: J1 + J2 + J3. (4.4)

By using Markov inequality and Lemma 4.2, we obtain that

∞∑n=1

P (|J1| ≥ n− 13 log n) ≤

∞∑n=1

EJ21

(n− 13 log n)2

≤ C

∞∑n=1

n−1(log n)−2 < ∞.

Hence, by Borel-Cantelli lemma,

J1 = O(n− 13 log n), a.s., n → ∞. (4.5)

By Lemma 4.2 and Lemma 4.5, it is easy to obtain

J2 = o(n− 13 log n), a.s., n → ∞. (4.6)

By Lemma 4.2, we obtain

J3 = O(n−2γ) + O(τ2m) = o(n− 1

3 log n), a.s., n → ∞. (4.7)

616 H. C. Hu and L. Wu

Hence, (4.3) follows from (4.5)–(4.7).In the end, we shall show

(n−1XTε)i = O(n− 13 log n), a.s., n → ∞. (4.8)

In fact,

(n−1XTε)i =1n

n∑h=1

(ηhi −

n∑k=1

Shkηki

)(εh −

n∑r=1

Shkεr

)

+1n

n∑h=1

(fi(th) −

n∑k=1

Shkfi(tk))(

εh −n∑

r=1

Shkεr

)=: T1 + T2. (4.9)

Write

T1 =1n

n∑h=1

(ηhiεh −

( n∑k=1

Shkηki

)εh −

( n∑r=1

Shrεr

)ηhi

)

+1n

n∑h=1

( n∑k=1

Shkηki

)( n∑r=1

Shrεr

)

=: T(1)1 − T(2)

1 − T(3)1 + T(4)

1 . (4.10)

Note that {ηhi} and {εh} are independent of each other, Eηi = 0 and Eei = 0. It is easy toshow that Cov(ηhieh, ηkiek) = 0. Thus {ηhieh, h = 1, 2, · · · , n} is a ρ-mixing random sequence,which implies an α-mixing sequence (note that 0 ≤ α(n) ≤ 1

4ρ(n) = 0, see [2]).Let ani = n− 2

3 log−1 n. By Lemma 4.6 and α(n) = 0, we can easily obtain

T(1)1 = o(n− 1

3 log n), a.s., n → ∞. (4.11)

It is easy to show that {ei, i = 1, 2, · · · , n} being NA implies {e+i , i = 1, 2, · · · , n} and

{e−i , i = 1, 2, · · · , n} being NA also. By Lemma 4.7, we obtain that n−1n∑

h=1

e+h = o(1), a.s., n →

∞, and n−1n∑

h=1

e−h = o(1), a.s., n → ∞. Therefore, by |ei| = e+i + e−i ,

n−1n∑

h=1

|eh| = o(1), a.s., n → ∞. (4.12)

By Lemma 4.5 and (4.12), we obtain

|T(2)1 | ≤ max

h

∣∣∣ n∑k=1

Shkηki

∣∣∣ · maxh

|h 12 (uh)| ·

( 1n

n∑k=1

|eh|)

= o(n−1 log n), a.s., n → ∞. (4.13)

By Lemma 4.4 and the strong law of large numbers, we obtain

|T(3)1 | ≤ max

h

∣∣∣ n∑r=1

Shrer

∣∣∣ · maxh

|h 12 (uh)| ·

( 1n

n∑h=1

|ηhi|)

= O(n− 13 log n), a.s., n → ∞. (4.14)

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 617

By Lemma 4.4 and Lemma 4.5, we obtain

|T(4)1 | ≤ max

h

∣∣∣( n∑k=1

Shkηki

)· max

h|h 1

2 (uh)| ·( n∑

r=1

Shrer

)∣∣∣ = o(n− 13 log n), a.s., n → ∞.

(4.15)

Hence, from (4.10)–(4.15), we obtain

T1 = O(n− 13 log n), a.s., n → ∞. (4.16)

By Lemma 4.2, (4.12) and Lemma 4.4, we obtain that

|T2| ≤ maxh

∣∣∣fi(th) −n∑

k=1

Shkfi(tk)∣∣∣{max

h|h 1

2 (uh)| · 1n

n∑h=1

|eh| + maxr

|h 12 (ur)|

n∑r=1

Shrer

}

≤ Cn− 13 . (4.17)

Thus, (4.8) follows from (4.9), (4.16)–(4.17).By (4.1), we obtain

sup1≤i≤d

|βni − βi| ≤ d · sup1≤j≤d

∣∣∣(n−1XTX)−1ij

∣∣∣ sup1≤j≤d

∣∣∣(n−1XTg)j + (n−1XTε)j

∣∣∣. (4.18)

Hence, Theorem 3.1 follows from (4.2)–(4.3), (4.8) and (4.18).

Proof of Theorem 3.2 Note that

supt

|gn(t) − g(t)| ≤ supt

|g0(t, β) − g(t)| + supt

∣∣∣ n∑j=1

XTj (β − βn)

∫Aj

Em(t, s)ds∣∣∣

≤ supt

∣∣∣ n∑j=1

g(tj)∫

Aj

Em(t, s)ds − g(t)∣∣∣ + sup

t

∣∣∣ n∑j=1

εj

∫Aj

Em(t, s)ds∣∣∣

+d∑

j=1

(|βnj − βj| sup

t

∣∣∣ n∑i=1

fj(ti)∫

Ai

Em(t, s)ds∣∣∣)

+d∑

j=1

(|βnj − βj| sup

t

∣∣∣ n∑i=1

ηij

∫Ai

Em(t, s)ds∣∣∣)

=: K1 + K2 + K3 + K4. (4.19)

By Lemma 4.2, we obtain

K1 = O(n−γ) + O(τm) = o(n− 13 log n). (4.20)

By Lemma 4.4, we obtain

K2 = O(n− 13 log n), a.s., n → ∞. (4.21)

Let M = supj,ti

|fj(ti)|. Then by Lemma 4.1 and Theorem 3.1, we obtain

K3 ≤ d · supj,ti

|fj(ti)| · supt

∫ 1

0

|Em(t, s)|ds · maxj

|βnj − βj |

618 H. C. Hu and L. Wu

≤ C max1≤j≤d

|βnj − βj | = O(n− 13 log n), a.s., n → ∞. (4.22)

By Lemma 4.5 and Theorem 3.1, we obtain

K4 ≤ d sup1≤j≤d

|βnj − βj | max1≤j≤d

supt

∣∣∣ n∑j=1

ηij

∫Ai

Em(t, s)ds∣∣∣

= o(n− 13 log n), a.s., n → ∞. (4.23)

From (4.19)–(4.23), the desired conclusion follows.

Proof of Theorem 3.3 By (2.5) and yi = XTi β + g(ti) + εi, we obtain that

hn(u) =n∑

i=1

(XTi (β − βn))2

∫Ai

Em(u, s)ds +n∑

i=1

g2(ti)∫

Ai

Em(u, s)ds

+n∑

i=1

ε2i

∫Ai

Em(u, s)ds + 2n∑

i=1

XTi (β − βn)g(ti)

∫Ai

Em(u, s)ds

+ 2n∑

i=1

XTi (β − βn)εi

∫Ai

Em(u, s)ds + 2n∑

i=1

g(ti)εi

∫Ai

Em(u, s)ds

=: I1 + I2 + I3 + 2I4 + 2I5 + 2I6. (4.24)

By Theorem 3.1, Lemma 4.1, Lemma 4.2 and Lemma 4.5,

|I1| =∣∣∣ n∑

i=1

( d∑r=1

xir(βr − βnr))2

∫Ai

Em(u, s)ds∣∣∣

≤ max1≤r≤d

|βr − βnr|2∣∣∣ n∑

i=1

( d∑r=1

(fr(ti) + ηir))2

∫Ai

Em(u, s)ds∣∣∣

≤ C max1≤r≤d

|βr − βnr|2∣∣∣ n∑

i=1

d∑r=1

(f2r (ti) + η2

ir)∫

Ai

Em(u, s)ds∣∣∣

≤ Cn− 23 log2 n

(Cn− 1

3 +n∑

i=1

d∑r=1

η2ir

∫Ai

Em(u, s)ds)

= o(n− 13 log n), a.s., n → ∞. (4.25)

By Lemma 4.1 and Lemma 4.2, we have that

sup0≤u≤1

|I2| ≤ sup0≤u≤1

n∑i=1

∣∣∣ ∫Ai

Em(u, s)ds∣∣∣ max1≤i≤n

|g2(ti)|

= O(n−2γ) + O(τ2m) = O(n− 2

3 ), a.s., n → ∞. (4.26)

In the following, we shall prove that

sup0≤u≤1

|I3 − h(u)| = O(n− 13 log n), a.s.,, n → ∞. (4.27)

By Cauchy inequality and Lemma 4.1, we obtain that

sup0≤u≤1

|I3 − h(u)|

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 619

≤ sup0≤u≤1

∣∣∣ n∑i=1

h(ui)(e2i − 1)

∫Ai

Em(u, s)ds∣∣∣

+ sup0≤u≤1

∣∣∣ n∑i=1

h(ui)∫

Ai

Em(u, s)ds − h(u)∣∣∣ · max

1≤i≤n

( n∑j=1

h12 (uj)ej

∫Aj

Em(ti, s)ds)2

+ 2 sup0≤u≤1

∣∣∣{ n∑i=1

h(ui)(e2i − 1)

∫Ai

Em(u, s)ds +n∑

i=1

h(ui)∫

Ai

Em(u, s)ds} 1

2∣∣∣

· sup0≤u≤1

∣∣∣{ n∑i=1

∫Ai

Em(u, s)ds( n∑

j=1

h12 (uj)ej

∫Aj

Em(ti, s)ds)2} 1

2∣∣∣

=: sup0≤u≤1

|I31| + sup0≤u≤1

|I32| + max1≤i≤n

|I33|

+ 2 sup0≤u≤1

{I31 + I34} 12 sup

0≤u≤1

{ n∑i=1

∫Ai

Em(u, s)ds|I33|} 1

2. (4.28)

Let ξi = e2i − 1 = (e+

i )2 − E(e+i )2 − ((e−i )2 − E(e−i )2) = ξ+

i − ξ−i . Then {ξ+i , i ≥ 1} and

{ξ−i , i ≥ 1} are NA with Eξ±i = 0 and Var(ξ±i ) < ∞. By Lemma 4.5, we obtain that

sup0≤u≤1

|I31| = O(n− 13 log n), a.s., n → ∞. (4.29)

By Lemma 4.2, we have

sup0≤u≤1

|I32| = O(n−γ) + O(τm) = O(n− 13 ), a.s., n → ∞. (4.30)

By Lemma 4.4, we have

max0≤u≤1

|I33| = O(n− 23 log2 n), a.s., n → ∞. (4.31)

By Lemma 4.1, we obtain that

|I34| ≤ max1≤i≤n

|h(ui)|∫ 1

0

|Em(u, s)|ds ≤ C. (4.32)

Hence, (4.27) follows from (4.28)–(4.32).By Cauchy inequality, (4.25) and (4.26), we obtain that

I24 ≤n∑

i=1

(XTi (β − βn))2

∫Ai

Em(u, s)ds ·n∑

i=1

g2(ti)∫

Ai

Em(u, s)ds = I1 · I2.

Hence,

|I4| = o(n− 13 log n), a.s., n → ∞. (4.33)

By Cauchy inequality, (4.25) and (4.27), we obtain that

I25 ≤n∑

i=1

(XTi (β − βn))2

∫Ai

Em(u, s)ds ·n∑

i=1

ε2i

∫Ai

Em(u, s)ds = I1 · I3.

Thus

|I5| = o(n− 13 log n), a.s., n → ∞. (4.34)

620 H. C. Hu and L. Wu

By Cauchy inequality, (4.26) and (4.27), we obtain that

sup0≤u≤1

|I6| ≤ sup0≤u≤1

∣∣∣( n∑i=1

ε2i

∫Ai

Em(u, s)ds) 1

2( n∑

i=1

g2(ti)∫

Ai

Em(u, s)ds) 1

2∣∣∣

≤ sup0≤u≤1

|I 122 I

123 | = o(n− 1

3 log n), a.s., n → ∞. (4.35)

Therefore, Theorem 3.3 follows from (4.24)–(4.27) and (4.33)–(4.35).

Proof of Theorem 3.4 Note (A5) and

βn − β = (n−1XTΣ−1X)−1(n−1XTΣ−1g + n−1XTΣ−1ε),

the proof of Theorem 3.4 is analogous with the proof of Theorem 3.1.

Proof of Theorem 3.5 Note that

supt

|gn(t) − g(t)|

≤ supt

∣∣∣ n∑i=1

(yi − XTi β)

∫Ai

Em(t, s)ds − g(t)∣∣∣ + sup

t

∣∣∣ n∑i=1

XTi (β − βn)

∫Ai

Em(u, s)ds∣∣∣

≤ supt

∣∣∣ n∑i=1

g(ti)∫

Ai

Em(t, s)ds − g(t)∣∣∣ + sup

t

∣∣∣ n∑i=1

h12 (ui)ei

∫Ai

Em(u, s)ds∣∣∣

+d∑

j=1

(|βnj − βj | sup

t

∣∣∣ n∑i=1

fj(ti)∫

Ai

Em(t, s)ds∣∣∣)

+d∑

j=1

(|βnj − βj | sup

t

∣∣∣ n∑i=1

ηij

∫Ai

Em(t, s)ds∣∣∣).

From Lemma 4.1, Lemma 4.2, Lemma 4.4, Lemma 4.5 and Theorem 3.4, the desired con-clusion follows.

Proof of Theorem 3.6 Note that

hn(u) =n∑

i=1

(XTi (β − βn))2

∫Ai

Em(u, s)ds +n∑

i=1

g2(ti)∫

Ai

Em(u, s)ds

+n∑

i=1

ε2i

∫Ai

Em(u, s)ds + 2n∑

i=1

XTi (β − βn)g(ti)

∫Ai

Em(u, s)ds

+ 2n∑

i=1

XTi (β − βn)εi

∫Ai

Em(u, s)ds + 2n∑

i=1

g(ti)εi

∫Ai

Em(u, s)ds.

The proof is similar to that of Theorem 3.3, and hence we omit it here.

Proof of Theorem 3.7 From Theorem 3.4, we only prove that

βn − βn = O(n− 13 log n), a.s., n → ∞. (4.36)

In fact, by (2.6) and (2.9), we obtain that

βn − βn = (XTΣ−1X)−1XTΣ−1(g + ε) − (XTΣ−1X)−1XTΣ−1(g + ε)

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 621

= (XTΣ−1X)−1XT(Σ−1 − Σ−1)(g + ε)

+ ((XTΣ−1X)−1 − (XTΣ−1X)−1)XTΣ−1(g + ε)

=: I1 + I2 + I3 + I4. (4.37)

We first prove that

n−1XTΣ−1Xa.s.−−→ Σ1 (n → ∞) (4.38)

and

n−1XTΣ−1Xa.s.−−→ Σ2 (n → ∞), (4.39)

where (Σ1)ij = limn→∞

1n

n∑h=1

h−1(uh)ηhiηhj , (Σ2)ij = limn→∞

1n

n∑h=1

h−1n (uh)ηhiηhj . The two limits

exist because {ηhiηhj , h = 1, 2, · · · , n} are independent random variables, and {h−1(uh), h ≥ 1}are bound and h−1

n (uh) → h−1(uh).

n−1XTΣ−1X =1n

n∑h=1

h−1(uh)(fi(th) −

n∑k=1

(∫Ak

Em(th, s)dsfi(tk)))

·(ηhj −

n∑k=1

(∫Ak

Em(th, s)ds)ηkj

)

+1n

n∑h=1

h−1(uh)(fj(th) −

n∑k=1

( ∫Ak

Em(th, s)dsfj(tk)))

·(ηhi −

n∑k=1

(∫Ak

Em(th, s)ds)ηki

)

+1n

n∑h=1

h−1(uh)(fi(th) −

n∑k=1

(∫Ak

Em(th, s)dsfi(tk)))

·(fj(th) −

n∑k=1

( ∫Ak

Em(th, s)ds)fj(tk)

)

+1n

n∑h=1

h−1(uh)(ηhi −

n∑k=1

∫Ak

Em(th, s)dsηki

)

·(ηhj −

n∑k=1

∫Ak

Em(th, s)dsηkj

)=: U1 + U2 + U3 + U4. (4.40)

It is easy to show that

U1 = O(n−γ) + O(τm), U2 = O(n−γ) + O(τm),

U3 = O(n−2γ) + O(τ2m), a.s., n → ∞.

(4.41)

By Lemma 4.5, we obtain that

U4 =1n

n∑h=1

h−1(uh)ηhiηhj + o(1) → (Σ1)ij , a.s., n → ∞. (4.42)

622 H. C. Hu and L. Wu

Then (4.38) follows from (4.40)–(4.42). With arguments in a way similar to (4.38), we get(4.39).

By (4.3) and Theorem 3.3, we obtain that

(n−1XTΣ−1XI1)i = n−1n∑

k=1

xki(h−1(uk) − h−1n (uk))g(tk)

= o(n− 13 log n), a.s., n → ∞. (4.43)

By (4.8) and Theorem 3.3, we obtain that

(n−1(XTΣ−1X)I2)i = (n−1XT(Σ−1 − Σ−1)ε)i = o(n− 13 log n), a.s., n → ∞. (4.44)

Note that hn(u) a.s.−−→ h(u) (n → ∞), and with arguments similar to (4.43) and (4.44), we get

I3 = ((XTΣ−1X)−1 − (XTΣ−1X)−1)XTΣ−1g = o(n− 13 log n), a.s., n → ∞. (4.45)

I4 = ((XTΣ−1X)−1 − (XTΣ−1X)−1)XTΣ−1ε = o(n− 13 log n), a.s., n → ∞. (4.46)

The desired conclusion follows from (4.37)–(4.39) and (4.43)–(4.46).

Proof of Theorem 3.8 From Theorem 3.5, we only prove that

sup0≤t≤1

|gn(t) − gn(t)| = o(n− 13 log n), a.s., n → ∞. (4.47)

In fact, by (2.7), (2.10), (4.25) and (4.36), we obtain that

sup0≤t≤1

|gn(t) − gn(t)| ≤ C sup0≤t≤1

n∑i=1

∣∣∣XTi (βn − βn)

∫Ai

Em(t, s)ds∣∣∣

= o(n− 13 log n), a.s., n → ∞.

Proof of Theorem 3.9 From Theorem 3.6, we only prove that

sup0≤u≤1

|hn(u) − hn(u)| = o(n− 13 log n), a.s., n → ∞. (4.48)

In fact, by (2.8) and (2.11), we obtain that

sup0≤u≤1

|hn(u) − hn(u)|

= sup0≤u≤1

∣∣∣ n∑i=1

(2yi − XTi (βn − βn))(XT

i (βn − βn))∫

Ai

Em(u, s)ds∣∣∣

= o(n− 13 log n), a.s., n → ∞.

Acknowledgement The authors wish to thank the referee for some comments and sug-gestions which helped to improve the presentation of this paper.

References

[1] Speckman, P., Kernel smoothing in partial linear models, J. R. Statist. Soc. B, 50, 1988, 413–436.

Convergence Rates of Wavelet Estimators in Semiparametric Regression Models Under NA Samples 623

[2] Fan, J. and Yao, Q., Nonlinear Time Series: Nonparametric and Parametric Methods, Springer-Verlag,New York, 2003.

[3] Alam, K. and Saxena, K. M. L., Positive dependence in multivariate distributions, Comm. Statist. TheoryMethods Ser. A, 10, 1981, 1183–1196.

[4] Joag-Dev, K. and Proschan, F., Negative association of random variables with applications, Ann. Statist.,11, 1983, 286–295.

[5] Matula, P., A note on the almost sure convergence of sums of negatively dependences random variables,Statist. Probab. Lett., 15, 1992, 209–213.

[6] Shao, Q. M. and Su, C., The law of the iterated logarithm for negatively associated random variables,Stochastic Process Appl., 83, 1999, 139–148.

[7] Wu, Q. and Jiang, Y., A law of the iterated logarithm of partial sums for NA random variables, J. KoreanStatist. Soc., 39, 2010, 199–206.

[8] Liang, H. Y., Complete convergence for weighted sums of negatively associated random variables, Statist.Probab. Lett., 48, 2000, 317–325.

[9] Baek, J. I., Kim, T. S. and Liang, H. Y., On the convergence of moving average processes under dependentconditions, Austral. New Zealand J. Statist., 45(3), 2003, 331–342.

[10] Liang, H. and Zhang, J., Strong convergence for weighted sums of negatively associated arrays, Chin. Ann.Math., 31B(2), 2010, 273–288.

[11] Su, C., Zhao, L. and Wang, Y., The moment inequality for NA random variables and the weak convergence(in Chinese), Sci. China Ser. A, 26(12), 1996, 1091–1099.

[12] Jin, J., Convergence rate in the law of logarithm for NA random fields (in Chinese), Acta Math. ScienciaSer. A, 29(4), 2009, 1138–1143.

[13] Roussas, G. G., Asymptotic normality of random fields of positively or negatively associated processes, J.Multivariate Anal., 50, 1994, 152–173.

[14] Chen, H., Convergence rates for parametric components in a partly linear model, Ann. Statist., 16(1),1988, 136–146.

[15] Qian, W. M. and Cai, G. X., Strong approximability of wavelet estimate in semiparametric regressionmodel (in Chinese), Sci. China Ser. A, 29(3), 1999, 233–240.

[16] Xue, L. and Liu, Q., Bootstrap approximation of wavelet estimates in a semiparametric regression model,Acta Math. Sin. (Engl. Ser.), 26(4), 2010, 763–778.

[17] Chai, G. X. and Xu, K. J., Wavelet smoothing in semiparametric regression model (in Chinese), ChineseJ. Appl. Probab. Statist., 15(1), 1999, 97–105.

[18] Hu, H. C., Ridge estimation of a semiparametric regression model, J. Comput. and Appl. Math., 176,2005, 215–222.

[19] Heckman, N., Spline smoothing in partly linear models, J. R. Statist. Soc. B, 48, 1986, 244–248.

[20] Green, P. J. and Silverman, B. W., Nonparametric Regression and Generalized Linear Models, Chapmanand Hall, London, 1995.

[21] Hardle, W., Liang, H. and Gao, J., Partially Linear Models, Heidelberg, Physica-Verlag, 2000.

[22] Shi, P. and Li, G., A note on the convergent rates of M -estimates for a partly linear model, Statistics, 26,1995, 27–47.

[23] Bianco, A. and Boente, G., Robust estimators in semiparametric partly linear regression models, J. Statist.Plann. Inference, 122, 2004, 229–252.

[24] Hu, H. C. and Hu, D. H., Strong consistency of wavelet estimation in semiparametric regression models(in Chinese), Acta Math. Sinca, 49(5), 2006, 1417–1424.

[25] Hu, H. C. and Hu, D. H., Wavelet estimation in semiparametric regression models with martingale differ-ences time series errors (in Chinese), Adv. Math., 39(1), 2010, 23–30.

[26] You, J. and Chen, G., Semiparametric generalized least squares estimation in partially linear regressionmodels with correlated errors, J. Statist. Plann. Inference, 137(1), 2007, 117–132.

[27] Liang, H. and Jing, B., Asymptotic normality in partial linear models based on dependent errors, J. Statist.Plann. Inference, 139, 2009, 1357–1371.

[28] Zhou, X., Liu, X. and Hu, S., Moment consistency of estimators in partially linear models under NAsamples, Metrika, 2009. DOI:10.10071/s00184-009-0260-5

624 H. C. Hu and L. Wu

[29] Baek, J. and Liang, H., Asymptotics of estimators in semi-parametric model under NA samples, J. Statist.Plann. Inference, 136, 2006, 3362–3382.

[30] Liang, H. and Wang, X., Convergence rate of wavelet estimator in semiparametric models with dependentMA(∞) error process, Chinese J. Appl. Probab. Statist., 26(1), 2010, 35–46.

[31] Ren, Z. and Chen, M., Strong consistency of a class of estimators in partial linear model for negativeassociated samples (in Chinese), Chinese J. Appl. Probab. Statist., 18(1), 2002, 60–66.

[32] Zhou, X. and You, J., Wavelet estimation in varying-coefficient partially linear regression models, Statist.Probab. Lett., 68, 2004, 91–104.

[33] Antoniads, A., Gregoire, G. and Mckeague, I. W., Wavelet methods for curve estimation, J. Amer. Statist.Assoc., 89, 1994, 1340–1353.

[34] Zhu, C., The convergence rate of the estimate of the regression function under NA sequence, J. XinjiangNorm. Univ. Nat. Sci. Ed., 27(2), 2008, 5–11.

[35] Hu, H. C., Convergence rates of wavelet estimators in semiparametric regression models with martingaledifference error, J. Math. Sci. Adv. Appl., 3(1), 2009, 4–19.

[36] Xu, B., Convergence of the weighted sums with applications for strongly mixing sequence (in Chinese),Acta Math. Sinica, 45(5), 2002, 1025–1034.

[37] Liu, J., Gan, S. and Chen, P., The Hjeck-Rnyi inequality for the NA random variables and its application,Statist. Probab. Lett., 43, 1999, 99–105.