Limiting distribution of the maximal spacing when the density function admits a positive minimum

8
Statistics & Probability Letters 14 (1992) 53-60 North-Holland 4 May 1992 Limiting distribution of the maximal spacing when the density function admits a positive minimum Philippe Barbe CREST and LSTA-University Paris VI, Paris, France Received May 1991 Revised September 1991 Abstract: Let XI, X,,... be a sequence of random variables with common distribution F, and let f be the density function of F. Let X,, G .‘. <X,,, . be the order statistics of X,, , X, and let A4, = max2 (1 <,X,,, - X,_ l,n be the maximal spacing. We assume that f has a positive minimum in x0 and that ftxa + h) = f(,r,)> h’d sgn(hX1 +0(l)) when h -+ 0. We prove that lim ,,+_P[nM, < x + a,] = exp(-ee+? where 4 = f&j and a,, = 4-r log n - 4-lr-l log log n + 4-l log(r~‘d-“‘r(l/r)~“‘). Keywords: Maximal spacing, distribution function of the spacings. 1. Introduction Let Xi, X,,... be a sequence of real independent and identically distributed (i.i.d.) random variables (r.v.‘s), and let Xi n G X, II G . . . G X,,, be the first n variables. Let si,, =X,,, -Xi_i,n, 2 =z i G n, be the spacings, and let be the maximal spacing. When the Xi are uniformly distributed, the behaviour of M,, is well known (see Devroye (1981, 1982), Deheuvels (1982, 1983)). In the non-uniform case, Deheuvels (1983bI gives almost sure bounds for M,,. Deheuvels (1986) investigates the asymptotic behaviour of M,, when the extreme values Xi,, and X,, n belong to a domain of attraction, i.e. when there exist two sequences (a,) and (b, > 0) such that Correspondence to: Philippe Barbe, CREST, 12 Rue Boulitte, 75014 Paris, France. 0167-7152/92/$05.00 0 1992 - Elsevier Science Publishers B.V. All rights reserved 53

Transcript of Limiting distribution of the maximal spacing when the density function admits a positive minimum

Statistics & Probability Letters 14 (1992) 53-60

North-Holland

4 May 1992

Limiting distribution of the maximal spacing when the density function admits a positive minimum

Philippe Barbe CREST and LSTA-University Paris VI, Paris, France

Received May 1991

Revised September 1991

Abstract: Let XI, X,,... be a sequence of random variables with common distribution F, and let f be the density function of F. Let X,, G .‘. <X,,, . be the order statistics of X,, , X, and let A4, = max2 (1 <,X,,, - X,_ l,n be the maximal

spacing. We assume that f has a positive minimum in x0 and that ftxa + h) = f(,r,)> h’d sgn(hX1 +0(l)) when h -+ 0. We prove that lim ,,+_P[nM, < x + a,] = exp(-ee+? where 4 = f&j and a,, = 4-r log n - 4-lr-l log log n + 4-l log(r~‘d-“‘r(l/r)~“‘).

Keywords: Maximal spacing, distribution function of the spacings.

1. Introduction

Let Xi, X,,... be a sequence of real independent and identically distributed (i.i.d.) random variables (r.v.‘s), and let Xi n G X, II G . . . G X,,, be the first n variables. Let

si,, =X,,, -Xi_i,n, 2 =z i G n,

be the spacings, and let

be the maximal spacing. When the Xi are uniformly distributed, the behaviour of M,, is well known (see Devroye (1981, 1982),

Deheuvels (1982, 1983)). In the non-uniform case, Deheuvels (1983bI gives almost sure bounds for M,,.

Deheuvels (1986) investigates the asymptotic behaviour of M,, when the extreme values Xi,, and X,, n ’ belong to a domain of attraction, i.e. when there exist two sequences (a,) and (b, > 0) such that

Correspondence to: Philippe Barbe, CREST, 12 Rue Boulitte, 75014 Paris, France.

0167-7152/92/$05.00 0 1992 - Elsevier Science Publishers B.V. All rights reserved 53

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

(H being a non-degenerated distribution) and analogously for Xi,,. In this case (e.g. De Haan (1970)), H is one of the three following distributions:

4,(x) = exp( -x-“) f or x > 0, a > 0 (Frechet limiting type),

$Jx) = ew( -( -x))a f or x < 0, a > 0 (Weibull limiting type),

A(x) = exp( - eex) for x E R (Gumbel limiting type).

When X,,, and Xi,, have a Weibull limiting type with parameter a < 1, Deheuvels’ techniques (1986) do not give the limiting distribution of M, because the limiting distribution does not depend on the extremes.

If a < 1, the density function is unbounded at the extreme points of its support. In this paper, we deal with density bounded from below by a positive constant. The limiting behaviour of M,, is related to the minimum of the density function and to the local behaviour of the density function near its minimum, as it is the case for the almost sure behaviour of M, (Deheuvels (1983b)).

2. Results

Let F(X) = P[ Xi G x] be the joint distribution function of the sequence Xi,. . . , X,, and let f= F’ be the density function with respect to the Lebesgue measure. We make the following assumptions:

(Hl) The support of f is an interval (A, B), --03 <A <B < 03. f is continuous on (A, B), except maybe at some isolated points, and the number of points of discontinuity is finity.

(H2) f admits a local minimum in a point x0. If f(x) =f(x,>, f IS continuously differentiable in a neighbourhood of x.

Note that f may have more than one global minimum. Let

GF(x)=l- (Bexp(-xf(y))f(y)dy and G,=l-G,. ‘A

G, is the limiting distribution of the empirical distribution function of the spacings: if G,(x) = ~-‘++zS~,~ <xl, then sup, rR I G,(x) - G(x) I converges to 0 in probability when IZ + m (see Pyke (1965)). Barbe (1990) shows that in the cases studied by Deheuvels (1986), the limiting behaviour of ??F as x + m is related to those of the extremes of F and of 44, (using Deheuvels’ results (1986)). In the present case, where f is bounded from below by a positive constant, different manner:

??F is still related to M,.but in a

Theorem 1. Under (Hl) and (H2), if there exist a sequence (a,) and x E R, lim n ,,nc&x + a,) = t)(x), then

lim P[ nA4, <x + a,] = e-“@). n-+c=

A consequence of Theorem 1 is that under (Hl) and (H2), if associated to the same distribution G, (i.e. G,, = GFJ then their limiting behaviour. Let

PF(l) =A@ f(x) >t),

a function ccl(x) such that for any

two distributions F, and F2 are maximal spacings have the same

54

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

where A denotes the Lebesgue measure on R. pi1 is the density function of a distribution K and G, = G, (Barbe (1990)). Therefore we may replace f by pF1, and, without loss of generality, we may assume that f is unimodal with mode at 0, with support (0, II), and f(B) = inf,,, <sf(x). If IX: f(x) = f(B)} contains an open set, f is constant on an interval (II’, B) and it4,, has the same asymptotic behaviour as the maximal spacing of a uniform distribution. Thus we assume from now on that {x: f(x) =f(B)J = {B}. Following Deheuvels (1983b), we also assume that f has an exact order of differentiability in B:

Theorem 2. We suppose that f(B - h) = 4 + h’d + o(h’) with r > 0, d > 0. Let

$=f(B)= inf f(x) and 4>0. xc(0; B)

Let

a, =c#-’ log 12 - f#-l,-l log log n + 4-l log ( VNm”’ )_

Then

lim nG(x+a,) =e-+I and lim P[nM, <x +a,] = exp( -e-+X). n-m n+m

Remark 1. We failed to obtain an analogous result by assuming only f(B - h) = 4 + h’/(h) where I is a regularly varying function at 0 with limit 0 or ~0 when h --) 0; the same failure occurred when f(B) = 0 and f(B -h) = I(h).

Remark 2. The relation T(X) = T(x + 1)/x implies T(x) - l/x when x + 0. Therefore, when r + M, rT(l/r) + 1, and

lim log( r(5:f”r) =O. r-m

Thus, if we denote

a,(r) =$-’ log It - +-lr-l log log Iz + 4-l log ( V/+;~‘J,

we have lim ,,,a,(r) = c#-‘log n for each IZ, and Theorem 2 may be extended to the case of uniform distributions where we have formally f(B - h) = 4 + h”d.

Example. We consider a p(p, q) distribution, with density function

f,,,(t) =B(p, q)-ltp-‘(l-t)q-l

for 0 < t G 1, with p < 1 and q < 1. We have x0 = (p - l)/(a + 9 - 21,

c#~=f~,,(x~) =(2-p-q)2-p-q(1-p)p-1(l-q)q-1B(p, 4)-l.

An easy calculation gives r = 2 and

d=B(p, q)-1(2-p-q)5-p-q(l-p)P~2(l-q)q-2.

Theorem 2 gives a, and the limiting distribution of M,.

55

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

3. Proofs

We first prove that the maximal spacing happens necessarily close to the minimum of the density function. Then we may study only the spacings near this minimum. We approximate the spacings by some weighted independent exponential r.v.‘s; it remains to study the maximum of these variables.

Let us assume that f has a minimum at x0. Let Q(U) = inf(x: F(x) 2 U} be the quantile function. Let u0 be such that Q<u,> =x0, i.e. f 0 Q(ua> = inf, <x < .f(x>.

Let wi, w2,. . . be a sequence of independent and exponentially distributed T.v.3 with Eo, = 1, and let wk=wi+ ... +o,.

Lemma 1.

wvwl+l’ l<i<n) g{q.,n:l<i<nn)

where U, n < . . . G U, n , is the order statistics of n independent r.v.‘s uniformly distributed on (0, 1).

Proof. See e.g. Pyke (1965). q

If U is uniformly distributed on (0, l), Q(U) has the distribution F. We deduce the following lemma:

Lemma 2.

{Q(WJW,+,)-Q(~_,/W,+,):2<i<n}~(Si,,:2,<i<n}. q

In the sequel we will not distinguish Si,n and Q(WJW,+,) - Q(H’_,/W,+,>. Lemma 2 and Taylor’s formula ensure the existence of oi,, between IV_, and U$ such that

Si,, = wi/W, + 1 f o Q(oi,n)- We approximate Si,n in the neighbourhood of the minimum of the density function: Let 6 > 0 and

V= V(6) = {u: f oQ(u) <f oQ(zq,) +a}.

For 6 small enough V is an interval (or maybe a countable union of disjoint intervals) containing ua because f is continuous.

Let

1, =Z,(S) = {i: i/n E V(6)}.

Lemma 3.

lim P max Sj,, > max Si,, = 0. n-m [ iEI,C ief, I

Remark. This lemma is implicitly proved in Hall (1984), under the hypotheses of Theorem 2.

Proof. Let Y = v,, = 2n-“‘(log log n)‘/* and let E, be the event

E,={ViEI,C:3x@V, If3i,n-~I <v}.

The law of the iterated logarithm ensures that lim,,,P(E,) = 1.

56

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

Taking 6 sufficiently small, we may assume that f 0 Q is continuously differentiable on V. On E,, if

i Ez;,

f~QU4.J <(4+V’(1+WJ) and Si,n <WW,+d4 + 6)(1+ O(v,J). f ’ 0 Q is bounded on T/ and Si,, = wiQ’(i/n)/W,+,(l + o(l)) uniformly in i E I,.

P max Si,n > max S,,, [ i E I: i E I, 1

x epYcb+‘) dy( 1 + o( 1))

1 - exp( -f o Q<i/n>> #Z,C($ + a)(1 - ew( -Y(4 + a)(1 + OC~J)))“-’

Note that

x e-Y(d’+~xl+o(un)) dy(l + o(l)).

lim C 1 - ew( -f o Q(i/n>)

n+m iEI, 1 - exp( -y(4 + 6)(1+ O(v,)) =O

and

/ ( r#Zi 1 - exp( -y($ + 6)(1 + O(V~)))~-’ e-Y(+fs) dy < 00.

0

Lebesgue’s dominated convergence theorem gives

max Si,, > max Si,, = 0. 0 iEIi lEI, 1

We shall only study maxi ~ I, S. r,n. We have already approximated Si,n by w/W,+ If 0 Q<i/n). The strong law of large numbers asserts that lim, +mWn+,/n = 1 a. s. and therefore Si,n = wi/nf 0 Q(i/n>(l + o(1)) uniformly in i E Z,.

Thus we calculate

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

Now we need an approximation for

S,:= C log 1-exp -(x+a,)foQ 4 .

iEI, i i ( ‘II n

Lemma 4. Let

T,=T,(x) = -rzlVexp(-(x+u,)foQ(u)) du.

Then

S,=T,+o(l) whenn-tm.

Proof. Since a, + ~0 and f 0 Q(U) 2 4 > 0,

(1+0(l)).

Let

Then

Now

IS,:-T,I<n-’ c sup ieI, i/n<ug(i+l)/n

since a, + 03.

= 0( a, exp( --a,~$)) = o( 1)

Thus 1 SL - T, I = o(l) and S, - T, = o(l). 0

Now, to prove Theorem 1 it is sufficient to prove:

Lemma 5.

lim n~F(x+a,) =I,!J(x) iff ilmnT,(x) =I,!J(x). n+m

Proof. Denote V” = [O; 11 - I/, the complement of I/ in [O; 11. Let

and

Z1(t) = J exp( -tf o Q(u)) du VC

G(t) =/Vew-tfoQ(u)) du.

c, = I, + I,. It is sufficient to prove that G,(t) N Z2( t) when t + 00.

58

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

Jensen’s inequality provides the lower bound:

Z,(t) 2 u exp ( -(V41,f” Q(u) du)

where c’ is the Lebesgue measure of V. The lower bound f 0 Q<u> > 4 + 6 on I/” gives I,(t) < (1 - U> exp(-t(4 + 8)).

Therefore

0 G I,( t)/Z,( t) G L)-' exp ( 1

-(t/u) V(foQ(u)-(++6)) du =uA1 exp(-(t/u)A(u)) i

with A(u) > 0. Thus lim f _mZ1(t>/Zz(t) = 0 and Z,(t) = o(Z,(t)> when t + w. It follows that G,(t) - Z,(t) when t + CC and we may replace Z,(t) by Tn. 0

Let us prove Theorem 2. We have shown that

GF(t) -_(exp(-tfoQ(u)) du (f-03)

for any E > 0. Now we assume that f 0 Q<u> = 4 + u’d(l + o(l)) with 0 < d < ~0.

Lemma 6.

e-‘+T( l/r) ‘&) - t’/‘rd’/’ when t +m.

Proof. For 0 < u < 17 small enough,

~+u’d(l-&)~fOQ(u)~~+u’d(l+&).

Thus

e-td i’ exp( -tu’d(l + E)) du G /& exp( -tf 0 Q(u)) du G e-“#‘/a’ exp( -tu’d(l -E)) du. 0

We use the change of variable c’ = tu’d(1 + E) and obtain the integrals as gamma integrals which lead to the result since E is arbitrary. 0

We have only to prove that a,, given in Theorem 2 ensures the convergence of nG,(x + a,> to e-“. If

then

lim nEF(x+a,) =$(x), n+m

log GF(x+aa,) +log n=log $(x)+0(1) (n+m).

Furthermore,

log GF(x+a,) +log Iz= -(x+a,)$+C-rr-’ log(x+ti,) +0(l), (1)

59

Volume 14, Number 1 STATISTICS & PROBABILITY LETTERS 4 May 1992

with

c = log( I’Q$yy) (from the preceding lemma).

Thus a, -4-‘1 o g IZ when n -+ ~0. Replacing a,, by +-‘loS(n>(l + a;> in cl), we have

-x+-a: log IZ++--_~’ loglog ntr-’ log $+0(l) =lOg e(x).

Thus a; - r -‘(log n)- ’ log log II when IZ -+ ~0. Taking

a: = -r-‘(log n)-’ log log(n)(l +a:)

in (2) we have

-x$+r-‘ai loglog n+C+F’ log ~+o(l)=lOg 4(x).

Taking

ai = -(log log n)-‘(C + r-l log +)r

gives the result. 0

References

Barbe, Ph. (19901, On the distribution function for the spac-

ings (submitted).

Deheuvels, P. (19821, Strong limiting bounds for maximal

uniform spacing, Ann. Probab. 10, 1058-1065.

Deheuvels, P. (1983a1, Upper bounds for k-th maximal spac-

ings, Z. Wahrsch. Verw. Gebiete 62, 465-474.

Deheuvels, P. (1983b1, Strong limit theorems for maximal

spacing from a general univariate distribution, Ann. Probab. 12, 1181-1193.

Deheuvels, P. (19861, On the influence of the extremes of an

i.i.d. sequence on the maximal spacings, Ann. Probab. 14,

194-208.

(2)

Devroye, L. (1981), Laws of the iterated logarithm for order

statistics of uniform spacings, Ann. Probab. 9, 860-867. Devroye, L. (19821, A log log law for maximal uniform spac-

ings, Ann. Probab. 10, 863-868. De Haan, L. (1970), On Regular Variation and its Application

to the Weak Convergence of Sample Extremes, Mathemati-

cal Centre Tracts No. 32 (Mathematical Centre, Amster-

dam). Hall, P. (19841, Random nonuniform distribution of line seg-

ments on a circle, Stochastic Process. Appl. 18, 239-261. Pyke, R. (19651, Spacings, .I. Roy. Statist. Sot. Ser. B 27,

395-436.

60