Existence and uniqueness of the maximum likelihood estimator for the two-parameter negative binomial...
-
Upload
jorge-aragon -
Category
Documents
-
view
216 -
download
2
Transcript of Existence and uniqueness of the maximum likelihood estimator for the two-parameter negative binomial...
Statistics & Probability Letters 15 (1992) 375-379
North-Holland
8 December 1992
Existence and uniqueness of the maximum likelihood estimator for the two-parameter negative binomial distribution
Jorge Arag6n *, David Eberly * * and Shelly Eberly * * * DiGsion of Mathematics, Computer Science, and Statistics, University of Texas, San Antonio, TX, USA
Received September 1991
Revised February 1992
Abstract: Given a sample with mean E and second moment s2, Anscombe in 1950 conjectured that the maximum likelihood
equations for the two-parameter negative binomial distribution have a unique solution if and only if s* > E. We give a proof of his
conjecture.
Keywords: Maximum likelihood estimator; negative binomial distribution; Newton’s method
1. Introduction
The negative binomial distribution has been ap- plied widely in Biology, Psychology, Communica- tions, Insurance, Economics, Medicine, Military, etc. The following parametrization is used here:
f(x)=(X~kTl)p*(l-p)X, x=0,1,2 )....
We treat, k as a continuous parameter with k E
(0, m) and refer to the distribution as NB(k, p).
Correspondence to: David Eberly, Computer Science Depart-
ment, CB 3175, Sitterson Hall, University of North Carolina,
Chapel Hill, NC 27599-3175, USA. e-mail: [email protected].
* Research partially supported by NIH grant AI-07358.
Current address: Biostatistics Department, Harvard School of Public Health, Boston, MA 02115, USA.
** Research partially supported by NSF Grant DMS- 9003037.
* ** Research supported by a NASA/Texas Space Grant Consortium Fellowship. Current address: Statistics Unit,
Cornell University, Ithaca, NY 14853, USA.
As a result of the frequent application of the negative binomial distribution, an increasing number of papers on estimation have appeared in the literature (Fisher, 1941; Haldane, 1941; Wise, 1946; Anscombe 1949 and 1950; Bliss and Fisher, 1953; Bliss and Gwen, 1958; Shah, 1961; Katti and Gurland, 1962; O’Carroll, 1962. Shenton and Wallington, 1962; Shenton, 1963; Martin and Katti, 1965; Shenton and Myers, 1965; Johnson and Kotz, 1969; Pahl, 1969; Shenton and Bow- man, 1967; Pieters, Gates, Matis and Sterling, 1977; Nedelman, 1983; Bowman, 1984; Willson, Folks and Young, 1986; Ross and Preece, 1985; Binet, 1986; Kemp and Kemp, 1987; Binns and Bostanian, 1988; Lam, Shenton and Bowman, 1988; and Piegorsch, 1990; among others). A sur- vey of the articles on the topic can be found in Clark and Perry (1989).
Forasample xi,..., x,, Anscombe (1950) con- jectured that the maximum likelihood estimator exists and is unique when the second sample moment s2 = Cy==, x:/n -X2 is greater than the sample mean Z = Cy==, xi/n and that no maxi-
0167-7152/92/$05.00 0 1992 - Elsevier Science Publishers B.V. All rights reserved 375
Volume 15. Number 5 STATISTICS & PROBABILITY LETTERS 8 December 1992
mum likelihood estimator exists when s2 <X. Proofs for existence of the maximum likelihood estimator when s2 > X were given by Johnson and Kotz (19691, although the book contains some misprints, and by Willson, Folks and Young (1986). Several issues pertaining to the negative binomial distribution have been addressed in the literature, such as what to do when s2 <Z, how to treat estimates of k less than one, and compari- son of different estimation methods for small samples. However, the questions of uniqueness when s2 > X and of existence when s2 <X have gone unanswered. We answer these remaining questions by proving the following:
Theorem. Let xi, i = 1,. . . , n be a random sample from NB(k, p). The maximum likelihood estima- tor of (k, p) exists if and only if s2 > X. Moreover, if the maximum likelihood estimator exists, then it must be unique.
The nonexistence of the maximum likelihood estimator when s2 <X fits in well with the nonex- istence of the method-of-moments estimator and the fact that (T* > p for the negative binomial distribution.
2. Formulation of the problem
Let
M=max xi,
so 0 6x, GM. If f, is the proportion of the sample values equal to j, then
X = f jfj j=l
and
s*= Ej'f,- Ejfi .
j=l i 1
2
j=l
We wish to compute the maximum likelihood estimators (ff, fi> for the sample. The maximum
376
likelihood estimator for p is given by fi = f/c_? + k> where k is a solution to
g(k) = 5 4 j=l k+j-1 (I)
and where F, = CEj fi is the proportion of the sample values greater than or equal to j.
The approach we took to finding the maximum likelihood estimator was motivated by Eberly (1991). In this thesis, data sets were generated from a length-biased truncated negative binomial distribution 1 + NB(k + 1, p> where k > 0. To construct a maximum likelihood estimator, we used equation (1) with k replaced by k + 1. In applying Newton’s method to (11, we had prob- lems with the zero at infinity for g(k). With an inappropriate initial guess, the iterates tended towards infinity. To avoid this problem, we de- fined z = l/k and G(z) = g(k), so
G(z)= ? . ZF,
j=l (J - 1)” + 1 -lo&l +xz),
2 E (0, w). (2) This reparametrization has been used by Lam, Shenton and Bowman (1988) and Clark and Perry (1983). For k > 1 we need only consider (2) for z E (0, 11. One has much more control on the behavior of Newton’s method on this finite inter- val. We observed that the graphs of G were of two types, see Figure 1. The function G has the properties G(O) = G’(O) = 0, G(z)/z + F, > 0 as z -+ co (so G(z) must be positive for z large), and the convexity/concavity at z = 0 determines the shape of the graph. Computing G”(O) and using the definition for Fj, we have
G”(O)=x’-2E(j-l)F,=x--s2. j=l
If G”(0) < 0, then eventually the graph of G must intersect the z-axis, thereby providing a solution i to G(z) = 0 and a maximum likeli- hood estimator k = l/i. This reproduces the re- sults in Johnson and Kotz (1969) and Willson, Folks and Young (19861, which show the exis- tence of at least one solution when s2 > 2. We now give a more detailed analysis to show uniqueness of the zero when G”(O) < 0 and the nonexistence of zeros when G”(0) 2 0.
Volume 15, Number 5 STATISTICS & PROBABILITY LETTERS 8 December 1992
no roots Fig. 1. Graphs for G(z).
3. The main results
We analyze the problem in three cases.
Case 1. Let M= 1; then
G(z) =F,z-log(l+F1z),
G’(z)=F1-F,/(l+F,z)
and
G”(z) = [F,/(l + F,z)]‘.
Since G”(Z) > 0 and G(O) = G’(O) = 0, the graph of G never intersects the positive z-axis. Thus, there is no root z > 0. Note that xi E (0, 1) im- plies s* =X -X2 <X.
Case 2. Let M = 2. The essential ideas are illustrated by this case. Our approach requires the change of variables u = F, and X = F, + F2.
Note that 1 2 F, > F2 > 0 implies
We will show that G(z; U, X> = 0 implicitly de- fines a unique function z = [(u, XI on some max- imal set of values E = Domain@). By doing so we will have shown that for each (u, XI E E there is a unique z = l(u, XI such that G(J(u, 2); U, XI = 0. The MLE is then given by k = l/l(u, X). The natural approach in constructing z = t(u, X> is to find a set E c D such that for each (u, X) E E there is a unique z corresponding to it. We use a
one root
slightly modified approach by selecting z first, constructing all pairs (u, X> corresponding to it, and showing that each such pair (u, X) cannot be
mapped to any other z. The equation G(z; U, 2) = 0 can be solved to
obtain
U =u(z, 5)
(z + 1) log(1 +xz> -xz = z2 > z>o. (3)
Taking the limit as z + Of yields ~(0, X) =X - $” which is equivalent to G”(0; F,, F2) = 0.
We show that the subregion
E=Dn{(w, X): w<u(O, i)}
is the domain for 5. Figure 2 illustrates the sets D and E, and the graphs of ~(0, XI and ~(1, X) in D. For each value of z > 0 the graph of (3) in D represents all pairs (u, X> for which G has a zero at z. We prove that E is a disjoint union of these graphs. A consequence is that G has a zero if and only if F, < 40, X), or equivalently, if and only if s* > X. Moreover, since the graphs are disjoint in D, a given pair (u, XI has exactly one graph containing it and the corresponding zero z for G is unique.
We now prove that E is a disjoint union of graphs. Differentiate u to obtain
u,,= -(1-x)x/(l+Zz)2<o
Volume 15, Number 5 STATISTICS & PROBABILITY LETTERS 8 December 1992
1
I 21
1
Fig. 2. Region of existence.
for E E (0, 1). Integrate in z from zr to z2 obtain up(z2, nC) - up(zl, i:) < 0 where the strict in- equality holds since the integral of a continuous negative function is negative. Integrate in X from 0 to X and use u(z, 0) = 0 to obtain u(z2, X> < u(zr, Z). Therefore, the graphs are ordered as claimed. As z + w, the graphs of u(z, X) in D
approach the graph of u(w, 2) = 0 in D, which is the single point (0, 0), and so E is a disjoint union of the graphs.
Case 3. Let M 2 3 and make the change of variables u = F,, 2 = C/“= 1 Fj. Denote
F’= (F2,...,FM_1).
The conditions 1 z F, a * * * 2 FM > 0 imply that
(u, X, F’, ED
= ((
u, X, F): l>u>F,> ...
>F,_,>X-(u+F,+ a.* +FM-I)).
The equation G(z; u, F, F’> = 0 can be solved to obtain
u = u(z, x, F) [(M- l)z+ l] log(1 +x2> -xz
= (M- l)z2
- &FZ1 (;:;;:l (4)
for z > 0. Taking the limit as z + Of yields
u(0, x, F) =, - x2
2(M- 1)
- $-&y&4-I)I;( J=2
which is equivalent to G”(0; u, X, F’> = 0. As in Case_2, for each z > 0 the graph of (4) in
the (u,+ 2, F) domain represents all triples (u, ?, F) for which G has a zero at z. We show that
is a disjoint union of these graphs. Existence of a zero for, G is guaranteed if and only if F, <
~(0, X, F), or equivalently, if and only if s2 > X. The disjointness of :he union implies that for a given triple (u, X, F) E E there is exactly one graph which passes through it, and so G has exactly one zero 2.
Differentiate u to obtain
up,= -(M-1-Z)f/[(M-1)(1+xz)]2<0
for X E (0, M - 1). Note that the graph of ~(0, Z, F’> exits D when X = M - 1. Integrate in z from zr to z2 to obtain
up z2, ( x, 2) - UX( zr, x, 3) < 0.
Finally, integrate in X from 0 to X to obtain
u(z,, Ji?, 2) -u(z,, o,$)
<*(zr, x, F) -u(z,, 0, R).
Unlike Case 2, we have two extra terms (where X = 0) which may affect the ordering of the graphs. However, restricting our attention to the domain D, when X = 0 all the parameters must be zero, u = 0 and F’= 0, so on D we have u(z,, X, F’> <
u(z1, X, F’> and the graphs are ordered as claimed. As z + CQ, the graphs o,f u(z, X, F’> in D approach the graphof u(m’, X, F) = 0 in D, which is the single point 0, so E is the disjoint union of the graphs.
378
Volume 15, Number 5 STATISTICS & PROBABILITY LETTERS 8 December 1992
References
Anscombe, F.J. (1949), The statistical analysis of insect counts
based on the negative binomial distribution, Biometrics 5,
165-173.
Anscombe, F.J. (19501, Sampling theory of the negative bino-
mial and logarithmic series distributions, Biometrika 37,
358-382.
Binet, F.E. (1986), Fitting the negative binomial distribution,
Biomefrics 42, 989-992.
Binns, M.R. and N.J. Bostanian (1988), Binomial and cen-
sored sampling in estimation and decision making for the
negative binomial distribution, Biometrics 44, 473-483.
Bliss, C.I. and R.A. Fisher (19531, Fitting the negative bino-
mial distribution to biological data, Biometrics 9, 176-200.
Bliss, C.I. and A.R.C. Owen (1958), Negative binomial distri-
butions with a common k, Biometriku 45, 36-58.
Bowman, K.O. (19841, Extended moment series and the pa-
rameters of the negative binomial distribution, Biometrics
40, 249-252. Clark, S.J. and J.N. Perry (19891, Estimation of the negative
binomial parameter k by maximum quasi-likelihood, Bio-
metrics 45, 309-316.
Eberly, S. (1991), Inferences from length-biased distributions,
Thesis, Univ. of Texas (San Antonio, TX).
Fisher, R.A. (1941), The negative binomial distribution, Ann.
Eugenics London 11, 182-187.
Haldane, J.B.S. (1941), The fitting of the binomial distribu-
tions, Ann. Eugenics London 11, 179-181.
Johnson, N.L. and S. Kotz (19691, Distributions in Statistics,
Vol. I: Discrete Distributions (Wiley, New York).
Katti, S.K. and J. Gurland (1962), Efficiency of certain meth-
ods of estimation for the negative binomial and the Ney-
man type A distribution, Biometrika 49, 215-226.
Kemp, A.W. and CD. Kemp (1987), A rapid and efficient
estimation for the negative binomial distribution, Biomet-
ric J. 29, 856-863.
Lam, H.K., L.R. Shenton and K.O. Bowman (19881, Some
properties of a moment estimator for the index parameter
of the negative binomial distribution, ASA Proc. of Statist.
Cornput., pp. 365-367.
Martin, D.C. and S.K. Katti (1965), Fitting of certain conta-
gious distributions to some available data by the maximum
likelihood method, Biometrics 21, 34-48.
Nedelman, J. (19831, A negative binomial model for sampling
mosquitoes in a malaria survey, Biometrics 39, 1009-1020.
O’Carroll, F.M. (1962), Fitting a negative binomial distribu-
tion to coarsely grouped data by maximum likelihood,
Appl. Statist. 11, 196-201.
Pahl (1969), On testing for goodness-of-fit of the negative
binomial distribution when expectations are small, Biomet-
rics 25, 143-151.
Piegorsch, W.W. (1990), Maximum likelihood estimation for
the Pigative binomial dispersion parameter, Biometrics 46,
863-867.
Pieters, E.P., C.E. Gates, J.H. Matis and W.L. Sterling (19771,
Small-sample comparison of different estimators of nega-
tive binomial parameters, Biometrics 33, 718-723.
Ross, G.J.S. and D.A. Preece (19851, The negative binomial
distribution, The Statistician 34, 323-336.
Shah, S.M. (19611, The asymptotic variances of method of
moments estimates of the parameters of the truncated
binomial and negative binomial distributions, J. Amer.
Statist. Assoc. 56, 880-994.
Shenton, L.R. (1963), A note on bounds for the asymptotic
sampling variance of the maximum likelihood estimator of
a parameter in the negative binomial distribution, Ann.
Inst. Math. Statis. Tokyo 15, 145-151.
Shenton, L.R. and K.O. Bowman (1967), Remarks on large-
sample estimators for some discrete distributions, Techno-
metrics 9, 587-598.
Shenton, L.R. and R. Myers (1965), Comments on estimation
for the negative binomial distribution, in: G.P. Patil, ed.,
Classical and Contagious Discrete Distributions (Statist. Pub.
Society, Calcutta) pp. 241-262.
Shenton, L.R. and P.A. Wallington (1962), The bias of mo-
ment estimators with an application to the negative bino-
mial distribution, Biomettika 49, 193-204.
Willson, L.J., J.L. Folks and J.H. Young (19861, Complete
sufficiency and maximum likelihood estimation for the
two-parameter negative binomial distribution, Metrika 33,
349-362.
Wise, M.E. (19461, The use of the negative binomial distribu-
tion in an industrial sampling problem, J; Roy. Statist. Sot.
Ser. B 8, 202-211.
379