Discrete Poisson kernel density estimation-with an application to wildcat coal strikes

I. INTRODUCTION

Bank failures, coal mine fatalities, and patents issued provideexamples of count data situations often confronted by econo-mists. Past researchers have frequently used speci� c distribu-tions such as the Poisson distribution to model count data.Kernel density estimations are a more � exible method for esti-mating the density of such count data. Our purpose is to presentresults on a new discrete Poisson kernel density estimator. Wepropose a Poisson kernel method as a simple approach to pro-viding a discrete nonparametric density estimator for countdata. Such an estimator has countably in� nite support in theory(population) but countably � nite support in practice (sample).

Although there has not been extensive work in economet-rics on discrete kernel density estimators, a few authors haveprovided some interesting results that serve as the backgroundfor this study. These include Bierens (1983) and Delgado andMora (1995). For a general discussion of nonparametric esti-mation see Ullah and Vinod (1993). These works generallyfocus on the discrete nonstochastic variables rather than ondiscrete random variables. However, much work has been donein cases for binary random variables and multiple responsetype as opposed to the count data context that we study here.For example, see Horowitz (1993).

In Section II we will introduce our discrete Poisson kerneldensity estimator and note its similarity to previous discretekernel approaches. Section III will compare the � t of ourPoisson kernel density estimator with exact distribution esti-mators (e.g. Poisson) serving as the least � exible and the unre-stricted relative frequency estimator as the most � exible.Finally, in Section IV we summarize our results and discussthe implications of our work.

II. DISCRETE KERNEL DENSITY ESTIMATORS

The traditional approach to dealing with count data is to startwith a speci� c distributional structure such as the Poisson. Forexample, see Cameron and Trivedi (1986). The Poisson prob-ability distribution function (pdf) is given as:

(1)

Suppose we have a sample of observations y1, . . . , yn ona discrete random variable, Y. The distribution of Y is de� nedas the probability:

P (Y = y) = p (y) for y = 0, 1, 2, . . . , + ` (2)

In this paper we propose a nonparametric mass functionestimator of p (y) denoted as pn(y) where pn (y) is estimatedusing the sample y1, . . . , yn:

(3)

where P (Y = y|yi) is a kernel probability mass function given theobservation yi. We propose a Poisson kernel for P (Y = y|yi).

Poisson kernelWe de� ne P (Y = y | yi) as a Poisson distribution with mean E(Y | yi) = yi where r is a fraction between zero and one. The

for y = 0, 1, 2, . . . , 1 `

p (y) = Pn (Y = y) = 1

n on

i= 1

P (Y = y | y i)

P (Y = y) = e2 l l y

y! for y = 0, 1, 2, . . . , 1 `

Applied Economics Letters, 1999, 6, 393– 396

Discrete Poisson kernel density estimation – withan application to wildcat coal strikes

LAWRENCE C. MARSH and KAJAL MUKHOPADHYAY

Department of Economics, University of Notre Dame, Notre Dame, IN 46556

Received 30 January 1998

This paper proposes a nonparametric Poisson kernel density estimation technique fordiscrete distributions. Economists have been using continuous kernels to approximatediscrete distributions. This work introduces a discrete kernel as more appropriate forapproximating discrete distributions. Simulation results are presented to compare withstandard parametric approaches. We apply our discrete Poisson kernel estimator to approx-imate the distribution of coal mine wildcat strikes in the United States.

1350–5851 © 1999 Routledge 393

reason for this is that the mode of this distribution occurs atyi. Usually in kernel distributions we put less weight on obser-vations that are further away from yi. This means P(Y = y|yi)will be maximized when y = yi. At other values of y not equalyi to the probabilities will be smaller than the probability ofand y = yi will approach zero as it goes further away from themode yi.

We can now de� ne the Poisson kernel estimator forEquation 3 as

(4)

The mode of the Poisson distribution with mean l (whenl is not an integer) is at [ l ] where [l ] is the greatest integernot greater than l . If l is already an integer then the modeof the distribution occurs at both l and l – 1. Since the obser-vations y1, . . . , yn are integer valued, we de� ne the mean tobe yi + r so that the mode is unique at yi. Thus, P (Y = y|yi) isa maximum at y = yi. The choice of r depends on how muchrelative weight one wants to put on the probabilities wheny = yi – 1 and y = yi + 1. When r is close to zero, relativelymore weight is put on values of y less than or equal to yi.Conversely, when r is close to one, relatively more weight isput to the right of yi.

However, note that r Î (0,1]. The reason that r cannotinclude zero is because if yi = 0 then the Poisson distributionis not de� ned. An applied researcher can choose r based onthe data. If the researcher believes that the relationships in thesample data truly represent the corresponding population rela-tionships, then the researcher will choose a value of r forwhich the kernel distribution is close to the histogram in somesense.

One choice of r is to closely � t the kernel to the histogramby trying various values of r graphically. A second choice ofr is to � t the zeros whenever a large proportion of the obser-vations are zeros. This can be done by � nding the value of rthat equates the proportion of zeros generated by the kernelestimator with those from the observed relative frequency (thezeros on the histogram). A third choice of r is to make thekernel estimator � t closely to the full range of observed rela-tive frequencies (the full histogram).

ImplicationsSince the mean and variance of the Poisson are equal, wecannot control for the variance parameter separately. For thisdistribution there is no value of r that will exactly � t the empir-ical distribution constructed from the histogram. This meansthat there will always be some degree of smoothing whenusing the Poisson kernel. Consequently, we need kernels forwhich the variance parameter can be controlled. However,within the standard families of distributions, there is nospeci� c discrete distribution with in� nite support that cancontrol the variance independently of the mean. The negativebinomial distribution can only be used if the variance para-meter exceeds the mean. However, � nite support distributionslike the binomial distribution demonstrate underdispersion.This means that the variance parameter which is related to thebandwidth parameter in standard kernel density estimation canbe systematically reduced to zero as sample size increasestowards in� nity.1

III. COMPARATIVE SIMULATION OF DENSITYESTIMATORS

In order to compare the estimators under both standard andnonstandard discrete distributions, we have performed simu-lations with both arti� cial and real life data.

First we generated data according to (a) standard distribu-tions and (b) bimodal distributions. We then � t nonparametricdistributions to the histograms generated by these arti� cialdata. We also used wildcat coal mine strike data with (c)skewed and excess zero distributions.

We � rst generated data with a standard Poisson distributionwith l = 3. For all of the cases we generated samples of 500observations. For the Poisson kernel we chose the values of rto be 0.1, 0.3, 1.0 and a value of r that equates the propor-tion of zeros from the sample. To calculate r we summed upthe probability of zero from the Poisson kernel over thesample. The formula for computing r is given as

(5)r = 2 log 3 #(y i = 0)

on

i= 1 e 2 yi

4

for y = 0, 1, 2, . . . , 1 `

Pn (y) = Pn (Y = y) = 1

n o

n

i= 1

e 2 (y i 1 r) (y i 1 r) y

y! P (Y = y | y i)

394 L.C. Marsh and K. Mukhopadhyay

1 In other research we de� ne P (Y = y|yi) as a binomial distribution with mean E (Y|yi) = kp + yi where k is a nonnegative integer which serves as the range of support for the bino-mial distribution with y = yi ,. . . ,yi + k

We follow the same logic as in the Poisson kernel case as explained above. That is, if we want the mode of the distribution to be at yi , then P(Y = y | yi) must be maximized wheny = yi. For this to be true we need to choose p such that (1 – p)k– (y–yi ) is maximized when y = yi. We can use the above kernel to obtain

where the indicator function, I (.), equals one if the event inside the parentheses is true, and zero otherwise.

Pn (y) = Pn (Y = y) = 1

n on

i=11 k

y 2 y i 2 p (y 2 y i) (1 2 p )k 2 (y 2 y i) I (y i ² y ² y i 1 k )

P (Y = y | y i) = 1k

y 2 y i 2 p (y 2 y i) (1 2 p) k 2 (y 2 y i )

where #(yi = 0) represents the number of observations equalto zero. In the case where the true distribution is Poisson, wefound that the Poisson kernel spreads out the probability byputting more probability in the right tail as shown in Fig. 1.This means that the probabilities associated with the lowercounts are underestimated by the Poisson kernel. The data

determined r value in this case was equal to the upper limit(r = 1.0) (see Fig. 1 panel (c)). When the true distribution isknown to be Poisson, we cannot improve the � t by use ofnonparametrics.

Next we considered a bimodal situation as shown in Fig. 2.The distribution is generated as a mixture of two Poisson

Discrete Poisson kernal density estimation 395

Fig. 1. Poisson kernel estimates compared to the histogram whenthe true data generating process is Poisson with lambda = 3

Fig. 2. Poisson kernel estimates compared to the histogram whenthe true data generating process is bimodal: 0.6 Poisson(0.5) + 0.4

Poisson(10.0)

distributions with means 0.5 and 10.0. The observations arerandomly selected from two independent Poisson populationssuch that 60% of the time it comes from Poisson with mean0.5 and 40% of the time from Poisson with mean 10.0. Thesedata have a very high count of zeros and exhibit bimodality.The � ts are better for smaller values of r such as 0.1 and 0.3(see Fig. 2 panels (a) and (b)), and severely underestimates

zeros for high values of r. The data determined r value(r = 0.194) equates the zero cell probability to the histogramand provides a moderately good � t for the other cell counts(see Fig. 2 panel (c)).

Next we looked at coal mine wildcat strike data whichprovided a typical example of a skewed distribution. The coalmine data are proprietary data obtained for a more extensivestudy with the understanding that their source and time framewould not be revealed other than to say that the data comefrom coal mines in the midwestern and eastern United States.The histogram for these data is presented as the darker verticalbars in Fig. 3. Since the data are skewed to the right, the lowervalue of r would presumably � t the data well. A value ofr = 0.1 might be a good choice. Figure 3 panel (c) comparesthe Poisson kernel with the original data histogram when r isdata determined (in this case, r = 0.152). We � nd that the datadetermined r value is marginally better than the 0.1 value.

IV. CONCLUSIONS

The topic we have chosen to study is the use of a discretePoisson kernel to generate discrete nonparametric probabilitydistributions. We have compared our discrete kernel densitywith some discrete probability distributions as well as withthe sample histogram data to see how well our approach � tthe data when generated by different methods. Our simulationapproach shows that our kernel density estimator works quitewell in most circumstances. It provides a good degree of � ex-ibility while still maintaining some reasonable degree of struc-ture. However, since continuous � rst and second derivativesdo not exist for discrete estimators, we are unable to provethe consistency and asymptotic ef� ciency of these estimatorsbut have instead provided simulation results under differentdata structures as well as an example using actual sample data.

REFERENCES

Bierens, H.J. (1983) Uniform consistency of kernel estimators of aregression function under generalized conditions, Journal of theAmerican Statistical Association, 78, 699–707.

Cameron, A.C., and Trivedi, P.K. (1986) Econometric models basedon count data: comparisons and applications of some estima-tors, Journal of Applied Econometrics, 1, 29–53.

Delgado, M.A. and Mora, J. (1995) Nonparametric and semipara-metric estimation with discrete regressors, Econometrica , 63,1477–84.

Horowitz, J.L. (1993) Semiparametric and nonparametric estimationof quantal response models, in: (Eds) G.S. Maddala, C.R. Raoand H.D. Vinod Handbook of Statistics, Vol. 11, North Holland,Amsterdam.

Ullah, A. and Vinod, H.D. (1993) General nonparametric regressionestimation and testing in econometrics, in: (Eds) G.S. Maddala,C.R. Rao and H.D. Vinod Handbook of Statistics, Vol. 11, NorthHolland, Amsterdam.

396 L.C. Marsh and K. Mukhopadhyay

Fig. 3. Poisson kernel estimates of the distribution compared tothe histogram for the coal mine wildcat strike data

Discrete Poisson kernel density estimation-with an application to wildcat coal strikes

Documents

Transcript of Discrete Poisson kernel density estimation-with an application to wildcat coal strikes