Download - Section

Transcript
Page 1: Section

Section

Count Data Models

Page 2: Section

Introduction

• Many outcomes of interest are integer counts– Doctor visits– Low work days– Cigarettes smoked per day– Missed school days

• OLS models can easily handle some integer models

Page 3: Section

• Example– SAT scores are essentially integer values– Few at ‘tails’– Distribution is fairly continuous– OLS models well

• In contrast, suppose– High fraction of zeros– Small positive values

Page 4: Section

• OLS models will– Predict negative values– Do a poor job of predicting the mass of

observations at zero

• Example– Dr visits in past year, Medicare patients(65+)– 1987 National Medical Expenditure Survey– Top code (for now) at 10– 17% have no visits

Page 5: Section

• visits | Freq. Percent Cum.• ------------+-----------------------------------• 0 | 915 17.18 17.18• 1 | 601 11.28 28.46• 2 | 533 10.01 38.46• 3 | 503 9.44 47.91• 4 | 450 8.45 56.35• 5 | 391 7.34 63.69• 6 | 319 5.99 69.68• 7 | 258 4.84 74.53• 8 | 216 4.05 78.58• 9 | 192 3.60 82.19• 10 | 949 17.81 100.00• ------------+-----------------------------------• Total | 5,327 100.00

Page 6: Section

Poisson Model

• yi is drawn from a Poisson distribution

• Poisson parameter varies across observations

• f(yi;λi) =e-λi λi yi/yi! For λi>0

• E[yi]= Var[yi] = λi = f(xi, β)

Page 7: Section

• λi must be positive at all times

• Therefore, we CANNOT let λi = xiβ

• Let λi = exp(xiβ)

• ln(λi) = (xiβ)

Page 8: Section

• d ln(λi)/dxi = β

• Remember that d ln(λi) = dλi/λi

• Interpret β as the percentage change in mean outcomes for a change in x

Page 9: Section

Problems with Poisson

• Variance grows with the mean– E[yi]= Var[yi] = λi = f(xi, β)

• Most data sets have over dispersion, where the variance grows faster than the mean

• In dr. visits sample, = 5.6, s=6.7• Impose Mean=Var, severe restriction

and you tend to reduce standard errors

Page 10: Section

Negative Binomial Model

• Where γi = exp(xiβ) and δ ≥ 0

• E[yi] = δγi = δexp(xiβ)

• Var[yi] = δ (1+δ) γi

• Var[yi]/ E[yi] = (1+δ)

ii y

ii

iii y

yy

11

1

)1()(

)()Pr(

Page 11: Section

• δ must always be ≥ 0• In this case, the variance grows

faster than the mean• If δ=0, the model collapses into the

Poisson• Always estimate negative binomial• If you cannot reject the null that δ=0,

report the Poisson estimates

Page 12: Section

• Notice that ln(E[yi]) = ln(δ) + ln(γi), so

• d ln(E[yi]) /dxi = β

• Parameters have the same interpretation as in the Poisson model

Page 13: Section

In STATA

• POISSON estimates a MLE model for poisson– Syntax

POISSON y independent variables

• NBREG estimates MLE negative binomial– Syntax

NBREG y independent variables

Page 14: Section

Interpret results for Poisson

• Those with CHRONIC condition have 50% more mean MD visits

• Those in EXCELent health have 78% fewer MD visits

• BLACKS have 33% fewer visits than whites

• Income elasticity is 0.021, 10% increase in income generates a 2.1% increase in visits

Page 15: Section

Negative Binomial

• Interpret results the same was as Poisson• Look at coefficient/standard error on delta• Ho: delta = 0 (Poisson model is correct)• In this case, delta = 5.21 standard error is

0.15, easily reject null.• Var/Mean = 1+delta = 6.21, Poisson is

mis-specificed, should see very small standard errors in the wrong model

Page 16: Section

Selected Results, Count ModelsParameter (Standard Error)

Variable Poisson Negative Binomial

Age65 0.214 (0.026) 0.103 (0.055)

Age70 0.787 (0.026) 0.204 (0.054)

Chronic 0.500 (0.014) 0.509 (0.029)

Excel -0.784 (0.031) -0.527 (0.059)

Ln(Inc). 0.021 (0.007) 0.038 (0.016)