Chapter 17.1
description
Transcript of Chapter 17.1
Chapter 17.1Poisson Regression
Classic Poisson Example
• Number of deaths by horse kick, for each of 16 corps in the Prussian army, from 1875 to 1894
• Did the risk of death show an trend across years for the guard corps?
1. Construct Model – Graphical
1. Construct Model - Formal
Write General Linear Model:
General linear model inappropriate for count data:• Variance likely increases with mean• Fitted values may be negative• Errors tend not to be normal• Zeros are difficult to handle with transformations
1. Construct Model - Formal
Write General Linear Model:
Write Generalized Linear Model:
2. Execute analysis & 3. Evaluate model glm1 <- glm(deaths~year, family=poisson(link=log),
data=horsekick)
2. Execute analysis & 3. Evaluate model glm1 <- glm(deaths~year, family=poisson(link=log),
data=horsekick)
4. State population and whether sample is representative.
5. Decide on mode of inference. Is hypothesis testing appropriate?
6. State HA / Ho pair, tolerance for Type I error
Statistic:Distribution:
7. ANODEV. Calculate change in fit (ΔG) due to explanatory variables.
• The F-statistic is not used for models with non-normal errors
• We will assess improvement in fit (ANODEV)
7. ANODEV. Calculate change in fit (ΔG) due to explanatory variables.
> anova(glm1, test="Chisq")Analysis of Deviance Table
Model: poisson, link: log
Response: deaths
Terms added sequentially (first to last)
Df Deviance Resid. Df Resid. Dev Pr(>Chi)NULL 19 22.050 year 1 0.61137 18 21.439 0.4343
8. Assess table in view of evaluation of residuals.– Residuals acceptable
9. Assess table in view of evaluation of residuals.– Reject HA: There was no apparent trend in deaths by
horsekick over two decades (ΔG=0.611, p=0.4343)
10.Analysis of parameters of biological interest.– βyear was not significant – report mean deaths/yr• 16 deaths / 20 years = 0.8 deaths/year
library(pscl)library(Hmisc)prussian
horsekick <- subset(prussian, corp=="G")names(horsekick) <- c("deaths","year","corps")
glm0 <- glm(deaths ~ 1, family = poisson(link = log), data = horsekick) # intercept onlyglm1 <- glm(deaths ~ year, family = poisson(link = log), data = horsekick)
plot(glm1, which=1, add.smooth=F, pch=16)plot(glm1$residuals, Lag(glm1$residuals), xlab="Residuals", ylab="Lagged residuals", pch=16)
plot(deaths~year, data=horsekick, pch=16, axes=F, xlab="Year", ylab="Deaths (Guard corp)")axis(1, at=75:94, labels=1875:1894)axis(2, at=0:3)box()lines(horsekick$year, glm1$fitted) # with regression termlines(horsekick$year, glm0$fitted, lty=2) # intercept
anova(glm1, test="Chisq")