Download - Chapter 17.1

Transcript
Page 1: Chapter 17.1
Page 2: Chapter 17.1

Chapter 17.1Poisson Regression

Page 3: Chapter 17.1

Classic Poisson Example

• Number of deaths by horse kick, for each of 16 corps in the Prussian army, from 1875 to 1894

• Did the risk of death show an trend across years for the guard corps?

Page 4: Chapter 17.1

1. Construct Model – Graphical

Page 5: Chapter 17.1

1. Construct Model - Formal

Write General Linear Model:

General linear model inappropriate for count data:• Variance likely increases with mean• Fitted values may be negative• Errors tend not to be normal• Zeros are difficult to handle with transformations

Page 6: Chapter 17.1

1. Construct Model - Formal

Write General Linear Model:

Write Generalized Linear Model:

Page 7: Chapter 17.1

2. Execute analysis & 3. Evaluate model glm1 <- glm(deaths~year, family=poisson(link=log),

data=horsekick)

Page 8: Chapter 17.1

2. Execute analysis & 3. Evaluate model glm1 <- glm(deaths~year, family=poisson(link=log),

data=horsekick)

Page 9: Chapter 17.1

4. State population and whether sample is representative.

5. Decide on mode of inference. Is hypothesis testing appropriate?

6. State HA / Ho pair, tolerance for Type I error

Statistic:Distribution:

Page 10: Chapter 17.1

7. ANODEV. Calculate change in fit (ΔG) due to explanatory variables.

• The F-statistic is not used for models with non-normal errors

• We will assess improvement in fit (ANODEV)

Page 11: Chapter 17.1

7. ANODEV. Calculate change in fit (ΔG) due to explanatory variables.

> anova(glm1, test="Chisq")Analysis of Deviance Table

Model: poisson, link: log

Response: deaths

Terms added sequentially (first to last)

Df Deviance Resid. Df Resid. Dev Pr(>Chi)NULL 19 22.050 year 1 0.61137 18 21.439 0.4343

Page 12: Chapter 17.1

8. Assess table in view of evaluation of residuals.– Residuals acceptable

9. Assess table in view of evaluation of residuals.– Reject HA: There was no apparent trend in deaths by

horsekick over two decades (ΔG=0.611, p=0.4343)

10.Analysis of parameters of biological interest.– βyear was not significant – report mean deaths/yr• 16 deaths / 20 years = 0.8 deaths/year

Page 13: Chapter 17.1

library(pscl)library(Hmisc)prussian

horsekick <- subset(prussian, corp=="G")names(horsekick) <- c("deaths","year","corps")

glm0 <- glm(deaths ~ 1, family = poisson(link = log), data = horsekick) # intercept onlyglm1 <- glm(deaths ~ year, family = poisson(link = log), data = horsekick)

plot(glm1, which=1, add.smooth=F, pch=16)plot(glm1$residuals, Lag(glm1$residuals), xlab="Residuals", ylab="Lagged residuals", pch=16)

plot(deaths~year, data=horsekick, pch=16, axes=F, xlab="Year", ylab="Deaths (Guard corp)")axis(1, at=75:94, labels=1875:1894)axis(2, at=0:3)box()lines(horsekick$year, glm1$fitted) # with regression termlines(horsekick$year, glm0$fitted, lty=2) # intercept

anova(glm1, test="Chisq")