Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm...
Transcript of Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm...
![Page 1: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/1.jpg)
Generalized Linear Models
SJSU
November 17, 2016
![Page 2: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/2.jpg)
GLM Overview
Analysis methods
ANOVA: discrete x, continuous y
Regression: discrete/continuous x, continuous y
GLM: discrete/continuous x, discrete/continuous y
Generalized Linear Models is the broadest category
2 / 48
![Page 3: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/3.jpg)
GLM Overview
GLM types
Logistic regression
Normal regression - needs the 3 assumptions
Poisson regression
Negative binomial regression (maybe)
3 / 48
![Page 4: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/4.jpg)
GLM Overview
GLM types
Response (y) type:
Normally distributed
categorical - disease present/absent
categorical - disease low/medium/high
integer valued - number of chocolate chips
4 / 48
![Page 5: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/5.jpg)
Logistic Regression (GLM)
Logistic Regression
Response (yi ) correspond to multiple covariates (xi )
Response is count data
male-female, healthy-sick, alive-dead, success-failure, win-loss
To simplify notation we denote the response with 0 or 1
We are interested in the probability (p) of y=1
5 / 48
![Page 6: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/6.jpg)
Logistic Regression (GLM)
Logistic Regression
Binary response data (Binomial)F−1(p) = β0 + β1x
Standard regression:y = β0 + β1x + εiF−1(p) = y standard link
6 / 48
![Page 7: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/7.jpg)
Logistic Regression (GLM)
Logistic Regression
7 / 48
![Page 8: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/8.jpg)
Logistic Regression (GLM)
Link Functions
Link function:-link probability (p) and the covariates- F is a monotone cumulative distribution function-common: logistic link, probit link, clog-log link
Logit link: F−1(p) = logit(pi ) = log p1−p
log pi1−pi
= β0 + β1xi
pi = exp(β0+β1xi )1+exp(β0+β1xi )
8 / 48
![Page 9: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/9.jpg)
Logistic Regression (GLM)
Link Functions
Link function:
9 / 48
![Page 10: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/10.jpg)
Logistic Regression (GLM)
Logistic Regression
Multiple measurements correspond to the same covariate (xi )Turn the multiple measurements into binomial counts
xi covariate
ni responses corresponding to xi
yi number of responses equal to 1
log pi1−pi
= β0 + β1xi
10 / 48
![Page 11: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/11.jpg)
Logistic Regression (GLM)
Solving for βs
Newton-Raphson algorithm (eg. the same method as finding theparameters of a Gamma distribution)solve the non-linear equation
log pi1−pi
= β0 + β1xi
Once βs are found then solve for pi
pi = exp(β0+β1xi )1+exp(β0+β1xi )
11 / 48
![Page 12: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/12.jpg)
Logistic Regression (GLM)
Assumptions
Independence still needs to be checked
There is no Normality assumption
There is no constant variance assumption
* the variance is a function of the mean
E (yi ) = pi and V (yi ) = E (yi )/(1− E (yi ))
12 / 48
![Page 13: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/13.jpg)
Logistic Regression (GLM)
Example 1- logistic regression
C-section infection data: C-section is a major surgery to have a baby thatcan cause excessive bleeding, blood clots, infection, pain, longer hospitalstays, and longer recovery. The data is from example 17.1 and concernsinfection from a C-section. The response variable (y) is occurrence ornon-occurrence of infection. Three covariates (x) each at two levels:
x1 nonplan -planned=0 and unplanned=1
x2 riskfac - diabetes, overweight, previous C-section: present=1,not=0
x3 antibio - antibiotics were given =1 or not=0
13 / 48
![Page 14: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/14.jpg)
Logistic Regression (GLM)
Example 1 - Data
Planned No Plan
Infection Infectionyes no total yes no total
AntibioticsRisk(yes) 1 17 18 11 87 98Risk (no) 0 2 2 0 0 0
No AntibioticsRisk(yes) 28 30 58 23 3 26Risk (no) 8 32 40 0 9 9
14 / 48
![Page 15: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/15.jpg)
Logistic Regression (GLM)
Example 1 -Code
log(pinfection/pnoinfection) = β0 + β1 ∗ noplan + β2 ∗ riskfac + β3 ∗ antibio
infection=c(1,11,0,0,28,23,8,0)total=c(18,98,2,0,58,26,40,9)proportion=infection/totalnoplan=c(0,1,0,1,0,1,0,1)riskfac=c(1,1,0,0,1,1,0,0)antibio=c(1,1,1,1,0,0,0,0)
reg1=glm(proportion ∼ noplan+riskfac+antibio, family=“binomial”,weights=total)summary(reg1)
15 / 48
![Page 16: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/16.jpg)
Logistic Regression (GLM)
proportion = yi/niweights = ni
reg1=glm(proportion∼noplan+riskfac+antibio, family=“binomial”,weights=ni )summary(reg1)
16 / 48
![Page 17: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/17.jpg)
Logistic Regression (GLM)
Example 1 -Code
Estimate Std. Error z value Pr(> |z |)(Intercept) -1.8926 0.4124 -4.589 4.45e-06 ***
noplan 1.0720 0.4254 2.520 0.0117 *riskfac 2.0299 0.4553 4.459 8.25e-06 ***antibio -3.2544 0.4813 -6.761 1.37e-11 ***
Null deviance: 83.491 on 6 degrees of freedomResidual deviance: 10.997 on 3 degrees of freedom(1 observation deleted due to missingness)AIC: 36.178
When antibiotics are given the factor exp(-3.25)=0.0388P(infection)/P(no.infection) or 1/0.0388=25.77, the odds decrease 25.77times
17 / 48
![Page 18: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/18.jpg)
Logistic Regression (GLM)
Example 1
Observed(no observed proportion for number 4: 0/0=NaNproportion= 0.0555 0.112 0.000 NaN 0.4827 0.8846 0.200 0.000
Model prediction proportion:log(pinfection/pnoinfection) = β0 + β1 ∗ noplan + β2 ∗ riskfac + β3 ∗ antibio(pinfection/pnoinfection) =exp(β0) + exp(β1 ∗ noplan) + exp(β2 ∗ riskfac) + exp(β3 ∗ antibio)predict(reg1, type=“response”)0.0424 0.1145 0.00578 NA 0.534 0.770 0.1309 0.3056
18 / 48
![Page 19: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/19.jpg)
Logistic Regression (GLM)
Deviance score
The deviance of the model measures goodness-of-fitOutput:Null deviance: 83.491 on 6 degrees of freedomResidual deviance: 10.997 on 3 degrees of freedom(1 observation deleted due to missingness)AIC: 36.178
χ2 with 3 degrees of freedom7 observations- 4 parameters estimated (βs)=3Residual deviance=10.9967pvalue=1-pchisq(10.9967,3)=0.0117Reject H0 and conclude the model does not fit wellInclude some interactions?
19 / 48
![Page 20: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/20.jpg)
Logistic Regression (GLM)
Plot Data and Model
plot(1:8, proportion, pch=15, col=“dark blue”)fit=predict(reg1, type=“response”)points(as.numeric(names(fit)),predict(reg1, type=“response”), pch=19,col=”dark green”)legend(1, 0.9,legend=c(“Obs. Proportions”,“Logistic Fit”), pch=c(15,19),col=c(”dark blue”, ”dark green”))
x: 1=(0,1,1),2=(1,1,1), 3=(0,0,1), 4=(1,0,1), 5=(0,1,0),6=(1,1,0),7=(0,0,0),8=(1,0,0)
20 / 48
![Page 21: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/21.jpg)
Logistic Regression (GLM)
Logistic Regression
Goodness of fit:
no R2, F or MSE (use a psuedo R2)
Deviance: D = −2log likelihood of the fitted modellikelihood of the saturated model
yi is the number of 1s and ni − yi is the number of 0s
D = −2∑k
i=1{yi log( yiyi ) + (ni − yi )logni−yini−yi}
Output:Null deviance: 83.491 on 6 degrees of freedomResidual deviance: 10.997 on 3 degrees of freedom(1 observation deleted due to missingness)AIC: 36.178
21 / 48
![Page 22: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/22.jpg)
Logistic Regression (GLM)
Example 2
Arrhythmia: patients who undergo coronary artery bypass graft surgery(CABG) have an approximately 19 to 40% chance of developing atrialfibrillation (AF). AF is a quivering, chaotic motion in the upper chambersof the heart, known as atria. AF can lead to the formation of blood clots,causing greater in-hospital mortality, strokes, and longer hospitalstays.While this can be prevented with drugs, it is very expensive andsometimes dangerous if not warranted. Ideally, several risk factors thatwould indicate an increased risk of developing AF in this population couldsave lives and money by indicating which patients need pharmacologicalintervention. Researchers began collecting data form CABG patientsduring their hospital stay such as demographics, like age and sex, as wellas heart rate, cholesterol, operations time, etc. Then the researchersrecorded which patients developed AF during their hospital stay. The goalwas to evaluate the probability of AF given the measured demographic andrisk factors.
22 / 48
![Page 23: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/23.jpg)
Logistic Regression (GLM)
Example 2: Data
Y fibrillation
X1 ageX2 aortic cross clamp timeX3 cardiopulmonary bypass timeX4 intensive care unit (ICU) timeX5 average heart rateX6 left ventricle ejection fractionX7 anamnesis of hypertensionX8 gender (1=female, 0=male)X9 anamnesis of diabetes
X10 previous MI
23 / 48
![Page 24: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/24.jpg)
Logistic Regression (GLM)
R Code - Binomial
proportion = yi/niweights = ni
reg1=glm(proportion ∼x1, family=“binomial”, weights=ni )summary(reg1)
OR
Y=0 or 1 arrhythmia, ni = 1 so weights are implied as 1reg1=glm(Y∼x1, family=“binomial”)
OR
ni=die+livereg1=glm(cbind(die, live)∼ x1, family=“binomial” )reg1=glm(die/ni ∼ x1, family=“binomial”, weights=ni )
24 / 48
![Page 25: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/25.jpg)
Logistic Regression (GLM)
Example 2
reg1=glm(Y∼X1+X2+X3+X4+X5+X6+X7+X8+X9+X10,binomial)summary(reg1)
Estimate Std. Error z value Pr(> |z |)Intercept -10.952752 4.539527 -2.413 0.015833 *
X1 0.153628 0.044021 3.490 0.000483 ***X2 0.024800 0.023960 1.035 0.300635X3 -0.016837 0.015594 -1.080 0.280272X4 -0.129457 0.086554 -1.496 0.134737X5 0.007144 0.029105 0.245 0.806109X6 0.020674 0.025727 0.804 0.421647X7 -0.537703 0.613750 -0.876 0.380979X8 -0.263754 0.631467 -0.418 0.676178X9 1.093606 0.633264 1.727 0.084179 .
X10 0.341597 0.641249 0.533 0.594237
(Dispersion parameter for binomial family taken to be 1)Residual deviance: 78.252 on 70 degrees of freedom, AIC: 100.25 25 / 48
![Page 26: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/26.jpg)
Logistic Regression (GLM)
Example 2 plot
Too many X values, so find linear predictor (x-axis): β0 + β1 ∗ x1 + ...y-axis is the observed data Y=1 fibrilation, Y=0 no fiby-axis is also the fitted probabilities (GLM regression)
26 / 48
![Page 27: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/27.jpg)
Logistic Regression (GLM)
Example 2
XX=cbind(rep(1,length(X1)),X1,X2,X3,X4,X5,X6,X7,X8,X9,X10)linpred= apply(XX%*%coef(reg1),1,sum)plot(linpred, Y, col=”dark blue”)fit=predict(reg1, type=”response”)points(linpred,predict(reg1, type=”response”), pch=19, col=”dark green”)legend(-5, 0.9,legend=c(”Obs. Proportions”,”Logistic Fit”), pch=c(1,19),col=c(”dark blue”, ”dark green”))
27 / 48
![Page 28: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/28.jpg)
Logistic Regression (GLM)
Example 2
reg1=glm(Y∼X1+X2+X3+X4+X5+X6+X7+X8+X9+X10,family=”binomial”)summary(reg1)
Estimate Std. Error z value Pr(> |z |)Intercept -10.952752 4.539527 -2.413 0.015833 *
X1 0.153628 0.044021 3.490 0.000483 ***X2 0.024800 0.023960 1.035 0.300635X3 -0.016837 0.015594 -1.080 0.280272X4 -0.129457 0.086554 -1.496 0.134737X5 0.007144 0.029105 0.245 0.806109X6 0.020674 0.025727 0.804 0.421647X7 -0.537703 0.613750 -0.876 0.380979X8 -0.263754 0.631467 -0.418 0.676178X9 1.093606 0.633264 1.727 0.084179 .
X10 0.341597 0.641249 0.533 0.594237
(Dispersion parameter for binomial family taken to be 1)Residual deviance: 78.252 on 70 degrees of freedom, AIC: 100.25 28 / 48
![Page 29: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/29.jpg)
Logistic Regression (GLM)
Example 2
X Correlation Checkround(cor(XX[,2:11]),2)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10X1 1.00 -0.03 0.01 -0.01 -0.18 0.14 0.15 0.21 -0.11 -0.11X2 -0.03 1.00 0.85 0.34 -0.07 -0.23 0.10 -0.26 0.14 -0.06X3 0.01 0.85 1.00 0.32 0.13 -0.10 0.12 -0.19 0.15 -0.11X4 -0.01 0.34 0.32 1.00 0.11 -0.23 0.14 -0.11 0.00 0.04X5 -0.18 -0.07 0.13 0.11 1.00 -0.13 -0.08 -0.06 -0.02 0.18X6 0.14 -0.23 -0.10 -0.23 -0.13 1.00 0.03 0.25 -0.18 -0.42X7 0.15 0.10 0.12 0.14 -0.08 0.03 1.00 0.08 0.18 -0.12X8 0.21 -0.26 -0.19 -0.11 -0.06 0.25 0.08 1.00 -0.08 -0.09X9 -0.11 0.14 0.15 0.00 -0.02 -0.18 0.18 -0.08 1.00 -0.10
X10 -0.11 -0.06 -0.11 0.04 0.18 -0.42 -0.12 -0.09 -0.10 1.00
highest is 0.85 for X3 (clamp time) and X2 (bypass time), remove one29 / 48
![Page 30: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/30.jpg)
Logistic Regression (GLM)
Example 2
Remove X5 average heart rateRemove X8 genderRemove X10 previous MIRemove X2 clamp timeRemove X3 bypass timeRemove X6 left ejection fractionRemove X7 hypertensionRemove X9 diabetes
Keep X1 age and X4 ICU time
30 / 48
![Page 31: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/31.jpg)
Logistic Regression (GLM)
AIC and Parsimony
AIC= 100.25, X1,.., X10 (full model)AIC=92.46, X1, X4, X6, X7,X9AIC=90.58, X1, X4, X7,X9AIC=89.44, X1, X4, X9AIC= 89.48, X1 and X4 (small model)
Small AIC is *best* modelAIC gives penalty for including too many XsOr you can look for largest “residual deviance”
31 / 48
![Page 32: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/32.jpg)
Logistic Regression (GLM)
Other links
Logit, probit, or clog-log linksreg1=glm(Y∼X1+X2+X3+...+X10,family=binomial(logit))reg1=glm(Y∼X1+X2+X3+...+X10,family=binomial(probit))reg1=glm(Y∼X1+X2+X3+...+X10,family=binomial(cloglog))
Very small difference in resultsComplementary log-log is good when y=1 is rareBayesian algorithms prefer the probit link(see lab 5)
32 / 48
![Page 33: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/33.jpg)
Poisson Regression (GLM)
Poisson Regression
Response (y) is count data (Poisson)y={0,1,2,3,...}
Tend to be rare events in a large number of trials- accidents, incidents of a rare disease, device failure in a time interval
33 / 48
![Page 34: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/34.jpg)
Poisson Regression (GLM)
Poisson Regression
E (y) = λ and V (y) = λ
yi ∼ Pois(λi )
log(λi ) = β0 + β1xior multiple x’slog(λi ) = β0 + β1x1,i + β2x2,i + ...
yi = exp(β0 + β1xi )
34 / 48
![Page 35: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/35.jpg)
Poisson Regression (GLM)
Poisson Regression
Model checking and Goodness-of-fit:
Deviance - D = 2∑n
i=1
(yi log
yiyi− (yi − yi )
)Deviance residuals
Pearson residuals
Freedman-Tukey residuals
Plot the residuals
35 / 48
![Page 36: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/36.jpg)
Poisson Regression (GLM)
Example 3
Cellular differentiation (when a cell becomes a more specialized cell). Thisis a study of TNF (tumor necrosis factor) and IFN (interferon) to inducecell differentiation. The number of cells that exhibited markers ofdifferentiation after exposure to TNF or IFN were recorded. There were 16dose combinations of TFN/IFN and 200 cells were examined.reg1=glm(y ∼tfn*ifn, family=poisson)
36 / 48
![Page 37: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/37.jpg)
Poisson Regression (GLM)
Example 3-Data
y=c(11,18,20,39,22,38,52,69,31,68,69,128,102,171,180,193)tfn=c(0,0,0,0,1,1,1,1,10,10,10,10,100,100,100,100)ifn=c(0,4,20,100,0,4,20,100,0,4,20,100,0,4,20,100)
37 / 48
![Page 38: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/38.jpg)
Poisson Regression (GLM)
Example 3
summary(reg1)
Estimate Std. Error z value Pr(> |z |)(Intercept) 3.436e+00 6.377e-02 53.877 < 2e-16 ***
tfn 1.553e-02 8.308e-04 18.689 < 2e-16 ***ifn 8.946e-03 9.669e-04 9.253 < 2e-16 ***
tfn:ifn -5.670e-05 1.348e-05 -4.205 2.61e-05 ***
(Dispersion parameter for poisson family taken to be 1)Null deviance: 707.03 on 15 degrees of freedomResidual deviance: 142.39 on 12 degrees of freedomAIC: 243.69
38 / 48
![Page 39: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/39.jpg)
Poisson Regression (GLM)
Example 3: More code
confint(reg1)2.5 % 97.5 %
(Intercept) 3.308307e+00 3.558360e+00tfn 1.390603e-02 1.716434e-02ifn 7.043823e-03 1.083599e-02tfn:ifn -8.318686e-05 -3.031362e-05
39 / 48
![Page 40: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/40.jpg)
Poisson Regression (GLM)
Example 3
What we expect to see:
predict(reg1, type=”response”)35.62746 36.47161 40.05305 63.97916 36.09877 36.95410 40.5829164.82554 40.63133 41.59405 45.67849 72.96502 132.60092 135.74276149.07241 238.12240
Data:y=c(11,18,20,39,22,38,52,69,31,68,69,128,102,171,180,193)
40 / 48
![Page 41: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/41.jpg)
Poisson Regression (GLM)
Example 3
residuals(reg1,type=”deviance”)
41 / 48
![Page 42: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/42.jpg)
Poisson Regression (GLM)
Example 4
Overdispersion/Underdispersion and the Quasi-PoissonThe Poisson distribution has one parameter for mean and variance(dispersion parameter)
There is a strict assumption that the mean=variance
What if that is not the case?
42 / 48
![Page 43: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/43.jpg)
Poisson Regression (GLM)
Example 4
Y is the count of faults in the manufacturing of rolls of fabric. X is thelength of the roll.
The Poisson model is: log(yi ) = β0 + β1xiglm(y∼x, family=poisson)
43 / 48
![Page 44: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/44.jpg)
Poisson Regression (GLM)
Example 4
Standard poisson: glm(y∼x, family=poisson)
Estimate Std. Error z value Pr(>—z—)
(Intercept) 0.9717506 0.2124693 4.574 4.79e-06 ***x 0.0019297 0.0003063 6.300 2.97e-10 ***
(Dispersion parameter for poisson family taken to be 1)Null deviance: 103.714 on 31 degrees of freedomResidual deviance: 61.758 on 30 degrees of freedomAIC: 189.06
44 / 48
![Page 45: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/45.jpg)
Poisson Regression (GLM)
Example 4
Standard poisson: glm(y∼x, family=quasipoisson)
Estimate Std. Error z value Pr(>—z—)
(Intercept) 0.9717506 0.3095033 3.140 0.003781 **x 0.0019297 0.0004462 4.325 0.000155 ***
(Dispersion parameter for quasipoisson family taken to be 2.121965)Null deviance: 103.714 on 31 degrees of freedomResidual deviance: 61.758 on 30 degrees of freedomAIC: NA
45 / 48
![Page 46: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/46.jpg)
Poisson Regression (GLM)
Example 4
Poisson versus Quassi-poisson
Same β0 and β1
Same deviance
Dispersion parameter (1) and Dispersion parameter (2.12)
Different p-values
46 / 48
![Page 47: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/47.jpg)
Poisson Regression (GLM)
Example 4
47 / 48
![Page 48: Generalized Linear Modelsiamrandom.com/sites/default/files/GLM-1.pdf · Newton-Raphson algorithm (eg. the same method as nding the parameters of a Gamma distribution) solve the non-linear](https://reader034.fdocuments.in/reader034/viewer/2022042210/5eae1286fa0eb81963088005/html5/thumbnails/48.jpg)
Other GLM
Other GLM
Exponential family response (y)
Normal
Binomial/Bernoulli
Poisson
Gamma
48 / 48