Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon...
Transcript of Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon...
![Page 1: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1.jpg)
Soc504: Generalized Linear Models
Brandon Stewart1
Princeton
February 22 - March 15, 2017
1These slides are heavily influenced by Gary King with some material from TeppeiYamamoto, Patrick Lam and Yuri Zhukov. Some individual vignettes are built from thecollective effort of generations of teaching fellows for Gov2001 at Harvard.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 1 / 242
![Page 2: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/2.jpg)
Followup
Questions?
Replication Stories?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 2 / 242
![Page 3: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/3.jpg)
Followup
Questions?
Replication Stories?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 2 / 242
![Page 4: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/4.jpg)
Followup
Questions?
Replication Stories?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 2 / 242
![Page 5: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/5.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 3 / 242
![Page 6: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/6.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 3 / 242
![Page 7: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/7.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 8: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/8.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science:
numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 9: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/9.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 10: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/10.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 11: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/11.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 12: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/12.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 13: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/13.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 14: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/14.jpg)
Binary Outcomes
Binary outcome variable:
Yi ∈ {0, 1}
Examples in social science: numerous!
I Turnout (1 = vote; 0 = abstain)
I Education (1 = completed hs; 0 = dropped out)
I Conflict (1 = civil war; 0 = no civil war)
I Eviction (1 = evicted; 0 = not evicted)
I etc. etc.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 4 / 242
![Page 15: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/15.jpg)
How Do We Model a Binary Outcome Variable?
For a continuous outcome variable, we model the
conditional expectation
function with predictors Xi :
E(Yi | Xi ) = X>i β (linear regression)
When Yi is binary, E(Yi | Xi ) = Pr(Yi = 1 | Xi )
Thus, we model the conditional probability of Yi = 1:
Pr(Yi = 1 | Xi ) = g(X>i β)
There are many possible binary outcome models, depending on thechoice of g(·)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 5 / 242
![Page 16: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/16.jpg)
How Do We Model a Binary Outcome Variable?
For a continuous outcome variable, we model theconditional expectation function with predictors Xi :
E(Yi | Xi ) = X>i β (linear regression)
When Yi is binary, E(Yi | Xi ) = Pr(Yi = 1 | Xi )
Thus, we model the conditional probability of Yi = 1:
Pr(Yi = 1 | Xi ) = g(X>i β)
There are many possible binary outcome models, depending on thechoice of g(·)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 5 / 242
![Page 17: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/17.jpg)
How Do We Model a Binary Outcome Variable?
For a continuous outcome variable, we model theconditional expectation function with predictors Xi :
E(Yi | Xi ) = X>i β (linear regression)
When Yi is binary, E(Yi | Xi ) =
Pr(Yi = 1 | Xi )
Thus, we model the conditional probability of Yi = 1:
Pr(Yi = 1 | Xi ) = g(X>i β)
There are many possible binary outcome models, depending on thechoice of g(·)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 5 / 242
![Page 18: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/18.jpg)
How Do We Model a Binary Outcome Variable?
For a continuous outcome variable, we model theconditional expectation function with predictors Xi :
E(Yi | Xi ) = X>i β (linear regression)
When Yi is binary, E(Yi | Xi ) = Pr(Yi = 1 | Xi )
Thus, we model the conditional probability of Yi = 1:
Pr(Yi = 1 | Xi ) = g(X>i β)
There are many possible binary outcome models, depending on thechoice of g(·)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 5 / 242
![Page 19: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/19.jpg)
How Do We Model a Binary Outcome Variable?
For a continuous outcome variable, we model theconditional expectation function with predictors Xi :
E(Yi | Xi ) = X>i β (linear regression)
When Yi is binary, E(Yi | Xi ) = Pr(Yi = 1 | Xi )
Thus, we model the conditional probability of Yi = 1:
Pr(Yi = 1 | Xi ) = g(X>i β)
There are many possible binary outcome models, depending on thechoice of g(·)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 5 / 242
![Page 20: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/20.jpg)
How Do We Model a Binary Outcome Variable?
For a continuous outcome variable, we model theconditional expectation function with predictors Xi :
E(Yi | Xi ) = X>i β (linear regression)
When Yi is binary, E(Yi | Xi ) = Pr(Yi = 1 | Xi )
Thus, we model the conditional probability of Yi = 1:
Pr(Yi = 1 | Xi ) = g(X>i β)
There are many possible binary outcome models, depending on thechoice of g(·)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 5 / 242
![Page 21: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/21.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 22: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/22.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 23: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/23.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 24: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/24.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 25: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/25.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:
I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 26: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/26.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 27: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/27.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 28: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/28.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 29: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/29.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:
I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 30: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/30.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]
I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 31: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/31.jpg)
Linear Probability Model (LPM)
The simplest choice: g(X>i β) = X>i β
This gives the linear probability model (LPM):
Pr(Yi = 1 | Xi ) = X>i β
or equivalently
Yi ∼ Bernoulli(πi )
πi = X>i β
Advantages:I Easy to estimate: Regress Yi on Xi
I Easy to interpret: β = ATE if Xi ∈ {0, 1} and exogenous
Disadvantages:I Estimated probability can go outside of [0, 1]I Always heteroskedastic
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 6 / 242
![Page 32: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/32.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is
Bernoulli(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 33: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/33.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is
Bernoulli(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 34: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/34.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is
Bernoulli(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 35: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/35.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is Bernoulli
(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 36: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/36.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is Bernoulli(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 37: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/37.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is Bernoulli(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 38: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/38.jpg)
Logit and Probit Models
Want: 0 ≤ g(X>i β) ≤ 1 for any Xi
Solution: Use a CDF
πi = g(X>i β) = F (X>i β)
Note: F is not the CDF of Yi , which is Bernoulli(using a CDF is just a convenient way to ensure 0 ≤ πi ≤ 1)
Logit: Logistic CDF (a.k.a. inverse logit function)
πi = logit−1(X>i β) ≡exp(X>i β)
1 + exp(X>i β)=
1
1 + exp(−X>i β)
Probit: Standard normal CDF
πi = Φ(X>i β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 7 / 242
![Page 39: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/39.jpg)
Binary Variable Regression Models
The logistic regression (or “logit”) model:
1 Stochastic component:
Yi ∼ YBern(yi |πi ) = πyii (1− πi )1−yi =
{πi for y = 1
1− πi for y = 0
2 Systematic Component:
Pr(Yi = 1|β) ≡ E (Yi ) ≡ πi =1
1 + e−xiβ
3 Yi and Yj are independent ∀ i 6= j , conditional on X
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 8 / 242
![Page 40: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/40.jpg)
Binary Variable Regression Models
The logistic regression (or “logit”) model:
1 Stochastic component:
Yi ∼ YBern(yi |πi ) = πyii (1− πi )1−yi =
{πi for y = 1
1− πi for y = 0
2 Systematic Component:
Pr(Yi = 1|β) ≡ E (Yi ) ≡ πi =1
1 + e−xiβ
3 Yi and Yj are independent ∀ i 6= j , conditional on X
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 8 / 242
![Page 41: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/41.jpg)
Binary Variable Regression Models
The logistic regression (or “logit”) model:
1 Stochastic component:
Yi ∼ YBern(yi |πi ) = πyii (1− πi )1−yi =
{πi for y = 1
1− πi for y = 0
2 Systematic Component:
Pr(Yi = 1|β) ≡ E (Yi ) ≡ πi =1
1 + e−xiβ
3 Yi and Yj are independent ∀ i 6= j , conditional on X
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 8 / 242
![Page 42: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/42.jpg)
Binary Variable Regression Models
The logistic regression (or “logit”) model:
1 Stochastic component:
Yi ∼ YBern(yi |πi ) = πyii (1− πi )1−yi =
{πi for y = 1
1− πi for y = 0
2 Systematic Component:
Pr(Yi = 1|β) ≡ E (Yi ) ≡ πi =1
1 + e−xiβ
3 Yi and Yj are independent ∀ i 6= j , conditional on X
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 8 / 242
![Page 43: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/43.jpg)
Binary Variable Regression Models
The logistic regression (or “logit”) model:
1 Stochastic component:
Yi ∼ YBern(yi |πi ) = πyii (1− πi )1−yi =
{πi for y = 1
1− πi for y = 0
2 Systematic Component:
Pr(Yi = 1|β) ≡ E (Yi ) ≡ πi =1
1 + e−xiβ
3 Yi and Yj are independent ∀ i 6= j , conditional on X
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 8 / 242
![Page 44: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/44.jpg)
Binary Variable Regression Models
The logistic regression (or “logit”) model:
1 Stochastic component:
Yi ∼ YBern(yi |πi ) = πyii (1− πi )1−yi =
{πi for y = 1
1− πi for y = 0
2 Systematic Component:
Pr(Yi = 1|β) ≡ E (Yi ) ≡ πi =1
1 + e−xiβ
3 Yi and Yj are independent ∀ i 6= j , conditional on X
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 8 / 242
![Page 45: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/45.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 46: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/46.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 47: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/47.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 48: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/48.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 49: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/49.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 50: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/50.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 51: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/51.jpg)
Binary Variable Regression Models
The probability density of all the data:
P(y |π) =n∏
i=1
πyii (1− πi )1−yi
The log-likelihood:
ln L(π|y) =n∑
i=1
{yi lnπi + (1− yi ) ln(1− πi )}
=n∑
i=1
{−yi ln
(1 + e−xiβ
)+ (1− yi ) ln
(1− 1
1 + e−xiβ
)}
= −n∑
i=1
ln(
1 + e(1−2yi )xiβ).
What do we do with this?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 9 / 242
![Page 52: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/52.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 53: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/53.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 54: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/54.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 55: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/55.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 56: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/56.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s
(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 57: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/57.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s
(c) Marginal effects: Can hold “other variables” constant at their means, atypical value, or at their observed values
(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 58: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/58.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values
(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 59: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/59.jpg)
Interpreting Functional Forms
Running Example is logit:
πi =1
1 + e−xiβ
Methods:
1. Graphs.
(a) Can use desired instead of observed X ’s(b) Can try entire surface plot for a small number of X ’s(c) Marginal effects: Can hold “other variables” constant at their means, a
typical value, or at their observed values(d) Average effects: compute effects for every observation and average
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 10 / 242
![Page 60: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/60.jpg)
Interpreting Functional Forms
2. Fitted Values for selected combinations of X ’s, or “typical” people ortypes:
Sex Age Home Income Pr(vote)
Male 20 Chicago $33,000 0.20Female 27 New York City $43,000 0.28Male 50 Madison, WI $55,000 0.72
...
We may also want to include uncertainty (fundamental and estimationuncertainty)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 11 / 242
![Page 61: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/61.jpg)
Interpreting Functional Forms
2. Fitted Values for selected combinations of X ’s, or “typical” people ortypes:
Sex Age Home Income Pr(vote)
Male 20 Chicago $33,000 0.20Female 27 New York City $43,000 0.28Male 50 Madison, WI $55,000 0.72
...
We may also want to include uncertainty (fundamental and estimationuncertainty)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 11 / 242
![Page 62: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/62.jpg)
Interpreting Functional Forms
2. Fitted Values for selected combinations of X ’s, or “typical” people ortypes:
Sex Age Home Income Pr(vote)
Male 20 Chicago $33,000 0.20Female 27 New York City $43,000 0.28Male 50 Madison, WI $55,000 0.72
...
We may also want to include uncertainty (fundamental and estimationuncertainty)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 11 / 242
![Page 63: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/63.jpg)
Interpreting Functional Forms
3. First Differences (called Risk Differences in epidemiology)
(a) Define Xs (starting point) and Xe (ending point) as k × 1 vectors ofvalues of X . Usually all values are the same but one.
(b) First difference = g(Xe , β)− g(Xs , β)(c) D = 1
1+e−Xe β− 1
1+e−Xs β
(d) Better (and necessary to compute se’s): do by simulation (we’ll repeatthe details soon)
Variable From To FirstDifference
Sex Male → Female .05Age 65 → 75 −.10Home NYC → Madison, WI .26Income $35,000 → $75, 000 .14
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 12 / 242
![Page 64: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/64.jpg)
Interpreting Functional Forms
3. First Differences (called Risk Differences in epidemiology)
(a) Define Xs (starting point) and Xe (ending point) as k × 1 vectors ofvalues of X . Usually all values are the same but one.
(b) First difference = g(Xe , β)− g(Xs , β)(c) D = 1
1+e−Xe β− 1
1+e−Xs β
(d) Better (and necessary to compute se’s): do by simulation (we’ll repeatthe details soon)
Variable From To FirstDifference
Sex Male → Female .05Age 65 → 75 −.10Home NYC → Madison, WI .26Income $35,000 → $75, 000 .14
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 12 / 242
![Page 65: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/65.jpg)
Interpreting Functional Forms
3. First Differences (called Risk Differences in epidemiology)
(a) Define Xs (starting point) and Xe (ending point) as k × 1 vectors ofvalues of X . Usually all values are the same but one.
(b) First difference = g(Xe , β)− g(Xs , β)
(c) D = 11+e−Xe β
− 11+e−Xs β
(d) Better (and necessary to compute se’s): do by simulation (we’ll repeatthe details soon)
Variable From To FirstDifference
Sex Male → Female .05Age 65 → 75 −.10Home NYC → Madison, WI .26Income $35,000 → $75, 000 .14
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 12 / 242
![Page 66: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/66.jpg)
Interpreting Functional Forms
3. First Differences (called Risk Differences in epidemiology)
(a) Define Xs (starting point) and Xe (ending point) as k × 1 vectors ofvalues of X . Usually all values are the same but one.
(b) First difference = g(Xe , β)− g(Xs , β)(c) D = 1
1+e−Xe β− 1
1+e−Xs β
(d) Better (and necessary to compute se’s): do by simulation (we’ll repeatthe details soon)
Variable From To FirstDifference
Sex Male → Female .05Age 65 → 75 −.10Home NYC → Madison, WI .26Income $35,000 → $75, 000 .14
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 12 / 242
![Page 67: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/67.jpg)
Interpreting Functional Forms
3. First Differences (called Risk Differences in epidemiology)
(a) Define Xs (starting point) and Xe (ending point) as k × 1 vectors ofvalues of X . Usually all values are the same but one.
(b) First difference = g(Xe , β)− g(Xs , β)(c) D = 1
1+e−Xe β− 1
1+e−Xs β
(d) Better (and necessary to compute se’s): do by simulation (we’ll repeatthe details soon)
Variable From To FirstDifference
Sex Male → Female .05Age 65 → 75 −.10Home NYC → Madison, WI .26Income $35,000 → $75, 000 .14
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 12 / 242
![Page 68: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/68.jpg)
Interpreting Functional Forms
3. First Differences (called Risk Differences in epidemiology)
(a) Define Xs (starting point) and Xe (ending point) as k × 1 vectors ofvalues of X . Usually all values are the same but one.
(b) First difference = g(Xe , β)− g(Xs , β)(c) D = 1
1+e−Xe β− 1
1+e−Xs β
(d) Better (and necessary to compute se’s): do by simulation (we’ll repeatthe details soon)
Variable From To FirstDifference
Sex Male → Female .05Age 65 → 75 −.10Home NYC → Madison, WI .26Income $35,000 → $75, 000 .14
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 12 / 242
![Page 69: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/69.jpg)
Interpreting Functional Forms
4. Derivatives (i.e. a source of heuristics for talks)
∂πi∂Xj
=∂ 1
1+e−Xβ
∂Xj= βj πi (1− πi )
(a) Max value of logit derivative: β × 0.5(1− 0.5) = β/4
(b) Max value for probit [πi = Φ(Xiβ)] derivative: β × 0.4
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 13 / 242
![Page 70: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/70.jpg)
Interpreting Functional Forms
4. Derivatives (i.e. a source of heuristics for talks)
∂πi∂Xj
=∂ 1
1+e−Xβ
∂Xj= βj πi (1− πi )
(a) Max value of logit derivative: β × 0.5(1− 0.5) = β/4
(b) Max value for probit [πi = Φ(Xiβ)] derivative: β × 0.4
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 13 / 242
![Page 71: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/71.jpg)
Interpreting Functional Forms
4. Derivatives (i.e. a source of heuristics for talks)
∂πi∂Xj
=∂ 1
1+e−Xβ
∂Xj= βj πi (1− πi )
(a) Max value of logit derivative: β × 0.5(1− 0.5) = β/4
(b) Max value for probit [πi = Φ(Xiβ)] derivative: β × 0.4
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 13 / 242
![Page 72: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/72.jpg)
Interpreting Functional Forms
4. Derivatives (i.e. a source of heuristics for talks)
∂πi∂Xj
=∂ 1
1+e−Xβ
∂Xj= βj πi (1− πi )
(a) Max value of logit derivative: β × 0.5(1− 0.5) = β/4
(b) Max value for probit [πi = Φ(Xiβ)] derivative: β × 0.4
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 13 / 242
![Page 73: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/73.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗i
Let
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, where
I Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 74: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/74.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, where
I Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 75: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/75.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, where
I Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 76: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/76.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, whereI Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0
I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 77: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/77.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, whereI Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utility
I εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 78: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/78.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, whereI Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 79: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/79.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, whereI Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 80: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/80.jpg)
Latent Variable Interpretation
Logit models can also be interpreted in terms of a latent variable Y ∗iLet
Yi =
{1 if Y ∗i > 00 if Y ∗i ≤ 0
Y ∗i = X>i β + εi where E(εi ) = 0
This is also called a random utility model, whereI Y ∗i : Utility from choosing Yi = 1 instead of Yi = 0I X>i β: Systematic component of utilityI εi : Stochastic (random) component of utility
Make distributional assumptions about εi :
I εii.i.d.∼ Logistic =⇒ Logit
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 14 / 242
![Page 81: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/81.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 82: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/82.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 83: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/83.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 84: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/84.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )
µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 85: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/85.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 86: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/86.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 87: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/87.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 88: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/88.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 89: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/89.jpg)
Latent Variable: Walkthrough
Let Y ∗ be a continuous unobserved variable. Health, propensity to vote,etc.
A model:
Y ∗i ∼ P(y∗i |µi )µi = xiβ
with observation mechanism:
yi =
{1 y∗ ≤ τ if i is alive
0 y∗ > τ if i is dead
Since Y ∗ is unobserved anyway, define the threshold as τ = 0. (Plus thesame independence assumption, which from now on is assumed implicit.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 15 / 242
![Page 90: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/90.jpg)
Same Logit Model, Different Justification andInterpretation
1. If Y ∗ is observed and P(·) is normal, this is a regression.
2. If only yi is observed, and Y ∗ is standardized logistic (which looks closeto the normal),
P(y∗i |µi ) = STL(y∗|µi ) =exp(y∗i − µi )
[1 + exp(y∗i − µi )]2
then we get a logit model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 16 / 242
![Page 91: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/91.jpg)
Same Logit Model, Different Justification andInterpretation
1. If Y ∗ is observed and P(·) is normal, this is a regression.
2. If only yi is observed, and Y ∗ is standardized logistic (which looks closeto the normal),
P(y∗i |µi ) = STL(y∗|µi ) =exp(y∗i − µi )
[1 + exp(y∗i − µi )]2
then we get a logit model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 16 / 242
![Page 92: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/92.jpg)
Same Logit Model, Different Justification andInterpretation
1. If Y ∗ is observed and P(·) is normal, this is a regression.
2. If only yi is observed, and Y ∗ is standardized logistic (which looks closeto the normal),
P(y∗i |µi ) = STL(y∗|µi ) =exp(y∗i − µi )
[1 + exp(y∗i − µi )]2
then we get a logit model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 16 / 242
![Page 93: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/93.jpg)
Same Logit Model, Different Justification andInterpretation
1. If Y ∗ is observed and P(·) is normal, this is a regression.
2. If only yi is observed, and Y ∗ is standardized logistic (which looks closeto the normal),
P(y∗i |µi ) = STL(y∗|µi ) =exp(y∗i − µi )
[1 + exp(y∗i − µi )]2
then we get a logit model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 16 / 242
![Page 94: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/94.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 95: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/95.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 96: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/96.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 97: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/97.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 98: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/98.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 99: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/99.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 100: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/100.jpg)
Same Logit Model, Different Justification andInterpretation
3. The derivation:
Pr(Yi = 1|µi ) = Pr(Y ∗i ≤ 0)
=
∫ 0
−∞STL(y∗i |µi )dy∗i
= Fstl(0|µi ) [the CDF of the STL]
= [1 + exp(−Xiβ)]−1
The same functional form!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 17 / 242
![Page 101: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/101.jpg)
The Probit Model
4. For the Probit Model, we modify:
P(y∗i |µi ) = N(y∗i |µi , 1)
with the same observation mechanism, implying
Pr(Yi = 1|µ) =
∫ 0
−∞N(y∗i |µi , 1)dy∗i = Φ(Xiβ)
5. =⇒ interpret β as regression coefficients of Y ∗ on X : β1 is whathappens to Y ∗ on average (or µi ) when X1 goes up by one unit,holding constant the other explanatory variables (and conditional onthe model). In probit, one unit of Y ∗ is one standard deviation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 18 / 242
![Page 102: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/102.jpg)
The Probit Model
4. For the Probit Model, we modify:
P(y∗i |µi ) = N(y∗i |µi , 1)
with the same observation mechanism, implying
Pr(Yi = 1|µ) =
∫ 0
−∞N(y∗i |µi , 1)dy∗i = Φ(Xiβ)
5. =⇒ interpret β as regression coefficients of Y ∗ on X : β1 is whathappens to Y ∗ on average (or µi ) when X1 goes up by one unit,holding constant the other explanatory variables (and conditional onthe model). In probit, one unit of Y ∗ is one standard deviation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 18 / 242
![Page 103: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/103.jpg)
The Probit Model
4. For the Probit Model, we modify:
P(y∗i |µi ) = N(y∗i |µi , 1)
with the same observation mechanism, implying
Pr(Yi = 1|µ) =
∫ 0
−∞N(y∗i |µi , 1)dy∗i = Φ(Xiβ)
5. =⇒ interpret β as regression coefficients of Y ∗ on X : β1 is whathappens to Y ∗ on average (or µi ) when X1 goes up by one unit,holding constant the other explanatory variables (and conditional onthe model). In probit, one unit of Y ∗ is one standard deviation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 18 / 242
![Page 104: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/104.jpg)
The Probit Model
4. For the Probit Model, we modify:
P(y∗i |µi ) = N(y∗i |µi , 1)
with the same observation mechanism, implying
Pr(Yi = 1|µ) =
∫ 0
−∞N(y∗i |µi , 1)dy∗i = Φ(Xiβ)
5. =⇒ interpret β as regression coefficients of Y ∗ on X : β1 is whathappens to Y ∗ on average (or µi ) when X1 goes up by one unit,holding constant the other explanatory variables (and conditional onthe model). In probit, one unit of Y ∗ is one standard deviation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 18 / 242
![Page 105: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/105.jpg)
The Probit Model
4. For the Probit Model, we modify:
P(y∗i |µi ) = N(y∗i |µi , 1)
with the same observation mechanism, implying
Pr(Yi = 1|µ) =
∫ 0
−∞N(y∗i |µi , 1)dy∗i = Φ(Xiβ)
5. =⇒ interpret β as regression coefficients of Y ∗ on X : β1 is whathappens to Y ∗ on average (or µi ) when X1 goes up by one unit,holding constant the other explanatory variables (and conditional onthe model). In probit, one unit of Y ∗ is one standard deviation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 18 / 242
![Page 106: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/106.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.Let Y ∗ ≡ UD
i − URi and apply the same interpretation as above: If
y∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 107: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/107.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.Let Y ∗ ≡ UD
i − URi and apply the same interpretation as above: If
y∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 108: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/108.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.Let Y ∗ ≡ UD
i − URi and apply the same interpretation as above: If
y∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 109: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/109.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.
Let Y ∗ ≡ UDi − UR
i and apply the same interpretation as above: Ify∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 110: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/110.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.Let Y ∗ ≡ UD
i − URi and apply the same interpretation as above: If
y∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 111: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/111.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.Let Y ∗ ≡ UD
i − URi and apply the same interpretation as above: If
y∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 112: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/112.jpg)
An Econometric Interpretation: Utility Maximization
Let UDi be the utility for the Democratic candidate;
and URi be the utility for the Republican candidate.
Assume UDi and UR
i are independent
Assume Uki ∼ P(Uk
i |ηki ) for k = {D,R}.Let Y ∗ ≡ UD
i − URi and apply the same interpretation as above: If
y∗ > 0, choose the Democrat, otherwise, choose the Republican.
If P(·) is normal, we get a Probit model
If P(·) is generalized extreme value, we get logit.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 19 / 242
![Page 113: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/113.jpg)
Comparing LPM, Logit, and Probit
●
●
●
●● ●
●●
●
●
● ●
●
●
●●
●●
●●
●● ●
●●
●
●
●●
●
●
●
●●
●
●●
●●● ●
●●
●
●●
● ●
●●
● ●
●●●
●●●
●
●
● ●
●●
●
●
●
●● ●●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●
●●
●●
● ●
●●●
●
●
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
Xi
π i
LPM goes outside of [0, 1] for extreme values of Xi
LPM underestimates the marginal effect near center and overpredicts nearextremes
Logit has slightly fatter tails than probit, but no practical difference
Note that β are completely different between the models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 20 / 242
![Page 114: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/114.jpg)
Comparing LPM, Logit, and Probit
●
●
●
●● ●
●●
●
●
● ●
●
●
●●
●●
●●
●● ●
●●
●
●
●●
●
●
●
●●
●
●●
●●● ●
●●
●
●●
● ●
●●
● ●
●●●
●●●
●
●
● ●
●●
●
●
●
●● ●●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●
●●
●●
● ●
●●●
●
●
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
Xi
π i
LPM (β = 0.19)
LPM goes outside of [0, 1] for extreme values of Xi
LPM underestimates the marginal effect near center and overpredicts nearextremes
Logit has slightly fatter tails than probit, but no practical difference
Note that β are completely different between the models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 20 / 242
![Page 115: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/115.jpg)
Comparing LPM, Logit, and Probit
●
●
●
●● ●
●●
●
●
● ●
●
●
●●
●●
●●
●● ●
●●
●
●
●●
●
●
●
●●
●
●●
●●● ●
●●
●
●●
● ●
●●
● ●
●●●
●●●
●
●
● ●
●●
●
●
●
●● ●●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●
●●
●●
● ●
●●●
●
●
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
Xi
π i
LPM (β = 0.19)
Logit (β = 1.06)
LPM goes outside of [0, 1] for extreme values of Xi
LPM underestimates the marginal effect near center and overpredicts nearextremes
Logit has slightly fatter tails than probit, but no practical difference
Note that β are completely different between the models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 20 / 242
![Page 116: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/116.jpg)
Comparing LPM, Logit, and Probit
●
●
●
●● ●
●●
●
●
● ●
●
●
●●
●●
●●
●● ●
●●
●
●
●●
●
●
●
●●
●
●●
●●● ●
●●
●
●●
● ●
●●
● ●
●●●
●●●
●
●
● ●
●●
●
●
●
●● ●●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●
●●
●●
● ●
●●●
●
●
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
Xi
π i
LPM (β = 0.19)
Logit (β = 1.06)
Probit (β = 0.63)
LPM goes outside of [0, 1] for extreme values of Xi
LPM underestimates the marginal effect near center and overpredicts nearextremes
Logit has slightly fatter tails than probit, but no practical difference
Note that β are completely different between the models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 20 / 242
![Page 117: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/117.jpg)
Comparing LPM, Logit, and Probit
●
●
●
●● ●
●●
●
●
● ●
●
●
●●
●●
●●
●● ●
●●
●
●
●●
●
●
●
●●
●
●●
●●● ●
●●
●
●●
● ●
●●
● ●
●●●
●●●
●
●
● ●
●●
●
●
●
●● ●●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●
●●
●●
● ●
●●●
●
●
−3 −2 −1 0 1 2 3
0.0
0.2
0.4
0.6
0.8
1.0
Xi
π i
LPM (β = 0.19)
Logit (β = 1.06)
Probit (β = 0.63)
LPM goes outside of [0, 1] for extreme values of Xi
LPM underestimates the marginal effect near center and overpredicts nearextremes
Logit has slightly fatter tails than probit, but no practical difference
Note that β are completely different between the models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 20 / 242
![Page 118: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/118.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 21 / 242
![Page 119: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/119.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 21 / 242
![Page 120: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/120.jpg)
How Not to Present Statistical Results
1. This one is typical of current practice, not that unusual.
2. What do these numbers mean?
3. Why so much whitespace? Can you connect cols A and B?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 22 / 242
![Page 121: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/121.jpg)
How Not to Present Statistical Results
1. This one is typical of current practice, not that unusual.
2. What do these numbers mean?
3. Why so much whitespace? Can you connect cols A and B?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 22 / 242
![Page 122: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/122.jpg)
How Not to Present Statistical Results
1. This one is typical of current practice, not that unusual.
2. What do these numbers mean?
3. Why so much whitespace? Can you connect cols A and B?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 22 / 242
![Page 123: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/123.jpg)
How Not to Present Statistical Results
1. This one is typical of current practice, not that unusual.
2. What do these numbers mean?
3. Why so much whitespace? Can you connect cols A and B?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 22 / 242
![Page 124: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/124.jpg)
How Not to Present Statistical Results
4. What does the star-gazing add?
5. Can any be interpreted as causal estimates?
6. Can you compute a quantity of interest from these numbers?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 23 / 242
![Page 125: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/125.jpg)
How Not to Present Statistical Results
4. What does the star-gazing add?
5. Can any be interpreted as causal estimates?
6. Can you compute a quantity of interest from these numbers?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 23 / 242
![Page 126: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/126.jpg)
How Not to Present Statistical Results
4. What does the star-gazing add?
5. Can any be interpreted as causal estimates?
6. Can you compute a quantity of interest from these numbers?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 23 / 242
![Page 127: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/127.jpg)
How Not to Present Statistical Results
4. What does the star-gazing add?
5. Can any be interpreted as causal estimates?
6. Can you compute a quantity of interest from these numbers?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 23 / 242
![Page 128: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/128.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 129: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/129.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 130: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/130.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 131: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/131.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,
(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 132: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/132.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.
(d) Include no superfluous information, long lists of coefficients no oneunderstands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 133: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/133.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 134: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/134.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 135: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/135.jpg)
Interpretation and Presentation
1. Statistical presentations should
(a) Convey numerically precise estimates of the quantities of substantiveinterest,
(b) Include reasonable measures of uncertainty about those estimates,(c) Require little specialized knowledge to understand.(d) Include no superfluous information, long lists of coefficients no one
understands, star gazing, etc.
2. For example: Other things being equal, an additional year of educationwould increase your annual income by $1,500 on average, plus or minusabout $500.
3. Your work should satisfy a reader who hasn’t taken this course
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 24 / 242
![Page 136: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/136.jpg)
Reading
King, Tomz, Wittenberg, “Making the Most of Statistical Analyses:Improving Interpretation and Presentation” American Journal ofPolitical Science, Vol. 44, No. 2 (March, 2000): 341-355.
Hamner and Kalkan (2013). Behind the Curve: Clarifying the BestApproach to Calculating Predicted Probabilities and Marginal Effectsfrom Limited Dependent Variable Models. American Journal ofPolitical Science.
Greenhill, Ward, and Sacks (2011). The Separation Plot: A newvisual method for evaluating the fit of binary models. AmericanJournal of Political Science.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 25 / 242
![Page 137: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/137.jpg)
Quantities of Interest
How to interpret β in binary outcome models?I In LPM, β is the marginal effect (βj = ∂E [Yi | X ] /∂Xij)
I In logit, β is the log odds ratio
βj = log
{Pr(Yi = 1 | Xij = 1)/Pr(Yi = 0 | Xij = 1)
Pr(Yi = 1 | Xij = 0)/Pr(Yi = 0 | Xij = 0)
}I In probit, no direct substantive interpretation of β
I In general, it is a bad practice to just present a coefficients table!
I Instead, always try to present your results in terms of aneasy-to-interpret quantity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 26 / 242
![Page 138: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/138.jpg)
Quantities of Interest
How to interpret β in binary outcome models?I In LPM, β is the marginal effect (βj = ∂E [Yi | X ] /∂Xij)
I In logit, β is the log odds ratio
βj = log
{Pr(Yi = 1 | Xij = 1)/Pr(Yi = 0 | Xij = 1)
Pr(Yi = 1 | Xij = 0)/Pr(Yi = 0 | Xij = 0)
}
I In probit, no direct substantive interpretation of β
I In general, it is a bad practice to just present a coefficients table!
I Instead, always try to present your results in terms of aneasy-to-interpret quantity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 26 / 242
![Page 139: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/139.jpg)
Quantities of Interest
How to interpret β in binary outcome models?I In LPM, β is the marginal effect (βj = ∂E [Yi | X ] /∂Xij)
I In logit, β is the log odds ratio
βj = log
{Pr(Yi = 1 | Xij = 1)/Pr(Yi = 0 | Xij = 1)
Pr(Yi = 1 | Xij = 0)/Pr(Yi = 0 | Xij = 0)
}I In probit, no direct substantive interpretation of β
I In general, it is a bad practice to just present a coefficients table!
I Instead, always try to present your results in terms of aneasy-to-interpret quantity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 26 / 242
![Page 140: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/140.jpg)
Quantities of Interest
How to interpret β in binary outcome models?I In LPM, β is the marginal effect (βj = ∂E [Yi | X ] /∂Xij)
I In logit, β is the log odds ratio
βj = log
{Pr(Yi = 1 | Xij = 1)/Pr(Yi = 0 | Xij = 1)
Pr(Yi = 1 | Xij = 0)/Pr(Yi = 0 | Xij = 0)
}I In probit, no direct substantive interpretation of β
I In general, it is a bad practice to just present a coefficients table!
I Instead, always try to present your results in terms of aneasy-to-interpret quantity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 26 / 242
![Page 141: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/141.jpg)
Quantities of Interest
How to interpret β in binary outcome models?I In LPM, β is the marginal effect (βj = ∂E [Yi | X ] /∂Xij)
I In logit, β is the log odds ratio
βj = log
{Pr(Yi = 1 | Xij = 1)/Pr(Yi = 0 | Xij = 1)
Pr(Yi = 1 | Xij = 0)/Pr(Yi = 0 | Xij = 0)
}I In probit, no direct substantive interpretation of β
I In general, it is a bad practice to just present a coefficients table!
I Instead, always try to present your results in terms of aneasy-to-interpret quantity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 26 / 242
![Page 142: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/142.jpg)
Analytic Quantities of Interest
1. Predicted probability when Xi = x :
Pr(Yi = 1 | Xi = x) = π(x)
2. Average Treatment Effect (ATE):I Given the model, εi is i.i.d.
=⇒ Ti is conditionally ignorable given Wi
I Thus, ATE can be identified as
τ = E [Pr(Yi = 1 | Ti = 1,Wi )− Pr(Yi = 1 | Ti = 0,Wi )]
= E [π(Ti = 1,Wi )− π(Ti = 0,Wi )]
where E is taken with respect to both ε and W3. Marginal effects: For a continuous predictor Xij ,
∂E[Yi | X ]
∂Xij=
{βj · logit−1(X>i β)
(1− logit−1(X>i β)
)(for logit)
βj · φ(X>i β) (for probit)
Note: Depends on all Xi , so must pick a particular value
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 27 / 242
![Page 143: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/143.jpg)
Analytic Quantities of Interest1. Predicted probability when Xi = x :
Pr(Yi = 1 | Xi = x) = π(x)
2. Average Treatment Effect (ATE):I Given the model, εi is i.i.d.
=⇒ Ti is conditionally ignorable given Wi
I Thus, ATE can be identified as
τ = E [Pr(Yi = 1 | Ti = 1,Wi )− Pr(Yi = 1 | Ti = 0,Wi )]
= E [π(Ti = 1,Wi )− π(Ti = 0,Wi )]
where E is taken with respect to both ε and W3. Marginal effects: For a continuous predictor Xij ,
∂E[Yi | X ]
∂Xij=
{βj · logit−1(X>i β)
(1− logit−1(X>i β)
)(for logit)
βj · φ(X>i β) (for probit)
Note: Depends on all Xi , so must pick a particular value
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 27 / 242
![Page 144: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/144.jpg)
Analytic Quantities of Interest1. Predicted probability when Xi = x :
Pr(Yi = 1 | Xi = x) = π(x)
2. Average Treatment Effect (ATE):I Given the model, εi is i.i.d.
=⇒ Ti is conditionally ignorable given Wi
I Thus, ATE can be identified as
τ = E [Pr(Yi = 1 | Ti = 1,Wi )− Pr(Yi = 1 | Ti = 0,Wi )]
= E [π(Ti = 1,Wi )− π(Ti = 0,Wi )]
where E is taken with respect to both ε and W3. Marginal effects: For a continuous predictor Xij ,
∂E[Yi | X ]
∂Xij=
{βj · logit−1(X>i β)
(1− logit−1(X>i β)
)(for logit)
βj · φ(X>i β) (for probit)
Note: Depends on all Xi , so must pick a particular value
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 27 / 242
![Page 145: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/145.jpg)
Analytic Quantities of Interest1. Predicted probability when Xi = x :
Pr(Yi = 1 | Xi = x) = π(x)
2. Average Treatment Effect (ATE):I Given the model, εi is i.i.d.
=⇒ Ti is conditionally ignorable given Wi
I Thus, ATE can be identified as
τ = E [Pr(Yi = 1 | Ti = 1,Wi )− Pr(Yi = 1 | Ti = 0,Wi )]
= E [π(Ti = 1,Wi )− π(Ti = 0,Wi )]
where E is taken with respect to both ε and W
3. Marginal effects: For a continuous predictor Xij ,
∂E[Yi | X ]
∂Xij=
{βj · logit−1(X>i β)
(1− logit−1(X>i β)
)(for logit)
βj · φ(X>i β) (for probit)
Note: Depends on all Xi , so must pick a particular value
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 27 / 242
![Page 146: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/146.jpg)
Analytic Quantities of Interest1. Predicted probability when Xi = x :
Pr(Yi = 1 | Xi = x) = π(x)
2. Average Treatment Effect (ATE):I Given the model, εi is i.i.d.
=⇒ Ti is conditionally ignorable given Wi
I Thus, ATE can be identified as
τ = E [Pr(Yi = 1 | Ti = 1,Wi )− Pr(Yi = 1 | Ti = 0,Wi )]
= E [π(Ti = 1,Wi )− π(Ti = 0,Wi )]
where E is taken with respect to both ε and W3. Marginal effects: For a continuous predictor Xij ,
∂E[Yi | X ]
∂Xij=
{βj · logit−1(X>i β)
(1− logit−1(X>i β)
)(for logit)
βj · φ(X>i β) (for probit)
Note: Depends on all Xi , so must pick a particular value
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 27 / 242
![Page 147: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/147.jpg)
Analytic Quantities of Interest1. Predicted probability when Xi = x :
Pr(Yi = 1 | Xi = x) = π(x)
2. Average Treatment Effect (ATE):I Given the model, εi is i.i.d.
=⇒ Ti is conditionally ignorable given Wi
I Thus, ATE can be identified as
τ = E [Pr(Yi = 1 | Ti = 1,Wi )− Pr(Yi = 1 | Ti = 0,Wi )]
= E [π(Ti = 1,Wi )− π(Ti = 0,Wi )]
where E is taken with respect to both ε and W3. Marginal effects: For a continuous predictor Xij ,
∂E[Yi | X ]
∂Xij=
{βj · logit−1(X>i β)
(1− logit−1(X>i β)
)(for logit)
βj · φ(X>i β) (for probit)
Note: Depends on all Xi , so must pick a particular value
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 27 / 242
![Page 148: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/148.jpg)
Example: Civil Conflict and Political Instability
Fearon & Laitin (2003):
Yi : Civil conflict
Ti : Political instability
Wi : Geography (log % mountainous)
Estimated model:
Pr(Yi = 1 | Ti ,Wi )
= logit−1 (−2.84 + 0.91Ti + 0.35Wi )
Predicted probability:
π(Ti = 1,Wi = 3.10) = 0.299
ATE:
τ =1
n
n∑i=1
{π(1,Wi )− π(0,Wi )}
= 0.127
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 28 / 242
![Page 149: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/149.jpg)
Example: Civil Conflict and Political Instability
Fearon & Laitin (2003):
Yi : Civil conflict
Ti : Political instability
Wi : Geography (log % mountainous)
Estimated model:
Pr(Yi = 1 | Ti ,Wi )
= logit−1 (−2.84 + 0.91Ti + 0.35Wi )
Predicted probability:
π(Ti = 1,Wi = 3.10) = 0.299
ATE:
τ =1
n
n∑i=1
{π(1,Wi )− π(0,Wi )}
= 0.127
●●
●
●
●●●
●●
●
●
●
●●●●●●●
●
●●
●
●●●●●
●
●
●
●●●●●●●
●●
●●●●
●
●●●●●
●●
●●●
●●●
●●●●●●
●●●●●
●
●●●●
●●●
●●●●●
●●●●●
●
●
●●
●●●
●
●●●
●
●●
●
●●
●
●●●●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●●
●●●●●●●
●●●
●
●●
●●●
●
●●●●
●●
●●
●● ●●●●
●●
●
●●●
●●●
●●
●●●●
●●
●
●
●●
●●●
●●●●
●●
●
●●
●
●●●●
●
●
●●●●●
●●●●
●
●●●●
●
●
●
●●●
●
●●
●●●●
●
●
●●●
●
●●
●●●●●●
●●●●
●
●●
●
●●
●●
●●●
●●
●●●
●●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●●
●
●
●●●●●●
●
●●●●●●●
●●
●
●●●
●●
●●●
●●●●●●●
●●●●●●●
●
●●
●
●●
●●●●
●
●
●●●
●
●●●
●●●●
●●
●
●●
●●
●
●●
●
●
●●●
●
●●
●●
●●●●●●●●●●●●●●
●●
●●
●●
●
●●●
●
●●
●●●●
●●
●
●●
●●●
●●
●●●
●
●
●
●
●●●
●●●
●●●
●
●●
●
●●●
●●●●●
●●●●●●
●
●●●●
●
●
●●●
●●
●●
●●
●●
●●●●●●
●●●●●●
●●●●●●
●
●●●
●
●●
●●●●●●●●●
●●●
●
●
●●
●●●●●
●●●●
●●●●●
●●
●
●
●
●
●●
●●
●
●
●●●
●
●●
●●●
●●●
●
●●
●●
●
●
●●
●●
●
●●●●●●●●
●
●●●
●●●●
●●
●●●
●●
●●●
●●●●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●●●
●●
●●●
●●●
●
●
●
●
●●●●●●●●
●●
●●
●●
●●●
●●●
●
●●●
●●
●●●
●
●●●●●●
●
●
●●●
●
●
●●
●
●
●
●●●●●
●
●●●●
●
●●●●
●●●●●
●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●
●
●
●
●
●
●
●●●
●
●●●
●●●●
●
●
●●●●
●●●●●●
●●
●●●●●●
●
●●
●●●●●●●●
●
●●
●
●●●●●●
●
●
●●●●
●●●●
●●●●●●
●
●●●●●
●●●●●
●
●●
●●●●
●●●●
●●
●
●●
●●●●●●
●●●●●●●●●●●●
●
●
●●
●●●●●●
●
●●
●
●●
●●
●●
●●
●
●
●●
●
●●●●●●●●●●●●
●●●
●
●●●●
●●●
●●●
●●
●
●
●●
●●
●●●●●
●
●●
●
●●●
●●
●
●
●
●
●●
●●
●
●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●
●
●●
●●●●
●●●●●
●
●●
●●
●
●●
●●
●●
●
●
●
●●●
●●●●●●●
●
●●
●
●●●
●●
●
●●●
●
●●●●●●
●●●●●●●
●
●●
●●●
●●
●
●●
●
●●
●
●
● ●
●
●●●
●
●●●●
●
●
●
●●●●●●●●
●●
●
●
●●
●●●●
●
●●
●●●●
●●●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●●●●
●●●
●●
●
●●●●●
●
●●●
●
●
●
●●●
●●●●●
●
●●●
●●●
●●
●●
●●●
●
●
●
●●
●
●●●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●●●●
●●●
●
●●
●●●●
●●
●
●
●●●
●
●●●●●
●
●
●●●
●●●●
●●
●
●●
●
●●
●●
●●●●●●●●●
●
●
●●
●●●●
●
●
●
●
●●
●●●●
●
●
●
●
●●●●●●
●
●●
●
●●●
●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●●
●●●●●●●
●●●
●●●
●
●●●
●
●●
●●●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●
●●●
●●●●●
●
●●
●
●
●
●●●
●●●
●●
●●●
●
●
●●●●
●
●●●●●●
●●●
●●
●
●●●
●
●●●●●
●
●
●●●●
●
●
●●
●
●●●●
●●
●●●●●
●
●
●
●
●●●
●●●●●
●
●
●
●●
●●
●●
●●●
●●
●●
●●
●●●●
●
●
●●●
●●
●
●
●
●●●
●
●
●●●●●
●●●
●●
●●
●●
●●●
●
●
●
●
●
●●●●●●●●
●
●
●●
●●●
●
●
●●●●●●●●●
●●●●●
●●●●●
●●●
●●
●●
●●●●
●
●
●●
●
●●
●●●●
●
●●
●●●
●
●●●
●
●●●
●
●
●
●
●●●
●
●●●
●●
●
●
●
●●●
●●
●●●
●●●●●
●●●●●
●●●●●
●
●
●●●●
●●
●
●●
●●●●
●
●●●●
●
●
●
●●
●
●●●●●●●●●●●
●●●●●●
●●●
●
●●
●●●●●●
●●●●
●
●
●
●●●
●
●
●
●●
●
●●
●
●●●●
●●●●
●●●●●●
●
●●●●
●●●●
●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●●●
●●●●
●
●
●●●●
●●
●
●
●
●
●●
●●
●
●●●
●
●●●●●●●
●●●●●●●●●
●●
●●●●●
●
●●
●●●●●
●
●●
●●●
●
●
●●
●●●
●
●●
●●
●●●●●●●●
●●
●
●
●
●●●●
●
●●●
●●●
●
●●●●●
●●
●●●
●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●
●●
●
●
●
●
●●●
●●●●
●●●●●
●●●
●●
●
●
●●●●●
●●●
●
●
●●●●●●
●
●
●●●
●
●●
●
●
●●●
●●●● ●
●
●●●●●
●
●●●●
●
●●
●●
●●●●●
●
●●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●●●●●
●●●●
●●
●●●●
●●●
●
●
●●
●●●
●
●●●●●
●
●●
●
● ●●
●●
●●
●
●●
●●●●●
●●●●●●
●
●●
●●
●●
●
●●●●●●
●
●●
●●●
●
●
●●
●
●
●
●
●
●●●●
●
●
●●●
●
●●
●
●●●●●
●●
●
●
●●
●●
●●●
●
●●●
●●●●
●
●●
●
●●●
●
●
●●●●●●
●●
●
●
●
●●●●●●
●
●
●
●
●●●●●●●
●●●●
●
●●●●●●
●●●●
●●
●●
●
●
●
●●●●●
●●
●●
●●●●●●●
●●●
●
●
●
●
●
●●
●●●●●●●●●●●
●●●●
●●●
●●●
●●
●
●●
●●●●
●●
●●
●
●●●
●●
●
●
●●
●●●
●●
●●
●●
●●
●
●
●
●
●●●
●
●●●
●
●●●●●
●
●
●
●●●
●
●
●●
●●
●●●●
●
●
●
●●●
●●●●
●●●
●●
●
●
●●●●●●
●
●●
●
●●●●●
●●
●
●
●●●●●
●
●
●
●●●●
●●●●
●
●●
●
●
●
●
●
●●●
●
●●●●
●●●
●●●
●
●
●●●
●●
●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●
●●●
●●●●●●
●●●
●●●●●●●
●
●
●
●●●
●●●
●●●●●●●
●●●
●●
●
●●●
●
●
●●
●
●●
●●
●
●●●●
●●●●●●
●●
●●
●
●
●●
●
●
●●●
●
●●
●
●
●
●●●●●●
●●●●●
●
●
●
●
●
●
●●
●●●
●
●●●●●●●
●●●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●
●
●●
●●
●●
● ●●
●
●●●●●
● ●
●●●
●
●
●●●
●
●●●
●●
●●●
●●
●●
●
●●●●●●
●●
●●●
●
●●●●●
●●●
●●●
●
●●●●
●
●
●●●
●
●
●
●
●●●
●●●
●●●●
●●● ●●
●●●●●●●
●
●●
●
●
●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●●●●●
●
●●●●
●
●●
●●●
●
●●
●●
●●
●●●●●
●
●●●●●●
●
●●
●●
●
●
●
●
●●●
●
●
●
●●●●●●●●●●
●●
●●●●●●
●●●
●●
●
●
●●
●●●●
●
●
●●
●●●●●●
●●
●●●●●●●●
●●●●●
●
●
●
●●●●●●●●●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●
●
●●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●●●●
●
●
●●
●●●
●
●●●●
●
●
●●●
●●
●●●●●
●●
●
●●
●
●
●●●
●
●
●●●●●
●
●●●●
●●●●●
●
●●●●●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●●●●●●●●
●●●●
●
●●●
●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●
●
●
●
●
●●●●●
●●
●●●
●●
●●●
●
●●
●●
●●
●
●
●●●
●
●●●●
●●
●
●●●●●●●●●●
●●
●
●
●
●●●
●
●●●● ●
●●●●●●●●
●
●●●●●●●●
●●
●●
●
●●
●
●●●
●●●
●
●
●
●
●
●●●●●
●
●
●
●●●●
●●
●●
●●●
●●●
●
●
●●
●●
●
●●●●
●
●
●●
●●
●●
●●●
●●●●●●
●●●●●●●●●●
●
●
●●●●●●●●
●●
●●●●
●●●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●●●●
●●●●
●
●●●●
●●
●●
●●●●●●
●
●●●●●●●●●
●●●
●●●
●●●
●
●●●●●
●●●
●
●●●
●
●●●
●●●
●●●
●
●●●
●
●
●
●●●●
●
●●
●
●
●
●●●●
●
●●
●
●
●●
●
●
●●
●●
●●
●
●●
●●
●
●●●●
●
●
●●
●
●
●●●●●●
●
●●
●
●●●●
●
●●●●
●
●●
●
●
●
●
●●
●●●●
●
●●●●●●●●●●
●●
●
●
●●●●
●
●
●●
●
●
●
●
●●●
●●●
●●
●
●●
●●
●●●●●●●
●
●●●●●●
●●●
●●
●
●●
●●●
●●●●
●●
●
●●●●
●
●●
●
●
●●●●●●
●●●●
●●●●●
●●
●●●●●
●●
●●●
●●●
●●●
●
●●●●
●●
●●
●●●●●
●
●●●
●
●
●
●
●
●
●●
●●
●●●
●
●
●
●●
●●●●
●●●●
●
●●●
●
●●●
●
●●●
●●●
●
●●●●
●●
●
●
●
●● ●
●
●●
●
●
●●
●
●●●●
●●●●
●
●●
●●
●●●
●
●
●●●●
●●
●●●
●●
●●
●●
●
●●●
●●●
●
●
●●
●●●●
●●●
●●
●●
●●●
●●●●
●
●●●●
●●
●●
●
●
●●
●
●
●
●
●
●●●●●
●
●●
●●
●●●●●●
●
●●
●●
●●●●●●
●●
●●
●
●●
●
●●●●●●●●●●●●
●●
●●
●●●
●
●
●
●●●
●
●●
●●●●●
●●●●●●●●
●
●
●
●●●●●
●
●
●
●
●●
●
●●
●●
●●●
●
●●
●
●●●
●
●●
●
●
●
●●●
●
●
●
●
●●
●●●
●●●
●
●●
●●●●●●●●
●
●
●●
●●●
●
●
●
●●●
●●●●
●
●●
●●
●●
●
●
●
●●●●
●
●
●
●●●
●
●●●
●
●●●
●●●●
●●●
●●
●●
●
●●
●●
●
●
●
●
●●
●●●●●●
●
●●●●
●
●●●●
●●
●
●
●
●●
●
●●●●
●
●●●●
●●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●●
●●●
●●
●●
●
●●●●●●
●
●●
●
●
●●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●
●
●●●●●●●●
●
●●●●
●●●
●
●●●●
●
●●
●
●●●
●
●●●
●●
●
●●●●●●
●
●●●
●●
●
●●●●
●
●
●
●
●●●●●●
●
●●●
●●●●●
●
●
●●
●
●●
●●●
●●
●
●
●●●●●●
●
●
●
●
●●
●●●●●
●
●●●
●●
●
●●
●
●
●
●●●●
●
●●
●
●●
●●
●
●●●
●
●●
●
●●
●●●●●
●●●
●
●●●●●●●●●
●●●
●
●●●●●●●●●
●
●
●
●●●●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●●●●●●
●
●●●●
●●●
●
●●
●●
●●
●●
●
●
●
●●●
●
● ●
●●●
●
●
●●●●
●●●●
●
●
●●●●
●
●●
●●●
●●●●●●●●●●●
●●●●●●●
●
●●●●●●
●●●●
●
●●●
●
●●●
●●●
●
●
●●
●
●
●●●
●
●
●
●●
●
●●●●●
●●●
●
●●●●●
●
●
●●●●●●●●
●●●●● ●●
●●
●●
●●●●●●●
●●●
●●
●
●●
●●●●●●●●
●
●●
●
●
●●●
●●●
●
●●●
●
●●●●
●●
●●●●●
●
●●●●
●●●●●●
●●
●
●
● ●●
●
●
●
●
●●
●
●●●
●●●
●●●
●
●
●●
●
●
●
●●●
●
●●●●
●●●●●
●●●●
●
●
●●
●
●●●●●
●●●●
●
●
●
●
●●
●
●●●●●
●●●●
●●
●
●●●●●
●●
●●
●
●●●
●
●●
●●
●
●
●●●●
●
●●
●
●
●●
●●●
●●●●●
●
●●●●●●●
●
●
●
●●●
●●●
●
●
●●●
●●●
●
●
●
●●
●
●●
●●
●
●●●●●
●●
●
●●
●
●●●
●●●●●
●●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●●
●●
●●
●●●
●●
●
●
●
●
●
●●●●●●
●●
●
●
●
●
●●
●
●●
●●
●●
●
●●
●●
●●●
●
●●
●
●●●●
●●
●
●●●●●●
●
●
●●●●●
●●●●●
●
●
●
●
●
●
●●
●
●●
●●●
●●
●●●●
●●
●
●
●
●●
●●
●●●●
●
●
●●●●
●
●●●
●●●●●●
●●●●●●
●
●
●●●●
●●●
●●●●●
●●
●●●
●●
●●●
●
●
●
●
●
●
●●
●●●
●●
●
●
●●●
●●
●
●●●
●●●
●●●●
●●●●●
●●●●
●●●●●
●●●●●
●●
●
●●
●
●●●
●
●●●
●●
●●
●●●●
●●
●
●
●●●
●
●●
●
●●
●●
●●●
●●
●
●●●
●●●
●
●●●●
●
●●●
●
●●●●
●
● ●●●●●●
●●●
●
●
●
●
●●●●●
●●●●●
●●●●●●
●
●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●●
●
●●
●
●
●●●●
●●
●
●
●●●●●●●●
●●●●
●
●
●●
●●●●●
●●●●●●●
●
●●
●●
●
●●●
●
●●
●●●
●●●●●
●
●●●●●●
●
●
●●●
●●
●●
●●●
●
●●●●●
●●●●
● ●
●
●●
●●
●
●●
●
●●●●●●
●●
●
●●●
●
●
●
●●
●●●●●
●
●
●●
●●●
●●
●●
●●●
●●●●●
●
●
●
●
●●●
●
●●
●●
●
●
●●●
●●●●
●
●
●●
●
●
●●●
●●
●●
●●●●
●●●●●●●●●●
●●
●
●●●●●
●
●●
●●
●●
●
●
●●●
●
●
●●
●●
●
●
●
●●●●
●
●●●
●●
●
●
●●●●●
●●
●●●●
●
●●●●●
●
●●●●●●●
●●
●●●●●
●●
●●
●
●●●●
●●●●●●●●●●●●●
●
●
●
●●
●●●
●●●
●●●
●●
●●●
●
●
●
●
●●
●
●●●●●●
●
●●
●●
●
●●●●
●●●●
●●●●●
●●
●●●●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●●●●●●●
●
●●●
●●
●●●
●●●
●
●●
●
●
●
●
●
●
●
●●
●
●●●●●●●●
●
●●
●●
●●
●
● ●
●
●●●●●●●
●●●
●
●
●●●●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●
●●
●
●●●●●
●
●●●●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●●●●
●●●●
●
●●●
●●●●
●●
●●
●
●●
●
●●●●●●●●●
●●●●
●●●●
●●●●●●
●●●●●●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●●●●●●
●●
●●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●●
●●●●
●●●●●
●●●●●
●
●●●●
●
●●●
●●●
●
●●
●●
●●●●
●●●
●
●●
●●●●
●
●
●
●●
●
●
●●●
●
●●
●
●●
●●●
●●
●●
●●●●●●●●●
●●● ●
●●
●●
●
●●●●●●●
●●●●
●●
●●●●●●●
●●●●●
●●●●
●●●
●
●
●
●
●●
●●
●●
●●●●●●
●●●
●
●●●●
●
●●
●
●●
●●
●
●
●
●
●●●●
●
●●●
●
●●
●●
●
●
●●●●●●
●
●●●●●●
●●
●●●
●
●
●●●●
●●●●●
●
●●●
●●●●
●●
●
●
●●●●●●●●
●
●
●●
●●●
●●●●
●
●
●●
●●●
●●●●●●
●
●
●●
●
●
●●●
●●
●
●
●●●
●●
●
●
●●●●●●●
●
●
●
●●●
●
●●●
●●●
●
●
●●
●
●●●●●
●●●●
●
●●●●
●
●●●
●
●
●
●●●
●
●●●●●●
●
●
●
●
●
●
●●●●●●
●●●
●
●●●●
●
●●●●●●
●
●
●●●●●●●
●
●●●●
●
●
●
●
●●
●●●
●
●●
●●
●●●●●●
●●●●●●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●●●●●●●●
●
●●●
●
●●●●●
●●●●
●●●
●●●●
●●●
●
●●●●
●
●●
●●
●●●●●
●
●●●
●●
●
●
●●
●●●
●●●
●
●●
●
●
●
●●●
●●●●
●●●
●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●
●●●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●●●
●
●
●●●
●
●
●●
●
●
●
●●
●●●●●
●
●●●
●●●
●●●●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●●
●
●
●●●
●●●●●
●●●●●●●●
●●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●●●●●
●●●
●
●●●
●
●●●●●
●●●●●
●●
●
●●
●●●●
●●●●●●●
●●●●●●●
●●●●●
●●●●●
●
●●
●
●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●
●
●●
●●
●
●
●●●●●
●
●
●
●
●●●●●●●●●
●
●●●●●●
●●●●●
●●
●●●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●
●
●
●●
●●
●●●
●
●●
●
●
●●●●●●
●●●
●
●
●
●
●
●●●●
●
●●●
●●●●●
●
●
●
●
●
●
●●●
●●●●
●
●●●●●●●
●
●●●●●●●●
●●
●●●●●●
●
●
●
●●●
●
●●●●
●
●●●
●
●●
●
●●●●●
●
●●
●●
●●●●
●
●●
●●●●
●●
●
●●●
●●●
●●●
●
●●
●●
●●●
●●
●●●●●●●
●
●●●
●●●
●●
●●●●●●●
●
●●●●
●
●●
●●●●
●
●
●●
●
●
●
●●
●
●●●●
●●
●●●
●●●●
●
●
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
Wi
π(T
i,Wi)
Ti = 1
Ti = 0
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 28 / 242
![Page 150: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/150.jpg)
Example: Civil Conflict and Political Instability
Fearon & Laitin (2003):
Yi : Civil conflict
Ti : Political instability
Wi : Geography (log % mountainous)
Estimated model:
Pr(Yi = 1 | Ti ,Wi )
= logit−1 (−2.84 + 0.91Ti + 0.35Wi )
Predicted probability:
π(Ti = 1,Wi = 3.10) = 0.299
ATE:
τ =1
n
n∑i=1
{π(1,Wi )− π(0,Wi )}
= 0.127
●●
●
●
●●●
●●
●
●
●
●●●●●●●
●
●●
●
●●●●●
●
●
●
●●●●●●●
●●
●●●●
●
●●●●●
●●
●●●
●●●
●●●●●●
●●●●●
●
●●●●
●●●
●●●●●
●●●●●
●
●
●●
●●●
●
●●●
●
●●
●
●●
●
●●●●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●●
●●●●●●●
●●●
●
●●
●●●
●
●●●●
●●
●●
●● ●●●●
●●
●
●●●
●●●
●●
●●●●
●●
●
●
●●
●●●
●●●●
●●
●
●●
●
●●●●
●
●
●●●●●
●●●●
●
●●●●
●
●
●
●●●
●
●●
●●●●
●
●
●●●
●
●●
●●●●●●
●●●●
●
●●
●
●●
●●
●●●
●●
●●●
●●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●●
●
●
●●●●●●
●
●●●●●●●
●●
●
●●●
●●
●●●
●●●●●●●
●●●●●●●
●
●●
●
●●
●●●●
●
●
●●●
●
●●●
●●●●
●●
●
●●
●●
●
●●
●
●
●●●
●
●●
●●
●●●●●●●●●●●●●●
●●
●●
●●
●
●●●
●
●●
●●●●
●●
●
●●
●●●
●●
●●●
●
●
●
●
●●●
●●●
●●●
●
●●
●
●●●
●●●●●
●●●●●●
●
●●●●
●
●
●●●
●●
●●
●●
●●
●●●●●●
●●●●●●
●●●●●●
●
●●●
●
●●
●●●●●●●●●
●●●
●
●
●●
●●●●●
●●●●
●●●●●
●●
●
●
●
●
●●
●●
●
●
●●●
●
●●
●●●
●●●
●
●●
●●
●
●
●●
●●
●
●●●●●●●●
●
●●●
●●●●
●●
●●●
●●
●●●
●●●●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●●●
●●
●●●
●●●
●
●
●
●
●●●●●●●●
●●
●●
●●
●●●
●●●
●
●●●
●●
●●●
●
●●●●●●
●
●
●●●
●
●
●●
●
●
●
●●●●●
●
●●●●
●
●●●●
●●●●●
●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●
●
●
●
●
●
●
●●●
●
●●●
●●●●
●
●
●●●●
●●●●●●
●●
●●●●●●
●
●●
●●●●●●●●
●
●●
●
●●●●●●
●
●
●●●●
●●●●
●●●●●●
●
●●●●●
●●●●●
●
●●
●●●●
●●●●
●●
●
●●
●●●●●●
●●●●●●●●●●●●
●
●
●●
●●●●●●
●
●●
●
●●
●●
●●
●●
●
●
●●
●
●●●●●●●●●●●●
●●●
●
●●●●
●●●
●●●
●●
●
●
●●
●●
●●●●●
●
●●
●
●●●
●●
●
●
●
●
●●
●●
●
●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●
●
●●
●●●●
●●●●●
●
●●
●●
●
●●
●●
●●
●
●
●
●●●
●●●●●●●
●
●●
●
●●●
●●
●
●●●
●
●●●●●●
●●●●●●●
●
●●
●●●
●●
●
●●
●
●●
●
●
● ●
●
●●●
●
●●●●
●
●
●
●●●●●●●●
●●
●
●
●●
●●●●
●
●●
●●●●
●●●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●●●●
●●●
●●
●
●●●●●
●
●●●
●
●
●
●●●
●●●●●
●
●●●
●●●
●●
●●
●●●
●
●
●
●●
●
●●●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●●●●
●●●
●
●●
●●●●
●●
●
●
●●●
●
●●●●●
●
●
●●●
●●●●
●●
●
●●
●
●●
●●
●●●●●●●●●
●
●
●●
●●●●
●
●
●
●
●●
●●●●
●
●
●
●
●●●●●●
●
●●
●
●●●
●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●●
●●●●●●●
●●●
●●●
●
●●●
●
●●
●●●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●
●●●
●●●●●
●
●●
●
●
●
●●●
●●●
●●
●●●
●
●
●●●●
●
●●●●●●
●●●
●●
●
●●●
●
●●●●●
●
●
●●●●
●
●
●●
●
●●●●
●●
●●●●●
●
●
●
●
●●●
●●●●●
●
●
●
●●
●●
●●
●●●
●●
●●
●●
●●●●
●
●
●●●
●●
●
●
●
●●●
●
●
●●●●●
●●●
●●
●●
●●
●●●
●
●
●
●
●
●●●●●●●●
●
●
●●
●●●
●
●
●●●●●●●●●
●●●●●
●●●●●
●●●
●●
●●
●●●●
●
●
●●
●
●●
●●●●
●
●●
●●●
●
●●●
●
●●●
●
●
●
●
●●●
●
●●●
●●
●
●
●
●●●
●●
●●●
●●●●●
●●●●●
●●●●●
●
●
●●●●
●●
●
●●
●●●●
●
●●●●
●
●
●
●●
●
●●●●●●●●●●●
●●●●●●
●●●
●
●●
●●●●●●
●●●●
●
●
●
●●●
●
●
●
●●
●
●●
●
●●●●
●●●●
●●●●●●
●
●●●●
●●●●
●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●●●
●●●●
●
●
●●●●
●●
●
●
●
●
●●
●●
●
●●●
●
●●●●●●●
●●●●●●●●●
●●
●●●●●
●
●●
●●●●●
●
●●
●●●
●
●
●●
●●●
●
●●
●●
●●●●●●●●
●●
●
●
●
●●●●
●
●●●
●●●
●
●●●●●
●●
●●●
●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●
●●
●
●
●
●
●●●
●●●●
●●●●●
●●●
●●
●
●
●●●●●
●●●
●
●
●●●●●●
●
●
●●●
●
●●
●
●
●●●
●●●● ●
●
●●●●●
●
●●●●
●
●●
●●
●●●●●
●
●●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●●●●●
●●●●
●●
●●●●
●●●
●
●
●●
●●●
●
●●●●●
●
●●
●
● ●●
●●
●●
●
●●
●●●●●
●●●●●●
●
●●
●●
●●
●
●●●●●●
●
●●
●●●
●
●
●●
●
●
●
●
●
●●●●
●
●
●●●
●
●●
●
●●●●●
●●
●
●
●●
●●
●●●
●
●●●
●●●●
●
●●
●
●●●
●
●
●●●●●●
●●
●
●
●
●●●●●●
●
●
●
●
●●●●●●●
●●●●
●
●●●●●●
●●●●
●●
●●
●
●
●
●●●●●
●●
●●
●●●●●●●
●●●
●
●
●
●
●
●●
●●●●●●●●●●●
●●●●
●●●
●●●
●●
●
●●
●●●●
●●
●●
●
●●●
●●
●
●
●●
●●●
●●
●●
●●
●●
●
●
●
●
●●●
●
●●●
●
●●●●●
●
●
●
●●●
●
●
●●
●●
●●●●
●
●
●
●●●
●●●●
●●●
●●
●
●
●●●●●●
●
●●
●
●●●●●
●●
●
●
●●●●●
●
●
●
●●●●
●●●●
●
●●
●
●
●
●
●
●●●
●
●●●●
●●●
●●●
●
●
●●●
●●
●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●
●●●
●●●●●●
●●●
●●●●●●●
●
●
●
●●●
●●●
●●●●●●●
●●●
●●
●
●●●
●
●
●●
●
●●
●●
●
●●●●
●●●●●●
●●
●●
●
●
●●
●
●
●●●
●
●●
●
●
●
●●●●●●
●●●●●
●
●
●
●
●
●
●●
●●●
●
●●●●●●●
●●●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●
●
●●
●●
●●
● ●●
●
●●●●●
● ●
●●●
●
●
●●●
●
●●●
●●
●●●
●●
●●
●
●●●●●●
●●
●●●
●
●●●●●
●●●
●●●
●
●●●●
●
●
●●●
●
●
●
●
●●●
●●●
●●●●
●●● ●●
●●●●●●●
●
●●
●
●
●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●●●●●
●
●●●●
●
●●
●●●
●
●●
●●
●●
●●●●●
●
●●●●●●
●
●●
●●
●
●
●
●
●●●
●
●
●
●●●●●●●●●●
●●
●●●●●●
●●●
●●
●
●
●●
●●●●
●
●
●●
●●●●●●
●●
●●●●●●●●
●●●●●
●
●
●
●●●●●●●●●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●
●
●●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●●●●
●
●
●●
●●●
●
●●●●
●
●
●●●
●●
●●●●●
●●
●
●●
●
●
●●●
●
●
●●●●●
●
●●●●
●●●●●
●
●●●●●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●●●●●●●●
●●●●
●
●●●
●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●
●
●
●
●
●●●●●
●●
●●●
●●
●●●
●
●●
●●
●●
●
●
●●●
●
●●●●
●●
●
●●●●●●●●●●
●●
●
●
●
●●●
●
●●●● ●
●●●●●●●●
●
●●●●●●●●
●●
●●
●
●●
●
●●●
●●●
●
●
●
●
●
●●●●●
●
●
●
●●●●
●●
●●
●●●
●●●
●
●
●●
●●
●
●●●●
●
●
●●
●●
●●
●●●
●●●●●●
●●●●●●●●●●
●
●
●●●●●●●●
●●
●●●●
●●●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●●●●
●●●●
●
●●●●
●●
●●
●●●●●●
●
●●●●●●●●●
●●●
●●●
●●●
●
●●●●●
●●●
●
●●●
●
●●●
●●●
●●●
●
●●●
●
●
●
●●●●
●
●●
●
●
●
●●●●
●
●●
●
●
●●
●
●
●●
●●
●●
●
●●
●●
●
●●●●
●
●
●●
●
●
●●●●●●
●
●●
●
●●●●
●
●●●●
●
●●
●
●
●
●
●●
●●●●
●
●●●●●●●●●●
●●
●
●
●●●●
●
●
●●
●
●
●
●
●●●
●●●
●●
●
●●
●●
●●●●●●●
●
●●●●●●
●●●
●●
●
●●
●●●
●●●●
●●
●
●●●●
●
●●
●
●
●●●●●●
●●●●
●●●●●
●●
●●●●●
●●
●●●
●●●
●●●
●
●●●●
●●
●●
●●●●●
●
●●●
●
●
●
●
●
●
●●
●●
●●●
●
●
●
●●
●●●●
●●●●
●
●●●
●
●●●
●
●●●
●●●
●
●●●●
●●
●
●
●
●● ●
●
●●
●
●
●●
●
●●●●
●●●●
●
●●
●●
●●●
●
●
●●●●
●●
●●●
●●
●●
●●
●
●●●
●●●
●
●
●●
●●●●
●●●
●●
●●
●●●
●●●●
●
●●●●
●●
●●
●
●
●●
●
●
●
●
●
●●●●●
●
●●
●●
●●●●●●
●
●●
●●
●●●●●●
●●
●●
●
●●
●
●●●●●●●●●●●●
●●
●●
●●●
●
●
●
●●●
●
●●
●●●●●
●●●●●●●●
●
●
●
●●●●●
●
●
●
●
●●
●
●●
●●
●●●
●
●●
●
●●●
●
●●
●
●
●
●●●
●
●
●
●
●●
●●●
●●●
●
●●
●●●●●●●●
●
●
●●
●●●
●
●
●
●●●
●●●●
●
●●
●●
●●
●
●
●
●●●●
●
●
●
●●●
●
●●●
●
●●●
●●●●
●●●
●●
●●
●
●●
●●
●
●
●
●
●●
●●●●●●
●
●●●●
●
●●●●
●●
●
●
●
●●
●
●●●●
●
●●●●
●●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●●
●●●
●●
●●
●
●●●●●●
●
●●
●
●
●●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●
●
●●●●●●●●
●
●●●●
●●●
●
●●●●
●
●●
●
●●●
●
●●●
●●
●
●●●●●●
●
●●●
●●
●
●●●●
●
●
●
●
●●●●●●
●
●●●
●●●●●
●
●
●●
●
●●
●●●
●●
●
●
●●●●●●
●
●
●
●
●●
●●●●●
●
●●●
●●
●
●●
●
●
●
●●●●
●
●●
●
●●
●●
●
●●●
●
●●
●
●●
●●●●●
●●●
●
●●●●●●●●●
●●●
●
●●●●●●●●●
●
●
●
●●●●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●●●●●●
●
●●●●
●●●
●
●●
●●
●●
●●
●
●
●
●●●
●
● ●
●●●
●
●
●●●●
●●●●
●
●
●●●●
●
●●
●●●
●●●●●●●●●●●
●●●●●●●
●
●●●●●●
●●●●
●
●●●
●
●●●
●●●
●
●
●●
●
●
●●●
●
●
●
●●
●
●●●●●
●●●
●
●●●●●
●
●
●●●●●●●●
●●●●● ●●
●●
●●
●●●●●●●
●●●
●●
●
●●
●●●●●●●●
●
●●
●
●
●●●
●●●
●
●●●
●
●●●●
●●
●●●●●
●
●●●●
●●●●●●
●●
●
●
● ●●
●
●
●
●
●●
●
●●●
●●●
●●●
●
●
●●
●
●
●
●●●
●
●●●●
●●●●●
●●●●
●
●
●●
●
●●●●●
●●●●
●
●
●
●
●●
●
●●●●●
●●●●
●●
●
●●●●●
●●
●●
●
●●●
●
●●
●●
●
●
●●●●
●
●●
●
●
●●
●●●
●●●●●
●
●●●●●●●
●
●
●
●●●
●●●
●
●
●●●
●●●
●
●
●
●●
●
●●
●●
●
●●●●●
●●
●
●●
●
●●●
●●●●●
●●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●●
●●
●●
●●●
●●
●
●
●
●
●
●●●●●●
●●
●
●
●
●
●●
●
●●
●●
●●
●
●●
●●
●●●
●
●●
●
●●●●
●●
●
●●●●●●
●
●
●●●●●
●●●●●
●
●
●
●
●
●
●●
●
●●
●●●
●●
●●●●
●●
●
●
●
●●
●●
●●●●
●
●
●●●●
●
●●●
●●●●●●
●●●●●●
●
●
●●●●
●●●
●●●●●
●●
●●●
●●
●●●
●
●
●
●
●
●
●●
●●●
●●
●
●
●●●
●●
●
●●●
●●●
●●●●
●●●●●
●●●●
●●●●●
●●●●●
●●
●
●●
●
●●●
●
●●●
●●
●●
●●●●
●●
●
●
●●●
●
●●
●
●●
●●
●●●
●●
●
●●●
●●●
●
●●●●
●
●●●
●
●●●●
●
● ●●●●●●
●●●
●
●
●
●
●●●●●
●●●●●
●●●●●●
●
●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●●
●
●●
●
●
●●●●
●●
●
●
●●●●●●●●
●●●●
●
●
●●
●●●●●
●●●●●●●
●
●●
●●
●
●●●
●
●●
●●●
●●●●●
●
●●●●●●
●
●
●●●
●●
●●
●●●
●
●●●●●
●●●●
● ●
●
●●
●●
●
●●
●
●●●●●●
●●
●
●●●
●
●
●
●●
●●●●●
●
●
●●
●●●
●●
●●
●●●
●●●●●
●
●
●
●
●●●
●
●●
●●
●
●
●●●
●●●●
●
●
●●
●
●
●●●
●●
●●
●●●●
●●●●●●●●●●
●●
●
●●●●●
●
●●
●●
●●
●
●
●●●
●
●
●●
●●
●
●
●
●●●●
●
●●●
●●
●
●
●●●●●
●●
●●●●
●
●●●●●
●
●●●●●●●
●●
●●●●●
●●
●●
●
●●●●
●●●●●●●●●●●●●
●
●
●
●●
●●●
●●●
●●●
●●
●●●
●
●
●
●
●●
●
●●●●●●
●
●●
●●
●
●●●●
●●●●
●●●●●
●●
●●●●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●●●●●●●
●
●●●
●●
●●●
●●●
●
●●
●
●
●
●
●
●
●
●●
●
●●●●●●●●
●
●●
●●
●●
●
● ●
●
●●●●●●●
●●●
●
●
●●●●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●
●●
●
●●●●●
●
●●●●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●●●●
●●●●
●
●●●
●●●●
●●
●●
●
●●
●
●●●●●●●●●
●●●●
●●●●
●●●●●●
●●●●●●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●●●●●●
●●
●●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●●
●●●●
●●●●●
●●●●●
●
●●●●
●
●●●
●●●
●
●●
●●
●●●●
●●●
●
●●
●●●●
●
●
●
●●
●
●
●●●
●
●●
●
●●
●●●
●●
●●
●●●●●●●●●
●●● ●
●●
●●
●
●●●●●●●
●●●●
●●
●●●●●●●
●●●●●
●●●●
●●●
●
●
●
●
●●
●●
●●
●●●●●●
●●●
●
●●●●
●
●●
●
●●
●●
●
●
●
●
●●●●
●
●●●
●
●●
●●
●
●
●●●●●●
●
●●●●●●
●●
●●●
●
●
●●●●
●●●●●
●
●●●
●●●●
●●
●
●
●●●●●●●●
●
●
●●
●●●
●●●●
●
●
●●
●●●
●●●●●●
●
●
●●
●
●
●●●
●●
●
●
●●●
●●
●
●
●●●●●●●
●
●
●
●●●
●
●●●
●●●
●
●
●●
●
●●●●●
●●●●
●
●●●●
●
●●●
●
●
●
●●●
●
●●●●●●
●
●
●
●
●
●
●●●●●●
●●●
●
●●●●
●
●●●●●●
●
●
●●●●●●●
●
●●●●
●
●
●
●
●●
●●●
●
●●
●●
●●●●●●
●●●●●●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●●●●●●●●
●
●●●
●
●●●●●
●●●●
●●●
●●●●
●●●
●
●●●●
●
●●
●●
●●●●●
●
●●●
●●
●
●
●●
●●●
●●●
●
●●
●
●
●
●●●
●●●●
●●●
●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●
●●●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●●●
●
●
●●●
●
●
●●
●
●
●
●●
●●●●●
●
●●●
●●●
●●●●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●●
●
●
●●●
●●●●●
●●●●●●●●
●●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●●●●●
●●●
●
●●●
●
●●●●●
●●●●●
●●
●
●●
●●●●
●●●●●●●
●●●●●●●
●●●●●
●●●●●
●
●●
●
●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●
●
●●
●●
●
●
●●●●●
●
●
●
●
●●●●●●●●●
●
●●●●●●
●●●●●
●●
●●●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●
●
●
●●
●●
●●●
●
●●
●
●
●●●●●●
●●●
●
●
●
●
●
●●●●
●
●●●
●●●●●
●
●
●
●
●
●
●●●
●●●●
●
●●●●●●●
●
●●●●●●●●
●●
●●●●●●
●
●
●
●●●
●
●●●●
●
●●●
●
●●
●
●●●●●
●
●●
●●
●●●●
●
●●
●●●●
●●
●
●●●
●●●
●●●
●
●●
●●
●●●
●●
●●●●●●●
●
●●●
●●●
●●
●●●●●●●
●
●●●●
●
●●
●●●●
●
●
●●
●
●
●
●●
●
●●●●
●●
●●●
●●●●
●
●
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
Wi
π(T
i,Wi)
Ti = 1
Ti = 0
●PHILIPPI , 1982πi = 0.299
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 28 / 242
![Page 151: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/151.jpg)
Example: Civil Conflict and Political Instability
Fearon & Laitin (2003):
Yi : Civil conflict
Ti : Political instability
Wi : Geography (log % mountainous)
Estimated model:
Pr(Yi = 1 | Ti ,Wi )
= logit−1 (−2.84 + 0.91Ti + 0.35Wi )
Predicted probability:
π(Ti = 1,Wi = 3.10) = 0.299
ATE:
τ =1
n
n∑i=1
{π(1,Wi )− π(0,Wi )}
= 0.127
●●
●
●
●●●
●●
●
●
●
●●●●●●●
●
●●
●
●●●●●
●
●
●
●●●●●●●
●●
●●●●
●
●●●●●
●●
●●●
●●●
●●●●●●
●●●●●
●
●●●●
●●●
●●●●●
●●●●●
●
●
●●
●●●
●
●●●
●
●●
●
●●
●
●●●●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●●
●●●●●●●
●●●
●
●●
●●●
●
●●●●
●●
●●
●● ●●●●
●●
●
●●●
●●●
●●
●●●●
●●
●
●
●●
●●●
●●●●
●●
●
●●
●
●●●●
●
●
●●●●●
●●●●
●
●●●●
●
●
●
●●●
●
●●
●●●●
●
●
●●●
●
●●
●●●●●●
●●●●
●
●●
●
●●
●●
●●●
●●
●●●
●●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●●
●
●
●●●●●●
●
●●●●●●●
●●
●
●●●
●●
●●●
●●●●●●●
●●●●●●●
●
●●
●
●●
●●●●
●
●
●●●
●
●●●
●●●●
●●
●
●●
●●
●
●●
●
●
●●●
●
●●
●●
●●●●●●●●●●●●●●
●●
●●
●●
●
●●●
●
●●
●●●●
●●
●
●●
●●●
●●
●●●
●
●
●
●
●●●
●●●
●●●
●
●●
●
●●●
●●●●●
●●●●●●
●
●●●●
●
●
●●●
●●
●●
●●
●●
●●●●●●
●●●●●●
●●●●●●
●
●●●
●
●●
●●●●●●●●●
●●●
●
●
●●
●●●●●
●●●●
●●●●●
●●
●
●
●
●
●●
●●
●
●
●●●
●
●●
●●●
●●●
●
●●
●●
●
●
●●
●●
●
●●●●●●●●
●
●●●
●●●●
●●
●●●
●●
●●●
●●●●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●●●
●●
●●●
●●●
●
●
●
●
●●●●●●●●
●●
●●
●●
●●●
●●●
●
●●●
●●
●●●
●
●●●●●●
●
●
●●●
●
●
●●
●
●
●
●●●●●
●
●●●●
●
●●●●
●●●●●
●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●
●
●
●
●
●
●
●●●
●
●●●
●●●●
●
●
●●●●
●●●●●●
●●
●●●●●●
●
●●
●●●●●●●●
●
●●
●
●●●●●●
●
●
●●●●
●●●●
●●●●●●
●
●●●●●
●●●●●
●
●●
●●●●
●●●●
●●
●
●●
●●●●●●
●●●●●●●●●●●●
●
●
●●
●●●●●●
●
●●
●
●●
●●
●●
●●
●
●
●●
●
●●●●●●●●●●●●
●●●
●
●●●●
●●●
●●●
●●
●
●
●●
●●
●●●●●
●
●●
●
●●●
●●
●
●
●
●
●●
●●
●
●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●
●
●●
●●●●
●●●●●
●
●●
●●
●
●●
●●
●●
●
●
●
●●●
●●●●●●●
●
●●
●
●●●
●●
●
●●●
●
●●●●●●
●●●●●●●
●
●●
●●●
●●
●
●●
●
●●
●
●
● ●
●
●●●
●
●●●●
●
●
●
●●●●●●●●
●●
●
●
●●
●●●●
●
●●
●●●●
●●●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●●●●
●●●
●●
●
●●●●●
●
●●●
●
●
●
●●●
●●●●●
●
●●●
●●●
●●
●●
●●●
●
●
●
●●
●
●●●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●●●●
●●●
●
●●
●●●●
●●
●
●
●●●
●
●●●●●
●
●
●●●
●●●●
●●
●
●●
●
●●
●●
●●●●●●●●●
●
●
●●
●●●●
●
●
●
●
●●
●●●●
●
●
●
●
●●●●●●
●
●●
●
●●●
●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●●
●●●●●●●
●●●
●●●
●
●●●
●
●●
●●●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●
●●●
●●●●●
●
●●
●
●
●
●●●
●●●
●●
●●●
●
●
●●●●
●
●●●●●●
●●●
●●
●
●●●
●
●●●●●
●
●
●●●●
●
●
●●
●
●●●●
●●
●●●●●
●
●
●
●
●●●
●●●●●
●
●
●
●●
●●
●●
●●●
●●
●●
●●
●●●●
●
●
●●●
●●
●
●
●
●●●
●
●
●●●●●
●●●
●●
●●
●●
●●●
●
●
●
●
●
●●●●●●●●
●
●
●●
●●●
●
●
●●●●●●●●●
●●●●●
●●●●●
●●●
●●
●●
●●●●
●
●
●●
●
●●
●●●●
●
●●
●●●
●
●●●
●
●●●
●
●
●
●
●●●
●
●●●
●●
●
●
●
●●●
●●
●●●
●●●●●
●●●●●
●●●●●
●
●
●●●●
●●
●
●●
●●●●
●
●●●●
●
●
●
●●
●
●●●●●●●●●●●
●●●●●●
●●●
●
●●
●●●●●●
●●●●
●
●
●
●●●
●
●
●
●●
●
●●
●
●●●●
●●●●
●●●●●●
●
●●●●
●●●●
●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●●●
●●●●
●
●
●●●●
●●
●
●
●
●
●●
●●
●
●●●
●
●●●●●●●
●●●●●●●●●
●●
●●●●●
●
●●
●●●●●
●
●●
●●●
●
●
●●
●●●
●
●●
●●
●●●●●●●●
●●
●
●
●
●●●●
●
●●●
●●●
●
●●●●●
●●
●●●
●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●
●●
●
●
●
●
●●●
●●●●
●●●●●
●●●
●●
●
●
●●●●●
●●●
●
●
●●●●●●
●
●
●●●
●
●●
●
●
●●●
●●●● ●
●
●●●●●
●
●●●●
●
●●
●●
●●●●●
●
●●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●●●●●
●●●●
●●
●●●●
●●●
●
●
●●
●●●
●
●●●●●
●
●●
●
● ●●
●●
●●
●
●●
●●●●●
●●●●●●
●
●●
●●
●●
●
●●●●●●
●
●●
●●●
●
●
●●
●
●
●
●
●
●●●●
●
●
●●●
●
●●
●
●●●●●
●●
●
●
●●
●●
●●●
●
●●●
●●●●
●
●●
●
●●●
●
●
●●●●●●
●●
●
●
●
●●●●●●
●
●
●
●
●●●●●●●
●●●●
●
●●●●●●
●●●●
●●
●●
●
●
●
●●●●●
●●
●●
●●●●●●●
●●●
●
●
●
●
●
●●
●●●●●●●●●●●
●●●●
●●●
●●●
●●
●
●●
●●●●
●●
●●
●
●●●
●●
●
●
●●
●●●
●●
●●
●●
●●
●
●
●
●
●●●
●
●●●
●
●●●●●
●
●
●
●●●
●
●
●●
●●
●●●●
●
●
●
●●●
●●●●
●●●
●●
●
●
●●●●●●
●
●●
●
●●●●●
●●
●
●
●●●●●
●
●
●
●●●●
●●●●
●
●●
●
●
●
●
●
●●●
●
●●●●
●●●
●●●
●
●
●●●
●●
●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●
●●●
●●●●●●
●●●
●●●●●●●
●
●
●
●●●
●●●
●●●●●●●
●●●
●●
●
●●●
●
●
●●
●
●●
●●
●
●●●●
●●●●●●
●●
●●
●
●
●●
●
●
●●●
●
●●
●
●
●
●●●●●●
●●●●●
●
●
●
●
●
●
●●
●●●
●
●●●●●●●
●●●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●
●
●●
●●
●●
● ●●
●
●●●●●
● ●
●●●
●
●
●●●
●
●●●
●●
●●●
●●
●●
●
●●●●●●
●●
●●●
●
●●●●●
●●●
●●●
●
●●●●
●
●
●●●
●
●
●
●
●●●
●●●
●●●●
●●● ●●
●●●●●●●
●
●●
●
●
●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●●●●●
●
●●●●
●
●●
●●●
●
●●
●●
●●
●●●●●
●
●●●●●●
●
●●
●●
●
●
●
●
●●●
●
●
●
●●●●●●●●●●
●●
●●●●●●
●●●
●●
●
●
●●
●●●●
●
●
●●
●●●●●●
●●
●●●●●●●●
●●●●●
●
●
●
●●●●●●●●●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●
●
●●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●●●●
●
●
●●
●●●
●
●●●●
●
●
●●●
●●
●●●●●
●●
●
●●
●
●
●●●
●
●
●●●●●
●
●●●●
●●●●●
●
●●●●●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●●●●●●●●
●●●●
●
●●●
●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●
●
●
●
●
●●●●●
●●
●●●
●●
●●●
●
●●
●●
●●
●
●
●●●
●
●●●●
●●
●
●●●●●●●●●●
●●
●
●
●
●●●
●
●●●● ●
●●●●●●●●
●
●●●●●●●●
●●
●●
●
●●
●
●●●
●●●
●
●
●
●
●
●●●●●
●
●
●
●●●●
●●
●●
●●●
●●●
●
●
●●
●●
●
●●●●
●
●
●●
●●
●●
●●●
●●●●●●
●●●●●●●●●●
●
●
●●●●●●●●
●●
●●●●
●●●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●●●●
●●●●
●
●●●●
●●
●●
●●●●●●
●
●●●●●●●●●
●●●
●●●
●●●
●
●●●●●
●●●
●
●●●
●
●●●
●●●
●●●
●
●●●
●
●
●
●●●●
●
●●
●
●
●
●●●●
●
●●
●
●
●●
●
●
●●
●●
●●
●
●●
●●
●
●●●●
●
●
●●
●
●
●●●●●●
●
●●
●
●●●●
●
●●●●
●
●●
●
●
●
●
●●
●●●●
●
●●●●●●●●●●
●●
●
●
●●●●
●
●
●●
●
●
●
●
●●●
●●●
●●
●
●●
●●
●●●●●●●
●
●●●●●●
●●●
●●
●
●●
●●●
●●●●
●●
●
●●●●
●
●●
●
●
●●●●●●
●●●●
●●●●●
●●
●●●●●
●●
●●●
●●●
●●●
●
●●●●
●●
●●
●●●●●
●
●●●
●
●
●
●
●
●
●●
●●
●●●
●
●
●
●●
●●●●
●●●●
●
●●●
●
●●●
●
●●●
●●●
●
●●●●
●●
●
●
●
●● ●
●
●●
●
●
●●
●
●●●●
●●●●
●
●●
●●
●●●
●
●
●●●●
●●
●●●
●●
●●
●●
●
●●●
●●●
●
●
●●
●●●●
●●●
●●
●●
●●●
●●●●
●
●●●●
●●
●●
●
●
●●
●
●
●
●
●
●●●●●
●
●●
●●
●●●●●●
●
●●
●●
●●●●●●
●●
●●
●
●●
●
●●●●●●●●●●●●
●●
●●
●●●
●
●
●
●●●
●
●●
●●●●●
●●●●●●●●
●
●
●
●●●●●
●
●
●
●
●●
●
●●
●●
●●●
●
●●
●
●●●
●
●●
●
●
●
●●●
●
●
●
●
●●
●●●
●●●
●
●●
●●●●●●●●
●
●
●●
●●●
●
●
●
●●●
●●●●
●
●●
●●
●●
●
●
●
●●●●
●
●
●
●●●
●
●●●
●
●●●
●●●●
●●●
●●
●●
●
●●
●●
●
●
●
●
●●
●●●●●●
●
●●●●
●
●●●●
●●
●
●
●
●●
●
●●●●
●
●●●●
●●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●●
●●●
●●
●●
●
●●●●●●
●
●●
●
●
●●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●
●
●●●●●●●●
●
●●●●
●●●
●
●●●●
●
●●
●
●●●
●
●●●
●●
●
●●●●●●
●
●●●
●●
●
●●●●
●
●
●
●
●●●●●●
●
●●●
●●●●●
●
●
●●
●
●●
●●●
●●
●
●
●●●●●●
●
●
●
●
●●
●●●●●
●
●●●
●●
●
●●
●
●
●
●●●●
●
●●
●
●●
●●
●
●●●
●
●●
●
●●
●●●●●
●●●
●
●●●●●●●●●
●●●
●
●●●●●●●●●
●
●
●
●●●●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●●●●●●
●
●●●●
●●●
●
●●
●●
●●
●●
●
●
●
●●●
●
● ●
●●●
●
●
●●●●
●●●●
●
●
●●●●
●
●●
●●●
●●●●●●●●●●●
●●●●●●●
●
●●●●●●
●●●●
●
●●●
●
●●●
●●●
●
●
●●
●
●
●●●
●
●
●
●●
●
●●●●●
●●●
●
●●●●●
●
●
●●●●●●●●
●●●●● ●●
●●
●●
●●●●●●●
●●●
●●
●
●●
●●●●●●●●
●
●●
●
●
●●●
●●●
●
●●●
●
●●●●
●●
●●●●●
●
●●●●
●●●●●●
●●
●
●
● ●●
●
●
●
●
●●
●
●●●
●●●
●●●
●
●
●●
●
●
●
●●●
●
●●●●
●●●●●
●●●●
●
●
●●
●
●●●●●
●●●●
●
●
●
●
●●
●
●●●●●
●●●●
●●
●
●●●●●
●●
●●
●
●●●
●
●●
●●
●
●
●●●●
●
●●
●
●
●●
●●●
●●●●●
●
●●●●●●●
●
●
●
●●●
●●●
●
●
●●●
●●●
●
●
●
●●
●
●●
●●
●
●●●●●
●●
●
●●
●
●●●
●●●●●
●●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●●
●●
●●
●●●
●●
●
●
●
●
●
●●●●●●
●●
●
●
●
●
●●
●
●●
●●
●●
●
●●
●●
●●●
●
●●
●
●●●●
●●
●
●●●●●●
●
●
●●●●●
●●●●●
●
●
●
●
●
●
●●
●
●●
●●●
●●
●●●●
●●
●
●
●
●●
●●
●●●●
●
●
●●●●
●
●●●
●●●●●●
●●●●●●
●
●
●●●●
●●●
●●●●●
●●
●●●
●●
●●●
●
●
●
●
●
●
●●
●●●
●●
●
●
●●●
●●
●
●●●
●●●
●●●●
●●●●●
●●●●
●●●●●
●●●●●
●●
●
●●
●
●●●
●
●●●
●●
●●
●●●●
●●
●
●
●●●
●
●●
●
●●
●●
●●●
●●
●
●●●
●●●
●
●●●●
●
●●●
●
●●●●
●
● ●●●●●●
●●●
●
●
●
●
●●●●●
●●●●●
●●●●●●
●
●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●●
●
●●
●
●
●●●●
●●
●
●
●●●●●●●●
●●●●
●
●
●●
●●●●●
●●●●●●●
●
●●
●●
●
●●●
●
●●
●●●
●●●●●
●
●●●●●●
●
●
●●●
●●
●●
●●●
●
●●●●●
●●●●
● ●
●
●●
●●
●
●●
●
●●●●●●
●●
●
●●●
●
●
●
●●
●●●●●
●
●
●●
●●●
●●
●●
●●●
●●●●●
●
●
●
●
●●●
●
●●
●●
●
●
●●●
●●●●
●
●
●●
●
●
●●●
●●
●●
●●●●
●●●●●●●●●●
●●
●
●●●●●
●
●●
●●
●●
●
●
●●●
●
●
●●
●●
●
●
●
●●●●
●
●●●
●●
●
●
●●●●●
●●
●●●●
●
●●●●●
●
●●●●●●●
●●
●●●●●
●●
●●
●
●●●●
●●●●●●●●●●●●●
●
●
●
●●
●●●
●●●
●●●
●●
●●●
●
●
●
●
●●
●
●●●●●●
●
●●
●●
●
●●●●
●●●●
●●●●●
●●
●●●●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●●●●●●●
●
●●●
●●
●●●
●●●
●
●●
●
●
●
●
●
●
●
●●
●
●●●●●●●●
●
●●
●●
●●
●
● ●
●
●●●●●●●
●●●
●
●
●●●●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●
●●
●
●●●●●
●
●●●●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●●●●
●●●●
●
●●●
●●●●
●●
●●
●
●●
●
●●●●●●●●●
●●●●
●●●●
●●●●●●
●●●●●●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●●●●●●
●●
●●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●●
●●●●
●●●●●
●●●●●
●
●●●●
●
●●●
●●●
●
●●
●●
●●●●
●●●
●
●●
●●●●
●
●
●
●●
●
●
●●●
●
●●
●
●●
●●●
●●
●●
●●●●●●●●●
●●● ●
●●
●●
●
●●●●●●●
●●●●
●●
●●●●●●●
●●●●●
●●●●
●●●
●
●
●
●
●●
●●
●●
●●●●●●
●●●
●
●●●●
●
●●
●
●●
●●
●
●
●
●
●●●●
●
●●●
●
●●
●●
●
●
●●●●●●
●
●●●●●●
●●
●●●
●
●
●●●●
●●●●●
●
●●●
●●●●
●●
●
●
●●●●●●●●
●
●
●●
●●●
●●●●
●
●
●●
●●●
●●●●●●
●
●
●●
●
●
●●●
●●
●
●
●●●
●●
●
●
●●●●●●●
●
●
●
●●●
●
●●●
●●●
●
●
●●
●
●●●●●
●●●●
●
●●●●
●
●●●
●
●
●
●●●
●
●●●●●●
●
●
●
●
●
●
●●●●●●
●●●
●
●●●●
●
●●●●●●
●
●
●●●●●●●
●
●●●●
●
●
●
●
●●
●●●
●
●●
●●
●●●●●●
●●●●●●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●●●●●●●●
●
●●●
●
●●●●●
●●●●
●●●
●●●●
●●●
●
●●●●
●
●●
●●
●●●●●
●
●●●
●●
●
●
●●
●●●
●●●
●
●●
●
●
●
●●●
●●●●
●●●
●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●
●●●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●●●
●
●
●●●
●
●
●●
●
●
●
●●
●●●●●
●
●●●
●●●
●●●●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●●
●
●
●●●
●●●●●
●●●●●●●●
●●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●●●●●
●●●
●
●●●
●
●●●●●
●●●●●
●●
●
●●
●●●●
●●●●●●●
●●●●●●●
●●●●●
●●●●●
●
●●
●
●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●
●
●●
●●
●
●
●●●●●
●
●
●
●
●●●●●●●●●
●
●●●●●●
●●●●●
●●
●●●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●
●
●
●●
●●
●●●
●
●●
●
●
●●●●●●
●●●
●
●
●
●
●
●●●●
●
●●●
●●●●●
●
●
●
●
●
●
●●●
●●●●
●
●●●●●●●
●
●●●●●●●●
●●
●●●●●●
●
●
●
●●●
●
●●●●
●
●●●
●
●●
●
●●●●●
●
●●
●●
●●●●
●
●●
●●●●
●●
●
●●●
●●●
●●●
●
●●
●●
●●●
●●
●●●●●●●
●
●●●
●●●
●●
●●●●●●●
●
●●●●
●
●●
●●●●
●
●
●●
●
●
●
●●
●
●●●●
●●
●●●
●●●●
●
●
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
Wi
π(T
i,Wi)
E[^ Yi(1)]
E[^ Yi(0)]
τ = 0.127
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 28 / 242
![Page 152: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/152.jpg)
Example: Civil Conflict and Political Instability
Fearon & Laitin (2003):
Yi : Civil conflict
Ti : Political instability
Wi : Geography (log % mountainous)
Estimated model:
Pr(Yi = 1 | Ti ,Wi )
= logit−1 (−2.84 + 0.91Ti + 0.35Wi )
Predicted probability:
π(Ti = 1,Wi = 3.10) = 0.299
ATE:
τ =1
n
n∑i=1
{π(1,Wi )− π(0,Wi )}
= 0.127
●●
●
●
●●●
●●
●
●
●
●●●●●●●
●
●●
●
●●●●●
●
●
●
●●●●●●●
●●
●●●●
●
●●●●●
●●
●●●
●●●
●●●●●●
●●●●●
●
●●●●
●●●
●●●●●
●●●●●
●
●
●●
●●●
●
●●●
●
●●
●
●●
●
●●●●
●●
●
●
●
●●●
●●
●●
●
●
●
●●
●●●
●●●●●●●
●●●
●
●●
●●●
●
●●●●
●●
●●
●● ●●●●
●●
●
●●●
●●●
●●
●●●●
●●
●
●
●●
●●●
●●●●
●●
●
●●
●
●●●●
●
●
●●●●●
●●●●
●
●●●●
●
●
●
●●●
●
●●
●●●●
●
●
●●●
●
●●
●●●●●●
●●●●
●
●●
●
●●
●●
●●●
●●
●●●
●●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●●
●
●
●●●●●●
●
●●●●●●●
●●
●
●●●
●●
●●●
●●●●●●●
●●●●●●●
●
●●
●
●●
●●●●
●
●
●●●
●
●●●
●●●●
●●
●
●●
●●
●
●●
●
●
●●●
●
●●
●●
●●●●●●●●●●●●●●
●●
●●
●●
●
●●●
●
●●
●●●●
●●
●
●●
●●●
●●
●●●
●
●
●
●
●●●
●●●
●●●
●
●●
●
●●●
●●●●●
●●●●●●
●
●●●●
●
●
●●●
●●
●●
●●
●●
●●●●●●
●●●●●●
●●●●●●
●
●●●
●
●●
●●●●●●●●●
●●●
●
●
●●
●●●●●
●●●●
●●●●●
●●
●
●
●
●
●●
●●
●
●
●●●
●
●●
●●●
●●●
●
●●
●●
●
●
●●
●●
●
●●●●●●●●
●
●●●
●●●●
●●
●●●
●●
●●●
●●●●
●
●
●
●
●●●●
●
●
●
●
●
●
●●●●
●
●
●
●
●
●
●
●●●
●●
●●●
●●●
●
●
●
●
●●●●●●●●
●●
●●
●●
●●●
●●●
●
●●●
●●
●●●
●
●●●●●●
●
●
●●●
●
●
●●
●
●
●
●●●●●
●
●●●●
●
●●●●
●●●●●
●●
●●●●●
●●●●
●
●
●●●
●●●
●●
●●●
●
●
●
●
●
●
●●●
●
●●●
●●●●
●
●
●●●●
●●●●●●
●●
●●●●●●
●
●●
●●●●●●●●
●
●●
●
●●●●●●
●
●
●●●●
●●●●
●●●●●●
●
●●●●●
●●●●●
●
●●
●●●●
●●●●
●●
●
●●
●●●●●●
●●●●●●●●●●●●
●
●
●●
●●●●●●
●
●●
●
●●
●●
●●
●●
●
●
●●
●
●●●●●●●●●●●●
●●●
●
●●●●
●●●
●●●
●●
●
●
●●
●●
●●●●●
●
●●
●
●●●
●●
●
●
●
●
●●
●●
●
●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●
●
●●
●●●●
●●●●●
●
●●
●●
●
●●
●●
●●
●
●
●
●●●
●●●●●●●
●
●●
●
●●●
●●
●
●●●
●
●●●●●●
●●●●●●●
●
●●
●●●
●●
●
●●
●
●●
●
●
● ●
●
●●●
●
●●●●
●
●
●
●●●●●●●●
●●
●
●
●●
●●●●
●
●●
●●●●
●●●
●
●
●
●
●
●●
●●
●
●●
●
●
●●
●●●●
●●●
●●
●
●●●●●
●
●●●
●
●
●
●●●
●●●●●
●
●●●
●●●
●●
●●
●●●
●
●
●
●●
●
●●●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●●●●
●●●
●
●●
●●●●
●●
●
●
●●●
●
●●●●●
●
●
●●●
●●●●
●●
●
●●
●
●●
●●
●●●●●●●●●
●
●
●●
●●●●
●
●
●
●
●●
●●●●
●
●
●
●
●●●●●●
●
●●
●
●●●
●●●●
●
●●
●
●
●
●●●
●●
●
●
●
●●
●●●●●●●
●●●
●●●
●
●●●
●
●●
●●●
●
●
●
●●●
●
●
●●●●●
●
●
●●
●
●●●
●●●●●
●
●●
●
●
●
●●●
●●●
●●
●●●
●
●
●●●●
●
●●●●●●
●●●
●●
●
●●●
●
●●●●●
●
●
●●●●
●
●
●●
●
●●●●
●●
●●●●●
●
●
●
●
●●●
●●●●●
●
●
●
●●
●●
●●
●●●
●●
●●
●●
●●●●
●
●
●●●
●●
●
●
●
●●●
●
●
●●●●●
●●●
●●
●●
●●
●●●
●
●
●
●
●
●●●●●●●●
●
●
●●
●●●
●
●
●●●●●●●●●
●●●●●
●●●●●
●●●
●●
●●
●●●●
●
●
●●
●
●●
●●●●
●
●●
●●●
●
●●●
●
●●●
●
●
●
●
●●●
●
●●●
●●
●
●
●
●●●
●●
●●●
●●●●●
●●●●●
●●●●●
●
●
●●●●
●●
●
●●
●●●●
●
●●●●
●
●
●
●●
●
●●●●●●●●●●●
●●●●●●
●●●
●
●●
●●●●●●
●●●●
●
●
●
●●●
●
●
●
●●
●
●●
●
●●●●
●●●●
●●●●●●
●
●●●●
●●●●
●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●●●
●●●●
●
●
●●●●
●●
●
●
●
●
●●
●●
●
●●●
●
●●●●●●●
●●●●●●●●●
●●
●●●●●
●
●●
●●●●●
●
●●
●●●
●
●
●●
●●●
●
●●
●●
●●●●●●●●
●●
●
●
●
●●●●
●
●●●
●●●
●
●●●●●
●●
●●●
●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●
●●
●
●
●
●
●●●
●●●●
●●●●●
●●●
●●
●
●
●●●●●
●●●
●
●
●●●●●●
●
●
●●●
●
●●
●
●
●●●
●●●● ●
●
●●●●●
●
●●●●
●
●●
●●
●●●●●
●
●●
●
●
●●
●
●
●
●
●●●
●
●●
●
●
●
●●●
●
●●●
●●
●●●●
●
●
●●
●
●
●
●
●●
●
●
●●●●●
●●●●
●●
●●●●
●●●
●
●
●●
●●●
●
●●●●●
●
●●
●
● ●●
●●
●●
●
●●
●●●●●
●●●●●●
●
●●
●●
●●
●
●●●●●●
●
●●
●●●
●
●
●●
●
●
●
●
●
●●●●
●
●
●●●
●
●●
●
●●●●●
●●
●
●
●●
●●
●●●
●
●●●
●●●●
●
●●
●
●●●
●
●
●●●●●●
●●
●
●
●
●●●●●●
●
●
●
●
●●●●●●●
●●●●
●
●●●●●●
●●●●
●●
●●
●
●
●
●●●●●
●●
●●
●●●●●●●
●●●
●
●
●
●
●
●●
●●●●●●●●●●●
●●●●
●●●
●●●
●●
●
●●
●●●●
●●
●●
●
●●●
●●
●
●
●●
●●●
●●
●●
●●
●●
●
●
●
●
●●●
●
●●●
●
●●●●●
●
●
●
●●●
●
●
●●
●●
●●●●
●
●
●
●●●
●●●●
●●●
●●
●
●
●●●●●●
●
●●
●
●●●●●
●●
●
●
●●●●●
●
●
●
●●●●
●●●●
●
●●
●
●
●
●
●
●●●
●
●●●●
●●●
●●●
●
●
●●●
●●
●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●●●
●
●
●
●
●
●●
●
●●
●●
●
●●
●●●
●●
●●●
●●●●●●
●●●
●●●●●●●
●
●
●
●●●
●●●
●●●●●●●
●●●
●●
●
●●●
●
●
●●
●
●●
●●
●
●●●●
●●●●●●
●●
●●
●
●
●●
●
●
●●●
●
●●
●
●
●
●●●●●●
●●●●●
●
●
●
●
●
●
●●
●●●
●
●●●●●●●
●●●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●
●
●●
●●
●●
● ●●
●
●●●●●
● ●
●●●
●
●
●●●
●
●●●
●●
●●●
●●
●●
●
●●●●●●
●●
●●●
●
●●●●●
●●●
●●●
●
●●●●
●
●
●●●
●
●
●
●
●●●
●●●
●●●●
●●● ●●
●●●●●●●
●
●●
●
●
●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●●●●●
●
●●●●
●
●●
●●●
●
●●
●●
●●
●●●●●
●
●●●●●●
●
●●
●●
●
●
●
●
●●●
●
●
●
●●●●●●●●●●
●●
●●●●●●
●●●
●●
●
●
●●
●●●●
●
●
●●
●●●●●●
●●
●●●●●●●●
●●●●●
●
●
●
●●●●●●●●●●
●
●●
●●●
●
●
●●●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●
●
●●●
●
●●
●●
●●
●
●●
●
●
●
●
●
●●
●
●●
●
●●●●
●
●
●●
●●●
●
●●●●
●
●
●●●
●●
●●●●●
●●
●
●●
●
●
●●●
●
●
●●●●●
●
●●●●
●●●●●
●
●●●●●●●
●
●●
●●
●
●
●
●●
●
●●●
●●
●
●
●
●●●●●●●●
●●●●
●
●●●
●●
●
●
●
●●
●
●●
●
●
●●●●
●
●●●●
●
●
●
●
●●●●●
●●
●●●
●●
●●●
●
●●
●●
●●
●
●
●●●
●
●●●●
●●
●
●●●●●●●●●●
●●
●
●
●
●●●
●
●●●● ●
●●●●●●●●
●
●●●●●●●●
●●
●●
●
●●
●
●●●
●●●
●
●
●
●
●
●●●●●
●
●
●
●●●●
●●
●●
●●●
●●●
●
●
●●
●●
●
●●●●
●
●
●●
●●
●●
●●●
●●●●●●
●●●●●●●●●●
●
●
●●●●●●●●
●●
●●●●
●●●●●
●●●
●●●
●●
●●
●●
●●
●●
●
●
●
●
●
●
●●●●●●●
●
●
●
●●
●●●●
●●●●
●
●●●●
●●
●●
●●●●●●
●
●●●●●●●●●
●●●
●●●
●●●
●
●●●●●
●●●
●
●●●
●
●●●
●●●
●●●
●
●●●
●
●
●
●●●●
●
●●
●
●
●
●●●●
●
●●
●
●
●●
●
●
●●
●●
●●
●
●●
●●
●
●●●●
●
●
●●
●
●
●●●●●●
●
●●
●
●●●●
●
●●●●
●
●●
●
●
●
●
●●
●●●●
●
●●●●●●●●●●
●●
●
●
●●●●
●
●
●●
●
●
●
●
●●●
●●●
●●
●
●●
●●
●●●●●●●
●
●●●●●●
●●●
●●
●
●●
●●●
●●●●
●●
●
●●●●
●
●●
●
●
●●●●●●
●●●●
●●●●●
●●
●●●●●
●●
●●●
●●●
●●●
●
●●●●
●●
●●
●●●●●
●
●●●
●
●
●
●
●
●
●●
●●
●●●
●
●
●
●●
●●●●
●●●●
●
●●●
●
●●●
●
●●●
●●●
●
●●●●
●●
●
●
●
●● ●
●
●●
●
●
●●
●
●●●●
●●●●
●
●●
●●
●●●
●
●
●●●●
●●
●●●
●●
●●
●●
●
●●●
●●●
●
●
●●
●●●●
●●●
●●
●●
●●●
●●●●
●
●●●●
●●
●●
●
●
●●
●
●
●
●
●
●●●●●
●
●●
●●
●●●●●●
●
●●
●●
●●●●●●
●●
●●
●
●●
●
●●●●●●●●●●●●
●●
●●
●●●
●
●
●
●●●
●
●●
●●●●●
●●●●●●●●
●
●
●
●●●●●
●
●
●
●
●●
●
●●
●●
●●●
●
●●
●
●●●
●
●●
●
●
●
●●●
●
●
●
●
●●
●●●
●●●
●
●●
●●●●●●●●
●
●
●●
●●●
●
●
●
●●●
●●●●
●
●●
●●
●●
●
●
●
●●●●
●
●
●
●●●
●
●●●
●
●●●
●●●●
●●●
●●
●●
●
●●
●●
●
●
●
●
●●
●●●●●●
●
●●●●
●
●●●●
●●
●
●
●
●●
●
●●●●
●
●●●●
●●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●●●●
●
●
●●
●●●
●●
●●
●
●●●●●●
●
●●
●
●
●●
●●
●
●
●●●●●
●
●
●●●●
●
●
●
●●
●
●●●●●●●●
●
●●●●
●●●
●
●●●●
●
●●
●
●●●
●
●●●
●●
●
●●●●●●
●
●●●
●●
●
●●●●
●
●
●
●
●●●●●●
●
●●●
●●●●●
●
●
●●
●
●●
●●●
●●
●
●
●●●●●●
●
●
●
●
●●
●●●●●
●
●●●
●●
●
●●
●
●
●
●●●●
●
●●
●
●●
●●
●
●●●
●
●●
●
●●
●●●●●
●●●
●
●●●●●●●●●
●●●
●
●●●●●●●●●
●
●
●
●●●●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●●●●●●●
●
●●●●
●●●
●
●●
●●
●●
●●
●
●
●
●●●
●
● ●
●●●
●
●
●●●●
●●●●
●
●
●●●●
●
●●
●●●
●●●●●●●●●●●
●●●●●●●
●
●●●●●●
●●●●
●
●●●
●
●●●
●●●
●
●
●●
●
●
●●●
●
●
●
●●
●
●●●●●
●●●
●
●●●●●
●
●
●●●●●●●●
●●●●● ●●
●●
●●
●●●●●●●
●●●
●●
●
●●
●●●●●●●●
●
●●
●
●
●●●
●●●
●
●●●
●
●●●●
●●
●●●●●
●
●●●●
●●●●●●
●●
●
●
● ●●
●
●
●
●
●●
●
●●●
●●●
●●●
●
●
●●
●
●
●
●●●
●
●●●●
●●●●●
●●●●
●
●
●●
●
●●●●●
●●●●
●
●
●
●
●●
●
●●●●●
●●●●
●●
●
●●●●●
●●
●●
●
●●●
●
●●
●●
●
●
●●●●
●
●●
●
●
●●
●●●
●●●●●
●
●●●●●●●
●
●
●
●●●
●●●
●
●
●●●
●●●
●
●
●
●●
●
●●
●●
●
●●●●●
●●
●
●●
●
●●●
●●●●●
●●
●●
●●●
●●●
●
●
●
●
●
●
●
●●●
●●●
●●
●
●
●
●●
●●
●●
●●●
●●
●
●
●
●
●
●●●●●●
●●
●
●
●
●
●●
●
●●
●●
●●
●
●●
●●
●●●
●
●●
●
●●●●
●●
●
●●●●●●
●
●
●●●●●
●●●●●
●
●
●
●
●
●
●●
●
●●
●●●
●●
●●●●
●●
●
●
●
●●
●●
●●●●
●
●
●●●●
●
●●●
●●●●●●
●●●●●●
●
●
●●●●
●●●
●●●●●
●●
●●●
●●
●●●
●
●
●
●
●
●
●●
●●●
●●
●
●
●●●
●●
●
●●●
●●●
●●●●
●●●●●
●●●●
●●●●●
●●●●●
●●
●
●●
●
●●●
●
●●●
●●
●●
●●●●
●●
●
●
●●●
●
●●
●
●●
●●
●●●
●●
●
●●●
●●●
●
●●●●
●
●●●
●
●●●●
●
● ●●●●●●
●●●
●
●
●
●
●●●●●
●●●●●
●●●●●●
●
●
●●
●●●●●
●
●
●
●
●
●
●
●
●●●
●●
●
●
●
●
●●●
●
●●
●
●
●●●●
●●
●
●
●●●●●●●●
●●●●
●
●
●●
●●●●●
●●●●●●●
●
●●
●●
●
●●●
●
●●
●●●
●●●●●
●
●●●●●●
●
●
●●●
●●
●●
●●●
●
●●●●●
●●●●
● ●
●
●●
●●
●
●●
●
●●●●●●
●●
●
●●●
●
●
●
●●
●●●●●
●
●
●●
●●●
●●
●●
●●●
●●●●●
●
●
●
●
●●●
●
●●
●●
●
●
●●●
●●●●
●
●
●●
●
●
●●●
●●
●●
●●●●
●●●●●●●●●●
●●
●
●●●●●
●
●●
●●
●●
●
●
●●●
●
●
●●
●●
●
●
●
●●●●
●
●●●
●●
●
●
●●●●●
●●
●●●●
●
●●●●●
●
●●●●●●●
●●
●●●●●
●●
●●
●
●●●●
●●●●●●●●●●●●●
●
●
●
●●
●●●
●●●
●●●
●●
●●●
●
●
●
●
●●
●
●●●●●●
●
●●
●●
●
●●●●
●●●●
●●●●●
●●
●●●●
●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●●●●●●●
●
●●●
●●
●●●
●●●
●
●●
●
●
●
●
●
●
●
●●
●
●●●●●●●●
●
●●
●●
●●
●
● ●
●
●●●●●●●
●●●
●
●
●●●●
●●●
●
●
●
●
●●
●●
●
●
●
●●●●●
●●
●
●●●●●
●
●●●●
●
●●
●●
●●
●●
●●
●
●
●
●
●
●●
●
●●●
●
●
●●●●●
●●●●
●
●●●
●●●●
●●
●●
●
●●
●
●●●●●●●●●
●●●●
●●●●
●●●●●●
●●●●●●●●
●●●●●●●
●
●
●●
●●●
●●
●●●●●●●●●
●●
●●
●
●
●
●●
●
●
●
●●
●●
●
●●
●
●●
●●●●
●●●●●
●●●●●
●
●●●●
●
●●●
●●●
●
●●
●●
●●●●
●●●
●
●●
●●●●
●
●
●
●●
●
●
●●●
●
●●
●
●●
●●●
●●
●●
●●●●●●●●●
●●● ●
●●
●●
●
●●●●●●●
●●●●
●●
●●●●●●●
●●●●●
●●●●
●●●
●
●
●
●
●●
●●
●●
●●●●●●
●●●
●
●●●●
●
●●
●
●●
●●
●
●
●
●
●●●●
●
●●●
●
●●
●●
●
●
●●●●●●
●
●●●●●●
●●
●●●
●
●
●●●●
●●●●●
●
●●●
●●●●
●●
●
●
●●●●●●●●
●
●
●●
●●●
●●●●
●
●
●●
●●●
●●●●●●
●
●
●●
●
●
●●●
●●
●
●
●●●
●●
●
●
●●●●●●●
●
●
●
●●●
●
●●●
●●●
●
●
●●
●
●●●●●
●●●●
●
●●●●
●
●●●
●
●
●
●●●
●
●●●●●●
●
●
●
●
●
●
●●●●●●
●●●
●
●●●●
●
●●●●●●
●
●
●●●●●●●
●
●●●●
●
●
●
●
●●
●●●
●
●●
●●
●●●●●●
●●●●●●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●●●●●●●●●●●
●
●●●
●
●●●●●
●●●●
●●●
●●●●
●●●
●
●●●●
●
●●
●●
●●●●●
●
●●●
●●
●
●
●●
●●●
●●●
●
●●
●
●
●
●●●
●●●●
●●●
●●●●●●●●●●●●●●●
●
●●●●●●
●
●●●
●●●
●
●
●●
●●
●●
●●
●
●
●
●
●●
●●●
●
●
●●●
●
●
●●
●
●
●
●●
●●●●●
●
●●●
●●●
●●●●●
●
●●●●
●
●
●●●
●●
●
●●
●
●●●
●
●
●●●
●●●●●
●●●●●●●●
●●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●●●●●
●●●
●
●●●
●
●●●●●
●●●●●
●●
●
●●
●●●●
●●●●●●●
●●●●●●●
●●●●●
●●●●●
●
●●
●
●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●
●
●●
●●
●
●
●●●●●
●
●
●
●
●●●●●●●●●
●
●●●●●●
●●●●●
●●
●●●●
●
●
●
●
●●
●
●●●
●
●
●
●
●●●
●
●
●●
●●
●●●
●
●●
●
●
●●●●●●
●●●
●
●
●
●
●
●●●●
●
●●●
●●●●●
●
●
●
●
●
●
●●●
●●●●
●
●●●●●●●
●
●●●●●●●●
●●
●●●●●●
●
●
●
●●●
●
●●●●
●
●●●
●
●●
●
●●●●●
●
●●
●●
●●●●
●
●●
●●●●
●●
●
●●●
●●●
●●●
●
●●
●●
●●●
●●
●●●●●●●
●
●●●
●●●
●●
●●●●●●●
●
●●●●
●
●●
●●●●
●
●
●●
●
●
●
●●
●
●●●●
●●
●●●
●●●●
●
●
0 1 2 3 4
0.0
0.2
0.4
0.6
0.8
1.0
Wi
π(T
i,Wi)
Ti = 1
Ti = 0
●PHILIPPI , 1982πi = 0.299
E[^ Yi(1)]
E[^ Yi(0)]
τ = 0.127
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 28 / 242
![Page 153: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/153.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 154: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/154.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]
But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 155: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/155.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 156: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/156.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 157: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/157.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 158: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/158.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 159: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/159.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:
1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 160: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/160.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:1 Analytical approximation: the Delta method
2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 161: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/161.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:1 Analytical approximation: the Delta method2 Simulating from sampling distributions
3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 162: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/162.jpg)
Variance Estimation
The variance estimates can be used for calculating confidenceintervals for logit/probit β: [βMLE − zα/2 · s.e., βMLE + zα/2 · s.e.]But β itself is (typically) not of direct substantive interest
Predicted probability: π(x) = exp(x>βMLE )
1+exp(x>βMLE )for logit
ATE: τ = 1n
∑ni=1
(exp(βT +W>i βW )
1+exp(βT +W>i βW )− exp(W>i βW )
1+exp(W>i βW )
)for logit
How we compute standard errors for quantities like π and τ?
Three approaches:1 Analytical approximation: the Delta method2 Simulating from sampling distributions3 Resampling: the bootstrap (parametric or nonparametric)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 29 / 242
![Page 163: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/163.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 164: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/164.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 165: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/165.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 166: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/166.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 167: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/167.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 168: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/168.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))
2. For each θr , compute f (θr )3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 169: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/169.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 170: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/170.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 171: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/171.jpg)
Monte Carlo Approximation
For MLE, we know that θapprox.∼ N (θ,V(θ))
We can simulate this distribution by sampling from N (θ, V(θ))
For each draw θ∗, compute f (θ∗) by plugging in
Variance in the distribution of θ∗ should transfer to f (θ∗)
This leads to the algorithm of King, Tomz and Wittenberg (2000):
1. Draw R copies of θr from N (θ, V(θ))2. For each θr , compute f (θr )
3a. To obtain s.e. of f (θ), use the sample standard deviation of{f (θ1), ..., f (θR)}
3b. To compute 95% CI, use 2.5/97.5 percentiles of {f (θ1), ..., f (θR)} asthe lower/upper bounds
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 30 / 242
![Page 172: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/172.jpg)
Example: Civil Conflict and Political InstabilityConfidence Intervals for π(Ti = 1,Wi = 3.10):
0.25
0.30
0.35
Comparison of 95% Confidence Intervalsπ i
Delta Method Nonparametric BootstrapClarify Parametric Bootstrap
Delta Method Clarify Nonpara. B. Para. B.Normal Approximation Yes Yes No NoSimulations No Yes Yes YesDerivation-Free No Yes Yes NoComputation Speed Instant Fast Very Slow Slow
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 31 / 242
![Page 173: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/173.jpg)
Simulation from any model must reflect all uncertainty
Yi ∼ f (θi , α) stochastic
θi = g(xi , β) systematic
Must simulate anything with uncertainty:
1. Estimation uncertainty: Lack of knowledge of β and α. (Due toinadequacies in your research design: n is not infinite.)
2. Fundamental uncertainty: Represented by the stochastic component.(Due to the nature of nature!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 32 / 242
![Page 174: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/174.jpg)
Simulation from any model must reflect all uncertainty
Yi ∼ f (θi , α) stochastic
θi = g(xi , β) systematic
Must simulate anything with uncertainty:
1. Estimation uncertainty: Lack of knowledge of β and α. (Due toinadequacies in your research design: n is not infinite.)
2. Fundamental uncertainty: Represented by the stochastic component.(Due to the nature of nature!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 32 / 242
![Page 175: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/175.jpg)
Simulation from any model must reflect all uncertainty
Yi ∼ f (θi , α) stochastic
θi = g(xi , β) systematic
Must simulate anything with uncertainty:
1. Estimation uncertainty: Lack of knowledge of β and α. (Due toinadequacies in your research design: n is not infinite.)
2. Fundamental uncertainty: Represented by the stochastic component.(Due to the nature of nature!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 32 / 242
![Page 176: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/176.jpg)
Simulation from any model must reflect all uncertainty
Yi ∼ f (θi , α) stochastic
θi = g(xi , β) systematic
Must simulate anything with uncertainty:
1. Estimation uncertainty: Lack of knowledge of β and α. (Due toinadequacies in your research design: n is not infinite.)
2. Fundamental uncertainty: Represented by the stochastic component.(Due to the nature of nature!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 32 / 242
![Page 177: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/177.jpg)
Simulation from any model must reflect all uncertainty
Yi ∼ f (θi , α) stochastic
θi = g(xi , β) systematic
Must simulate anything with uncertainty:
1. Estimation uncertainty: Lack of knowledge of β and α. (Due toinadequacies in your research design: n is not infinite.)
2. Fundamental uncertainty: Represented by the stochastic component.(Due to the nature of nature!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 32 / 242
![Page 178: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/178.jpg)
Simulation from any model must reflect all uncertainty
Yi ∼ f (θi , α) stochastic
θi = g(xi , β) systematic
Must simulate anything with uncertainty:
1. Estimation uncertainty: Lack of knowledge of β and α. (Due toinadequacies in your research design: n is not infinite.)
2. Fundamental uncertainty: Represented by the stochastic component.(Due to the nature of nature!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 32 / 242
![Page 179: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/179.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 180: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/180.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 181: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/181.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 182: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/182.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 183: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/183.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 184: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/184.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 185: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/185.jpg)
Strategy for Simulating from Generalized Linear Models
All of the models we’ve talked about so far (and for the next few weeks)belong to the class of generalized linear models (GLM).
Three elements of a GLM
A distribution for Y (stochastic component)
A linear predictor Xβ (systematic component)
A link function that relates the linear predictor to the mean of thedistribution. (systematic component)
(Note: the language is slightly different for the latent variable withobservation mechanism but the result is the same)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 33 / 242
![Page 186: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/186.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 187: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/187.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 188: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/188.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 189: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/189.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 190: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/190.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 191: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/191.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 192: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/192.jpg)
Complete Recipe
1 Specify a distribution for Y
2 Specify a linear predictor
3 Specify a link function
4 Estimate Parameters via Maximum Likelihood
5 Simulate or Calculate Quantities of Interest
Let’s do this together for a particular example.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 34 / 242
![Page 193: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/193.jpg)
The Data: Political Assassinations
Taken from Olken and Jones (2009), ”Hit or Miss? The Effect ofAssassinations on Institutions and War”, American Economic Journal:Macroeconomics.
Dataframe is called as and contains information on assassinationattempts, success or failure, and various covariates.
> as[as$country == "United States" & as$year == "1975",]
country year leadername age tenure attempt
United States 1975 Ford 62 510 TRUE
survived result dem_score civil_war war
1 24 10 0 0
pop energy solo weapon
215973 2208506 1 gun
Observations are country-year-leaders, so some country-years have multipleobservations.Let’s try to predict assassination attempts with some of our covariates.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 35 / 242
![Page 194: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/194.jpg)
The Data: Political Assassinations
Taken from Olken and Jones (2009), ”Hit or Miss? The Effect ofAssassinations on Institutions and War”, American Economic Journal:Macroeconomics.
Dataframe is called as and contains information on assassinationattempts, success or failure, and various covariates.
> as[as$country == "United States" & as$year == "1975",]
country year leadername age tenure attempt
United States 1975 Ford 62 510 TRUE
survived result dem_score civil_war war
1 24 10 0 0
pop energy solo weapon
215973 2208506 1 gun
Observations are country-year-leaders, so some country-years have multipleobservations.Let’s try to predict assassination attempts with some of our covariates.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 35 / 242
![Page 195: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/195.jpg)
The Data: Political Assassinations
Taken from Olken and Jones (2009), ”Hit or Miss? The Effect ofAssassinations on Institutions and War”, American Economic Journal:Macroeconomics.
Dataframe is called as and contains information on assassinationattempts, success or failure, and various covariates.
> as[as$country == "United States" & as$year == "1975",]
country year leadername age tenure attempt
United States 1975 Ford 62 510 TRUE
survived result dem_score civil_war war
1 24 10 0 0
pop energy solo weapon
215973 2208506 1 gun
Observations are country-year-leaders, so some country-years have multipleobservations.
Let’s try to predict assassination attempts with some of our covariates.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 35 / 242
![Page 196: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/196.jpg)
The Data: Political Assassinations
Taken from Olken and Jones (2009), ”Hit or Miss? The Effect ofAssassinations on Institutions and War”, American Economic Journal:Macroeconomics.
Dataframe is called as and contains information on assassinationattempts, success or failure, and various covariates.
> as[as$country == "United States" & as$year == "1975",]
country year leadername age tenure attempt
United States 1975 Ford 62 510 TRUE
survived result dem_score civil_war war
1 24 10 0 0
pop energy solo weapon
215973 2208506 1 gun
Observations are country-year-leaders, so some country-years have multipleobservations.Let’s try to predict assassination attempts with some of our covariates.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 35 / 242
![Page 197: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/197.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 198: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/198.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 199: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/199.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 200: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/200.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded:
Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 201: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/201.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 202: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/202.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary:
Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 203: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/203.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 204: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/204.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count:
Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 205: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/205.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 206: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/206.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration:
Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 207: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/207.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 208: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/208.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories:
Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 209: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/209.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 210: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/210.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories:
Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 211: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/211.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 212: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/212.jpg)
1. Specify a distribution for Y
Assume our data was generated from some distribution.
Examples:
Continuous and Unbounded: Normal
Binary: Bernoulli
Event Count: Poisson
Duration: Exponential
Ordered Categories: Normal with observation mechanism
Unordered Categories: Multinomial
What fits our application?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 36 / 242
![Page 213: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/213.jpg)
2. Specify a linear predictor
We are interested in allowing some parameter of the distribution θ to varyas a (linear) function of covariates. So we specify a linear predictor.
Xβ = β0 + x1β1 + x2β2 + · · ·+ xkβk
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 37 / 242
![Page 214: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/214.jpg)
2. Specify a linear predictor
We are interested in allowing some parameter of the distribution θ to varyas a (linear) function of covariates. So we specify a linear predictor.
Xβ = β0 + x1β1 + x2β2 + · · ·+ xkβk
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 37 / 242
![Page 215: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/215.jpg)
What’s in our model?
We wish to predict assassination attempts for country-year-leaders.
tenure: number of days in office
age: age of leader, in years
dem score: polity score, -10 to 10
civil war: is there currently a civil war?
war: is country in an international conflict?
pop: the country’s population, in thousands
energy: energy usage
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 38 / 242
![Page 216: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/216.jpg)
What’s in our model?
We wish to predict assassination attempts for country-year-leaders.
tenure: number of days in office
age: age of leader, in years
dem score: polity score, -10 to 10
civil war: is there currently a civil war?
war: is country in an international conflict?
pop: the country’s population, in thousands
energy: energy usage
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 38 / 242
![Page 217: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/217.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 218: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/218.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 219: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/219.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 220: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/220.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 221: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/221.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 222: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/222.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 223: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/223.jpg)
3. Specify a link function
The link function relates the linear predictor to some parameter θ of thedistribution for Y (usually the mean).
Let g(·) be the link function and let E (Y ) = θ be the mean of distributionfor Y .
g(θ) = Xβ
θ = g−1(Xβ)
Note that we usually use the inverse link function g−1(Xβ) rather thanthe link function.
Together with the linear predictor this forms the systematic componentthat we’ve been talking about all along.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 39 / 242
![Page 224: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/224.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 225: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/225.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 226: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/226.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 227: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/227.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 228: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/228.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 229: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/229.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 230: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/230.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 231: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/231.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 232: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/232.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 233: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/233.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 234: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/234.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 235: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/235.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 236: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/236.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 237: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/237.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 238: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/238.jpg)
Example Link Functions
Identity:
Link: µ = Xβ
Inverse:
Link: λ−1 = Xβ
Inverse Link: λ = (Xβ)−1
Logit:
Link: ln(
π1−π
)= Xβ
Inverse Link: π = 11+e−Xβ
Probit:
Link: Φ−1(π) = Xβ
Inverse Link: π = Φ(Xβ)
Log:
Link: ln(λ) = Xβ
Inverse Link: λ = exp(Xβ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 40 / 242
![Page 239: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/239.jpg)
Logit or Probit?
“The question of which distribution to use is a natural one... There arepractical reasons for favoring one or the other in some cases formathematical convenience, but it is difficult to justify the choice of onedistribution or another on theoretical grounds ...[A]s a general proposition,the question is unresolved. In most applications, the choice between thesetwo seems not to make much difference.” -Econometric Analysis, Greene.pg. 774.
Let’s do probit. Why? Mostly to avoid giving away the problem set.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 41 / 242
![Page 240: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/240.jpg)
Logit or Probit?
“The question of which distribution to use is a natural one... There arepractical reasons for favoring one or the other in some cases formathematical convenience, but it is difficult to justify the choice of onedistribution or another on theoretical grounds ...[A]s a general proposition,the question is unresolved. In most applications, the choice between thesetwo seems not to make much difference.” -Econometric Analysis, Greene.pg. 774.
Let’s do probit. Why? Mostly to avoid giving away the problem set.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 41 / 242
![Page 241: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/241.jpg)
Logit or Probit?
“The question of which distribution to use is a natural one... There arepractical reasons for favoring one or the other in some cases formathematical convenience, but it is difficult to justify the choice of onedistribution or another on theoretical grounds ...[A]s a general proposition,the question is unresolved. In most applications, the choice between thesetwo seems not to make much difference.” -Econometric Analysis, Greene.pg. 774.
Let’s do probit.
Why? Mostly to avoid giving away the problem set.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 41 / 242
![Page 242: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/242.jpg)
Logit or Probit?
“The question of which distribution to use is a natural one... There arepractical reasons for favoring one or the other in some cases formathematical convenience, but it is difficult to justify the choice of onedistribution or another on theoretical grounds ...[A]s a general proposition,the question is unresolved. In most applications, the choice between thesetwo seems not to make much difference.” -Econometric Analysis, Greene.pg. 774.
Let’s do probit. Why?
Mostly to avoid giving away the problem set.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 41 / 242
![Page 243: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/243.jpg)
Logit or Probit?
“The question of which distribution to use is a natural one... There arepractical reasons for favoring one or the other in some cases formathematical convenience, but it is difficult to justify the choice of onedistribution or another on theoretical grounds ...[A]s a general proposition,the question is unresolved. In most applications, the choice between thesetwo seems not to make much difference.” -Econometric Analysis, Greene.pg. 774.
Let’s do probit. Why? Mostly to avoid giving away the problem set.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 41 / 242
![Page 244: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/244.jpg)
4. Estimate Parameters via ML
a. Write down the likelihood
b. Estimate all the parameters by maximizing the likelihood.I In this case it would be the coefficients βI In the regression case it would be θ = {β, γ} where γ is a
reparametrization of the variance.
c. Obtain an estimate of the variance by inverting the negative Hessian
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 42 / 242
![Page 245: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/245.jpg)
4. Estimate Parameters via ML
a. Write down the likelihood
b. Estimate all the parameters by maximizing the likelihood.I In this case it would be the coefficients βI In the regression case it would be θ = {β, γ} where γ is a
reparametrization of the variance.
c. Obtain an estimate of the variance by inverting the negative Hessian
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 42 / 242
![Page 246: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/246.jpg)
4. Estimate Parameters via ML
a. Write down the likelihood
b. Estimate all the parameters by maximizing the likelihood.
I In this case it would be the coefficients βI In the regression case it would be θ = {β, γ} where γ is a
reparametrization of the variance.
c. Obtain an estimate of the variance by inverting the negative Hessian
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 42 / 242
![Page 247: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/247.jpg)
4. Estimate Parameters via ML
a. Write down the likelihood
b. Estimate all the parameters by maximizing the likelihood.I In this case it would be the coefficients β
I In the regression case it would be θ = {β, γ} where γ is areparametrization of the variance.
c. Obtain an estimate of the variance by inverting the negative Hessian
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 42 / 242
![Page 248: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/248.jpg)
4. Estimate Parameters via ML
a. Write down the likelihood
b. Estimate all the parameters by maximizing the likelihood.I In this case it would be the coefficients βI In the regression case it would be θ = {β, γ} where γ is a
reparametrization of the variance.
c. Obtain an estimate of the variance by inverting the negative Hessian
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 42 / 242
![Page 249: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/249.jpg)
4. Estimate Parameters via ML
a. Write down the likelihood
b. Estimate all the parameters by maximizing the likelihood.I In this case it would be the coefficients βI In the regression case it would be θ = {β, γ} where γ is a
reparametrization of the variance.
c. Obtain an estimate of the variance by inverting the negative Hessian
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 42 / 242
![Page 250: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/250.jpg)
Step 4a: Write Down the Likelihood
The model:
1. Yi ∼ fbern(yi |πi ).
2. πi = Φ(Xiβ) where Φ is the CDF of the standard normal distribution.
3. Yi and Yj are independent for all i 6= j .
Like all CDF’s, Φ has range 0 to 1, so it bounds our πi to the correctspace:
Φ(z) =
∫ z
−∞
1√2π
exp(z2
2)dz .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 43 / 242
![Page 251: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/251.jpg)
Step 4a: Write Down the Likelihood
The model:
1. Yi ∼ fbern(yi |πi ).
2. πi = Φ(Xiβ) where Φ is the CDF of the standard normal distribution.
3. Yi and Yj are independent for all i 6= j .
Like all CDF’s, Φ has range 0 to 1, so it bounds our πi to the correctspace:
Φ(z) =
∫ z
−∞
1√2π
exp(z2
2)dz .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 43 / 242
![Page 252: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/252.jpg)
Step 4a: Write Down the Likelihood
We can then derive the log-likelihood for β:
L(β|y) ∝n∏
i=1
fbern(yi |πi )
=n∏
i=1
(πi )yi (1− πi )(1−yi )
Therefore:
ln L(β|y) ∝n∑
i=1
yi ln(πi ) + (1− yi ) ln(1− πi )
=n∑
i=1
yi ln(Φ(Xiβ)) + (1− yi ) ln(1− Φ(Xiβ))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 44 / 242
![Page 253: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/253.jpg)
Step 4a: Write Down the Likelihood
We can then derive the log-likelihood for β:
L(β|y) ∝n∏
i=1
fbern(yi |πi )
=n∏
i=1
(πi )yi (1− πi )(1−yi )
Therefore:
ln L(β|y) ∝n∑
i=1
yi ln(πi ) + (1− yi ) ln(1− πi )
=n∑
i=1
yi ln(Φ(Xiβ)) + (1− yi ) ln(1− Φ(Xiβ))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 44 / 242
![Page 254: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/254.jpg)
Step 4a: Write Down the Likelihood
We can then derive the log-likelihood for β:
L(β|y) ∝n∏
i=1
fbern(yi |πi )
=n∏
i=1
(πi )yi (1− πi )(1−yi )
Therefore:
ln L(β|y) ∝n∑
i=1
yi ln(πi ) + (1− yi ) ln(1− πi )
=n∑
i=1
yi ln(Φ(Xiβ)) + (1− yi ) ln(1− Φ(Xiβ))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 44 / 242
![Page 255: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/255.jpg)
Step 4a: Write Down the Likelihood
We can then derive the log-likelihood for β:
L(β|y) ∝n∏
i=1
fbern(yi |πi )
=n∏
i=1
(πi )yi (1− πi )(1−yi )
Therefore:
ln L(β|y) ∝n∑
i=1
yi ln(πi ) + (1− yi ) ln(1− πi )
=n∑
i=1
yi ln(Φ(Xiβ)) + (1− yi ) ln(1− Φ(Xiβ))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 44 / 242
![Page 256: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/256.jpg)
Step 4a: Write Down the Likelihood
We can then derive the log-likelihood for β:
L(β|y) ∝n∏
i=1
fbern(yi |πi )
=n∏
i=1
(πi )yi (1− πi )(1−yi )
Therefore:
ln L(β|y) ∝n∑
i=1
yi ln(πi ) + (1− yi ) ln(1− πi )
=n∑
i=1
yi ln(Φ(Xiβ)) + (1− yi ) ln(1− Φ(Xiβ))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 44 / 242
![Page 257: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/257.jpg)
Step 4a: Write Down the Likelihood
We can then derive the log-likelihood for β:
L(β|y) ∝n∏
i=1
fbern(yi |πi )
=n∏
i=1
(πi )yi (1− πi )(1−yi )
Therefore:
ln L(β|y) ∝n∑
i=1
yi ln(πi ) + (1− yi ) ln(1− πi )
=n∑
i=1
yi ln(Φ(Xiβ)) + (1− yi ) ln(1− Φ(Xiβ))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 44 / 242
![Page 258: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/258.jpg)
Step 4b: Maximize the Likelihood
First implement a function of the likelihood:
ll.probit <- function(beta, y=y, X=X){
phi <- pnorm(X%*%beta, log = TRUE)
opp.phi <- pnorm(X%*%beta, log = TRUE, lower.tail = FALSE)
logl <- sum(y*phi + (1-y)*opp.phi)
return(logl)
}
Notes:
1. the STN CDF is evaluated with pnorm. R’s pre-programmed log ofthe CDF has greater range than log(pnorm(Z)) (try Z=-50).
2. if lower.tail = FALSE gives Pr(Z ≥ z).
3. uses a logical test to check that an intercept column has been added
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 45 / 242
![Page 259: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/259.jpg)
Step 4b: Maximize the Likelihood
First implement a function of the likelihood:
ll.probit <- function(beta, y=y, X=X){
phi <- pnorm(X%*%beta, log = TRUE)
opp.phi <- pnorm(X%*%beta, log = TRUE, lower.tail = FALSE)
logl <- sum(y*phi + (1-y)*opp.phi)
return(logl)
}
Notes:
1. the STN CDF is evaluated with pnorm. R’s pre-programmed log ofthe CDF has greater range than log(pnorm(Z)) (try Z=-50).
2. if lower.tail = FALSE gives Pr(Z ≥ z).
3. uses a logical test to check that an intercept column has been added
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 45 / 242
![Page 260: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/260.jpg)
Step 4b: Maximize the Likelihood
First implement a function of the likelihood:
ll.probit <- function(beta, y=y, X=X){
phi <- pnorm(X%*%beta, log = TRUE)
opp.phi <- pnorm(X%*%beta, log = TRUE, lower.tail = FALSE)
logl <- sum(y*phi + (1-y)*opp.phi)
return(logl)
}
Notes:
1. the STN CDF is evaluated with pnorm. R’s pre-programmed log ofthe CDF has greater range than log(pnorm(Z)) (try Z=-50).
2. if lower.tail = FALSE gives Pr(Z ≥ z).
3. uses a logical test to check that an intercept column has been added
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 45 / 242
![Page 261: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/261.jpg)
Step 4b: Maximize the Likelihood
First implement a function of the likelihood:
ll.probit <- function(beta, y=y, X=X){
phi <- pnorm(X%*%beta, log = TRUE)
opp.phi <- pnorm(X%*%beta, log = TRUE, lower.tail = FALSE)
logl <- sum(y*phi + (1-y)*opp.phi)
return(logl)
}
Notes:
1. the STN CDF is evaluated with pnorm. R’s pre-programmed log ofthe CDF has greater range than log(pnorm(Z)) (try Z=-50).
2. if lower.tail = FALSE gives Pr(Z ≥ z).
3. uses a logical test to check that an intercept column has been added
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 45 / 242
![Page 262: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/262.jpg)
Step 4b: Maximize the Likelihood
First implement a function of the likelihood:
ll.probit <- function(beta, y=y, X=X){
phi <- pnorm(X%*%beta, log = TRUE)
opp.phi <- pnorm(X%*%beta, log = TRUE, lower.tail = FALSE)
logl <- sum(y*phi + (1-y)*opp.phi)
return(logl)
}
Notes:
1. the STN CDF is evaluated with pnorm. R’s pre-programmed log ofthe CDF has greater range than log(pnorm(Z)) (try Z=-50).
2. if lower.tail = FALSE gives Pr(Z ≥ z).
3. uses a logical test to check that an intercept column has been added
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 45 / 242
![Page 263: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/263.jpg)
Step 4b: Maximize the Likelihood
First implement a function of the likelihood:
ll.probit <- function(beta, y=y, X=X){
phi <- pnorm(X%*%beta, log = TRUE)
opp.phi <- pnorm(X%*%beta, log = TRUE, lower.tail = FALSE)
logl <- sum(y*phi + (1-y)*opp.phi)
return(logl)
}
Notes:
1. the STN CDF is evaluated with pnorm. R’s pre-programmed log ofthe CDF has greater range than log(pnorm(Z)) (try Z=-50).
2. if lower.tail = FALSE gives Pr(Z ≥ z).
3. uses a logical test to check that an intercept column has been added
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 45 / 242
![Page 264: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/264.jpg)
Step 4b: Maximize the Likelihood
y <- as$attempt
X <- as[,c("tenure","age","dem_score","civil_war",
"war","pop","energy")]
yX <- na.omit(cbind(y,X))
y <- yX[,1]; X <- cbind(1,yX[,-1])
opt <- optim(par = rep(0,ncol(X)), fn = ll.probit, y=y, X=X,
method = "BFGS", control = list(fnscale = -1,
maxit = 1000), hessian = TRUE)
opt$par
[1] -1.836166916 0.029141779 -0.005751652 -0.012433873 0.066287021 0.317451778 0.040863289
[8] 0.026859441
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 46 / 242
![Page 265: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/265.jpg)
Step 4c: Estimate the Variance-Covariance Matrix
vcov <- solve(-opt$hessian)
Now we can draw approximate the sampling distribution of beta.
MASS::mvrnorm(n=1, mu=opt$par, Sigma=vcov)
[1] -1.819984146 -0.001830225 -0.005933452 -0.012464456 0.122449059 0.380434336 0.072157828
[8] 0.008879418
This is stochastic so we do it again and get a different answer:
MASS::mvrnorm(n=1, mu=opt$par, Sigma=vcov)
[1] -1.792636772 0.081477117 -0.006457063 -0.013436530 0.019081307 0.255634394 0.036075468
[8] 0.073840550
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 47 / 242
![Page 266: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/266.jpg)
Step 4c: Estimate the Variance-Covariance Matrix
vcov <- solve(-opt$hessian)
Now we can draw approximate the sampling distribution of beta.
MASS::mvrnorm(n=1, mu=opt$par, Sigma=vcov)
[1] -1.819984146 -0.001830225 -0.005933452 -0.012464456 0.122449059 0.380434336 0.072157828
[8] 0.008879418
This is stochastic so we do it again and get a different answer:
MASS::mvrnorm(n=1, mu=opt$par, Sigma=vcov)
[1] -1.792636772 0.081477117 -0.006457063 -0.013436530 0.019081307 0.255634394 0.036075468
[8] 0.073840550
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 47 / 242
![Page 267: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/267.jpg)
Step 4c: Estimate the Variance-Covariance Matrix
vcov <- solve(-opt$hessian)
Now we can draw approximate the sampling distribution of beta.
MASS::mvrnorm(n=1, mu=opt$par, Sigma=vcov)
[1] -1.819984146 -0.001830225 -0.005933452 -0.012464456 0.122449059 0.380434336 0.072157828
[8] 0.008879418
This is stochastic so we do it again and get a different answer:
MASS::mvrnorm(n=1, mu=opt$par, Sigma=vcov)
[1] -1.792636772 0.081477117 -0.006457063 -0.013436530 0.019081307 0.255634394 0.036075468
[8] 0.073840550
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 47 / 242
![Page 268: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/268.jpg)
5. Quantities of Interest
What not to do...
est SE
Intercept -1.8362 1.4012tenure 0.0291 0.2937
age -0.0058 0.0251dem score -0.0124 0.0445
civil war 0.0663 0.8918war 0.3175 1.0141pop 0.0409 0.2368
energy 0.0269 0.2432
ses <- sqrt(diag(solve(-opt$hessian)))
table.dat <- cbind(opt$par, ses)
rownames(table.dat) <- colnames(X)
xtable::xtable(table.dat, digits = 4)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 48 / 242
![Page 269: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/269.jpg)
5. Quantities of Interest
What not to do...
est SE
Intercept -1.8362 1.4012tenure 0.0291 0.2937
age -0.0058 0.0251dem score -0.0124 0.0445
civil war 0.0663 0.8918war 0.3175 1.0141pop 0.0409 0.2368
energy 0.0269 0.2432
ses <- sqrt(diag(solve(-opt$hessian)))
table.dat <- cbind(opt$par, ses)
rownames(table.dat) <- colnames(X)
xtable::xtable(table.dat, digits = 4)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 48 / 242
![Page 270: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/270.jpg)
5. Quantities of Interest
What not to do...
est SE
Intercept -1.8362 1.4012tenure 0.0291 0.2937
age -0.0058 0.0251dem score -0.0124 0.0445
civil war 0.0663 0.8918war 0.3175 1.0141pop 0.0409 0.2368
energy 0.0269 0.2432
ses <- sqrt(diag(solve(-opt$hessian)))
table.dat <- cbind(opt$par, ses)
rownames(table.dat) <- colnames(X)
xtable::xtable(table.dat, digits = 4)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 48 / 242
![Page 271: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/271.jpg)
5. Quantities of Interest
1 Simulate parameters from multivariate normal.
2 Run Xβ through inverse link function to get the original parameter(typically the distribution mean)
3 Draw from distribution of Y for predicted values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 49 / 242
![Page 272: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/272.jpg)
5. Quantities of Interest
1 Simulate parameters from multivariate normal.
2 Run Xβ through inverse link function to get the original parameter(typically the distribution mean)
3 Draw from distribution of Y for predicted values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 49 / 242
![Page 273: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/273.jpg)
5. Quantities of Interest
1 Simulate parameters from multivariate normal.
2 Run Xβ through inverse link function to get the original parameter(typically the distribution mean)
3 Draw from distribution of Y for predicted values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 49 / 242
![Page 274: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/274.jpg)
5. Quantities of Interest
1 Simulate parameters from multivariate normal.
2 Run Xβ through inverse link function to get the original parameter(typically the distribution mean)
3 Draw from distribution of Y for predicted values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 49 / 242
![Page 275: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/275.jpg)
5. Quantities of Interest
General considerations:
a. Incorporating estimation uncertainty.
b. Incorporating fundamental uncertainty when making predictions.
c. Establishing appropriate baseline values for QOI, and consideringplausible changes in those values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 50 / 242
![Page 276: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/276.jpg)
5. Quantities of Interest
General considerations:
a. Incorporating estimation uncertainty.
b. Incorporating fundamental uncertainty when making predictions.
c. Establishing appropriate baseline values for QOI, and consideringplausible changes in those values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 50 / 242
![Page 277: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/277.jpg)
5. Quantities of Interest
General considerations:
a. Incorporating estimation uncertainty.
b. Incorporating fundamental uncertainty when making predictions.
c. Establishing appropriate baseline values for QOI, and consideringplausible changes in those values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 50 / 242
![Page 278: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/278.jpg)
5. Quantities of Interest
General considerations:
a. Incorporating estimation uncertainty.
b. Incorporating fundamental uncertainty when making predictions.
c. Establishing appropriate baseline values for QOI, and consideringplausible changes in those values.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 50 / 242
![Page 279: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/279.jpg)
Expected Values
For this model we will be interested in estimating the predicted probabilityof an assassination attempt at some level for the covariate values. Ingeneral, E [y |X ].
Let’s consider a potentially high risk situations (we’ll call them“highrisk”,“XHR”) then we can manipulate the risk factors:
Var. Value
tenure -0.30age 54.00
dem score -3.00civil war 0.00
war 0.00pop -0.18
energy -0.23
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 51 / 242
![Page 280: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/280.jpg)
Expected Values
For this model we will be interested in estimating the predicted probabilityof an assassination attempt at some level for the covariate values. Ingeneral, E [y |X ].
Let’s consider a potentially high risk situations (we’ll call them“highrisk”,“XHR”) then we can manipulate the risk factors:
Var. Value
tenure -0.30age 54.00
dem score -3.00civil war 0.00
war 0.00pop -0.18
energy -0.23
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 51 / 242
![Page 281: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/281.jpg)
Expected Values
For this model we will be interested in estimating the predicted probabilityof an assassination attempt at some level for the covariate values. Ingeneral, E [y |X ].
Let’s consider a potentially high risk situations (we’ll call them“highrisk”,“XHR”) then we can manipulate the risk factors:
Var. Value
tenure -0.30age 54.00
dem score -3.00civil war 0.00
war 0.00pop -0.18
energy -0.23
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 51 / 242
![Page 282: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/282.jpg)
Expected Values
What’s the estimated probability of an assassination at XHR?
Draw β
beta.draws <- MASS::mvrnorm(10000, mu = opt$par, Sigma = vcov)
dim(beta.draws)
[1] 10000 8
Now we simulate the outcome (warning: inefficient code!)
nsims <- 10000
p.ests <- vector(length=nrow(beta.draws))
for(i in 1:nsims){
p.ass.att <- pnorm(highrisk%*%beta.draws[i,])
outcomes <- rbinom(nsims2, 1, p.ass.att)
p.ests[i] <- mean(outcomes)
}
> mean(p.ests)
[1] 0.0166266
> quantile(p.ests, .025); quantile(p.ests, .975)
2.5% 97.5%
0.0134 0.0201
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 52 / 242
![Page 283: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/283.jpg)
Expected Values
What’s the estimated probability of an assassination at XHR?
Draw β
beta.draws <- MASS::mvrnorm(10000, mu = opt$par, Sigma = vcov)
dim(beta.draws)
[1] 10000 8
Now we simulate the outcome (warning: inefficient code!)
nsims <- 10000
p.ests <- vector(length=nrow(beta.draws))
for(i in 1:nsims){
p.ass.att <- pnorm(highrisk%*%beta.draws[i,])
outcomes <- rbinom(nsims2, 1, p.ass.att)
p.ests[i] <- mean(outcomes)
}
> mean(p.ests)
[1] 0.0166266
> quantile(p.ests, .025); quantile(p.ests, .975)
2.5% 97.5%
0.0134 0.0201
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 52 / 242
![Page 284: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/284.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 285: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/285.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 286: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/286.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 287: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/287.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 288: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/288.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 289: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/289.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 290: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/290.jpg)
Expected ValuesWhat are the steps that I just took?
1. simulate from the estimated sampling distribution of β to incorporate estimationuncertainty.
2. start our for-loop which will do steps 3-5 each time.
3. combine one β draw with XHR as XHR β, then plug into Φ() to get probability ofattempt for that β draw.
4. draw a bunch of outcomes from the Bernoulli(Φ(XHR β)).
5. average over those draws to get one simulated E [y |XHR ].
6. return to step 3.
0.010 0.015 0.020 0.025
050
100
150
200
Density of Expected Values at High Risk
N = 10000 Bandwidth = 0.000275
Den
sity
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 53 / 242
![Page 291: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/291.jpg)
Expected Values: A Shortcut
There is a shorter way to come up with the same answer, but it requiressome care in its application.
beta.draws <- mvrnorm(10000, mu = opt$par, Sigma
= solve(-opt$hessian))
p.ests2 <- pnorm(highrisk%*%t(beta.draws))
> mean(p.ests2)
[1] 0.01659705
> quantile(p.ests2, .025); quantile(p.ests2, .975)
2.5% 97.5%
0.01395935 0.01955867
This shortcut works because E [y |XHR ] = πHR ; i.e. the parameter is theexpected value of the outcome.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 54 / 242
![Page 292: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/292.jpg)
Expected Values: A Shortcut
There is a shorter way to come up with the same answer, but it requiressome care in its application.
beta.draws <- mvrnorm(10000, mu = opt$par, Sigma
= solve(-opt$hessian))
p.ests2 <- pnorm(highrisk%*%t(beta.draws))
> mean(p.ests2)
[1] 0.01659705
> quantile(p.ests2, .025); quantile(p.ests2, .975)
2.5% 97.5%
0.01395935 0.01955867
This shortcut works because E [y |XHR ] = πHR ; i.e. the parameter is theexpected value of the outcome.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 54 / 242
![Page 293: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/293.jpg)
Expected Values: A Shortcut
There is a shorter way to come up with the same answer, but it requiressome care in its application.
beta.draws <- mvrnorm(10000, mu = opt$par, Sigma
= solve(-opt$hessian))
p.ests2 <- pnorm(highrisk%*%t(beta.draws))
> mean(p.ests2)
[1] 0.01659705
> quantile(p.ests2, .025); quantile(p.ests2, .975)
2.5% 97.5%
0.01395935 0.01955867
This shortcut works because E [y |XHR ] = πHR ; i.e. the parameter is theexpected value of the outcome.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 54 / 242
![Page 294: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/294.jpg)
When wouldn’t this work?
Ex.: suppose that yi ∼ Expo(λi ) where λi = exp(Xiβ). We could find ourlikelihood, insert our parameterization of λi for each i , and then maximize to findβ as usual.
Thus, for some baseline set of covariates XBL, we now have a simulated samplingdistribution for λBL which has a mean at E [exp(XBLβ)].
Its not too hard to show that if y ∼ Expo(λ), then E [y ] = 1λ . The temptation is
then to declare that because E [λBL] = E [exp(XBLβ)] then
E [y ] = 1/E [exp(XBLβ)].
It turns out this is not the case because E [1/λ] 6= 1/E [λ]. The first averages over
the sampling distribution of the means of y . The second averages over the
sampling distribution of λ then plugs into the formula for the mean of y .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 55 / 242
![Page 295: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/295.jpg)
When wouldn’t this work?
Ex.: suppose that yi ∼ Expo(λi ) where λi = exp(Xiβ). We could find ourlikelihood, insert our parameterization of λi for each i , and then maximize to findβ as usual.
Thus, for some baseline set of covariates XBL, we now have a simulated samplingdistribution for λBL which has a mean at E [exp(XBLβ)].
Its not too hard to show that if y ∼ Expo(λ), then E [y ] = 1λ . The temptation is
then to declare that because E [λBL] = E [exp(XBLβ)] then
E [y ] = 1/E [exp(XBLβ)].
It turns out this is not the case because E [1/λ] 6= 1/E [λ]. The first averages over
the sampling distribution of the means of y . The second averages over the
sampling distribution of λ then plugs into the formula for the mean of y .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 55 / 242
![Page 296: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/296.jpg)
When wouldn’t this work?
Ex.: suppose that yi ∼ Expo(λi ) where λi = exp(Xiβ). We could find ourlikelihood, insert our parameterization of λi for each i , and then maximize to findβ as usual.
Thus, for some baseline set of covariates XBL, we now have a simulated samplingdistribution for λBL which has a mean at E [exp(XBLβ)].
Its not too hard to show that if y ∼ Expo(λ), then E [y ] = 1λ . The temptation is
then to declare that because E [λBL] = E [exp(XBLβ)] then
E [y ] = 1/E [exp(XBLβ)].
It turns out this is not the case because E [1/λ] 6= 1/E [λ]. The first averages over
the sampling distribution of the means of y . The second averages over the
sampling distribution of λ then plugs into the formula for the mean of y .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 55 / 242
![Page 297: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/297.jpg)
When wouldn’t this work?
Ex.: suppose that yi ∼ Expo(λi ) where λi = exp(Xiβ). We could find ourlikelihood, insert our parameterization of λi for each i , and then maximize to findβ as usual.
Thus, for some baseline set of covariates XBL, we now have a simulated samplingdistribution for λBL which has a mean at E [exp(XBLβ)].
Its not too hard to show that if y ∼ Expo(λ), then E [y ] = 1λ . The temptation is
then to declare that because E [λBL] = E [exp(XBLβ)] then
E [y ] = 1/E [exp(XBLβ)].
It turns out this is not the case because E [1/λ] 6= 1/E [λ]. The first averages over
the sampling distribution of the means of y . The second averages over the
sampling distribution of λ then plugs into the formula for the mean of y .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 55 / 242
![Page 298: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/298.jpg)
When wouldn’t this work?
Ex.: suppose that yi ∼ Expo(λi ) where λi = exp(Xiβ). We could find ourlikelihood, insert our parameterization of λi for each i , and then maximize to findβ as usual.
Thus, for some baseline set of covariates XBL, we now have a simulated samplingdistribution for λBL which has a mean at E [exp(XBLβ)].
Its not too hard to show that if y ∼ Expo(λ), then E [y ] = 1λ . The temptation is
then to declare that because E [λBL] = E [exp(XBLβ)] then
E [y ] = 1/E [exp(XBLβ)].
It turns out this is not the case because E [1/λ] 6= 1/E [λ]. The first averages over
the sampling distribution of the means of y . The second averages over the
sampling distribution of λ then plugs into the formula for the mean of y .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 55 / 242
![Page 299: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/299.jpg)
When wouldn’t this work?
Why this annoying wrinkle?
Jensen’s inequality: given a random variableX, E [g(X )] 6= g(E [X ]) (it’s ≥ if g(·) is concave; ≤ if g(·) is convex).
Why can we use our shortcut with the Probit model? If Y ∼ Bern(π) then
E [Y ] = π. Our guess would then be that E [Y ] = E [Φ(XHR β)] which isfine because 1 · E [Φ(XHR β)] = E [1 · Φ(XHR β)].
Rule of thumb: if E [Y ] = θ, you are safe taking the shortcut.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 56 / 242
![Page 300: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/300.jpg)
When wouldn’t this work?
Why this annoying wrinkle? Jensen’s inequality: given a random variableX, E [g(X )] 6= g(E [X ]) (it’s ≥ if g(·) is concave; ≤ if g(·) is convex).
Why can we use our shortcut with the Probit model? If Y ∼ Bern(π) then
E [Y ] = π. Our guess would then be that E [Y ] = E [Φ(XHR β)] which isfine because 1 · E [Φ(XHR β)] = E [1 · Φ(XHR β)].
Rule of thumb: if E [Y ] = θ, you are safe taking the shortcut.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 56 / 242
![Page 301: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/301.jpg)
When wouldn’t this work?
Why this annoying wrinkle? Jensen’s inequality: given a random variableX, E [g(X )] 6= g(E [X ]) (it’s ≥ if g(·) is concave; ≤ if g(·) is convex).
Why can we use our shortcut with the Probit model?
If Y ∼ Bern(π) then
E [Y ] = π. Our guess would then be that E [Y ] = E [Φ(XHR β)] which isfine because 1 · E [Φ(XHR β)] = E [1 · Φ(XHR β)].
Rule of thumb: if E [Y ] = θ, you are safe taking the shortcut.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 56 / 242
![Page 302: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/302.jpg)
When wouldn’t this work?
Why this annoying wrinkle? Jensen’s inequality: given a random variableX, E [g(X )] 6= g(E [X ]) (it’s ≥ if g(·) is concave; ≤ if g(·) is convex).
Why can we use our shortcut with the Probit model? If Y ∼ Bern(π) then
E [Y ] = π. Our guess would then be that E [Y ] = E [Φ(XHR β)] which isfine because 1 · E [Φ(XHR β)] = E [1 · Φ(XHR β)].
Rule of thumb: if E [Y ] = θ, you are safe taking the shortcut.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 56 / 242
![Page 303: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/303.jpg)
When wouldn’t this work?
Why this annoying wrinkle? Jensen’s inequality: given a random variableX, E [g(X )] 6= g(E [X ]) (it’s ≥ if g(·) is concave; ≤ if g(·) is convex).
Why can we use our shortcut with the Probit model? If Y ∼ Bern(π) then
E [Y ] = π. Our guess would then be that E [Y ] = E [Φ(XHR β)] which isfine because 1 · E [Φ(XHR β)] = E [1 · Φ(XHR β)].
Rule of thumb: if E [Y ] = θ, you are safe taking the shortcut.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 56 / 242
![Page 304: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/304.jpg)
More Expected Values
What if I want a bunch of these to see how expected values change withsome variable?
dem.rng <- -10:10
p.ests <- matrix(data = NA, ncol = length(dem.rng),
nrow=10000)
for(j in 1:length(dem.rng)){
highrisk.dem <- highrisk
highrisk.dem["dem_score"] <- dem.rng[j]
p.ests[,j] <- pnorm(highrisk.dem%*%t(beta.draws))
}
plot(dem.rng, apply(p.ests,2,mean), ylim = c(0,.028))
segments(x0 = dem.rng, x1 = dem.rng,
y0 = apply(p.ests, 2, quantile, .025),
y1 = apply(p.ests, 2, quantile, .975))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 57 / 242
![Page 305: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/305.jpg)
More Expected Values
What if I want a bunch of these to see how expected values change withsome variable?
dem.rng <- -10:10
p.ests <- matrix(data = NA, ncol = length(dem.rng),
nrow=10000)
for(j in 1:length(dem.rng)){
highrisk.dem <- highrisk
highrisk.dem["dem_score"] <- dem.rng[j]
p.ests[,j] <- pnorm(highrisk.dem%*%t(beta.draws))
}
plot(dem.rng, apply(p.ests,2,mean), ylim = c(0,.028))
segments(x0 = dem.rng, x1 = dem.rng,
y0 = apply(p.ests, 2, quantile, .025),
y1 = apply(p.ests, 2, quantile, .975))
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 57 / 242
![Page 306: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/306.jpg)
●●
●●
●●
●●
● ● ● ● ● ● ● ● ● ● ● ● ●
−10 −5 0 5 10
0.00
00.
010
0.02
0
Dem Score
Pro
babi
lity
of A
ss. A
ttem
pt
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 58 / 242
![Page 307: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/307.jpg)
Predicted Values
What if someone asks me to predict whether an assassination will takeplace? If I were to simulate from the distribution of β, and then draw 1value from the stochastic component for each simulation I would get apredictive distribution for y |X .
Q: What will it look like?
There is no need to actually conduct the simulation, though. The
simulated outcomes will be Bern(E [y |X ]) = Bern(.166). How is thisdifferent than the linear regression case?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 59 / 242
![Page 308: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/308.jpg)
Predicted Values
What if someone asks me to predict whether an assassination will takeplace? If I were to simulate from the distribution of β, and then draw 1value from the stochastic component for each simulation I would get apredictive distribution for y |X .Q: What will it look like?
There is no need to actually conduct the simulation, though. The
simulated outcomes will be Bern(E [y |X ]) = Bern(.166). How is thisdifferent than the linear regression case?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 59 / 242
![Page 309: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/309.jpg)
Predicted Values
What if someone asks me to predict whether an assassination will takeplace? If I were to simulate from the distribution of β, and then draw 1value from the stochastic component for each simulation I would get apredictive distribution for y |X .Q: What will it look like?
There is no need to actually conduct the simulation, though. The
simulated outcomes will be Bern(E [y |X ]) = Bern(.166). How is thisdifferent than the linear regression case?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 59 / 242
![Page 310: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/310.jpg)
First Differences
Compare expected values of the outcome for two different scenarios,usually all predictors held constant but one. Recall from our regressionresults that war seemed to have a big positive effect on probability of anassassination attempt.
So let’s find:E [y |XWar ]− E [y |XNowar ].
Each of these are just fitted values for the probability parameter, with allcovariates at the highrisk values except war, which we control.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 60 / 242
![Page 311: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/311.jpg)
First Differences
Compare expected values of the outcome for two different scenarios,usually all predictors held constant but one. Recall from our regressionresults that war seemed to have a big positive effect on probability of anassassination attempt.
So let’s find:E [y |XWar ]− E [y |XNowar ].
Each of these are just fitted values for the probability parameter, with allcovariates at the highrisk values except war, which we control.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 60 / 242
![Page 312: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/312.jpg)
First Differences
Compare expected values of the outcome for two different scenarios,usually all predictors held constant but one. Recall from our regressionresults that war seemed to have a big positive effect on probability of anassassination attempt.
So let’s find:E [y |XWar ]− E [y |XNowar ].
Each of these are just fitted values for the probability parameter, with allcovariates at the highrisk values except war, which we control.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 60 / 242
![Page 313: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/313.jpg)
First Differences
highrisk.war <- highrisk
highrisk.war["war"] <- 1
highrisk.nowar <- highrisk
highrisk.nowar["war"] <- 0
fd.ests <- pnorm(highrisk.war%*%t(beta.draws)) -
pnorm(highrisk.nowar%*%t(beta.draws))
> mean(fd.ests)
[1] 0.01891
> quantile(fd.ests, .025); quantile(fd.ests, .975)
2.5% 97.5%
0.00578 0.03609
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 61 / 242
![Page 314: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/314.jpg)
First Differences
highrisk.war <- highrisk
highrisk.war["war"] <- 1
highrisk.nowar <- highrisk
highrisk.nowar["war"] <- 0
fd.ests <- pnorm(highrisk.war%*%t(beta.draws)) -
pnorm(highrisk.nowar%*%t(beta.draws))
> mean(fd.ests)
[1] 0.01891
> quantile(fd.ests, .025); quantile(fd.ests, .975)
2.5% 97.5%
0.00578 0.03609
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 61 / 242
![Page 315: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/315.jpg)
Simulating (Parameter) Estimation Uncertainty
To take one random draw of all the parameters γ = (~β, α) from their“sampling distribution” (or “posterior distribution” with a flat prior):
1. Estimate the model by maximizing the likelihood function, record thepoint estimates γ and variance matrix V(γ).
2. Draw the vector γ from the multivariate normal distribution:
γ ∼ N(γ, V(γ)
)Denote the draw γ = vec(β, α), which has k elements.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 62 / 242
![Page 316: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/316.jpg)
Simulating (Parameter) Estimation Uncertainty
To take one random draw of all the parameters γ = (~β, α) from their“sampling distribution” (or “posterior distribution” with a flat prior):
1. Estimate the model by maximizing the likelihood function, record thepoint estimates γ and variance matrix V(γ).
2. Draw the vector γ from the multivariate normal distribution:
γ ∼ N(γ, V(γ)
)Denote the draw γ = vec(β, α), which has k elements.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 62 / 242
![Page 317: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/317.jpg)
Simulating (Parameter) Estimation Uncertainty
To take one random draw of all the parameters γ = (~β, α) from their“sampling distribution” (or “posterior distribution” with a flat prior):
1. Estimate the model by maximizing the likelihood function, record thepoint estimates γ and variance matrix V(γ).
2. Draw the vector γ from the multivariate normal distribution:
γ ∼ N(γ, V(γ)
)Denote the draw γ = vec(β, α), which has k elements.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 62 / 242
![Page 318: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/318.jpg)
Simulating (Parameter) Estimation Uncertainty
To take one random draw of all the parameters γ = (~β, α) from their“sampling distribution” (or “posterior distribution” with a flat prior):
1. Estimate the model by maximizing the likelihood function, record thepoint estimates γ and variance matrix V(γ).
2. Draw the vector γ from the multivariate normal distribution:
γ ∼ N(γ, V(γ)
)
Denote the draw γ = vec(β, α), which has k elements.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 62 / 242
![Page 319: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/319.jpg)
Simulating (Parameter) Estimation Uncertainty
To take one random draw of all the parameters γ = (~β, α) from their“sampling distribution” (or “posterior distribution” with a flat prior):
1. Estimate the model by maximizing the likelihood function, record thepoint estimates γ and variance matrix V(γ).
2. Draw the vector γ from the multivariate normal distribution:
γ ∼ N(γ, V(γ)
)Denote the draw γ = vec(β, α), which has k elements.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 62 / 242
![Page 320: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/320.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 321: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/321.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 322: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/322.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 323: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/323.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 324: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/324.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 325: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/325.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 326: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/326.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 327: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/327.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 328: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/328.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 329: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/329.jpg)
Simulating the Distribution of Predicted Values, ∼ Y
Predicted values can be for:
1. Forecasts: about the future
2. Farcasts: about some area for which you have no y
3. Nowcasts: about the current data (perhaps to reproduce it to see whether it fits)
To simulate one predicted value, follow these steps:
1. Draw one value of γ = vec(β, α).
2. Choose a predicted value to compute, defined by one value for each explanatoryvariable as the vector Xc .
3. Extract simulated β from γ; compute θc = g(Xc , β) (from systematic component)
4. Simulate outcome variable Yc ∼ f (θc , α) (from stochastic component)
Repeat algorithm say M = 1000 times, to produce 1000 predicted values. Use these to
compute a histogram for the full posterior, the average, variance, percentile values, or
others.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 63 / 242
![Page 330: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/330.jpg)
The Distribution of Expected v. Predicted Values
1. Predicted values: draws of Y that are or could be observed
2. Expected values: draws of fixed features of the distribution of Y , suchas E (Y ).
3. Predicted values: include estimation and fundamental uncertainty.
4. Expected values: average away fundamental uncertainty
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 64 / 242
![Page 331: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/331.jpg)
The Distribution of Expected v. Predicted Values
1. Predicted values: draws of Y that are or could be observed
2. Expected values: draws of fixed features of the distribution of Y , suchas E (Y ).
3. Predicted values: include estimation and fundamental uncertainty.
4. Expected values: average away fundamental uncertainty
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 64 / 242
![Page 332: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/332.jpg)
The Distribution of Expected v. Predicted Values
1. Predicted values: draws of Y that are or could be observed
2. Expected values: draws of fixed features of the distribution of Y , suchas E (Y ).
3. Predicted values: include estimation and fundamental uncertainty.
4. Expected values: average away fundamental uncertainty
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 64 / 242
![Page 333: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/333.jpg)
The Distribution of Expected v. Predicted Values
1. Predicted values: draws of Y that are or could be observed
2. Expected values: draws of fixed features of the distribution of Y , suchas E (Y ).
3. Predicted values: include estimation and fundamental uncertainty.
4. Expected values: average away fundamental uncertainty
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 64 / 242
![Page 334: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/334.jpg)
The Distribution of Expected v. Predicted Values
1. Predicted values: draws of Y that are or could be observed
2. Expected values: draws of fixed features of the distribution of Y , suchas E (Y ).
3. Predicted values: include estimation and fundamental uncertainty.
4. Expected values: average away fundamental uncertainty
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 64 / 242
![Page 335: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/335.jpg)
5. The variance of expected values (but not predicted values) go to 0 andn gets large.
6. Example use of predicted value distribution: probability of temperaturecolder than 32◦ tomorrow. (Predicted temperature is uncertain becausewe have to estimate it and because of natural fluctuations.)
7. Example use of expected value distribution: probability the averagetemperature on days like tomorrow will be colder than 32◦. (Expectedtemperature is only uncertain because we have to estimate it; naturalfluctuations in temperature doesn’t affect the average.)
8. Which to use for causal effects & first differences?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 65 / 242
![Page 336: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/336.jpg)
5. The variance of expected values (but not predicted values) go to 0 andn gets large.
6. Example use of predicted value distribution: probability of temperaturecolder than 32◦ tomorrow. (Predicted temperature is uncertain becausewe have to estimate it and because of natural fluctuations.)
7. Example use of expected value distribution: probability the averagetemperature on days like tomorrow will be colder than 32◦. (Expectedtemperature is only uncertain because we have to estimate it; naturalfluctuations in temperature doesn’t affect the average.)
8. Which to use for causal effects & first differences?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 65 / 242
![Page 337: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/337.jpg)
5. The variance of expected values (but not predicted values) go to 0 andn gets large.
6. Example use of predicted value distribution: probability of temperaturecolder than 32◦ tomorrow. (Predicted temperature is uncertain becausewe have to estimate it and because of natural fluctuations.)
7. Example use of expected value distribution: probability the averagetemperature on days like tomorrow will be colder than 32◦. (Expectedtemperature is only uncertain because we have to estimate it; naturalfluctuations in temperature doesn’t affect the average.)
8. Which to use for causal effects & first differences?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 65 / 242
![Page 338: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/338.jpg)
5. The variance of expected values (but not predicted values) go to 0 andn gets large.
6. Example use of predicted value distribution: probability of temperaturecolder than 32◦ tomorrow. (Predicted temperature is uncertain becausewe have to estimate it and because of natural fluctuations.)
7. Example use of expected value distribution: probability the averagetemperature on days like tomorrow will be colder than 32◦. (Expectedtemperature is only uncertain because we have to estimate it; naturalfluctuations in temperature doesn’t affect the average.)
8. Which to use for causal effects & first differences?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 65 / 242
![Page 339: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/339.jpg)
5. The variance of expected values (but not predicted values) go to 0 andn gets large.
6. Example use of predicted value distribution: probability of temperaturecolder than 32◦ tomorrow. (Predicted temperature is uncertain becausewe have to estimate it and because of natural fluctuations.)
7. Example use of expected value distribution: probability the averagetemperature on days like tomorrow will be colder than 32◦. (Expectedtemperature is only uncertain because we have to estimate it; naturalfluctuations in temperature doesn’t affect the average.)
8. Which to use for causal effects & first differences?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 65 / 242
![Page 340: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/340.jpg)
Simulating the Distribution of Expected Values: AnAlgorithm
1. Draw one value of γ = vec(β, α).
2. Choose one value for each explanatory variable (Xc is a vector)
3. Taking the one set of simulated β from γ, compute θc = g(Xc , β)(from the systematic component)
4. Draw m values of the outcome variable Y(k)c (k = 1, . . . ,m) from the
stochastic component f (θc , α). (This step simulates fundamentaluncertainty.)
5. Average over the fundamental uncertainty by calculating the mean ofthe m simulations to yield one simulated expected value
E (Yc) =∑m
k=1 Y(k)c /m.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 66 / 242
![Page 341: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/341.jpg)
Simulating the Distribution of Expected Values: AnAlgorithm
1. Draw one value of γ = vec(β, α).
2. Choose one value for each explanatory variable (Xc is a vector)
3. Taking the one set of simulated β from γ, compute θc = g(Xc , β)(from the systematic component)
4. Draw m values of the outcome variable Y(k)c (k = 1, . . . ,m) from the
stochastic component f (θc , α). (This step simulates fundamentaluncertainty.)
5. Average over the fundamental uncertainty by calculating the mean ofthe m simulations to yield one simulated expected value
E (Yc) =∑m
k=1 Y(k)c /m.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 66 / 242
![Page 342: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/342.jpg)
Simulating the Distribution of Expected Values: AnAlgorithm
1. Draw one value of γ = vec(β, α).
2. Choose one value for each explanatory variable (Xc is a vector)
3. Taking the one set of simulated β from γ, compute θc = g(Xc , β)(from the systematic component)
4. Draw m values of the outcome variable Y(k)c (k = 1, . . . ,m) from the
stochastic component f (θc , α). (This step simulates fundamentaluncertainty.)
5. Average over the fundamental uncertainty by calculating the mean ofthe m simulations to yield one simulated expected value
E (Yc) =∑m
k=1 Y(k)c /m.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 66 / 242
![Page 343: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/343.jpg)
Simulating the Distribution of Expected Values: AnAlgorithm
1. Draw one value of γ = vec(β, α).
2. Choose one value for each explanatory variable (Xc is a vector)
3. Taking the one set of simulated β from γ, compute θc = g(Xc , β)(from the systematic component)
4. Draw m values of the outcome variable Y(k)c (k = 1, . . . ,m) from the
stochastic component f (θc , α). (This step simulates fundamentaluncertainty.)
5. Average over the fundamental uncertainty by calculating the mean ofthe m simulations to yield one simulated expected value
E (Yc) =∑m
k=1 Y(k)c /m.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 66 / 242
![Page 344: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/344.jpg)
Simulating the Distribution of Expected Values: AnAlgorithm
1. Draw one value of γ = vec(β, α).
2. Choose one value for each explanatory variable (Xc is a vector)
3. Taking the one set of simulated β from γ, compute θc = g(Xc , β)(from the systematic component)
4. Draw m values of the outcome variable Y(k)c (k = 1, . . . ,m) from the
stochastic component f (θc , α). (This step simulates fundamentaluncertainty.)
5. Average over the fundamental uncertainty by calculating the mean ofthe m simulations to yield one simulated expected value
E (Yc) =∑m
k=1 Y(k)c /m.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 66 / 242
![Page 345: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/345.jpg)
Simulating the Distribution of Expected Values: AnAlgorithm
1. Draw one value of γ = vec(β, α).
2. Choose one value for each explanatory variable (Xc is a vector)
3. Taking the one set of simulated β from γ, compute θc = g(Xc , β)(from the systematic component)
4. Draw m values of the outcome variable Y(k)c (k = 1, . . . ,m) from the
stochastic component f (θc , α). (This step simulates fundamentaluncertainty.)
5. Average over the fundamental uncertainty by calculating the mean ofthe m simulations to yield one simulated expected value
E (Yc) =∑m
k=1 Y(k)c /m.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 66 / 242
![Page 346: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/346.jpg)
Simulating Expected Values: Notes
1. When m = 1, this algorithm produces predicted values.
2. With large m, this algorithm better represents and averages over thefundamental uncertainty.
3. Repeat entire algorithm M times (say 1000), with results differing onlydue to estimation uncertainty
4. Use to compute a histogram, average, standard error, confidenceinterval, etc.
5. When E (Yc) = θc , we can skip the last two steps. E.g., in the logitmodel, once we simulate πi , we don’t need to draw Y and then averageto get back to πi . (If you’re unsure, do it anyway!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 67 / 242
![Page 347: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/347.jpg)
Simulating Expected Values: Notes
1. When m = 1, this algorithm produces predicted values.
2. With large m, this algorithm better represents and averages over thefundamental uncertainty.
3. Repeat entire algorithm M times (say 1000), with results differing onlydue to estimation uncertainty
4. Use to compute a histogram, average, standard error, confidenceinterval, etc.
5. When E (Yc) = θc , we can skip the last two steps. E.g., in the logitmodel, once we simulate πi , we don’t need to draw Y and then averageto get back to πi . (If you’re unsure, do it anyway!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 67 / 242
![Page 348: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/348.jpg)
Simulating Expected Values: Notes
1. When m = 1, this algorithm produces predicted values.
2. With large m, this algorithm better represents and averages over thefundamental uncertainty.
3. Repeat entire algorithm M times (say 1000), with results differing onlydue to estimation uncertainty
4. Use to compute a histogram, average, standard error, confidenceinterval, etc.
5. When E (Yc) = θc , we can skip the last two steps. E.g., in the logitmodel, once we simulate πi , we don’t need to draw Y and then averageto get back to πi . (If you’re unsure, do it anyway!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 67 / 242
![Page 349: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/349.jpg)
Simulating Expected Values: Notes
1. When m = 1, this algorithm produces predicted values.
2. With large m, this algorithm better represents and averages over thefundamental uncertainty.
3. Repeat entire algorithm M times (say 1000), with results differing onlydue to estimation uncertainty
4. Use to compute a histogram, average, standard error, confidenceinterval, etc.
5. When E (Yc) = θc , we can skip the last two steps. E.g., in the logitmodel, once we simulate πi , we don’t need to draw Y and then averageto get back to πi . (If you’re unsure, do it anyway!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 67 / 242
![Page 350: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/350.jpg)
Simulating Expected Values: Notes
1. When m = 1, this algorithm produces predicted values.
2. With large m, this algorithm better represents and averages over thefundamental uncertainty.
3. Repeat entire algorithm M times (say 1000), with results differing onlydue to estimation uncertainty
4. Use to compute a histogram, average, standard error, confidenceinterval, etc.
5. When E (Yc) = θc , we can skip the last two steps. E.g., in the logitmodel, once we simulate πi , we don’t need to draw Y and then averageto get back to πi . (If you’re unsure, do it anyway!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 67 / 242
![Page 351: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/351.jpg)
Simulating Expected Values: Notes
1. When m = 1, this algorithm produces predicted values.
2. With large m, this algorithm better represents and averages over thefundamental uncertainty.
3. Repeat entire algorithm M times (say 1000), with results differing onlydue to estimation uncertainty
4. Use to compute a histogram, average, standard error, confidenceinterval, etc.
5. When E (Yc) = θc , we can skip the last two steps. E.g., in the logitmodel, once we simulate πi , we don’t need to draw Y and then averageto get back to πi . (If you’re unsure, do it anyway!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 67 / 242
![Page 352: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/352.jpg)
Simulating First Differences
To draw one simulated first difference:
1. Choose vectors Xs , the starting point, Xe , the ending point.
2. Apply the expected value algorithm twice, once for Xs and Xe (butreuse the random draws).
3. Take the difference in the two expected values.
4. (To save computation time, and improve approximation, use the samesimulated β in each.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 68 / 242
![Page 353: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/353.jpg)
Simulating First Differences
To draw one simulated first difference:
1. Choose vectors Xs , the starting point, Xe , the ending point.
2. Apply the expected value algorithm twice, once for Xs and Xe (butreuse the random draws).
3. Take the difference in the two expected values.
4. (To save computation time, and improve approximation, use the samesimulated β in each.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 68 / 242
![Page 354: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/354.jpg)
Simulating First Differences
To draw one simulated first difference:
1. Choose vectors Xs , the starting point, Xe , the ending point.
2. Apply the expected value algorithm twice, once for Xs and Xe (butreuse the random draws).
3. Take the difference in the two expected values.
4. (To save computation time, and improve approximation, use the samesimulated β in each.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 68 / 242
![Page 355: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/355.jpg)
Simulating First Differences
To draw one simulated first difference:
1. Choose vectors Xs , the starting point, Xe , the ending point.
2. Apply the expected value algorithm twice, once for Xs and Xe (butreuse the random draws).
3. Take the difference in the two expected values.
4. (To save computation time, and improve approximation, use the samesimulated β in each.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 68 / 242
![Page 356: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/356.jpg)
Simulating First Differences
To draw one simulated first difference:
1. Choose vectors Xs , the starting point, Xe , the ending point.
2. Apply the expected value algorithm twice, once for Xs and Xe (butreuse the random draws).
3. Take the difference in the two expected values.
4. (To save computation time, and improve approximation, use the samesimulated β in each.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 68 / 242
![Page 357: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/357.jpg)
Simulating First Differences
To draw one simulated first difference:
1. Choose vectors Xs , the starting point, Xe , the ending point.
2. Apply the expected value algorithm twice, once for Xs and Xe (butreuse the random draws).
3. Take the difference in the two expected values.
4. (To save computation time, and improve approximation, use the samesimulated β in each.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 68 / 242
![Page 358: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/358.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale to
I make γ converge more quickly in n (and so work better with small n) toa multivariate normal. (MLEs don’t change, but the posteriors do.)
I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 359: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/359.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale to
I make γ converge more quickly in n (and so work better with small n) toa multivariate normal. (MLEs don’t change, but the posteriors do.)
I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 360: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/360.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale to
I make γ converge more quickly in n (and so work better with small n) toa multivariate normal. (MLEs don’t change, but the posteriors do.)
I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 361: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/361.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)
I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 362: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/362.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 363: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/363.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 364: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/364.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 365: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/365.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).
I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 366: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/366.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 367: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/367.jpg)
Tricks for Simulating Parameters
1. Simulate all parameters (in γ), including ancillary parameters, together,unless you know they are orthogonal.
2. Reparameterize to unbounded scale toI make γ converge more quickly in n (and so work better with small n) to
a multivariate normal. (MLEs don’t change, but the posteriors do.)I make the maximization algorithm work faster without constraints
3. To do this, all estimated parameters should be unbounded and logicallysymmetric. E.g.,
I σ2 = eη (i.e., wherever you see σ2, in your log-likelihood function,replace it with eη)
I For a probability, π = [1 + e−η]−1 (a logit transformation).I For −1 ≤ ρ ≤ 1, use ρ = (e2η − 1)/(e2η + 1) (Fisher’s Z transformation)
In all 3 cases, η is unbounded: estimate it, simulate from it, andreparameterize back to the scale you care about.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 69 / 242
![Page 368: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/368.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 369: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/369.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 370: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/370.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 371: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/371.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 372: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/372.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.
(c) Its wrong because the regression estimates E [ln(Y )], butE [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y
(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 373: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/373.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y
(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 374: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/374.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 375: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/375.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 376: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/376.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 377: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/377.jpg)
Tricks for Simulating Quantities of Interest
1. Unless you’re sure, always compute simulations of Y and use that as a basisfor creating simulations of other quantities. (This will get all information fromthe model in the simulations.)
2. Simulating functions of Y
(a) If some function of Y , such as ln(Y ), is used, simulate ln(Y ) and thenapply the inverse function exp(ln(Y )) to reveal Y .
(b) The usual, but wrong way: Regress ln(Y ) on X , compute predicted value
ln(Y ) and exponentiate.(c) Its wrong because the regression estimates E [ln(Y )], but
E [ln(Y )] 6= ln[E (Y )], so exp(E [ln(Y )]) 6= Y(d) More generally, E (g [Y ]) 6= g [E (Y )], unless g [·] is linear.
3. Check the approximation error of your simulation algorithm: Run it twice,check the number of digits of precision that don’t change. If its not enoughfor your tables, increase M (or m) and try again.
4. Analytical calculations and other tricks can speed simulation, or precision.
5. Canned Software Options: Clarify in Stata, Zelig in R
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 70 / 242
![Page 378: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/378.jpg)
Replication of Rosenstone and Hansen from King, Tomzand Wittenberg (2000)
1. Logit of reported turnout on Age, Age2, Education, Income, and Race
2. Quantity of Interest: (nonlinear) effect of age on Pr(vote|X ), holdingconstant Income and Race.
3. Use M = 1000 and compute 99% CI:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 71 / 242
![Page 379: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/379.jpg)
Replication of Rosenstone and Hansen from King, Tomzand Wittenberg (2000)
1. Logit of reported turnout on Age, Age2, Education, Income, and Race
2. Quantity of Interest: (nonlinear) effect of age on Pr(vote|X ), holdingconstant Income and Race.
3. Use M = 1000 and compute 99% CI:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 71 / 242
![Page 380: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/380.jpg)
Replication of Rosenstone and Hansen from King, Tomzand Wittenberg (2000)
1. Logit of reported turnout on Age, Age2, Education, Income, and Race
2. Quantity of Interest: (nonlinear) effect of age on Pr(vote|X ), holdingconstant Income and Race.
3. Use M = 1000 and compute 99% CI:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 71 / 242
![Page 381: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/381.jpg)
Replication of Rosenstone and Hansen from King, Tomzand Wittenberg (2000)
1. Logit of reported turnout on Age, Age2, Education, Income, and Race
2. Quantity of Interest: (nonlinear) effect of age on Pr(vote|X ), holdingconstant Income and Race.
3. Use M = 1000 and compute 99% CI:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 71 / 242
![Page 382: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/382.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 383: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/383.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 384: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/384.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 385: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/385.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 386: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/386.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 387: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/387.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 388: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/388.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 389: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/389.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 390: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/390.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 391: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/391.jpg)
To create this graph, simulate:
1. Set age=24, education=high school, income=average, Race=white
2. Run logistic regression
3. Simulate 1000 β’s
4. Compute 1000 πi = [1 + exi β ]−1
5. Sort in numerical order
6. Take 5th and 995th values as the 99% confidence interval
7. Plot a vertical line on the graph at age=24 representing the CI.
8. Repeat for other ages and for college degree.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 72 / 242
![Page 392: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/392.jpg)
Replication of Garrett (King, Tomz and Wittenberg 2000)
Dependent variable: Government Spending as % of GDPKey explanatory variable: left-labor power (high = solid line; low =dashed)Garrett used only point estimates to distinguish the eight quantitiesrepresented above. What new information do we learn with this approach?Left-labor power only has a clear effect when exposure to trade or capitalmobility is high.
See Last Semester’s Slides
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 73 / 242
![Page 393: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/393.jpg)
Replication of Garrett (King, Tomz and Wittenberg 2000)
Dependent variable: Government Spending as % of GDPKey explanatory variable: left-labor power (high = solid line; low =dashed)Garrett used only point estimates to distinguish the eight quantitiesrepresented above. What new information do we learn with this approach?Left-labor power only has a clear effect when exposure to trade or capitalmobility is high.
See Last Semester’s Slides
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 73 / 242
![Page 394: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/394.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 74 / 242
![Page 395: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/395.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 74 / 242
![Page 396: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/396.jpg)
Model Diagnostics for Binary Outcome Models
How do you quantify the goodness of fit of your model?
Note: Goodness of fit may or may not be very important
A model that fits well is likely to be a good predictive model
But it does not guarantee that the model is a good causal model
Pseudo-R2: a generalization of R2 to outside of the linear world
R2 = 1− `(βMLE )
`(y)∈ [0, 1]
`(y): log-likelihood of the null model, which sets πi = y for all i
This one is due to McFadden (1974); many other variants exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 75 / 242
![Page 397: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/397.jpg)
Model Diagnostics for Binary Outcome Models
How do you quantify the goodness of fit of your model?
Note: Goodness of fit may or may not be very important
A model that fits well is likely to be a good predictive model
But it does not guarantee that the model is a good causal model
Pseudo-R2: a generalization of R2 to outside of the linear world
R2 = 1− `(βMLE )
`(y)∈ [0, 1]
`(y): log-likelihood of the null model, which sets πi = y for all i
This one is due to McFadden (1974); many other variants exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 75 / 242
![Page 398: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/398.jpg)
Model Diagnostics for Binary Outcome Models
How do you quantify the goodness of fit of your model?
Note: Goodness of fit may or may not be very important
A model that fits well is likely to be a good predictive model
But it does not guarantee that the model is a good causal model
Pseudo-R2: a generalization of R2 to outside of the linear world
R2 = 1− `(βMLE )
`(y)∈ [0, 1]
`(y): log-likelihood of the null model, which sets πi = y for all i
This one is due to McFadden (1974); many other variants exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 75 / 242
![Page 399: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/399.jpg)
Model Diagnostics for Binary Outcome Models
How do you quantify the goodness of fit of your model?
Note: Goodness of fit may or may not be very important
A model that fits well is likely to be a good predictive model
But it does not guarantee that the model is a good causal model
Pseudo-R2: a generalization of R2 to outside of the linear world
R2 = 1− `(βMLE )
`(y)∈ [0, 1]
`(y): log-likelihood of the null model, which sets πi = y for all i
This one is due to McFadden (1974); many other variants exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 75 / 242
![Page 400: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/400.jpg)
Model Diagnostics for Binary Outcome Models
How do you quantify the goodness of fit of your model?
Note: Goodness of fit may or may not be very important
A model that fits well is likely to be a good predictive model
But it does not guarantee that the model is a good causal model
Pseudo-R2: a generalization of R2 to outside of the linear world
R2 = 1− `(βMLE )
`(y)∈ [0, 1]
`(y): log-likelihood of the null model, which sets πi = y for all i
This one is due to McFadden (1974); many other variants exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 75 / 242
![Page 401: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/401.jpg)
Model Diagnostics for Binary Outcome Models
How do you quantify the goodness of fit of your model?
Note: Goodness of fit may or may not be very important
A model that fits well is likely to be a good predictive model
But it does not guarantee that the model is a good causal model
Pseudo-R2: a generalization of R2 to outside of the linear world
R2 = 1− `(βMLE )
`(y)∈ [0, 1]
`(y): log-likelihood of the null model, which sets πi = y for all i
This one is due to McFadden (1974); many other variants exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 75 / 242
![Page 402: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/402.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 403: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/403.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 404: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/404.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 405: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/405.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.
(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 406: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/406.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).
(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 407: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/407.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.
(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 408: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/408.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.
(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of theseobservations in the test set should be 1s.
(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 409: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/409.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.
(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 410: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/410.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.
(h) If the world changes, an otherwise good model will fail. But it’s still theright test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 411: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/411.jpg)
How do you know which model is better?
1. Out-of-sample forecasts (or farcasts)
(a) Your job: find the underlying (persistent) structure, not the idiosyncraticfeatures of any one data set.
(b) Set aside some (test) data.(c) Fit your model to the rest (the training data).(d) Make predictions with training set; compare to the test set.(e) Comparisons to average prediction and full distribution.(f) E.g., if a set of predictions have Pr(y = 1) = 0.2, then 20% of these
observations in the test set should be 1s.(g) The best test sets are really out of sample, not even available yet.(h) If the world changes, an otherwise good model will fail. But it’s still the
right test.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 76 / 242
![Page 412: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/412.jpg)
(See Trevor Hastie et al. 2001. The Elements of Statistical Learning, Springer, Chapter 7: Fig 7.1.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 77 / 242
![Page 413: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/413.jpg)
(i) Binary variable predictions require a normative decision.
I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.
• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 414: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/414.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.
I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.
• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 415: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/415.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.
I C could come from your philosophical justification, survey of policymakers, a review of the literature, etc.
I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.
• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 416: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/416.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.
I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.
• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 417: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/417.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.
I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.
• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 418: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/418.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.
• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 419: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/419.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5
• If C = 2, predict y = 1 when π > 1/3I Only with C chosen can we compute (a) % of 1s correctly predicted
and (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 420: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/420.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 421: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/421.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 422: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/422.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 423: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/423.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of C
I Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 424: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/424.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0s
I Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 425: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/425.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.
I If one curve is above another the whole way, then that model dominatesthe other. It’s better no matter your normative decision (about C )
I Otherwise, one model is better than the other in only given specificedranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 426: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/426.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )
I Otherwise, one model is better than the other in only given specificedranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 427: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/427.jpg)
(i) Binary variable predictions require a normative decision.I Let C be number of times more costly misclassifying a 1 is than a 0.I C must be chosen independently of the data.I C could come from your philosophical justification, survey of policy
makers, a review of the literature, etc.I People often choose C = 1, but without justification.I Decision theory: choose Y = 1 when π > 1/(1 + C ) and 0 otherwise.• If C = 1, predict y = 1 when π > 0.5• If C = 2, predict y = 1 when π > 1/3
I Only with C chosen can we compute (a) % of 1s correctly predictedand (b) % of 0s correctly predicted, and (c) patterns in errors indifferent subsets of the data or forecasts.
(j) If you can’t justify a choice for C , use ROC (receiver-operatorcharacteristic) curves
I Compute %1s and %0s correctly predicted for every possible value of CI Plot %1s by %0sI Overlay curves for several models on the same graph.I If one curve is above another the whole way, then that model dominates
the other. It’s better no matter your normative decision (about C )I Otherwise, one model is better than the other in only given specificed
ranges of C (i.e., for only some normative perspectives).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 78 / 242
![Page 428: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/428.jpg)
In sample ROC, on left (from Gary King and Langche Zeng. “ImprovingForecasts of State Failure,” World Politics, Vol. 53, No. 4 (July, 2001):623-58)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 79 / 242
![Page 429: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/429.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 430: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/430.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 431: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/431.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 432: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/432.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 433: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/433.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 434: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/434.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics
(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 435: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/435.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y
(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 436: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/436.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 437: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/437.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:
F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 438: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/438.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:F transform bounded variables
F transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 439: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/439.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:F transform bounded variablesF transform heteroskedastic results
F highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 440: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/440.jpg)
4. Cross-validation
(a) The idea: set aside k observations as the “test set”: evaluate; and thenset aside another set of k observations. Repeat multiple times; reportperformance averaged over subsets
(b) Useful for smaller data sets; real test sets are better.
5. Fit, in general: Look for all possible observable implications of a model,and compare to observations. (Think. Be creative here!)
6. Fit: continuous variables
(a) The usual regression diagnostics(b) E.G., plots of e = y − y by X , Y or y(c) Check more than the means. E.g., plot e by y and draw a line at 0 and at±1, 2 se’s. 66%, 95% of the observations should fall between the lines.
(d) For graphics:F transform bounded variablesF transform heteroskedastic resultsF highlight key results; label everything
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 80 / 242
![Page 441: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/441.jpg)
7. Fit: dichotomous variables
(a) Sort estimated probabilities into bins of say 0.1 width: [0, 0.1),[0.1, 0.2),. . . , [0.9, 1].
(b) From the observations in each bin, compute (a) the mean predictions(probably near 0.05, 0.15, etc.) and (b) the average fraction of 1s.
(c) Plot (a) by (b) and look for systematic deviation from 45◦ line.
In sample calibration graph on right (from Gary King and Langche Zeng.“Improving Forecasts of State Failure,” World Politics, Vol. 53, No. 4(July, 2001): 623-58)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 81 / 242
![Page 442: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/442.jpg)
7. Fit: dichotomous variables(a) Sort estimated probabilities into bins of say 0.1 width: [0, 0.1),
[0.1, 0.2),. . . , [0.9, 1].
(b) From the observations in each bin, compute (a) the mean predictions(probably near 0.05, 0.15, etc.) and (b) the average fraction of 1s.
(c) Plot (a) by (b) and look for systematic deviation from 45◦ line.
In sample calibration graph on right (from Gary King and Langche Zeng.“Improving Forecasts of State Failure,” World Politics, Vol. 53, No. 4(July, 2001): 623-58)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 81 / 242
![Page 443: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/443.jpg)
7. Fit: dichotomous variables(a) Sort estimated probabilities into bins of say 0.1 width: [0, 0.1),
[0.1, 0.2),. . . , [0.9, 1].(b) From the observations in each bin, compute (a) the mean predictions
(probably near 0.05, 0.15, etc.) and (b) the average fraction of 1s.
(c) Plot (a) by (b) and look for systematic deviation from 45◦ line.
In sample calibration graph on right (from Gary King and Langche Zeng.“Improving Forecasts of State Failure,” World Politics, Vol. 53, No. 4(July, 2001): 623-58)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 81 / 242
![Page 444: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/444.jpg)
7. Fit: dichotomous variables(a) Sort estimated probabilities into bins of say 0.1 width: [0, 0.1),
[0.1, 0.2),. . . , [0.9, 1].(b) From the observations in each bin, compute (a) the mean predictions
(probably near 0.05, 0.15, etc.) and (b) the average fraction of 1s.(c) Plot (a) by (b) and look for systematic deviation from 45◦ line.
In sample calibration graph on right (from Gary King and Langche Zeng.“Improving Forecasts of State Failure,” World Politics, Vol. 53, No. 4(July, 2001): 623-58)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 81 / 242
![Page 445: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/445.jpg)
7. Fit: dichotomous variables(a) Sort estimated probabilities into bins of say 0.1 width: [0, 0.1),
[0.1, 0.2),. . . , [0.9, 1].(b) From the observations in each bin, compute (a) the mean predictions
(probably near 0.05, 0.15, etc.) and (b) the average fraction of 1s.(c) Plot (a) by (b) and look for systematic deviation from 45◦ line.
In sample calibration graph on right (from Gary King and Langche Zeng.“Improving Forecasts of State Failure,” World Politics, Vol. 53, No. 4(July, 2001): 623-58)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 81 / 242
![Page 446: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/446.jpg)
Out of sample calibration graph on right.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 82 / 242
![Page 447: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/447.jpg)
Out-of-Sample with Cross-Validation
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●●●●●●●●●
●
●●●
●●●●●●●●●●●●●
●●●●●
●●●●●
●●●●●●●●●●●●●●●●●●●●
●●
●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Fitted Probabilities vs. Actual Outcomes
Mean Fitted Prob(Y=1)
Fra
ctio
n of
Y=
1
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
ROC Curve for the Fearon and Laitin(2003) Data
Proportion of 1's correctly classified
Pro
port
ion
of 0
's c
orre
ctly
cla
ssifi
ed
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 83 / 242
![Page 448: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/448.jpg)
New Developments: Separation plots
Greenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 449: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/449.jpg)
New Developments: Separation plots
Greenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 450: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/450.jpg)
New Developments: Separation plots
Greenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 451: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/451.jpg)
New Developments: Separation plots
Greenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 452: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/452.jpg)
New Developments: Separation plotsGreenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 453: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/453.jpg)
New Developments: Separation plots
Greenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 454: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/454.jpg)
New Developments: Separation plots
Greenhill, Ward and Sacks (2011)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 84 / 242
![Page 455: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/455.jpg)
New Developments
Hanmer and Kalkan distinguish between:
average case (most common)
observed value (they argue better)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 85 / 242
![Page 456: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/456.jpg)
New Developments
Hanmer and Kalkan distinguish between:
average case (most common)
observed value (they argue better)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 85 / 242
![Page 457: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/457.jpg)
New Developments
Hanmer and Kalkan distinguish between:
average case (most common)
observed value (they argue better)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 85 / 242
![Page 458: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/458.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 459: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/459.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004.
Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 460: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/460.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 461: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/461.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 462: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/462.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 463: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/463.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 464: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/464.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 465: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/465.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 466: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/466.jpg)
The Case for Observed Case
Consider the case of voting for Bush in 2004. Average case is:
a white
48 year old woman
identifies as independent and a political moderate
with an associates degree
believes economic performance has been constant
disapproves of the Iraq war but not strongly
with an income between $45K and $50K
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 86 / 242
![Page 467: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/467.jpg)
The Case for Observed Case
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 87 / 242
![Page 468: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/468.jpg)
The Case for Observed Case
Try to come up with an argument for why the average-case method willtend to produce bigger changes than the observed-case method.
Average cases are likely to be in the “middle” of the data where thepredicted probabilities are changing the fastest. Think about Bill O’Reillyand Rachel Maddow. They are going to show up in the observed-casemethod but not the average-case method.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 88 / 242
![Page 469: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/469.jpg)
The Case for Observed Case
Try to come up with an argument for why the average-case method willtend to produce bigger changes than the observed-case method.
Average cases are likely to be in the “middle” of the data where thepredicted probabilities are changing the fastest. Think about Bill O’Reillyand Rachel Maddow. They are going to show up in the observed-casemethod but not the average-case method.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 88 / 242
![Page 470: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/470.jpg)
The Case for Observed Case
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 89 / 242
![Page 471: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/471.jpg)
UPM Remainder of Chapter 5.
Optionally: Greenhill et al. 2011, Hamner and Kalkan 2013
Also Helpful: Mood 2010, Berry, DeMeritt and Esarey 2010
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 90 / 242
![Page 472: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/472.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 91 / 242
![Page 473: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/473.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 91 / 242
![Page 474: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/474.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 475: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/475.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 476: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/476.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)
I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 477: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/477.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)
I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 478: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/478.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)
I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 479: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/479.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 480: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/480.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?
−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 481: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/481.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 482: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/482.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?
−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 483: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/483.jpg)
Modeling Ordered Outcomes
Suppose that we have an outcome which is one of J choices that areordered in a substantively meaningful way
Examples:I “Likert scale” in survey questions (“strongly agree”, “agree”, etc.)I Party positions (extreme left, center left, center, right, extreme right)I Levels of democracy (autocracy, anocracy, democracy)I Health status (healthy, sick, dying, dead)
Why not use continuous outcome models?−→ Don’t want to assume equal distances between levels
Why not use categorical outcome models?−→ Don’t want to waste information about ordering
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 92 / 242
![Page 484: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/484.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 485: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/485.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 486: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/486.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )
µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 487: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/487.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 488: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/488.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 489: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/489.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 490: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/490.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 491: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/491.jpg)
Ordered Dependent Variable Models
The model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 492: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/492.jpg)
Ordered Dependent Variable ModelsThe model
Y ∗i ∼ STN(y∗i |µi )µi = xiβ
Observation mechanism
yij =
{1 if τj−1,i ≤ y∗i ≤ τj ,i0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 93 / 242
![Page 493: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/493.jpg)
Ordered Logit and Probit ModelsAgain, the latent variable representation: Y ∗i = X>i β + εi
Assume that Y ∗i gives rise to Yi based on the following scheme:
Yi =
1 if −∞(= ψ0) < Y ∗i ≤ ψ1,2 if ψ1 < Y ∗i ≤ ψ2,...
...J if ψJ−1 < Y ∗i ≤ ∞(= ψJ)
where ψ1, ..., ψJ−1 are the threshold parameters to be estimatedIf Xi contains an intercept, one of the ψ’s must be fixed foridentifiability (typically ψ1 = 0)
εji.i.d.∼ logistic ⇒ the ordered logit model:
Pr(Yi ≤ j | Xi ) =exp(ψj − X>i β)
1 + exp(ψj − X>i β)
εji.i.d.∼ N (0, 1) ⇒ the ordered probit model:
Pr(Yi ≤ j | Xi ) = Φ(ψj − X>i β
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 94 / 242
![Page 494: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/494.jpg)
Ordered Logit and Probit ModelsAgain, the latent variable representation: Y ∗i = X>i β + εiAssume that Y ∗i gives rise to Yi based on the following scheme:
Yi =
1 if −∞(= ψ0) < Y ∗i ≤ ψ1,2 if ψ1 < Y ∗i ≤ ψ2,...
...J if ψJ−1 < Y ∗i ≤ ∞(= ψJ)
where ψ1, ..., ψJ−1 are the threshold parameters to be estimated
If Xi contains an intercept, one of the ψ’s must be fixed foridentifiability (typically ψ1 = 0)
εji.i.d.∼ logistic ⇒ the ordered logit model:
Pr(Yi ≤ j | Xi ) =exp(ψj − X>i β)
1 + exp(ψj − X>i β)
εji.i.d.∼ N (0, 1) ⇒ the ordered probit model:
Pr(Yi ≤ j | Xi ) = Φ(ψj − X>i β
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 94 / 242
![Page 495: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/495.jpg)
Ordered Logit and Probit ModelsAgain, the latent variable representation: Y ∗i = X>i β + εiAssume that Y ∗i gives rise to Yi based on the following scheme:
Yi =
1 if −∞(= ψ0) < Y ∗i ≤ ψ1,2 if ψ1 < Y ∗i ≤ ψ2,...
...J if ψJ−1 < Y ∗i ≤ ∞(= ψJ)
where ψ1, ..., ψJ−1 are the threshold parameters to be estimatedIf Xi contains an intercept, one of the ψ’s must be fixed foridentifiability (typically ψ1 = 0)
εji.i.d.∼ logistic ⇒ the ordered logit model:
Pr(Yi ≤ j | Xi ) =exp(ψj − X>i β)
1 + exp(ψj − X>i β)
εji.i.d.∼ N (0, 1) ⇒ the ordered probit model:
Pr(Yi ≤ j | Xi ) = Φ(ψj − X>i β
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 94 / 242
![Page 496: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/496.jpg)
Ordered Logit and Probit ModelsAgain, the latent variable representation: Y ∗i = X>i β + εiAssume that Y ∗i gives rise to Yi based on the following scheme:
Yi =
1 if −∞(= ψ0) < Y ∗i ≤ ψ1,2 if ψ1 < Y ∗i ≤ ψ2,...
...J if ψJ−1 < Y ∗i ≤ ∞(= ψJ)
where ψ1, ..., ψJ−1 are the threshold parameters to be estimatedIf Xi contains an intercept, one of the ψ’s must be fixed foridentifiability (typically ψ1 = 0)
εji.i.d.∼ logistic ⇒ the ordered logit model:
Pr(Yi ≤ j | Xi ) =exp(ψj − X>i β)
1 + exp(ψj − X>i β)
εji.i.d.∼ N (0, 1) ⇒ the ordered probit model:
Pr(Yi ≤ j | Xi ) = Φ(ψj − X>i β
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 94 / 242
![Page 497: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/497.jpg)
Ordered Logit and Probit ModelsAgain, the latent variable representation: Y ∗i = X>i β + εiAssume that Y ∗i gives rise to Yi based on the following scheme:
Yi =
1 if −∞(= ψ0) < Y ∗i ≤ ψ1,2 if ψ1 < Y ∗i ≤ ψ2,...
...J if ψJ−1 < Y ∗i ≤ ∞(= ψJ)
where ψ1, ..., ψJ−1 are the threshold parameters to be estimatedIf Xi contains an intercept, one of the ψ’s must be fixed foridentifiability (typically ψ1 = 0)
εji.i.d.∼ logistic ⇒ the ordered logit model:
Pr(Yi ≤ j | Xi ) =exp(ψj − X>i β)
1 + exp(ψj − X>i β)
εji.i.d.∼ N (0, 1) ⇒ the ordered probit model:
Pr(Yi ≤ j | Xi ) = Φ(ψj − X>i β
)Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 94 / 242
![Page 498: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/498.jpg)
Ordered Logit and Probit Models
0.0
0.2
0.4
0.6
0.8
1.0
F(Y
i*)
0.0
0.1
0.2
0.3
0.4
f(Yi*)
Yi* ~ N(XiTβ, 1)
XiTβ
Yi*
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 95 / 242
![Page 499: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/499.jpg)
Ordered Logit and Probit Models
0.0
0.2
0.4
0.6
0.8
1.0
F(Y
i*)
Pr(Yi = 1)
ψ1
0.0
0.1
0.2
0.3
0.4
f(Yi*)
Yi* ~ N(XiTβ, 1)
XiTβ
Pr(Yi = 1)
Yi*
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 95 / 242
![Page 500: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/500.jpg)
Ordered Logit and Probit Models
0.0
0.2
0.4
0.6
0.8
1.0
F(Y
i*)
Pr(Yi = 1)
ψ1
Pr(Yi ≤ 2)
ψ2
0.0
0.1
0.2
0.3
0.4
f(Yi*)
Yi* ~ N(XiTβ, 1)
XiTβ
Pr(Yi = 1) Pr(Yi = 2)
Yi*
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 95 / 242
![Page 501: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/501.jpg)
Ordered Logit and Probit Models
0.0
0.2
0.4
0.6
0.8
1.0
F(Y
i*)
Pr(Yi = 1)
ψ1
Pr(Yi ≤ 2)
ψ2
Pr(Yi ≤ 3)
ψ3
0.0
0.1
0.2
0.3
0.4
f(Yi*)
Yi* ~ N(XiTβ, 1)
XiTβ
Pr(Yi = 1) Pr(Yi = 2) Pr(Yi = 3)
Yi*
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 95 / 242
![Page 502: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/502.jpg)
Ordered Dependent Variable Models: Connections
1. Y ∗i ∼ STL(y∗i |µi )→ ordinal logitY ∗i ∼ STN(y∗i |µi )→ ordinal probit
2. Alternate representation: dichotomous variable Yji for each category j ,only one of which is 1; the others are 0.
3. If Y ∗i is observed, the probit version is a linear-normal regression model
4. If a dichotomous realization of Y ∗ is observed, its a logit/probit model
5. This is the same model, and the same parameters are being estimated;only the observation mechanism differs.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 96 / 242
![Page 503: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/503.jpg)
Ordered Dependent Variable Models: Connections
1. Y ∗i ∼ STL(y∗i |µi )→ ordinal logitY ∗i ∼ STN(y∗i |µi )→ ordinal probit
2. Alternate representation: dichotomous variable Yji for each category j ,only one of which is 1; the others are 0.
3. If Y ∗i is observed, the probit version is a linear-normal regression model
4. If a dichotomous realization of Y ∗ is observed, its a logit/probit model
5. This is the same model, and the same parameters are being estimated;only the observation mechanism differs.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 96 / 242
![Page 504: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/504.jpg)
Ordered Dependent Variable Models: Connections
1. Y ∗i ∼ STL(y∗i |µi )→ ordinal logitY ∗i ∼ STN(y∗i |µi )→ ordinal probit
2. Alternate representation: dichotomous variable Yji for each category j ,only one of which is 1; the others are 0.
3. If Y ∗i is observed, the probit version is a linear-normal regression model
4. If a dichotomous realization of Y ∗ is observed, its a logit/probit model
5. This is the same model, and the same parameters are being estimated;only the observation mechanism differs.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 96 / 242
![Page 505: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/505.jpg)
Ordered Dependent Variable Models: Connections
1. Y ∗i ∼ STL(y∗i |µi )→ ordinal logitY ∗i ∼ STN(y∗i |µi )→ ordinal probit
2. Alternate representation: dichotomous variable Yji for each category j ,only one of which is 1; the others are 0.
3. If Y ∗i is observed, the probit version is a linear-normal regression model
4. If a dichotomous realization of Y ∗ is observed, its a logit/probit model
5. This is the same model, and the same parameters are being estimated;only the observation mechanism differs.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 96 / 242
![Page 506: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/506.jpg)
Ordered Dependent Variable Models: Connections
1. Y ∗i ∼ STL(y∗i |µi )→ ordinal logitY ∗i ∼ STN(y∗i |µi )→ ordinal probit
2. Alternate representation: dichotomous variable Yji for each category j ,only one of which is 1; the others are 0.
3. If Y ∗i is observed, the probit version is a linear-normal regression model
4. If a dichotomous realization of Y ∗ is observed, its a logit/probit model
5. This is the same model, and the same parameters are being estimated;only the observation mechanism differs.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 96 / 242
![Page 507: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/507.jpg)
Ordered Dependent Variable Models: Connections
1. Y ∗i ∼ STL(y∗i |µi )→ ordinal logitY ∗i ∼ STN(y∗i |µi )→ ordinal probit
2. Alternate representation: dichotomous variable Yji for each category j ,only one of which is 1; the others are 0.
3. If Y ∗i is observed, the probit version is a linear-normal regression model
4. If a dichotomous realization of Y ∗ is observed, its a logit/probit model
5. This is the same model, and the same parameters are being estimated;only the observation mechanism differs.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 96 / 242
![Page 508: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/508.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 509: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/509.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 510: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/510.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 511: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/511.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 512: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/512.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )
= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 513: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/513.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 514: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/514.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 515: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/515.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 516: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/516.jpg)
Deriving the likelihood function
First the probability of each observation, then the joint probability.
Pr(Yij = 1) = Pr(τj−1 ≤ Y ∗i ≤ τj)
=
∫ τj
τj−1
STN(y∗i |µi )dy∗i
= Fstn(τj |µi )− Fstn(τj−1|µi )= Fstn(τj |xiβ)− Fstn(τj−1|xiβ)
The joint probability is then:
P(Y ) =n∏
i=1
J∏j=1
Pr(Yij = 1)yij
Bracketed portion has only one active component for each i .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 97 / 242
![Page 517: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/517.jpg)
Deriving the likelihood function
The Log-likelihood:
ln L(β, τ |y) =n∑
i=1
J∑j=1
yij ln Pr(Yij = 1)
=n∑
i=1
J∑j=1
yij ln [Fstn(τj |xiβ)− Fstn(τj−1|xiβ)]
(Constraints during optimization make this more complicated to do fromscratch: τj−1 < τj , ∀j)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 98 / 242
![Page 518: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/518.jpg)
Deriving the likelihood function
The Log-likelihood:
ln L(β, τ |y) =n∑
i=1
J∑j=1
yij ln Pr(Yij = 1)
=n∑
i=1
J∑j=1
yij ln [Fstn(τj |xiβ)− Fstn(τj−1|xiβ)]
(Constraints during optimization make this more complicated to do fromscratch: τj−1 < τj , ∀j)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 98 / 242
![Page 519: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/519.jpg)
Deriving the likelihood function
The Log-likelihood:
ln L(β, τ |y) =n∑
i=1
J∑j=1
yij ln Pr(Yij = 1)
=n∑
i=1
J∑j=1
yij ln [Fstn(τj |xiβ)− Fstn(τj−1|xiβ)]
(Constraints during optimization make this more complicated to do fromscratch: τj−1 < τj , ∀j)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 98 / 242
![Page 520: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/520.jpg)
Deriving the likelihood function
The Log-likelihood:
ln L(β, τ |y) =n∑
i=1
J∑j=1
yij ln Pr(Yij = 1)
=n∑
i=1
J∑j=1
yij ln [Fstn(τj |xiβ)− Fstn(τj−1|xiβ)]
(Constraints during optimization make this more complicated to do fromscratch: τj−1 < τj , ∀j)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 98 / 242
![Page 521: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/521.jpg)
Interpretation: Ordinal Probit
1. Coefficients are the linear effect of X on Y ∗ in standard deviation units
2. Predictions from the model are J probabilities that sum to 1.
3. One first difference has an effect on all J probabilities.
4. When one probability goes up, at least one of the others must go down.
5. Can use ternary diagrams if J = 3
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 99 / 242
![Page 522: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/522.jpg)
Interpretation: Ordinal Probit
1. Coefficients are the linear effect of X on Y ∗ in standard deviation units
2. Predictions from the model are J probabilities that sum to 1.
3. One first difference has an effect on all J probabilities.
4. When one probability goes up, at least one of the others must go down.
5. Can use ternary diagrams if J = 3
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 99 / 242
![Page 523: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/523.jpg)
Interpretation: Ordinal Probit
1. Coefficients are the linear effect of X on Y ∗ in standard deviation units
2. Predictions from the model are J probabilities that sum to 1.
3. One first difference has an effect on all J probabilities.
4. When one probability goes up, at least one of the others must go down.
5. Can use ternary diagrams if J = 3
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 99 / 242
![Page 524: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/524.jpg)
Interpretation: Ordinal Probit
1. Coefficients are the linear effect of X on Y ∗ in standard deviation units
2. Predictions from the model are J probabilities that sum to 1.
3. One first difference has an effect on all J probabilities.
4. When one probability goes up, at least one of the others must go down.
5. Can use ternary diagrams if J = 3
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 99 / 242
![Page 525: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/525.jpg)
Interpretation: Ordinal Probit
1. Coefficients are the linear effect of X on Y ∗ in standard deviation units
2. Predictions from the model are J probabilities that sum to 1.
3. One first difference has an effect on all J probabilities.
4. When one probability goes up, at least one of the others must go down.
5. Can use ternary diagrams if J = 3
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 99 / 242
![Page 526: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/526.jpg)
Interpretation: Ordinal Probit
1. Coefficients are the linear effect of X on Y ∗ in standard deviation units
2. Predictions from the model are J probabilities that sum to 1.
3. One first difference has an effect on all J probabilities.
4. When one probability goes up, at least one of the others must go down.
5. Can use ternary diagrams if J = 3
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 99 / 242
![Page 527: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/527.jpg)
Representing 3 variables, with Yj ∈ [0, 1] and∑3
j=1 Yj = 1
0
.5
1
ViC
0
.5
1
V iA
0
.5
1
ViL
C
LA
.. ..
.
.
. .
..
.
.
.
...
..
.
.
.
..
.
.......
.
.
.
..
.. .
. .
.
..
.
..
.
.
..
.
.
..
.
.
.
..
.
.
.. .
.
... .
.
. .
.
.
..
.
..
....
....
..
.
.
..
...
...
.
.
.
.
.
..
.
.
.
.
...
.
...
..
.. .
...
.. ..
.
..
. .. ..
. ..
..
. .
.
.
.
.
.
.
.
..
.
.. . .. .
.
.
.
..
.
...
.
..
..
...
.
..
.
.
..
..
.. .
..
..
.
...
.
..
.
.
..
.
.
.
..
.
.
... .
..
. ..
..
.
.
.
.
. ..
.
.
.
. .
.
..
.... ..
.
.. ...
.
...
..
.
..
. .
..
..
..
..
..
.
..
.. .
...
..
..
..
.
. .
. .
.
. .. ..
........
... .
..
..
..
..
..
.
...
..
.
.
..
..
.. .
.
..
. ...
. ....
.
.
.
.
. .. .
..
..
.
.. . .. ..
..
.
. ..
.
.. .
..
.
.. .
... .
...
..
.
.
.
.
. ..
.
.
.
..
.
.
.
..
.
.
...
.
..
..
.
... .
..
..
.
.
..
.
.
.
.
..
.
.
. .
..
.
..
.
.
..
...
..
..
.
.
..
. .
.
....
. .
.
....
.
.
.
...
. .
.
.
..
.
..
.
.
.
.
..
.
.
.
.
.
..
. .
.
.
.
..
.
.
..
.
.
.
.
.
.
.
.
.
...
.
.
.
.
.
..
.
..
..
.
.
.
..
.
. ..
.
.
.
.
..
.
..
.
.
.
.
.
.
.
.
.
.
.
..
.
...
..
...
.
...
.
.
.
.
...
.
...
...
.
.
. ..
.
..
.
.. .
..
..
..
.
.
.
.
.
.
.. .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 100 / 242
![Page 528: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/528.jpg)
Calculating Quantities of Interest
Predicted probability:
πij(Xi ) ≡ Pr(Yi = j | Xi ) = Pr(Yi ≤ j | Xi )− Pr(Yi ≤ j − 1 | Xi )
=
{exp(ψj−X>i β)
1+exp(ψj−X>i β)− exp(ψj−1−X>i β)
1+exp(ψj−1−X>i β)for logit
Φ(ψj − X>i β
)− Φ
(ψj−1 − X>i β
)for probit
ATE (APE): τj = E [πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate β and ψ via MLE, plug the estimates in, replace E with 1n
∑, and
compute CI by delta or MC or bootstrap
Note that X>i β appears both before and after the minus sign in πij−→ Direction of effect of Xi on Yij is ambiguous (except top and bottom)−→ Again, calculate quantities of interest, not just coefficients
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 101 / 242
![Page 529: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/529.jpg)
Calculating Quantities of Interest
Predicted probability:
πij(Xi ) ≡ Pr(Yi = j | Xi ) = Pr(Yi ≤ j | Xi )− Pr(Yi ≤ j − 1 | Xi )
=
{exp(ψj−X>i β)
1+exp(ψj−X>i β)− exp(ψj−1−X>i β)
1+exp(ψj−1−X>i β)for logit
Φ(ψj − X>i β
)− Φ
(ψj−1 − X>i β
)for probit
ATE (APE): τj = E [πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate β and ψ via MLE, plug the estimates in, replace E with 1n
∑, and
compute CI by delta or MC or bootstrap
Note that X>i β appears both before and after the minus sign in πij−→ Direction of effect of Xi on Yij is ambiguous (except top and bottom)−→ Again, calculate quantities of interest, not just coefficients
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 101 / 242
![Page 530: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/530.jpg)
Calculating Quantities of Interest
Predicted probability:
πij(Xi ) ≡ Pr(Yi = j | Xi ) = Pr(Yi ≤ j | Xi )− Pr(Yi ≤ j − 1 | Xi )
=
{exp(ψj−X>i β)
1+exp(ψj−X>i β)− exp(ψj−1−X>i β)
1+exp(ψj−1−X>i β)for logit
Φ(ψj − X>i β
)− Φ
(ψj−1 − X>i β
)for probit
ATE (APE): τj = E [πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate β and ψ via MLE, plug the estimates in, replace E with 1n
∑, and
compute CI by delta or MC or bootstrap
Note that X>i β appears both before and after the minus sign in πij−→ Direction of effect of Xi on Yij is ambiguous (except top and bottom)−→ Again, calculate quantities of interest, not just coefficients
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 101 / 242
![Page 531: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/531.jpg)
Calculating Quantities of Interest
Predicted probability:
πij(Xi ) ≡ Pr(Yi = j | Xi ) = Pr(Yi ≤ j | Xi )− Pr(Yi ≤ j − 1 | Xi )
=
{exp(ψj−X>i β)
1+exp(ψj−X>i β)− exp(ψj−1−X>i β)
1+exp(ψj−1−X>i β)for logit
Φ(ψj − X>i β
)− Φ
(ψj−1 − X>i β
)for probit
ATE (APE): τj = E [πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate β and ψ via MLE, plug the estimates in, replace E with 1n
∑, and
compute CI by delta or MC or bootstrap
Note that X>i β appears both before and after the minus sign in πij
−→ Direction of effect of Xi on Yij is ambiguous (except top and bottom)−→ Again, calculate quantities of interest, not just coefficients
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 101 / 242
![Page 532: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/532.jpg)
Calculating Quantities of Interest
Predicted probability:
πij(Xi ) ≡ Pr(Yi = j | Xi ) = Pr(Yi ≤ j | Xi )− Pr(Yi ≤ j − 1 | Xi )
=
{exp(ψj−X>i β)
1+exp(ψj−X>i β)− exp(ψj−1−X>i β)
1+exp(ψj−1−X>i β)for logit
Φ(ψj − X>i β
)− Φ
(ψj−1 − X>i β
)for probit
ATE (APE): τj = E [πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate β and ψ via MLE, plug the estimates in, replace E with 1n
∑, and
compute CI by delta or MC or bootstrap
Note that X>i β appears both before and after the minus sign in πij−→ Direction of effect of Xi on Yij is ambiguous (except top and bottom)
−→ Again, calculate quantities of interest, not just coefficients
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 101 / 242
![Page 533: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/533.jpg)
Calculating Quantities of Interest
Predicted probability:
πij(Xi ) ≡ Pr(Yi = j | Xi ) = Pr(Yi ≤ j | Xi )− Pr(Yi ≤ j − 1 | Xi )
=
{exp(ψj−X>i β)
1+exp(ψj−X>i β)− exp(ψj−1−X>i β)
1+exp(ψj−1−X>i β)for logit
Φ(ψj − X>i β
)− Φ
(ψj−1 − X>i β
)for probit
ATE (APE): τj = E [πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate β and ψ via MLE, plug the estimates in, replace E with 1n
∑, and
compute CI by delta or MC or bootstrap
Note that X>i β appears both before and after the minus sign in πij−→ Direction of effect of Xi on Yij is ambiguous (except top and bottom)−→ Again, calculate quantities of interest, not just coefficients
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 101 / 242
![Page 534: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/534.jpg)
Example: Immigration and Media PrimingBrader, Valentino and Suhay (2008):
Yi : Ordinal response to question about increasing immigration
T1i ,T2i : Media cues (immigrant ethnicity × story tone)
Wi : Respondent age and income
Estimated coefficients:
Coefficients:
Value s.e. t
tone 0.27 0.32 0.85
eth -0.33 0.32 -1.02
ppage 0.01 0.02 1.40
ppincimp 0.00 0.03 0.06
tone:eth 0.90 0.46 2.16
Intercepts:
Value s.e. t
1|2 -1.93 0.58 -3.32
2|3 -0.12 0.55 -0.21
3|4 1.12 0.56 2.01
ATE:
−0.
4−
0.2
0.0
0.2
0.4
Do you think the number of immigrants from foreign countries should be increased or decreased?
Ave
rage
Tre
atm
ent E
ffect
European/Positive vs. Latino/Negative
Increased a Lot
Increased a Little
Decreased a Little
Decreased a Lot
●
●
●
●
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 102 / 242
![Page 535: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/535.jpg)
Example: Immigration and Media PrimingBrader, Valentino and Suhay (2008):
Yi : Ordinal response to question about increasing immigration
T1i ,T2i : Media cues (immigrant ethnicity × story tone)
Wi : Respondent age and income
Estimated coefficients:
Coefficients:
Value s.e. t
tone 0.27 0.32 0.85
eth -0.33 0.32 -1.02
ppage 0.01 0.02 1.40
ppincimp 0.00 0.03 0.06
tone:eth 0.90 0.46 2.16
Intercepts:
Value s.e. t
1|2 -1.93 0.58 -3.32
2|3 -0.12 0.55 -0.21
3|4 1.12 0.56 2.01
ATE:
−0.
4−
0.2
0.0
0.2
0.4
Do you think the number of immigrants from foreign countries should be increased or decreased?
Ave
rage
Tre
atm
ent E
ffect
European/Positive vs. Latino/Negative
Increased a Lot
Increased a Little
Decreased a Little
Decreased a Lot
●
●
●
●
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 102 / 242
![Page 536: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/536.jpg)
Example: Immigration and Media PrimingBrader, Valentino and Suhay (2008):
Yi : Ordinal response to question about increasing immigration
T1i ,T2i : Media cues (immigrant ethnicity × story tone)
Wi : Respondent age and income
Estimated coefficients:
Coefficients:
Value s.e. t
tone 0.27 0.32 0.85
eth -0.33 0.32 -1.02
ppage 0.01 0.02 1.40
ppincimp 0.00 0.03 0.06
tone:eth 0.90 0.46 2.16
Intercepts:
Value s.e. t
1|2 -1.93 0.58 -3.32
2|3 -0.12 0.55 -0.21
3|4 1.12 0.56 2.01
ATE:
−0.
4−
0.2
0.0
0.2
0.4
Do you think the number of immigrants from foreign countries should be increased or decreased?
Ave
rage
Tre
atm
ent E
ffect
European/Positive vs. Latino/Negative
Increased a Lot
Increased a Little
Decreased a Little
Decreased a Lot
●
●
●
●
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 102 / 242
![Page 537: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/537.jpg)
Example: Peer Bereavement
Andersen, Silver, Koperwas, Stewart and Kirschbaum (2013):
Study of response by college students to a sequence of 14 peer deaths in oneacademic year.
Yi : Severity of acute reaction (1-5 scale)
Xi : gender, number of peers known, media exposure
Coef SE CI P-value RR Strong (95% CI) RR Extreme (95% CI)Female 0.89 0.23 (0.44, 1.35) 0 2.71 (1.51, 4.82) 13.69 (2.84, 45.55)
Num. Peers Known 0.54 0.16 (0.23, 0.85) 0 1.43 (1.15, 1.82) 3.53 (1.6, 7.25)Media Exposure 0.25 0.06 (0.13, 0.36) 0 1.17 (1.08, 1.3) 1.73 (1.31, 2.38)
The Risk Ratio measures the relative probability of being in the outcome categorybased on different values of the independent variable. Thus the RR for the StrongReaction category for Female can be understood as
RRStrong =Pr(Strong Reaction|Female)
Pr(Strong Reaction|Male)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 103 / 242
![Page 538: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/538.jpg)
Example: Peer Bereavement
Andersen, Silver, Koperwas, Stewart and Kirschbaum (2013):Study of response by college students to a sequence of 14 peer deaths in oneacademic year.
Yi : Severity of acute reaction (1-5 scale)
Xi : gender, number of peers known, media exposure
Coef SE CI P-value RR Strong (95% CI) RR Extreme (95% CI)Female 0.89 0.23 (0.44, 1.35) 0 2.71 (1.51, 4.82) 13.69 (2.84, 45.55)
Num. Peers Known 0.54 0.16 (0.23, 0.85) 0 1.43 (1.15, 1.82) 3.53 (1.6, 7.25)Media Exposure 0.25 0.06 (0.13, 0.36) 0 1.17 (1.08, 1.3) 1.73 (1.31, 2.38)
The Risk Ratio measures the relative probability of being in the outcome categorybased on different values of the independent variable. Thus the RR for the StrongReaction category for Female can be understood as
RRStrong =Pr(Strong Reaction|Female)
Pr(Strong Reaction|Male)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 103 / 242
![Page 539: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/539.jpg)
Example: Peer Bereavement
Andersen, Silver, Koperwas, Stewart and Kirschbaum (2013):Study of response by college students to a sequence of 14 peer deaths in oneacademic year.
Yi : Severity of acute reaction (1-5 scale)
Xi : gender, number of peers known, media exposure
Coef SE CI P-value RR Strong (95% CI) RR Extreme (95% CI)Female 0.89 0.23 (0.44, 1.35) 0 2.71 (1.51, 4.82) 13.69 (2.84, 45.55)
Num. Peers Known 0.54 0.16 (0.23, 0.85) 0 1.43 (1.15, 1.82) 3.53 (1.6, 7.25)Media Exposure 0.25 0.06 (0.13, 0.36) 0 1.17 (1.08, 1.3) 1.73 (1.31, 2.38)
The Risk Ratio measures the relative probability of being in the outcome categorybased on different values of the independent variable. Thus the RR for the StrongReaction category for Female can be understood as
RRStrong =Pr(Strong Reaction|Female)
Pr(Strong Reaction|Male)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 103 / 242
![Page 540: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/540.jpg)
Example: Peer Bereavement
Andersen, Silver, Koperwas, Stewart and Kirschbaum (2013):Study of response by college students to a sequence of 14 peer deaths in oneacademic year.
Yi : Severity of acute reaction (1-5 scale)
Xi : gender, number of peers known, media exposure
Coef SE CI P-value RR Strong (95% CI) RR Extreme (95% CI)Female 0.89 0.23 (0.44, 1.35) 0 2.71 (1.51, 4.82) 13.69 (2.84, 45.55)
Num. Peers Known 0.54 0.16 (0.23, 0.85) 0 1.43 (1.15, 1.82) 3.53 (1.6, 7.25)Media Exposure 0.25 0.06 (0.13, 0.36) 0 1.17 (1.08, 1.3) 1.73 (1.31, 2.38)
The Risk Ratio measures the relative probability of being in the outcome categorybased on different values of the independent variable.
Thus the RR for the StrongReaction category for Female can be understood as
RRStrong =Pr(Strong Reaction|Female)
Pr(Strong Reaction|Male)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 103 / 242
![Page 541: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/541.jpg)
Example: Peer Bereavement
Andersen, Silver, Koperwas, Stewart and Kirschbaum (2013):Study of response by college students to a sequence of 14 peer deaths in oneacademic year.
Yi : Severity of acute reaction (1-5 scale)
Xi : gender, number of peers known, media exposure
Coef SE CI P-value RR Strong (95% CI) RR Extreme (95% CI)Female 0.89 0.23 (0.44, 1.35) 0 2.71 (1.51, 4.82) 13.69 (2.84, 45.55)
Num. Peers Known 0.54 0.16 (0.23, 0.85) 0 1.43 (1.15, 1.82) 3.53 (1.6, 7.25)Media Exposure 0.25 0.06 (0.13, 0.36) 0 1.17 (1.08, 1.3) 1.73 (1.31, 2.38)
The Risk Ratio measures the relative probability of being in the outcome categorybased on different values of the independent variable. Thus the RR for the StrongReaction category for Female can be understood as
RRStrong =Pr(Strong Reaction|Female)
Pr(Strong Reaction|Male)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 103 / 242
![Page 542: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/542.jpg)
Example: Peer Bereavement
●
●
●
●
●
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Severity of Acute Reaction
Exp
ecte
d P
roba
bilit
y of
Rea
ctio
n
●
●
●
●
●
No Reaction Slight Moderate Strong Extreme
● 0.0
0.2
0.4
0.6
0.8
Severity of Acute ReactionE
xpec
ted
Pro
babi
lity
of R
eact
ion
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
● ●
●
●●
No Reaction Slight Moderate Strong Extreme
Figure: This plot shows the expected probabilities of being in each category ofreaction given gender (left) and knowing 1 to 4 people (right) with 95%confidence intervals.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 104 / 242
![Page 543: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/543.jpg)
Ordered Categorical Conclusions
Straightforward to derive from latent variable representation
Ordered probit is often easier to work with,
ordered logit has a niceinterpretation as a proportional odds model
logγij
1− γij= λjexp(xiβ)
where γij is the cumulative probability and λj is the baseline odds.Covariates raise or lower the odds of a response in category j or below.
Visualization and appropriate quantities of interest can be tricky. Letthe substance guide you.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 105 / 242
![Page 544: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/544.jpg)
Ordered Categorical Conclusions
Straightforward to derive from latent variable representation
Ordered probit is often easier to work with,
ordered logit has a niceinterpretation as a proportional odds model
logγij
1− γij= λjexp(xiβ)
where γij is the cumulative probability and λj is the baseline odds.Covariates raise or lower the odds of a response in category j or below.
Visualization and appropriate quantities of interest can be tricky. Letthe substance guide you.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 105 / 242
![Page 545: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/545.jpg)
Ordered Categorical Conclusions
Straightforward to derive from latent variable representation
Ordered probit is often easier to work with,
ordered logit has a niceinterpretation as a proportional odds model
logγij
1− γij= λjexp(xiβ)
where γij is the cumulative probability and λj is the baseline odds.Covariates raise or lower the odds of a response in category j or below.
Visualization and appropriate quantities of interest can be tricky. Letthe substance guide you.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 105 / 242
![Page 546: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/546.jpg)
Ordered Categorical Conclusions
Straightforward to derive from latent variable representation
Ordered probit is often easier to work with, ordered logit has a niceinterpretation as a proportional odds model
logγij
1− γij= λjexp(xiβ)
where γij is the cumulative probability and λj is the baseline odds.
Covariates raise or lower the odds of a response in category j or below.
Visualization and appropriate quantities of interest can be tricky. Letthe substance guide you.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 105 / 242
![Page 547: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/547.jpg)
Ordered Categorical Conclusions
Straightforward to derive from latent variable representation
Ordered probit is often easier to work with, ordered logit has a niceinterpretation as a proportional odds model
logγij
1− γij= λjexp(xiβ)
where γij is the cumulative probability and λj is the baseline odds.Covariates raise or lower the odds of a response in category j or below.
Visualization and appropriate quantities of interest can be tricky. Letthe substance guide you.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 105 / 242
![Page 548: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/548.jpg)
Ordered Categorical Conclusions
Straightforward to derive from latent variable representation
Ordered probit is often easier to work with, ordered logit has a niceinterpretation as a proportional odds model
logγij
1− γij= λjexp(xiβ)
where γij is the cumulative probability and λj is the baseline odds.Covariates raise or lower the odds of a response in category j or below.
Visualization and appropriate quantities of interest can be tricky. Letthe substance guide you.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 105 / 242
![Page 549: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/549.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 106 / 242
![Page 550: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/550.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 106 / 242
![Page 551: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/551.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 552: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/552.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 553: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/553.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 554: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/554.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 555: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/555.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 556: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/556.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 557: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/557.jpg)
Multinomial Logit
Sometimes we encounter unordered categories (choose a Ph.D.Sociology, Politics, Psychology, Statistics).
We can generalize the logit model to two choices to get themultinomial logit model
πij = Pr(Yi = j | Xi ) =exp(X>i βj)∑J
k=1 exp(X>i βk),
where Xi = individual-specific characteristics of unit i
category-specific set of coefficients (one category omitted foridentification)
Multinomial logit also has a latent variable interpretation: makechoice with greatest utility Y ∗ij . When the stochastic component on
the utility is εiji.i.d.∼ type I extreme value distribution, multinomial
logit is implied.
Coefficients are relative to a baseline category- so again we want tocompute quantities of interest for interpretation.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 107 / 242
![Page 558: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/558.jpg)
Independence of Irrelevant Alternatives
Multinomial logit assumes iid errors in the latent utility model, thisimplies that unobserved factors affecting Y ∗ij are unrelated to thoseaffecting Y ∗ik .
MNL makes the Independence of Irrelevant Alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 108 / 242
![Page 559: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/559.jpg)
Independence of Irrelevant Alternatives
Multinomial logit assumes iid errors in the latent utility model, thisimplies that unobserved factors affecting Y ∗ij are unrelated to thoseaffecting Y ∗ik .
MNL makes the Independence of Irrelevant Alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 108 / 242
![Page 560: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/560.jpg)
Independence of Irrelevant Alternatives
Multinomial logit assumes iid errors in the latent utility model, thisimplies that unobserved factors affecting Y ∗ij are unrelated to thoseaffecting Y ∗ik .
MNL makes the Independence of Irrelevant Alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 108 / 242
![Page 561: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/561.jpg)
Independence of Irrelevant Alternatives
Multinomial logit assumes iid errors in the latent utility model, thisimplies that unobserved factors affecting Y ∗ij are unrelated to thoseaffecting Y ∗ik .
MNL makes the Independence of Irrelevant Alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 108 / 242
![Page 562: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/562.jpg)
Independence of Irrelevant Alternatives
Multinomial logit assumes iid errors in the latent utility model, thisimplies that unobserved factors affecting Y ∗ij are unrelated to thoseaffecting Y ∗ik .
MNL makes the Independence of Irrelevant Alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 108 / 242
![Page 563: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/563.jpg)
Independence of Irrelevant Alternatives
Multinomial logit assumes iid errors in the latent utility model, thisimplies that unobserved factors affecting Y ∗ij are unrelated to thoseaffecting Y ∗ik .
MNL makes the Independence of Irrelevant Alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 108 / 242
![Page 564: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/564.jpg)
Relaxing IIA with Multinomial Probit
To relax IIA, we need to allow the stochastic component of the utilityεij to be correlated across choices j for each voter.
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on level and scale of Y ∗i for identification
Computation is difficult because integral is intractableMoreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 109 / 242
![Page 565: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/565.jpg)
Relaxing IIA with Multinomial Probit
To relax IIA, we need to allow the stochastic component of the utilityεij to be correlated across choices j for each voter.
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on level and scale of Y ∗i for identification
Computation is difficult because integral is intractableMoreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 109 / 242
![Page 566: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/566.jpg)
Relaxing IIA with Multinomial Probit
To relax IIA, we need to allow the stochastic component of the utilityεij to be correlated across choices j for each voter.
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on level and scale of Y ∗i for identification
Computation is difficult because integral is intractableMoreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 109 / 242
![Page 567: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/567.jpg)
Relaxing IIA with Multinomial Probit
To relax IIA, we need to allow the stochastic component of the utilityεij to be correlated across choices j for each voter.
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on level and scale of Y ∗i for identification
Computation is difficult because integral is intractableMoreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 109 / 242
![Page 568: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/568.jpg)
Relaxing IIA with Multinomial Probit
To relax IIA, we need to allow the stochastic component of the utilityεij to be correlated across choices j for each voter.
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on level and scale of Y ∗i for identification
Computation is difficult because integral is intractable
Moreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 109 / 242
![Page 569: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/569.jpg)
Relaxing IIA with Multinomial Probit
To relax IIA, we need to allow the stochastic component of the utilityεij to be correlated across choices j for each voter.
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on level and scale of Y ∗i for identification
Computation is difficult because integral is intractableMoreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 109 / 242
![Page 570: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/570.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 110 / 242
![Page 571: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/571.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 110 / 242
![Page 572: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/572.jpg)
The Poisson Distribution
It’s a discrete probability distribution which gives the probability that somenumber of events will occur in a fixed period of time.Examples:
1. number of terrorist attacks in a given year
2. number of publications by a Professor in a career
3. number of days absent from school for High School Sophomores
4. logo for the Stata Press:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 111 / 242
![Page 573: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/573.jpg)
The Poisson Distribution
It’s a discrete probability distribution which gives the probability that somenumber of events will occur in a fixed period of time.Examples:
1. number of terrorist attacks in a given year
2. number of publications by a Professor in a career
3. number of days absent from school for High School Sophomores
4. logo for the Stata Press:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 111 / 242
![Page 574: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/574.jpg)
The Poisson Distribution
It’s a discrete probability distribution which gives the probability that somenumber of events will occur in a fixed period of time.Examples:
1. number of terrorist attacks in a given year
2. number of publications by a Professor in a career
3. number of days absent from school for High School Sophomores
4. logo for the Stata Press:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 111 / 242
![Page 575: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/575.jpg)
The Poisson Distribution
It’s a discrete probability distribution which gives the probability that somenumber of events will occur in a fixed period of time.Examples:
1. number of terrorist attacks in a given year
2. number of publications by a Professor in a career
3. number of days absent from school for High School Sophomores
4. logo for the Stata Press:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 111 / 242
![Page 576: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/576.jpg)
The Poisson Distribution
It’s a discrete probability distribution which gives the probability that somenumber of events will occur in a fixed period of time.Examples:
1. number of terrorist attacks in a given year
2. number of publications by a Professor in a career
3. number of days absent from school for High School Sophomores
4. logo for the Stata Press:
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 111 / 242
![Page 577: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/577.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 578: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/578.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 579: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/579.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
Observation period Total count observed here
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 580: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/580.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
Observation period Total count observed here
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 581: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/581.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
Observation period Total count observed here
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 582: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/582.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
Observation period Total count observed here
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 583: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/583.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
Observation period Total count observed here
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 584: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/584.jpg)
Poisson distribution’s first principles:
1. Begin with an observation period and count point:
Observation period Total count observed here
2. Assumptions are about: events occurring between start and countobservation. The process of event generation is not observed.
3. 0 events occur at the start of the period
4. Observe only: number of events at end of the period
5. No 2 events can occur at the same time
6. Pr(event at time t | all events up to time t − 1) is constant for all t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 112 / 242
![Page 585: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/585.jpg)
The Poisson Distribution
Here is the probability density function (PDF) for a random variable Ythat is distributed Pois(λ):
Pr(Y = y) =λy
y !e−λ.
-suppose Y ∼ Pois(3). What’s Pr(Y = 4)?
Pr(Y = 4) =34
4!e−3 = 0.168.
Poisson Distribution
y
Pr(
Y=
y)
00.
050.
10.
15
0 2 4 6 8 10
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 113 / 242
![Page 586: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/586.jpg)
The Poisson DistributionHere is the probability density function (PDF) for a random variable Ythat is distributed Pois(λ):
Pr(Y = y) =λy
y !e−λ.
-suppose Y ∼ Pois(3). What’s Pr(Y = 4)?
Pr(Y = 4) =34
4!e−3 = 0.168.
Poisson Distribution
y
Pr(
Y=
y)
00.
050.
10.
15
0 2 4 6 8 10
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 113 / 242
![Page 587: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/587.jpg)
The Poisson DistributionHere is the probability density function (PDF) for a random variable Ythat is distributed Pois(λ):
Pr(Y = y) =λy
y !e−λ.
-suppose Y ∼ Pois(3). What’s Pr(Y = 4)?
Pr(Y = 4) =34
4!e−3 = 0.168.
Poisson Distribution
y
Pr(
Y=
y)
00.
050.
10.
15
0 2 4 6 8 10
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 113 / 242
![Page 588: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/588.jpg)
The Poisson DistributionHere is the probability density function (PDF) for a random variable Ythat is distributed Pois(λ):
Pr(Y = y) =λy
y !e−λ.
-suppose Y ∼ Pois(3). What’s Pr(Y = 4)?
Pr(Y = 4) =34
4!e−3 = 0.168.
Poisson Distribution
y
Pr(
Y=
y)
00.
050.
10.
15
0 2 4 6 8 10
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 113 / 242
![Page 589: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/589.jpg)
The Poisson Distribution
(λ):
Pr(Y = y) =λy
y !e−λ.
Using a little bit of geometric series trickery, it isn’t too hard to show thatE[Y ] =
∑∞y=0 y · λyy ! e
−λ = λ.
It also turns out that Var(Y ) = λ, a feature of the model we will discusslater on.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 114 / 242
![Page 590: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/590.jpg)
The Poisson Distribution
(λ):
Pr(Y = y) =λy
y !e−λ.
Using a little bit of geometric series trickery, it isn’t too hard to show thatE[Y ] =
∑∞y=0 y · λyy ! e
−λ = λ.
It also turns out that Var(Y ) = λ, a feature of the model we will discusslater on.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 114 / 242
![Page 591: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/591.jpg)
The Poisson Distribution
(λ):
Pr(Y = y) =λy
y !e−λ.
Using a little bit of geometric series trickery, it isn’t too hard to show thatE[Y ] =
∑∞y=0 y · λyy ! e
−λ = λ.
It also turns out that Var(Y ) = λ, a feature of the model we will discusslater on.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 114 / 242
![Page 592: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/592.jpg)
The Poisson Distribution
Poisson data arises when there is some discrete event which occurs(possibly multiple times) at a constant rate for some fixed time period.
This constant rate assumption could be restated: the probability of anevent occurring at any moment is independent of whether an event hasoccurred at any other moment.
Derivation of the distribution has some other technical first principles, butthe above is the most important.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 115 / 242
![Page 593: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/593.jpg)
The Poisson Distribution
Poisson data arises when there is some discrete event which occurs(possibly multiple times) at a constant rate for some fixed time period.
This constant rate assumption could be restated: the probability of anevent occurring at any moment is independent of whether an event hasoccurred at any other moment.
Derivation of the distribution has some other technical first principles, butthe above is the most important.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 115 / 242
![Page 594: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/594.jpg)
The Poisson Distribution
Poisson data arises when there is some discrete event which occurs(possibly multiple times) at a constant rate for some fixed time period.
This constant rate assumption could be restated: the probability of anevent occurring at any moment is independent of whether an event hasoccurred at any other moment.
Derivation of the distribution has some other technical first principles, butthe above is the most important.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 115 / 242
![Page 595: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/595.jpg)
The Poisson Distribution
Poisson data arises when there is some discrete event which occurs(possibly multiple times) at a constant rate for some fixed time period.
This constant rate assumption could be restated: the probability of anevent occurring at any moment is independent of whether an event hasoccurred at any other moment.
Derivation of the distribution has some other technical first principles, butthe above is the most important.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 115 / 242
![Page 596: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/596.jpg)
Connections to Distributions We Have Seen
Take Binom(n, p) and let n→∞ and p → 0 holding np = µ constant
If the number of arrivals in the time interval [0, t] follows aPoisson(λt) then the wait times are distributed Exponential withmean 1/λ.
For Yj |(X = k) ∼Multinom(k , pj) then each Yj ∼Pois(λpj).
If Xi ∼Pois(λi ) for i = 1 . . . n independent thenY =
∑ni=1 Xi ∼ Pois(
∑ni=1 λi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 116 / 242
![Page 597: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/597.jpg)
Connections to Distributions We Have Seen
Take Binom(n, p) and let n→∞ and p → 0 holding np = µ constant
If the number of arrivals in the time interval [0, t] follows aPoisson(λt) then the wait times are distributed Exponential withmean 1/λ.
For Yj |(X = k) ∼Multinom(k , pj) then each Yj ∼Pois(λpj).
If Xi ∼Pois(λi ) for i = 1 . . . n independent thenY =
∑ni=1 Xi ∼ Pois(
∑ni=1 λi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 116 / 242
![Page 598: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/598.jpg)
Connections to Distributions We Have Seen
Take Binom(n, p) and let n→∞ and p → 0 holding np = µ constant
If the number of arrivals in the time interval [0, t] follows aPoisson(λt) then the wait times are distributed Exponential withmean 1/λ.
For Yj |(X = k) ∼Multinom(k , pj) then each Yj ∼Pois(λpj).
If Xi ∼Pois(λi ) for i = 1 . . . n independent thenY =
∑ni=1 Xi ∼ Pois(
∑ni=1 λi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 116 / 242
![Page 599: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/599.jpg)
Connections to Distributions We Have Seen
Take Binom(n, p) and let n→∞ and p → 0 holding np = µ constant
If the number of arrivals in the time interval [0, t] follows aPoisson(λt) then the wait times are distributed Exponential withmean 1/λ.
For Yj |(X = k) ∼Multinom(k , pj) then each Yj ∼Pois(λpj).
If Xi ∼Pois(λi ) for i = 1 . . . n independent thenY =
∑ni=1 Xi ∼ Pois(
∑ni=1 λi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 116 / 242
![Page 600: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/600.jpg)
Connections to Distributions We Have Seen
Take Binom(n, p) and let n→∞ and p → 0 holding np = µ constant
If the number of arrivals in the time interval [0, t] follows aPoisson(λt) then the wait times are distributed Exponential withmean 1/λ.
For Yj |(X = k) ∼Multinom(k , pj) then each Yj ∼Pois(λpj).
If Xi ∼Pois(λi ) for i = 1 . . . n independent thenY =
∑ni=1 Xi ∼ Pois(
∑ni=1 λi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 116 / 242
![Page 601: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/601.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 602: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/602.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )
λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 603: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/603.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 604: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/604.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .
The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 605: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/605.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 606: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/606.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 607: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/607.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 608: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/608.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 609: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/609.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 610: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/610.jpg)
The Poisson regression model:
Yi ∼ Poisson(yi |λi )λi = exp(xiβ)
and, as usual, Yi and Yj are independent ∀ i 6= j , conditional on X .The probability density of all the data:
P(y |λ) =n∏
i=1
e−λiλyiiyi !
The log-likelihood:
ln L(β|y) =n∑
i=1
{yi ln(λi )− λi − ln(yi !)}
=n∑
i=1
{(xiβ)yi − exp(xiβ)− ln yi !}
.=
n∑i=1
{(xiβ)yi − exp(xiβ)}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 117 / 242
![Page 611: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/611.jpg)
Comparing with the Linear Model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 118 / 242
![Page 612: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/612.jpg)
Example: Civil Conflict in Northern Ireland
Background: a conflict largely along religious lines about the status ofNorthern Ireland within the United Kingdom, and the division of resourcesand political power between Northern Ireland’s Protestant (mainlyUnionist) and Catholic (mainly Republican) communities.
The data: the number of Republican deaths for every month from 1969,the beginning of sustained violence, to 2001 (at which point, mostorganized violence had subsided). Also, the unemployment rates in the twomain religious communities.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 119 / 242
![Page 613: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/613.jpg)
Example: Civil Conflict in Northern Ireland
Background: a conflict largely along religious lines about the status ofNorthern Ireland within the United Kingdom, and the division of resourcesand political power between Northern Ireland’s Protestant (mainlyUnionist) and Catholic (mainly Republican) communities.
The data: the number of Republican deaths for every month from 1969,the beginning of sustained violence, to 2001 (at which point, mostorganized violence had subsided). Also, the unemployment rates in the twomain religious communities.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 119 / 242
![Page 614: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/614.jpg)
Example: Civil Conflict in Northern Ireland
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 120 / 242
![Page 615: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/615.jpg)
Example: Civil Conflict in Northern Ireland
The model: Let Yi = # of Republican deaths in a month. Our solepredictor for the moment will be: UC = the unemployment rate amongNorthern Ireland’s Catholics.
Our model is then:Yi ∼ Pois(λi )
andλi = E [Yi |UC
i ] = exp(β0 + β1 ∗ UCi ).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 121 / 242
![Page 616: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/616.jpg)
Example: Civil Conflict in Northern Ireland
The model: Let Yi = # of Republican deaths in a month. Our solepredictor for the moment will be: UC = the unemployment rate amongNorthern Ireland’s Catholics.
Our model is then:Yi ∼ Pois(λi )
andλi = E [Yi |UC
i ] = exp(β0 + β1 ∗ UCi ).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 121 / 242
![Page 617: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/617.jpg)
Estimate (just as we have all along!)
mod <- glm(repdeaths ~ cathunemp,
data = troubles, family = poisson(link="log")
> summary(mod)$coefficients
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.295875 0.1805327 7.178064 7.070547e-13
cathunemp 1.406498 0.6689819 2.102445 3.551432e-02
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 122 / 242
![Page 618: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/618.jpg)
Our fitted model
λi = E [Yi |UCi ] = exp(1.296 + 1.407 ∗ UC
i ).
●●●●●●●●●●●●
●
●●●●●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●●
●
●
●●
●● ●
●
●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●●
●
●●
●●
●●
●●●
●
●●
●
●●
●●
●●●
●
● ●
●
●●
●
●
●
●
●
●
●
●●
●
●● ●
●●
●
●●
●
●●
●
●
●●●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●●●●●
●
●
●●
●
●●
●
●
●●
●
●●●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●● ●
●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●●●●
●
●●
●
●
●
●●●●
●
●●●
●
●
●
●●
●●
●
●
●●●
●
●●
●
●●
●●●●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●●
●●
●●
●
●●
●
●
●
●
●●●
●●●
●●●●
●●●●●
●●● ●●
●● ●
●●
●●●
●●
●●●
●●●●
●●●
●●
●●●●●●●●
●●
● ●●
●
●●●●●●●●●●●●●●●●●●●
●●●●●● ●●●●●●
●●●●●●●●●
0.10 0.20 0.30 0.40
010
2030
4050
cathunemp
repd
eath
s
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 123 / 242
![Page 619: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/619.jpg)
Some fitted and predicted valuesSuppose UC is equal to .2.
mod.coef <- coef(mod); mod.vcov <- vcov(mod)
beta.draws <- mvrnorm(10000, mod.coef, mod.vcov)
lambda.draws <- exp(beta.draws[,1] + .2*beta.draws[,2])
outcome.draws <- rpois(10000, lambda.draws)
4.4 4.8 5.2
0.0
1.0
2.0
3.0
Expected Values
E[Y|U]
Den
sity
Predicted Values
Y|U
Fre
quen
cy
0 5 10 15
050
010
0015
00
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 124 / 242
![Page 620: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/620.jpg)
Overdispersion
36% of observations lie outside the 2.5% or 97.5% quantile of the Poissondistribution that we are alleging generated them.
●●●●●●●●●●●●
●
●●●●●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●●
●
●
●●
●● ●
●
●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●●
●
●●
●●
●●
●●●
●
●●
●
●●
●●
●●●
●
● ●
●
●●
●
●
●
●
●
●
●
●●
●
●● ●
●●
●
●●
●
●●
●
●
●●●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●●●●●
●
●
●●
●
●●
●
●
●●
●
●●●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●● ●
●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●●●●
●
●●
●
●
●
●●●●
●
●●●
●
●
●
●●
●●
●
●
●●●
●
●●
●
●●
●●●●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●●
●●
●●
●
●●
●
●
●
●
●●●
●●●
●●●●
●●●●●
●●● ●●
●● ●
●●
●●●
●●
●●●
●●●●
●●●
●●
●●●●●●●●
●●
● ●●
●
●●●●●●●●●●●●●●●●●●●
●●●●●● ●●●●●●
●●●●●●●●●
0.10 0.20 0.30 0.40
010
2030
4050
cathunemp
repd
eath
s
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 125 / 242
![Page 621: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/621.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 622: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/622.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 623: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/623.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 624: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/624.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 625: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/625.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 626: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/626.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 627: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/627.jpg)
Overdispersion in Poisson Model
The Poisson model assumes E(Yi | Xi ) = V(Yi | Xi )
But for many count data, E(Yi | Xi ) < V(Yi | Xi )
Potential sources of overdispersion:1 unobserved heterogeneity2 clustering3 contagion or diffusion4 (classical) measurement error
Underdispersion could occur, but rare
One solution to this is to modify the Poisson model by assuming:
E(Yi | Xi ) = µi = exp(X>i β) and V(Yi | Xi ) = Vi = φµi
This is called the overdispersed Poisson regression model
When φ > 1, this corresponds to a type of the negative binomialregression model (more on this later)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 126 / 242
![Page 628: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/628.jpg)
Derivation as a Gamma-Poisson Mixture
Here’s the new stochastic component:
Yi |ςi ∼ Poisson(ςiλi )
ςi ∼1
θGamma(θ)
Note that Gamma(θ) implicitly has location parameter 1, so its mean is θ.This means that 1
θGamma(θ) has mean 1, and so Poisson(ςiλi ) has meanλi .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 127 / 242
![Page 629: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/629.jpg)
Derivation as a Gamma-Poisson Mixture
Here’s the new stochastic component:
Yi |ςi ∼ Poisson(ςiλi )
ςi ∼1
θGamma(θ)
Note that Gamma(θ) implicitly has location parameter 1, so its mean is θ.This means that 1
θGamma(θ) has mean 1, and so Poisson(ςiλi ) has meanλi .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 127 / 242
![Page 630: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/630.jpg)
Derivation as a Gamma-Poisson Mixture
Here’s the new stochastic component:
Yi |ςi ∼ Poisson(ςiλi )
ςi ∼1
θGamma(θ)
Note that Gamma(θ) implicitly has location parameter 1, so its mean is θ.This means that 1
θGamma(θ) has mean 1, and so Poisson(ςiλi ) has meanλi .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 127 / 242
![Page 631: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/631.jpg)
Derivation as a Gamma-Poisson Mixture
Here’s the new stochastic component:
Yi |ςi ∼ Poisson(ςiλi )
ςi ∼1
θGamma(θ)
Note that Gamma(θ) implicitly has location parameter 1, so its mean is θ.
This means that 1θGamma(θ) has mean 1, and so Poisson(ςiλi ) has mean
λi .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 127 / 242
![Page 632: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/632.jpg)
Derivation as a Gamma-Poisson Mixture
Here’s the new stochastic component:
Yi |ςi ∼ Poisson(ςiλi )
ςi ∼1
θGamma(θ)
Note that Gamma(θ) implicitly has location parameter 1, so its mean is θ.This means that 1
θGamma(θ) has mean 1, and so Poisson(ςiλi ) has meanλi .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 127 / 242
![Page 633: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/633.jpg)
Derivation as a Gamma-Poisson Mixture
Using a similar approach to that described in UPM pgs. 51-52 we canderive the marginal distribution of Y as
Yi ∼ Negbin(λi , θ)
where
fnb(yi |λi , θ) =Γ(θ + yi )
yi !Γ(θ)
λyii θθ
(λi + θ)θ+yi
Notes:
1. E[Yi ] = λi and Var(Yi ) = λi +λ2iθ . What values of θ would be
evidence against overdispersion?
2. we still have the same old systematic component: λi = exp(Xiβ).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 128 / 242
![Page 634: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/634.jpg)
Derivation as a Gamma-Poisson Mixture
Using a similar approach to that described in UPM pgs. 51-52 we canderive the marginal distribution of Y as
Yi ∼ Negbin(λi , θ)
where
fnb(yi |λi , θ) =Γ(θ + yi )
yi !Γ(θ)
λyii θθ
(λi + θ)θ+yi
Notes:
1. E[Yi ] = λi and Var(Yi ) = λi +λ2iθ . What values of θ would be
evidence against overdispersion?
2. we still have the same old systematic component: λi = exp(Xiβ).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 128 / 242
![Page 635: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/635.jpg)
Derivation as a Gamma-Poisson Mixture
Using a similar approach to that described in UPM pgs. 51-52 we canderive the marginal distribution of Y as
Yi ∼ Negbin(λi , θ)
where
fnb(yi |λi , θ) =Γ(θ + yi )
yi !Γ(θ)
λyii θθ
(λi + θ)θ+yi
Notes:
1. E[Yi ] = λi and Var(Yi ) = λi +λ2iθ . What values of θ would be
evidence against overdispersion?
2. we still have the same old systematic component: λi = exp(Xiβ).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 128 / 242
![Page 636: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/636.jpg)
Derivation as a Gamma-Poisson Mixture
Using a similar approach to that described in UPM pgs. 51-52 we canderive the marginal distribution of Y as
Yi ∼ Negbin(λi , θ)
where
fnb(yi |λi , θ) =Γ(θ + yi )
yi !Γ(θ)
λyii θθ
(λi + θ)θ+yi
Notes:
1. E[Yi ] = λi and Var(Yi ) = λi +λ2iθ . What values of θ would be
evidence against overdispersion?
2. we still have the same old systematic component: λi = exp(Xiβ).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 128 / 242
![Page 637: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/637.jpg)
Estimates
mod <- zelig(repdeaths ~ cathunemp, data = troubles,
model = "negbin")
summary(mod)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.2959 0.1805 7.178 7.07e-13 ***
cathunemp 1.4065 0.6690 2.102 0.0355 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
...
Theta: 0.8551
Std. Err.: 0.0754
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 129 / 242
![Page 638: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/638.jpg)
Overdispersion Handled!5.68% of observations lie at or above the 95% quantile of the NegativeBinomial distribution that we are alleging generated them.
●●●●●●●●●●●●
●
●●●●●●
●
●
●
●●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●●
●
●●
●
●●
●
●
●●
●● ●
●
●
●
●●
●●
●
●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●
●●
●●
●
●●
●●
●●
●●●
●
●●
●
●●
●●
●●●
●
● ●
●
●●
●
●
●
●
●
●
●
●●
●
●● ●
●●
●
●●
●
●●
●
●
●●●
●●
●
●
●
●
●●
●●
●
●
●
●●
●
●●●●●
●
●
●●
●
●●
●
●
●●
●
●●●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●● ●
●●
●
●
●
●●●●
●●
●
●●
●
●●●
●●
●
●●●●●
●
●●
●
●
●
●●●●
●
●●●
●
●
●
●●
●●
●
●
●●●
●
●●
●
●●
●●●●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●●
●●●
●●
●●
●
●●
●
●
●
●
●●●
●●●
●●●●
●●●●●
●●● ●●
●● ●
●●
●●●
●●
●●●
●●●●
●●●
●●
●●●●●●●●
●●
● ●●
●
●●●●●●●●●●●●●●●●●●●
●●●●●● ●●●●●●
●●●●●●●●●
0.10 0.20 0.30 0.40
010
2030
4050
cathunemp
repd
eath
s
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 130 / 242
![Page 639: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/639.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 640: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/640.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 641: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/641.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 642: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/642.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 643: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/643.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 644: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/644.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 645: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/645.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,
by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 646: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/646.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 647: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/647.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 648: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/648.jpg)
Binomial Regression Model
Sometimes count data have a known upper bound Mi
Examples:
I # of votes for a third party candidate in precinct with population Mi
I # of children who drop out of high school in a family with Mi children
If Mi “trials” are all independent, we have the binomial distribution:
p(Yi | Mi , πi ) =
(Mi
Yi
)πYi
i (1− πi )Mi−Yi
An exponential family with E(Yi ) = Miπi and V(Yi ) = Miπi (1− πi )
We can thus consider a GLM, the binomial regression model,by setting πi = g−1(X>i β)
Common links: logit (canonical), probit, cloglog
Note that if Mi = 1 for all i , this reduces to a binary outcome model
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 131 / 242
![Page 649: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/649.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 650: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/650.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 651: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/651.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 652: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/652.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 653: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/653.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 654: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/654.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 655: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/655.jpg)
Estimation and Overdispersion
The log-likelihood:
`(β | Xi ) =N∑i=1
{Yi log
(πi
1− πi
)+ Mi log(1− πi ) + log
(Mi
Yi
)}
Use the standard MLE/GLM machinary to estimate β and calculatequantities of interest
Data are often overdispersed due to dependence between trials
Modify the variance function by including a dispersion parameter:
E(Yi | Xi ) = Miπi and V(Yi | Xi ) = φMiπi (1− πi )
Estimate β and φ via QMLE
This is a GLM, so we have the same robustness properties as the Poissoncase
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 132 / 242
![Page 656: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/656.jpg)
Example: Butterfly Ballot in 2000 Presidential ElectionWand et al. (2001): Did the butterfly ballot give the election to Bush?
Yi : Number of votes cast for Buchanan in county i
Xi : Past Republican & third-party vote shares, demographic covariates
Wand et al. examine residuals to see how abberant the vote share was inPalm Beach
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 133 / 242
![Page 657: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/657.jpg)
Fitting GLMs in R
Canonical LinkFamily (Default) Variance Modelgaussian identity φ (= σ2) normal linearbinomial logit µ(1− µ) logit, probit, binomialpoisson log µ Poissonquasibinomial logit φµ(1− µ) overdispersed binomialquasipoisson log φµ overdispersed Poisson
Other choices not covered in this course: Gamma, inverse.gaussian
You can roll your own GLM using the quasi family
The negative binomial regression (NB2) can be fitted via the glm.nb
function in MASS
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 134 / 242
![Page 658: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/658.jpg)
Fitting GLMs in R
Canonical LinkFamily (Default) Variance Modelgaussian identity φ (= σ2) normal linearbinomial logit µ(1− µ) logit, probit, binomialpoisson log µ Poissonquasibinomial logit φµ(1− µ) overdispersed binomialquasipoisson log φµ overdispersed Poisson
Other choices not covered in this course: Gamma, inverse.gaussian
You can roll your own GLM using the quasi family
The negative binomial regression (NB2) can be fitted via the glm.nb
function in MASS
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 134 / 242
![Page 659: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/659.jpg)
Fitting GLMs in R
Canonical LinkFamily (Default) Variance Modelgaussian identity φ (= σ2) normal linearbinomial logit µ(1− µ) logit, probit, binomialpoisson log µ Poissonquasibinomial logit φµ(1− µ) overdispersed binomialquasipoisson log φµ overdispersed Poisson
Other choices not covered in this course: Gamma, inverse.gaussian
You can roll your own GLM using the quasi family
The negative binomial regression (NB2) can be fitted via the glm.nb
function in MASS
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 134 / 242
![Page 660: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/660.jpg)
Other Models
Note that there are many other count models for different types ofsituations:
Generalized Event Count (GEC) Model
Zero-Inflated Poisson
Zero-Inflated Negative Binomial
Zero-Truncated Models
Hurdle Models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 135 / 242
![Page 661: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/661.jpg)
Other Models
Note that there are many other count models for different types ofsituations:
Generalized Event Count (GEC) Model
Zero-Inflated Poisson
Zero-Inflated Negative Binomial
Zero-Truncated Models
Hurdle Models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 135 / 242
![Page 662: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/662.jpg)
Other Models
Note that there are many other count models for different types ofsituations:
Generalized Event Count (GEC) Model
Zero-Inflated Poisson
Zero-Inflated Negative Binomial
Zero-Truncated Models
Hurdle Models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 135 / 242
![Page 663: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/663.jpg)
Other Models
Note that there are many other count models for different types ofsituations:
Generalized Event Count (GEC) Model
Zero-Inflated Poisson
Zero-Inflated Negative Binomial
Zero-Truncated Models
Hurdle Models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 135 / 242
![Page 664: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/664.jpg)
Other Models
Note that there are many other count models for different types ofsituations:
Generalized Event Count (GEC) Model
Zero-Inflated Poisson
Zero-Inflated Negative Binomial
Zero-Truncated Models
Hurdle Models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 135 / 242
![Page 665: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/665.jpg)
Other Models
Note that there are many other count models for different types ofsituations:
Generalized Event Count (GEC) Model
Zero-Inflated Poisson
Zero-Inflated Negative Binomial
Zero-Truncated Models
Hurdle Models
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 135 / 242
![Page 666: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/666.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 136 / 242
![Page 667: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/667.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 136 / 242
![Page 668: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/668.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 669: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/669.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 670: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/670.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 671: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/671.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 672: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/672.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts,
how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 673: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/673.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job,
how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 674: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/674.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 675: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/675.jpg)
What are duration models used for?
Survival models = duration models = event history models
Dependent variable Y is the duration of time that observations spendin some state before experiencing an event (aka failure, death)
Used in biostatistics and engineering: i.e. how long until a patient dies
Models the relationship between duration and covariates (how doesan increase in X affect the duration Y )
In social science, used in questions such as how long a coalitiongovernment lasts, how long until someone gets a job, how a programextends life expectancy
Observations should be measured in the same (temporal) units, i.e.don’t have some units’ duration measured in days and others inmonths
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 137 / 242
![Page 676: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/676.jpg)
Why not just use OLS?
Three reasons:
1. The normal linear model assumes Y is Normal but durationdependent variables are always positive (number of years, etc.)
2. Duration models can handle censoring
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored in thatit has not experienced the eventat the time we stop collectingdata, so we don’t know its trueduration
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 138 / 242
![Page 677: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/677.jpg)
Why not just use OLS?
Three reasons:
1. The normal linear model assumes Y is Normal but durationdependent variables are always positive (number of years, etc.)
2. Duration models can handle censoring
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored in thatit has not experienced the eventat the time we stop collectingdata, so we don’t know its trueduration
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 138 / 242
![Page 678: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/678.jpg)
Why not just use OLS?
Three reasons:
1. The normal linear model assumes Y is Normal but durationdependent variables are always positive (number of years, etc.)
2. Duration models can handle censoring
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored in thatit has not experienced the eventat the time we stop collectingdata, so we don’t know its trueduration
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 138 / 242
![Page 679: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/679.jpg)
Why not just use OLS?
Three reasons:
1. The normal linear model assumes Y is Normal but durationdependent variables are always positive (number of years, etc.)
2. Duration models can handle censoring
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored in thatit has not experienced the eventat the time we stop collectingdata, so we don’t know its trueduration
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 138 / 242
![Page 680: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/680.jpg)
Why not just use OLS?
Three reasons:
1. The normal linear model assumes Y is Normal but durationdependent variables are always positive (number of years, etc.)
2. Duration models can handle censoring
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored in thatit has not experienced the eventat the time we stop collectingdata, so we don’t know its trueduration
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 138 / 242
![Page 681: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/681.jpg)
Why not use OLS?
3. Duration models can handle time-varying covariates
I If Y is duration of a regime, GDP may change during the duration ofthe regime
I OLS cannot handle multiple values of GDP per observationI You can set up data in a special way with duration models such that
you can accommodate time-varying covariates
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 139 / 242
![Page 682: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/682.jpg)
Why not use OLS?
3. Duration models can handle time-varying covariatesI If Y is duration of a regime, GDP may change during the duration of
the regime
I OLS cannot handle multiple values of GDP per observationI You can set up data in a special way with duration models such that
you can accommodate time-varying covariates
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 139 / 242
![Page 683: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/683.jpg)
Why not use OLS?
3. Duration models can handle time-varying covariatesI If Y is duration of a regime, GDP may change during the duration of
the regimeI OLS cannot handle multiple values of GDP per observation
I You can set up data in a special way with duration models such thatyou can accommodate time-varying covariates
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 139 / 242
![Page 684: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/684.jpg)
Why not use OLS?
3. Duration models can handle time-varying covariatesI If Y is duration of a regime, GDP may change during the duration of
the regimeI OLS cannot handle multiple values of GDP per observationI You can set up data in a special way with duration models such that
you can accommodate time-varying covariates
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 139 / 242
![Page 685: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/685.jpg)
Duration/Survival Model Jargon
Let T denote a continuous positive random variable representing theduration/survival times (T = Y )
T has a probability density function f (t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 140 / 242
![Page 686: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/686.jpg)
Duration/Survival Model Jargon
Let T denote a continuous positive random variable representing theduration/survival times (T = Y )
T has a probability density function f (t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 140 / 242
![Page 687: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/687.jpg)
Duration/Survival Model Jargon
Let T denote a continuous positive random variable representing theduration/survival times (T = Y )
T has a probability density function f (t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 140 / 242
![Page 688: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/688.jpg)
Duration/Survival Model Jargon
Let T denote a continuous positive random variable representing theduration/survival times (T = Y )
T has a probability density function f (t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 140 / 242
![Page 689: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/689.jpg)
Duration/Survival Model JargonF(t): the CDF of f (t),
∫ t0 f (u)du = P(T ≤ t), which is the probability of
an event occurring before (or at exactly) time t
Survivor function: The probability of surviving (i.e. no event occuring)until at least time t: S(t) = 1− F (t) = P(T > t)
Eye of the Tiger: 1982 album by the band Survivor, which reachednumber 2 on the US Billboard 200 chart.
0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
CDF F(t)
t
F(t
)
0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
Survivor Function S(t)
t
S(t
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 141 / 242
![Page 690: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/690.jpg)
Duration/Survival Model JargonF(t): the CDF of f (t),
∫ t0 f (u)du = P(T ≤ t), which is the probability of
an event occurring before (or at exactly) time t
Survivor function: The probability of surviving (i.e. no event occuring)until at least time t: S(t) = 1− F (t) = P(T > t)
Eye of the Tiger: 1982 album by the band Survivor, which reachednumber 2 on the US Billboard 200 chart.
0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
CDF F(t)
t
F(t
)
0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
Survivor Function S(t)
t
S(t
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 141 / 242
![Page 691: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/691.jpg)
Duration/Survival Model JargonF(t): the CDF of f (t),
∫ t0 f (u)du = P(T ≤ t), which is the probability of
an event occurring before (or at exactly) time t
Survivor function: The probability of surviving (i.e. no event occuring)until at least time t: S(t) = 1− F (t) = P(T > t)
Eye of the Tiger: 1982 album by the band Survivor, which reachednumber 2 on the US Billboard 200 chart.
0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
CDF F(t)
t
F(t
)
0 2 4 6
0.0
0.2
0.4
0.6
0.8
1.0
Survivor Function S(t)
t
S(t
)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 141 / 242
![Page 692: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/692.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 693: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/693.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 694: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/694.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 695: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/695.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 696: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/696.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 697: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/697.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 698: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/698.jpg)
Duration/Survival Model Jargon
Hazard rate (or hazard function): h(t) is roughly the probability of anevent at time t given survival up to time t
h(t) = P(t ≤ T < t + τ |T ≥ t)
= P(event at t|survival up to t)
=P(survival up to t|event at t)P(event at t)
P(survival up to t)
=P(event at t)
P(survival up to t)
=f (t)
S(t)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 142 / 242
![Page 699: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/699.jpg)
Relating the Density, Survival, and Hazard Functions
h(t) =f (t)
S(t)
implies
f (t)︸︷︷︸density function
= h(t)︸︷︷︸hazard function
· S(t)︸︷︷︸survival function
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 143 / 242
![Page 700: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/700.jpg)
Modeling with Covariates
We can model the mean of the duration times as a function of covariatesvia a link function g(·)
g(E [Ti ]) = Xiβ
and estimate β via maximum likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 144 / 242
![Page 701: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/701.jpg)
Modeling with Covariates
We can model the mean of the duration times as a function of covariatesvia a link function g(·)
g(E [Ti ]) = Xiβ
and estimate β via maximum likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 144 / 242
![Page 702: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/702.jpg)
Modeling with Covariates
We can model the mean of the duration times as a function of covariatesvia a link function g(·)
g(E [Ti ]) = Xiβ
and estimate β via maximum likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 144 / 242
![Page 703: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/703.jpg)
Modeling with Covariates
We can model the mean of the duration times as a function of covariatesvia a link function g(·)
g(E [Ti ]) = Xiβ
and estimate β via maximum likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 144 / 242
![Page 704: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/704.jpg)
How to estimate parametric survival models
They might seem fancy and complicated, but we estimate these modelsthe same as every other model!
1 Make an assumption that Ti follows a specific distribution f (t) (i.e.choose the stochastic component).
2 Model the hazard rate with covariates (i.e. specify the systematiccomponent).
3 Estimate via maximum likelihood.
4 Interpret quantities of interest (hazard ratios, expected survivaltimes).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 145 / 242
![Page 705: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/705.jpg)
How to estimate parametric survival models
They might seem fancy and complicated, but we estimate these modelsthe same as every other model!
1 Make an assumption that Ti follows a specific distribution f (t) (i.e.choose the stochastic component).
2 Model the hazard rate with covariates (i.e. specify the systematiccomponent).
3 Estimate via maximum likelihood.
4 Interpret quantities of interest (hazard ratios, expected survivaltimes).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 145 / 242
![Page 706: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/706.jpg)
How to estimate parametric survival models
They might seem fancy and complicated, but we estimate these modelsthe same as every other model!
1 Make an assumption that Ti follows a specific distribution f (t) (i.e.choose the stochastic component).
2 Model the hazard rate with covariates (i.e. specify the systematiccomponent).
3 Estimate via maximum likelihood.
4 Interpret quantities of interest (hazard ratios, expected survivaltimes).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 145 / 242
![Page 707: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/707.jpg)
How to estimate parametric survival models
They might seem fancy and complicated, but we estimate these modelsthe same as every other model!
1 Make an assumption that Ti follows a specific distribution f (t) (i.e.choose the stochastic component).
2 Model the hazard rate with covariates (i.e. specify the systematiccomponent).
3 Estimate via maximum likelihood.
4 Interpret quantities of interest (hazard ratios, expected survivaltimes).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 145 / 242
![Page 708: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/708.jpg)
How to estimate parametric survival models
They might seem fancy and complicated, but we estimate these modelsthe same as every other model!
1 Make an assumption that Ti follows a specific distribution f (t) (i.e.choose the stochastic component).
2 Model the hazard rate with covariates (i.e. specify the systematiccomponent).
3 Estimate via maximum likelihood.
4 Interpret quantities of interest (hazard ratios, expected survivaltimes).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 145 / 242
![Page 709: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/709.jpg)
How to estimate parametric survival models
They might seem fancy and complicated, but we estimate these modelsthe same as every other model!
1 Make an assumption that Ti follows a specific distribution f (t) (i.e.choose the stochastic component).
2 Model the hazard rate with covariates (i.e. specify the systematiccomponent).
3 Estimate via maximum likelihood.
4 Interpret quantities of interest (hazard ratios, expected survivaltimes).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 145 / 242
![Page 710: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/710.jpg)
What’s Special About Survival Models?
Censoring:
... it makes modeling a little tricky.But not too tricky
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored because ithad not experienced the event when wecollected the data, so we don’t know itstrue duration.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 146 / 242
![Page 711: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/711.jpg)
What’s Special About Survival Models?
Censoring:
... it makes modeling a little tricky.
But not too tricky
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored because ithad not experienced the event when wecollected the data, so we don’t know itstrue duration.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 146 / 242
![Page 712: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/712.jpg)
What’s Special About Survival Models?
Censoring:
... it makes modeling a little tricky.But not too tricky
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored because ithad not experienced the event when wecollected the data, so we don’t know itstrue duration.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 146 / 242
![Page 713: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/713.jpg)
What’s Special About Survival Models?
Censoring:
... it makes modeling a little tricky.But not too tricky
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored because ithad not experienced the event when wecollected the data, so we don’t know itstrue duration.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 146 / 242
![Page 714: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/714.jpg)
What’s Special About Survival Models?
Censoring:
... it makes modeling a little tricky.But not too tricky
0 1 2 3 4
01
23
4
Observation 3 is Censored
time
obse
rvat
ions
●
●
●●
Observation 3 is censored because ithad not experienced the event when wecollected the data, so we don’t know itstrue duration.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 146 / 242
![Page 715: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/715.jpg)
Censoring
Observations that are censored give us information about how long theysurvive.
For censored observations, we know that they survived at least until someobserved time, tc , and that the true duration, t is greater than or equal totc .
For each observation, let’s create a censoring indicator variable, ci , suchthat
ci =
{1 if not censored0 if censored
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 147 / 242
![Page 716: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/716.jpg)
Censoring
Observations that are censored give us information about how long theysurvive.
For censored observations, we know that they survived at least until someobserved time, tc , and that the true duration, t is greater than or equal totc .
For each observation, let’s create a censoring indicator variable, ci , suchthat
ci =
{1 if not censored0 if censored
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 147 / 242
![Page 717: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/717.jpg)
Censoring
Observations that are censored give us information about how long theysurvive.
For censored observations, we know that they survived at least until someobserved time, tc , and that the true duration, t is greater than or equal totc .
For each observation, let’s create a censoring indicator variable, ci , suchthat
ci =
{1 if not censored0 if censored
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 147 / 242
![Page 718: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/718.jpg)
Censoring
Observations that are censored give us information about how long theysurvive.
For censored observations, we know that they survived at least until someobserved time, tc , and that the true duration, t is greater than or equal totc .
For each observation, let’s create a censoring indicator variable, ci , suchthat
ci =
{1 if not censored0 if censored
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 147 / 242
![Page 719: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/719.jpg)
Censoring
We can incorporate the information from the censored observations intothe likelihood function.
L =n∏
i=1
[f (ti )]ci︸ ︷︷ ︸uncensored
[P(Ti ≥ tci )]1−ci︸ ︷︷ ︸censored
=n∏
i=1
[f (ti )]ci [1− F (ti )]1−ci
=n∏
i=1
[f (ti )]ci [S(ti )]1−ci
So uncensored observations contribute to the density function andcensored observations contribute to the survivor function in the likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 148 / 242
![Page 720: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/720.jpg)
Censoring
We can incorporate the information from the censored observations intothe likelihood function.
L =n∏
i=1
[f (ti )]ci︸ ︷︷ ︸uncensored
[P(Ti ≥ tci )]1−ci︸ ︷︷ ︸censored
=n∏
i=1
[f (ti )]ci [1− F (ti )]1−ci
=n∏
i=1
[f (ti )]ci [S(ti )]1−ci
So uncensored observations contribute to the density function andcensored observations contribute to the survivor function in the likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 148 / 242
![Page 721: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/721.jpg)
Censoring
We can incorporate the information from the censored observations intothe likelihood function.
L =n∏
i=1
[f (ti )]ci︸ ︷︷ ︸uncensored
[P(Ti ≥ tci )]1−ci︸ ︷︷ ︸censored
=n∏
i=1
[f (ti )]ci [1− F (ti )]1−ci
=n∏
i=1
[f (ti )]ci [S(ti )]1−ci
So uncensored observations contribute to the density function andcensored observations contribute to the survivor function in the likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 148 / 242
![Page 722: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/722.jpg)
Censoring
We can incorporate the information from the censored observations intothe likelihood function.
L =n∏
i=1
[f (ti )]ci︸ ︷︷ ︸uncensored
[P(Ti ≥ tci )]1−ci︸ ︷︷ ︸censored
=n∏
i=1
[f (ti )]ci [1− F (ti )]1−ci
=n∏
i=1
[f (ti )]ci [S(ti )]1−ci
So uncensored observations contribute to the density function andcensored observations contribute to the survivor function in the likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 148 / 242
![Page 723: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/723.jpg)
Censoring
We can incorporate the information from the censored observations intothe likelihood function.
L =n∏
i=1
[f (ti )]ci︸ ︷︷ ︸uncensored
[P(Ti ≥ tci )]1−ci︸ ︷︷ ︸censored
=n∏
i=1
[f (ti )]ci [1− F (ti )]1−ci
=n∏
i=1
[f (ti )]ci [S(ti )]1−ci
So uncensored observations contribute to the density function andcensored observations contribute to the survivor function in the likelihood.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 148 / 242
![Page 724: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/724.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:I Independent increments: number of events occurring in two disjoint
intervals is independentI Stationary increments: probability distribution of number of
occurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 725: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/725.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:I Independent increments: number of events occurring in two disjoint
intervals is independentI Stationary increments: probability distribution of number of
occurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 726: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/726.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:
I Independent increments: number of events occurring in two disjointintervals is independent
I Stationary increments: probability distribution of number ofoccurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 727: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/727.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:I Independent increments: number of events occurring in two disjoint
intervals is independent
I Stationary increments: probability distribution of number ofoccurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 728: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/728.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:I Independent increments: number of events occurring in two disjoint
intervals is independentI Stationary increments: probability distribution of number of
occurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 729: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/729.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:I Independent increments: number of events occurring in two disjoint
intervals is independentI Stationary increments: probability distribution of number of
occurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 730: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/730.jpg)
The Poisson Process
Popular example of stochastic process
Principles of Poisson process:I Independent increments: number of events occurring in two disjoint
intervals is independentI Stationary increments: probability distribution of number of
occurrences depends only on the time length of interval (because ofcommon rate)
Events occur at rate λ (expected occurrences per unit of time)
Nτ = number of arrivals in time period of length τI Nτ ∼ Poisson(λτ)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 149 / 242
![Page 731: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/731.jpg)
The Poisson Process
Exponential distribution measures the times between events in aPoisson process
T = time to wait until next event in a Poisson process with rate λ
T ∼ Expo(λ)
Memorylessness property: how much you have waited already isirrelevant
P(T > t + k |T > t) = P(T > k)
P(T > 3 + 5|T > 3) = P(T > 5)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 150 / 242
![Page 732: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/732.jpg)
The Poisson Process
Exponential distribution measures the times between events in aPoisson process
T = time to wait until next event in a Poisson process with rate λ
T ∼ Expo(λ)
Memorylessness property: how much you have waited already isirrelevant
P(T > t + k |T > t) = P(T > k)
P(T > 3 + 5|T > 3) = P(T > 5)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 150 / 242
![Page 733: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/733.jpg)
The Poisson Process
Exponential distribution measures the times between events in aPoisson process
T = time to wait until next event in a Poisson process with rate λ
T ∼ Expo(λ)
Memorylessness property: how much you have waited already isirrelevant
P(T > t + k |T > t) = P(T > k)
P(T > 3 + 5|T > 3) = P(T > 5)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 150 / 242
![Page 734: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/734.jpg)
The Poisson Process
Exponential distribution measures the times between events in aPoisson process
T = time to wait until next event in a Poisson process with rate λ
T ∼ Expo(λ)
Memorylessness property: how much you have waited already isirrelevant
P(T > t + k |T > t) = P(T > k)
P(T > 3 + 5|T > 3) = P(T > 5)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 150 / 242
![Page 735: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/735.jpg)
The Poisson Process
Exponential distribution measures the times between events in aPoisson process
T = time to wait until next event in a Poisson process with rate λ
T ∼ Expo(λ)
Memorylessness property: how much you have waited already isirrelevant
P(T > t + k |T > t) = P(T > k)
P(T > 3 + 5|T > 3) = P(T > 5)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 150 / 242
![Page 736: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/736.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 737: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/737.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 738: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/738.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 739: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/739.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 740: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/740.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 741: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/741.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 742: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/742.jpg)
Two Possible Parameterizations of the Exponential Model
λi > 0 is the rate parameter
Ti ∼ Exponential(λi )
f (ti ) = λie−λi ti
E (Ti ) =1
λi
θi > 0 is scale parameter (θi = 1λi
)
Ti ∼ Exponential(θi )
f (ti ) =1
θie− tiθi
E (Ti ) = θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 151 / 242
![Page 743: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/743.jpg)
The Exponential Model
0 2 4 6
0.0
0.2
0.4
0.6
Exponential(1)
t
Den
sity
E(T)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 152 / 242
![Page 744: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/744.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 745: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/745.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 746: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/746.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 747: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/747.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 748: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/748.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 749: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/749.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 750: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/750.jpg)
Link Functions
If you use a rate parameterization with λi :
E (Ti ) =1
λi=
1
exp(xiβ)
Positive β implies that expected duration time decreases as xincreases.
If you use a scale parameterization with θi
E (Ti ) = θi = exp(xiβ)
Positive β implies that expected duration time increases as xincreases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 153 / 242
![Page 751: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/751.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 752: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/752.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 753: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/753.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 754: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/754.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 755: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/755.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 756: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/756.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 757: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/757.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 758: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/758.jpg)
Hazard Function for Rate Parametrization
For Ti ∼ Exponential(λi ):
f (t) = λie−λi t
S(t) = 1− F (t)
= 1− (1− e−λt)
= e−λi t
h(t) =f (t)
S(t)
=λie−λi t
e−λi t
= λi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 154 / 242
![Page 759: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/759.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 760: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/760.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 761: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/761.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 762: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/762.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 763: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/763.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 764: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/764.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 765: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/765.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]
=1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 766: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/766.jpg)
Hazard Function for Scale Parametrization
For Ti ∼ Exponential(θi ):
f (t) =1
θiexp[− t
θi]
S(t) = 1− F (t)
= 1− (1− exp[− t
θi])
= exp[− t
θi]
h(t) =f (t)
S(t)
=1θi
exp[− tθi
]
exp[− tθi
]=
1
θi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 155 / 242
![Page 767: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/767.jpg)
Let’s work with the scale parametrization
Note that h(t) = 1θi
, which does not depend on t!
I The exponential model thus assume a flat hazard: Every unit /individual has their own hazard rate, but it does not change over time
I Connected to memorylessness property of the exponentialdistribution
Modeling h(t) with covariates:
h(t) =1
θi= exp[−xiβ]
Positive β implies that hazard decreases and average survival timeincreases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 156 / 242
![Page 768: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/768.jpg)
Let’s work with the scale parametrization
Note that h(t) = 1θi
, which does not depend on t!
I The exponential model thus assume a flat hazard: Every unit /individual has their own hazard rate, but it does not change over time
I Connected to memorylessness property of the exponentialdistribution
Modeling h(t) with covariates:
h(t) =1
θi= exp[−xiβ]
Positive β implies that hazard decreases and average survival timeincreases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 156 / 242
![Page 769: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/769.jpg)
Let’s work with the scale parametrization
Note that h(t) = 1θi
, which does not depend on t!
I The exponential model thus assume a flat hazard: Every unit /individual has their own hazard rate, but it does not change over time
I Connected to memorylessness property of the exponentialdistribution
Modeling h(t) with covariates:
h(t) =1
θi= exp[−xiβ]
Positive β implies that hazard decreases and average survival timeincreases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 156 / 242
![Page 770: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/770.jpg)
Let’s work with the scale parametrization
Note that h(t) = 1θi
, which does not depend on t!
I The exponential model thus assume a flat hazard: Every unit /individual has their own hazard rate, but it does not change over time
I Connected to memorylessness property of the exponentialdistribution
Modeling h(t) with covariates:
h(t) =1
θi= exp[−xiβ]
Positive β implies that hazard decreases and average survival timeincreases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 156 / 242
![Page 771: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/771.jpg)
Let’s work with the scale parametrization
Note that h(t) = 1θi
, which does not depend on t!
I The exponential model thus assume a flat hazard: Every unit /individual has their own hazard rate, but it does not change over time
I Connected to memorylessness property of the exponentialdistribution
Modeling h(t) with covariates:
h(t) =1
θi= exp[−xiβ]
Positive β implies that hazard decreases and average survival timeincreases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 156 / 242
![Page 772: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/772.jpg)
Let’s work with the scale parametrization
Note that h(t) = 1θi
, which does not depend on t!
I The exponential model thus assume a flat hazard: Every unit /individual has their own hazard rate, but it does not change over time
I Connected to memorylessness property of the exponentialdistribution
Modeling h(t) with covariates:
h(t) =1
θi= exp[−xiβ]
Positive β implies that hazard decreases and average survival timeincreases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 156 / 242
![Page 773: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/773.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci` =
n∑i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 774: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/774.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci` =
n∑i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 775: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/775.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci
` =n∑
i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 776: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/776.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci` =
n∑i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 777: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/777.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci` =
n∑i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 778: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/778.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci` =
n∑i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 779: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/779.jpg)
Estimation via ML:
L =n∏
i=1
[f (ti )]1−ci [1− F (ti )]ci
=n∏
i=1
[1
θie− ti
θi
]1−ci [e− ti
θi
]ci` =
n∑i=1
(1− ci )(ln1
θi− tiθi
) + ci (−tiθi
)
=n∑
i=1
(1− ci )(ln e−xiβ − e−xiβti ) + ci (−e−xiβti )
=n∑
i=1
(1− ci )(−xiβ − e−xiβti )− ci (e−xiβti )
=n∑
i=1
(1− ci )(−xiβ)− e−xiβti
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 157 / 242
![Page 780: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/780.jpg)
Quantities of interest
If our outcome variable is how long a parliamentary government lasts, andwe’re interested in the effect of majority versus minority governments. Wecould calculate:
Find the hazard ratio of majority to minority governments
Expected survival time for majority and minority governments
Predicted survival times for majority and minority governments
First differences in expected survival times between majority andminority governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 158 / 242
![Page 781: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/781.jpg)
Quantities of interest
If our outcome variable is how long a parliamentary government lasts, andwe’re interested in the effect of majority versus minority governments. Wecould calculate:
Find the hazard ratio of majority to minority governments
Expected survival time for majority and minority governments
Predicted survival times for majority and minority governments
First differences in expected survival times between majority andminority governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 158 / 242
![Page 782: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/782.jpg)
Quantities of interest
If our outcome variable is how long a parliamentary government lasts, andwe’re interested in the effect of majority versus minority governments. Wecould calculate:
Find the hazard ratio of majority to minority governments
Expected survival time for majority and minority governments
Predicted survival times for majority and minority governments
First differences in expected survival times between majority andminority governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 158 / 242
![Page 783: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/783.jpg)
Quantities of interest
If our outcome variable is how long a parliamentary government lasts, andwe’re interested in the effect of majority versus minority governments. Wecould calculate:
Find the hazard ratio of majority to minority governments
Expected survival time for majority and minority governments
Predicted survival times for majority and minority governments
First differences in expected survival times between majority andminority governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 158 / 242
![Page 784: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/784.jpg)
Quantities of interest
If our outcome variable is how long a parliamentary government lasts, andwe’re interested in the effect of majority versus minority governments. Wecould calculate:
Find the hazard ratio of majority to minority governments
Expected survival time for majority and minority governments
Predicted survival times for majority and minority governments
First differences in expected survival times between majority andminority governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 158 / 242
![Page 785: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/785.jpg)
Quantities of interest
If our outcome variable is how long a parliamentary government lasts, andwe’re interested in the effect of majority versus minority governments. Wecould calculate:
Find the hazard ratio of majority to minority governments
Expected survival time for majority and minority governments
Predicted survival times for majority and minority governments
First differences in expected survival times between majority andminority governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 158 / 242
![Page 786: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/786.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 787: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/787.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 788: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/788.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 789: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/789.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 790: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/790.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 791: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/791.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 792: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/792.jpg)
Hazard Ratios
HR =h(t|xmaj)
h(t|xmin)
=e−xmajβ
e−xminβ
=e−β0e−x1β1e−x2β2e−x3β3e−xmajβ4e−x5β5
e−β0e−x1β1e−x2β2e−x3β3e−xminβ4e−x5β5
=e−xmajβ4
e−xminβ4
= e−β4
Hazard ratio greater than 1 would imply that majority governments fallfaster (shorter survival time) than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 159 / 242
![Page 793: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/793.jpg)
0.4 0.5 0.6 0.7 0.8 0.9 1.0
01
23
4
Distribution of Hazard Ratios
hazard ratios
Den
sity
Mean Hazard Ratio
Majority governments survive longer than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 160 / 242
![Page 794: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/794.jpg)
0.4 0.5 0.6 0.7 0.8 0.9 1.0
01
23
4
Distribution of Hazard Ratios
hazard ratios
Den
sity
Mean Hazard Ratio
Majority governments survive longer than minority governments.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 160 / 242
![Page 795: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/795.jpg)
Expected (average) Survival Time
E (T |xi ) = θi
= exp[xiβ]
10 15 20 25 30 35 40
0.00
0.05
0.10
0.15
0.20
0.25
Distribution of Expected Duration
expected duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 161 / 242
![Page 796: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/796.jpg)
Expected (average) Survival Time
E (T |xi ) = θi
= exp[xiβ]
10 15 20 25 30 35 40
0.00
0.05
0.10
0.15
0.20
0.25
Distribution of Expected Duration
expected duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 161 / 242
![Page 797: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/797.jpg)
Expected (average) Survival Time
E (T |xi ) = θi
= exp[xiβ]
10 15 20 25 30 35 40
0.00
0.05
0.10
0.15
0.20
0.25
Distribution of Expected Duration
expected duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 161 / 242
![Page 798: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/798.jpg)
Expected (average) Survival Time
E (T |xi ) = θi
= exp[xiβ]
10 15 20 25 30 35 40
0.00
0.05
0.10
0.15
0.20
0.25
Distribution of Expected Duration
expected duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 161 / 242
![Page 799: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/799.jpg)
Predicted Survival Time
Draw predicted values from the exponential distribution.
0 50 100 150
0.00
0.01
0.02
0.03
0.04
Distribution of Predicted Duration
predicted duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 162 / 242
![Page 800: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/800.jpg)
Predicted Survival Time
Draw predicted values from the exponential distribution.
0 50 100 150
0.00
0.01
0.02
0.03
0.04
Distribution of Predicted Duration
predicted duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 162 / 242
![Page 801: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/801.jpg)
Predicted Survival Time
Draw predicted values from the exponential distribution.
0 50 100 150
0.00
0.01
0.02
0.03
0.04
Distribution of Predicted Duration
predicted duration in months
Den
sity
Majority GovernmentsMinority Governments
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 162 / 242
![Page 802: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/802.jpg)
First Differences
E (T |xmaj)− E (T |xmin)
0 5 10 15 20 25
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Distribution of First Differences
first difference in months
Den
sity
Mean First Difference
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 163 / 242
![Page 803: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/803.jpg)
First Differences
E (T |xmaj)− E (T |xmin)
0 5 10 15 20 25
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Distribution of First Differences
first difference in months
Den
sity
Mean First Difference
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 163 / 242
![Page 804: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/804.jpg)
First Differences
E (T |xmaj)− E (T |xmin)
0 5 10 15 20 25
0.00
0.02
0.04
0.06
0.08
0.10
0.12
Distribution of First Differences
first difference in months
Den
sity
Mean First Difference
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 163 / 242
![Page 805: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/805.jpg)
Quantities of Interest in Zelig
x.min <- setx(z.out,numst2=0)
x.maj <- setx(z.out,numst2=1)
s.out <- sim(z.out, x=x.min,x1=x.maj)
summary(s.out)
plot(s.out)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 164 / 242
![Page 806: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/806.jpg)
0 20 40 60 80 100
0.00
0.04
Predicted Values: Y|X
0 50 100 150
0.000
0.025
Predicted Values: Y|X1
10 12 14 16 18 20
0.00
0.20
Expected Values: E(Y|X)
20 25 30
0.00
0.15
Expected Values: E(Y|X1)
0 5 10 15
0.00
0.10
First Differences: E(Y|X1) - E(Y|X)
0 50 100 150
0.00
0.04
Comparison of Y|X and Y|X1
0 50 100 150
0.00
0.04
10 15 20 25 30
0.00
0.20
Comparison of E(Y|X) and E(Y|X1)
10 15 20 25 30
0.00
0.20
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 165 / 242
![Page 807: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/807.jpg)
The exponential model is nice and simple, but the assumption of a flathazard may be too restrictive.
What if we want to loosen that restriction by assuming a monotonichazard?
We can use the Weibull model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 166 / 242
![Page 808: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/808.jpg)
The exponential model is nice and simple, but the assumption of a flathazard may be too restrictive.
What if we want to loosen that restriction by assuming a monotonichazard?
We can use the Weibull model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 166 / 242
![Page 809: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/809.jpg)
The exponential model is nice and simple, but the assumption of a flathazard may be too restrictive.
What if we want to loosen that restriction by assuming a monotonichazard?
We can use the Weibull model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 166 / 242
![Page 810: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/810.jpg)
The Weibull Model
Similar to how we generalized the Poisson into a Negative Binomial by adding aparameter, we can do the same with the Exponential by turning it into a Weibull:
Ti ∼ Weibull(θi , α)
E (Ti ) = θiΓ
(1 +
1
α
)θi > 0 is the scale parameter and α > 0 is the shape parameter.
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Shape
l = 1, k = 0.5l = 1, k = 1l = 1, k = 1.5l = 1, k = 5
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Scale
k = 1, l = 0.5k = 1, l = 1k = 1, l = 1.5k = 1, l = 5
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 167 / 242
![Page 811: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/811.jpg)
The Weibull Model
Similar to how we generalized the Poisson into a Negative Binomial by adding aparameter, we can do the same with the Exponential by turning it into a Weibull:
Ti ∼ Weibull(θi , α)
E (Ti ) = θiΓ
(1 +
1
α
)θi > 0 is the scale parameter and α > 0 is the shape parameter.
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Shape
l = 1, k = 0.5l = 1, k = 1l = 1, k = 1.5l = 1, k = 5
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Scale
k = 1, l = 0.5k = 1, l = 1k = 1, l = 1.5k = 1, l = 5
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 167 / 242
![Page 812: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/812.jpg)
The Weibull Model
Similar to how we generalized the Poisson into a Negative Binomial by adding aparameter, we can do the same with the Exponential by turning it into a Weibull:
Ti ∼ Weibull(θi , α)
E (Ti ) = θiΓ
(1 +
1
α
)
θi > 0 is the scale parameter and α > 0 is the shape parameter.
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Shape
l = 1, k = 0.5l = 1, k = 1l = 1, k = 1.5l = 1, k = 5
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Scale
k = 1, l = 0.5k = 1, l = 1k = 1, l = 1.5k = 1, l = 5
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 167 / 242
![Page 813: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/813.jpg)
The Weibull Model
Similar to how we generalized the Poisson into a Negative Binomial by adding aparameter, we can do the same with the Exponential by turning it into a Weibull:
Ti ∼ Weibull(θi , α)
E (Ti ) = θiΓ
(1 +
1
α
)θi > 0 is the scale parameter and α > 0 is the shape parameter.
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Shape
l = 1, k = 0.5l = 1, k = 1l = 1, k = 1.5l = 1, k = 5
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Scale
k = 1, l = 0.5k = 1, l = 1k = 1, l = 1.5k = 1, l = 5
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 167 / 242
![Page 814: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/814.jpg)
The Weibull Model
Similar to how we generalized the Poisson into a Negative Binomial by adding aparameter, we can do the same with the Exponential by turning it into a Weibull:
Ti ∼ Weibull(θi , α)
E (Ti ) = θiΓ
(1 +
1
α
)θi > 0 is the scale parameter and α > 0 is the shape parameter.
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Shape
l = 1, k = 0.5l = 1, k = 1l = 1, k = 1.5l = 1, k = 5
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
2.5
Changing Scale
k = 1, l = 0.5k = 1, l = 1k = 1, l = 1.5k = 1, l = 5
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 167 / 242
![Page 815: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/815.jpg)
The Weibull Model
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]
Model θi with covariates in the systematic component:
θi = exp(xiβ)
Positive β implies that expected duration time increases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 168 / 242
![Page 816: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/816.jpg)
The Weibull Model
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]
Model θi with covariates in the systematic component:
θi = exp(xiβ)
Positive β implies that expected duration time increases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 168 / 242
![Page 817: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/817.jpg)
The Weibull Model
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]
Model θi with covariates in the systematic component:
θi = exp(xiβ)
Positive β implies that expected duration time increases as x increases.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 168 / 242
![Page 818: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/818.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]
S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 819: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/819.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 820: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/820.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 821: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/821.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 822: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/822.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 823: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/823.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 824: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/824.jpg)
f (ti ) =
(α
θαi
)tα−1i exp
[−(
tiθi
)α]S(ti ) = 1− F (ti )
= 1− (1− e−(ti/θi )α
)
= e−(ti/θi )α
h(ti ) =f (ti )
S(ti )
=
(αθαi
)tα−1i exp
[−(
tiθi
)α]e−(ti/θi )α
=
(α
θi
)(tiθi
)α−1
=
(α
θαi
)tα−1i
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 169 / 242
![Page 825: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/825.jpg)
Hazard monotonicity assumptionh(ti ) is modeled with both λi and α and is a function of ti . Thus, theWeibull model assumes a monotonic hazard.
If α = 1, h(ti ) is flat and the model is the exponential model.
If α > 1, h(ti ) is monotonically increasing.
If α < 1, h(ti ) is monotonically decreasing.
0.5 1.0 1.5 2.0
01
23
4
Weibull Hazards
t
h(t)
p=.5p=1p=2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 170 / 242
![Page 826: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/826.jpg)
Hazard monotonicity assumptionh(ti ) is modeled with both λi and α and is a function of ti . Thus, theWeibull model assumes a monotonic hazard.
If α = 1, h(ti ) is flat and the model is the exponential model.
If α > 1, h(ti ) is monotonically increasing.
If α < 1, h(ti ) is monotonically decreasing.
0.5 1.0 1.5 2.0
01
23
4
Weibull Hazards
t
h(t)
p=.5p=1p=2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 170 / 242
![Page 827: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/827.jpg)
Hazard monotonicity assumptionh(ti ) is modeled with both λi and α and is a function of ti . Thus, theWeibull model assumes a monotonic hazard.
If α = 1, h(ti ) is flat and the model is the exponential model.
If α > 1, h(ti ) is monotonically increasing.
If α < 1, h(ti ) is monotonically decreasing.
0.5 1.0 1.5 2.0
01
23
4
Weibull Hazards
t
h(t)
p=.5p=1p=2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 170 / 242
![Page 828: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/828.jpg)
Hazard monotonicity assumptionh(ti ) is modeled with both λi and α and is a function of ti . Thus, theWeibull model assumes a monotonic hazard.
If α = 1, h(ti ) is flat and the model is the exponential model.
If α > 1, h(ti ) is monotonically increasing.
If α < 1, h(ti ) is monotonically decreasing.
0.5 1.0 1.5 2.0
01
23
4
Weibull Hazards
t
h(t)
p=.5p=1p=2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 170 / 242
![Page 829: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/829.jpg)
Hazard monotonicity assumptionh(ti ) is modeled with both λi and α and is a function of ti . Thus, theWeibull model assumes a monotonic hazard.
If α = 1, h(ti ) is flat and the model is the exponential model.
If α > 1, h(ti ) is monotonically increasing.
If α < 1, h(ti ) is monotonically decreasing.
0.5 1.0 1.5 2.0
01
23
4Weibull Hazards
t
h(t)
p=.5p=1p=2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 170 / 242
![Page 830: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/830.jpg)
The shape parameter α for the Weibull distribution is the reciprocal of thescale parameter given by survreg().
The scale parameter given by survreg() is NOT the same as the scaleparameter in the Weibull distribution, which should be θi = exiβ.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 171 / 242
![Page 831: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/831.jpg)
The shape parameter α for the Weibull distribution is the reciprocal of thescale parameter given by survreg().
The scale parameter given by survreg() is NOT the same as the scaleparameter in the Weibull distribution, which should be θi = exiβ.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 171 / 242
![Page 832: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/832.jpg)
Hazard Ratios
One quantity of interest is the hazard ratio:
HR =h(t|x = 1)
h(t|x = 0)
With the Weibull model we make a proportional hazards assumption:hazard ratio does not depend t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 172 / 242
![Page 833: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/833.jpg)
Hazard Ratios
One quantity of interest is the hazard ratio:
HR =h(t|x = 1)
h(t|x = 0)
With the Weibull model we make a proportional hazards assumption:hazard ratio does not depend t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 172 / 242
![Page 834: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/834.jpg)
Hazard Ratios
One quantity of interest is the hazard ratio:
HR =h(t|x = 1)
h(t|x = 0)
With the Weibull model we make a proportional hazards assumption:hazard ratio does not depend t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 172 / 242
![Page 835: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/835.jpg)
Hazard Ratios
One quantity of interest is the hazard ratio:
HR =h(t|x = 1)
h(t|x = 0)
With the Weibull model we make a proportional hazards assumption:hazard ratio does not depend t.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 172 / 242
![Page 836: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/836.jpg)
Other Parametric Models
Gompertz model: monotonic hazard
Log-logistic or log-normal model: nonmonotonic hazard
Generalized gamma model: nests the exponential, Weibull,log-normal, and gamma models with an extra parameter (seeappendix slides)
But what if we don’t want to make an assumption about the shape of thehazard?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 173 / 242
![Page 837: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/837.jpg)
Other Parametric Models
Gompertz model: monotonic hazard
Log-logistic or log-normal model: nonmonotonic hazard
Generalized gamma model: nests the exponential, Weibull,log-normal, and gamma models with an extra parameter (seeappendix slides)
But what if we don’t want to make an assumption about the shape of thehazard?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 173 / 242
![Page 838: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/838.jpg)
Other Parametric Models
Gompertz model: monotonic hazard
Log-logistic or log-normal model: nonmonotonic hazard
Generalized gamma model: nests the exponential, Weibull,log-normal, and gamma models with an extra parameter (seeappendix slides)
But what if we don’t want to make an assumption about the shape of thehazard?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 173 / 242
![Page 839: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/839.jpg)
Other Parametric Models
Gompertz model: monotonic hazard
Log-logistic or log-normal model: nonmonotonic hazard
Generalized gamma model: nests the exponential, Weibull,log-normal, and gamma models with an extra parameter (seeappendix slides)
But what if we don’t want to make an assumption about the shape of thehazard?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 173 / 242
![Page 840: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/840.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 841: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/841.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 842: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/842.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 843: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/843.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 844: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/844.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 845: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/845.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 846: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/846.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 847: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/847.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 848: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/848.jpg)
The Cox Proportional Hazards Model
Often described as a semi-parametric model.Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 174 / 242
![Page 849: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/849.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.I No two events can occur at the same instant. It only seems that way
because our unit of measurement is not precise enough.I There are ways to adjust the likelihood to take into account observed
ties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 850: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/850.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.I No two events can occur at the same instant. It only seems that way
because our unit of measurement is not precise enough.I There are ways to adjust the likelihood to take into account observed
ties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 851: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/851.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.
I No two events can occur at the same instant. It only seems that waybecause our unit of measurement is not precise enough.
I There are ways to adjust the likelihood to take into account observedties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 852: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/852.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.
I No two events can occur at the same instant. It only seems that waybecause our unit of measurement is not precise enough.
I There are ways to adjust the likelihood to take into account observedties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 853: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/853.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.I No two events can occur at the same instant. It only seems that way
because our unit of measurement is not precise enough.
I There are ways to adjust the likelihood to take into account observedties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 854: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/854.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.I No two events can occur at the same instant. It only seems that way
because our unit of measurement is not precise enough.I There are ways to adjust the likelihood to take into account observed
ties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 855: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/855.jpg)
1 Reconceptualize each ti as a discrete event time rather than aduration or survival time (non-censored observations only).
I ti = 5: An event occurred at month 5, rather than observation isurviving for 5 months.
2 Assume there are no tied event times in the data.I No two events can occur at the same instant. It only seems that way
because our unit of measurement is not precise enough.I There are ways to adjust the likelihood to take into account observed
ties.
3 Assume no events can happen between event times.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 175 / 242
![Page 856: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/856.jpg)
We know that exactly one event occurred at each ti for all non-censored i .
Define a risk set Ri as the set of all possible observations at risk of anevent at time ti .
What observations belong in Ri?
All observations (censored and non-censored) j such that tj ≥ ti
For example, if ti = 5 months, then all observations that do not experiencethe event or are not censored before 5 months are at risk.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 176 / 242
![Page 857: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/857.jpg)
We know that exactly one event occurred at each ti for all non-censored i .
Define a risk set Ri as the set of all possible observations at risk of anevent at time ti .
What observations belong in Ri?
All observations (censored and non-censored) j such that tj ≥ ti
For example, if ti = 5 months, then all observations that do not experiencethe event or are not censored before 5 months are at risk.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 176 / 242
![Page 858: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/858.jpg)
We know that exactly one event occurred at each ti for all non-censored i .
Define a risk set Ri as the set of all possible observations at risk of anevent at time ti .
What observations belong in Ri?
All observations (censored and non-censored) j such that tj ≥ ti
For example, if ti = 5 months, then all observations that do not experiencethe event or are not censored before 5 months are at risk.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 176 / 242
![Page 859: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/859.jpg)
We know that exactly one event occurred at each ti for all non-censored i .
Define a risk set Ri as the set of all possible observations at risk of anevent at time ti .
What observations belong in Ri?
All observations (censored and non-censored) j such that tj ≥ ti
For example, if ti = 5 months, then all observations that do not experiencethe event or are not censored before 5 months are at risk.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 176 / 242
![Page 860: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/860.jpg)
We know that exactly one event occurred at each ti for all non-censored i .
Define a risk set Ri as the set of all possible observations at risk of anevent at time ti .
What observations belong in Ri?
All observations (censored and non-censored) j such that tj ≥ ti
For example, if ti = 5 months, then all observations that do not experiencethe event or are not censored before 5 months are at risk.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 176 / 242
![Page 861: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/861.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci=
n∏i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 862: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/862.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci=
n∏i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 863: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/863.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci
=n∏
i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 864: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/864.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci=
n∏i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 865: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/865.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci=
n∏i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 866: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/866.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci=
n∏i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 867: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/867.jpg)
We can then create a partial likelihood function:
L =n∏
i=1
[P(event occurred in i |event occurred in Ri )]ci
=n∏
i=1
[P(event occurred in i)
P(event occurred in Ri )
]ci=
n∏i=1
[h(ti )∑j∈Ri
h(tj)
]ci
=n∏
i=1
[h0(t)hi (ti )∑j∈Ri
h0(t)hj(tj)
]ci
=n∏
i=1
[hi (ti )∑j∈Ri
hj(tj)
]ci
h0(t) is the baseline hazard, which is the same for all observations, so itcancels out.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 177 / 242
![Page 868: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/868.jpg)
Like in parametric models, h(t) is modeled with covariates:
hi (ti ) = exiβ
Note that a positive β now suggests that an increase in x increases thehazard and decreases survival time.
L =n∏
i=1
[exiβ∑j∈Ri
exjβ
]ci
There is no β0 term estimated. This implies that the shape of the baselinehazard is left unmodeled.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 178 / 242
![Page 869: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/869.jpg)
Like in parametric models, h(t) is modeled with covariates:
hi (ti ) = exiβ
Note that a positive β now suggests that an increase in x increases thehazard and decreases survival time.
L =n∏
i=1
[exiβ∑j∈Ri
exjβ
]ci
There is no β0 term estimated. This implies that the shape of the baselinehazard is left unmodeled.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 178 / 242
![Page 870: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/870.jpg)
Like in parametric models, h(t) is modeled with covariates:
hi (ti ) = exiβ
Note that a positive β now suggests that an increase in x increases thehazard and decreases survival time.
L =n∏
i=1
[exiβ∑j∈Ri
exjβ
]ci
There is no β0 term estimated. This implies that the shape of the baselinehazard is left unmodeled.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 178 / 242
![Page 871: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/871.jpg)
Like in parametric models, h(t) is modeled with covariates:
hi (ti ) = exiβ
Note that a positive β now suggests that an increase in x increases thehazard and decreases survival time.
L =n∏
i=1
[exiβ∑j∈Ri
exjβ
]ci
There is no β0 term estimated. This implies that the shape of the baselinehazard is left unmodeled.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 178 / 242
![Page 872: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/872.jpg)
Like in parametric models, h(t) is modeled with covariates:
hi (ti ) = exiβ
Note that a positive β now suggests that an increase in x increases thehazard and decreases survival time.
L =n∏
i=1
[exiβ∑j∈Ri
exjβ
]ci
There is no β0 term estimated.
This implies that the shape of the baselinehazard is left unmodeled.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 178 / 242
![Page 873: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/873.jpg)
Like in parametric models, h(t) is modeled with covariates:
hi (ti ) = exiβ
Note that a positive β now suggests that an increase in x increases thehazard and decreases survival time.
L =n∏
i=1
[exiβ∑j∈Ri
exjβ
]ci
There is no β0 term estimated. This implies that the shape of the baselinehazard is left unmodeled.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 178 / 242
![Page 874: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/874.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 875: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/875.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 876: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/876.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 877: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/877.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 878: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/878.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 879: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/879.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 880: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/880.jpg)
Pros:
Makes no restrictive assumption about the shape of the hazard.
A better choice if you want the effects of the covariates and thenature of the time dependence is unimportant.
Cons:
Only quantities of interest are hazard ratios.
Can be subject to overfitting
Shape of hazard is unknown (although there are semi-parametric waysto derive the hazard and survivor functions)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 179 / 242
![Page 881: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/881.jpg)
How do I run a Cox proportional hazards model in R?
Use the coxph() function in the survival package (also in the Design
and Zelig packages).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 180 / 242
![Page 882: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/882.jpg)
How do I run a Cox proportional hazards model in R?
Use the coxph() function in the survival package (also in the Design
and Zelig packages).
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 180 / 242
![Page 883: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/883.jpg)
Alternatives
Survival models are cool . . .
but hard.
There are other things you can model:I Perhaps some observations are more likely to fail than others: frailty
modelsI Perhaps some observations you don’t expect to fail at all: split
population modelsI Perhaps there can be more than one type of event: competing risks
model
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 884: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/884.jpg)
Alternatives
Survival models are cool . . . but hard.
There are other things you can model:I Perhaps some observations are more likely to fail than others: frailty
modelsI Perhaps some observations you don’t expect to fail at all: split
population modelsI Perhaps there can be more than one type of event: competing risks
model
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 885: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/885.jpg)
Alternatives
Survival models are cool . . . but hard.
There are other things you can model:
I Perhaps some observations are more likely to fail than others: frailtymodels
I Perhaps some observations you don’t expect to fail at all: splitpopulation models
I Perhaps there can be more than one type of event: competing risksmodel
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 886: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/886.jpg)
Alternatives
Survival models are cool . . . but hard.
There are other things you can model:I Perhaps some observations are more likely to fail than others: frailty
models
I Perhaps some observations you don’t expect to fail at all: splitpopulation models
I Perhaps there can be more than one type of event: competing risksmodel
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 887: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/887.jpg)
Alternatives
Survival models are cool . . . but hard.
There are other things you can model:I Perhaps some observations are more likely to fail than others: frailty
modelsI Perhaps some observations you don’t expect to fail at all: split
population models
I Perhaps there can be more than one type of event: competing risksmodel
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 888: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/888.jpg)
Alternatives
Survival models are cool . . . but hard.
There are other things you can model:I Perhaps some observations are more likely to fail than others: frailty
modelsI Perhaps some observations you don’t expect to fail at all: split
population modelsI Perhaps there can be more than one type of event: competing risks
model
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 889: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/889.jpg)
Alternatives
Survival models are cool . . . but hard.
There are other things you can model:I Perhaps some observations are more likely to fail than others: frailty
modelsI Perhaps some observations you don’t expect to fail at all: split
population modelsI Perhaps there can be more than one type of event: competing risks
model
If you encounter survival data think carefully about the process and thenchoose a corresponding model.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 181 / 242
![Page 890: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/890.jpg)
References:
Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event HistoryModeling. Cambridge University Press.
Therneau, Terry M., and Patricia M. Grambsch. 2013 Modeling survivaldata: extending the Cox model. Springer Science & Business Media.
Andersen, Per Kragh, et al. 2012 Statistical models based on countingprocesses. Springer Science & Business Media.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 182 / 242
![Page 891: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/891.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 183 / 242
![Page 892: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/892.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 183 / 242
![Page 893: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/893.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 894: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/894.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 895: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/895.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 896: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/896.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 897: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/897.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 898: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/898.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 899: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/899.jpg)
How Do Survival Models Relate to Duration Dependencein a Logit Model?
Based on Beck, Katz, and Tucker (1998)
Suppose we have Time-Series Cross-Sectional Data with a binarydependent variable.
I For example, if we had data on country dyads over 50 years, with thedependent variable being whether there was a war between the twocountries in each year.
Not all observations are independent. We may see some durationdependence.
I Perhaps countries that have been at peace for 100 years may be lesslikely to go to war than countries that have been at peace for only 2years.
How can we account for this duration dependence in a logit model?
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 184 / 242
![Page 900: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/900.jpg)
Think of the observations as grouped duration data:
Year tk Dyad Yi Ti
1992 1 US-Iraq 01993 2 US-Iraq 01994 3 US-Iraq 01995 4 US-Iraq 01996 5 US-Iraq 01997 6 US-Iraq 01998 7 US-Iraq 0 121999 8 US-Iraq 02000 9 US-Iraq 02001 10 US-Iraq 02002 11 US-Iraq 02003 12 US-Iraq 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 185 / 242
![Page 901: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/901.jpg)
Think of the observations as grouped duration data:
Year tk Dyad Yi Ti
1992 1 US-Iraq 01993 2 US-Iraq 01994 3 US-Iraq 01995 4 US-Iraq 01996 5 US-Iraq 01997 6 US-Iraq 01998 7 US-Iraq 0 121999 8 US-Iraq 02000 9 US-Iraq 02001 10 US-Iraq 02002 11 US-Iraq 02003 12 US-Iraq 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 185 / 242
![Page 902: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/902.jpg)
Then we end up with:
P(yi ,tk = 1|xi ,tk ) = h(tk |xi ,tk )
= 1− P(surviving beyond tk |survival up to tk−1)
It can be shown in general that
S(t) = e−∫ t
0 h(u)du
So then we get
P(yi ,tk = 1|xi ,tk ) = 1− e−
∫ tktk−1
h(u)du
where we take the integral from tk−1 to tk in order to get the conditionalsurvival.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 186 / 242
![Page 903: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/903.jpg)
Then we end up with:
P(yi ,tk = 1|xi ,tk ) = h(tk |xi ,tk )
= 1− P(surviving beyond tk |survival up to tk−1)
It can be shown in general that
S(t) = e−∫ t
0 h(u)du
So then we get
P(yi ,tk = 1|xi ,tk ) = 1− e−
∫ tktk−1
h(u)du
where we take the integral from tk−1 to tk in order to get the conditionalsurvival.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 186 / 242
![Page 904: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/904.jpg)
Then we end up with:
P(yi ,tk = 1|xi ,tk ) = h(tk |xi ,tk )
= 1− P(surviving beyond tk |survival up to tk−1)
It can be shown in general that
S(t) = e−∫ t
0 h(u)du
So then we get
P(yi ,tk = 1|xi ,tk ) = 1− e−
∫ tktk−1
h(u)du
where we take the integral from tk−1 to tk in order to get the conditionalsurvival.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 186 / 242
![Page 905: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/905.jpg)
Then we end up with:
P(yi ,tk = 1|xi ,tk ) = h(tk |xi ,tk )
= 1− P(surviving beyond tk |survival up to tk−1)
It can be shown in general that
S(t) = e−∫ t
0 h(u)du
So then we get
P(yi ,tk = 1|xi ,tk ) = 1− e−
∫ tktk−1
h(u)du
where we take the integral from tk−1 to tk in order to get the conditionalsurvival.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 186 / 242
![Page 906: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/906.jpg)
P(yi ,tk = 1|xi ,tk ) = 1− exp
(−∫ tk
tk−1
h(u)du
)
= 1− exp
(−∫ tk
tk−1
exi,tk βh0(u)du
)
= 1− exp
(−exi,tk β
∫ tk
tk−1
h0(u)du
)= 1− exp
(−exi,tk βαtk
)= 1− exp
(−exi,tk β+κtk
)
This is equivalent to a model with a complementary log-log (cloglog) linkand time dummies κtk .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 187 / 242
![Page 907: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/907.jpg)
P(yi ,tk = 1|xi ,tk ) = 1− exp
(−∫ tk
tk−1
h(u)du
)
= 1− exp
(−∫ tk
tk−1
exi,tk βh0(u)du
)
= 1− exp
(−exi,tk β
∫ tk
tk−1
h0(u)du
)= 1− exp
(−exi,tk βαtk
)= 1− exp
(−exi,tk β+κtk
)
This is equivalent to a model with a complementary log-log (cloglog) linkand time dummies κtk .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 187 / 242
![Page 908: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/908.jpg)
P(yi ,tk = 1|xi ,tk ) = 1− exp
(−∫ tk
tk−1
h(u)du
)
= 1− exp
(−∫ tk
tk−1
exi,tk βh0(u)du
)
= 1− exp
(−exi,tk β
∫ tk
tk−1
h0(u)du
)
= 1− exp(−exi,tk βαtk
)= 1− exp
(−exi,tk β+κtk
)
This is equivalent to a model with a complementary log-log (cloglog) linkand time dummies κtk .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 187 / 242
![Page 909: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/909.jpg)
P(yi ,tk = 1|xi ,tk ) = 1− exp
(−∫ tk
tk−1
h(u)du
)
= 1− exp
(−∫ tk
tk−1
exi,tk βh0(u)du
)
= 1− exp
(−exi,tk β
∫ tk
tk−1
h0(u)du
)= 1− exp
(−exi,tk βαtk
)
= 1− exp(−exi,tk β+κtk
)
This is equivalent to a model with a complementary log-log (cloglog) linkand time dummies κtk .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 187 / 242
![Page 910: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/910.jpg)
P(yi ,tk = 1|xi ,tk ) = 1− exp
(−∫ tk
tk−1
h(u)du
)
= 1− exp
(−∫ tk
tk−1
exi,tk βh0(u)du
)
= 1− exp
(−exi,tk β
∫ tk
tk−1
h0(u)du
)= 1− exp
(−exi,tk βαtk
)= 1− exp
(−exi,tk β+κtk
)
This is equivalent to a model with a complementary log-log (cloglog) linkand time dummies κtk .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 187 / 242
![Page 911: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/911.jpg)
P(yi ,tk = 1|xi ,tk ) = 1− exp
(−∫ tk
tk−1
h(u)du
)
= 1− exp
(−∫ tk
tk−1
exi,tk βh0(u)du
)
= 1− exp
(−exi,tk β
∫ tk
tk−1
h0(u)du
)= 1− exp
(−exi,tk βαtk
)= 1− exp
(−exi,tk β+κtk
)
This is equivalent to a model with a complementary log-log (cloglog) linkand time dummies κtk .
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 187 / 242
![Page 912: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/912.jpg)
BKT suggest using a logit link instead of a cloglog link because logitis more widely used (and nobody knows what a cloglog link is exceptyou guys!).
As long as probability of an event does not exceed 50 percent, logitand cloglog links are very similar.
The use of time dummies means that we are imposing no structure onthe nature of duration dependence (structure of the hazard).
If we don’t use time dummies, we are assuming no durationdependence (flat hazard)
Using a variable such as “number of years at peace” instead of timedummies imposes a monotonic hazard.
The use of time dummies may use up a lot of degrees of freedom, soBKT suggest using restricted cubic splines.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 188 / 242
![Page 913: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/913.jpg)
BKT suggest using a logit link instead of a cloglog link because logitis more widely used (and nobody knows what a cloglog link is exceptyou guys!).
As long as probability of an event does not exceed 50 percent, logitand cloglog links are very similar.
The use of time dummies means that we are imposing no structure onthe nature of duration dependence (structure of the hazard).
If we don’t use time dummies, we are assuming no durationdependence (flat hazard)
Using a variable such as “number of years at peace” instead of timedummies imposes a monotonic hazard.
The use of time dummies may use up a lot of degrees of freedom, soBKT suggest using restricted cubic splines.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 188 / 242
![Page 914: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/914.jpg)
BKT suggest using a logit link instead of a cloglog link because logitis more widely used (and nobody knows what a cloglog link is exceptyou guys!).
As long as probability of an event does not exceed 50 percent, logitand cloglog links are very similar.
The use of time dummies means that we are imposing no structure onthe nature of duration dependence (structure of the hazard).
If we don’t use time dummies, we are assuming no durationdependence (flat hazard)
Using a variable such as “number of years at peace” instead of timedummies imposes a monotonic hazard.
The use of time dummies may use up a lot of degrees of freedom, soBKT suggest using restricted cubic splines.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 188 / 242
![Page 915: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/915.jpg)
BKT suggest using a logit link instead of a cloglog link because logitis more widely used (and nobody knows what a cloglog link is exceptyou guys!).
As long as probability of an event does not exceed 50 percent, logitand cloglog links are very similar.
The use of time dummies means that we are imposing no structure onthe nature of duration dependence (structure of the hazard).
If we don’t use time dummies, we are assuming no durationdependence (flat hazard)
Using a variable such as “number of years at peace” instead of timedummies imposes a monotonic hazard.
The use of time dummies may use up a lot of degrees of freedom, soBKT suggest using restricted cubic splines.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 188 / 242
![Page 916: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/916.jpg)
BKT suggest using a logit link instead of a cloglog link because logitis more widely used (and nobody knows what a cloglog link is exceptyou guys!).
As long as probability of an event does not exceed 50 percent, logitand cloglog links are very similar.
The use of time dummies means that we are imposing no structure onthe nature of duration dependence (structure of the hazard).
If we don’t use time dummies, we are assuming no durationdependence (flat hazard)
Using a variable such as “number of years at peace” instead of timedummies imposes a monotonic hazard.
The use of time dummies may use up a lot of degrees of freedom, soBKT suggest using restricted cubic splines.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 188 / 242
![Page 917: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/917.jpg)
BKT suggest using a logit link instead of a cloglog link because logitis more widely used (and nobody knows what a cloglog link is exceptyou guys!).
As long as probability of an event does not exceed 50 percent, logitand cloglog links are very similar.
The use of time dummies means that we are imposing no structure onthe nature of duration dependence (structure of the hazard).
If we don’t use time dummies, we are assuming no durationdependence (flat hazard)
Using a variable such as “number of years at peace” instead of timedummies imposes a monotonic hazard.
The use of time dummies may use up a lot of degrees of freedom, soBKT suggest using restricted cubic splines.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 188 / 242
![Page 918: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/918.jpg)
Possible complications:
Multiple eventsI Assumes that multiple events are independent (independence of
observations assumption in a survival model).
Left censoringI Countries may have been at peace long before we start observing data,
and we don’t know when that “peace duration” began.
Variables that do not vary across unitsI May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 919: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/919.jpg)
Possible complications:
Multiple events
I Assumes that multiple events are independent (independence ofobservations assumption in a survival model).
Left censoringI Countries may have been at peace long before we start observing data,
and we don’t know when that “peace duration” began.
Variables that do not vary across unitsI May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 920: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/920.jpg)
Possible complications:
Multiple eventsI Assumes that multiple events are independent (independence of
observations assumption in a survival model).
Left censoringI Countries may have been at peace long before we start observing data,
and we don’t know when that “peace duration” began.
Variables that do not vary across unitsI May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 921: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/921.jpg)
Possible complications:
Multiple eventsI Assumes that multiple events are independent (independence of
observations assumption in a survival model).
Left censoring
I Countries may have been at peace long before we start observing data,and we don’t know when that “peace duration” began.
Variables that do not vary across unitsI May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 922: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/922.jpg)
Possible complications:
Multiple eventsI Assumes that multiple events are independent (independence of
observations assumption in a survival model).
Left censoringI Countries may have been at peace long before we start observing data,
and we don’t know when that “peace duration” began.
Variables that do not vary across unitsI May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 923: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/923.jpg)
Possible complications:
Multiple eventsI Assumes that multiple events are independent (independence of
observations assumption in a survival model).
Left censoringI Countries may have been at peace long before we start observing data,
and we don’t know when that “peace duration” began.
Variables that do not vary across units
I May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 924: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/924.jpg)
Possible complications:
Multiple eventsI Assumes that multiple events are independent (independence of
observations assumption in a survival model).
Left censoringI Countries may have been at peace long before we start observing data,
and we don’t know when that “peace duration” began.
Variables that do not vary across unitsI May be collinear with time dummies.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 189 / 242
![Page 925: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/925.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 926: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/926.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 927: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/927.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear model
I binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 928: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/928.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probit
I count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 929: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/929.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomial
I ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 930: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/930.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logit
I categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 931: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/931.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probit
I duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 932: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/932.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 933: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/933.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component
2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 934: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/934.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood
3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 935: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/935.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE
4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 936: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/936.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 937: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/937.jpg)
First Half of Course Summary
In the first six weeks we have covered: maximum likelihood estimation,generalized linear models and a general approach to quantities of interest
We touched on at least briefly on standard models for most types of datayou will encounter:
I continuous: normal linear modelI binary: logit, probitI count: poisson, negative binomialI ordered: ordinal probit, ordinal logitI categorical: multinomial logit, multinomial probitI duration: exponential, weibull, cox proportional hazards
Each model followed a similar pattern:
1 define a model with a stochastic and systematic component2 derive the log-likelihood3 estimate via MLE4 interpret quantities of interest
You can now interpret these models and learn new ones.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 190 / 242
![Page 938: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/938.jpg)
Preview
Weeks 7-8 Missing DataI Mixture Models and the Expectation Maximization algorithmI Missing Data and Multiple Imputation
Weeks 9-10 Causal InferenceI Model Dependence and MatchingI Explanation in Causal Inference with Moderation/Mediation
Weeks 11-12 Hierarchical ModelsI Regularization and Hierarchical ModelsI More Hierarchical Models and Wrap-up
These topics are a long-term bet on things that will be important in yourcareer. Also a short case-study in reading into a new statistical literature.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 191 / 242
![Page 939: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/939.jpg)
Preview
Weeks 7-8 Missing DataI Mixture Models and the Expectation Maximization algorithmI Missing Data and Multiple Imputation
Weeks 9-10 Causal InferenceI Model Dependence and MatchingI Explanation in Causal Inference with Moderation/Mediation
Weeks 11-12 Hierarchical ModelsI Regularization and Hierarchical ModelsI More Hierarchical Models and Wrap-up
These topics are a long-term bet on things that will be important in yourcareer. Also a short case-study in reading into a new statistical literature.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 191 / 242
![Page 940: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/940.jpg)
Preview
Weeks 7-8 Missing DataI Mixture Models and the Expectation Maximization algorithmI Missing Data and Multiple Imputation
Weeks 9-10 Causal InferenceI Model Dependence and MatchingI Explanation in Causal Inference with Moderation/Mediation
Weeks 11-12 Hierarchical ModelsI Regularization and Hierarchical ModelsI More Hierarchical Models and Wrap-up
These topics are a long-term bet on things that will be important in yourcareer. Also a short case-study in reading into a new statistical literature.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 191 / 242
![Page 941: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/941.jpg)
Preview
Weeks 7-8 Missing DataI Mixture Models and the Expectation Maximization algorithmI Missing Data and Multiple Imputation
Weeks 9-10 Causal InferenceI Model Dependence and MatchingI Explanation in Causal Inference with Moderation/Mediation
Weeks 11-12 Hierarchical ModelsI Regularization and Hierarchical ModelsI More Hierarchical Models and Wrap-up
These topics are a long-term bet on things that will be important in yourcareer. Also a short case-study in reading into a new statistical literature.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 191 / 242
![Page 942: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/942.jpg)
Preview
Weeks 7-8 Missing DataI Mixture Models and the Expectation Maximization algorithmI Missing Data and Multiple Imputation
Weeks 9-10 Causal InferenceI Model Dependence and MatchingI Explanation in Causal Inference with Moderation/Mediation
Weeks 11-12 Hierarchical ModelsI Regularization and Hierarchical ModelsI More Hierarchical Models and Wrap-up
These topics are a long-term bet on things that will be important in yourcareer. Also a short case-study in reading into a new statistical literature.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 191 / 242
![Page 943: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/943.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 192 / 242
![Page 944: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/944.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 192 / 242
![Page 945: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/945.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”
I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 946: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/946.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”
I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 947: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/947.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”
I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 948: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/948.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”I Qualitative: we can learn the parameter with infinite draws
I Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 949: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/949.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood value
I Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 950: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/950.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 951: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/951.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 952: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/952.jpg)
Identification
Reading: Unifying Political Methodology, Chapter 8
Definition of “identification”I Qualitative: we can learn the parameter with infinite drawsI Mathematical: Each parameter value produces unique likelihood valueI Graphical: A likelihood with a plateau at the maximum
Partially identified models: the likelihood is informative but not abouta single point
Non-identified models: include those that make little sense, even ifhard to tell.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 193 / 242
![Page 953: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/953.jpg)
Example 1: Flat Likelihoods
A (dumb) model:Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 954: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/954.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 955: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/955.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 956: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/956.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 957: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/957.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 958: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/958.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 959: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/959.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 960: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/960.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 961: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/961.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 962: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/962.jpg)
Example 1: Flat LikelihoodsA (dumb) model:
Yi ∼ fp(yi |λi )
λi = 1 + 0β
What do we know about β? (from the likelihood perspective)
L(λ|y) =n∏
i=1
e−λλyi
yi !
and the log-likelihood, with (1 + 0β) substituted for λi :
ln L(β|y) =n∑
i=1
{−(0β + 1)− yi ln(0β + 1)}
=n∑
i=1
−1
= −n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 194 / 242
![Page 963: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/963.jpg)
Example 1: Flat Likelihoods
1. An identified likelihood has a unique maximum.
2. A likelihood function with a flat region or plateau at the maximum isnot identified.
3. A likelihood with a plateau can be informative, but a unique MLEdoesn’t exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 195 / 242
![Page 964: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/964.jpg)
Example 1: Flat Likelihoods
1. An identified likelihood has a unique maximum.
2. A likelihood function with a flat region or plateau at the maximum isnot identified.
3. A likelihood with a plateau can be informative, but a unique MLEdoesn’t exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 195 / 242
![Page 965: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/965.jpg)
Example 1: Flat Likelihoods
1. An identified likelihood has a unique maximum.
2. A likelihood function with a flat region or plateau at the maximum isnot identified.
3. A likelihood with a plateau can be informative, but a unique MLEdoesn’t exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 195 / 242
![Page 966: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/966.jpg)
Example 1: Flat Likelihoods
1. An identified likelihood has a unique maximum.
2. A likelihood function with a flat region or plateau at the maximum isnot identified.
3. A likelihood with a plateau can be informative, but a unique MLEdoesn’t exist
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 195 / 242
![Page 967: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/967.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 968: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/968.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 969: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/969.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 970: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/970.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3,
where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 971: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/971.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 972: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/972.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 973: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/973.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 974: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/974.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 975: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/975.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 976: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/976.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 977: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/977.jpg)
Example 2: Non-unique Reparameterization
A model
Yi ∼ fN(yi |µi , σ2)
µi = x1iβ1 + x2iβ2 + x3iβ3, where x2i = x3i
= x1iβ1 + x2i (β2 + β3)
What is the (unique) MLE of β2 and β3? Different parameter values lead to thesame values of µ and thus the same likelihood values:
µi = x1iβ1 + x2i (5 + 3)
µi = x1iβ1 + x2i (3 + 5)
µi = x1iβ1 + x2i (7 + 1)
So {β2 = 2, β3 = 5} gives the same likelihood as {β2 = 5, β3 = 2}.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 196 / 242
![Page 978: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/978.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 979: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/979.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 980: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/980.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 981: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/981.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 982: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/982.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 983: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/983.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 984: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/984.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 985: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/985.jpg)
Introduction to Multiple Equation Models
1. Let Yi be an N × 1 vector for each i (i = 1, . . . , n)
2. Elements of Yi are jointly distributed
YiN×1∼ f ( θi
N×1, αN×N
)
3. Systematic components:
θ1i = g1(x1i , β1)
θ2i = g2(x2i , β2)
...
θNi = gN(xNi , βN)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 197 / 242
![Page 986: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/986.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 987: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/987.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 988: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/988.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 989: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/989.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 990: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/990.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 991: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/991.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 992: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/992.jpg)
When are Multiple Equation Models different from Nseparate equation-by-equation models?
When the elements of Yi are (conditional on X ),
1. Stochastically dependent, or
2. Parametrically dependent (shared parameters)
Example and proof:
Suppose no ancillary parameters, and N = 2. The joint density:
f (y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
(BTW, you now know how to form the likelihood for multiple equationmodels!)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 198 / 242
![Page 993: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/993.jpg)
Assuming stochastic independence lets us factor f :
P(y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
=n∏
i=1
f (y1i |θ1i )f (y2i |θ2i )
with log-likelihood
ln L(θ1, θ2|y) =n∑
i=1
ln f (y1i |θ1i ) +n∑
i=1
ln f (y2i |θ2i )
Also assume parametric independence, and you can estimate the equationsseparately.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 199 / 242
![Page 994: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/994.jpg)
Assuming stochastic independence lets us factor f :
P(y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
=n∏
i=1
f (y1i |θ1i )f (y2i |θ2i )
with log-likelihood
ln L(θ1, θ2|y) =n∑
i=1
ln f (y1i |θ1i ) +n∑
i=1
ln f (y2i |θ2i )
Also assume parametric independence, and you can estimate the equationsseparately.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 199 / 242
![Page 995: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/995.jpg)
Assuming stochastic independence lets us factor f :
P(y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
=n∏
i=1
f (y1i |θ1i )f (y2i |θ2i )
with log-likelihood
ln L(θ1, θ2|y) =n∑
i=1
ln f (y1i |θ1i ) +n∑
i=1
ln f (y2i |θ2i )
Also assume parametric independence, and you can estimate the equationsseparately.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 199 / 242
![Page 996: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/996.jpg)
Assuming stochastic independence lets us factor f :
P(y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
=n∏
i=1
f (y1i |θ1i )f (y2i |θ2i )
with log-likelihood
ln L(θ1, θ2|y) =n∑
i=1
ln f (y1i |θ1i ) +n∑
i=1
ln f (y2i |θ2i )
Also assume parametric independence, and you can estimate the equationsseparately.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 199 / 242
![Page 997: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/997.jpg)
Assuming stochastic independence lets us factor f :
P(y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
=n∏
i=1
f (y1i |θ1i )f (y2i |θ2i )
with log-likelihood
ln L(θ1, θ2|y) =n∑
i=1
ln f (y1i |θ1i ) +n∑
i=1
ln f (y2i |θ2i )
Also assume parametric independence, and you can estimate the equationsseparately.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 199 / 242
![Page 998: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/998.jpg)
Assuming stochastic independence lets us factor f :
P(y |θ) =n∏
i=1
f (y1i , y2i |θ1i , θ2i )
=n∏
i=1
f (y1i |θ1i )f (y2i |θ2i )
with log-likelihood
ln L(θ1, θ2|y) =n∑
i=1
ln f (y1i |θ1i ) +n∑
i=1
ln f (y2i |θ2i )
Also assume parametric independence, and you can estimate the equationsseparately.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 199 / 242
![Page 999: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/999.jpg)
Example: 1992 U.S. Presidential Election
Alvarez and Nagler (1995):
Yi : Vote choice in the 1992 U.S. presidential election(1 = Clinton, 2 = Bush, 3 = Perot)
Bush Clinton Perot
1992 Presidential Election Vote Choice (ANES, n=909)
020
040
0
34%
46%
20%
Two types of predictors:
I Voter-specific (Vi ): age, gender, education, party, opinions, etc.
I Candidate-varying (Xij): ideological distance between voter i andcandidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 200 / 242
![Page 1000: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1000.jpg)
Example: 1992 U.S. Presidential Election
Alvarez and Nagler (1995):
Yi : Vote choice in the 1992 U.S. presidential election(1 = Clinton, 2 = Bush, 3 = Perot)
Bush Clinton Perot
1992 Presidential Election Vote Choice (ANES, n=909)0
200
400
34%
46%
20%
Two types of predictors:
I Voter-specific (Vi ): age, gender, education, party, opinions, etc.
I Candidate-varying (Xij): ideological distance between voter i andcandidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 200 / 242
![Page 1001: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1001.jpg)
Example: 1992 U.S. Presidential Election
Alvarez and Nagler (1995):
Yi : Vote choice in the 1992 U.S. presidential election(1 = Clinton, 2 = Bush, 3 = Perot)
Bush Clinton Perot
1992 Presidential Election Vote Choice (ANES, n=909)0
200
400
34%
46%
20%
Two types of predictors:
I Voter-specific (Vi ): age, gender, education, party, opinions, etc.
I Candidate-varying (Xij): ideological distance between voter i andcandidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 200 / 242
![Page 1002: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1002.jpg)
Multinomial Logit Model
Generalize the logit model to more than two choices
The multinomial logit model (MNL):
πij = Pr(Yi = j | Vi ) =exp(V>i δj)∑J
k=1 exp(V>i δk),
where Vi = individual-specific characteristics of unit i (and anintercept)
Note that∑J
j=1 πij = 1
Need to set the base category for identifiability: δ1 = 0
δj represents how characteristics of voter i is associated withprobability of voting for candidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 201 / 242
![Page 1003: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1003.jpg)
Multinomial Logit Model
Generalize the logit model to more than two choices
The multinomial logit model (MNL):
πij = Pr(Yi = j | Vi ) =exp(V>i δj)∑J
k=1 exp(V>i δk),
where Vi = individual-specific characteristics of unit i (and anintercept)
Note that∑J
j=1 πij = 1
Need to set the base category for identifiability: δ1 = 0
δj represents how characteristics of voter i is associated withprobability of voting for candidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 201 / 242
![Page 1004: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1004.jpg)
Multinomial Logit Model
Generalize the logit model to more than two choices
The multinomial logit model (MNL):
πij = Pr(Yi = j | Vi ) =exp(V>i δj)∑J
k=1 exp(V>i δk),
where Vi = individual-specific characteristics of unit i (and anintercept)
Note that∑J
j=1 πij = 1
Need to set the base category for identifiability: δ1 = 0
δj represents how characteristics of voter i is associated withprobability of voting for candidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 201 / 242
![Page 1005: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1005.jpg)
Multinomial Logit Model
Generalize the logit model to more than two choices
The multinomial logit model (MNL):
πij = Pr(Yi = j | Vi ) =exp(V>i δj)∑J
k=1 exp(V>i δk),
where Vi = individual-specific characteristics of unit i (and anintercept)
Note that∑J
j=1 πij = 1
Need to set the base category for identifiability: δ1 = 0
δj represents how characteristics of voter i is associated withprobability of voting for candidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 201 / 242
![Page 1006: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1006.jpg)
Multinomial Logit Model
Generalize the logit model to more than two choices
The multinomial logit model (MNL):
πij = Pr(Yi = j | Vi ) =exp(V>i δj)∑J
k=1 exp(V>i δk),
where Vi = individual-specific characteristics of unit i (and anintercept)
Note that∑J
j=1 πij = 1
Need to set the base category for identifiability: δ1 = 0
δj represents how characteristics of voter i is associated withprobability of voting for candidate j
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 201 / 242
![Page 1007: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1007.jpg)
Conditional Logit Model
We can also incorporate alternative-varying predictors Xij
The conditional logit (CL) model:
πij = Pr(Yi = j | Xij) =exp(X>ij β)∑Jk=1 exp(X>ik β)
β represents how characteristics of candidate j for voter i areassociated with voting probabilities
Xij does not have to vary across voters (e.g. whether candidate j isincumbent)
In that case we suppress the subscript to Xj
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 202 / 242
![Page 1008: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1008.jpg)
Conditional Logit Model
We can also incorporate alternative-varying predictors Xij
The conditional logit (CL) model:
πij = Pr(Yi = j | Xij) =exp(X>ij β)∑Jk=1 exp(X>ik β)
β represents how characteristics of candidate j for voter i areassociated with voting probabilities
Xij does not have to vary across voters (e.g. whether candidate j isincumbent)
In that case we suppress the subscript to Xj
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 202 / 242
![Page 1009: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1009.jpg)
Conditional Logit Model
We can also incorporate alternative-varying predictors Xij
The conditional logit (CL) model:
πij = Pr(Yi = j | Xij) =exp(X>ij β)∑Jk=1 exp(X>ik β)
β represents how characteristics of candidate j for voter i areassociated with voting probabilities
Xij does not have to vary across voters (e.g. whether candidate j isincumbent)
In that case we suppress the subscript to Xj
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 202 / 242
![Page 1010: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1010.jpg)
Conditional Logit Model
We can also incorporate alternative-varying predictors Xij
The conditional logit (CL) model:
πij = Pr(Yi = j | Xij) =exp(X>ij β)∑Jk=1 exp(X>ik β)
β represents how characteristics of candidate j for voter i areassociated with voting probabilities
Xij does not have to vary across voters (e.g. whether candidate j isincumbent)
In that case we suppress the subscript to Xj
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 202 / 242
![Page 1011: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1011.jpg)
Conditional Logit Model
We can also incorporate alternative-varying predictors Xij
The conditional logit (CL) model:
πij = Pr(Yi = j | Xij) =exp(X>ij β)∑Jk=1 exp(X>ik β)
β represents how characteristics of candidate j for voter i areassociated with voting probabilities
Xij does not have to vary across voters (e.g. whether candidate j isincumbent)
In that case we suppress the subscript to Xj
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 202 / 242
![Page 1012: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1012.jpg)
MNL as a Special Case of CL
Mathematically, MNL can be subsumed under CL using a set ofartificial alternative-varying regressors for each Vi :
Xi1 =
Vi
0...0
, Xi2 =
0Vi
...0
, · · · , XiJ =
00...
Vi
Set the element of β for Xij to δj and you get the MNL model
δ1 must be set to zero for identifiability
Thus we can write both models (and their mixture) simply as CL:
πij = Pr(Yi = j | X ) =exp(X>ij β)∑Jk=1 exp(X>ik β)
We use the names CL and MNL interchangeably from here on
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 203 / 242
![Page 1013: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1013.jpg)
MNL as a Special Case of CL
Mathematically, MNL can be subsumed under CL using a set ofartificial alternative-varying regressors for each Vi :
Xi1 =
Vi
0...0
, Xi2 =
0Vi
...0
, · · · , XiJ =
00...
Vi
Set the element of β for Xij to δj and you get the MNL model
δ1 must be set to zero for identifiability
Thus we can write both models (and their mixture) simply as CL:
πij = Pr(Yi = j | X ) =exp(X>ij β)∑Jk=1 exp(X>ik β)
We use the names CL and MNL interchangeably from here on
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 203 / 242
![Page 1014: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1014.jpg)
MNL as a Special Case of CL
Mathematically, MNL can be subsumed under CL using a set ofartificial alternative-varying regressors for each Vi :
Xi1 =
Vi
0...0
, Xi2 =
0Vi
...0
, · · · , XiJ =
00...
Vi
Set the element of β for Xij to δj and you get the MNL model
δ1 must be set to zero for identifiability
Thus we can write both models (and their mixture) simply as CL:
πij = Pr(Yi = j | X ) =exp(X>ij β)∑Jk=1 exp(X>ik β)
We use the names CL and MNL interchangeably from here on
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 203 / 242
![Page 1015: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1015.jpg)
MNL as a Special Case of CL
Mathematically, MNL can be subsumed under CL using a set ofartificial alternative-varying regressors for each Vi :
Xi1 =
Vi
0...0
, Xi2 =
0Vi
...0
, · · · , XiJ =
00...
Vi
Set the element of β for Xij to δj and you get the MNL model
δ1 must be set to zero for identifiability
Thus we can write both models (and their mixture) simply as CL:
πij = Pr(Yi = j | X ) =exp(X>ij β)∑Jk=1 exp(X>ik β)
We use the names CL and MNL interchangeably from here on
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 203 / 242
![Page 1016: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1016.jpg)
MNL as a Special Case of CL
Mathematically, MNL can be subsumed under CL using a set ofartificial alternative-varying regressors for each Vi :
Xi1 =
Vi
0...0
, Xi2 =
0Vi
...0
, · · · , XiJ =
00...
Vi
Set the element of β for Xij to δj and you get the MNL model
δ1 must be set to zero for identifiability
Thus we can write both models (and their mixture) simply as CL:
πij = Pr(Yi = j | X ) =exp(X>ij β)∑Jk=1 exp(X>ik β)
We use the names CL and MNL interchangeably from here on
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 203 / 242
![Page 1017: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1017.jpg)
MNL as a Special Case of CL
Mathematically, MNL can be subsumed under CL using a set ofartificial alternative-varying regressors for each Vi :
Xi1 =
Vi
0...0
, Xi2 =
0Vi
...0
, · · · , XiJ =
00...
Vi
Set the element of β for Xij to δj and you get the MNL model
δ1 must be set to zero for identifiability
Thus we can write both models (and their mixture) simply as CL:
πij = Pr(Yi = j | X ) =exp(X>ij β)∑Jk=1 exp(X>ik β)
We use the names CL and MNL interchangeably from here on
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 203 / 242
![Page 1018: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1018.jpg)
Predictor Types and Data Formats
Discrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1019: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1019.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1020: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1020.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1021: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1021.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1022: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1022.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1023: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1023.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1024: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1024.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1025: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1025.jpg)
Predictor Types and Data FormatsDiscrete choice data usually come in one of the two formats:
1 Wide format: N rows, #V + J ·#X predictors
choice women educ idist.Clinton idist.Bush idist.Perot
Bush 1 3 4.0804 0.1024 0.2601
Bush 1 4 4.0804 0.1024 0.2601
Clinton 1 2 1.0404 1.7424 0.2401
Bush 0 6 0.0004 5.3824 2.2201
Clinton 1 3 0.9604 11.0220 6.2001 ...
2 Long format: NJ rows, #V + #X predictors
chid alt choice women educ idist
1 Bush TRUE 1 3 0.1024
1 Clinton FALSE 1 3 4.0804
1 Perot FALSE 1 3 0.2601
2 Bush TRUE 1 4 0.1024
2 Clinton FALSE 1 4 4.0804
2 Perot FALSE 1 4 0.2601 ...
Use reshape to change between wide and long
Some estimation functions (e.g. mlogit) can take both formats
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 204 / 242
![Page 1026: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1026.jpg)
Latent Variable Interpretation
Recall the random utility model:
Y ∗ij = X>ij β + εij ,
where
{Y ∗ij = latent utility from choosing j for i
εij = stochastic component of the utility
Assume that voter chooses the most preferred candidate, i.e.,
Yi = j if Y ∗ij ≥ Y ∗ij ′ for any j ′ ∈ {1, ..., J}
Assuming εiji.i.d.∼ type I extreme value distribution, this setup implies
MNL (McFadden 1974)
Proof for J = 2:
Pr(Yi = 1 | X ) = Pr(Y ∗i1 ≥ Y ∗i2 | X )
= Pr(εi2 − εi1 ≤ (Xi1 − Xi2)>β
)=
exp((Xi1 − Xi2)>β
)1 + exp ((Xi1 − Xi2)>β)
=exp(X>i1 β)
exp(X>i1 β) + exp(X>i2 β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 205 / 242
![Page 1027: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1027.jpg)
Latent Variable Interpretation
Recall the random utility model:
Y ∗ij = X>ij β + εij ,
where
{Y ∗ij = latent utility from choosing j for i
εij = stochastic component of the utility
Assume that voter chooses the most preferred candidate, i.e.,
Yi = j if Y ∗ij ≥ Y ∗ij ′ for any j ′ ∈ {1, ..., J}
Assuming εiji.i.d.∼ type I extreme value distribution, this setup implies
MNL (McFadden 1974)
Proof for J = 2:
Pr(Yi = 1 | X ) = Pr(Y ∗i1 ≥ Y ∗i2 | X )
= Pr(εi2 − εi1 ≤ (Xi1 − Xi2)>β
)=
exp((Xi1 − Xi2)>β
)1 + exp ((Xi1 − Xi2)>β)
=exp(X>i1 β)
exp(X>i1 β) + exp(X>i2 β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 205 / 242
![Page 1028: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1028.jpg)
Latent Variable Interpretation
Recall the random utility model:
Y ∗ij = X>ij β + εij ,
where
{Y ∗ij = latent utility from choosing j for i
εij = stochastic component of the utility
Assume that voter chooses the most preferred candidate, i.e.,
Yi = j if Y ∗ij ≥ Y ∗ij ′ for any j ′ ∈ {1, ..., J}
Assuming εiji.i.d.∼ type I extreme value distribution, this setup implies
MNL (McFadden 1974)
Proof for J = 2:
Pr(Yi = 1 | X ) = Pr(Y ∗i1 ≥ Y ∗i2 | X )
= Pr(εi2 − εi1 ≤ (Xi1 − Xi2)>β
)=
exp((Xi1 − Xi2)>β
)1 + exp ((Xi1 − Xi2)>β)
=exp(X>i1 β)
exp(X>i1 β) + exp(X>i2 β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 205 / 242
![Page 1029: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1029.jpg)
Latent Variable Interpretation
Recall the random utility model:
Y ∗ij = X>ij β + εij ,
where
{Y ∗ij = latent utility from choosing j for i
εij = stochastic component of the utility
Assume that voter chooses the most preferred candidate, i.e.,
Yi = j if Y ∗ij ≥ Y ∗ij ′ for any j ′ ∈ {1, ..., J}
Assuming εiji.i.d.∼ type I extreme value distribution, this setup implies
MNL (McFadden 1974)
Proof for J = 2:
Pr(Yi = 1 | X ) = Pr(Y ∗i1 ≥ Y ∗i2 | X )
= Pr(εi2 − εi1 ≤ (Xi1 − Xi2)>β
)=
exp((Xi1 − Xi2)>β
)1 + exp ((Xi1 − Xi2)>β)
=exp(X>i1 β)
exp(X>i1 β) + exp(X>i2 β)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 205 / 242
![Page 1030: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1030.jpg)
Estimation and Inference
Estimation via MLE
Likelihood for a random sample of size n:
L(β | Y ,X ) =∏n
i=1
∏Jj=1 π
1{Yi=j}ij
It can be shown that the log-likelihood is globally concave⇒ guaranteed convergence to the true (not local) MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 206 / 242
![Page 1031: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1031.jpg)
Estimation and Inference
Estimation via MLE
Likelihood for a random sample of size n:
L(β | Y ,X ) =∏n
i=1
∏Jj=1 π
1{Yi=j}ij
It can be shown that the log-likelihood is globally concave⇒ guaranteed convergence to the true (not local) MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 206 / 242
![Page 1032: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1032.jpg)
Estimation and Inference
Estimation via MLE
Likelihood for a random sample of size n:
L(β | Y ,X ) =∏n
i=1
∏Jj=1 π
1{Yi=j}ij
It can be shown that the log-likelihood is globally concave⇒ guaranteed convergence to the true (not local) MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 206 / 242
![Page 1033: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1033.jpg)
Interpreting MNL/CL Coefficients
In MNL/CL, β itself is not necessarily informative about the effect of X
1 The coefficients are all with respect to the baseline category
−→ Testing βj = 0 does not generally make sense(unless comparison to the baseline is the goal)
2 Changing Xij has impact on Pr(Yi = k | X ), k 6= j :
I For individual-specific characteristics (Vi ), even sign of δj may notagree with the direction of the change in response probability for j
I For alternative-varying characteristics (Xij), sign of β does indicate thedirection of the effect, but magnitude is hard to interpret
Compute a quantity that has a clear substantive interpretation!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 207 / 242
![Page 1034: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1034.jpg)
Interpreting MNL/CL Coefficients
In MNL/CL, β itself is not necessarily informative about the effect of X
1 The coefficients are all with respect to the baseline category
−→ Testing βj = 0 does not generally make sense(unless comparison to the baseline is the goal)
2 Changing Xij has impact on Pr(Yi = k | X ), k 6= j :
I For individual-specific characteristics (Vi ), even sign of δj may notagree with the direction of the change in response probability for j
I For alternative-varying characteristics (Xij), sign of β does indicate thedirection of the effect, but magnitude is hard to interpret
Compute a quantity that has a clear substantive interpretation!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 207 / 242
![Page 1035: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1035.jpg)
Interpreting MNL/CL Coefficients
In MNL/CL, β itself is not necessarily informative about the effect of X
1 The coefficients are all with respect to the baseline category
−→ Testing βj = 0 does not generally make sense(unless comparison to the baseline is the goal)
2 Changing Xij has impact on Pr(Yi = k | X ), k 6= j :
I For individual-specific characteristics (Vi ), even sign of δj may notagree with the direction of the change in response probability for j
I For alternative-varying characteristics (Xij), sign of β does indicate thedirection of the effect, but magnitude is hard to interpret
Compute a quantity that has a clear substantive interpretation!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 207 / 242
![Page 1036: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1036.jpg)
Interpreting MNL/CL Coefficients
In MNL/CL, β itself is not necessarily informative about the effect of X
1 The coefficients are all with respect to the baseline category
−→ Testing βj = 0 does not generally make sense(unless comparison to the baseline is the goal)
2 Changing Xij has impact on Pr(Yi = k | X ), k 6= j :
I For individual-specific characteristics (Vi ), even sign of δj may notagree with the direction of the change in response probability for j
I For alternative-varying characteristics (Xij), sign of β does indicate thedirection of the effect, but magnitude is hard to interpret
Compute a quantity that has a clear substantive interpretation!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 207 / 242
![Page 1037: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1037.jpg)
Interpreting MNL/CL Coefficients
In MNL/CL, β itself is not necessarily informative about the effect of X
1 The coefficients are all with respect to the baseline category
−→ Testing βj = 0 does not generally make sense(unless comparison to the baseline is the goal)
2 Changing Xij has impact on Pr(Yi = k | X ), k 6= j :
I For individual-specific characteristics (Vi ), even sign of δj may notagree with the direction of the change in response probability for j
I For alternative-varying characteristics (Xij), sign of β does indicate thedirection of the effect, but magnitude is hard to interpret
Compute a quantity that has a clear substantive interpretation!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 207 / 242
![Page 1038: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1038.jpg)
Interpreting MNL/CL Coefficients
In MNL/CL, β itself is not necessarily informative about the effect of X
1 The coefficients are all with respect to the baseline category
−→ Testing βj = 0 does not generally make sense(unless comparison to the baseline is the goal)
2 Changing Xij has impact on Pr(Yi = k | X ), k 6= j :
I For individual-specific characteristics (Vi ), even sign of δj may notagree with the direction of the change in response probability for j
I For alternative-varying characteristics (Xij), sign of β does indicate thedirection of the effect, but magnitude is hard to interpret
Compute a quantity that has a clear substantive interpretation!
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 207 / 242
![Page 1039: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1039.jpg)
Calculating Quantities of Interest1 Choice probability:
πj(x) = Pr(Yi = j | X = x)
e.g. How likely is a female college-educated conservative Republican voter tovote for Perot?
2 Predicted vote share:
pj(x1) ≡ E [1 {πj(Xi1 = x1,Xi2) ≥ πk(Xi1 = x1,Xi2) for all k}]
where Xi1 is the predictor(s) of interest and Xi2 is all other predictors
e.g. What would Perot’s vote share be if all voters supported abortion?
3 Average partial (treatment) effects:
τjk = E [πj(Tik = 1,Ti∗,Wi )− πj(Tik = 0,Ti∗,Wi )]
where Tik is treatment on candidate k, Ti∗ is treatment on others, Wi ispre-treatment covariates
I “Direct effect” if j = k; “indirect effect” if j 6= kI If T is individual-specific, τj = E[πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate by plugging in sample analogues (e.g. πj → πj , E→ 1n
∑)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 208 / 242
![Page 1040: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1040.jpg)
Calculating Quantities of Interest1 Choice probability:
πj(x) = Pr(Yi = j | X = x)
e.g. How likely is a female college-educated conservative Republican voter tovote for Perot?
2 Predicted vote share:
pj(x1) ≡ E [1 {πj(Xi1 = x1,Xi2) ≥ πk(Xi1 = x1,Xi2) for all k}]
where Xi1 is the predictor(s) of interest and Xi2 is all other predictors
e.g. What would Perot’s vote share be if all voters supported abortion?
3 Average partial (treatment) effects:
τjk = E [πj(Tik = 1,Ti∗,Wi )− πj(Tik = 0,Ti∗,Wi )]
where Tik is treatment on candidate k, Ti∗ is treatment on others, Wi ispre-treatment covariates
I “Direct effect” if j = k; “indirect effect” if j 6= kI If T is individual-specific, τj = E[πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate by plugging in sample analogues (e.g. πj → πj , E→ 1n
∑)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 208 / 242
![Page 1041: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1041.jpg)
Calculating Quantities of Interest1 Choice probability:
πj(x) = Pr(Yi = j | X = x)
e.g. How likely is a female college-educated conservative Republican voter tovote for Perot?
2 Predicted vote share:
pj(x1) ≡ E [1 {πj(Xi1 = x1,Xi2) ≥ πk(Xi1 = x1,Xi2) for all k}]
where Xi1 is the predictor(s) of interest and Xi2 is all other predictors
e.g. What would Perot’s vote share be if all voters supported abortion?
3 Average partial (treatment) effects:
τjk = E [πj(Tik = 1,Ti∗,Wi )− πj(Tik = 0,Ti∗,Wi )]
where Tik is treatment on candidate k , Ti∗ is treatment on others, Wi ispre-treatment covariates
I “Direct effect” if j = k ; “indirect effect” if j 6= kI If T is individual-specific, τj = E[πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate by plugging in sample analogues (e.g. πj → πj , E→ 1n
∑)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 208 / 242
![Page 1042: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1042.jpg)
Calculating Quantities of Interest1 Choice probability:
πj(x) = Pr(Yi = j | X = x)
e.g. How likely is a female college-educated conservative Republican voter tovote for Perot?
2 Predicted vote share:
pj(x1) ≡ E [1 {πj(Xi1 = x1,Xi2) ≥ πk(Xi1 = x1,Xi2) for all k}]
where Xi1 is the predictor(s) of interest and Xi2 is all other predictors
e.g. What would Perot’s vote share be if all voters supported abortion?
3 Average partial (treatment) effects:
τjk = E [πj(Tik = 1,Ti∗,Wi )− πj(Tik = 0,Ti∗,Wi )]
where Tik is treatment on candidate k , Ti∗ is treatment on others, Wi ispre-treatment covariates
I “Direct effect” if j = k ; “indirect effect” if j 6= kI If T is individual-specific, τj = E[πj(Ti = 1,Wi )− πj(Ti = 0,Wi )]
Estimate by plugging in sample analogues (e.g. πj → πj , E→ 1n
∑)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 208 / 242
![Page 1043: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1043.jpg)
Example: 1992 U.S. Presidential Election
Model specification (Alvarez and Nagler 1995):
πij =exp(X>ij β + V>i δj)∑J
k=1 exp(X>ik β + V>i δk)
where
Xij = {ideological distance}Vi = {1, issue opinions, party, gender, education, age, ...}
Estimated coefficients:
β = −0.11 (0.02)
δ =[δBush δClinton
]=
0.67 (0.94) −0.41 (0.45)−0.52 (0.11) −0.02 (0.12)0.54 (0.23) 0.30 (0.22)
......
(intercept)(support abortion)(female)
...
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 209 / 242
![Page 1044: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1044.jpg)
Example: 1992 U.S. Presidential Election
Model specification (Alvarez and Nagler 1995):
πij =exp(X>ij β + V>i δj)∑J
k=1 exp(X>ik β + V>i δk)
where
Xij = {ideological distance}Vi = {1, issue opinions, party, gender, education, age, ...}
Estimated coefficients:
β = −0.11 (0.02)
δ =[δBush δClinton
]=
0.67 (0.94) −0.41 (0.45)−0.52 (0.11) −0.02 (0.12)0.54 (0.23) 0.30 (0.22)
......
(intercept)(support abortion)(female)
...
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 209 / 242
![Page 1045: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1045.jpg)
Example: 1992 U.S. Presidential Election
Estimated choice probabilities for a “typical” voter from South:
0.0
0.2
0.4
0.6
0.8
1.0
Cho
ice
Pro
babi
lity
Bush Clinton Perot
●
●
●
Predicted vote shares if everyone opposed abortion:
0.0
0.2
0.4
0.6
0.8
1.0
Vot
e S
hare
Bush Clinton Perot
● ●
●
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 210 / 242
![Page 1046: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1046.jpg)
Example: 1992 U.S. Presidential Election
Estimated choice probabilities for a “typical” voter from South:
0.0
0.2
0.4
0.6
0.8
1.0
Cho
ice
Pro
babi
lity
Bush Clinton Perot
●
●
●
Predicted vote shares if everyone opposed abortion:
0.0
0.2
0.4
0.6
0.8
1.0
Vot
e S
hare
Bush Clinton Perot
● ●
●
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 210 / 242
![Page 1047: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1047.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)? Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1048: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1048.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)? Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1049: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1049.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)? Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1050: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1050.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)? Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1051: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1051.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)? Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1052: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1052.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)?
Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1053: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1053.jpg)
Assumptions in Multinomial Logit Models
Recall that MNL assumes εij is i.i.d.
In particular, εij⊥⊥εik for j 6= k
This implies that unobserved factors affecting Y ∗ij are unrelated tothose affecting Y ∗ik
When is this assumption plausible?
Example: Multiparty election with parties R, L1 and L2.
Do voters’ unobserved ideological preferences affect Pr(Yi =L1)independently of their effect on Pr(Yi =L2)? Probably not.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 211 / 242
![Page 1054: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1054.jpg)
Independence of Irrelevant Alternatives
MNL makes the Independence of irrelevant alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
Relative risk of j over k does not depend on other alternatives:
Pr(Yi = j | Xi )
Pr(Yi = k | Xi )= exp{(Xij − Xik)>β}
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 212 / 242
![Page 1055: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1055.jpg)
Independence of Irrelevant Alternatives
MNL makes the Independence of irrelevant alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
Relative risk of j over k does not depend on other alternatives:
Pr(Yi = j | Xi )
Pr(Yi = k | Xi )= exp{(Xij − Xik)>β}
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 212 / 242
![Page 1056: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1056.jpg)
Independence of Irrelevant Alternatives
MNL makes the Independence of irrelevant alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
Relative risk of j over k does not depend on other alternatives:
Pr(Yi = j | Xi )
Pr(Yi = k | Xi )= exp{(Xij − Xik)>β}
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 212 / 242
![Page 1057: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1057.jpg)
Independence of Irrelevant Alternatives
MNL makes the Independence of irrelevant alternatives (IIA)assumption:
Pr(Choose j | j or k)
Pr(Choose k | j or k)=
Pr(Choose j | j or k or l)
Pr(Choose k | j or k or l)for any l ∈ {1, ...J}
A classical example of IIA violation: the red bus-blue bus problem
Relative risk of j over k does not depend on other alternatives:
Pr(Yi = j | Xi )
Pr(Yi = k | Xi )= exp{(Xij − Xik)>β}
That is, the multinomial choice reduces to a series of independentpairwise comparisons
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 212 / 242
![Page 1058: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1058.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1059: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1059.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1060: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1060.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1061: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1061.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:
I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1062: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1062.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter
−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1063: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1063.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1064: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1064.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1065: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1065.jpg)
Multinomial Probit Model
How can we relax the IIA assumption?
Instead of assuming εij to be i.i.d. across alternatives j , we allow εijto be correlated across j within each voter i
Multinomial probit model (MNP):
Y ∗i = X>i β + εi where
εi
i.i.d.∼ N (0,ΣJ)Y ∗i = [Y ∗i1 · · · Y ∗iJ ]>
Xi = [Xi1 · · · XiJ ]>
Restrictions on the model for identifiability:I The (absolute) level of Y ∗i shouldn’t matter−→ Subtract the 1st equation from all the other equations and work
with a system of J − 1 equations with εii.i.d.∼ N (0, ΣJ−1)
I The scale of Y ∗i also shouldn’t matter
−→ Σ(1,1) = 1
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 213 / 242
![Page 1066: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1066.jpg)
Multinomial Probit Walkthrough
U∗i ∼ N(u∗i |µi ,Σ)
µij = xijβj
with observation mechanism:
Yij =
{1 if U∗ij > U∗ij ′ ∀ j 6= j ′
0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 214 / 242
![Page 1067: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1067.jpg)
Multinomial Probit Walkthrough
U∗i ∼ N(u∗i |µi ,Σ)
µij = xijβj
with observation mechanism:
Yij =
{1 if U∗ij > U∗ij ′ ∀ j 6= j ′
0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 214 / 242
![Page 1068: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1068.jpg)
Multinomial Probit Walkthrough
U∗i ∼ N(u∗i |µi ,Σ)
µij = xijβj
with observation mechanism:
Yij =
{1 if U∗ij > U∗ij ′ ∀ j 6= j ′
0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 214 / 242
![Page 1069: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1069.jpg)
Multinomial Probit Walkthrough
U∗i ∼ N(u∗i |µi ,Σ)
µij = xijβj
with observation mechanism:
Yij =
{1 if U∗ij > U∗ij ′ ∀ j 6= j ′
0 otherwise
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 214 / 242
![Page 1070: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1070.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1
for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1071: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1071.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1 for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1072: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1072.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1 for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1073: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1073.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1 for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1074: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1074.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1 for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1075: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1075.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1 for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1076: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1076.jpg)
The stochastic component:
Pr(Yij = 1) = πij , s.t.J∑
j=1
πij = 1 for i = 1, . . . , n
Systematic component. Let Y ∗ij = U∗ij − U∗ij ′ , so the observationmechanism is
Yij =
{1 if Y ∗ij > 0
0 otherwise
πij = Pr(yij = 1)
= Pr(Y ∗i1 ≤ 0, . . . ,Y ∗ij > 0, . . . ,Y ∗iJ ≤ 0)
=
∫ 0
−∞· · ·∫ ∞
0· · ·∫ 0
−∞N(y |µi ,Σ)dyi1 · · · dyij · · · dyiJ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 215 / 242
![Page 1077: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1077.jpg)
Computational and Estimation issues
No analytical solution is known to the integral
Moreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Consequently, MNP is only feasible when J is small
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 216 / 242
![Page 1078: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1078.jpg)
Computational and Estimation issues
No analytical solution is known to the integral
Moreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Consequently, MNP is only feasible when J is small
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 216 / 242
![Page 1079: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1079.jpg)
Computational and Estimation issues
No analytical solution is known to the integral
Moreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Consequently, MNP is only feasible when J is small
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 216 / 242
![Page 1080: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1080.jpg)
Computational and Estimation issues
No analytical solution is known to the integral
Moreover, # of parameters in ΣJ increases as J gets large, but datacontain little information about ΣJ :
J 3 4 5 6 7# of elements in ΣJ 6 10 15 21 28# of parameters identified 2 5 9 14 20
Consequently, MNP is only feasible when J is small
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 216 / 242
![Page 1081: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1081.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 217 / 242
![Page 1082: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1082.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 217 / 242
![Page 1083: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1083.jpg)
Estimating the Dispersion Parameter
In GLMs, φ is typically estimated sequentially after β is obtained by MLE.Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1084: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1084.jpg)
Estimating the Dispersion ParameterIn GLMs, φ is typically estimated sequentially after β is obtained by MLE.
Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1085: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1085.jpg)
Estimating the Dispersion ParameterIn GLMs, φ is typically estimated sequentially after β is obtained by MLE.Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1086: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1086.jpg)
Estimating the Dispersion ParameterIn GLMs, φ is typically estimated sequentially after β is obtained by MLE.Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1087: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1087.jpg)
Estimating the Dispersion ParameterIn GLMs, φ is typically estimated sequentially after β is obtained by MLE.Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1088: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1088.jpg)
Estimating the Dispersion ParameterIn GLMs, φ is typically estimated sequentially after β is obtained by MLE.Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1089: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1089.jpg)
Estimating the Dispersion ParameterIn GLMs, φ is typically estimated sequentially after β is obtained by MLE.Two common methods for the common case of a(φ) = φ/ωi :
1 Calculate the (unscaled) deviance:
D(Y ; µ) ≡ φD∗(Y ; µ) = 2n∑
i=1
ωi
{Yi (θi − θi )− (b(θi )− b(θi ))
}which approximately follows φ · χ2
n−k (because D∗(Y ; µ)approx.∼ χ2
n−k)
Then estimate φ by φD = D(Y ;µ)n−k (because E
[χ2n−k]
= n − k)
2 Calculate the generalized Pearson χ2 statistic:
X 2 ≡n∑
i=1
ωi (Yi − µi )2
b′′(θi )= φ
n∑i=1
(Yi − µi )2
Vi
which also approximately follows φ · χ2n−k
Then estimate φ by φP = X 2
n−k (by the same logic)
When Yi ∼ N , D = X 2 ∼ φχ2n−k exactly, and φD and φP are identical and MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 218 / 242
![Page 1090: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1090.jpg)
Relaxing the Distributional Assumption
Note that in the overdispersed Poisson model, Yi cannot be Poissondistributed
How can this seemingly arbitrary modification justified?
In GLMs, we can replace the distributional assumption with thevariance function assumption:
1 Systematic component: X>i β = X>i β2 Link function: X>i β = g(µi ) where µi ≡ E(Yi | X )3 Variance function: V(Yi | X ) = φψ(µi )
That is, we specify mean and variance, but remain agnostic about therest of f (Y ) (i.e. likelihood)
With this reduced set of assumptions, what can we learn?
Bottom line: We lose nothing, thanks to the properties of theexponential family
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 219 / 242
![Page 1091: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1091.jpg)
Relaxing the Distributional Assumption
Note that in the overdispersed Poisson model, Yi cannot be Poissondistributed
How can this seemingly arbitrary modification justified?
In GLMs, we can replace the distributional assumption with thevariance function assumption:
1 Systematic component: X>i β = X>i β2 Link function: X>i β = g(µi ) where µi ≡ E(Yi | X )3 Variance function: V(Yi | X ) = φψ(µi )
That is, we specify mean and variance, but remain agnostic about therest of f (Y ) (i.e. likelihood)
With this reduced set of assumptions, what can we learn?
Bottom line: We lose nothing, thanks to the properties of theexponential family
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 219 / 242
![Page 1092: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1092.jpg)
Relaxing the Distributional Assumption
Note that in the overdispersed Poisson model, Yi cannot be Poissondistributed
How can this seemingly arbitrary modification justified?
In GLMs, we can replace the distributional assumption with thevariance function assumption:
1 Systematic component: X>i β = X>i β2 Link function: X>i β = g(µi ) where µi ≡ E(Yi | X )3 Variance function: V(Yi | X ) = φψ(µi )
That is, we specify mean and variance, but remain agnostic about therest of f (Y ) (i.e. likelihood)
With this reduced set of assumptions, what can we learn?
Bottom line: We lose nothing, thanks to the properties of theexponential family
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 219 / 242
![Page 1093: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1093.jpg)
Relaxing the Distributional Assumption
Note that in the overdispersed Poisson model, Yi cannot be Poissondistributed
How can this seemingly arbitrary modification justified?
In GLMs, we can replace the distributional assumption with thevariance function assumption:
1 Systematic component: X>i β = X>i β2 Link function: X>i β = g(µi ) where µi ≡ E(Yi | X )3 Variance function: V(Yi | X ) = φψ(µi )
That is, we specify mean and variance, but remain agnostic about therest of f (Y ) (i.e. likelihood)
With this reduced set of assumptions, what can we learn?
Bottom line: We lose nothing, thanks to the properties of theexponential family
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 219 / 242
![Page 1094: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1094.jpg)
Relaxing the Distributional Assumption
Note that in the overdispersed Poisson model, Yi cannot be Poissondistributed
How can this seemingly arbitrary modification justified?
In GLMs, we can replace the distributional assumption with thevariance function assumption:
1 Systematic component: X>i β = X>i β2 Link function: X>i β = g(µi ) where µi ≡ E(Yi | X )3 Variance function: V(Yi | X ) = φψ(µi )
That is, we specify mean and variance, but remain agnostic about therest of f (Y ) (i.e. likelihood)
With this reduced set of assumptions, what can we learn?
Bottom line: We lose nothing, thanks to the properties of theexponential family
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 219 / 242
![Page 1095: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1095.jpg)
Relaxing the Distributional Assumption
Note that in the overdispersed Poisson model, Yi cannot be Poissondistributed
How can this seemingly arbitrary modification justified?
In GLMs, we can replace the distributional assumption with thevariance function assumption:
1 Systematic component: X>i β = X>i β2 Link function: X>i β = g(µi ) where µi ≡ E(Yi | X )3 Variance function: V(Yi | X ) = φψ(µi )
That is, we specify mean and variance, but remain agnostic about therest of f (Y ) (i.e. likelihood)
With this reduced set of assumptions, what can we learn?
Bottom line: We lose nothing, thanks to the properties of theexponential family
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 219 / 242
![Page 1096: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1096.jpg)
Negative Binomial Regression Models
The overdispersed Poisson model is also called the negative binomial1 (NB1) model
An alternative parameterization for allowing overdispersion:
E(Yi | Xi ) = µi and V(Yi | Xi ) = µi + µ2i /γ > E(Yi | Xi )
This is called the negative binomial 2 (NB2) model
The NB2 model corresponds to the following PMF:
p(Yi | µi , γ) =Γ(Yi + γ)
Yi !Γ(γ)
(µi
µi + γ
)Yi(
γ
µi + γ
)γwhere µi = exp(X>i β)
(But keep in mind, you don’t need to exactly assume this PMF!)
Estimation via (Q)MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 220 / 242
![Page 1097: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1097.jpg)
Negative Binomial Regression Models
The overdispersed Poisson model is also called the negative binomial1 (NB1) model
An alternative parameterization for allowing overdispersion:
E(Yi | Xi ) = µi and V(Yi | Xi ) = µi + µ2i /γ > E(Yi | Xi )
This is called the negative binomial 2 (NB2) model
The NB2 model corresponds to the following PMF:
p(Yi | µi , γ) =Γ(Yi + γ)
Yi !Γ(γ)
(µi
µi + γ
)Yi(
γ
µi + γ
)γwhere µi = exp(X>i β)
(But keep in mind, you don’t need to exactly assume this PMF!)
Estimation via (Q)MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 220 / 242
![Page 1098: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1098.jpg)
Negative Binomial Regression Models
The overdispersed Poisson model is also called the negative binomial1 (NB1) model
An alternative parameterization for allowing overdispersion:
E(Yi | Xi ) = µi and V(Yi | Xi ) = µi + µ2i /γ > E(Yi | Xi )
This is called the negative binomial 2 (NB2) model
The NB2 model corresponds to the following PMF:
p(Yi | µi , γ) =Γ(Yi + γ)
Yi !Γ(γ)
(µi
µi + γ
)Yi(
γ
µi + γ
)γwhere µi = exp(X>i β)
(But keep in mind, you don’t need to exactly assume this PMF!)
Estimation via (Q)MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 220 / 242
![Page 1099: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1099.jpg)
Negative Binomial Regression Models
The overdispersed Poisson model is also called the negative binomial1 (NB1) model
An alternative parameterization for allowing overdispersion:
E(Yi | Xi ) = µi and V(Yi | Xi ) = µi + µ2i /γ > E(Yi | Xi )
This is called the negative binomial 2 (NB2) model
The NB2 model corresponds to the following PMF:
p(Yi | µi , γ) =Γ(Yi + γ)
Yi !Γ(γ)
(µi
µi + γ
)Yi(
γ
µi + γ
)γwhere µi = exp(X>i β)
(But keep in mind, you don’t need to exactly assume this PMF!)
Estimation via (Q)MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 220 / 242
![Page 1100: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1100.jpg)
Negative Binomial Regression Models
The overdispersed Poisson model is also called the negative binomial1 (NB1) model
An alternative parameterization for allowing overdispersion:
E(Yi | Xi ) = µi and V(Yi | Xi ) = µi + µ2i /γ > E(Yi | Xi )
This is called the negative binomial 2 (NB2) model
The NB2 model corresponds to the following PMF:
p(Yi | µi , γ) =Γ(Yi + γ)
Yi !Γ(γ)
(µi
µi + γ
)Yi(
γ
µi + γ
)γwhere µi = exp(X>i β)
(But keep in mind, you don’t need to exactly assume this PMF!)
Estimation via (Q)MLE
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 220 / 242
![Page 1101: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1101.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 221 / 242
![Page 1102: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1102.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 221 / 242
![Page 1103: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1103.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1104: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1104.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1105: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1105.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
where
πi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1106: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1106.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1107: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1107.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which implies
E (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1108: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1108.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1109: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1109.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1110: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1110.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1111: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1111.jpg)
Grouped Uncorrelated Binary VariablesSame model as binary logit, but we only observe sums of iid groups ofBernoulli trials. E.g., the number of times you voted out of the last 5elections.
Yi ∼ Binomial(yi |πi ) =
(Ni
yi
)πyii (1− πi )Ni−yi
whereπi = [1 + e−xiβ]−1
which impliesE (Yi ) ≡ µi = Niπi = Ni [1 + e−xiβ]−1
and a likelihood of
L(π|y) ∝n∏
i=1
Binomial(yi |πi )
=n∏
i=1
(Ni
yi
)πyii (1− πi )Ni−yi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 222 / 242
![Page 1112: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1112.jpg)
Grouped Uncorrelated Binary Variables
The Log-likelihood is then:
ln L(π|y) =n∑
i=1
{ln
(Ni
yi
)+ yi lnπi + (Ni − yi ) ln(1− πi )
}
and after substituting in the systematic component:
ln L(β|y).
=n∑
i=1
{−yi ln[1 + e−xiβ] + (Ni − yi ) ln
(1− [1 + e−xiβ]−1
)}=
n∑i=1
{(Ni − yi ) ln(1 + exiβ)− yi ln(1 + e−xiβ)
}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 223 / 242
![Page 1113: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1113.jpg)
Grouped Uncorrelated Binary Variables
The Log-likelihood is then:
ln L(π|y) =n∑
i=1
{ln
(Ni
yi
)+ yi lnπi + (Ni − yi ) ln(1− πi )
}
and after substituting in the systematic component:
ln L(β|y).
=n∑
i=1
{−yi ln[1 + e−xiβ] + (Ni − yi ) ln
(1− [1 + e−xiβ]−1
)}=
n∑i=1
{(Ni − yi ) ln(1 + exiβ)− yi ln(1 + e−xiβ)
}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 223 / 242
![Page 1114: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1114.jpg)
Grouped Uncorrelated Binary Variables
The Log-likelihood is then:
ln L(π|y) =n∑
i=1
{ln
(Ni
yi
)+ yi lnπi + (Ni − yi ) ln(1− πi )
}
and after substituting in the systematic component:
ln L(β|y).
=n∑
i=1
{−yi ln[1 + e−xiβ] + (Ni − yi ) ln
(1− [1 + e−xiβ]−1
)}=
n∑i=1
{(Ni − yi ) ln(1 + exiβ)− yi ln(1 + e−xiβ)
}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 223 / 242
![Page 1115: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1115.jpg)
Grouped Uncorrelated Binary Variables
The Log-likelihood is then:
ln L(π|y) =n∑
i=1
{ln
(Ni
yi
)+ yi lnπi + (Ni − yi ) ln(1− πi )
}
and after substituting in the systematic component:
ln L(β|y).
=n∑
i=1
{−yi ln[1 + e−xiβ] + (Ni − yi ) ln
(1− [1 + e−xiβ]−1
)}
=n∑
i=1
{(Ni − yi ) ln(1 + exiβ)− yi ln(1 + e−xiβ)
}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 223 / 242
![Page 1116: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1116.jpg)
Grouped Uncorrelated Binary Variables
The Log-likelihood is then:
ln L(π|y) =n∑
i=1
{ln
(Ni
yi
)+ yi lnπi + (Ni − yi ) ln(1− πi )
}
and after substituting in the systematic component:
ln L(β|y).
=n∑
i=1
{−yi ln[1 + e−xiβ] + (Ni − yi ) ln
(1− [1 + e−xiβ]−1
)}=
n∑i=1
{(Ni − yi ) ln(1 + exiβ)− yi ln(1 + e−xiβ)
}
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 223 / 242
![Page 1117: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1117.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1118: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1118.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1119: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1119.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1120: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1120.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1121: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1121.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.
(b) Draw many values of β from the multivariate normal with mean vector βand the variance matrix that come from optim.
(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1122: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1122.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.
(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1123: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1123.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1124: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1124.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1125: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1125.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1126: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1126.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1127: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1127.jpg)
Grouped Uncorrelated Binary Variables
Notes:
1. Similar log-likelihood to binary logit
2. All inference is about the same π as in binary logit
3. How to simulate and compute quantities of interest?
(a) Run optim, and get β and the variance matrix.(b) Draw many values of β from the multivariate normal with mean vector β
and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromBinomial(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 224 / 242
![Page 1128: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1128.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1129: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1129.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1130: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1130.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1131: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1131.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1132: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1132.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1133: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1133.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1134: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1134.jpg)
Grouped Correlated Binary VariablesIn the binomial-logit model, V (Y ) = πi (1− πi )/Ni , with no σ2-likeparameter to take up slack. The beta-binomial (or extended BB) adds thisextra parameter. The model:
Yi ∼ febb(yi |πi , γ)
where, recall
febb(yi |πi , γ) = Pr(Yi = yi |πi , γ,N)
=N!
yi !(N − yi )!
yi−1∏j=0
(πi + γj)
N−yi−1∏j=0
(1− πi + γj)/N−1∏j=0
(1 + γj)
and
πi =1
1 + e−xiβ
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 225 / 242
![Page 1135: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1135.jpg)
The probability model of all the data:
Pr(Y = y |β, γ; N) =n∏
i=1
(N!
yi !(N − yi )!
)
×yi−1∏j=0
{[1 + exp(−xiβ)]−1 + γj
}
×N−yi−1∏
j=0
{[1 + exp(xiβ)]−1 + γj
}/N−1∏j=0
(1 + γj)
ln L(β, γ|y) =n∑
i=1
{ln
(N!
yi !(N − yi )!
)
+
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
.
=n∑
i=1
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 226 / 242
![Page 1136: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1136.jpg)
The probability model of all the data:
Pr(Y = y |β, γ; N) =n∏
i=1
(N!
yi !(N − yi )!
)
×yi−1∏j=0
{[1 + exp(−xiβ)]−1 + γj
}
×N−yi−1∏
j=0
{[1 + exp(xiβ)]−1 + γj
}/N−1∏j=0
(1 + γj)
ln L(β, γ|y) =n∑
i=1
{ln
(N!
yi !(N − yi )!
)
+
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
.
=n∑
i=1
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 226 / 242
![Page 1137: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1137.jpg)
The probability model of all the data:
Pr(Y = y |β, γ; N) =n∏
i=1
(N!
yi !(N − yi )!
)
×yi−1∏j=0
{[1 + exp(−xiβ)]−1 + γj
}
×N−yi−1∏
j=0
{[1 + exp(xiβ)]−1 + γj
}/N−1∏j=0
(1 + γj)
ln L(β, γ|y) =n∑
i=1
{ln
(N!
yi !(N − yi )!
)
+
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
.=
n∑i=1
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 226 / 242
![Page 1138: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1138.jpg)
The probability model of all the data:
Pr(Y = y |β, γ; N) =n∏
i=1
(N!
yi !(N − yi )!
)
×yi−1∏j=0
{[1 + exp(−xiβ)]−1 + γj
}
×N−yi−1∏
j=0
{[1 + exp(xiβ)]−1 + γj
}/N−1∏j=0
(1 + γj)
ln L(β, γ|y) =n∑
i=1
{ln
(N!
yi !(N − yi )!
)
+
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
.
=n∑
i=1
yi−1∑j=0
ln{
[1 + exp(−xiβ)]−1 + γj}
+
N−yi−1∑j=0
ln{
[1 + exp(xiβ)]−1 + γj}−
N−1∑j=0
ln(1 + γj)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 226 / 242
![Page 1139: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1139.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?
(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1140: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1140.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?
(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1141: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1141.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?
(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1142: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1142.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?
(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1143: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1143.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?
(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1144: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1144.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?
(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1145: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1145.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.
(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1146: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1146.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.
(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1147: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1147.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1148: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1148.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1149: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1149.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1150: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1150.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1151: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1151.jpg)
Notes:
1. The math looks complicated.
2. The use of this model is simple.
3. γ soaks up binomial misspecification
4. Assuming binomial when EBB is the right model causes se’s to bewrong.
5. How to simulate to compute quantities of interest?(a) Run optim, and get β, γ and the variance matrix.(b) Draw many values of β and γ from the multivariate normal with mean
vector ~(β, γ) and the variance matrix that come from optim.(c) Set X to your choice of values, Xc
(d) Calculate simulations of the probability that any of the component binaryvariables is a one:
πc = [1 + e−xc β]−1
(e) If π is of interest, summarize with mean, SD, CI’s, or histogram asneeded.
(f) If simulations of y are needed, go one more step and draw y fromfebb(yi |πi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 227 / 242
![Page 1152: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1152.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 228 / 242
![Page 1153: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1153.jpg)
1 Binary Outcome Models
2 Quantities of InterestAn Example with CodePredicted ValuesFirst DifferencesGeneral Algorithms
3 Model Diagnostics for Binary Outcome Models
4 Ordered Categorical
5 Unordered Categorical
6 Event Count ModelsPoissonOverdispersionBinomial for Known Trials
7 Duration ModelsExponential ModelWeibull ModelCox Proportional Hazards Model
8 Duration-Logit Correspondence
9 Appendix: Multinomial Models
10 Appendix: More on Overdispersed Poisson
11 Appendix: More on Binomial Models
12 Appendix: Gamma Regression
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 228 / 242
![Page 1154: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1154.jpg)
Generalized Gamma DistributionThe Weibull and Exponential distributions are special cases of theGeneralized Gamma distribution, Y ∼ GGamma(ν, λ, p) :
fY (y) =pλpν
Γ(ν)ypν−1exp(−λy)p
When p = 1, Y ∼ Gamma(ν, λ):
fY (y) = λν
Γ(ν) yν−1exp(−λy)
When ν = 1, Y ∼ Weibull( 1λ, p):
fY (y) = pλpyp−1exp(−λy)p
When p = 1 and ν = 1, Y ∼ Expo(λ):
fY (y) = λexp(−λy)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 229 / 242
![Page 1155: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1155.jpg)
Generalized Gamma DistributionThe Weibull and Exponential distributions are special cases of theGeneralized Gamma distribution, Y ∼ GGamma(ν, λ, p) :
fY (y) =pλpν
Γ(ν)ypν−1exp(−λy)p
When p = 1, Y ∼ Gamma(ν, λ):
fY (y) = λν
Γ(ν) yν−1exp(−λy)
When ν = 1, Y ∼ Weibull( 1λ, p):
fY (y) = pλpyp−1exp(−λy)p
When p = 1 and ν = 1, Y ∼ Expo(λ):
fY (y) = λexp(−λy)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 229 / 242
![Page 1156: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1156.jpg)
Generalized Gamma DistributionThe Weibull and Exponential distributions are special cases of theGeneralized Gamma distribution, Y ∼ GGamma(ν, λ, p) :
fY (y) =pλpν
Γ(ν)ypν−1exp(−λy)p
When p = 1, Y ∼ Gamma(ν, λ):
fY (y) = λν
Γ(ν) yν−1exp(−λy)
When ν = 1, Y ∼ Weibull( 1λ, p):
fY (y) = pλpyp−1exp(−λy)p
When p = 1 and ν = 1, Y ∼ Expo(λ):
fY (y) = λexp(−λy)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 229 / 242
![Page 1157: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1157.jpg)
Generalized Gamma DistributionThe Weibull and Exponential distributions are special cases of theGeneralized Gamma distribution, Y ∼ GGamma(ν, λ, p) :
fY (y) =pλpν
Γ(ν)ypν−1exp(−λy)p
When p = 1, Y ∼ Gamma(ν, λ):
fY (y) = λν
Γ(ν) yν−1exp(−λy)
When ν = 1, Y ∼ Weibull( 1λ, p):
fY (y) = pλpyp−1exp(−λy)p
When p = 1 and ν = 1, Y ∼ Expo(λ):
fY (y) = λexp(−λy)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 229 / 242
![Page 1158: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1158.jpg)
When would a Gamma regression be appropriate?
For positive random variables with a skewed distribution, the varianceoften increases with the mean.
Poisson random variable: Var(Y ) = E [Y ] = µ.
Another case occurs where the standard-deviation increases linearly withthe mean: √
Var(Y ) ∝ E (Y )
In this case, the coefficient of variation (ratio of standard deviation toexpectation) is constant:
c.v . =
√Var(Y )
E (Y )
The Gamma distribution has this property.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 230 / 242
![Page 1159: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1159.jpg)
When would a Gamma regression be appropriate?
For positive random variables with a skewed distribution, the varianceoften increases with the mean.
Poisson random variable: Var(Y ) = E [Y ] = µ.
Another case occurs where the standard-deviation increases linearly withthe mean: √
Var(Y ) ∝ E (Y )
In this case, the coefficient of variation (ratio of standard deviation toexpectation) is constant:
c.v . =
√Var(Y )
E (Y )
The Gamma distribution has this property.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 230 / 242
![Page 1160: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1160.jpg)
When would a Gamma regression be appropriate?
For positive random variables with a skewed distribution, the varianceoften increases with the mean.
Poisson random variable: Var(Y ) = E [Y ] = µ.
Another case occurs where the standard-deviation increases linearly withthe mean: √
Var(Y ) ∝ E (Y )
In this case, the coefficient of variation (ratio of standard deviation toexpectation) is constant:
c.v . =
√Var(Y )
E (Y )
The Gamma distribution has this property.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 230 / 242
![Page 1161: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1161.jpg)
When would a Gamma regression be appropriate?
For positive random variables with a skewed distribution, the varianceoften increases with the mean.
Poisson random variable: Var(Y ) = E [Y ] = µ.
Another case occurs where the standard-deviation increases linearly withthe mean: √
Var(Y ) ∝ E (Y )
In this case, the coefficient of variation (ratio of standard deviation toexpectation) is constant:
c.v . =
√Var(Y )
E (Y )
The Gamma distribution has this property.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 230 / 242
![Page 1162: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1162.jpg)
When would a Gamma regression be appropriate?
Example
Mean duration of developmental period in Drosphila melanogaster(McCullagh & Nelder, 1989)
15 20 25 30
2030
4050
60
Temperature (Celsius)
Mea
n du
ratio
n (h
rs.)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 231 / 242
![Page 1163: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1163.jpg)
When would a Gamma regression be appropriate?
Example
Mean duration of developmental period in Drosphila melanogaster(McCullagh & Nelder, 1989)
20 30 40 50 60
0.5
1.0
1.5
2.0
Mean duration (hrs.)
Bat
ch s
tand
ard
devi
atio
n
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 232 / 242
![Page 1164: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1164.jpg)
Gamma shapesν is the shape parameter, λ is the scale parameter
0 2 4 6 8 10
0.0
0.5
1.0
1.5
y
f(y)
ν = 0.5ν = 1ν = 2ν = 5λ = 1
f(y) =λν
Γ(ν)(y)(ν−1)exp(−λy)
Special cases:
ν = 1 =⇒ Exponential
ν →∞ =⇒ Normal
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 233 / 242
![Page 1165: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1165.jpg)
Gamma as an EDF
fY (y) =λν
Γ(ν)yν−1exp(−λy)
= exp
(−λν y + ln(λν )
ν−1+ νln(νy)− ln(y)− ln(Γ(ν))
)
Where
θ = −λν
φ = ν−1 = σ2
b(θ) = −ln(λν
)E [Y ] = b′(θ) =
ν
λ= µ
Var(Y ) = φb′′(θ) =1
ν
ν2
λ2= σ2µ2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 234 / 242
![Page 1166: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1166.jpg)
Gamma as an EDF
fY (y) =λν
Γ(ν)yν−1exp(−λy)
= exp
(−λν y + ln(λν )
ν−1+ νln(νy)− ln(y)− ln(Γ(ν))
)Where
θ = −λν
φ = ν−1 = σ2
b(θ) = −ln(λν
)E [Y ] = b′(θ) =
ν
λ= µ
Var(Y ) = φb′′(θ) =1
ν
ν2
λ2= σ2µ2
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 234 / 242
![Page 1167: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1167.jpg)
Link functions
Canonical link
η = θ = − 1
µ
The reciprocal transformation does not map the range of µ onto the wholereal line.
The requirement that µ > 0 places restrictions on β’s.
The canonical link is rarely used.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 235 / 242
![Page 1168: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1168.jpg)
Link functions
Inverse polynomial: linear
η =µ−1 = β0 + β1/x
Inverse polynomial: quadratic
η =µ−1 = β0 + β1x + β2/x
Inverse polynomials have appealing property that η is everywhere positiveand bounded.
Application: sometimes used in plant density experiments, where yield perplant (yi ) varies inversely with plant density (xi )
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 236 / 242
![Page 1169: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1169.jpg)
Link functions
Log link
η =ln(µ) = β0 + β1x
η =ln(µ) = β0 + β1x + β2/x
Application: useful for describing functions that have turning points, butare noticeably asymmetric around that point.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 237 / 242
![Page 1170: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1170.jpg)
Link functions
Identity link
η =µ = β0 + β1x
Application: used for modeling variance components.
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 238 / 242
![Page 1171: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1171.jpg)
Maximum Likelihood Estimation
L =n∏
i=1
λν
Γ(ν)yν−1i exp(−λyi )
lnL =n∑
i=1
ln
[λν
Γ(ν)yν−1i exp(−λyi )
]
=n∑
i=1
νlnλ− lnΓ(ν) + (ν − 1)lnyi − λyi
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 239 / 242
![Page 1172: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1172.jpg)
Gamma regression with weightsSuppose your data consist of n observations, each from a separate groupi ∈ {1, . . . , n}. Each group has ni individuals.
Example
Yi is the duration of embryonic period in ni batches of fruit flies
Yi =
ni∑j=1
Yij Yij = duration for j embryo in i-th batch
Y si = Yi/ni = average duration in i-th batch
If Yij ∼ Gamma(λi , ν), independent, with λi = ν/µi :
E [Y si ] =
1
ni
ni∑j=1
E [Yij ] =1
ni
ni∑j=1
µi = µi
Var(Y si ) =
1
niVar(Yi ) =
σ2µ2i
niweights = ni
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 240 / 242
![Page 1173: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1173.jpg)
Gamma regression with weightsSuppose your data consist of n observations, each from a separate groupi ∈ {1, . . . , n}. Each group has ni individuals.
Example
Yi is the duration of embryonic period in ni batches of fruit flies
Yi =
ni∑j=1
Yij Yij = duration for j embryo in i-th batch
Y si = Yi/ni = average duration in i-th batch
If Yij ∼ Gamma(λi , ν), independent, with λi = ν/µi :
E [Y si ] =
1
ni
ni∑j=1
E [Yij ] =1
ni
ni∑j=1
µi = µi
Var(Y si ) =
1
niVar(Yi ) =
σ2µ2i
niweights = ni
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 240 / 242
![Page 1174: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1174.jpg)
Gamma regression with weightsSuppose your data consist of n observations, each from a separate groupi ∈ {1, . . . , n}. Each group has ni individuals.
Example
Yi is the duration of embryonic period in ni batches of fruit flies
Yi =
ni∑j=1
Yij Yij = duration for j embryo in i-th batch
Y si = Yi/ni = average duration in i-th batch
If Yij ∼ Gamma(λi , ν), independent, with λi = ν/µi :
E [Y si ] =
1
ni
ni∑j=1
E [Yij ] =1
ni
ni∑j=1
µi = µi
Var(Y si ) =
1
niVar(Yi ) =
σ2µ2i
niweights = ni
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 240 / 242
![Page 1175: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1175.jpg)
Application: Drosophila melanogaster
4 models estimated:
1 log(Durationi ) = β0 + β1Tempi
2 log(Durationi ) = β0 + β1Tempi + β2/Tempi3 log(Durationi ) = β0 + β1Tempi + β2/Tempi
(weighted by batch size)
4 log(Durationi ) = β0 + β1Tempi + β2/(Tempi − δ)(weighted by batch size)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 241 / 242
![Page 1176: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1176.jpg)
Application: Drosophila melanogaster
4 models estimated:
1 log(Durationi ) = β0 + β1Tempi2 log(Durationi ) = β0 + β1Tempi + β2/Tempi
3 log(Durationi ) = β0 + β1Tempi + β2/Tempi(weighted by batch size)
4 log(Durationi ) = β0 + β1Tempi + β2/(Tempi − δ)(weighted by batch size)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 241 / 242
![Page 1177: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1177.jpg)
Application: Drosophila melanogaster
4 models estimated:
1 log(Durationi ) = β0 + β1Tempi2 log(Durationi ) = β0 + β1Tempi + β2/Tempi3 log(Durationi ) = β0 + β1Tempi + β2/Tempi
(weighted by batch size)
4 log(Durationi ) = β0 + β1Tempi + β2/(Tempi − δ)(weighted by batch size)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 241 / 242
![Page 1178: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1178.jpg)
Application: Drosophila melanogaster
4 models estimated:
1 log(Durationi ) = β0 + β1Tempi2 log(Durationi ) = β0 + β1Tempi + β2/Tempi3 log(Durationi ) = β0 + β1Tempi + β2/Tempi
(weighted by batch size)
4 log(Durationi ) = β0 + β1Tempi + β2/(Tempi − δ)(weighted by batch size)
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 241 / 242
![Page 1179: Soc504: Generalized Linear Models - Princeton UniversitySoc504: Generalized Linear Models Brandon Stewart1 Princeton February 22 - March 15, 2017 1These slides are heavily in uenced](https://reader036.fdocuments.in/reader036/viewer/2022071213/60325b969ab99022ad486975/html5/thumbnails/1179.jpg)
Application: Drosophila melanogaster
15 20 25 30
2030
4050
60
Temperature (Celsius)
Mea
n du
ratio
n hr
s
log(µ) = β0 + β1x
Min. duration at 35 degrees
15 20 25 30
2030
4050
60
Temperature (Celsius)
Mea
n du
ratio
n hr
s
log(µ) = β0 + β1x + β2 x
Min. duration at 35 degrees
15 20 25 30
2030
4050
60
Temperature (Celsius)
Mea
n du
ratio
n hr
s
log(µ) = β0 + β1x + β2 x(weighted)
Min. duration at 35 degrees
15 20 25 30
2030
4050
60
Temperature (Celsius)
Mea
n du
ratio
n hr
s
log(µ) = β0 + β1x + β−1 (x − δ)(weighted)
Min. duration at 29.95 degrees
Stewart (Princeton) GLMs Feb 22 - Mar 15, 2017 242 / 242