1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael...
-
Upload
janis-ramsey -
Category
Documents
-
view
228 -
download
0
Transcript of 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael...
![Page 1: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/1.jpg)
1
Module I: Statistical Background on
Multi-level Models
Francesca Dominici
Scott L. Zeger
Michael Griswold
The Johns Hopkins University
Bloomberg School of Public Health
![Page 2: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/2.jpg)
2
Statistical Background on Multi-level Models
• Multi-level models
- Main ideas
- Conditional
- Marginal
- Contrasting Examples
![Page 3: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/3.jpg)
3
A Rose is a Rose is a…
• Multi-level model
• Random effects model
• Mixed model
• Random coefficient model
• Hierarchical model
![Page 4: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/4.jpg)
4
• Biological, psychological and social processes that influence health occur at many levels:– Cell– Organ– Person– Family– Neighborhood– City– Society
• An analysis of risk factors should consider:– Each of these levels– Their interactions
Multi-level Models – Main Idea
Health Outcome
![Page 5: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/5.jpg)
5
Example: Alcohol Abuse
1. Cell: Neurochemistry
2. Organ: Ability to metabolize ethanol
3. Person: Genetic susceptibility to addiction
4. Family: Alcohol abuse in the home
5. Neighborhood: Availability of bars
6. Society: Regulations; organizations;
social norms
Level:
![Page 6: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/6.jpg)
6
Example: Alcohol Abuse; Interactions among Levels
5 Availability of bars and
6 State laws about drunk driving
4 Alcohol abuse in the family and
2 Person’s ability to metabolize ethanol
3 Genetic predisposition to addiction and
4 Household environment
6 State regulations about intoxication and
3 Job requirements
Level:
![Page 7: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/7.jpg)
7
Notation:Population
Neighborhood:
i=1,…,Is
State: s=1,…,S
Family: j=1,…,Jsi
Person: k=1,…,Ksij
Outcome: Ysijk
Predictors: Xsijk
Person: sijk
( y1223 , x1223 )
![Page 8: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/8.jpg)
8
Notation (cont.)
![Page 9: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/9.jpg)
9
Multi-level Models: IdeaPredictor Variables
AlcoholAbuse
Response
Person’s Income
Family Income
Percent poverty in neighborhood
State support of the poor
s
sij
si
sijk
Level:
X.n
X.s
X.f
X.p
Ysijk
1.
2.
3.
4.
![Page 10: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/10.jpg)
10
Digression on Statistical Models
• A statistical model is an approximation to reality
• There is not a “correct” model;
– ( forget the holy grail )
• A model is a tool for asking a scientific question;
– ( screw-driver vs. sludge-hammer )
• A useful model combines the data with prior
information to address the question of interest.
• Many models are better than one.
![Page 11: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/11.jpg)
11
Generalized Linear Models (GLMs) g( ) = 0 + 1*X1 + … + p*Xp
Model Response g( ) Distribution Coef Interp
LinearContinuous
(ounces) Gaussian
Change in avg(Y) per unit change in
X
LogisticBinary
(disease) log Binomial Log Odds Ratio
Log-linear
Count/Times to events
log( ) Poisson Log Relative Risk
where: = E(Y|X) = mean
(1-)
![Page 12: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/12.jpg)
12
Since: E(y|Age+1,Gender) = 0 + 1(Age+1) + 2Gender
And: E(y|Age ,Gender) = 0 + 1Age + 2Gender
E(y) = 1
Generalized Linear Models (GLMs) g( ) = 0 + 1*X1 + … + p*Xp
Gaussian – Linear: E(y) = 0 + 1Age + 2Gender
Example: Age & Gender
1 = Change in Average Response per 1 unit increase in Age, Comparing people of the SAME GENDER.
WHY?
![Page 13: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/13.jpg)
13
Generalized Linear Models (GLMs) g( ) = 0 + 1*X1 + … + p*Xp
Binary – Logistic: log{odds(Y)} = 0 + 1Age + 2Gender
Example: Age & Gender
1 = log-OR of “+ Response” for a 1 unit increase in Age, Comparing people of the SAME GENDER.
WHY?
Since: log{odds(y|Age+1,Gender)} = 0 + 1(Age+1) + 2Gender
And: log{odds(y|Age ,Gender)} = 0 + 1Age + 2Gender
log-Odds = 1
log-OR = 1
![Page 14: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/14.jpg)
14
Generalized Linear Models (GLMs) g( ) = 0 + 1*X1 + … + p*Xp
Counts – Log-linear: log{E(Y)} = 0 + 1Age + 2Gender
Example: Age & Gender
1 = log-RR for a 1 unit increase in Age, Comparing people of the SAME GENDER.
WHY?
Verify for Yourself Tonight
![Page 15: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/15.jpg)
15
D. Responses are independent
B. All the key covariates are included in the model
Most Important Assumptions of Regression Analysis?
A. Data follow normal distribution
B. All the key covariates are included in the model
C. Xs are fixed and known
D. Responses are independent
![Page 16: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/16.jpg)
16
Within-Cluster Correlation
• Fact: two responses from the same family tend to be more like one another than two observations from different families
• Fact: two observations from the same neighborhood tend to be more like one another than two observations from different neighborhoods
• Why?
![Page 17: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/17.jpg)
17
Why? (Family Wealth Example)
GODGrandparents
Parents
You
Great-Grandparents
Great-Grandparents
You
Parents
Grandparents
![Page 18: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/18.jpg)
18
Multi-level Models: IdeaPredictor Variables
AlcoholAbuse
Response
Person’s Income
Family Income
Percent poverty in neighborhood
State support of the poor
s
sij
si
sijk
Level:
X.n
X.s
X.f
X.p
Ysijk
1.
2.
3.
4.
Bars a.nsi
Drunk Driving Laws a.ss
Unobserved random intercepts
Genes a.fsij
![Page 19: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/19.jpg)
19
Key Components of Multi-level Model
• Specification of predictor variables from multiple levels– Variables to include– Key interactions
• Specification of correlation among responses from same clusters
• Choices must be driven by the scientific question
![Page 20: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/20.jpg)
20
Multi-level Shmulti-level
• Multi-level analysis of social/behavioral phenomena: an important idea
• Multi-level models involve predictors from multi-levels and their interactions
• They must account for correlation among observations within clusters (levels) to make efficient and valid inferences.
![Page 21: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/21.jpg)
21
Key Idea for Regression with Correlated Data
Must take account of correlation to:
• Obtain valid inferences– standard errors– confidence intervals– posteriors
• Make efficient inferences
![Page 22: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/22.jpg)
22
Logistic Regression Example: Cross-over trial
Ordinary logistic regression:
• Response: 1-normal; 0- alcohol dependence
• Predictors: period (x1); treatment group (x2)
• Two observations per person
• Parameter of interest: log odds ratio of
dependence: treatment vs placebo
Mean Model: log{odds(AD)} = 0 + 1Period + 2Trt
![Page 23: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/23.jpg)
23
Results:estimate, (standard error)
Model
Variable Ordinary Logistic Regression
Account for correlation
Intercept 0.66
(0.32)
0.67
(0.29) Period -0.27
(0.38)
-0.30
(0.23) Treatment 0.56
(0.38)
0.57
(0.23)
Similar estimates,WRONG Standard Errors (& Inferences) for OLR
( 0 )
( 2 )
( 1 )
![Page 24: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/24.jpg)
24
Variance of Least Squares and ML Estimators of Slope –vs- First Lag Correlation
Variance reported by LS
True variance of LS Variance of mle
Source: DHLZ 2002 (pg 19)
![Page 25: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/25.jpg)
25
Simulated Data: Non-Clustered
Cluster Number (Neighborhood)
Alc
ohol
Con
sum
ptio
n (m
l/da
y)
![Page 26: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/26.jpg)
26
Simulated Data: Clustered
Cluster Number (Neighborhood)
Alc
ohol
Con
sum
ptio
n (m
l/da
y)
![Page 27: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/27.jpg)
27
Within-Cluster Correlation
• Correlation of two observations from same cluster =
• Non-Clustered = (9.8-9.8) / 9.8 = 0
• Clustered = (9.8-3.2) / 9.8 = 0.67
Total Var – Within VarTotal Var
![Page 28: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/28.jpg)
28
Models for Clustered Data
• Models are tools for inference
• Choice of model determined by scientific question
• Scientific Target for inference?– Marginal mean:
• Average response across the population – Conditional mean:
• Given other responses in the cluster(s) • Given unobserved random effects
![Page 29: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/29.jpg)
29
Marginal Models
• Target – marginal mean or population-average response for different values of predictor variables
• Compare Groups
• Examples:
– Mean alcohol consumption for Males vs Females
– Rates of alcohol abuse for states with active addiction treatment programs vs inactive states
• Public health (a.k.a. population) questions
ex. mean model: E(AlcDep) = 0 + 1Gender
![Page 30: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/30.jpg)
30
Marginal GLMS for Multi-level Data: Generalized Estimating Equations (GEE)• Mean Model: (Ordinary GLM - linear, logistic,..)
– Population-average parameters
– e.g. log{ odds(AlcDepij) } = 0 + 1Genderij
• Solving GEE (DHLZ, 2002) gives nearly efficient and valid inferences about population-average parameters
subject i in cluster j
• Association Model: (for observations in clusters)– e.g. log{ Odds Ratio(Yij,Ykj) } = 0
two different subjects (i & k) in cluster j
![Page 31: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/31.jpg)
31
OLR vs GEECross-over Example
Model
Variable Ordinary Logistic
Regression
GEE
Logistic Regression
Intercept 0.66
(0.32)
0.67
(0.29)
Period -0.27
(0.38)
-0.30
(0.23)
Treatment 0.56
(0.38)
0.57
(0.23)
log( OR )
(association)
0.0 3.56
(0.81)
![Page 32: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/32.jpg)
32
Marginal Model Interpretations
• log{ odds(AlcDep) } = 0 + 1Period + 2trt
= 0.67 + (-0.30)Period + (0.57)trt
TRT Effect: (placebo vs. trt)
OR = exp( 0.57 ) = 1.77, 95% CI (1.12, 2.80)
Risk of Alcohol Dependence is almost twice as high on placebo, regardless of, (adjusting for), time period
Since: log{odds(AlcDep|Period, pl)} = 0 + 1Period + 2
And: log{odds(AlcDep|Period, trt)} = 0 + 1Period
log-Odds = 2
OR = exp( 2 )
WHY?
![Page 33: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/33.jpg)
33
Conditional Models
• Conditional on other observations in cluster– Probability a person abuses alcohol as a function
of the number of family members that do– A person’s average alcohol consumption as a
function on the average in the neighborhood
• Use other responses from the cluster as predictors in regressions like additional covariates
ex: E(AlcDepij) = 0 + 1Genderij + 2AlcDepj
![Page 34: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/34.jpg)
34
Conditional on Other Responses:
- Usually a Bad Idea - • Definition of “other responses in cluster”
depends on size/nature of cluster– e.g. “number of other family members who do”
• 0 for a single person means something different that 0 in a family with 10 others
• The “risk factors” may affect the entire cluster; conditioning on the responses for the others will dilute the risk factor effect– Two eyes example
ex: log{odds(Blindi,Left)} = 0 + 1Sun + 2Blindi,Right
0
![Page 35: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/35.jpg)
35
Conditional Models
• Conditional on unobserved latent variables or “random effects”– Alcohol use within a family is related
because family members share an unobserved “family effect”: common genes, diets, family culture and other unmeasured factors
– Repeated observations within a neighborhood are correlated because neighbors share: common traditions, access to services, stress levels,…
![Page 36: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/36.jpg)
36
Random Effects Models
• Latent (random) effects are unobserved
– inferred from the correlation among residuals
• Random effects models describe the marginal mean and the source of correlation in one equation
• Assumptions about the latent variables determine the nature of the associations
– ex: Random Intercept = Uniform Correlation
ex: E(AlcDepij | bj) = 0 + 1Genderij + bj
where: bj ~ N(0,2)
Cluster specific random effect
![Page 37: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/37.jpg)
37
OLR vs R.E.Cross-over Example
Model
Variable Ordinary Logistic
Regression
Random Int. Logistic
Regression
Intercept 0.66
(0.32)
2.2
(1.0)
Period -0.27
(0.38)
-1.0
(0.84)
Treatment 0.56
(0.38)
1.8
(0.93)
log( )
(association)
0.0 5.0
(2.3)
![Page 38: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/38.jpg)
38
Conditional Model Interpretations
• log{ odds(AlcDepi | bi) }
= 0 + 1Period + 2trt + bi
= 2.2 + (-1.0)Period + (1.8)trt + bi
where: bi ~ N(0,52)
TRT Effect: (placebo vs. trt)
OR = exp( 1.8 ) = 6.05, 95% CI (0.94, 38.9)
A Specific Subject’s Risk of Alcohol Dependence is 6 TIMES higher on placebo, regardless of, (adjusting for), time period
ith subject’s latent propensity for Alcohol
Dependence
![Page 39: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/39.jpg)
39
Conditional Model Interpretations
Since: log{odds(AlcDepi|Period, pl, bi) )} = 0 + 1Period + 2 + bi
And: log{odds(AlcDep|Period, trt, bi) )} = 0 + 1Period + bi
log-Odds = 2
OR = exp( 2 )
WHY?
• In order to make comparisons we must keep the subject-specific latent effect (bi) the same.
• In a Cross-Over trial we have outcome data for each subject on both placebo & treatment
• What about in a usual clinical trial / cohort study?
![Page 40: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/40.jpg)
40
Marginal vs. Random Effects Models
• For linear models, regression coefficients in random effects models and marginal models are identical:
average of linear function = linear function of average
• For non-linear models, (logistic, log-linear,…) coefficients have different meanings/values, and address different questions
- Marginal models -> population-average parameters
- Random effects models -> cluster-specific parameters
![Page 41: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/41.jpg)
41
Marginal –vs- Random Intercept Model log{odds(Yi) } = 0 + 1*Gender VS.
log{odds(Yi | ui) } = 0 + 1*Gender + ui
cluster specific
comparisons
population prevalences
Female
Male
Female
Male
Source: DHLZ 2002 (pg 135)
![Page 42: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/42.jpg)
42
Marginal -vs- Random Intercept Models; Cross-over Example
Model
Variable Ordinary Logistic
Regression
Marginal (GEE) Logistic
Regression
Random-Effect Logistic
Regression
Intercept 0.66
(0.32)
0.67
(0.29)
2.2
(1.0)
Period -0.27
(0.38)
-0.30
(0.23)
-1.0
(0.84)
Treatment 0.56
(0.38)
0.57
(0.23)
1.8
(0.93)
Log OR
(assoc.)
0.0 3.56
(0.81)
5.0
(2.3)
![Page 43: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/43.jpg)
43
Comparison of Marginal and Random Effect Logistic Regressions
• Regression coefficients in the random effects model are roughly 3.3 times as large
– Marginal: population odds (prevalence with/prevalence without) of AlcDep is exp(.57) = 1.8 greater for placebo than on active drug;population-average parameter
– Random Effects: a person’s odds of AlcDep is exp(1.8)= 6.0 times greater on placebo than on active drug;cluster-specific, here person-specific, parameter
Which model is better? They ask different questions.
![Page 44: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/44.jpg)
44
Marginalized Multi-level Models
• Heagerty (1999, Biometrics); Heagerty and Zeger (2000, Statistical Science)
• Model:– marginal mean as a function of covariates– conditional mean given random effects as a
function of marginal mean and cluster-specific random effects
• Random Effects allow flexible association models, but public health is usually concerned with population-averaged (marginal) questions.
MMM
![Page 45: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/45.jpg)
45
Schematic of Marginal Random-
effects Model
![Page 46: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/46.jpg)
46
Model
Variable Ordinary Logistic
Regression
GEE
Logistic Regression
MMM
Logistic
Regression
Random Int. Logistic Regression
Intercept 0.66
(0.32)
0.67
(0.29)
0.65
(0.28)
2.2
(1.0)
Period -0.27
(0.38)
-0.30
(0.23)
-0.33
(0.22)
-1.0
(0.84)
Treatment 0.56
(0.38)
0.57
(0.23)
0.58
(0.23)
1.8
(0.93)
log(OR)
(assoc.)
0.0 3.56
(0.81)
5.44
(3.72)
5.0
(2.3)
Marginal and Random Intercept Models Cross-over Example
![Page 47: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/47.jpg)
47
Refresher: Forests & TreesMulti-Level Models:
– Explanatory variables from multiple levels• Family• Neighborhood• State
– Interactions
Must take account of correlation among responses from same clusters:– Marginal: GEE, MMM– Conditional: RE, GLMM
![Page 48: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/48.jpg)
48
Illustration of Conditional Models and Marginal Multi-level Models;
The British Social Attitudes Survey
• Binary Response: Yijk =
• Levels (notation)– Year: k=1,…,4 (1983-1986)– Subject: j=1,…,264– District: i=1,…54– Overall Sample: N = 1,056
• Levels (conception)– 1: time within person– 2: persons within districts– 3: districts
1 if favor abortion 0 if not
![Page 49: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/49.jpg)
49
Covariates at Three Levels
• Level 1: time– Indicators of time
• Level 2: person– Class: upper working; lower working– Gender– Religion: protestant, catholic, other
• Level 3: district– Percentage protestant (derived)
![Page 50: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/50.jpg)
50
Scientific Questions• How does a person’s religion influence her probability
of favoring abortion?
• How does the predominant religion in a person’s district influence her probability of favoring abortion?
• How does the rate of favoring abortion differ between protestants and otherwise similar catholics?
• How does the rate of favoring abortion differ between districts that are predominantly protestant versus other religions?
Conditional model
Marginal model
![Page 51: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/51.jpg)
51
Conditional Multi-level Model
Person and district random effects
1. Time: k2. Person: j3. District: i
Levels:
![Page 52: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/52.jpg)
52
Conditional Multi-level Model Results
3
2
1
![Page 53: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/53.jpg)
53
Conditional Scientific Answers
• How does a person’s religion influence her probability of favoring abortion?
• How does the predominant religion in a person’s district influence her probability of favoring abortion?
But Wait!…
![Page 54: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/54.jpg)
54
Conditional Model Interpretations: Model 4
WHY?
log{odds(Fav|Catholic,X,b2,ij,b3,ij) )} = 0+X + 8 + b2,0+ bC+ b3,0
log{odds(Fav|Protestant,X,b2,ij,b3,ij) )} = 0+X + b2,0 + b3,0
OR exp( 8 )
OR = exp( 8 + bC )
log-Odds = 8 + bC
![Page 55: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/55.jpg)
55
Conditional Model Interpretations: Model 4
What happens if you simply report exp()??
log{odds(Fav|Catholic,X,b2,ij,b3,ij) )} = 0+X + 8 + b2,0 + bC+ b3,0
log{odds(Fav|Prot/Cath,X,b2,ij,b3,ij) )} = 0+X + b2,0 + bC+ b3,0
OR = exp( 8 )
log-Odds = 8
But there were NO subjects in the study who were simultaneously BOTH Catholic AND Protestant
( Similar for % protestant! )
![Page 56: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/56.jpg)
56
Marginal Multi-level Model1. Time: k2. Person: j3. District: i
Levels:
Mean Model:
Person and district random effects
Association Model: (Separate)
![Page 57: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/57.jpg)
57
Marginal Multi-level Model Results
3
2
1
![Page 58: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/58.jpg)
58
Marginal Scientific Answers
• How does the rate of favoring abortion differ between protestants and otherwise similar catholics?
• How does the rate of favoring abortion differ between districts that are predominantly protestant versus other religions?
![Page 59: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/59.jpg)
59
Key Points
• “Multi-level” Models:– Have covariates from many levels and their
interactions– Acknowledge correlation among observations
from within a level (cluster)• Conditional and Marginal Multi-level models have
different targets; ask different questions• When population-averaged parameters are the
focus, use– GEE– Marginal Multi-level Models (Heagerty and Zeger,
2000)
![Page 60: 1 Module I: Statistical Background on Multi-level Models Francesca Dominici Scott L. Zeger Michael Griswold The Johns Hopkins University Bloomberg School.](https://reader035.fdocuments.in/reader035/viewer/2022062321/56649e875503460f94b8abd3/html5/thumbnails/60.jpg)
60
Key Points (continued)
• When cluster-specific parameters are the focus, use random effects models that condition on unobserved latent variables that are assumed to be the source of correlation
• Warning: Model Carefully. Cluster-specific targets often involve extrapolations where there are no actual data for support– e.g. % protestant in neighborhood given a
random neighborhood effect