Post on 19-Feb-2017
IntroductionFormula
ExamplesDiscussionReferences
New Epidemiologic Measures in MultilevelStudy:
Median Risk Ratio, Median Hazard Ratio and MedianBeta
Jinseob Kim1
Graduate School of Public Health, Seoul National University, Korea
Apr 4, 2014
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Contents
1 IntroductionICC & VPCMultilevel analysis: binomial caseThis study
2 FormulaBrief review of median ORMedian RR, Median HR, Median Beta
3 ExamplesDataCount dataCox proportional hazard model
4 Discussion
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
목표
1 Multilevel study에서 그룹변수의 효과를 설명하는 방법을소개한다.
2 그룹변수의 효과를 직관적으로 설명할 수 있는 새로운 지표를제시한다.
3 실제로 어떻게 계산하고 해석되는지 예제를 통해 알아본다.
4 새로운 역학지표로서의 의의.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
예시
Health survey conducted in 2000 in the county of Scania,Sweden[11]
1 10,723 persons, 18-80 age, 60 areas
2 Individual propensity of consulting private physicians VS Area.
3 Y: 최근 1년간 private physicians consulting 경험유무:binomial
4 X: individual level variables, area level variables, area
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Effect of group variable
1 Repeated measure, random effect, multilevel, hierarchial GLM,GEE, GLMM...
2 그룹변수의 beta값 못구하겠다. (그룹이 너무 많다.. 50개 100개...)
3 구해본들.. 해석난감.. (50개 그룹 → 49개 베타값)
4 그룹변수의 효과를 숫자 하나로 표시한다: Vgroup
5 분산이 얼마나 크냐?? 0: 그룹은 의미없다, 클수록 그룹의의미가 크다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Intraclass correlation coefficient, variance partitioncoefficients
Yi = Xiβ + Groupi + εi (1)
ICC =VGroup
VY=
VGroup
VGroup + Vε(2)
1 그룹변수의 효과를 나타내는 지표[1, 6].
2 0: 그룹변수는 의미없는 변수, 1: 그룹변수가 Y의 모든 것을설명한다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
ICC example
lmer(formula = TG ~ age + sex + BMI + (1 | FID), data = a)
Estimate Std. Error t value
(Intercept) -65.222107 35.8720093 -1.8181894
age 0.109564 0.3318413 0.3301699
sex -41.942137 11.3684264 -3.6893529
BMI 8.648601 1.2917159 6.6954362
Groups Name Std.Dev.
FID (Intercept) 39.356
Residual 72.007
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
ICC =39.3562
39.3562 + 72.0072' 0.23 (3)
해석: age, sex, BMI를 보정한 후에도 FID가 TG의 23%를설명한다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Different scale: ICC??
Var(Yi ) = pi (1− pi ) (4)
logit(pi ) = Xiβ + Groupi (5)
Proportional scale VS Logistic scale[3]
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Example: binomial case
glmer(formula = hyperTG ~ age + sex + BMI + (1 | FID), data = a,
family = binomial)
Estimate Std. Error z value Pr(>|z|)
(Intercept) -6.65451749 1.48227814 -4.4893852 7.142904e-06
age 0.01052907 0.01206682 0.8725635 3.829010e-01
sex -1.48506920 0.60773433 -2.4436158 1.454090e-02
BMI 0.19131619 0.05022612 3.8090977 1.394749e-04
Groups Name Std.Dev.
FID (Intercept) 1.1163
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Solution
1 Linearization : logit → proportion
2 Simulation : proportion → logit
3 Latent variable
Approximation of ICC, calculation issue[3, 15]
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Median Odds Ratio(MOR)
Larsen et al.(2000, 2005)임의의 두 group을 골랐을 때 (Odds가 큰 그룹: 작은 그룹) 의OR이 대충(median) 얼마나 되는가?[8, 7, 11]
MOR = exp (√
2VGroup × Φ−1(0.75)) ' exp (0.95√VGroup) (6)
1 1 ∼ inf : Group효과 없다, 엄청 크다.
2 VGroup만 있으면 계산가능.
3 OR scale로 해석: age, sex 해석하듯이 하면 된다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
Example: binomial case
glmer(formula = hyperTG ~ age + sex + BMI + (1 | FID), data = a,
family = binomial)
Groups Name Std.Dev.
FID (Intercept) 1.1163
MOR = exp(√
2× 1.11632 × 0.6745) = 3.67 (7)
: 임의의 두 가족을 뽑으면 대충(median) OR이 3.67이다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
If count data? survival analysis?
Count data(rate, 자녀 수..)
1 poisson 분포때는 ICC계산가능 (similar to binomial case)
2 Gamma, neg-bin...???, Interpretation issue: 0-1 scale.
Cox-proportional hazard model
1 ICC의 개념이 없다. Y: hazard function...
2 그냥 Vgroup 만 제시하는 정도.. Interpretation issue.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
ICC & VPCMultilevel analysis: binomial caseThis study
목표
New measurement in multilevel analysis.
1 Count data(poisson, gamma, neg-bin...)[4, 14, 16] : MedianRisk Ratio
2 Survival data : cox proportional hazard : Median HazardRatio
3 Continuous data : Median Beta
일반 변수 해석과 같은 Scale로 해석가능 & 계산이 간단하며신뢰구간도 쉽게 구할 수 있다[5].
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Brief review of median ORMedian RR, Median HR, Median Beta
Multilevel logistic regression[8]
Logit[Pr(Yij = 1|Xij ,Gj)] = β0 + X ′ijβ1 + Gj (8)
(β0: intercept, β1: vector of fixed regression coefficients, Gj :random intercept Gj ∼ N(0,Vg ))
Odds[Pr(Yij = 1|Xij ,Gj)] = exp (β0) exp (X ′ijβ1) exp (Gj) (9)
Odds[Pr(Yij = 1|X ,Gj)]
Odds[Pr(Yik = 1|X ,Gk)]= exp (Gj − Gk) (10)
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Brief review of median ORMedian RR, Median HR, Median Beta
Odds가 큰그룹을 Odds가 작은 그룹과 비교!
OR = exp |Gj − Gk | (11)
(Gj − Gk) ∼ N(0, 2Vg ) (12)
결국 임의로 두 그룹을 뽑았을 때 Odds가 큰 그룹과 Odds가 작은그룹을 비교하여 OR의 median값을 계산하였을 때 그 결과는
MOR = exp (√
2Vg × Φ−1(0.75)) ' exp (0.95√Vg) (13)
(Φ: probability density function(PDF) of standard normaldistribution)
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Brief review of median ORMedian RR, Median HR, Median Beta
Multilevel poisson regression[9]
Yij |λij ∼ Pois(λij) (14)
ln[(λij |Xij ,Gj)] = β0 + X ′ijβ1 + Gj (15)
Risk[(λij |X ,Gj)]
Risk[(λik |X ,Gk)]= exp (Gj − Gk) (16)
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Brief review of median ORMedian RR, Median HR, Median Beta
Risk가 큰그룹을 Risk가 작은 그룹과 비교!!
RR = exp |Gj − Gk | (17)
(Gj − Gk) ∼ N(0, 2Vg ) (18)
임을 이용하면 결국 임의로 두 그룹을 뽑았을 때 Risk가 큰 그룹과Risk가 작은 그룹을 비교하여 RR의 median값을 계산하였을 때 그결과는
MRR = exp (√
2Vg × Φ−1(0.75)) ' exp (0.95√
Vg) (19)
(Φ: probability density function(PDF) of standard normaldistribution)
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Brief review of median ORMedian RR, Median HR, Median Beta
Multilevel cox-proportional hazard analysis[10]
ln[(hij(t)
h0(t)|Xij ,Gj)] = β0 + X ′ijβ1 + Gj (20)
(hij(t): hazard function of ith individual of jth group, h0(t): basehazard function)
[(hij(t)|X ,Gj)]
[(hik(t)|X ,Gk)]= exp (Gj − Gk) (21)
MHR = exp (√
2Vg × Φ−1(0.75)) ' exp (0.95√Vg) (22)
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Brief review of median ORMedian RR, Median HR, Median Beta
Gaussian multilevel regression
Yij ∼ N(µij , σ2) (23)
[(µij |Xij ,Gj)] = β0 + X ′ijβ1 + Gj (24)
[(µij |X ,Gj)]− [(µik |X ,Gk)] = (Gj − Gk) (25)
Median Beta =√
2Vg × Φ−1(0.75) ' 0.95√
Vg (26)
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
DataCount dataCox proportional hazard model
Minnesota Breast Cancer Study- kinship2 packages in R[13]
1 3725 obs. of 15 variables (female with non-missing)
2 education : 1-고졸이하, 2-대졸미만, 3-대졸이상
3 marstat : 1- 결혼 및 사실혼, 2- 사별 및 이혼, 3-미혼
4 yob(출생년도): 1: -1919, 2: 1920-1939, 3: 1940-1959, 4: 1960-
5 parity: 자녀 수
6 cancer: 1-유방암, 0-censored
7 endage: 마지막 f/u 또는 암발생 나이
8 famid: 가족 id
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
DataCount dataCox proportional hazard model
See data
famid endage cancer yob education marstat parity
16 4 64 0 2 2 1 2
20 4 69 0 2 2 1 2
22 4 59 0 3 3 2 2
23 4 59 0 3 3 2 2
31 4 62 0 2 2 2 1
35 4 61 0 3 2 1 2
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
DataCount dataCox proportional hazard model
Variables Model.1 Model.2 Model.3(Intercept) 2.75 (2.67˜2.83) 3.53 (3.35˜3.71) 3.61 (3.35˜3.9)Education
1 . 1 12 . 0.84 (0.8˜0.89) 0.89 (0.84˜0.94)3 . 0.67 (0.63˜0.71) 0.75 (0.7˜0.79)
Marriage1 . 1 12 . 1.03 (0.98˜1.08) 0.95 (0.9˜1)3 . 0.07 (0.05˜0.11) 0.08 (0.05˜0.13)
Year of birth˜1919 . . 1˜1939 . . 1.13 (1.05˜1.21)˜1959 . . 0.75 (0.7˜0.81)1960˜ . . 0.52 (0.47˜0.59)
V famid 0.03 0.02 0.01Median RR 1.18 1.14 1.11
Table: Y: parity, Group: family ID, lme4 package in R
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
DataCount dataCox proportional hazard model
Interpretation
1 일반적인 RR에 대한 해석
2 가족구조가 차지하는 분산이 각각 0.03, 0.02, 0.01
3 MRR: 임의로 두 가족을 골랐을때 high rate: low rate의 RR값의 중간값은 각각 1.18, 1.14, 1.11
4 교육수준, 결혼상태, period effect 를 고려한 후에도가족자체의 효과는 남아있다??
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
DataCount dataCox proportional hazard model
Variables Model.1 Model.2 Model.3 Model.4
I(parity > 0)TRUE . 0.71 (0.66˜0.77) 0.72 (0.65˜0.79) 0.74 (0.67˜0.81)Education
1 . . 1 12 . . 1.32 (1.25˜1.4) 1.24 (1.17˜1.31)3 . . 1.07 (0.99˜1.14) 0.97 (0.9˜1.04)
Marriage1 . . 1 12 . . 1.03 (0.99˜1.07) 1.15 (1.1˜1.2)3 . . 1.08 (0.71˜1.64) 1.23 (0.81˜1.88)
Year of birth˜1919 . . . 1˜1939 . . . 1.41 (1.3˜1.52)˜1959 . . . 2.52 (2.21˜2.87)1960˜ . . . 1.5 (0.17˜13.14)
V famid 0.18 0.18 0.18 0.17Median HR 1.49 1.5 1.5 1.49
Table: Y: Breast cancer hazard, Group: family ID, coxme package in R
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
DataCount dataCox proportional hazard model
Interpretation
1 일반적인 Hazard Ratio에 대한 해석.
2 가족구조가 차지하는 분산이 1.5정도
3 MHR: 임의로 두 가족을 골랐을때 high hazard: low hazard의 HR값의 중간값은 1.5
4 출산경험, 교육수준, 결혼상태, period effect와 상관없이가족력이 일정하게 존재한다??
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Median OR의 count, survival 버전.
1 Poisson regression: ICC 계산가능 but, Neg-bin? Gamma?
2 Cox: ICC 개념적용어렵다, 그냥 그룹변수의 분산을 제시하고끝이었다.
3 MRR, MHR: 다른 지표 해석과 같은 scale에서 해석이가능하다[2, 12].
4 계산이 간단하다. 그룹변수만 있으면 된다.
5 신뢰구간 구하기도 ICC보다 편하다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Conclusion
1 New measurement explaining effect of group variable inmultilevel analysis with count data/survival data
2 Count data: ICC의 대안, 해석하고싶은 scale로 (proportion VSRR)
3 Cox: Best explaination??
4 Median Beta도 ICC의 대안이 될 수 있다.
5 Multilevel study에서 Group level의 효과를 직관적으로설명할 수 있어 의사결정과 소통에 도움이 될 것이다.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Packages
1 lme4, nlme, coxme.. in R
2 Confidence interval for MHR, MRR: calculation issue..
3 Using Bayesian hierarchical model with OpenBUGS, JAGS,Stan..
4 R2OpenBUGS, BRugs, rjags, R2jags, rstan
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Reference I
[1] Bartko, J. J. (1966). The intraclass correlation coefficient as a measure of reliability. Psychological reports,19(1):3–11.
[2] Bolker, B. M., Brooks, M. E., Clark, C. J., Geange, S. W., Poulsen, J. R., Stevens, M. H. H., and White,J.-S. S. (2009). Generalized linear mixed models: a practical guide for ecology and evolution. Trends in ecology& evolution, 24(3):127–135.
[3] Browne, W. J., Subramanian, S. V., Jones, K., and Goldstein, H. (2005). Variance partitioning in multilevellogistic models that exhibit overdispersion. Journal of the Royal Statistical Society: Series A (Statistics inSociety), 168(3):599–613.
[4] Coxe, S., West, S. G., and Aiken, L. S. (2009). The analysis of count data: A gentle introduction to poissonregression and its alternatives. Journal of personality assessment, 91(2):121–136.
[5] Do Ha, I. and Lee, Y. (2005). Multilevel mixed linear models for survival data. Lifetime data analysis,11(1):131–142.
[6] Goldstein, H., Browne, W., and Rasbash, J. (2002). Partitioning variation in multilevel models. UnderstandingStatistics: Statistical Issues in Psychology, Education, and the Social Sciences, 1(4):223–231.
[7] Larsen, K. and Merlo, J. (2005). Appropriate assessment of neighborhood effects on individual health:integrating random and fixed effects in multilevel logistic regression. American journal of epidemiology,161(1):81–88.
[8] Larsen, K., Petersen, J. H., Budtz-Jørgensen, E., and Endahl, L. (2000). Interpreting parameters in the logisticregression model with random effects. Biometrics, 56(3):909–914.
[9] Lee, A. H., Wang, K., Scott, J. A., Yau, K. K., and McLachlan, G. J. (2006). Multi-level zero-inflated poissonregression modelling of correlated count data with excess zeros. Statistical Methods in Medical Research,15(1):47–61.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
Reference II
[10] Liu, L. and Huang, X. (2008). The use of gaussian quadrature for estimation in frailty proportional hazardsmodels. Statistics in medicine, 27(14):2665–2683.
[11] Merlo, J., Chaix, B., Ohlsson, H., Beckman, A., Johnell, K., Hjerpe, P., Rastam, L., and Larsen, K. (2006). Abrief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevellogistic regression to investigate contextual phenomena. Journal of Epidemiology and Community Health,60(4):290–297.
[12] Therneau, T. (2012). Mixed effects cox models. R-package description. URL: http://cran. r-project.org/web/packages/coxme/vignettes/coxme. pdf.
[13] Therneau, T., Atkinson, E., Sinnwell, J., Schaid, D., and McDonnell, S. (2014). kinship2: Pedigree functions.R package version 1.5.7.
[14] Ver Hoef, J. M. and Boveng, P. L. (2007). Quasi-poisson vs. negative binomial regression: how should wemodel overdispersed count data? Ecology, 88(11):2766–2772.
[15] Vigre, H., Dohoo, I., Stryhn, H., and Busch, M. (2004). Intra-unit correlations in seroconversion toactinobacillus pleuropneumoniae and mycoplasma hyopneumoniae at different levels in danish multi-site pigproduction facilities. Preventive veterinary medicine, 63(1-2):9–28.
[16] Winkelmann, R. and Zimmermann, K. F. (1995). Recent developments in count data modelling: theory andapplication. Journal of economic surveys, 9(1):1–24.
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study:
IntroductionFormula
ExamplesDiscussionReferences
END
Email : secondmath85@gmail.com
Jinseob Kim1 New Epidemiologic Measures in Multilevel Study: