Measures of association - Biostatistics

36
Biostatist ics Lecture 10

description

Understanding Measures of association briefly at http://www.helpwithassignment.com/statistics-assignment-help

Transcript of Measures of association - Biostatistics

Page 1: Measures of association - Biostatistics

Biostatistics

Lecture 10

Page 2: Measures of association - Biostatistics

Lecture 9 Review –Measures of association

Measures of association

– – 

– 

Risk differenceRisk ratio

Odds ratio

Calculation & interpretation

interval for each measure

of confidence

of association

Page 3: Measures of association - Biostatistics

2×2 table - Measures of associationOutcome - binary

Measure of Effect Formula

Risk difference p1-p0

Risk ratio p1 / p0

Odds ratio (d1/h1) / (d0/h0)

Page 4: Measures of association - Biostatistics

Differences in measures of association

•  When there is no association between exposure and outcome,

– 

– 

– 

risk difference = 0

risk ratio (RR) = 1

odds ratio (OR) = 1

• 

• 

Risk difference can be negative or positive

RR & OR are always positive

•  For rare outcomes, OR ~ RR

•  OR is always further from 1 than corresponding RR

–  If RR > 1 then OR > RR

–  If RR < 1 the OR < RR

Page 5: Measures of association - Biostatistics

Interpretation of measures of association

•  RR & OR < 1, associated with a reduced risk / odds (mayprotective)

be

–  RR = 0.8 (reduced risk of 20%)

•  RR

– 

& OR > 1, associated with an increased risk / odds

RR = 1.2 (increased risk of 20%)

•  RR & OR – further the risk is from 1, stronger the association

between exposure and outcome (e.g. RR=2 versus RR=3).

Page 6: Measures of association - Biostatistics

Comparing the outcome measure of two exposure groups(groups 1 & 0)

s.e.(lo g RR )

=− + −

eeR

(lo g e

OR )

d h d h

Outcome variable – data type

Population parameter

Estimate of

population parameter

from sample

Standard error of loge(parameter)

95% Confidence interval of loge(population parameter)

Categorical

Population risk ratio

p1/p0 1 1 1 1d1 n1 d0 n0

log eRR

± 1.96 × s.e.(lo g R )

Categorical Population odds ratio

(d1/h1) / (d0/h0)

s.e. = 1

+ 1

+ 1

+ 1

1 1 0 0

log eOR

± 1.96 x s.e.(lo g eOR )

Page 7: Measures of association - Biostatistics

Calculation of p-values for comparing two groups

1 0z =

s.e.(lo g

( RR ))

s.e.(lo g

( OR ))

Outcome variable – data type

Population parameter Population parameter under null hypothesis

Test statistic

Categorical

π1-π0

Population risk ratio

Population odds ratio

π1-π0=0

Population risk ratio=1

Population odds ratio=1

p − p

s.e.( p1

− p 0

)

z = loge ( RR)

e

z = loge (OR)

e

Page 8: Measures of association - Biostatistics

Comparing the outcome measure of two exposure groups(TBM trial: dexamethasone versus placebo)

Outcome variable – data type

Population parameter under null hypothesis

Estimate of population parameter

from sample

95% confidence interval for population parameter

Two-sided p-value

Categorical Population risk

difference

= 0

p1-p0

= -0.095-0.175, -0.015 0.020

Categorical

Population risk ratio

= 1

p1/p0

= 0.77

0.62, 0.96 0.016

Categorical Population odds ratio

= 1

(d1/h1) / (d0/h0)= 0.66

0.46, 0.93 0.021

Page 9: Measures of association - Biostatistics

2×2 table – TBM trial example

Odds ratio for death = (d1/h1) / (d0/h0) = 0.465 / 0.704 = 0.66

Odds ratio for exposure to dexamethasone = (d1/d0) / (h1/h0) = 0.777 / 1.176 = 0.66

Odds ratio for not dying = (h1/d1) / (h0/d0) = 2.149 / 1.420 = 1.51 = (1/0.66)

Odds ratio for exposure to placebo = (d0/d1) / (h0/h1) = 1.287 / 0.850 = 1.51 = (1/0.66)

Death during 9 months post start of treatment

Treatment group Yes No Total

Dexamethasone

(group 1)

87 (d1) 187 (h1) 274 (n1)

Placebo

(group 0)

112 (d0) 159 (h0) 271 (n0)

Total 199 346 545

Page 10: Measures of association - Biostatistics

Measure of association

Study Design Risk difference

RiskRatio

OddsRatio

Randomised controlled trial

√ √ √Cohort Study

√ √ √Case-control Study

× × √

Page 11: Measures of association - Biostatistics

Lecture 10 – Controlling for confounding:stratification and regression

•  A description of confounding

•  How to control for confoundinganalysis by

–  Stratification

–  Regression modelling

in statistical

•  A brief description of the role of multiplelinear or logistic regression in adjusting for confounding

Page 12: Measures of association - Biostatistics

Outcome and exposure variables(RECAP)

Outcomes are variables of interest (population health relevance) whose patterns and determinants we wish to learn about from data

• 

•  Exposures are the variables we think mightexplain observed variation in the outcomes

•  Statistical analysis can be used to quantify theassociation between outcomes and exposures

Page 13: Measures of association - Biostatistics

What is confounding?

A confounding variable

1)2)

3)

is associated with the outcome variable;is associated with the exposure variable;

does not lie on the causal pathway.

Outcome variableExposure variable

Confounding variable

Failing to control for confounding may result in abiased estimate of the magnitude of the association between exposure and outcome

Page 14: Measures of association - Biostatistics

Example of confounding

Exposure variable Outcome variableAlcohol intake Heart disease

Confounding variablesCigarette smoking

Page 15: Measures of association - Biostatistics

Control of confounding

Design of Study

•  Randomisation(randomised controlled trial: e.g. TBM trial)

•  Restriction(only include those with one value of confounder)

•  Matching

Page 16: Measures of association - Biostatistics

Control of confounding

Statistical analysis

•  Stratification

•  Regression modelling

Page 17: Measures of association - Biostatistics

Hypothetical example of a case-control studyAssociation between energy intake and heart disease

OddsOdds

of heart disease in high energy intake group = 730/600 = 1.22of heart disease in low energy intake group = 700/540 = 1.30

Odds ratio = 1.22 / 1.30 = 0.94

95% confidence interval: 0.80 up to 1.10

Heart disease

Energy intake Yes No Total

High(group 1)

730 (d1) 600 (h1) 1330 (n1)

Low(group 0)

700 (d0) 540 (h0) 1240 (n0)

Total 1430 1140 2570

Page 18: Measures of association - Biostatistics

Is this association confoundedby physical activity?

Exposure variable Outcome variableEnergy intake Heart disease

Confounding variablesPhysical activity

Page 19: Measures of association - Biostatistics

Stratify by physical activity…..

Calculate the stratum specific odds ratios…

Energy intake

High physical activity Low physical activity

Heart disease Heart disease

Yes No Yes No

High(group 1)

500 510 230 90

Low(group 0)

100 150 600 390

Page 20: Measures of association - Biostatistics

Stratify by physical activity…..

For high physical activity group: OR (95% CI) = 1.47 (1.11, 1.95)

For low physical activity group: OR (95% CI) = 1.66 (1.26, 2.19)

Energy intake

High physical activity Low physical activity

Heart disease Heart disease

Yes No Yes No

High(group 1)

500 510 230 90

Low(group 0)

100 150 600 390

Page 21: Measures of association - Biostatistics

Is this association confoundedby physical

???

activity?

Exposure variableEnergy intake

Outcome variableHeart disease

??????

Confounding variablesPhysical activity

Page 22: Measures of association - Biostatistics

Confounding – condition 1Association between physical activity and heart disease

** Look particularly in those who are not exposed to the factor of interest**

For low energy intake group: OR (95% CI) = 0.43 (0.33, 0.58)

For high energy intake group: OR (95% CI) = 0.38 (0.29, 0.50)

Physical activity

High energy intake Low energy intake

Heart disease Heart disease

Yes No Yes No

High(group 1)

500 510 100 150

Low(group 0)

230 90 600 390

Page 23: Measures of association - Biostatistics

Confounding – condition 2

Association between energy intake and physical activity

•  In a case-control study: examine the association in the controls•  In a cohort study: use the whole cohort

Page 24: Measures of association - Biostatistics

Confounding – condition 2Association between energy intake and physical activity for those

without heart disease (n=1140)

Proportion in high energy intake group who report high physical activity =510/600 = 0.85 (85%)

Proportion in low energy intake group who report high physical activity =150/540 = 0.28 (28%)

Odds Ratio = (510/90) / (150/390) = 14.7; 95% CI: 11.0 up to 19.7

Physical activity

Energy intake High Low Total

High(group 1)

510 90 600

Low(group 0)

150 390 540

Page 25: Measures of association - Biostatistics

Is this association confoundedby physical

???

activity?

Exposure variableEnergy intake

Outcome variableHeart disease

High energy intake:OR = 0.38 (95% CI: 0.29, 0.50) Low energy intake:OR = 0.43 (95% CI: 0.33, 0.58)

High energy intakeassociated

with high physical activity

Confounding variablesPhysical activity

Page 26: Measures of association - Biostatistics

So physical activity is a potential confounderControl for confounding - Stratified analyses

1) Start with stratum specific estimates differences, rate ratios

of odds ratios, risk ratios, risk

2) Calculate a weighted average of the‘pooled’ estimate

stratum-specific estimates

Usual method is Mantel-Haenszel method–  Weights assigned according to amount of information in

each stratum

Page 27: Measures of association - Biostatistics

Calculate a pooled OR

(600×90)/1310) = 41.2

For low physical activity:

OR = 1.66

w= (d0×h1)/n =

For high physical activity:

OR = 1.47

w= (d0×h1)/n = (100×510)/1260) = 40.5

Energy intake

High physical activity(n=1260)

Low physical activity(n=1310)

Heart disease Heart disease

Yes No Yes No

High(group 1)

500 (d1) 510 (h1) 230 (d1) 90 (h1)

Low(group 0)

100 (d0) 150 (h0) 600 (d0) 390 (h0)

Page 28: Measures of association - Biostatistics

Calculate a pooled OR

(600×90)/1310) = 41.2

Mantel-Haenszel estimate of pooled odds ratio:

∑ (wi × ORi )OR =

MH ∑ wi

Stratum ‘i’

For low physical activity:

OR = 1.66

w= (d0×h1)/n =

For high physical activity:

OR = 1.47

w= (d0×h1)/n = (100×510)/1260) = 40.5

Page 29: Measures of association - Biostatistics

Calculate a pooled OR

(600×90)/1310) = 41.2

Mantel-Haenszel estimate of pooled odds ratio:

(40.5 ×1.47 ) + (41.2 ×1.66)OR = 1.57=

MH (40.5 + 41.2)

95% CI: 1.29 up to 1.91

Recall that the crude OR was 0.94 (95% CI 0.80-1.10)

Is there a difference between crudeand adjusted measures of effect?

For low physical activity:

OR = 1.66

w= (d0×h1)/n =

For high physical activity:

OR = 1.47

w= (d0×h1)/n = (100×510)/1260) = 40.5

Page 30: Measures of association - Biostatistics

Association between energy intake & heartdisease adjusting for physical activity

ORMH = 1.57

95% CI: 1.29, 1.91Exposure variableEnergy intake

Outcome variableHeart disease

High energy intake:OR = 0.38 (95% CI: 0.29, 0.50) Low energy intake:OR = 0.43 (95% CI: 0.33, 0.58)

High energy intakeassociated

with high physical activity

Confounding variablesPhysical activity

Page 31: Measures of association - Biostatistics

Multiple logistic regressionOutcome variable (y-variable) – binary

e.g. dead or alive; treatment failure or success;disease or no disease..

Measure of association – Odds ratio

Multiple logistic regression model –

loge(odds of outcome) = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk

β1,…βk – loge(odds ratios)X1, …..Xk – k different exposure variables (do not need to

be binary but can be categorical with more than 2 categoriesor numerical)

Useful when there are many confounding variables…

Page 32: Measures of association - Biostatistics

Logistic regressionExample – Association between energy intake

and heart disease

Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0)

Logistic regression model –loge(odds of outcome) = β0 + β1X1

β1 – loge(odds ratios)X1 – energy intake (high versus low)

Exposure Odds Ratio (expβi) 95% Confidence Interval

Energy intake(high vs low)

0.94 0.80, 1.10

Page 33: Measures of association - Biostatistics

Multiple logistic regressionExample – Association between energy intake and

heart disease

Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0)

Multiple logistic regression model –loge(odds of outcome) = β0 + β1X1 + β2X2

β1, β2 – loge(odds ratios)X1 – energy intake (high versus low)X2 – physical activity (high versus low)

Exposure Odds Ratio (expβi) 95% Confidence Interval

Energy intake(high vs low)

1.57 1.29, 1.91

Physical activity(high vs low)

0.41 0.33, 0.49

Page 34: Measures of association - Biostatistics

Multiple linear regressionOutcome variable (y-variable) – numericale.g. blood pressure, forced expiratory volume in 1 sec (FEV1)

Linear regression model –

y = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk

y – numerical outcome variable,

β1,…βk – increase in y for every unit increase in x

X1, …..Xk – k different exposure variables (can be numerical

or categorical with 2+ categories)

Useful when there are many confounding variables…

Page 35: Measures of association - Biostatistics

Lecture 10 - Objectives

•  Understand confounding

•  Calculate the Mantel-Haenszel estimate ofpooled odds ratio

the

•  Understand the difference between linear andlogistic regression

Page 36: Measures of association - Biostatistics

Thank You

www.HelpWithAssignment.com