The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

63
The 2x2 table, The 2x2 table, RxCxK contingency tables, RxCxK contingency tables, and pair-matched data and pair-matched data July 27, 2004 July 27, 2004

Transcript of The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Page 1: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

The 2x2 table,The 2x2 table,RxCxK contingency tables,RxCxK contingency tables,

and pair-matched dataand pair-matched data

July 27, 2004July 27, 2004

Page 2: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Introduction to the 2x2 TableIntroduction to the 2x2 Table

  Exposure (E) No Exposure (~E)

 

Disease (D) a b a+b = P(D)

No Disease (~D) c d c+d = P(~D)

  a+c = P(E) b+d = P(~E)

Marginal probability of disease

Marginal probability of exposure

Page 3: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Cohort StudiesCohort Studies

Target population

Exposed

Not Exposed

Disease-free cohort

Disease

Disease-free

Disease

Disease-free

TIME

Page 4: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

  Exposure (E) No Exposure (~E)

 

Disease (D) a b

No Disease (~D) c d

  a+c b+d

)/()/(

)~/(

)/(

dbbcaa

EDP

EDPRR

risk to the exposed

risk to the unexposed

The Risk Ratio, or Relative Risk (RR)

Page 5: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

400 400

1100 2600

0.23000/4001500/400 RR

Hypothetical DataHypothetical Data

  Normal BP

Congestive Heart Failure

No CHF

1500 3000

High Systolic BP

Page 6: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Case-Control StudiesCase-Control Studies

Sample on disease status and ask retrospectively about exposures (for rare diseases) Marginal probabilities of exposure for cases and

controls are valid.

• Doesn’t require knowledge of the absolute risks of disease

• For rare diseases, can approximate relative risk

Page 7: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Target population

Exposed in past

Not exposed

Exposed

Not Exposed

Case-Control StudiesCase-Control Studies

Disease

(Cases)

No Disease

(Controls)

Page 8: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

bc

adOR

db

ca

  Exposure (E) No Exposure (~E)

 

Disease (D) a = P (D& E) b = P(D& ~E)

No Disease (~D) c = P (~D&E) d = P (~D&~E)

 

The Odds Ratio (OR)

Page 9: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

The Odds RatioThe Odds Ratio

RR

OR

EDPEDP

EDPEDP

EDPEDP

EDPEDP

EDPEDP

DEPDEP

DEPDEP

)~/()/(

)~/(~)~/(

)/(~)/(

)~&(~)&(~

)~&()&(

)~/(~)~/(

)/(~)/(

When disease is rare: P(~D) 1

“The Rare Disease Assumption”

Via Bayes’ Rule

1

1

Page 10: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

0 0.35 0.7 1.05 1.4 1.75 2.1 2.45 2.8 3.15 3.5 0

1

2

3

4

5

6

P e r c e n t

Simulated Odds Ratio

Properties of the OR (simulation)

Page 11: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Properties of the lnOR

-1.05 -0.75 -0.45 -0.15 0.15 0.45 0.75 1.05 1.35 1.65 1.95 0

2

4

6

8

10

P e r c e n t

lnOR

Standard deviation =

dcba

1111

Standard deviation =

Page 12: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Hypothetical DataHypothetical Data

0.8)10)(6(

)24)(20(OR

25.8) - (2.47(8.0)e ,(8.0)e CI %95 24

1

10

1

6

1

20

196.1

24

1

10

1

6

1

20

196.1

  Smoker Non-smoker  

Lung Cancer 20 10

No lung cancer 6 24

 

30

30

Note that the size of the smallest 2x2 cell determines the magnitude of the variance

Page 13: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Example: Cell phones and brain Example: Cell phones and brain tumors (cross-sectional data)tumors (cross-sectional data)

22.10156.

019.

91

)982)(.018(.

352

)982)(.018(.

)033.014(.

018.453

8;

)1)(()1)((

0)ˆˆ(

033.91

3;014.

352

5

21

21

//

Z

p

n

pp

n

pp

ppZ

pp nophonetumorcellphonetumor

  Brain tumor No brain tumor  

Own a cell phone

5 347 352

Don’t own a cell phone

3 88 91

  8 435 453

Page 14: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Same data, but use Chi-square testSame data, but use Chi-square testor Fischer’s exactor Fischer’s exact

48.122.1:note

48.17.345

345.7)-(347

3.89

88)-(89.3

7.1

1.7)-(3

3.6

6.3)-(8

df 11111

d cellin 89.3 b; cellin 345.7

c; cellin 1.7 6.3;453*.014 a cellin Expected

014.777.*018.

777.453

352;018.

453

8

22

2222

12

Z

NS

*))*(C-(R-

xpp

pp

cellphonetumor

cellphonetumor

  Brain tumor No brain tumor  

Own 5 347 352

Don’t own 3 88 91

  8 435 453

Page 15: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Same data, but use Odds RatioSame data, but use Odds Ratio

  Brain tumor No brain tumor  

Own a cell phone

5 347 352

Don’t own a cell phone

3 88 91

  8 435 453

05.;16.174.

86.

88

1

3

1

347

1

5

1

)423ln(.

1111

0-lnORZ

423.347*3

88*5OR

p

dcba

Page 16: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Adding a Third Dimension to Adding a Third Dimension to the the RxCRxC picture picture

Exposure Disease?

Mediator

Confounder

Effect modifier

Page 17: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

ConfoundingConfounding

A confounding variable is associated with the exposure and it affects the outcome, but it is not an intermediate link in the chain of causation between exposure and outcome.

Page 18: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Examples of ConfoundingExamples of Confounding

Poor nutrition?

Menstrual irregularity

Low weight

Menstrual irregularity?

Low bone strength

Late menarche

Page 19: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Controlling for confounders in Controlling for confounders in medical studiesmedical studies

1. Confounders can be controlled for in the design phase of a study (randomization or restriction or matching).

2. Confounders can be controlled for in the analysis phase of a study (stratification or multivariate regression).

Page 20: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Analytical identification of Analytical identification of confounders through confounders through

stratification stratification

Page 21: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Mantel-Haenszel Procedure:Mantel-Haenszel Procedure:Non-regression technique used to Non-regression technique used to

identify confounders and to control identify confounders and to control for confounding in the for confounding in the statistical statistical

analysisanalysis phase rather than the phase rather than the designdesign phase of a study.phase of a study.

Page 22: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Stratification: “Series of 2x2 Stratification: “Series of 2x2 tables”tables”

Idea: Take a 2x2 table and break it into a series of smaller 2x2 tables (one table at each of J levels of the confounder yields J tables).

Example: in testing for an association between lung cancer and alcohol drinking (yes/no), separate smokers and non-smokers.

Page 23: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

“It is more informative to estimate the strength of association than simply to test a hypothesis about it.”

“When the association seems stable across partial tables, we can estimate an assumed common value of the k true odds ratios.”

Page 24: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Controlling for confounding by Controlling for confounding by stratificationstratification

Example: Gender Bias at Berkeley?(From: Sex Bias in Graduate Admissions: Data from Berkeley, Science 187: 398-403; 1975.)

 

 

Crude RR = (1276/1835)/(1486/2681) =1.25

(1.20 – 1.32)

Denied

Admitted

1835 2681

Female Male

1276 1486

559 1195

Page 25: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Program AProgram A

Stratum 1 = only those who applied to program A

 

 

Stratum-specific RR = .90 (.87-.94)

Denied

Admitted

108 825

Female Male

19 314

89 511

Page 26: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Program BProgram B

Stratum 2 = only those who applied to program B

 

 

Stratum-specific RR = .99 (.96-1.03)

Denied

Admitted

25 560

Female Male

8 208

17 352

Page 27: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Program CProgram C

Stratum 3 = only those who applied to program C

 

 

Stratum-specific RR = 1.08 (.91-1.30)

Denied

Admitted

593 325

Female Male

391 205

202 120

Page 28: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Program DProgram D

Stratum 4 = only those who applied to program D

 

 

Stratum-specific RR = 1.02 (.89-1.18)

Denied

Admitted

375 407

Female Male

248 265

127 142

Page 29: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Program EProgram E

Stratum 5 = only those who applied to program E

 

 

Stratum-specific RR = .88 (.67-1.17)

Denied

Admitted

393 191

Female Male

289 147

104 44

Page 30: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Program FProgram F

Stratum 6 = only those who applied to program F

 

 

Stratum-specific RR = 1.09 (.84-1.42)

Denied

Admitted

341 373

Female Male

321 347

20 26

Page 31: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

SummarySummary

 

 

Crude RR = 1.25 (1.20 – 1.32)

Stratum specific RR’s:.90 (.87-.94)

.99 (.96-1.03)1.08 (.91-1.30) 1.02 (.89-1.18).88 (.67-1.17)

1.09 (.84-1.42)

Maentel-Haenszel Summary RR: .97

Cochran-Mantel-Haenszel Test is NS. Gender and denial of admissions are conditionally independent given program.The apparent association (RR=1.25) was due to confounding.

Page 32: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

The Mantel-Haenszel The Mantel-Haenszel Summary Risk RatioSummary Risk Ratio

k

i i

iii

k

i i

iii

T

bacT

dca

1

1

)(

)(

Disease

Not Disease

Exposure Not Exposed

a c

b d

k strata

Page 33: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

E.g., for Berkeley…E.g., for Berkeley…

97.

714)341(347

584)393(147

782)375(265

918)593(205

585)25(208

933)108(314

714)373(321

584)191(289

782)407(248

918)325(391

585)560(8

933)825(19

Page 34: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

The Mantel-Haenszel The Mantel-Haenszel Summary Odds RatioSummary Odds Ratio

Exposed

Not Exposed

Case Control

a b

c d

k

i i

ii

k

i i

ii

T

cbT

da

1

1

Page 35: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

 

 

Country

OR = 1.32 Spouse smokes

Spouse does not smoke

137 363

71 249US

Spouse smokes

Spouse does not smoke

19 38

5 16

Great

Britain

OR = 1.6

Spouse smokes

Spouse does not smoke

Lung Cancer Control

73 188

21 82Japan OR = 1.52

Source: Blot and Fraumeni, J. Nat. Cancer Inst., 77: 993-1000 (1986).

Example

Page 36: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Summary ORSummary OR

38.1

820363*71

785*38

36421*188

820137*249

7816*19

36482*73

Not Surprising!

Page 37: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

MH assumptionsMH assumptions

OR or RR doesn’t vary across strata. (Homogeneity!)

If exposure/disease association does vary for different subgroups, then the summary OR or RR is not appropriate…

Page 38: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

advantages and limitationsadvantages and limitationsadvantages…• Mantel-Haenszel summary statistic is easy to interpret

and calculate• Gives you a hands-on feel for the datadisadvantages…• Requires categorical confounders or continuous

confounders that have been divided into intervals • Cumbersome if more than a single confounder

To control for 1 and/or continuous confounders, a multivariate technique (such as logistic regression) is preferable.

Page 39: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Analysis of matched dataAnalysis of matched data

Page 40: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Pair Matching: Why match?Pair Matching: Why match?Pairing can control for extraneous sources

of variability and increase the power of a statistical test.

Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.

Page 41: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

ExampleExample Johnson and Johnson (NEJM 287: 1122-1125,

1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as….

Hodgkin’s

Sib control

Tonsillectomy None

41 44

33 52

From John A. Rice, “Mathematical Statistics and Data Analysis.

OR=1.47; chi-square=1.53 (NS)

Page 42: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

ExampleExample But several letters to the editor pointed out that

those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this:

From John A. Rice, “Mathematical Statistics and Data Analysis.

OR=2.14; chi-square=2.91 (p=.09)

Tonsillectomy

None

Tonsillectomy None

37 7

15 26

Case

Control

Page 43: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Pair MatchingPair Matching

Match each MI case to an MI control based on age and gender.

Ask about history of diabetes to find out if diabetes increases your risk for MI.

Page 44: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Pair MatchingPair Matching

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

Page 45: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Each pair is it’s own “age-Each pair is it’s own “age-gender” stratumgender” stratum

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Example: Concordant for

exposure (cell “a” from before)

Page 46: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Diabetes

No diabetes

Case (MI) Control

1 0

0 1

x 9

x 37

Diabetes

No diabetes

Case (MI) Control

0 1

1 0

Diabetes

No diabetes

Case (MI) Control

0 0

1 1

x 16

x 82

Page 47: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Mantel-Haenszel for pair-Mantel-Haenszel for pair-matched datamatched data

We want to know the relationship between diabetes and MI controlling for age and gender.

Mantel-Haenszel methods apply.

Page 48: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

RECALL: The Mantel-Haenszel RECALL: The Mantel-Haenszel Summary Odds RatioSummary Odds Ratio

Exposed

Not Exposed

Case Control

a b

c d

k

i i

ii

k

i i

ii

T

cbT

da

1

1

Page 49: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Diabetes

No diabetes

Case (MI) Control

1 0

0 1

ad/T = 0

bc/T=0

ad/T=1/2

bc/T=0

Diabetes

No diabetes

Case (MI) Control

0 1

1 0

Diabetes

No diabetes

Case (MI) Control

0 0

1 1

ad/T=0

bc/T=1/2

ad/T=0

bc/T=0

Page 50: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

16

37

21

*16

21

37

2

2144

1

144

1

x

cb

da

OR

i

ii

i

ii

MH

Mantel-Haenszel Summary ORMantel-Haenszel Summary OR

Page 51: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

OR estimate comes only from discordant pairs!!

OR= 37/16 = 2.31

Makes Sense!

Page 52: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

McNemar’s TestMcNemar’s Test

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

OR estimate comes only from discordant pairs!

The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.

Page 53: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

P(“favors” case/discordant pair) =

53

37

1637

37ˆ

cb

bp

Page 54: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

odds(“favors” case/discordant pair) =

16

37

c

bOR

Page 55: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Diabetes

No diabetes

Diabetes No Diabetes

9 37

16 82

MI casesMI controls

McNemar’s TestMcNemar’s Test

...)5(.)5(.39

53)5(.)5(.

38

53)5(.)5(.

37

53 143915381637

valuep

01.;88.264.3

5.10

)5)(.5(.53

)2

53(37

pZ

Null hypothesis: P(“favors” case / discordant pair) = .5(note: equivalent to OR=1.0 or cell b=cell c)

By normal approximation to binomial:

Page 56: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

McNemar’s Test: generallyMcNemar’s Test: generally

cb

cb

cb

cb

cb

cbb

Z

4

22)5)(.5)(.(

)2

(

By normal approximation to binomial:

Equivalently:

cb

cb

cb

cb

2

221

)()(

exp

No exp

exp No exp

a b

c d

casescontrols

Page 57: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ 312: 91-94; Jan 1996.

Example: Salmonella Example: Salmonella Outbreak in France, 1996Outbreak in France, 1996

Page 58: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.
Page 59: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Epidemic CurveEpidemic Curve

Page 60: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Matched Case Control StudyMatched Case Control Study

Case = Salmonella gastroenteritis.

Community controls (1:1) matched for: age group (< 1, 1-4, 5-14, 15-34, 35-44,

45-54, 55-64, or >= 65 years) gender city of residence

Page 61: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

ResultsResults

Page 62: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

In 2x2 table form: any goat’s In 2x2 table form: any goat’s cheesecheese

Goat’s cheese

None

29 30

Goat’ cheese None

23 23

6 7

46

13

59

Cases

Controls

8.36

23

c

bOR

Page 63: The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

In 2x2 table form: Brand B In 2x2 table form: Brand B Goat’s cheeseGoat’s cheese

Goat’s cheese B

None

10 49

Goat’ cheese B None

8 24

2 25

32

27

59

Cases

Controls

0.122

24

c

bOR