The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

Post on 23-Dec-2015

217 views 0 download

Transcript of The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.

The 2x2 table,The 2x2 table,RxCxK contingency tables,RxCxK contingency tables,

and pair-matched dataand pair-matched data

July 27, 2004July 27, 2004

Introduction to the 2x2 TableIntroduction to the 2x2 Table

  Exposure (E) No Exposure (~E)

 

Disease (D) a b a+b = P(D)

No Disease (~D) c d c+d = P(~D)

  a+c = P(E) b+d = P(~E)

Marginal probability of disease

Marginal probability of exposure

Cohort StudiesCohort Studies

Target population

Exposed

Not Exposed

Disease-free cohort

Disease

Disease-free

Disease

Disease-free

TIME

  Exposure (E) No Exposure (~E)

 

Disease (D) a b

No Disease (~D) c d

  a+c b+d

)/()/(

)~/(

)/(

dbbcaa

EDP

EDPRR

risk to the exposed

risk to the unexposed

The Risk Ratio, or Relative Risk (RR)

400 400

1100 2600

0.23000/4001500/400 RR

Hypothetical DataHypothetical Data

  Normal BP

Congestive Heart Failure

No CHF

1500 3000

High Systolic BP

Case-Control StudiesCase-Control Studies

Sample on disease status and ask retrospectively about exposures (for rare diseases) Marginal probabilities of exposure for cases and

controls are valid.

• Doesn’t require knowledge of the absolute risks of disease

• For rare diseases, can approximate relative risk

Target population

Exposed in past

Not exposed

Exposed

Not Exposed

Case-Control StudiesCase-Control Studies

Disease

(Cases)

No Disease

(Controls)

bc

adOR

db

ca

  Exposure (E) No Exposure (~E)

 

Disease (D) a = P (D& E) b = P(D& ~E)

No Disease (~D) c = P (~D&E) d = P (~D&~E)

 

The Odds Ratio (OR)

The Odds RatioThe Odds Ratio

RR

OR

EDPEDP

EDPEDP

EDPEDP

EDPEDP

EDPEDP

DEPDEP

DEPDEP

)~/()/(

)~/(~)~/(

)/(~)/(

)~&(~)&(~

)~&()&(

)~/(~)~/(

)/(~)/(

When disease is rare: P(~D) 1

“The Rare Disease Assumption”

Via Bayes’ Rule

1

1

0 0.35 0.7 1.05 1.4 1.75 2.1 2.45 2.8 3.15 3.5 0

1

2

3

4

5

6

P e r c e n t

Simulated Odds Ratio

Properties of the OR (simulation)

Properties of the lnOR

-1.05 -0.75 -0.45 -0.15 0.15 0.45 0.75 1.05 1.35 1.65 1.95 0

2

4

6

8

10

P e r c e n t

lnOR

Standard deviation =

dcba

1111

Standard deviation =

Hypothetical DataHypothetical Data

0.8)10)(6(

)24)(20(OR

25.8) - (2.47(8.0)e ,(8.0)e CI %95 24

1

10

1

6

1

20

196.1

24

1

10

1

6

1

20

196.1

  Smoker Non-smoker  

Lung Cancer 20 10

No lung cancer 6 24

 

30

30

Note that the size of the smallest 2x2 cell determines the magnitude of the variance

Example: Cell phones and brain Example: Cell phones and brain tumors (cross-sectional data)tumors (cross-sectional data)

22.10156.

019.

91

)982)(.018(.

352

)982)(.018(.

)033.014(.

018.453

8;

)1)(()1)((

0)ˆˆ(

033.91

3;014.

352

5

21

21

//

Z

p

n

pp

n

pp

ppZ

pp nophonetumorcellphonetumor

  Brain tumor No brain tumor  

Own a cell phone

5 347 352

Don’t own a cell phone

3 88 91

  8 435 453

Same data, but use Chi-square testSame data, but use Chi-square testor Fischer’s exactor Fischer’s exact

48.122.1:note

48.17.345

345.7)-(347

3.89

88)-(89.3

7.1

1.7)-(3

3.6

6.3)-(8

df 11111

d cellin 89.3 b; cellin 345.7

c; cellin 1.7 6.3;453*.014 a cellin Expected

014.777.*018.

777.453

352;018.

453

8

22

2222

12

Z

NS

*))*(C-(R-

xpp

pp

cellphonetumor

cellphonetumor

  Brain tumor No brain tumor  

Own 5 347 352

Don’t own 3 88 91

  8 435 453

Same data, but use Odds RatioSame data, but use Odds Ratio

  Brain tumor No brain tumor  

Own a cell phone

5 347 352

Don’t own a cell phone

3 88 91

  8 435 453

05.;16.174.

86.

88

1

3

1

347

1

5

1

)423ln(.

1111

0-lnORZ

423.347*3

88*5OR

p

dcba

Adding a Third Dimension to Adding a Third Dimension to the the RxCRxC picture picture

Exposure Disease?

Mediator

Confounder

Effect modifier

ConfoundingConfounding

A confounding variable is associated with the exposure and it affects the outcome, but it is not an intermediate link in the chain of causation between exposure and outcome.

Examples of ConfoundingExamples of Confounding

Poor nutrition?

Menstrual irregularity

Low weight

Menstrual irregularity?

Low bone strength

Late menarche

Controlling for confounders in Controlling for confounders in medical studiesmedical studies

1. Confounders can be controlled for in the design phase of a study (randomization or restriction or matching).

2. Confounders can be controlled for in the analysis phase of a study (stratification or multivariate regression).

Analytical identification of Analytical identification of confounders through confounders through

stratification stratification

Mantel-Haenszel Procedure:Mantel-Haenszel Procedure:Non-regression technique used to Non-regression technique used to

identify confounders and to control identify confounders and to control for confounding in the for confounding in the statistical statistical

analysisanalysis phase rather than the phase rather than the designdesign phase of a study.phase of a study.

Stratification: “Series of 2x2 Stratification: “Series of 2x2 tables”tables”

Idea: Take a 2x2 table and break it into a series of smaller 2x2 tables (one table at each of J levels of the confounder yields J tables).

Example: in testing for an association between lung cancer and alcohol drinking (yes/no), separate smokers and non-smokers.

“It is more informative to estimate the strength of association than simply to test a hypothesis about it.”

“When the association seems stable across partial tables, we can estimate an assumed common value of the k true odds ratios.”

Controlling for confounding by Controlling for confounding by stratificationstratification

Example: Gender Bias at Berkeley?(From: Sex Bias in Graduate Admissions: Data from Berkeley, Science 187: 398-403; 1975.)

 

 

Crude RR = (1276/1835)/(1486/2681) =1.25

(1.20 – 1.32)

Denied

Admitted

1835 2681

Female Male

1276 1486

559 1195

Program AProgram A

Stratum 1 = only those who applied to program A

 

 

Stratum-specific RR = .90 (.87-.94)

Denied

Admitted

108 825

Female Male

19 314

89 511

Program BProgram B

Stratum 2 = only those who applied to program B

 

 

Stratum-specific RR = .99 (.96-1.03)

Denied

Admitted

25 560

Female Male

8 208

17 352

Program CProgram C

Stratum 3 = only those who applied to program C

 

 

Stratum-specific RR = 1.08 (.91-1.30)

Denied

Admitted

593 325

Female Male

391 205

202 120

Program DProgram D

Stratum 4 = only those who applied to program D

 

 

Stratum-specific RR = 1.02 (.89-1.18)

Denied

Admitted

375 407

Female Male

248 265

127 142

Program EProgram E

Stratum 5 = only those who applied to program E

 

 

Stratum-specific RR = .88 (.67-1.17)

Denied

Admitted

393 191

Female Male

289 147

104 44

Program FProgram F

Stratum 6 = only those who applied to program F

 

 

Stratum-specific RR = 1.09 (.84-1.42)

Denied

Admitted

341 373

Female Male

321 347

20 26

SummarySummary

 

 

Crude RR = 1.25 (1.20 – 1.32)

Stratum specific RR’s:.90 (.87-.94)

.99 (.96-1.03)1.08 (.91-1.30) 1.02 (.89-1.18).88 (.67-1.17)

1.09 (.84-1.42)

Maentel-Haenszel Summary RR: .97

Cochran-Mantel-Haenszel Test is NS. Gender and denial of admissions are conditionally independent given program.The apparent association (RR=1.25) was due to confounding.

The Mantel-Haenszel The Mantel-Haenszel Summary Risk RatioSummary Risk Ratio

k

i i

iii

k

i i

iii

T

bacT

dca

1

1

)(

)(

Disease

Not Disease

Exposure Not Exposed

a c

b d

k strata

E.g., for Berkeley…E.g., for Berkeley…

97.

714)341(347

584)393(147

782)375(265

918)593(205

585)25(208

933)108(314

714)373(321

584)191(289

782)407(248

918)325(391

585)560(8

933)825(19

The Mantel-Haenszel The Mantel-Haenszel Summary Odds RatioSummary Odds Ratio

Exposed

Not Exposed

Case Control

a b

c d

k

i i

ii

k

i i

ii

T

cbT

da

1

1

 

 

Country

OR = 1.32 Spouse smokes

Spouse does not smoke

137 363

71 249US

Spouse smokes

Spouse does not smoke

19 38

5 16

Great

Britain

OR = 1.6

Spouse smokes

Spouse does not smoke

Lung Cancer Control

73 188

21 82Japan OR = 1.52

Source: Blot and Fraumeni, J. Nat. Cancer Inst., 77: 993-1000 (1986).

Example

Summary ORSummary OR

38.1

820363*71

785*38

36421*188

820137*249

7816*19

36482*73

Not Surprising!

MH assumptionsMH assumptions

OR or RR doesn’t vary across strata. (Homogeneity!)

If exposure/disease association does vary for different subgroups, then the summary OR or RR is not appropriate…

advantages and limitationsadvantages and limitationsadvantages…• Mantel-Haenszel summary statistic is easy to interpret

and calculate• Gives you a hands-on feel for the datadisadvantages…• Requires categorical confounders or continuous

confounders that have been divided into intervals • Cumbersome if more than a single confounder

To control for 1 and/or continuous confounders, a multivariate technique (such as logistic regression) is preferable.

Analysis of matched dataAnalysis of matched data

Pair Matching: Why match?Pair Matching: Why match?Pairing can control for extraneous sources

of variability and increase the power of a statistical test.

Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.

ExampleExample Johnson and Johnson (NEJM 287: 1122-1125,

1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as….

Hodgkin’s

Sib control

Tonsillectomy None

41 44

33 52

From John A. Rice, “Mathematical Statistics and Data Analysis.

OR=1.47; chi-square=1.53 (NS)

ExampleExample But several letters to the editor pointed out that

those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this:

From John A. Rice, “Mathematical Statistics and Data Analysis.

OR=2.14; chi-square=2.91 (p=.09)

Tonsillectomy

None

Tonsillectomy None

37 7

15 26

Case

Control

Pair MatchingPair Matching

Match each MI case to an MI control based on age and gender.

Ask about history of diabetes to find out if diabetes increases your risk for MI.

Pair MatchingPair Matching

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

Each pair is it’s own “age-Each pair is it’s own “age-gender” stratumgender” stratum

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Example: Concordant for

exposure (cell “a” from before)

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Diabetes

No diabetes

Case (MI) Control

1 0

0 1

x 9

x 37

Diabetes

No diabetes

Case (MI) Control

0 1

1 0

Diabetes

No diabetes

Case (MI) Control

0 0

1 1

x 16

x 82

Mantel-Haenszel for pair-Mantel-Haenszel for pair-matched datamatched data

We want to know the relationship between diabetes and MI controlling for age and gender.

Mantel-Haenszel methods apply.

RECALL: The Mantel-Haenszel RECALL: The Mantel-Haenszel Summary Odds RatioSummary Odds Ratio

Exposed

Not Exposed

Case Control

a b

c d

k

i i

ii

k

i i

ii

T

cbT

da

1

1

Diabetes

No diabetes

Case (MI) Control

1 1

0 0

Diabetes

No diabetes

Case (MI) Control

1 0

0 1

ad/T = 0

bc/T=0

ad/T=1/2

bc/T=0

Diabetes

No diabetes

Case (MI) Control

0 1

1 0

Diabetes

No diabetes

Case (MI) Control

0 0

1 1

ad/T=0

bc/T=1/2

ad/T=0

bc/T=0

16

37

21

*16

21

37

2

2144

1

144

1

x

cb

da

OR

i

ii

i

ii

MH

Mantel-Haenszel Summary ORMantel-Haenszel Summary OR

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

OR estimate comes only from discordant pairs!!

OR= 37/16 = 2.31

Makes Sense!

McNemar’s TestMcNemar’s Test

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

OR estimate comes only from discordant pairs!

The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

P(“favors” case/discordant pair) =

53

37

1637

37ˆ

cb

bp

Diabetes

No diabetes

25 119

Diabetes No Diabetes

9 37

16 82

46

98

144

MI cases

MI controls

odds(“favors” case/discordant pair) =

16

37

c

bOR

Diabetes

No diabetes

Diabetes No Diabetes

9 37

16 82

MI casesMI controls

McNemar’s TestMcNemar’s Test

...)5(.)5(.39

53)5(.)5(.

38

53)5(.)5(.

37

53 143915381637

valuep

01.;88.264.3

5.10

)5)(.5(.53

)2

53(37

pZ

Null hypothesis: P(“favors” case / discordant pair) = .5(note: equivalent to OR=1.0 or cell b=cell c)

By normal approximation to binomial:

McNemar’s Test: generallyMcNemar’s Test: generally

cb

cb

cb

cb

cb

cbb

Z

4

22)5)(.5)(.(

)2

(

By normal approximation to binomial:

Equivalently:

cb

cb

cb

cb

2

221

)()(

exp

No exp

exp No exp

a b

c d

casescontrols

From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ 312: 91-94; Jan 1996.

Example: Salmonella Example: Salmonella Outbreak in France, 1996Outbreak in France, 1996

Epidemic CurveEpidemic Curve

Matched Case Control StudyMatched Case Control Study

Case = Salmonella gastroenteritis.

Community controls (1:1) matched for: age group (< 1, 1-4, 5-14, 15-34, 35-44,

45-54, 55-64, or >= 65 years) gender city of residence

ResultsResults

In 2x2 table form: any goat’s In 2x2 table form: any goat’s cheesecheese

Goat’s cheese

None

29 30

Goat’ cheese None

23 23

6 7

46

13

59

Cases

Controls

8.36

23

c

bOR

In 2x2 table form: Brand B In 2x2 table form: Brand B Goat’s cheeseGoat’s cheese

Goat’s cheese B

None

10 49

Goat’ cheese B None

8 24

2 25

32

27

59

Cases

Controls

0.122

24

c

bOR