Lecture 2

31
Lecture 2 The analysis of cross- tabulations

description

The analysis of cross-tabulations. Lecture 2. Cross-tabulations. Tables of countable entities or frequencies Made to analyze the association, relationship, or connection between two variables This association is difficult to describe statistically - PowerPoint PPT Presentation

Transcript of Lecture 2

Page 1: Lecture 2

Lecture 2

The analysis of cross-tabulations

Page 2: Lecture 2

Cross-tabulations

• Tables of countable entities or frequencies • Made to analyze the association,

relationship, or connection between two variables

• This association is difficult to describe statistically

• Null- Hypothesis: “There is no association between the two variables” can be tested

• Analysis of cross-tabulations with larges samples

Page 3: Lecture 2

Delivery and housing tenure

Housing tenure Preterm Term Total

Owner-occupier 50 849 899

Council tentant 29 229 258

Private tentant 11 164 175

Lives with parents 6 66 72

Other 3 36 39

Total 99 1344 1443

Page 4: Lecture 2

Delivery and housing tenure

• Expected number without any association between delivery and housing tenure

Housing tenure Pre Term Total

Owner-occupier 899

Council tenant 258

Private tenant 175

Lives with parents 72

Other 39

Total 99 1344 1443

Page 5: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

• 899/1443 = 62.3% are house owners.• 62.3% of the Pre-terms should be house owners:

99*899/1443 = 61.7

Housing tenure Pre Term Total

Owner-occupier 899

Council tenant 258

Private tenant 175

Lives with parents 72

Other 39

Total 99 1344 1443

Page 6: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

• 899/1443 = 62.3% are house owners.• 62.3% of the ‘Term’s should be house owners:

1344*899/1443 = 837.3

Housing tenure Pre Term Total

Owner-occupier 61.7 899

Council tenant 258

Private tenant 175

Lives with parents 72

Other 39

Total 99 1344 1443

Page 7: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

• 258/1443 = 17.9% are council tenant.• 17.9% of the ‘preterm’s should be council tenant:

99*258/1443 = 17.7

Housing tenure Pre Term Total

Owner-occupier 61.7 837.3 899

Council tenant 258

Private tenant 175

Lives with parents 72

Other 39

Total 99 1344 1443

Page 8: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

• In general

Housing tenure Pre Term Total

Owner-occupier 61.7 837.3 899

Council tenant 17.7 240.3 258

Private tenant 12.0 163.0 175

Lives with parents 4.9 67.1 72

Other 2.7 36.3 39

Total 99 1344 1443

row total * column totalgrand total

Page 9: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

• In general

Housing tenure Pre Term Total

Owner-occupier 50(61.7) 849(837.3) 899

Council tenant 29(17.7) 229(240.3) 258

Private tenant 11(12.0) 164(163.0) 175

Lives with parents 6(4.9) 66(67.1) 72

Other 3(2.7) 36(36.3) 39

Total 99 1344 1443

row total * column totalgrand total

Page 10: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

Housing tenure Pre Term Total

Owner-occupier 50(61.7) 849(837.3) 899

Council tenant 29(17.7) 229(240.3) 258

Private tenant 11(12.0) 164(163.0) 175

Lives with parents 6(4.9) 66(67.1) 72

Other 3(2.7) 36(36.3) 39

Total 99 1344 1443

2

all_cells

10.5O EE

Page 11: Lecture 2

Delivery and housing tenuretest for association

• If the numbers are large this will be chi-square distributed.

• The degree of freedom is (r-1)(c-1) = 4• From Table 13.3 there is a 1 - 5%

probability that delivery and housing tenure is not associated

2

all_cells

10.5O EE

Page 12: Lecture 2

Chi Squared Table

Page 13: Lecture 2

Delivery and housing tenureIf the null-hypothesis is true

• It is difficult to say anything about the nature of the association.

Housing tenure Pre Term Total

Owner-occupier 50(61.7) 849(837.3) 899

Council tenant 29(17.7) 229(240.3) 258

Private tenant 11(12.0) 164(163.0) 175

Lives with parents 6(4.9) 66(67.1) 72

Other 3(2.7) 36(36.3) 39

Total 99 1344 1443

Page 14: Lecture 2

2 by 2 tables

Bronchitis No bronchitis Total

Cough 26 44 70

No Cough 247 1002 1249

Total 273 1046 1319

Page 15: Lecture 2

2 by 2 tables

Bronchitis No bronchitis Total

Cough 26 (14.49) 44 (55.51) 70

No Cough 247 (258.51) 1002 (990.49) 1249

Total 273 1046 1319

2

all_cells

12.2O EE

Page 16: Lecture 2

Chi Squared Table

Page 17: Lecture 2

Chi-squared test for small samples

• Expected valued– > 80% >5– All >1

Streptomycin Control Total

Improvement 13 (8.4) 5 (9.6) 18

Deterioration 2 (4.2) 7 (4.8) 9

Death 0 (2.3) 5 (2.7) 5

Total 15 17 32

Page 18: Lecture 2

Chi-squared test for small samples

• Expected valued– > 80% >5– All >1

Streptomycin Control Total

Improvement 13 (8.4) 5 (9.6) 18

Deterioration and death

2 (6.6) 12 (7.4) 14

Total 15 17 32

2

all_cells

10.8O EE

Page 19: Lecture 2

Fisher’s exact test

• An example

S D T

A 3 1 4

B 2 2 4

5 3 8

S D T

A 4 0 4

B 1 3 4

5 3 8

S D T

A 1 3 4

B 4 0 4

5 3 8

S D T

A 2 2 4

B 3 1 4

5 3 8

Page 20: Lecture 2

Fisher’s exact test

• Survivers: – a, b, c, d, e

• Deaths: – f, g, h

• Table 1 can be made in 5 ways

• Table 2: 30• Table 3: 30• Table 4: 5• 70 ways in total

S D T

A 3 1 4

B 2 2 4

5 3 8

S D T

A 4 0 4

B 1 3 4

5 3 8

S D T

A 1 3 4

B 4 0 4

5 3 8

S D T

A 2 2 4

B 3 1 4

5 3 8

Page 21: Lecture 2

Fisher’s exact test

• Survivers: – a, b, c, d, e

• Deaths: – f, g, h

• Table 1 can be made in 5 ways• Table 2: 30• Table 3: 30• Table 4: 5• 70 ways in total

5 30 170 70 2

• The properties of finding table 2 or a more extreme is:

Page 22: Lecture 2

Fisher’s exact test

S D T

A f11 f12 r1B f21 f22 r2

c1 c2 n

S D T

A 3 1 4

B 2 2 4

5 3 8

1 2 1 2

11 12 21 22

! ! ! !! ! ! ! !

4!4!5!3! 0.42868!3!1!2!2!

r r c cpn f f f f

S D T

A f11 f12 r1B f21 f22 r2

c1 c2 n

S D T

A 4 0 4

B 1 3 4

5 3 8

1 2 1 2

11 12 21 22

! ! ! !! ! ! ! !

4!4!5!3! 0.07148!4!0!1!3!

r r c cpn f f f f

Page 23: Lecture 2

Yates’ correction for 2x2

• Yates correction: 212

all_cells

O EE

Streptomycin Control Total

Improvement 13 (8.4) 5 (9.6) 18

Deterioration and death

2 (6.6) 12 (7.4) 14

Total 15 17 32

212

all_cells

8.6O E

E

2

all_cells

10.8O EE

Page 24: Lecture 2

Chi Squared Table

Page 25: Lecture 2

Yates’ correction for 2x2

• Table 13.7– Fisher: p = 0.001455384362148– ‘Two-sided’ p = 0.0029– χ2: p = 0.001121814118023– Yates’ p = 0.0037

Page 26: Lecture 2

Odds and odds ratios

• Odds, p is the probability of an event

• Log odds / logit

1pop

ln( ) ln1pop

Page 27: Lecture 2

Odds

• The probability of coughs in kids with history of bronchitis.p = 26/273 = 0.095o = 26/247 = 0.105The probability of coughs in kids with

history without bronchitis.p = 44/1046 = 0.042o = 44/1002 = 0.044

Bronchitis No bronchitis Total

Cough 26 (a) 44 (b) 70

No Cough 247 (c) 1002 (d) 1249

Total 273 1046 1319

1pop

Page 28: Lecture 2

Odds ratio

• The odds ratio; the ratio of odds for experiencing coughs in kids with and kids without a history of bronchitis.

Bronchitis No bronchitis Total

Cough 26; 0.105 (a) 44; 0.0439 (b) 70

No Cough 247; 9.50 (c) 1002; 22.8 (d) 1249

Total 273 1046 1319

acbd

adorbc

abcd

adorbc

26247441002

26*1002 2.40247*44

or

Page 29: Lecture 2

Is the odds ratio different form 1?Bronchitis No bronchitis Total

Cough 26 (a) 44 (b) 70

No Cough 247 (c) 1002 (d) 1249

Total 273 1046 1319

1 1 1 1 1 1 1 126 44 247 1002SE ln 0.257a b c dor

0.874 1.96 0.257 _ to_0.874 1.96 0.257 0.37 _ _1.38to

ln( ) ln(2.40) 0.874or

• We could take ln to the odds ratio. Is ln(or) different from zero?

• 95% confidence (assumuing normailty)

Page 30: Lecture 2

Confidence interval of the Odds ratio

• ln (or) ± 1.96*SE(ln(or)) = 0.37 to 1.38• Returning to the odds ratio itself:• e0.370 to e1.379 = 1.45 to 3.97• The interval does not contain 1, indicating

a statistically significant difference

Bronchitis No bronchitis Total

Cough 26 (a) 44 (b) 70

No Cough 247 (c) 1002 (d) 1249

Total 273 1046 1319

Page 31: Lecture 2

Chi-square for goodness of fit

• df = 4-1-1 = 2