Post on 24-Feb-2016
description
Lecture 2
The analysis of cross-tabulations
Cross-tabulations
• Tables of countable entities or frequencies • Made to analyze the association,
relationship, or connection between two variables
• This association is difficult to describe statistically
• Null- Hypothesis: “There is no association between the two variables” can be tested
• Analysis of cross-tabulations with larges samples
Delivery and housing tenure
Housing tenure Preterm Term Total
Owner-occupier 50 849 899
Council tentant 29 229 258
Private tentant 11 164 175
Lives with parents 6 66 72
Other 3 36 39
Total 99 1344 1443
Delivery and housing tenure
• Expected number without any association between delivery and housing tenure
Housing tenure Pre Term Total
Owner-occupier 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureIf the null-hypothesis is true
• 899/1443 = 62.3% are house owners.• 62.3% of the Pre-terms should be house owners:
99*899/1443 = 61.7
Housing tenure Pre Term Total
Owner-occupier 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureIf the null-hypothesis is true
• 899/1443 = 62.3% are house owners.• 62.3% of the ‘Term’s should be house owners:
1344*899/1443 = 837.3
Housing tenure Pre Term Total
Owner-occupier 61.7 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureIf the null-hypothesis is true
• 258/1443 = 17.9% are council tenant.• 17.9% of the ‘preterm’s should be council tenant:
99*258/1443 = 17.7
Housing tenure Pre Term Total
Owner-occupier 61.7 837.3 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureIf the null-hypothesis is true
• In general
Housing tenure Pre Term Total
Owner-occupier 61.7 837.3 899
Council tenant 17.7 240.3 258
Private tenant 12.0 163.0 175
Lives with parents 4.9 67.1 72
Other 2.7 36.3 39
Total 99 1344 1443
row total * column totalgrand total
Delivery and housing tenureIf the null-hypothesis is true
• In general
Housing tenure Pre Term Total
Owner-occupier 50(61.7) 849(837.3) 899
Council tenant 29(17.7) 229(240.3) 258
Private tenant 11(12.0) 164(163.0) 175
Lives with parents 6(4.9) 66(67.1) 72
Other 3(2.7) 36(36.3) 39
Total 99 1344 1443
row total * column totalgrand total
Delivery and housing tenureIf the null-hypothesis is true
Housing tenure Pre Term Total
Owner-occupier 50(61.7) 849(837.3) 899
Council tenant 29(17.7) 229(240.3) 258
Private tenant 11(12.0) 164(163.0) 175
Lives with parents 6(4.9) 66(67.1) 72
Other 3(2.7) 36(36.3) 39
Total 99 1344 1443
2
all_cells
10.5O EE
Delivery and housing tenuretest for association
• If the numbers are large this will be chi-square distributed.
• The degree of freedom is (r-1)(c-1) = 4• From Table 13.3 there is a 1 - 5%
probability that delivery and housing tenure is not associated
2
all_cells
10.5O EE
Chi Squared Table
Delivery and housing tenureIf the null-hypothesis is true
• It is difficult to say anything about the nature of the association.
Housing tenure Pre Term Total
Owner-occupier 50(61.7) 849(837.3) 899
Council tenant 29(17.7) 229(240.3) 258
Private tenant 11(12.0) 164(163.0) 175
Lives with parents 6(4.9) 66(67.1) 72
Other 3(2.7) 36(36.3) 39
Total 99 1344 1443
2 by 2 tables
Bronchitis No bronchitis Total
Cough 26 44 70
No Cough 247 1002 1249
Total 273 1046 1319
2 by 2 tables
Bronchitis No bronchitis Total
Cough 26 (14.49) 44 (55.51) 70
No Cough 247 (258.51) 1002 (990.49) 1249
Total 273 1046 1319
2
all_cells
12.2O EE
Chi Squared Table
Chi-squared test for small samples
• Expected valued– > 80% >5– All >1
Streptomycin Control Total
Improvement 13 (8.4) 5 (9.6) 18
Deterioration 2 (4.2) 7 (4.8) 9
Death 0 (2.3) 5 (2.7) 5
Total 15 17 32
Chi-squared test for small samples
• Expected valued– > 80% >5– All >1
Streptomycin Control Total
Improvement 13 (8.4) 5 (9.6) 18
Deterioration and death
2 (6.6) 12 (7.4) 14
Total 15 17 32
2
all_cells
10.8O EE
Fisher’s exact test
• An example
S D T
A 3 1 4
B 2 2 4
5 3 8
S D T
A 4 0 4
B 1 3 4
5 3 8
S D T
A 1 3 4
B 4 0 4
5 3 8
S D T
A 2 2 4
B 3 1 4
5 3 8
Fisher’s exact test
• Survivers: – a, b, c, d, e
• Deaths: – f, g, h
• Table 1 can be made in 5 ways
• Table 2: 30• Table 3: 30• Table 4: 5• 70 ways in total
S D T
A 3 1 4
B 2 2 4
5 3 8
S D T
A 4 0 4
B 1 3 4
5 3 8
S D T
A 1 3 4
B 4 0 4
5 3 8
S D T
A 2 2 4
B 3 1 4
5 3 8
Fisher’s exact test
• Survivers: – a, b, c, d, e
• Deaths: – f, g, h
• Table 1 can be made in 5 ways• Table 2: 30• Table 3: 30• Table 4: 5• 70 ways in total
5 30 170 70 2
• The properties of finding table 2 or a more extreme is:
Fisher’s exact test
S D T
A f11 f12 r1B f21 f22 r2
c1 c2 n
S D T
A 3 1 4
B 2 2 4
5 3 8
1 2 1 2
11 12 21 22
! ! ! !! ! ! ! !
4!4!5!3! 0.42868!3!1!2!2!
r r c cpn f f f f
S D T
A f11 f12 r1B f21 f22 r2
c1 c2 n
S D T
A 4 0 4
B 1 3 4
5 3 8
1 2 1 2
11 12 21 22
! ! ! !! ! ! ! !
4!4!5!3! 0.07148!4!0!1!3!
r r c cpn f f f f
Yates’ correction for 2x2
• Yates correction: 212
all_cells
O EE
Streptomycin Control Total
Improvement 13 (8.4) 5 (9.6) 18
Deterioration and death
2 (6.6) 12 (7.4) 14
Total 15 17 32
212
all_cells
8.6O E
E
2
all_cells
10.8O EE
Chi Squared Table
Yates’ correction for 2x2
• Table 13.7– Fisher: p = 0.001455384362148– ‘Two-sided’ p = 0.0029– χ2: p = 0.001121814118023– Yates’ p = 0.0037
Odds and odds ratios
• Odds, p is the probability of an event
• Log odds / logit
1pop
ln( ) ln1pop
Odds
• The probability of coughs in kids with history of bronchitis.p = 26/273 = 0.095o = 26/247 = 0.105The probability of coughs in kids with
history without bronchitis.p = 44/1046 = 0.042o = 44/1002 = 0.044
Bronchitis No bronchitis Total
Cough 26 (a) 44 (b) 70
No Cough 247 (c) 1002 (d) 1249
Total 273 1046 1319
1pop
Odds ratio
• The odds ratio; the ratio of odds for experiencing coughs in kids with and kids without a history of bronchitis.
Bronchitis No bronchitis Total
Cough 26; 0.105 (a) 44; 0.0439 (b) 70
No Cough 247; 9.50 (c) 1002; 22.8 (d) 1249
Total 273 1046 1319
acbd
adorbc
abcd
adorbc
26247441002
26*1002 2.40247*44
or
Is the odds ratio different form 1?Bronchitis No bronchitis Total
Cough 26 (a) 44 (b) 70
No Cough 247 (c) 1002 (d) 1249
Total 273 1046 1319
1 1 1 1 1 1 1 126 44 247 1002SE ln 0.257a b c dor
0.874 1.96 0.257 _ to_0.874 1.96 0.257 0.37 _ _1.38to
ln( ) ln(2.40) 0.874or
• We could take ln to the odds ratio. Is ln(or) different from zero?
• 95% confidence (assumuing normailty)
Confidence interval of the Odds ratio
• ln (or) ± 1.96*SE(ln(or)) = 0.37 to 1.38• Returning to the odds ratio itself:• e0.370 to e1.379 = 1.45 to 3.97• The interval does not contain 1, indicating
a statistically significant difference
Bronchitis No bronchitis Total
Cough 26 (a) 44 (b) 70
No Cough 247 (c) 1002 (d) 1249
Total 273 1046 1319
Chi-square for goodness of fit
• df = 4-1-1 = 2