Post on 01-Jan-2016
description
Chi-Square:
Introduction to Nonparametric Stats
2
Chi-square
Parametric vs. nonparametric tests Hypotheses about Frequencies Two main uses:
Goodness of fit. 1 IV. Test of independence. 2 or more IVs.
Goodness-of-fit testBlind beer tasting study. Judges taste 4 beers and declare their favorite. 100 lucky judges. Results:
Coors Corona Miller Sam Adams
Total
Frequency 15 45 30 10 100
If no difference in taste (or all the same beer) we expect about 25 people to choose each beer (null hypothesis). There are 100 people and 4 choices (100/4 = 25).
We will test whether frequencies are equal across beers.
Goodness-of-fit (2)
E
EO 22 )(
Where O is an observed frequency and E is an expected frequency under the null.
Coors Corona Miller Sam Adams
Total
Freq 15 45 30 10 100
Expected 25 25 25 25
O-E -10 20 5 -15
(O-E)2 100 400 25 225
(O-E)2/E 4 16 1 9 30 = test value
Goodness-of-fit 3
Our test statistic was 30. The df for this test are k-1, where k is the number of cells. In our example k=4 and df = 3. Chi-square has a distribution found in tables. For alpha=.05 and 3 df, the critical value is 7.81, which is less than 30. We reject the null hypothesis. People can taste the difference among beers and have favorites.
Test of Independence (1)Exit survey at polls. Voter preferences. Did you vote yes for:
School tax increase
Ban EEO hiring prefs
Police Tax Increase
Total
Male 40 65 55 160
Female 70 50 60 180
Total 110 115 115 340
E
EO 22 )(
Use same formula. But now E is calculated by:
E=(rowtotal*columntotal)/grandtotal or equivalently:E=pctr*pctc*N, where pct means percentage.
Test of Independence (2)
Find expected values:
School tax increase
Ban EEO hiring prefs
Police Tax Increase
Total
Male (110*160)/340 = 51.76
(115*160)/340 = 54.12
(115*160)/340 = 54.12
160
Female (110*180)/340 = 58.24
(115*180)/340 = 60.88
(115*180)/340 = 60.88
180
Total 110 115 115 340
We use row and column totals to figure expected cell frequencies under the null hypothesis that all cell frequencies are proportional to their row and column frequencies in the population.
Test of Independence (3)
Find the value of chi-square:
E
EO 22 )(
School tax increase Ban EEO hiring prefs
Police Tax Increase
Total
Male (40-51.76)2/51.76 = 2.67
2.19 .01
Female 2.37 1.94 .01 test value
Total 9.19= 2
For the chi-square test of independence, the df are (rows-1) times (cols-1) or for this example, (2-1)*(3-1) = 2. From the chi-square table, we find the critical value is 5.99 for an alpha of .05, so we reject the null. Men and women have different voting preferences.
Effect Size
Effect size – index of magnitude of relations Statistical Significance – probability of
outcome Significant results when large magnitude or
large sample size. Can have trivial magnitude but still significant results, so you want an effect size.
Effect Sizes for Contingencies - Phi
Nobt2
Type A Type B
Heart attack 25 10
No heart attack 5 40
80;56.302 Nobt
62.80
56.30
Phi
For 2x2 tables only
This is a strong relation. Anything larger than about .5 is unusual in psychology. Average is about .20. Data are hypothetical.
Contingency Coefficient
For 2-way tables other than 2x2, e.g., 3x2 or 4x3
2
2
obt
obt
NC
School tax increase
Ban EEO hiring prefs
Police Tax Increase
Total
Male 40 65 55 160
Female 70 50 60 180
Total 110 115 115 340
340;19.92 Nobt
16.19.9340
19.92
2
obt
obt
NC
This is a more typical result.
There is a significant association, but the association is not very strong.