Chapter 14: Chi-Square Procedures. 14.1 – Test for Goodness of Fit.
Transcript of Chapter 14: Chi-Square Procedures. 14.1 – Test for Goodness of Fit.
Chapter 14: Chi-Square Procedures
14.1 – Test for Goodness of Fit
Chi-square test for goodness of fit:
Used to determine if what outcome you expect to happen actually does happen
Observed Counts: Count of actual results
Expected Counts: Count of expected results
To find the expected counts multiply the proportion you expect times the sample size
(2)
Note: Sometimes the probabilities will be expected to be the same and sometimes they will be expected to be different
Ho: All of the proportions are as expected
Chi-square test for goodness of fit:
HA: One or more of the proportions are different from expected
( )∑ −=
22
E
EOχ
Chi-square test for goodness of fit:
df = k – 1 categories
Properties of the Chi-distribution:
• Always positive (being squared)
• Skewed to the right
• Distribution changes as degrees of freedom change
• Area is shaded to the right
Properties of the Chi-distribution:
• Always positive (being squared)
• Skewed to the right
• Distribution changes as degrees of freedom change
Chi-square test for goodness of fit:
Conditions:
all expected counts are ≥ 5
Calculator Tip! Goodness of Fit
L1: Observed
L2: Expected
L3: (L1 – L2)2 / L2
List – Math – Sum – L3
Calculator Tip! P(2 > #)
2nd dist - 2cdf(2, big #, df)
Example #1A study yields a chi-square statistic value of 20 (2 = 20). What is the P value of the test if the study was a goodness-of-fit test with 12 categories?
Example #1A study yields a chi-square statistic value of 20 (2 = 20). What is the P value of the test if the study was a goodness-of-fit test with 12 categories?
< P <0.025 0.05
Or by calc:
2cdf(20, 10000000, 11) =
2nd dist - 2cdf(2, big #, df)
0.04534
Example #2Find the expected values and calculate the 2 of the 96 rolls of the die. Then find the probability.
( )E
EO 2−
Face Value
1 2 3 4 5 6
Observed 19 15 10 14 17 21
Expected 16 16 16 16 16 16
(19 – 16)2
16= 0.5625
0.5625
(15 – 16)2
16= 0.0625
0.0625 2.25 0.25 0.0625 1.5625
Example #2Find the expected values and calculate the 2 of the 96 rolls of the die. Then find the probability.
( )E
EO 2−
Face Value
1 2 3 4 5 6
Observed 19 15 10 14 17 21
Expected 16 16 16 16 16 16
0.5625 0.0625 2.25 0.25 0.0625 1.5625
0.5625 + 0.0625 + 2.25 + 0.25 + 0.0625 + 1.5625 =
( )22 O E
Eχ
−= ∑ = 4.75
P(2 > 4.75) =
df = 6 – 1 = 5
P(2 > 4.75) =
df = 6 – 1 = 5
P(2 > 4.75) > 0.25
Or by calc:
2cdf(4.75, 10000000, 5) = 0.4471
Example #3The number of defects from a manufacturing process by day of the week are as follows:
The manufacturer is concerned that the number of defects is greater on Monday and Friday. Test, at the 0.05 level of significance, the claim that the proportion of defects is the same each day of the week.
Monday Tuesday Wednesday Thursday Friday
# 36 23 26 25 40
P: The true proportion of defects from a manufacturing process per day
H:
The proportion of defects from a manufacturing process is the same Mon-Fri
Ho:
The proportion of defects from a manufacturing process is not the same Mon-Fri (on one day or more)
HA:
A:
Monday Tuesday Wednesday Thursday Friday
# 36 23 26 25 40
Expected
150 errors total.
150 5
= 30
30 30 30 30 30
(All expected counts > 5)
N: Chi-Square Goodness of Fit
T:
2 =
Monday Tuesday Wednesday Thursday Friday
# 36 23 26 25 40
Expected 30 30 30 30 30
(O – E)2
E= = 7.533(36 – 30)2
30+ (23 – 30)2
30+ …
O: P(2 > 7.533) =
df = 5 – 1 = 4
P(2 > 7.533) =
df = 5 – 1 = 4
0.10 < P(2 > 7.533) > 0.15
Or by calc:
2cdf(7.533, 10000000, 4) = 0.1102
O:
M:
P ___________
0.1102 0.05
>
Accept the Null
S:
There is not enough evidence to claim the proportion of defects from a manufacturing process is not the same Mon-Fri (on one day or more)
14.2 – Inference for Two-Way Tables
To compare two proportions, we use a 2-Proportion Z Test. If we want to compare three or more proportions, we need a new procedure.
Two – Way Table:
Organize the data for several proportions
R rows and C columns
Dimensions are r x c
To calculate the expected counts, multiply the row total by the column total, and divide by the table total:
Expected count = Row total x Column total table total
Degrees of Freedom: (r – 1)(c – 1)
Chi-Square test for Homogeneity:
Ho: The proportions are the same among 2 or more populations
Ha: The proportions are different among 2 or more populations
Expected Counts are ≥ 5
SRS
Conditions:
Compare two or more populations on one categorical variable
Chi-Square test for Association/Independence:
Ho: There is no association between two categorical variables
Ha: There is an association
Expected Counts are ≥ 5
SRS
Conditions:
Two categorical variables collected from a single population
Calculator Tip! Test for Homogeneity/Independence
2nd – matrx – edit – [A] – rxc – Table info
Stat – tests – 2–test
Observed: [A]Expected: [B]Calculate
Then:
Note: Expected: [B] is done automatically!
Example #1The table shows the number of people in each grade of high school who preferred a different color of socks.
a. What is the expected value for the number of 12th graders who prefer red socks?
Expected count = Row total x Column total table total
20 x 14 56
=
2014
1818
121515 56
= 5
Example #1The table shows the number of people in each grade of high school who preferred a different color of socks.
b. Find the degrees of freedom.
(r – 1)(c – 1)
(3 – 1)(4 – 1)
(2)(3)
6
Example #2An SRS of a group of teens enrolled in alternative schooling programs was asked if they smoked or not. The information is classified by gender in the table. Find the expected counts for each cell, and then find the chi-square statistic, degrees of freedom, and its corresponding probability.
Expected Counts: Row total x Column total table total
70 x 79 217
79 138
70
147217
147 x 79 217
70 x 138 217
147 x 138 217
=
=
=
=
25.484
53.516
44.516
93.484
2 = (O – E)2
E=
Expected Counts:
Smoker Non-Smoker
Male
Female
25.484
53.516
44.516
93.484
0.56197
(23 – 25.484)2
25.48+ (47 – 44.516)2
44.516+ (56 – 53.516)2
53.516+ (91 – 93.484)2
93.484
2 = (O – E)2
E=
Expected Counts:
Smoker Non-Smoker
Male
Female
25.484
53.516
44.516
93.484
0.56197
(r – 1)(c – 1) = (2 – 1)(2 – 1) = (1)(1) = 1Degrees of Freedom:
P(2 > 0.56197) =
2 = (O – E)2
E=
Expected Counts:
Smoker Non-Smoker
Male
Female
25.484
53.516
44.516
93.484
0.56197
(r – 1)(c – 1) = (2 – 1)(2 – 1) = (1)(1) = 1Degrees of Freedom:
P(2 > 0.56197) = More than 0.25 OR: 0.4535
Example #3At a school a random sample of 20 male and 16 females were asked to classify which political party they identified with.
Democrat Republican Independent
Male 11 7 2
Female 7 8 1
Are the proportions of Democrats, Republicans, and Independents the same within both populations? Conduct a test of significance at the α = 0.05 level.
P: The proportion of Democrats, Republicans, and Independents that are Male and Female.
H:
Ho: The proportions are the same among males and females and their political party
HA: The proportions are different among males and females and their political party
A: SRS (says)
Expected Counts 5
Democrat Republican Independent
Male 11 7 2
Female 7 8 1
Row total x Column total table total
20 x 18 36
16 x 18 36
20 x 15 36
16 x 15 36
=
=
=
=
10
8
8.33
6.67
20 x 3 36
16 x 3 36
=
=
1.67
1.33
18 15 3
20
1636
Not all are expected counts are 5, proceed with caution!
N: Chi-Square test for Homogeneity
T:
2 = (O – E)2
E= 0.855
(11 – 10)2
10+ (7 – 8)2
8
Democrat Republican Independent
Male 11 7 2
Female 7 8 1
Male 10 8.33 1.67
Female 8 6.67 1.33
Expected
+ (7 – 8.33)2
8.33+ (8 – 6.67)2
6.67+ (2 – 1.67)2
1.67+ (1 – 1.33)2
1.33
O:
(r – 1)(c – 1) = (2 – 1)(3 – 1) = (1)(2) = 2Degrees of Freedom:
P(2 > 0.855) =
More than 0.25
OR:
0.6521
O:
(r – 1)(c – 1) = (2 – 1)(3 – 1) = (1)(2) = 2Degrees of Freedom:
P(2 > 0.855) =
P(2 > 0.855) =
M:
P ___________
0.6521 0.05
>
Accept the Null
S:
There is not enough evidence to claim the proportions are different among males and females and their political party
Example #4The following chart represents the score distribution on the AP Exams for different subjects at a certain high school. Is there evidence that the score distribution is dependent from the subject?
P: Determine if AP scores are independent from the subject areas.
H:
Ho: AP scores and AP test are independent.
HA: AP scores and AP test are not independent.
A: SRS (says)
Expected Counts 5
Row total x Column total table total
52 68 30
3442441812
150
34 x 52 150
42 x 52 150
34 x 68 150
42 x 68150
=
=
=
=
11.787
14.56
15.413
19.04
34 x 30 150
42 x 30150
=
=
6.8
8.4
52 68 30
3442441812
150
44 x 52 150
44 x 68 150
= =15.253 19.947 44 x 30 150
= 8.8
18 x 52 150
12 x 52 150
18 x 68 150
12 x 68150
=
=
=
=
6.24
4.16
8.16
5.44
18 x 30 150
12 x 30150
=
=
3.6
2.4
52 68 30
3442441812
150
Not all are expected counts are 5, proceed with caution!
N: Chi-Square test for Independence
T:
5 11.787 15.413 6.8
4 14.56 19.04 8.4
3 15.253 19.947 8.8
2 6.24 8.16 3.6
1 4.16 5.44 2.4
Expected
2 = (O – E)2
E= 5.227698
O:
(r – 1)(c – 1) = (5 – 1)(3 – 1) = (4)(2) = 8Degrees of Freedom:
P(2 > 5.227698) =
More than 0.25
OR:
0.732985
O: P(2 > 5.227698) =
P(2 > 5.227698) =
(r – 1)(c – 1) = (5 – 1)(3 – 1) = (4)(2) = 8Degrees of Freedom:
M:
P ___________
0.732985 0.05
>
Accept the Null
S:
There is not enough evidence to claim the AP scores and AP test are dependent