1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.
-
Upload
marcus-baldwin -
Category
Documents
-
view
221 -
download
0
Transcript of 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.
![Page 1: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/1.jpg)
1
HYPOTHESIS TESTING:HYPOTHESIS TESTING:ABOUT ABOUT MORE THAN MORE THAN TWOTWO (K) (K) INDEPENDENT POPULATIONSINDEPENDENT POPULATIONS
![Page 2: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/2.jpg)
2
ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
Analysis of variance is used for two different purposes:
1. To estimate and test hypotheses about population variances
2. To estimate and test hypotheses about population means
We are concerned here with the latter use.
![Page 3: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/3.jpg)
3
H0: 1= 2= 3=...= k
Ha: Not all the i are equal.
Assumptions:
•We have K independent samples, one from each of K populations.
•Each population has a normal distribution with unknown mean i
•All of the populations have the same standard deviation (unknown)
![Page 4: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/4.jpg)
4
Mean
T..T.kT.3T.2T.1Total
x3kx33x32x31
x2kx23x22x21
x1kx13x12x11
k321
Treatment
1.x 2.x 3.x kx. ..x
11nx
22nx 33n
x knkx
jn
iijj xT
1. columnjth theof total
columnjth theofmean ..
j
jj n
Tx
k
j
k
j
n
iijj
j
xTT1 1 1
... nsobservatio all theof total
N
Tx ..
..
k
jjnN
1
![Page 5: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/5.jpg)
5
k
j
n
i
k
j
n
iijij
j j
N
TxxxSST
1 1 1 1
2..22
.. )(
The Total Sum of Squares
The Within Groups Sum of Squares
k
j
n
i
k
j
n
i
k
j j
jijjij
j j
n
TxxxSSW
1 1 1 1 1
2.22
. )(
The Among Groups Sum of Squares
k
j
k
j j
jjj NT
n
TxxnSSA
1 1
2..
.2... /)(
SST=SSA+SSW
![Page 6: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/6.jpg)
6
)1/( kSSAMSA
)/( kNSSWMSW
MSWMSAVR /
Within groups mean square
Among groups mean square
Variance Ratio (F)
![Page 7: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/7.jpg)
7
Source SS df MS F
(VR)
Among samples SSA k-1 MSA MSA/MSW
Within samples SSW N-k MSW
Total SST N-1
ANOVA TABLE
![Page 8: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/8.jpg)
8
Testing for Significant Differences Between Individual Pairs of Means
Whenever the analysis of variance leads to a rejection of the null hypothesis of no difference among population means, the question naturally arise regarding just which pairs of means are different. Over the years several procedures for making individual comparisons have been suggested.
The oldest procedure, and perhaps the one most widely used in the past, is the Least Significant Difference (LSD) procedure.
LSD (LSD (Least Significant Difference ))
TukeyTukey
BonferroniBonferroni
SidakSidak
Dunnett’s CDunnett’s C
Dunnett’s T3Dunnett’s T3
![Page 9: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/9.jpg)
9
When sample sizes are equal (n1=n2=n3=...=nk=n)
Least Significant Difference (LSD)
n
MSWtxx ji
)(2 p<0.05
When sample sizes are not equal (n1n2 n3 ... nk)
)11
(ji
ji nnMSWtxx p<0.05
![Page 10: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/10.jpg)
10
Example In a study of the effect of glucose on insulin release, specimens of pancreatictissue from experimental animals were randomly assigned to be treated with one of five different stimulants. Later, a determination was made on the amount of insulin released. The experimenters wished to know if they could conclude that there is a difference among the five treatments with respect to the mean amount of insulin released. The resulting measurements of amount of insulin released following treatment are displayed in the table.The five sets of observed data constitute five independent samples from the respective populations.
Each of the populations from which he samples come is normally distributed with mean,i, and variances i
2.
Each population has the same variance.
![Page 11: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/11.jpg)
11
Stimulant
1 2 3 4 5
1.53 3.15 3.89 8.18 5.86
1.61 3.96 3.68 5.64 5.46
3.75 3.59 5.70 7.36 5.69
2.89 1.89 5.62 5.33 6.49
3.26 1.45 5.79 8.82 7.81
1.56 5.33 5.26 9.03
7.10 7.49
8.98
Total 13.04 15.60 30.01 47.69 56.81 163.15
Mean 2.61 2.60 5.00 6.81 7.10 5.10
![Page 12: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/12.jpg)
12
H0: 1= 2= 3= 4 = 5
Ha: Not all the i are equal.
54282.162 32
923.2661173529.994
32
15.16398.861.153.1
2222
1 1
2..2
k
j
n
iij
j
N
TxSST
35739.41
8
81.56
7
69.47
6
01.30
6
60.15
5
04.1398.861.153.1
22222222
1 1 1
2.2
k
j
n
i
k
j j
jij
j
n
TxSSW
SSA=SST-SSW=162.54282-41.35739=121.18543
![Page 13: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/13.jpg)
13
ANOVA TABLE
121,185 4 30,296 19,779 ,000
41,357 27 1,532
162,543 31
Between Groups
Within Groups
Total
Sum ofSquares df Mean Square F Sig.
MSW=SSW/27=41.357/27=1.532
MSA=SSA/(5-1)=121.185/4=30.296
F=MSA/MSW=30.296/1.532=19.779
We conclude that not all population means are equal.
![Page 14: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/14.jpg)
14
Since n1n2 n3 n4 n5), reject H0 if )11
(21 nn
MSWtxx ji
538.1)6
1
5
1(532.105.2
Hypothesis LSD Statistical Decision
H0: 1= 2
0.01<1.538,
accept H0.
H0: 1= 3
2.391.538,
reject H0.
H0: 4= 5
0.29<1.314,
accept H0.
538.1)6
1
5
1(532.105.2
314.1)8
1
7
1(532.105.2
60.261.2
![Page 15: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/15.jpg)
15
Multiple Comparisons LSD
8,000E-03 ,7494 ,992 -1,5297 1,5457
-2,3937 * ,7494 ,004 -3,9314 -,8560
-4,2049 * ,7247 ,000 -5,6918 -2,7179
-4,4933 * ,7056 ,000 -5,9409 -3,0456
-8,0000E-03 ,7494 ,992 -1,5457 1,5297
-2,4017 * ,7146 ,002 -3,8678 -,9355
-4,2129 * ,6886 ,000 -5,6257 -2,8000
-4,5013 * ,6684 ,000 -5,8727 -3,1298
2,3937 * ,7494 ,004 ,8560 3,9314
2,4017 * ,7146 ,002 ,9355 3,8678
-1,8112 * ,6886 ,014 -3,2240 -,3984
-2,0996 * ,6684 ,004 -3,4710 -,7281
4,2049 * ,7247 ,000 2,7179 5,6918
4,2129 * ,6886 ,000 2,8000 5,6257
1,8112 * ,6886 ,014 ,3984 3,2240
-,2884 ,6405 ,656 -1,6027 1,0259
4,4933 * ,7056 ,000 3,0456 5,9409
4,5013 * ,6684 ,000 3,1298 5,8727
2,0996 * ,6684 ,004 ,7281 3,4710
,2884 ,6405 ,656 -1,0259 1,6027
(J) Stimulant2,00
3,00
4,00
5,00
1,00
3,00
4,00
5,00
1,00
2,00
4,00
5,00
1,00
2,00
3,00
5,00
1,00
2,00
3,00
4,00
(I) Stimulant1,00
2,00
3,00
4,00
5,00
MeanDifference
(I-J) Std. Error Sig. Lower Bound Upper Bound
95% Confidence Interval
The mean difference is significant at the .05 level.*.
![Page 16: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/16.jpg)
16
![Page 17: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/17.jpg)
17
KRUSKAL- WALLIS ONE-WAY ANOVA
When the assumptions underlying One-way ANOVA are not met, that is, when the populations from which the samples are drawn are not normally distributed with equal variances, or when the data for analysis consist only of ranks, a nonparametric alternative to the one-way analysis of variance may be used to test the hypothesis of equal location parameters.
![Page 18: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/18.jpg)
18
The application of the test involves the following steps:
1. The n1, n2, ..., nk observations from the k groups are combined into a single series of size n and arranged in order of magnitude from smallest to largest. The observations are then replaced by ranks.
2. The ranks assigned to observations in each of the k groups are added separately to give k rank sums.
3. The test statistic
is computed.
k
j j
j nn
R
nnKW
1
2
)1(3)1(
12
# of groups
# of obs. in jth group
Sum of ranks in jth group
![Page 19: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/19.jpg)
19
4. When there are three groups and five and fewer observations in each group, the significance of the computed KW is determined by using special tables. When there are more than five observations in one or more of the groups, KW is compared with the tabulated values of 2 with k-1 df.
![Page 20: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/20.jpg)
20
ji
ji nnkn
KWnnntRR
111
12
)1(
Determing which groups are significantly different
Like the one-way ANOVA, the Kruskal-Wallis test is an overall test of significant result, the test does not indicate where the differences are among the groups. To determine which groups are significantly different from one another, it is necessary to undertake multiple comparisons.
p<0.05
![Page 21: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/21.jpg)
21
Example The effect of two drugs on reaction time to a certain stimulus were studied in three groups of experimental animals. Group III served as a control while the animals in group I treated with drug A and those in group II were treated with drug B prior to the application of the stimulus. Table shows the reaction times in seconds of 13 animals. Can we conclude that the three populations represented by the three samples differ with respect to reaction time?
H0: The population distributions are all identical.
Ha: At least one of the populations tends to exhibit larger values than at least one of the other populations.
![Page 22: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/22.jpg)
22
Group
I II III
17 8 2
20 7 5
40 9 4
31 8 3
35
Rank Rank Rank
9 6.5 1
10 5 4
13 8 3
11 6.5 2
12
Ri 55 26 10
68.10
)113(3
4
10
4
26
5
55
)113(13
12
)1(3)1(
12
222
1
2
k
j j
j nn
R
nnKW
KW(5,4,4;0.05)=5.617<KWcal
p<0.05, reject H0.
![Page 23: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/23.jpg)
23
Groups Statistical Decision
1-2 4.5 2.115 p<0.05
1-3 8.5 2.115 p<0.05
2-3 4 2.229 p<0.05
ji RR
ji nnkn
KWnnnt
111
12
)1(
Multiple Comparisons Table
![Page 24: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/24.jpg)
24
We can use the chi-square test to compare frequencies or proportions in two or more groups. The classification according to two criteria, of a set of entities, can be shown by a table in which the r rows represents the various levels of one criterion of classification and c columns represent the various levels of the second criterion. Such a table is generally called a contingency table.
We will be interested in testing the null hypothesis that in the population the two criteria of classification are independent or associated.
rxc Chi Square Test
![Page 25: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/25.jpg)
25
Second Criteria
First Criteria
1 2 c Total
1 O11 O12 O1c O1.
2 O21 O22 O2c O2.
r Or1 Or2 Orc Or.
Total O.1 O.2 O.c N
![Page 26: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/26.jpg)
26
r
1i
c
1j ij
2ijij2
E
)E(Oχ
N
OOE .ji.
ij
No more than 20% of the cells should have expected frequencies of less than 5.
df = (r-1)(c-1)
![Page 27: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/27.jpg)
27
Example A research team studying the relationship between blood type and severity of a certain condition in a population collected data on 1500 subjects as displayed in the below contingency table. The researchers wished to know if these data were compatible with the hypothesis that severity of condition and blood type are independent.
Severity of Condition
Blood Type
A B AB 0 Total
Absent 543 211 90 476 1320
Mild 44 22 8 31 105
Severe 28 9 7 31 75
Total 615 242 105 538 1500
![Page 28: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/28.jpg)
28
543 211 90 476 1320
41,1% 16,0% 6,8% 36,1% 100,0%
44 22 8 31 105
41,9% 21,0% 7,6% 29,5% 100,0%
28 9 7 31 75
75,0
37,3% 12,0% 9,3% 41,3% 100,0%
615 242 105 538 1500
41,0% 16,1% 7,0% 35,9% 100,0%
Severity of conditionCount
% within severity
Count
% within severity
Count
% within severity
Count
% within severity
Absent
Mild
Severe
Total
A B AB O
Blood Type
Total
541,2 213,0 92,4 473,4 1320,0
43,1 16,9 7,4 37,7 105,0
30,8 12,1 5,3 26,9
615,0 242,0 105,0 538,0 1500,0
Expected Count
Expected Count
Expected Count
Expected Count
0 cells (,0%) have expected count less than 5. The minimum expected count is 5,25.
![Page 29: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/29.jpg)
29
12.5 9.26
)9.2631(
96.212
)96.212211(
2.541
)2.541543(
E
)E(Oχ
222
r
1i
c
1j ij
2ijij2
2(6,0.05)=12.592> 2
(calculated), accept H0, p>0.05
We conclude that these data are compatible with the hypothesis that severity of the condition and blood type are independent.
H0: severity of condition and blood type are independent.
Ha: severity of condition and blood type are not independent.
![Page 30: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/30.jpg)
30
Chi-Square Tests
1,998a 6 ,920
171
Pearson Chi-Square
N of Valid Cases
Value dfAsymp. Sig.
(2-sided)
5 cells (41,7%) have expected count less than 5. Theminimum expected count is ,84.
a.
Assumption is violated
50 20 9 45 124
50,0 22,5 9,4 42,1 124,0
15 8 3 10 36
14,5 6,5 2,7 12,2 36,0
4 3 1 3 11
4,4 2,0 ,8 3,7 11,0
69 31 13 58 171
69,0 31,0 13,0 58,0 171,0
Count
Expected Count
Count
Expected Count
Count
Expected Count
Count
Expected Count
Total
A B AB O Total
Absent
Mild
Severe
Severity of condition Blood Type
We decide to merge two conditions
When the sample size is small and assumption about expected frequencies is not met;
![Page 31: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/31.jpg)
31
After combining mild and severe groups in one group, no more than 20% of the cells have expected frequencies less than 5.
50 20 9 45 124
50,0 22,5 9,4 42,1 124,0
19 11 4 13 47
19,0 8,5 3,6 15,9 47,0
69 31 13 58 171
69,0 31,0 13,0 58,0 171,0
Count
Expected Count
Count
Expected Count
Count
Expected Count
Present
Total
TotalA B AB O
Absent
Severity of conditionBlood Type
Chi-Square Tests
1,814a 3 ,612Pearson Chi-SquareValue df
Asymp. Sig.(2-sided)
1 cells (12,5%) have expected count less than 5. Theminimum expected count is 3,57.
a.
![Page 32: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/32.jpg)
32
32 51 83
22,8 60,2 83,0
58,2% 35,2% 41,5%
4 8 12
3,3 8,7 12,0
7,3% 5,5% 6,0%
8 19 27
7,4 19,6 27,0
14,5% 13,1% 13,5%
11 67 78
21,5 56,6 78,0
20,0% 46,2% 39,0%
55 145 200
55,0 145,0 200,0
100,0% 100,0% 100,0%
Blood TypeCount
Expected Count
% within Tromboembolism
Count
Expected Count
% within Tromboembolism
Count
Expected Count
% within Tromboembolism
Count
Expected Count
% within Tromboembolism
Count
Expected Count
% within Tromboembolism
A
AB
B
O
Total
+ -
Tromboembolism
Total
12,375a 3 ,006200
Chi-SquareN of Valid Cases
Value dfAsymp. Sig.
(2-sided)
1 cells (12,5%) have expected count less than 5. Theminimum expected count is 3,30.
a.
2=5,118261
2=0,204807
2=0,067016
2=7,038861
Reject H0. Which type of blood group(s) is/are different from the others ?
Exclude Type O from the analysis
If null hypothesis is rejected, how can we find the group which is different?
![Page 33: 1 HYPOTHESIS TESTING: ABOUT MORE THAN TWO (K) INDEPENDENT POPULATIONS.](https://reader035.fdocuments.in/reader035/viewer/2022062304/56649da55503460f94a90b42/html5/thumbnails/33.jpg)
33
Count
32 51 83
4 8 12
8 19 27
44 78 122
A
AB
B
Total
+ -
Tromboembosim
Total
,747a 2 ,688Pearson Chi-SquareValue df
Asymp. Sig.(2-sided)
1 cells (16,7%) have expected count less than 5. Theminimum expected count is 4,33.
a. p>0.05
Except for blood type O, distribution of tromboembolism is similar within the others.