Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i...

16
Nonparametric: K sample 張張張 2015/03

Transcript of Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i...

Page 1: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Nonparametric: K sample

張育慈 2015/03

Page 2: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Kruskal-Wallis testLet represent the ranks of observation The Kruskal-Wallis test statistic is denoted by

Where is the total number of observation, and are the sample size and the average rank for treatment i. Kruskal-Wallis statistic approximate the chi-square distribution with degrees of freedom k-1.

Treatment 1

… … Treatment i

Data Ranks … … Data Ranks

219 . … … 223 .

255 . … … 186 2

269 . … … 164 1

SSTR=

= , =

SSTO===

Page 3: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

For tied data, the Kruskal-Wallis test adjusted for ties is given by

Assume the tied data are arrange into g groups of like observations. We let denote the number of observation in the ith group, i=1,2,…,g.

Treatment 1

… … Treatment i

Data Ranks … … Data Ranks

219 5 … … 223 .

253 . … … 164 2

255 . … … 164 2

269 . … … 164 2

250 . … … 204 4

Page 4: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

• Multiple comparisons use pairwise tests to determine which treatment differ from

others with controlling experiment-wise error rate.(=1- > >) 1. Bonferroni Adjustment =2. Fisher’s(Protected) Least Significant Difference (LSD)

3. Tukey’s HSD Procedure

for equal sample sizes

for unequal sample sizes

Page 5: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

ExampleAn agronomist gave scores from 0 to 5 to denote insect damage to wheat plants that were treated with four insecticides. The data are givens in following tables. Use the Kruskal-Wallis test and one-way ANOVA to test whether or not there is difference among the treatments.

One- way ANOVA Kruskal-Wallis test

四種殺蟲劑所獲得的分數平均數相同四種殺蟲劑所獲得的分數平均數不同

四種殺蟲劑所獲得的分數中位數相同四種殺蟲劑所獲得的分數中位數不同

P-value=0.0384 P-value=0.0485

在 =0.05之下,我們可以拒絕。因此可以推論四種殺蟲劑所獲得的分數平均數不同。

在 =0.05之下,我們可以拒絕。因此可以推論四種殺蟲劑所獲得的分數中位數不同。

多重比較 :第一種殺蟲劑所獲得的分數平均數和第四種不同

多重比較 :第一種殺蟲劑所獲得的分數中位數和第四種不同

T1 T 2 T 3 T 4

0 2 1 3

2 0 3 4

1 3 4 2

3 1 2 5

1 3 2 3

4 4 1 4

Page 6: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Block: Group of homogeneous experiment units.

• Blocking to "remove" the effect of nuisance factors.

• More precise.

• The treatments are randomly assigned to experimental units within blocks.

Page 7: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Friedman’s test is a nonparametric test for ranking the observations within blocks. It’s a randomized complete block design to the ranks.

Where denote the average rank for treatment i , k and b are total of treatments and blocks . FM follows a chi-square distribution with k-1 degrees of freedom.

BlocksRow

totals1 2 … b

Treatments

1 …

2 …

. . . … .

. . . … .

k …

Column totals …

SSTR, =

Page 8: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

For tied data

Let denote the number of tied observations in the ith group within the jth block. Let denote the number of groups of tied observations with the jth block.

Blocks

1 2 b

Treatments

1

2 60(2)

. 150(3.5) 80(2) 80(3)

k 90(4)

Page 9: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

ExampleDifferent types of farm machinery have different effects on the compaction of soil and thus may affect yields differently .Table shows yield data from a randomized complete block design in which four different types of tractors were used in tilling the soil. One- way ANOVA RBD Friedman’s test

四種農業用拖拉機所獲得的產量平均數相同四種農業用拖拉機所獲得的產量平均數不同

四種農業用拖拉機所獲得的產量中位數相同四四種農業用拖拉機所獲得的產量中位數不同

P-value=0.46 P-value=0.058

在 =0.1之下,我們不拒絕。在 =0.1之下,我們可以拒絕。

tractor LOCATION1 LOCATION2 LOCATION3 LOCATION4 LOCATION5 LOCATION6

1 120(1) 208(4) 199(4) 194(4) 177(4) 195(4)

2 207(4) 188(3) 181(3) 164(2) 155(1) 175(2)

3 122(2) 137(2) 177(2) 177(3) 160(3) 138(1)

4 128(3) 128(1) 160(1) 142(1) 157(2) 179(3)

Page 10: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Nonparametric vs parametric

Kruskal-Wallis test vsOne- way ANOVA

•approximate the chi-square distribution with degrees of freedom k-1.

• F=

Friedman’s test vsOne- way ANOVA RBD

•.

• F=

• Distribution• Central measure• Outliers

Page 11: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

ExampleAn agronomist gave scores from 0 to 5 to denote insect damage to wheat plants that were treated with four insecticides. The data are givens in following tables. Use the Kruskal-Wallis test and one-way ANOVA to test whether or not there is difference among the treatments.

One- way ANOVA Kruskal-Wallis test

四種殺蟲劑所獲得的分數平均數相同四種殺蟲劑所獲得的分數平均數不同

四種殺蟲劑所獲得的分數中位數相同四種殺蟲劑所獲得的分數中位數不同

P-value=0.285 P-value=0.0485

在 =0.05之下,我們不拒絕。因此我們沒有足夠的證據說明四種殺蟲劑所獲得的分數平均數不同

在 =0.05之下,我們可以拒絕。因此可以推論四種殺蟲劑所獲得的分數中位數不同。

T1 T 2 T 3 T 4

0 2 1 3

2 0 3 4

1 3 4 2

3 1 2 50

1 3 2 3

4 4 1 4

Page 12: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

END

Page 13: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

= .

The expected rank for any observation is the average rank .

For the ith sample which contains observations, the expected sum of ranks would be .

the actual sum of ranks assigned to the elements in the ith sample.

The sum of squares of these deviation can be S=

==

Average rank sum for ith column

Hence E()= Var()==

The CLT allows us to approximate the distribution of = is distributed approximately as chi square with one degree of freedom.

Kruskal(1952) showed that under , if no is very small, the r.v

is distributed approximately as chi square with k-1 degree of freedom.

Page 14: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

Kruskal-Wallis Test

Chi-Square 7.6301

DF 3

Asymptotic Pr > Chi-Square

0.0543

Exact Pr >= Chi-Square

0.0458

Source DF Sum of Squares

Mean Square

F Value Pr > F

Model 3 14.45833333

4.81944444

3.38 0.0384

Error 20 28.50000000

1.42500000

   

Corrected Total

23 42.95833333

 

p-Values

Variable Contrast Raw Bonferroni Permutation

scores 1 vs 2 0.2407 1.0000 0.6727

scores 1 vs 3 0.2407 1.0000 0.6727

scores 1 vs 4 0.0051 0.0307 0.0294

scores 2 vs 3 1.0000 1.0000 1.0000

scores 2 vs 4 0.0673 0.4039 0.2750

scores 3 vs 4 0.0673 0.4039 0.2750

Means with the same letterare not significantly different.

t Grouping Mean N treatment

  A 3.5000 6 4

  A      

B A 2.1667 6 3

B A      

B A 2.1667 6 2

B        

B   1.3333 6 1

Page 15: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

來源 自由度 ANOVA SS

均方 F 值 Pr > F

tractor 3 5408.333333

1802.777778

3.12 0.0575

location 5 2816.833333

563.366667

0.98 0.4640

Cochran-Mantel-Haenszel  統計值 ( 根據排名計分 )

統計值 對立假設 自由度 值 機率1 非零相關 1 0.2650 0.6067

2 列平均值計分差異

5 4.6377 0.4617

Page 16: Nonparametric: K sample 張育慈 2015/03. Treatment 1 ……Treatment i DataRanks……DataRanks.……223. 255.……1862 269.……164.

data;

input treatment scores @@;

cards;

1 0 2 2 3 1 4 3

1 2 2 0 3 3 4 4

1 1 2 3 3 4 4 2

1 3 2 1 3 2 4 5

1 1 2 3 3 2 4 3

1 1 2 4 3 1 4 4

;

proc npar1way wilcoxon;

class treatment;

exact wilcoxon;

var scores;

run;

Proc glm;

Class treatment;

model scores=treatment;

run;

data yield;do tractor= 1 to 4;do location= 1 to 6;input y @@;output;end;end;cards;120 208 199 194 177 195 207 188 181 164 155 175122 137 177 177 160 138 128 128 160 142 157 179;proc anova;class tractor location;model y= tractor location;run;proc freq;tables tractor*location*y/ CMH2 scores=Rank noprint;run; proc univariate normal;var y;by tractor;run;

proc glm;

Class treatment;

model scores=treatment;

means treatment/BON LSD TUKEY;

run;

proc multtest perm bon pvals ;

class treatment;

contrast '1 vs 2' -1 1 0 0;

contrast '1 vs 3' -1 0 1 0;

contrast '1 vs 4' -1 0 0 1;

contrast '2 vs 3' 0 -1 1 0;

contrast '2 vs 4' 0 -1 0 1;

contrast '3 vs 4' 0 0 -1 1;

test mean(scores/);

run;