Slides by John Loucks St. Edward’s University
description
Transcript of Slides by John Loucks St. Edward’s University
1 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Slides by
JohnLoucks
St. Edward’sUniversity
2 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Chapter 19Nonparametric Methods
Sign Test Wilcoxon Signed Rank Test Mann-Whitney-Wilcoxon Test Kruskal-Wallis Test Rank Correlation
3 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Nonparametric Methods
Most of the statistical methods referred to as parametric require the use of interval- or ratio-scaled data.
Nonparametric methods are often the only way to analyze categorical ( nominal or ordinal) data and draw statistical conclusions.
Nonparametric methods require no assumptions about the population probability distributions. Nonparametric methods are often called distribution-free methods.
4 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Nonparametric Methods
Whenever the data are quantitative, we will transform the data into categorical data in order to conduct the nonparametric test.
5 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sign Test
The sign test is a versatile method for hypothesis testing that uses the binomial distribution with p = .50 as the sampling distribution. We present two applications of the sign test: A hypothesis test about a population
median A matched-sample test about the difference between two populations
6 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median
We can apply the sign test by:• Using a plus sign whenever the data in the
sample are above the hypothesized value of the median• Using a minus sign whenever the data in the sample are below the hypothesized value of the median
• Discarding any data exactly equal to the hypothesized median
7 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median
The assigning of the plus and minus signs makes the situation into a binomial distribution application.• The sample size is the number of trials.• There are two outcomes possible per trial, a
plus sign or a minus sign.
• We let p denote the probability of a plus sign.• If the population median is in fact a particular value, p should equal .5.
• The trials are independent.
8 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Small-Sample Case The small-sample case for this sign test should
be used whenever n < 20. The hypotheses are
The population median is differentthan the value assumed.
The population median equals thevalue assumed.
The number of plus signs is our test statistic. Assuming H0 is true, the sampling distribution
for the test statistic is a binomial distribution with p = .5. H0 is rejected if the p-value < level of significance, a.
0 : .50H p
0 : .50H p
9 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Potato Chip Sales
Hypothesis Test about a Population Median:
Smaller Sample Size Lawler’s Grocery Store made the decision to carry Cape May Potato Chips based on the manufacturer’s estimate that the median sales should be $450 per week on a per-store basis.
Lawler’s has been carrying the potato chips for three months. Data showing one-week sales at 10 randomly selected Lawler’s stores are shown on the next slide.
10 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Smaller Sample Size
StoreNumber
56193612812
WeeklySales$485 562 415 860 426
Sign++-+-
StoreNumber
633984
10244
WeeklySales$474 662 380 515 721
Sign++-++
Example: Potato Chip Sales
11 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Smaller Sample Size Example: Potato Chip Sales Lawler’s management requested the following hypothesis test about the population median weekly sales of Cape May Potato Chips (using a = .10).
H0: Median Sales = $450Ha: Median Sales ≠ $450
H0: p = .50Ha: p ≠ .50
In terms of the binomial probability p:
12 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Smaller Sample Size Example: Potato Chip Sales
Number ofPlus Signs
012345
Probability.0010 .0098 .0439 .1172 .2051 .2461
Number ofPlus Signs
678910
Probability.2051 .1172 .0439 .0098 .0010
Binomial Probabilities with n = 10 and p = .50
13 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Smaller Sample Size Example: Potato Chip Sales Because the observed number of plus signs is 7, we begin by computing the probability of obtaining 7 or more plus signs. The probability of 7, 8, 9, or 10 plus signs is: .1172 + .0439 + .0098 + .0010 = .1719. We are using a two-tailed hypothesis test, so: p-value = 2(.1719) = .3438. With p-value > a, (.3438 > .10), we cannot reject H0.
14 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Smaller Sample Size ConclusionBecause the p-value > a, we cannot reject H0.
There is insufficient evidence in the sample to reject
the assumption that the median weekly sales is $450.
15 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Larger Sample Size With larger sample sizes, we rely on the normal distribution approximation of the binomial distribution to compute the p-value, which makes the computations quicker and easier.
Normal Approximation ofthe Number of Plus Signs when H0: p = .50
Mean: = .50n Standard Deviation: .25n
Distribution Form: Approximately normal for n > 20
16 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Larger Sample Size
H0: Median Age = 34 yearsHa: Median Age ≠ 34 years
Example: Trim Fitness CenterA hypothesis test is being conducted about the
median age of female members of the Trim Fitness
Center.
In a sample of 40 female members, 25 are older
than 34, 14 are younger than 34, and 1 is 34. Is there
sufficient evidence to reject H0? Use a = .05.
17 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Larger Sample Size Example: Trim Fitness Center• Letting x denote the number of plus signs,
we will use the normal distribution to approximate the binomial probability P(x < 25).• Remember that the binomial distribution is discrete and the normal distribution is continuous.• To account for this, the binomial probability of 25 is computed by the normal probability interval 24.5 to 25.5.
18 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
p-Value = 2(1.0000 - .9726) = .0548
= .5(n) = .5(39) = 19.5
Hypothesis Test about a Population Median:
Larger Sample Size
p-Valuez = (x – )/ = (25.5 – 19.5)/3.1225 = 1.92
Test Statistic
Mean and Standard Deviation
.25 .25(39) 3.1225n
19 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test about a Population Median:
Larger Sample Size Rejection Rule
ConclusionDo not reject H0. The p-value for this
two-tail test is .0548. There is insufficient evidence in the sample to conclude that the median age is not 34 for female members of Trim Fitness Center.
Using .05 level of significance: Reject H0 if p-value < .05
20 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test with Matched Samples
A common application of the sign test involves using a sample of n potential customers to identify a preference for one of two brands of a product. The objective is to determine whether there is a difference in preference between the two items being compared.
To record the preference data, we use a plus sign if the individual prefers one brand and a minus sign if the individual prefers the other brand. Because the data are recorded as plus and minus signs, this test is called the sign test.
21 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test with Matched Samples:Small-Sample Case
The small-sample case for the sign test should be used whenever n < 20.
The hypotheses are
a : .50H p
0 : .50H p
A preference for one brandover the other exists.
No preference for one brandover the other exists.
The number of plus signs is our test statistic. Assuming H0 is true, the sampling distribution
for the test statistic is a binomial distribution with p = .5. H0 is rejected if the p-value < level of significance, a.
22 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test with Matched Samples:Small-Sample Case
Maria Gonzales is the supervisor responsible forscheduling telephone operators at a major call center. She is interested in determining whether her operators’ preferences between the day shift (7 a.m.to 3 p.m.) and evening shift (3 p.m. to 11 p.m.) are different.
Example: Major Call Center
Maria randomly selected a sample of 16 operators
who were asked to state a preference for the one of
the two work shifts. The data collected from the
sample are shown on the next slide.
23 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypothesis Test with Matched Samples:Small-Sample Case
Example: Major Call Center
Worker12345678
ShiftPreference
DayEveningEveningEvening
DayEvening
Day(none)
Sign+---+-+
Worker 910111213141516
ShiftPreference
Evening EveningEvening(none)
EveningDay
EveningEvening
Sign---
-+--
24 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
4 plus signs10 negative
signs (n = 14)
Hypothesis Test with Matched Samples:Small-Sample Case
Example: Major Call Center
Can Maria conclude, using a level of significance of a = .10, that operator preferences are different for the two shifts?
a : .50H p
0 : .50H p
A preference for one shiftover the other does not exist.
A preference for one shiftover the other does exist.
25 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Number ofPlus Signs
01234567
Probability
.00006 .00085.00555 .02222 .06110 .12219 .18329.20947
Number ofPlus Signs
8 91011121314
Probability
.18329 .12219 .06110 .02222 .00555.00085.00006
Binomial Probabilities with n = 14 and p = .50
Hypothesis Test with Matched Samples:Small-Sample Case
Example: Major Call Center
26 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Because the observed number of plus signs is 4, we begin by computing the probability of obtaining 4 or less plus signs. The probability of 0, 1, 2, 3, or 4 plus signs is: .00006 + .00085 + .00555 + .02222 + .06110 = .08978. We are using a two-tailed hypothesis test, so: p-value = 2(.08978) = .17956. With p-value > a, (.17956 > .10), we cannot reject H0.
Hypothesis Test with Matched Samples:Small-Sample Case
Example: Major Call Center
27 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
ConclusionBecause the p-value > a, we cannot reject H0. There
is insufficient evidence in the sample to conclude that
a difference in preference exists for the two work shifts.
Hypothesis Test with Matched Samples:Small-Sample Case
28 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Using H0: p = .5 and n > 20, the sampling distribution for the number of plus signs can be approximated by a normal distribution.
When no preference is stated (H0: p = .5), the sampling distribution will have:
The test statistic is:
H0 is rejected if the p-value < level of significance, a.
Mean: = .50nStandard Deviation: .25n
xz -
(x is the number of plus signs)
Hypothesis Test with Matched Samples:Large-Sample Case
29 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Ketchup Taste TestAs part of a market research study, a sample of
80 consumers were asked to taste two brands of
ketchup and indicate a preference. Do the data
shown on the next slide indicate a significantdifference in the consumer preferences for
the twobrands?
Hypothesis Test with Matched Samples:Large-Sample Case
30 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
45 preferred Brand A Ketchup (+ sign recorded)
27 preferred Brand B Ketchup (_ sign recorded)
8 had no preference
Example: Ketchup Taste Test
The analysis will be based on
a sample size of 45 + 27 = 72.
Hypothesis Test with Matched Samples:Large-Sample Case
31 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypotheses
a : .50H p
0 : .50H p
A preference for one brand over the other exists
No preference for one brand over the other exists
Hypothesis Test with Matched Samples:Large-Sample Case
32 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Distribution for Number of Plus Signs
= .5(72) = 36
.25 .25(72) 4.243n
Hypothesis Test with Matched Samples:Large-Sample Case
33 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
p-Value = 2(1.0000 - .9875) = .025
Rejection Rule
p-Value
z = (x – )/ = (45.5 - 36)/4.243 = 2.24 Test Statistic
Using .05 level of significance:Reject H0 if p-value < .05
Hypothesis Test with Matched Samples:Large-Sample Case
34 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
ConclusionBecause the p-value < a, we can reject H0. There
is sufficient evidence in the sample to conclude that
a difference in preference exists for the two brands of
ketchup.
Hypothesis Test with Matched Samples:Large-Sample Case
35 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test
The Wilcoxon signed-rank test is a procedure for analyzing data from a matched samples experiment. The test uses quantitative data but does not require the assumption that the differences between the paired observations are normally distributed.
It only requires the assumption that the differences have a symmetric distribution.
This occurs whenever the shapes of the two populations are the same and the focus is on determining if there is a difference between the two populations’ medians.
36 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test
Let T - denote the sum of the negative signed ranks.
If the medians of the two populations are equal, we would expect the sum of the negative signed ranks and the sum of the positive signed ranks to be approximately the same.
Let T + denote the sum of the positive signed ranks.
We use T + as the test statistic.
37 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Distribution of T +
for the Wilcoxon Signed-Rank Test
Mean:
( 1)(2 1)Standard Deviation: 24T
n n n
Distribution Form: Approximately normal for n > 10
( 1)4T
n n
Wilcoxon Signed-Rank Test
38 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Express Deliveries
Wilcoxon Signed-Rank Test
A firm has decided to select one of two expressdelivery services to provide next-day deliveries to itsdistrict offices. To test the delivery times of the two services, thefirm sends two reports to a sample of 10 district offices, with one report carried by one service and theother report carried by the second service. Do the dataon the next slide indicate a difference in the twoservices?
39 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test
SeattleLos Angeles
BostonClevelandNew YorkHoustonAtlantaSt. LouisMilwaukeeDenver
32 hrs.30191615181410 716
25 hrs.241515131515 8 911
District Office OverNight NiteFlite
40 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test
HypothesesH0: The difference in the median delivery times of
the two services equals 0.Ha: The difference in the median delivery times of
the two services does not equal 0.
41 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test Preliminary Steps of the Test
• Compute the differences between the paired observations.
• Discard any differences of zero.• Rank the absolute value of the differences
from lowest to highest. Tied differences are assigned the average ranking of their positions.• Give the ranks the sign of the original difference in the data.
• Sum the signed ranks.. . . next we will determine whether the
sum is significantly different from zero.
42 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test
SeattleLos Angeles
BostonClevelandNew YorkHoustonAtlantaSt. LouisMilwaukeeDenver
7 6 4 1 2 3-1 2-2 5
District Office Differ. |Diff.| Rank Sign. Rank1097
1.546
1.5448
+10+9+7
+1.5+4+6-1.5+4-4
+8T+ = 44.0
43 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Wilcoxon Signed-Rank Test
Test Statistic
p-Valuep-Value = 2(1.0000 - .9535) = .093
44 27.5( 44) ( 1.68)9.81
P T P z p z -
( 1)(2 1) 10(11)(21) 9.8124 24T
n n n
( 1) 10(10 1) 27.54 4T
n n
44 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
ConclusionReject H0. The p-value for this two-tail
test is .093. There is insufficient evidence in the sample to conclude that a difference exists in the median delivery times provided by the two services.
Wilcoxon Signed-Rank Test
Rejection RuleUsing .05 level of significance,
Reject H0 if p-value < .05
45 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test
This test is another nonparametric method for determining whether there is a difference between two populations.
This test is based on two independent samples. Advantages of this procedure are; It can be used with either ordinal data or
quantitative data. It does not require the assumption that the
populations have a normal distribution.
46 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test
Ha: The two populations are not identicalH0: The two populations are identical
Instead of testing for the difference between the medians of two populations, this method tests to determine whether the two populations are identical. The hypotheses are:
47 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test
Example: Westin FreezersManufacturer labels indicate the annual energy cost
associated with operating home appliances such as
freezers.The energy costs for a sample of 10 Westin freezers
and a sample of 10 Easton Freezers are shown on the
next slide. Do the data indicate, using a = .05, that a
difference exists in the annual energy costs for the two
brands of freezers?
48 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test
$55.10
54.50
53.20
53.00
55.50
54.90
55.80
54.00
54.20
55.20
$56.10
54.70 54.40 55.40 54.10 56.00 55.50 55.00 54.30 57.00
Westin FreezersEaston Freezers
49 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Hypotheses
Mann-Whitney-Wilcoxon Test
Ha: Annual energy costs differ for the two brands of freezers.
H0: Annual energy costs for Westin freezers and Easton freezers are the same.
50 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test:Large-Sample Case
First, rank the combined data from the lowest to the highest values, with tied values being assigned the average of the tied rankings. Then, compute W, the sum of the ranks for the first sample.
Then, compare the observed value of W to the sampling distribution of W for identical populations. The value of the standardized test statistic z will provide the basis for deciding whether to reject H0.
51 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test:Large-Sample Case
1 2 1 21 ( 1)12W n n n n
Approximately normal, providedn1 > 7 and n2 > 7
Sampling Distribution of W with Identical Populations• Mean
• Standard Deviation
• Distribution Form
W = 1/2 n1(n1 + n2 + 1)
52 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test
$55.10
54.50
53.20
53.00
55.50
54.90
55.80
54.00
54.20
55.20
$56.10
54.70 54.40 55.40 54.10 56.00 55.50 55.00 54.30 57.00
Westin Freezers Easton Freezers
Sum of Ranks Sum of Ranks
Rank Rank
86.5 123.5
12
128
15.510173513
1997
144
1815.5116
20
53 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Distribution of W with Identical Populations
W = ½(10)(21) = 105
Mann-Whitney-Wilcoxon Test
1 2 1 21 ( 1)12
1 (10)(10)(21)12 13.23
W n n n n
W
54 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rejection RuleUsing .05 level of significance,
Reject H0 if p-value < .05 Test Statistic
p-Valuep-Value = 2(.0808) = .1616
Mann-Whitney-Wilcoxon Test
86.5 105( 44) ( 1.40)13.23
P W P z p z- -
55 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Mann-Whitney-Wilcoxon Test
ConclusionDo not reject H0. The p-value > a. There
is insufficient evidence in the sample data to conclude that there is a difference in the annual energy cost associated with the two brands of freezers.
56 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
The Mann-Whitney-Wilcoxon test has been extended by Kruskal and Wallis for cases of three or more populations.
The Kruskal-Wallis test can be used with ordinal data as well as with interval or ratio data. Also, the Kruskal-Wallis test does not require the assumption of normally distributed populations.
Ha: Not all populations are identicalH0: All populations are identical
57 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Test Statistic
Kruskal-Wallis Test
-
2
1
12 3( 1)( 1)k
iT
iT T i
RH nn n n
where: k = number of populations ni = number of observations in sample I nT = Sni = total number of observations in all samples Ri = sum of the ranks for sample i
58 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
When the populations are identical, the sampling distribution of the test statistic H can be approximated by a chi-square distribution with k – 1 degrees of freedom.
This approximation is acceptable if each of the sample sizes ni is > 5.
The rejection rule is: Reject H0 if p-value < a This test is always expressed as an upper-tailed
test.
59 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
John Norr, Director of Athletics at Lakewood High School, is curious about whether a student’s total number of absences in four years of high school is the same for students participating in no varsity sport, one varsity sport, and two varsity sports.
Example: Lakewood High School
Number of absences data were available for 20 recent graduates and are listed on the next slide. Test whether the three populations are identical in terms of number of absences. Use a = .10.
60 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
Example: Lakewood High SchoolNo Sport 1 Sport 2 Sports
13 18 1216 12 226 19 927 7 1120 15 1514 20 21
17 10
61 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
No Sport Rank 1 Sport Rank 2 Sports Rank
13 8 18 14 12 6.516 12 12 6.5 22 196 1 19 15 9 327 20 7 2 11 520 16.5 15 10.5 15 10.514 9 20 16.5 21 18
17 13 10 4Total 66.5 77.5 66
Example: Lakewood High School
62 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
Using test statistic: Reject H0 if c2 > 4.60517 (2 d.f.)Using p-value: Reject H0 if p-value < .10
Rejection Rule
-
2
1
12 3( 1)( 1)k
iT
iT T i
RH nn n n
-
2 2 2(66.5) (77.5) (66.0)12 3(20 1) .353220(20 1) 6 7 7
k = 3 populations, n1 = 6, n2 = 7, n3 = 7, nT = 20 Kruskal-Wallis Test Statistic
63 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Kruskal-Wallis Test
Do no reject H0. There is insufficient evidence to conclude that the populations are not identical. (H = .3532 < 4.60517)
Conclusion
64 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rank Correlation
The Pearson correlation coefficient, r, is a measure of the linear association between two variables for which interval or ratio data are available. The Spearman rank-correlation coefficient, rs , is a measure of association between two variables when only ordinal data are available.
Values of rs can range from –1.0 to +1.0, where• values near 1.0 indicate a strong positive
association between the rankings, and• values near -1.0 indicate a strong negative
association between the rankings.
65 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rank Correlation
Spearman Rank-Correlation Coefficient, rs
2
261 ( 1)
is
dr
n n -
-
where: n = number of observations being ranked xi = rank of observation i with respect to the first variable yi = rank of observation i with respect to the second variable di = xi - yi
66 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Test for Significant Rank Correlation
0 : 0sH p
a : 0sH p
We may want to use sample results to make an inference about the population rank correlation ps. To do so, we must test the hypotheses:
(No rank correlation exists)(Rank correlation exists)
67 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rank Correlation
0sr
11sr n
-
Approximately normal, provided n > 10
Sampling Distribution of rs when ps = 0• Mean
• Standard Deviation
• Distribution Form
68 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rank Correlation
Example: Crennor InvestorsCrennor Investors provides a portfolio management
service for its clients. Two of Crennor’s analysts ranked
ten investments as shown on the next slide. Use rank
correlation, with a = .10, to comment on the agreement
of the two analysts’ rankings.
69 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rank Correlation
Analyst #2 1 5 6 2 9 7 3 10 4 8Analyst #1 1 4 9 8 6 3 5 7 2 10Investment A B C D E F G H I J
Example: Crennor Investors
0 : 0sH p
a : 0sH p (No rank correlation exists)(Rank correlation exists)
• Analysts’ Rankings
• Hypotheses
70 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Rank Correlation
ABCDEFGHIJ
14986357210
1562973
1048
0-136-3-42-3-22
019369164944
Sum = 92
InvestmentAnalyst #1
RankingAnalyst #2
Ranking Differ. (Differ.)2
71 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Sampling Distribution of rs
Assuming No Rank Correlation
Rank Correlation
1 .33310 1sr -
r = 0rs
72 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Test Statistic2
26 6(92)1 1 0.4424( 1) 10(100 1)
is
dr
n n - -
- -
Rank Correlation
z = (rs - r )/r = (.4424 - 0)/.3333 = 1.33
Rejection RuleWith .10 level of significance:
Reject H0 if p-value < .10
p-Valuep-Value = 2(1.0000 - .9082) = .1836
73 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
Do no reject H0. The p-value > a. There is not a significant rank correlation. The two analysts are not showing agreement in their ranking of the risk associated with the different investments.
Rank Correlation
Conclusion
74 Slide© 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied
or duplicated, or posted to a publicly accessible website, in whole or in part.
End of Chapter 19