Chapter 24 Single-Factor (One-Way) Analysis of Variance ...
Transcript of Chapter 24 Single-Factor (One-Way) Analysis of Variance ...
3/18/2013
1
Chapter 24
Single-Factor (One-Way)
Analysis of Variance (ANOVA)
and
Analysis of Means (ANOM)
Introduction
• This chapter describes single-factor analysis of
variance (ANOVA) experiments with 2 or more levels
(or treatments).
• The method is based on a fixed effects model (as
opposed to a random effects model, or components of
variance model). It tests the 𝐻0 that the different
processes give an equal response.
• With fixed effects model, the levels are specifically
chosen. The test hypothesis is about the mean
responses due to factor levels. Conclusions apply only
to the factor levels considered.
3/18/2013
2
24.1 S4/IEE Application Examples:
ANOVA and ANOM
• Transactional 30,000-foot-level metric: DSO reduction was
chosen as an S4/IEE project. A cause-and-effect matrix
ranked company as an important input that could affect the
DSO response (i.e., the team thought that some
companies were more delinquent in payments than other
companies). From randomly sampled data, a statistical
assessment was conducted to test the hypothesis of
equality of means for the DSOs of these companies.
24.1 S4/IEE Application Examples:
ANOVA and ANOM
• Manufacturing 30,000-foot-level metric (KPOV): An S4/IEE
project was to improve the capability/performance of the
diameter of a manufactured product (i.e., reduce the
number of parts beyond the specification limits). A cause-
and-effect matrix ranked cavity of the four-cavity mold as
an important input that could be yielding different part
diameters. From randomly sampled data, statistical tests
were conducted to test the hypotheses of mean diameter
equality and equality of variances for the cavities.
3/18/2013
3
24.1 S4/IEE Application Examples:
ANOVA and ANOM
• Transactional and manufacturing 30.000-foot-level cycle
time metric (a lean metric): An S4/IEE project was to
improve the time from order entry to fulfillment. The WIP
at each process step was collected at the end of the day
for a random number of days. Statistical tests were
conducted to test the hypothesis that the mean and
variance of WIP at each step was equal.
24.2 Application Steps
1. Describe the problem using a response variable that
corresponds to the KPOV or measured quality characteristic.
2. Describe the analysis (e.g., determine if there is a difference)
3. State the null and alternative hypotheses.
4. Choose a large enough sample and conduct the experiment
randomly.
5. Generate an ANOVA table.
6. Test the data normality and equality of variance assumptions.
7. Make hypothesis decisions about factors from ANOVA table.
8. Calculate (if desired) epsilon squared (𝜀2).
9. Conduct an analysis of means (ANOM).
10. Translate conclusions into terms relevant to the problem or
process in question.
3/18/2013
4
24.3 Single-factor ANOVA
Hypothesis Test
• ANOVA assesses the differences between samples taken
at different factor levels to determine if these differences
are large enough relative to error to conclude that the
factor level causes a statistically significant difference in
response.
• For a single-factor analysis of variance, a linear statistical
model can describe the observations of a level with 𝑗 observations taken under level 𝑖 (𝑖 = 1,2,… , 𝑎; 𝑗 = 1,2,… , 𝑛):
𝑦𝑖𝑗 = 𝜇 + 𝜏𝑖 + 𝜀𝑖𝑗 where 𝑦𝑖𝑗 is the (𝑖𝑗)th observation, 𝜇 is the overall mean, 𝜏𝑖 is
the 𝑖th level effect, and 𝜀𝑖𝑗 is random error.
24.3 Single-factor ANOVA
Hypothesis Test
• In ANOVA test, model errors are assumed to be normally
and independently distributed random variables with mean
0 and variance 𝜎2. The variance is assumed constant for
all factor levels.
• An expression for the hypothesis test of means is:
𝐻0: 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑎
𝐻𝑎: 𝜇𝑖 ≠ 𝜇𝑗 for at least one pair (𝑖, 𝑗)
• When 𝐻0 is true, all levels have a common mean 𝜇, which
leads to an equivalent expression in terms of 𝜏𝑖.
𝐻0: 𝜏1 = 𝜏2 = ⋯ = 𝜏𝑎 = 0
𝐻𝑎: 𝜏𝑖 ≠ 0 (for at least one 𝑖)
3/18/2013
5
24.4 Single-factor ANOVA Table
Calculations
• The total sum of squares of deviations about the grand
average 𝑦 (also referred as the total corrected sum of
squares) represents the overall variability of the data:
𝑆𝑆total = (𝑦𝑖𝑗 − 𝑦 )2𝑛
𝑗=1
𝑎
𝑖=1
• A division of 𝑆𝑆total by the number of degrees of freedom
would yield a sample variance of 𝑦’s. For this situation, the
overall number of degrees of freedom is 𝑎𝑛 − 1 = 𝑁 − 1.
24.4 Single-factor ANOVA Table
Calculations
• Total variability in data, as measured by 𝑆𝑆total (the total
corrected sum of squares) can be partitioned into a sum of
two elements. The first element is the sum of squares for
differences between factor level averages and the grand
average. The second element is the sum of squares of the
differences of observations within factor levels from the
average of factorial levels. The first element is a measure
of the difference between the means of the levels, whereas
the second one is due to random error.
𝑆𝑆total = 𝑆𝑆factor levels + 𝑆𝑆error
3/18/2013
6
24.4 Single-factor ANOVA Table
Calculations
• 𝑆𝑆factor levels is called the sum of squares due to factor
levels (i.e., between factor levels or treatments)
𝑆𝑆factor levels = 𝑛 (𝑦 𝑖 − 𝑦 )2𝑎
𝑖=1
• 𝑆𝑆error is called the sum of squares due to error (i.e., within
factor levels or treatments)
𝑆𝑆error = (𝑦𝑖𝑗 − 𝑦 𝑖)2
𝑛
𝑗=1
𝑎
𝑖=1
• When divided by the appropriate number of degrees of
freedom, these SSs give good estimates
24.4 Single-factor ANOVA Table
Calculations
• When divided by the appropriate number of degrees of
freedom, these SSs give good estimates of the total
variability, the variability between factor levels, and the
variability within factor levels (or error).
• Expressions for the mean square are
𝑀𝑆factor levels =𝑀𝑆factor levels
𝑎 − 1
𝑀𝑆error =𝑆𝑆error
𝑎(𝑛 − 1)=
𝑆𝑆error
𝑁 − 𝑎
3/18/2013
7
24.4 Single-factor ANOVA Table
Calculations
• If there is no difference in treatment means, the two
estimates are presumed to be similar. If there is a
difference, we suspect that the observed difference is
caused by differences in the treatment (factor) levels.
• F-test statistic tests 𝐻0: there is no difference in factor
levels.
𝐹0 =𝑀𝑆factor levels
𝑀𝑆error
• 𝐻0 should be rejected if 𝐹0 > 𝐹𝛼,𝑎−1,𝑁−𝑎.
• Alternatively, a 𝑝-value could be calculated for 𝐹0. 𝐻0 should
be rejected if 𝑝 − value > 𝛼.
Source of
Variation
Sum of
Squares
Degrees of
Freedom
Mean
Square 𝑭𝟎
Between-
factor levels 𝑆𝑆factor levels 𝑎 − 1 𝑀𝑆factor levels 𝐹0 =
𝑀𝑆factor levels
𝑀𝑆error
Error
(within-
factor levels)
𝑆𝑆error 𝑁 − 𝑎 𝑀𝑆error
Total 𝑆𝑆total 𝑁 − 1
24.4 Single-factor ANOVA Table
Calculations
3/18/2013
8
24.5 Estimation of Model
Parameters
• In addition to factor-level significance, it is useful to
estimate the parameters of the single-factor model and the
CI on the factor-level means.
• For the single-factor model
𝑦𝑖𝑗 = 𝜇 + 𝜏𝑖 + 𝜀𝑖𝑗 Estimates for the overall mean and factor-level effects are
𝜇 = 𝑦
𝜏 𝑖 = 𝑦 𝑖 − 𝑦 , 𝑖 = 1,2,… , 𝑎
• A 100(1-)% CI on the 𝑖th factor level is
𝑦 𝑖 ± 𝑡𝛼,𝑁−𝑎
𝑀𝑆error
𝑛
24.6 Unbalanced Data
• A design is considered unbalanced when the number of
observations in the factor level is different. For this
situation, ANOVA equations need slight modification
𝑆𝑆factor levels = 𝑛𝑖(𝑦 𝑖 − 𝑦 )2𝑎
𝑖=1
• A balanced design is preferable to an unbalanced design.
With a balanced design, the power of the test in maximized
and the test statistic is robust to small departure from the
assumption of equal variances.
3/18/2013
9
24.7 Model Adequacy
• As in the regression model, valid ANOVA requires that
certain assumptions be satisfied.
• One typical assumption is that errors are normally and
independently distributed with mean 0 and constant but
unknown variance 𝑁𝐼𝐷(0, 𝜎2). • To help with meeting the independence and normal
distribution requirement, an experimenter needs to select
an adequate sample size and randomly conduct the trials.
• After data are collected, computer programs offer routines
to test the assumptions.
• Generally, in a fixed effects ANOVA, moderate departures
from normality of the residuals are of little concern.
24.7 Model Adequacy
• In addition to an analysis of residuals, there is also a direct
statistical test for equality of variance.
𝐻0: 𝜎12 = 𝜎2
2 = ⋯ = 𝜎𝑎2
𝐻𝑎: above not true for at least one 𝜎𝑖2
• Bartlett’s test is frequently used to test this hypothesis
when the normality assumption is valid. Levene’s test can
be used when the normality assumption is questionable.
3/18/2013
10
24.8 Analysis of Residuals: Fitted Value Plots and Data Transformations
• Residual plots should show no structure relative to any
factor included in the fitted response; however, trends in
the data may occur for various reasons.
• One phenomenon that may occur is inconsistent variance.
Fortunately, a balanced fixed effects model is robust to
variance not being homogeneous.
• A data transformation may then be used to reduce this
phenomenon in the residuals, which would yield a more
precise significance test.
24.8 Analysis of Residuals: Fitted Value Plots and Data Transformations
• Another situation occurs when the output is count data,
where a square root transformation may be appropriate,
while a lognormal transformation is often appropriate if the
trial outputs are standard deviation values and a logit might
be helpful when there are upper and lower limits.(Table 24.2)
• As an alternative to the transformations included in the table,
Box (I988) describes a method for eliminating unnecessary
coupling of dispersion effects and location effects by
determining an approximate transformation using a lambda
plot.
• With transformations, the conclusions of the analysis apply to
the transformed populations.
3/18/2013
11
24.8 Analysis of Residuals: Fitted Value Plots and Data Transformations
Data Characteristics Data (𝒙𝒊 𝒐𝒓 𝒑𝒊) Transformation
𝜎 ∝ 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 None
𝜎 ∝ 𝜇2 1 𝑥𝑖
𝜎 ∝ 𝜇3/2 1 𝑥𝑖
𝜎 ∝ 𝜇 log 𝑥𝑖
𝜎 ∝ 𝜇, Poisson (count) data 𝑥𝑖 or 𝑥𝑖 + 1
Binomial proportions 𝑠𝑖𝑛−1( 𝑝𝑖)
Upper- and lower-bound data
(e.g., 0~1 probability of failure)
Logit transformation:
𝑙𝑜𝑔𝑥𝑖 − 𝑙𝑜𝑤𝑒𝑟 𝑙𝑖𝑚𝑖𝑡
𝑢𝑝𝑝𝑒𝑟 𝑙𝑖𝑚𝑖𝑡 − 𝑥𝑖
24.9 Comparing Pairs of
Treatment Means
• The rejection of the null hypothesis in an ANOVA indicates
that there is a difference between the factor levels
(treatments). However, no information is given to determine
which means are different.
• Sometimes it is useful to make further comparisons and
analysis among groups of factor level means. Multiple
comparison methods assess differences between treatment
means in either the factor level totals or the factor level
averages.
• Methods include those of Tukey and Fisher. Montgomery
(I997) describes several methods of making comparisons.
• The analysis of means (ANOM) approach is to compare
individual means to a grand mean.
3/18/2013
12
24.10 Example 24.1:
Single-Factor ANOVA
• The bursting strengths of diaphragms were determined in an
experiment. Use analysis of variance techniques to determine
if there is a statistically significant difference at a level of 0.05.
Type 1 Type 2 Type 3 Type 4 Type 5 Type 6 Type 7
59.0 65.7 65.3 67.9 60.6 73.1 59.4
62.3 62.8 63.7 67.4 65.0 71.9 61.6
65.2 59.1 68.9 62.9 68.2 67.8 56.3
65.5 60.2 70.0 61.7 66.0 67.4 62.7
24.10 Example 24.1:
Single-Factor ANOVA
Minitab:
Stat
ANOVA
One-Way
Graphs
Boxplot
3/18/2013
13
24.10 Example 24.1:
Single-Factor ANOVA
Minitab:
Stat
ANOVA
One-Way
Graphs
Ind. values
24.10 Example 24.1:
Single-Factor ANOVA
Minitab:
Stat
ANOVA
One-Way
One-way ANOVA: Strength versus Type
Source DF SS MS F P
Type 6 265.34 44.22 4.92 0.003
Error 21 188.71 8.99
Total 27 454.05
S = 2.998 R-Sq = 58.44% R-Sq(adj) = 46.57%
Individual 95% CIs For Mean Based on Pooled StDev
Level N Mean StDev ------+---------+---------+---------+---
1 4 63.000 3.032 (-----*-----)
2 4 61.950 2.942 (-----*-----)
3 4 66.975 2.966 (-----*-----)
4 4 64.975 3.134 (-----*-----)
5 4 64.950 3.193 (-----*-----)
6 4 70.050 2.876 (-----*-----)
7 4 60.000 2.823 (-----*-----)
------+---------+---------+---------+---
60.0 65.0 70.0 75.0
Pooled StDev = 2.998
3/18/2013
14
24.10 Example 24.1:
Single-Factor ANOVA
Minitab:
ANOVA
Test of equal variances
24.10 Example 24.1:
Single-Factor ANOVA
Minitab:
Stat
ANOVA
One-Way
Graphs
Four in One
3/18/2013
15
24.11 Analysis of Means (ANOM)
• Analysis of means (ANOM) is a statistical test procedure in
a graphical format.
Group 1 2 3 ⋯ 𝒌
𝑥11 𝑥21 𝑥31 ⋯ 𝑥𝑘1
𝑥12 𝑥22 𝑥32 ⋯ 𝑥𝑘2
𝑥13 𝑥23 𝑥33 ⋯ 𝑥𝑘3
⋮ ⋮ ⋮ ⋮ ⋮
𝑥1𝑗 𝑥2𝑗 𝑥3𝑗 ⋯ 𝑥𝑘𝑗
𝑥 1 𝑥 2 𝑥 3 ⋯ 𝑥 𝑘
𝑠1 𝑠2 𝑠3 ⋯ 𝑠𝑘
24.11 Analysis of Means (ANOM)
• The grand mean 𝑥 is simply the average of the group
means (𝑥 𝑖).
𝑥 = 𝑥 𝑖
𝑘𝑖=1
𝑘
• The pooled estimate for the standard deviation is as follows
𝑠 = 𝑠𝑖
2𝑘𝑖=1
𝑘
• The lower and upper decision lines (LDL and UDL) are
𝐿𝐷𝐿 = 𝑥 − ℎ𝛼𝑠𝑘 − 1
𝑘𝑛 𝑈𝐷𝐿 = 𝑥 + ℎ𝛼𝑠
𝑘 − 1
𝑘𝑛
3/18/2013
16
24.11 Analysis of Means (ANOM)
• The lower and upper decision lines (LDL and UDL) are
𝐿𝐷𝐿 = 𝑥 − ℎ𝛼𝑠𝑘 − 1
𝑘𝑛 𝑈𝐷𝐿 = 𝑥 + ℎ𝛼𝑠
𝑘 − 1
𝑘𝑛
• ℎ𝛼 is from Table I for risk level 𝛼, number of means 𝑘, and
degrees of freedom 𝑛 − 1 𝑘 .
• The means are then plotted against the decision lines. If
any mean fall outside the decision lines, there is a
statistically significant difference for this mean from the
grand mean.
• If normality can be assumed (𝑛𝑝 > 5 𝑎𝑛𝑑 𝑛 1 − 𝑝 > 5),
ANOM is also directly applicable to attribute data.
24.12 Example 24.2:
Analysis of Means (ANOM)
• The bursting strengths of diaphragms were determined in an
experiment. Use analysis of variance techniques to determine
if there is a statistically significant difference at a level of 0.05.
Type 1 Type 2 Type 3 Type 4 Type 5 Type 6 Type 7 Sum 59.0 65.7 65.3 67.9 60.6 73.1 59.4
62.3 62.8 63.7 67.4 65.0 71.9 61.6
65.2 59.1 68.9 62.9 68.2 67.8 56.3
65.5 60.2 70.0 61.7 66.0 67.4 62.7
Mean 63.0 62.0 67.0 65.0 65.0 70.1 60.0 451.9
Var. 9.19 8.66 8.80 9.82 10.20 8.27 7.97 62.9
3/18/2013
17
• The grand mean 𝑥 is
𝑥 = 𝑥 𝑖
𝑘𝑖=1
𝑘=
451.9
7= 64.6
• The pooled standard deviation is
𝑠 = 𝑠𝑖2
𝑘𝑖=1
𝑘=
62.90
7= 3.00
• The lower and upper decision lines (LDL and UDL) are
𝐿𝐷𝐿 = 𝑥 − ℎ𝛼𝑠𝑘 − 1
𝑘𝑛= 64.6 − 2.94 3.0
7 − 1
7 4= 60.49
𝑈𝐷𝐿 = 68.65
24.12 Example 24.2:
ANOM of Injection-Molding Data
24.12 Example 24.2:
ANOM of Injection-Molding Data
Minitab:
Stat
ANOVA
Analysis of Means
3/18/2013
18
24.13 Example 24.3:
Analysis of Means (ANOM)
Example 15.1
Example 22.4
• A comparison of the proportion of total variability of the
factor levels to the error term could be made in % units
using a controversial epsilon square relationship:
𝜀factor level2 = 100 ×
𝑆𝑆factor
𝑆𝑆total
𝜀error2 = 100 ×
𝑆𝑆error
𝑆𝑆total
• Consider the situation in which a process was randomly
sampled using conventional, rational sampling practices.
Consider also that there were between 25 and 100 sets of
samples taken over time.
24.14 Six Sigma Considerations*
3/18/2013
19
• For this type of data, the sums of squares from an ANOVA
table can be used to break down total variability into 2 parts.
• The division of these sums of squares by the correct
number of degrees of freedom yields estimates for the
different source of variation (total, between subgroups, and
within subgroups).
• The estimator of total variability gives an estimate for LT
capability, while the estimator of within group variability
gives an estimate for ST capability.
• These concepts of variability can be used to represent the
influence of time on a process.
24.14 Six Sigma Considerations*
• The ST and LT standard deviation estimates from an
ANOVA table are
𝜎 𝐿𝑇 = (𝑦𝑖𝑗 − 𝑦 )2𝑛
𝑗=1𝑎𝑖=1
𝑛𝑎 − 1
𝜎 𝑆𝑇 = (𝑦𝑖𝑗 − 𝑦 𝑖)2
𝑛𝑗=1
𝑎𝑖=1
𝑎(𝑛 − 1)
• These two estimators are useful in calculating the ST and
LT capability / performance of the process.
24.14 Six Sigma Considerations*
3/18/2013
20
• The variable used to measure this capability / performance
is Z. Short-term Z values for the process are
𝑍LSL,ST =𝐿𝑆𝐿 − 𝑇
𝜎 𝑆𝑇 𝑍USL,ST =
𝑈𝑆𝐿 − 𝑇
𝜎 𝑆𝑇
• The nominal specification 𝑇 value is used because it
represents the potential capability of the process.
• Long-term Z values for the process are
𝑍LSL,LT =𝐿𝑆𝐿 − 𝜇
𝜎 𝐿𝑇 𝑍USL,ST =
𝑈𝑆𝐿 − 𝜇
𝜎 𝐿𝑇
24.14 Six Sigma Considerations*
• Probability values can then be obtained from the normal
distribution for the different values of 𝑍.
• These probabilities correspond to the frequency of
occurrence beyond specification limits.
• Multiplication of these probabilities by one million gives
DPMO.
24.14 Six Sigma Considerations*
3/18/2013
21
24.15 Example 24.4:
Determining Process Capability Using
One-Factor ANOVA # 𝑥 R
1 0.65 0.70 0.65 0.65 0.85 0.70 0.20
2 0.75 0.85 0.75 0.85 0.65 0.77 0.20
3 0.75 0.80 0.80 0.70 0.75 0.76 0.10
4 0.60 0.70 0.70 0.75 0.65 0.68 0.15
5 0.70 0.75 0.65 0.85 0.80 0.75 0.20
6 0.60 0.75 0.75 0.85 0.70 0.73 0.25
7 0.75 0.80 0.65 0.75 0.70 0.73 0.15
8 0.60 0.70 0.80 0.75 0.75 0.72 0.20
9 0.65 0.80 0.85 0.85 0.75 0.78 0.20
10 0.60 0.70 0.60 0.80 0.65 0.67 0.20
11 0.80 0.75 0.90 0.50 0.80 0.75 0.40
12 0.85 0.75 0.85 0.65 0.70 0.76 0.20
13 0.70 0.70 0.75 0.75 0.70 0.72 0.05
14 0.65 0.70 0.85 0.75 0.60 0.71 0.25
15 0.90 0.80 0.80 0.75 0.85 0.82 0.15
16 0.75 0.80 0.75 0.80 0.65 0.75 0.15
• Example 11.2 Process
capability / performance
metrics
• Example 22.3 Variance
components
24.15 Example 24.4:
Determining Process Capability Using
One-Factor ANOVA
One-way ANOVA: Data versus Subgroup
Source DF SS MS F P
Subgroup 15 0.10950 0.00730 1.12 0.360
Error 64 0.41800 0.00653
Total 79 0.52750
𝜎 𝐿𝑇 = (𝑦𝑖𝑗 − 𝑦 )2𝑛
𝑗=1𝑎𝑖=1
𝑛𝑎 − 1=
0.52750
5 16 − 1= 0.081714
𝜎 𝑆𝑇 = (𝑦𝑖𝑗 − 𝑦 𝑖)2
𝑛𝑗=1
𝑎𝑖=1
𝑎(𝑛 − 1)=
0.41800
16(5 − 1)= 0.080816
3/18/2013
22
𝑍LSL,ST =𝐿𝑆𝐿 − 𝑇
𝜎 𝑆𝑇=
.5 − .7
.080816 = −2.4747 𝑍USL,ST=
𝑈𝑆𝐿 − 𝑇
𝜎 𝑆𝑇= 2.4747
𝑃 𝑍 > 𝑍USL,ST = 𝑃 𝑍 > 2.4747 = 0.006667
𝑃 𝑍 < 𝑍LSL,ST = 𝑃 𝑍 < −2.4747 = 0.006667
𝑝𝑆𝑇 = 0.006667 + 0.006667 = 0.013335
𝑍LSL,LT =𝐿𝑆𝐿 − 𝜇
𝜎 𝐿𝑇=
.5 − .7375
.081714= −2.9065 𝑍USL,ST =
𝑈𝑆𝐿 − 𝜇
𝜎 𝐿𝑇= 1.9886
𝑃 𝑍 > 𝑍USL,LT = 𝑃 𝑍 > 1.9886 = 0.023373
𝑃 𝑍 < 𝑍LSL,LT = 𝑃 𝑍 < −2.9065 = 0.001828
𝑝𝐿𝑇 = 0.023373 + 0.001828 = 0.02520
24.15 Example 24.4:
Determining Process Capability Using
One-Factor ANOVA
• When conducting the 2-sample t-test to compare the
average of two groups, the data in both groups must be
sampled from a normally distributed population. If that
assumption does not hold, the nonparametric Mann-Whitney
test is a better safeguard against drawing wrong conclusions.
• The Mann-Whitney test compares the medians from two
populations and works when the Y variable is continuous,
discrete-ordinal or discrete-count, and the X variable is
discrete with two attributes. Of course, the Mann-Whitney
test can also be used for normally distributed data, but in that
case it is less powerful than the 2-sample t-test.
24.16 Non-Parametric Estimate:
Mann-Whitney Test Procedure
http://www.isixsigma.com/tools-templates/hypothesis-
testing/making-sense-mann-whitney-test-median-comparison/
3/18/2013
23
Testing Hypothesis on the Difference
of Medians – Mann Whitney Test
• Null Hypothesis: 𝐻0: 1 = 2
• Test statistic: 𝑈 = 𝑆1 −𝑛1(𝑛1+1)
2
• 𝐻𝑎: 1 < 2
• Critical value 𝑈𝑐 (from Table, with n1, n2, )
• Reject 𝐻0 if 𝑈 < 𝑈𝑐
• Ha: 1 > 2
• Find 𝑈’ (from Table, with n1, n2, ), 𝑈𝑐 = 𝑛1𝑛2 − 𝑈’
• Reject 𝐻0 if 𝑈 > 𝑈𝑐
• 𝐻𝑎: 1 ≠ 2
• Find 𝑈𝑙𝑜𝑤𝑒𝑟 (from Table, with n1, n2, /2), 𝑈𝑢𝑝𝑝𝑒𝑟 = 𝑛1𝑛2 − 𝑈𝑙𝑜𝑤𝑒𝑟
• Reject 𝐻0 if 𝑈 < 𝑈𝑙𝑜𝑤𝑒𝑟 or 𝑈 > 𝑈𝑢𝑝𝑝𝑒𝑟
Where S1 is the sum of ranks of sample 1
Testing Hypothesis on the Difference
of Medians – Mann Whitney Test
http://www.lesn.appstate.edu/olson/stat_directory/Statistical%20procedu
res/Mann_Whitney%20U%20Test/Mann-Whitney%20Table.pdf
Two-tailed Test
3/18/2013
24
Testing Hypothesis on the Difference
of Medians – Mann Whitney Test
http://www.lesn.appstate.edu/olson/stat_directory/Statistical%20procedu
res/Mann_Whitney%20U%20Test/Mann-Whitney%20Table.pdf
One-tailed Test
Testing Hypothesis on the
Difference of Medians – Example
• =.05
• Null Hypothesis: H0: 1 = 2
• Test statistic:
𝑈 = 𝑆1 −𝑛1 𝑛1 + 1
2= 29.5 −
5 5 + 1
2= 14.5
• Ha: 1 ≠ 2
• Ulower= 3 (Table VI, with n1=5, n2=6, .025), Uupper = n1n2 – Ulower = (5)(6)-3 = 27
• Reject H0 if U<Ulower or U>Uupper
• Fail to reject H0
X Rank
4.6 B 1 1
8.9 B 2 2
9.5 A 3 3
9.7 A 4 4
10.8 A 5 5
11.1 B 6 6
12.9 A 7.5 7.5
12.9 B 7.5 7.5
13.0 B 9 9
16.0 A 10 10
16.4 B 11 11
29.5 36.5
Model A 9.5 10.8 12.9 16.0 9.7
Model B 12.9 11.1 13.0 16.4 8.9 4.6
3/18/2013
25
Testing Hypothesis on the
Difference of Medians – Example
Mann-Whitney Test and CI: Model A, Model B
N Median
Model A 5 10.800
Model B 6 12.000
Point estimate for ETA1-ETA2 is -0.050
96.4 Percent CI for ETA1-ETA2 is (-3.502,6.202)
W = 29.5
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 1.0000
The test is significant at 1.0000 (adjusted for ties)
Minitab:
Stat
Nonparametrics
Mann-Whitney
• A Kruskal-Wallis test provides an alternative to a one-way
ANOVA. This test is a generalization of Mann-Whitney test.
• The null hypothesis is all medians are equal. The alternative
hypothesis is the medians are not all equal.
• For this test, it is assumed that independent random samples
taken from different populations have a continuous
distribution with the same shape.
• For many distributions, the Kruskal-Wallis test is more
powerful than Mood’s median test, but it is less robust
against outliers.
24.16 Non-Parametric Estimate:
Kruskal-Wallis Test
3/18/2013
26
• The yield per acre for 4 methods of growing corn
24.17 Example 24.5:
Nonparametric Kruskal-Wallis Test
Method 1 Method 2 Method 3 Method 4
83 91 101 78
91 90 100 82
94 81 91 81
89 83 93 77
89 84 96 79
96 83 95 81
91 88 94 80
92 91 81
90 89
84
24.17 Example 24.5:
Nonparametric Kruskal-Wallis Test
Kruskal-Wallis Test: Yield versus Mothod
Kruskal-Wallis Test on Yield
Mothod N Median Ave Rank Z
1 9 91.00 21.8 1.52
2 10 86.00 15.3 -0.83
3 7 95.00 29.6 3.60
4 8 80.50 4.8 -4.12
Overall 34 17.5
H = 25.46 DF = 3 P = 0.000
H = 25.63 DF = 3 P = 0.000 (adjusted for ties)
Minitab:
Stat
Nonparametrics
Kruskal-Wallis
3/18/2013
27
• Like the Kruskal-Wallis test, a Mood’s median test (also
called a median test or sign scores test) is a nonparametric
alternative to ANOVA.
• In this chi-square test, the null hypothesis is the population
medians are equal. The alternative hypothesis is the
medians are not all equal.
• For this test, it is assumed that independent random samples
taken from different populations have a continuous
distribution with the same shape.
• Mood’s median test is more robust to outliers than the
Kruskal-Wallis test.
24.18 Non-Parametric Estimate:
Mood’s Median Test
24.19 Example 24.6:
Nonparametric Mood’s Median Test
Minitab:
Stat
Nonparametrics
Mood’s Median Test
Mood Median Test: Yield versus Mothod
Mood median test for Yield
Chi-Square = 17.54 DF = 3 P = 0.001
Individual 95.0% CIs
Mothod N<= N> Median Q3-Q1 ---------+---------+---------+-----
--
1 3 6 91.0 4.0 (--*---)
2 7 3 86.0 7.3 (---*-----)
3 0 7 95.0 7.0 (---*------)
4 8 0 80.5 2.8 (---*)
---------+---------+---------+-----
--
84.0 91.0 98.0
Overall median = 89.0
3/18/2013
28
• Variability in an experiment can be caused by nuisance
factors in which we have no interest.
• These nuisance factors are sometimes unknown and not
controlled. Randomization guards against this type of factor
affecting results.
• In other situation, the nuisance factor is known but not
controlled. When we observe the value of a factor, it can be
compensated for by using analysis of covariance techniques.
• In yet another situation, the nuisance factor is both known
and controllable. We can systematically eliminate the effect
on comparisons among factor level considerations (i.e.,
treatments) by using a randomized block design.
24.20 Other Considerations
• Experiment results can often be improved dramatically
through the wise management of nuisance factors.
• Statistical software can offer blocking and covariance
analysis options.
24.20 Other Considerations