Overview of non parametric methods. Parametric methods. Normality.
-
Upload
barbara-sullivan -
Category
Documents
-
view
232 -
download
0
Transcript of Overview of non parametric methods. Parametric methods. Normality.
Parametric methods. Problems in measurement (scale of measurement)
Stanley Smith Stevens' typology#
Scale type
Logical/math operations allowed
Examples: Variable name (data values)
Measure ofcentral tendency
Qualitative or Quantitative
1 Nominal =/≠
Dichotomous: Gender (male vs. female)Non-dichotomous: Nationality (American/Chinese/etc)
Mode Qualitative
2 Ordinal =/≠ ; </>
Dichotomous: Health (healthy vs. sick), Truth (true vs. false), Beauty (beautiful vs. ugly)Non-dichotomous: Opinion ('completely agree'/ 'mostly agree'/ 'mostly disagree'/ 'completely disagree')
Median Qualitative
3 Interval =/≠ ; </> ; +/− Date (from 9999 BC to 2013 AD) Latitude (from +90° to −90°)
Arithmetic Mean Quantitative
4 Ratio =/≠ ; </> ; +/− ; ×/÷ Age (from 0 to 99 years) Geometric Mean Quantitative
• Descriptive statistics;• Tests of differences between groups (independent samples); • Tests of differences between variables (dependent samples); • Tests of relationships between variables.
Nonparametric methods
Nonparametric methodsDescriptive statistics
G = (x1 *x2 *...*xn )1/n - geometric mean
log(G) = {Σ[log(xi)]}/n
Nonparametric methodsDifferences between dependent groups.
2 variables• Sign test • Wilcoxon's matched pairs test.
More than 2 vars• Cochran's Q test
Nonparametric methodsSign test
Peanut Butter Taste Test
As part of a market research study, a sample of 36 consumers were asked to taste two brands of peanutbutter and indicate a preference.
Do the data shownbelow indicate a significant difference in the consumerpreferences for the two brands?18 preferred Hoppy Peanut Butter 12 preferred Pokey Peanut Butter6 had no preference
The analysis is based on a sample size of 18 + 12 = 30.
Nonparametric methodsSign test
Peanut Butter Taste Test
Reject H0 if z < -1.96 or z > 1.96z = (18 - 15)/2.74 = 3/2.74 = 1.095
H0: p = 0.50
p = Pr(X > Y)
W be the number of pairs for which yi − xi > 0
Nonparametric methodsWilcoxon Signed-Rank Test
District Office Overnight NiteFlite
Seattle 32 hrs. 25 hrs.Los Angeles 30 24Boston 19 15Cleveland 16 15New York 15 13Houston 18 15Atlanta 14 15St. Louis10 8Milwaukee 7 9Denver 16 11
Nonparametric methodsWilcoxon Signed-Rank Test
District Office Differ. Diff. Rank Sign. Rank
Seattle 7 10 +10Los Angeles 6 9 +9Boston 4 7 +7Cleveland 1 1.5 +1.5New York 2 4 +4Houston 3 6 +6Atlanta -1 1.5 -1.5St. Louis 2 4 +4Milwaukee -2 4 -4Denver 5 8 +8
+44
Nonparametric methodsWilcoxon Signed-Rank Test
• Compute the differences between the paired observations.
• Discard any differences of zero.• Rank the absolute value of the differences
from lowest to highest. Tied differences are assigned the average ranking of their positions.
• Give the ranks the sign of the original difference in the data.
• Sum the signed ranks.
. . . next we will determine whether the sum is significantly different from zero.
.
Nonparametric methodsDifferences between independent groups.
2 variables• Mann-Whitney U test• Wald-Wolfowitz runs test
More than 2 vars• Kruskal-Wallis analysis of ranks
Nonparametric methodsMann-Whitney U test
• First, rank the combined data from the lowest to the highest values, with tied values being assigned the average of the tied rankings.
• Then, compute U, the sum of the ranks for the each sample.
• The smaller value of U1 and U2 is the one used when consulting significance tables.
• If U < = Table value H1, else H0
.
U is just number of wins out of all pairwise contests
Nonparametric methodsTests of relationships between variables.
• Spearman, Kendall tau, Gamma correlation coefficients• 2x2 Tables
Nonparametric methodsSpearman correllation
ui = rank of item i with respect to one variable vi = rank of item i with respect to a second variabledi = ui - vi
2
12
6 (6)(32.5)1 1 0.803
( 1) (10)(99)
n
iis n n
dr
Nonparametric methodsSpearman correllation
Car Age(months)Xi
Minimal Stopping dist at 40 kph (metres)Yi
Age Rank(ui)
Stopping Rank(vi)
Differences of the Ranks (di = ui-vi)
A 9 28.4 1 1 0
B 15 29.3 2 2 0
C 24 37.6 3 7 -4
D 30 36.2 4 4.5 -0.5
E 38 36.5 5 6 -1
F 46 35.3 6 3 3
G 53 36.2 7 4.5 2.5
H 60 44.1 8 8 0
I 64 44.8 9 9 0
J 76 47.2 10 10 0
d2=32.5
Nonparametric methodsClassification and clustering
Tree Diagram for 22 Variables
Single Linkage
Euclidean distances
0 20 40 60 80 100 120
(Dlink/Dmax)*100
Copper, $/mt, nominal$Tobacco, $/mt, nominal$
Aluminum, $/mt, nominal$Platinum, $/toz, nominal$
Gold, $/toz, nominal$Wheat, Canada, $/mt, nominal$
Rice, Thailand, 5%, $/mt, nominal$Soybeans, $/mt, nominal$
Logs, Cameroon, $/cubicmeter, nominal$Maize, $/mt, nominal$Silver, $/toz, nominal$
Natural gas LNG, $/mmbtu, nominal$Natural gas, Europe, $/mmbtu, nominal$
Sugar, world, $/kg, nominal$Coffee, Arabica, $/kg, nominal$
Tea, Colombo auctions, $/kg, nominal$Cotton, A Index, $/kg, nominal$
Meat, chicken, $/kg, nominal$Meat, beef, $/kg, nominal$Iron ore, $/dmtu, nominal$
Coal, Australia, $/mt, nominal$Crude oil, Brendt, $/bbl, nominal$
Nonparametric methods2x2 Tables
Suppose that you are considering whether to introduce a new formula for a successful soft drink. Before finally deciding on the new formula, you conduct a survey in which you ask male and female respondents to express their preference for either the old or new soft drink. Assume that out of 50 males, 41 prefer the new formula over the old formula; out of 50 females, only 27 prefer the new formula. 41 9
27 23
Chi-square (df=1) 9.01 p= .0027
V-square (df=1) 8.92 p= .0028
Yates corrected Chi-square
7.77 p= .0053
Phi-square.09007
Fisher exact p, one-tailed
p= .0025
two-tailed p= .0049
McNemar Chi-square (A/D)
4.52 p= .0336
Chi-square (B/C) 8.03 p= .0046