Modified by ARQ, from © 2002 Prentice-Hall.Chap 3-1 Numerical Descriptive Measures Chapter 2 ...
-
Upload
gwendoline-lambert -
Category
Documents
-
view
218 -
download
0
Transcript of Modified by ARQ, from © 2002 Prentice-Hall.Chap 3-1 Numerical Descriptive Measures Chapter 2 ...
Modified by ARQ, from © 2002 Prentice-Hall. Chap 3-1
Numerical Descriptive Measures
Chapter 2http://www2.uta.edu/infosys/amer/
courses/3321%20ppts/c3.ppt
Modified: 2002 Prentice-Hall, Inc. Chap 3-2
Chapter Topics Measures of central tendency
Mean, median, mode, midrange
Quartiles Measure of variation
Range, interquartile range, variance and Standard deviation, coefficient of variation
Shape Symmetric, skewed (+/-)
Coefficient of correlation
Modified: 2002 Prentice-Hall, Inc. Chap 3-3
Summary Measures
Central Tendency
Arithmetic Mean
Median Mode
Quartile
Geometric Mean
Summary Measures
Variation
Variance
Standard Deviation
Coefficient of Variation
Range
Modified: 2002 Prentice-Hall, Inc. Chap 3-4
Measures of Central Tendency
Central Tendency
Average (Mean) Median Mode
1
1
n
ii
N
ii
XX
n
X
N
Modified: 2002 Prentice-Hall, Inc. Chap 3-5
Mean (Arithmetic Mean) Mean (arithmetic mean) of data values
Sample mean
Population mean
1 1 2
n
ii n
XX X X
Xn n
1 1 2
N
ii N
XX X X
N N
Sample Size
Population Size
Modified: 2002 Prentice-Hall, Inc. Chap 3-6
Mean (Arithmetic Mean)
The most common measure of central tendency
Affected by extreme values (outliers)
(continued)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Mean = 5 Mean = 6
Modified: 2002 Prentice-Hall, Inc. Chap 3-7
Median Robust measure of central tendency Not affected by extreme values
In an Ordered array, median is the “middle” number If n or N is odd, median is the middle number If n or N is even, median is the average of
the two middle numbers
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 12 14
Median = 5 Median = 5
Modified: 2002 Prentice-Hall, Inc. Chap 3-8
Mode A measure of central tendency Value that occurs most often Not affected by extreme values Used for either numerical or categorical
data There may may be no mode There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Mode = 9
0 1 2 3 4 5 6
No Mode
Modified: 2002 Prentice-Hall, Inc. Chap 3-9
Quartiles Split Ordered Data into 4 Quarters
Position of i-th Quartile
and Are Measures of Noncentral Location = Median, A Measure of Central Tendency
25% 25% 25% 25%
1Q 2Q 3Q
Data in Ordered Array: 11 12 13 16 16 17 18 21 22
1 1
1 9 1 12 13Position of 2.5 12.5
4 2Q Q
1Q 3Q
2Q
1
4i
i nQ
Modified: 2002 Prentice-Hall, Inc. Chap 3-10
Measures of Variation
Variation
Variance Standard Deviation Coefficient of Variation
PopulationVariance (σ2)
Sample
Variance (S2)
Population Standard deviation (σ)
Sample Standard deviation (S)
Range
Inter-quartile Range
Modified: 2002 Prentice-Hall, Inc. Chap 3-11
Range
Measure of variation Difference between the largest and the
smallest observations:
Ignores the way in which data are distributed
Largest SmallestRange X X
7 8 9 10 11 12
Range = 12 - 7 = 5
7 8 9 10 11 12
Range = 12 - 7 = 5
Modified: 2002 Prentice-Hall, Inc. Chap 3-12
Measure of variation Also known as midspread
Spread in the middle 50% Difference between the first and third
quartiles
Not affected by extreme values
3 1Interquartile Range 17.5 12.5 5Q Q
Interquartile Range
Data in Ordered Array: 11 12 13 16 16 17 17 18 21
Modified: 2002 Prentice-Hall, Inc. Chap 3-13
2
2 1
N
ii
X
N
Important measure of variation Shows variation about the mean
Sample variance:
Population variance:
2
2 1
1
n
ii
X XS
n
Variance
Modified: 2002 Prentice-Hall, Inc. Chap 3-14
Standard Deviation Most important measure of variation Shows variation about the mean Has the same units as the original data
Sample standard deviation:
Population standard deviation:
2
1
1
n
ii
X XS
n
2
1
N
ii
X
N
Modified: 2002 Prentice-Hall, Inc. Chap 3-15
Comparing Standard Deviations
Mean = 15.5 S = 3.338 11 12 13 14 15 16 17 18 19 20 21
11 12 13 14 15 16 17 18 19 20 21
Data B
Data A
Mean = 15.5 S = .9258
11 12 13 14 15 16 17 18 19 20 21
Mean = 15.5 S = 4.57
Data C
Modified: 2002 Prentice-Hall, Inc. Chap 3-16
Coefficient of Variation
Measures relative variation Always in percentage (%) Shows variation relative to mean Is used to compare two or more sets of
data measured in different units
100%S
CVX
Modified: 2002 Prentice-Hall, Inc. Chap 3-17
Comparing Coefficient of Variation
Stock A: Average price last year = $50 Standard deviation = $5
Stock B: Average price last year = $100 Standard deviation = $5
Coefficient of variation: Stock A:
Stock B:
$5100% 100% 10%
$50
SCV
X
$5100% 100% 5%
$100
SCV
X
Modified: 2002 Prentice-Hall, Inc. Chap 3-18
Shape of a Distribution
Describes how data is distributed Measures of shape
Symmetric or Skewed
Mean = Median =Mode Mean < Median < Mode Mode < Median < Mean
(+) Right-Skewed(-) Left-Skewed Symmetric
Modified: 2002 Prentice-Hall, Inc. Chap 3-19
Coefficient of Correlation
Measures the strength of the linear relationship between two quantitative variables
1
2 2
1 1
n
i ii
n n
i ii i
X X Y Yr
X X Y Y
Modified: 2002 Prentice-Hall, Inc. Chap 3-20
Features of Correlation Coefficient
Unit free Ranges between –1 and 1 The closer to –1, the stronger the negative
linear relationship The closer to 1, the stronger the positive linear
relationship The closer to 0, the weaker any positive linear
relationship
Modified: 2002 Prentice-Hall, Inc. Chap 3-21
Scatter Plots of Data with Various Correlation
CoefficientsY
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = .6 r = 1
Modified: 2002 Prentice-Hall, Inc. Chap 3-22
Chapter Summary Described measures of central tendency
Mean, median, mode, midrange
Discussed quartile Described measure of variation
Range, interquartile range, variance and standard deviation, coefficient of variation
Illustrated shape of distribution Symmetric, Skewed
Discussed correlation coefficient
Modified: 2002 Prentice-Hall, Inc. Chap 3-23
Stem-n-leaf plot A class took a test. The students' got the following scores. 94 85 74 85 77 100 85 95 98 95 First let us draw the stem and leaf plot
This makes it easy to see that the mode, the most common
score is an 85 since there are three of those scores and all of the other scores have frequencies of either one or tow.
It will also make it easier to rank the scores
Modified: 2002 Prentice-Hall, Inc. Chap 3-24
This will make it easier to find the median and the quartiles. There are as many scores above the median as below. Since there are ten scores, and ten is an even number, we can divide the scores into two equal groups with no scores left over. To find the median, we divide the scores up into the upper five and the lower five.