Describing Distributions with Numbers BPS chapter 2 © 2006 W.H. Freeman and Company.
The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with...
-
Upload
gilbert-joseph -
Category
Documents
-
view
218 -
download
0
Transcript of The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with...
The Practice of StatisticsThird Edition
Chapter 1:Exploring Data
1.2 Describing Distributions with Numbers
Copyright © 2008 by W. H. Freeman & Company
Daniel S. Yates
Objectives for 1.2
• Given a data set, How do you compute mean, median, quartiles, and the five-number summary?
• How do you construct a box plot using the five-number summary?
• How do you compute the inter-quartile range?• How do you identify an outlier using the inter-
quartile range rule?• How do you compute the standard deviation and
variance?
Measure for The Center of a Distribution
The Means of a Data Set
• So far, we know several measures of central tendency of a set of numbers: means, median, and mode.
• The means is the arithmetic average of the data set.
The Mean of a Data Set“Average Value”
• Σ (sigma) means to add them all up. All the data values and get a total.
• Take the total and divide by the number of data.
Example - Mean
• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,
98, 93
• Find the mean.
• Answer 85.
• Use calculator – Stat edit, enter data in L1Second Stat, Math, Mean( L1), Enter
The Median of the Data Set
• Median is the center of the data set.
• Half of the data set is above and Half is below the median. The 50th Percentile.
• The median may or may not be in the data set.
Calculation for Median“Middle Value”
Example - Median
• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,
98, 93
• Find the median.
• Answer 85.
• Use calculator – Stat edit, enter data in L1Second Stat, Math, Median( L1), Enter
Terminology “A measure is resistant”
• A measure that does not respond strongly to the influence of outliers (extreme observations).
• Furthermore, a measure that is resistant does not respond strongly to changes in a few observations.
Are mean and median resistant?
Mean and Median Applet
Mean vs Median
• Mean is not a resistant measure.– It is sensitive to the influence of a few extreme
observations (outliers).– It is sensitive to skewed distributions. The mean is
pulled towards the tail.
• Median is resistant.– It is resistant to extreme values and skewed
distributions.
• For skewed distributions the median is the better measure for center.
Measure for Spread
RangeQuartiles
Five Number SummaryThe Standard Deviation
Range
• The difference between the largest value and the smallest value.
• Gives the full spread of the data.
• But may be dependent on outliers.
Quartiles• We can describe the spread (variability of a distribution) by giving several percentiles (pth percentile of a distribution)
• Typically we use 25th percentile, 50th percentile, 75th percentile.
• Q1, median, Q3.
Example
• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,
98, 93
• Find Q1, median, and Q3.
• Answer: Q1 = 78, Median = 85, Q3 = 91
• Using the calculator– STAT, CALC, 1-Var Stats L1, ENTER
Five Number Summary
Using the calculator, we again use 1-Var Stats.
Five Number Summary Computer Software Output
Five Number Summary Computer Software Output
Graphical Display of 5 Number Summary
Example - Boxplot
• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,
98, 93
• Answer 74 78 85 91 98
• Calculator
STAT PLOT, make appropriate selections on the menu, ZOOM, 9:Zoom Stat
Interquartile Range
Identifying Outliers
Variance and Standard Deviation
Example – Variance and Standard Deviation
Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90, 98, 93
Calculate the variance and standard deviation.2
2 ( )
1ix x
sn
86 86-85=1 1
84 84-85=-1 1
91 91-85=6 36
75 75-85=-10 100
78 78-85=-7 49
80 80-85=-5 25
74 74-85=-11 121
87 87-85=2 4
76 76-85=-9 81
96 96-85=11 121
82 82-85=-3 9
90 90-85=5 25
98 98-85=13 169
93 93-85=8 64
Total
1190
Tot
806
119085
14ixx
n
2 80662
13s
Standard Deviation 62 7.874s
Calculator – STAT EDIT, enter data in list 1, QUIT
STAT CALC 1-Var Stat
Standard Deviation
• The standard deviation is zero when there is no spread.
• The Standard deviation gets larger as the spread increases.
Impact of adding a constant to all data in the set?
• Joey’s first 14 quiz grades in a marking period were – 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90, 98, 93
• Add 32 points to each score, then store in L2.• Compute 1-Var Stat. What has changed?• The five-number summary has changed but the
standard deviation has not?• The measure the spread remains the same?
The impact of multiplying each data in the set by a constant?
• Using the data set in L1 multiply the 2.
• Compute 1-Var Stat.
• What has changed?
• The five-number summary has changed by 2 times and the standard deviation has changed by 2 times.
• The measure of the spread has increased.