The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with...

38
The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates

Transcript of The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with...

Page 1: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

The Practice of StatisticsThird Edition

Chapter 1:Exploring Data

1.2 Describing Distributions with Numbers

Copyright © 2008 by W. H. Freeman & Company

Daniel S. Yates

Page 2: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Objectives for 1.2

• Given a data set, How do you compute mean, median, quartiles, and the five-number summary?

• How do you construct a box plot using the five-number summary?

• How do you compute the inter-quartile range?• How do you identify an outlier using the inter-

quartile range rule?• How do you compute the standard deviation and

variance?

Page 3: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Measure for The Center of a Distribution

Page 4: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

The Means of a Data Set

• So far, we know several measures of central tendency of a set of numbers: means, median, and mode.

• The means is the arithmetic average of the data set.

Page 5: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

The Mean of a Data Set“Average Value”

• Σ (sigma) means to add them all up. All the data values and get a total.

• Take the total and divide by the number of data.

Page 6: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Example - Mean

• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,

98, 93

• Find the mean.

• Answer 85.

• Use calculator – Stat edit, enter data in L1Second Stat, Math, Mean( L1), Enter

Page 7: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

The Median of the Data Set

• Median is the center of the data set.

• Half of the data set is above and Half is below the median. The 50th Percentile.

• The median may or may not be in the data set.

Page 8: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Calculation for Median“Middle Value”

Page 9: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Example - Median

• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,

98, 93

• Find the median.

• Answer 85.

• Use calculator – Stat edit, enter data in L1Second Stat, Math, Median( L1), Enter

Page 10: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Terminology “A measure is resistant”

• A measure that does not respond strongly to the influence of outliers (extreme observations).

• Furthermore, a measure that is resistant does not respond strongly to changes in a few observations.

Page 11: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Are mean and median resistant?

Mean and Median Applet

Page 12: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Mean vs Median

• Mean is not a resistant measure.– It is sensitive to the influence of a few extreme

observations (outliers).– It is sensitive to skewed distributions. The mean is

pulled towards the tail.

• Median is resistant.– It is resistant to extreme values and skewed

distributions.

• For skewed distributions the median is the better measure for center.

Page 13: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Measure for Spread

RangeQuartiles

Five Number SummaryThe Standard Deviation

Page 14: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Range

• The difference between the largest value and the smallest value.

• Gives the full spread of the data.

• But may be dependent on outliers.

Page 15: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Quartiles• We can describe the spread (variability of a distribution) by giving several percentiles (pth percentile of a distribution)

• Typically we use 25th percentile, 50th percentile, 75th percentile.

• Q1, median, Q3.

Page 16: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Example

• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,

98, 93

• Find Q1, median, and Q3.

• Answer: Q1 = 78, Median = 85, Q3 = 91

• Using the calculator– STAT, CALC, 1-Var Stats L1, ENTER

Page 17: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Five Number Summary

Using the calculator, we again use 1-Var Stats.

Page 18: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Five Number Summary Computer Software Output

Page 19: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Five Number Summary Computer Software Output

Page 20: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Graphical Display of 5 Number Summary

Page 21: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Example - Boxplot

• Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90,

98, 93

• Answer 74 78 85 91 98

• Calculator

STAT PLOT, make appropriate selections on the menu, ZOOM, 9:Zoom Stat

Page 22: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 23: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Interquartile Range

Page 24: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Identifying Outliers

Page 25: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 26: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 27: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 28: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Variance and Standard Deviation

Page 29: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 30: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 31: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 32: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 33: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Example – Variance and Standard Deviation

Joey’s first 14 quiz grades in a marking period were 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90, 98, 93

Calculate the variance and standard deviation.2

2 ( )

1ix x

sn

86 86-85=1 1

84 84-85=-1 1

91 91-85=6 36

75 75-85=-10 100

78 78-85=-7 49

80 80-85=-5 25

74 74-85=-11 121

87 87-85=2 4

76 76-85=-9 81

96 96-85=11 121

82 82-85=-3 9

90 90-85=5 25

98 98-85=13 169

93 93-85=8 64

Total

1190

Tot

806

119085

14ixx

n

2 80662

13s

Standard Deviation 62 7.874s

Calculator – STAT EDIT, enter data in list 1, QUIT

STAT CALC 1-Var Stat

Page 34: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Standard Deviation

• The standard deviation is zero when there is no spread.

• The Standard deviation gets larger as the spread increases.

Page 35: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

Impact of adding a constant to all data in the set?

• Joey’s first 14 quiz grades in a marking period were – 86, 84, 91, 75, 78, 80, 74, 87, 76, 96, 82, 90, 98, 93

• Add 32 points to each score, then store in L2.• Compute 1-Var Stat. What has changed?• The five-number summary has changed but the

standard deviation has not?• The measure the spread remains the same?

Page 36: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.

The impact of multiplying each data in the set by a constant?

• Using the data set in L1 multiply the 2.

• Compute 1-Var Stat.

• What has changed?

• The five-number summary has changed by 2 times and the standard deviation has changed by 2 times.

• The measure of the spread has increased.

Page 37: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.
Page 38: The Practice of Statistics Third Edition Chapter 1: Exploring Data 1.2 Describing Distributions with Numbers Copyright © 2008 by W. H. Freeman & Company.