Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance,...

23
Describing Describing Distributions With Distributions With Numbers Numbers Section 1.3 cont. Section 1.3 cont. (five number summary, boxplots, variance, (five number summary, boxplots, variance, standard deviation) standard deviation) Target Goal: I can calculate a 5 number summary Target Goal: I can calculate a 5 number summary and and construct a boxplot. construct a boxplot. I can describe spread using the I can describe spread using the standard standard deviation of a deviation of a distribution. distribution. Hw: Hw: pg 71: 92, 93, 95, 96, 97, 103, 105, 107 - pg 71: 92, 93, 95, 96, 97, 103, 105, 107 - 110 110

Transcript of Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance,...

Page 1: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Describing Distributions Describing Distributions With NumbersWith Numbers

Section 1.3 cont.Section 1.3 cont.(five number summary, boxplots, variance, standard (five number summary, boxplots, variance, standard deviation)deviation)

Target Goal: I can calculate a 5 number summary and Target Goal: I can calculate a 5 number summary and construct a boxplot. construct a boxplot.

I can describe spread using the standard I can describe spread using the standard deviation of a distribution. deviation of a distribution.

Hw:Hw: pg 71: 92, 93, 95, 96, 97, 103, 105, 107 - 110 pg 71: 92, 93, 95, 96, 97, 103, 105, 107 - 110

Page 2: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Five-Number SummaryFive-Number Summary

Data set consisting of smallest Data set consisting of smallest observation, first quartile, median, observation, first quartile, median, third quartile, and largest observation third quartile, and largest observation written in order. written in order.

Min Q1 M Q3 Max Min Q1 M Q3 Max

It gives us a It gives us a quick summary of both quick summary of both center and spread.center and spread.

Page 3: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Bonds:Bonds:

Min Q1 Min Q1 M M Q3 Max Q3 Max

1616 2525 3434 4141 7373

Page 4: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Box (and whiskers) PlotBox (and whiskers) Plot

A graph of a five-number summary of A graph of a five-number summary of a distribution; a distribution;

• best for side- by-side best for side- by-side comparisonscomparisons since they show less since they show less detail than histograms or stemplots; detail than histograms or stemplots;

• drawn either horizontally or drawn either horizontally or vertically.vertically.

Page 5: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Modified BoxplotModified Boxplot

Because the regular boxplot conceals Because the regular boxplot conceals outliers we will use modified boxplot.outliers we will use modified boxplot.

• Plots outliers as isolated pointsPlots outliers as isolated points

• Extend “whiskers” out to Extend “whiskers” out to largest largest and/or smallest data pointsand/or smallest data points that are that are not outliers not outliers

• Remember: label axis, title graph, Remember: label axis, title graph, scale axis.scale axis.

Page 6: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Regular (a) and modified (b) boxplots Regular (a) and modified (b) boxplots comparing Barry Bonds and Hank Aaron comparing Barry Bonds and Hank Aaron home runs.home runs.

Min Q1 M Q3 Max Min Q1 M Q3 Max

OutlierOutlier

Page 7: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Activity:Activity: Acing the First TestAcing the First Test

Enter the scores of Mrs. Liao’s students Enter the scores of Mrs. Liao’s students on their first statistics test into L1 from on their first statistics test into L1 from page 71, ex. 92page 71, ex. 92

Sort Data(ascending)Sort Data(ascending): : InspireInspirePlace cursor on column titlePlace cursor on column titleSelect:Menu,1:Actions,6:sort, sort by (a)Select:Menu,1:Actions,6:sort, sort by (a)

Inspire: Appendix A6Inspire: Appendix A6

Page 8: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

a.a. Find the five-number summary and Find the five-number summary and

verify your expectation from a.verify your expectation from a. Calculator activityCalculator activity

• Enter the scores into L1 from page 71. Enter the scores into L1 from page 71.

•Calculator: 1 VAR STAT(L1)Calculator: 1 VAR STAT(L1)

43, 82, 43, 82, 87.7587.75, 93, 98, 93, 98

mean = 2544/30 (or )= mean = 2544/30 (or )= 84.884.8

the median is greater than the the median is greater than the meanmean

x

Page 9: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

b.b. What is the range of the middle half What is the range of the middle half

of the score of the statistic students?of the score of the statistic students?

Between Q1 and Q3:Between Q1 and Q3:

•Between 82 and 93Between 82 and 93

Page 10: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Acing the First Test Cont.Acing the First Test Cont.

c.c. Construct Construct by hand a modified boxplot by hand a modified boxplot of of the stats students scores.the stats students scores.

First find potential outliers.First find potential outliers.• IQR = IQR = • Q1 - IQR x 1.5 = Q1 - IQR x 1.5 = • Q3 + IQR x 1.5 =Q3 + IQR x 1.5 =• Outliers:Outliers:Graph: Mark a small x for the outlier(s), next lowest Graph: Mark a small x for the outlier(s), next lowest

min, Q1, M, Q3, max. min, Q1, M, Q3, max. Draw box and whisker plot.Draw box and whisker plot.

93 182 1 82 16.5 65.5 93 16.5 109.5

43,45

Page 11: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Acing the First Test Cont.Acing the First Test Cont.

d.d. On your calculator: On your calculator: First define Plot1 First define Plot1 to to be a be a modified boxplot modified boxplot using the list. using the list. Graph, trace and compare. Is there an Graph, trace and compare. Is there an outlier? If so, was it the same as in part outlier? If so, was it the same as in part a ?a ?  

• Based on the boxplot, conjecture the Based on the boxplot, conjecture the shape of the corresponding histogram.shape of the corresponding histogram.

• Histogram Histogram shape:______________________shape:______________________

43,45

Page 12: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Acing the First Test Cont.Acing the First Test Cont.

• Next, Define Next, Define Plot2 to be a histogram Plot2 to be a histogram also also using the same list. Trace and compare. using the same list. Trace and compare. Did you guess correctly? Roughly draw Did you guess correctly? Roughly draw histogram below. histogram below.

Page 13: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Important Note:Important Note:

• If a distribution contains If a distribution contains outliersoutliers, use , use the the median and the IQRmedian and the IQR to describe to describe the distribution.the distribution.

Page 14: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

The most common numerical The most common numerical description of a distribution is description of a distribution is the :the :Standard deviationStandard deviation (s): (s):

• measures spread measures spread by looking at how by looking at how far the observations are from their far the observations are from their meanmean

• The standard deviations (s) is the The standard deviations (s) is the square root of the variance (ssquare root of the variance (s22).).

Page 15: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

VarianceVariance (s (s22)) of a set of observations is the of a set of observations is the average of the squares of the deviations average of the squares of the deviations of the observations from their mean.of the observations from their mean.

Note: Most of the time we will use calculator Note: Most of the time we will use calculator (STAT:CALC:1VAR STAT).(STAT:CALC:1VAR STAT).

22 ( )

( 1)

ix xs

n

Page 16: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Why square the Why square the deviations?deviations?

• It makes them all non negative so that the It makes them all non negative so that the observations far from the mean in either observations far from the mean in either direction will have large positive squared direction will have large positive squared deviation.deviation.

Page 17: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Properties of the Standard Properties of the Standard DeviationDeviation

• The sum of the deviations of the The sum of the deviations of the observations from their mean will always be observations from their mean will always be zero.zero.

• Choose s only when mean is chosenChoose s only when mean is chosen as the as the measure of center.measure of center.

• s = 0 only when there is no spread (all s = 0 only when there is no spread (all observations have the same value).observations have the same value).

• ss, like the mean is , like the mean is not resistantnot resistant. Strong . Strong skewness or a few outliers can make s very skewness or a few outliers can make s very large.large.

• If a value is If a value is more than 2more than 2σσ’s ’s from the mean it from the mean it is is an outlier.an outlier.

Page 18: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Why divide by (n – 1)? Why divide by (n – 1)?

Degrees of freedomDegrees of freedom – –

Since is the exact balancing point of the Since is the exact balancing point of the data, the data data, the data will almost always be will almost always be closer to closer to , on average, , on average, than they will than they will be to μbe to μ. . The sum of the squared The sum of the squared deviations of will underestimate deviations of will underestimate the the sum of the squared deviations of µ. To sum of the squared deviations of µ. To correct this we divide by n-1 instead of correct this we divide by n-1 instead of n.n.

x

xx

Page 19: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Example: Roger MarisExample: Roger Maris

• New York Yankee Roger Maris held New York Yankee Roger Maris held the single-season home run record the single-season home run record from 1961 until 1998. Here are from 1961 until 1998. Here are Maris’s home run counts for his 10 Maris’s home run counts for his 10 years in the American League: years in the American League:

1414 28 16 39 61 28 16 39 61 33 33 23 26 8 23 26 8 1313

Page 20: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

14 28 16 39 61 3314 28 16 39 61 33 23 26 8 23 26 8 1313

a.a. Maris’s mean number of home runs is Maris’s mean number of home runs is = 26.1.= 26.1. Find the standard deviation Find the standard deviation ss from its definition from its definition (by hand).(by hand).

∑ ∑ (x(xii - ) - )22 = (14-26.1) = (14-26.1)22 + (28-26.1) + (28-26.1)22……

ss22 = / n-1 = / n-1 ss22 = = 2192.9/92192.9/9 ss22 = = 243.66243.66 s = s = 15.60915.609

x

x

Page 21: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

14 28 16 39 61 3314 28 16 39 61 33 23 26 8 23 26 8 1313

b.b. Use your calculator to verify your Use your calculator to verify your results.results.(STAT:CALC:1 var stat:L1)(STAT:CALC:1 var stat:L1)

• Then use your calculator to find the Then use your calculator to find the mean mean and s and s for the 9 observations that remain for the 9 observations that remain when you leave out any outlier(s). when you leave out any outlier(s).

Recall IQR x 1.5 Recall IQR x 1.5 Note: they choose 61 as an outlier while Note: they choose 61 as an outlier while

the upper bound is 61.5. the upper bound is 61.5.

Page 22: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

• Mean = 22.2Mean = 22.2

• Sx = 10.244Sx = 10.244

• How does the leaving out the “outlier” How does the leaving out the “outlier” affect the values of the mean and s? affect the values of the mean and s?

It caused the values of both measures to It caused the values of both measures to decrease.decrease.

• Is Is ss a resistant measure of spread? a resistant measure of spread?

Clearly, s is not a resistant measure of Clearly, s is not a resistant measure of spread.spread.

Page 23: Describing Distributions With Numbers Section 1.3 cont. (five number summary, boxplots, variance, standard deviation) Target Goal: I can calculate a 5.

Key Points of ChapterKey Points of Chapter