CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA-C LESSON 9 ... · Name Class Date Resource Locker ......

12
Name Class Date Resource Locker © Houghton Mifflin Harcourt Publishing Company . image credit: ©Blend Images/Alamy Explore Using Dot Plots to Display Data A dot plot is a data representation that uses a number line and Xs, dots, or other symbols to show frequency. Dot plots are sometimes called line plots. Finance Twelve employees at a small company make the following annual salaries (in thousands of dollars): 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, and 60. Choose the number line with the most appropriate scale for this problem. Explain your reasoning. Create and label a dot plot of the data. Put an X above the number line for each time that value appears in the data set. Reflect 1. Discussion Recall that quantitative data can be expressed as a numerical measurement. Categorical, qualitative data is expressed in categories, such as attributes or preferences. Is it appropriate to use a dot plot for displaying quantitative data, qualitative data, or both? Explain. 50 100 0 50 60 30 40 70 20 65 80 35 50 95 20 Salary (thousands of dollars) x x x x x xx x x x x x 20 30 40 50 60 70 The second number line has the most appropriate scale. The scale of the first number line includes a larger range of numbers than necessary, so dots will be clustered in the middle. The scale of the third number line does not have convenient tick marks for determining where values between the labels belong. A dot plot uses a number line, so it is only appropriate for displaying quantitative data. Module 9 389 Lesson 2 9.2 Data Distributions and Outliers Essential Question: What statistics are most affected by outliers, and what shapes can data distributions have? Common Core Math Standards The student is expected to: S-ID.1 Represent data with plots on the real number line (dot plots, histograms, and box plots). Also S-ID.2, S-ID.3, N-Q.1 Mathematical Practices MP.2 Reasoning Language Objective Explain to a partner what an outlier is. HARDCOVER PAGES 317326 Turn to these pages to find this lesson in the hardcover student edition. Data Distributions and Outliers ENGAGE Essential Question: What statistics are most affected by outliers, and what shapes can data distributions have? Outliers affect the mean more than the median, and they affect the standard deviation more than the IQR. Data distributions can be described generally as symmetric, skewed to the left, or skewed to the right. PREVIEW: LESSON PERFORMANCE TASK View the Engage section online. Discuss why, if you owned a business, you might compare a competitor’s sales to your company’s sales, and how your findings might lead you to change the way you run your business. Then preview the Lesson Performance Task. Name Class Date Resource Locker Resource Locker Resource Locker © Houghton Mifflin Harcourt Publishing Company . image credit: ©Blend Images/Alamy Explore Using Dot Plots to Display Data A dot plot is a data representation that uses a number line and Xs, dots, or other symbols to show frequency. Dot plots are sometimes called line plots. Finance Twelve employees at a small company make the following annual salaries (in thousands of dollars): 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, and 60. Choose the number line with the most appropriate scale for this problem. Explain your reasoning. Create and label a dot plot of the data. Put an X above the number line for each time that value appears in the data set. Reflect 1. Discussion Recall that quantitative data can be expressed as a numerical measurement. Categorical, qualitative data is expressed in categories, such as attributes or preferences. Is it appropriate to use a dot plot for displaying quantitative data, qualitative data, or both? Explain. S-ID.1 Represent data with plots on the real number line (dot plots, histograms, and box plots). Also S-ID.2, S-ID.3, N-Q.1 50 100 0 50 60 30 40 70 20 65 80 35 50 95 20 Salary (thousands of dollars) x x x x x xx x x x x x 20 30 40 50 60 70 The second number line has the most appropriate scale. The scale of the first number line includes a larger range of numbers than necessary, so dots will be clustered in the middle. The scale of the third number line does not have convenient tick marks for determining where values between the labels belong. A dot plot uses a number line, so it is only appropriate for displaying quantitative data. Module 9 389 Lesson 2 9.2 Data Distributions and Outliers Essential Question: What statistics are most affected by outliers, and what shapes can data distributions have? DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-A;CA-A 389 Lesson 9.2 LESSON 9.2

Transcript of CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA-C LESSON 9 ... · Name Class Date Resource Locker ......

Name Class Date

Resource Locker

Resource Locker

Resource Locker

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y . i

mag

e cr

edit

: ©Bl

end

Imag

es/A

lam

y

Explore Using Dot Plots to Display Data A dot plot is a data representation that uses a number line and Xs, dots, or other symbols to show frequency. Dot plots are sometimes called line plots.

Finance Twelve employees at a small company make the following annual salaries (in thousands of dollars): 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, and 60.

Choose the number line with the most appropriate scale for this problem. Explain your reasoning.

Create and label a dot plot of the data. Put an X above the number line for each time that value appears in the data set.

Reflect

1. Discussion Recall that quantitative data can be expressed as a numerical measurement. Categorical, qualitative data is expressed in categories, such as attributes or preferences. Is it appropriate to use a dot plot for displaying quantitative data, qualitative data, or both? Explain.

50 1000

50 6030 40 7020

65 8035 50 9520

Salary (thousands of dollars)

x xxx

xx x

xxx xx

20 30 40 50 60 70

The second number line has the most appropriate scale. The scale of the first number line

includes a larger range of numbers than necessary, so dots will be clustered in the middle.

The scale of the third number line does not have convenient tick marks for determining

where values between the labels belong.

A dot plot uses a number line, so it is only appropriate for displaying quantitative data.

Module 9 389 Lesson 2

9 . 2 Data Distributions and OutliersEssential Question: What statistics are most affected by outliers, and what shapes can data

distributions have?

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A

IN1_MNLESE389755_U4M09L2.indd 389 01/04/14 9:27 PM

Common Core Math StandardsThe student is expected to:

S-ID.1

Represent data with plots on the real number line (dot plots, histograms, and box plots). Also S-ID.2, S-ID.3, N-Q.1

Mathematical Practices

MP.2 Reasoning

Language ObjectiveExplain to a partner what an outlier is.

HARDCOVER PAGES 317326

Turn to these pages to find this lesson in the hardcover student edition.

Data Distributions and Outliers

ENGAGE Essential Question: What statistics are most affected by outliers, and what shapes can data distributions have?Outliers affect the mean more than the median, and

they affect the standard deviation more than the

IQR. Data distributions can be described generally

as symmetric, skewed to the left, or skewed to

the right.

PREVIEW: LESSON PERFORMANCE TASKView the Engage section online. Discuss why, if you owned a business, you might compare a competitor’s sales to your company’s sales, and how your findings might lead you to change the way you run your business. Then preview the Lesson Performance Task.

389

HARDCOVER

Turn to these pages to find this lesson in the hardcover student edition.

Name

Class Date

Resource

LockerResource

LockerResource

Locker

© H

ough

ton

Mif

flin

Har

cour

t Pub

lishi

ng C

omp

any

. im

age

cred

it: ©

Blen

d

Imag

es/A

lam

y

Explore Using Dot Plots to Display Data

A dot plot is a data representation that uses a number line and Xs,

dots, or other symbols to show frequency. Dot plots are sometimes

called line plots.

Finance Twelve employees at a small company make the

following annual salaries (in thousands of dollars): 25, 30, 35, 35,

35, 40, 40, 40, 45, 45, 50, and 60.

Choose the number line with the most appropriate scale

for this problem. Explain your reasoning.

Create and label a dot plot of the data. Put an X above the number line for each time that

value appears in the data set.

Reflect

1. Discussion Recall that quantitative data can be expressed as a numerical measurement. Categorical,

qualitative data is expressed in categories, such as attributes or preferences. Is it appropriate to use a dot

plot for displaying quantitative data, qualitative data, or both? Explain.

S-ID.1 Represent data with plots on the real number line (dot plots, histograms, and box plots).

Also S-ID.2, S-ID.3, N-Q.1

50100

0

50 6030 40

70

20

65 8035 50

95

20

Salary (thousands of dollars)

xxxx

x

x xxxx

xx

20 30 40 50 60 70

The second number line has the most appropriate scale. The scale of the first number line

includes a larger range of numbers than necessary, so dots will be clustered in the middle.

The scale of the third number line does not have convenient tick marks for determining

where values between the labels belong.

A dot plot uses a number line, so it is only appropriate for displaying quantitative data.

Module 9

389

Lesson 2

9 . 2 Data Distributions and Outliers

Essential Question: What statistics are most affected by outliers, and what shapes can data

distributions have?

DO NOT EDIT--Changes must be made through "File info"

CorrectionKey=NL-A;CA-A

IN1_MNLESE389755_U4M09L2 389

09/04/14 6:11 PM

389 Lesson 9 . 2

L E S S O N 9 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

Explain 1 The Effects of an Outlier in a Data SetAn outlier is a value in a data set that is much greater or much less than most of the other values in the data set. Outliers are determined by using the first or third quartiles and the IQR.

How to Identify an Outlier

A data value x is an outlier if x < Q 1 - 1.5(IQR) or if x > Q 3 + 1.5(IQR).

Example 1 Create a dot plot for the data set using an appropriate scale for the number line. Determine whether the extreme value is an outlier.

Suppose that the list of salaries from the Explore is expanded to include the owner’s salary of $150,000. Now the list of salaries is 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, 60, and 150.

To choose an appropriate scale, consider the minimum and maximum values, 25 and 150.

A number line from 20 to 160 will contain all the values. A scale of 5 will be convenient for the data. Label tick marks by 20s.

Plot each data value to see the distribution.

Find the quartiles and the IQR to determine whether 150 is an outlier.

Suppose that the salaries from Part A were adjusted so that the owner’s salary is $65,000.

Now the list of salaries is 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, 60, and 65.

To choose an appropriate scale, consider the minimum and

maximum data values, and .

A number line from to will

contain all the data values.

A scale of will be convenient for the data.

Label tick marks by .

Plot each data value to see the distribution.

150 ? > Q3 + 1.5 (IQR)

150 ? > 47.5 + 1.5 (47.5 - 35)

150 > 66.25 True

150 is an outlier.

x xxx

x xxxxxx x x

20 40 60 80 100 160140120

Salary(thousands of dollars)

Salary (thousands of dollars)

20 70

x xxx

xx x

xxx xx

30 40 50 60

25 65

20 70

5

10s

Module 9 390 Lesson 2

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A

IN1_MNLESE389755_U4M09L2.indd 390 01/04/14 9:27 PM

Integrate Mathematical PracticesThis lesson provides an opportunity to address Mathematical Practice MP.2, which calls for students to “reason abstractly and quantitatively.” Students solve real-world problems by creating dot plots for data sets. They analyze and describe the shapes of the data distributions, recognizing how the shapes affect the measures of center and spread, and they use both dot plots and statistical measures to compare data sets. Thus, they first take a situation from its real-world context to represent it symbolically, then they interpret the results in the real-world context.

EXPLORE Using Dot Plots to Display Data

INTEGRATE TECHNOLOGYTo make it easier to create a dot plot for a large data set, students can enter the data values into one column of a spreadsheet, then use the spreadsheet’s data-sorting function to arrange them in increasing order.

QUESTIONING STRATEGIESHow can you use a dot plot to find the interquartile range of a data set? First, find

the median by counting the same number of marks

from each end of the dot plot until the middle value

is reached. If there are an even number of marks,

find the mean of the two middle values. Then use

the same process to find the first quartile

(Q1, the middle value of the lower half) and the

third quartile (Q3, the middle value of the upper

half). Finally, subtract Q1 from Q3 to find the

interquartile range.

EXPLAIN 1 The Effects of an Outlier in a Data Set

AVOID COMMON ERRORSStudents sometimes forget to take the square root of the mean of the squared deviations when calculating standard deviation. Review the steps for calculating the standard deviation.

QUESTIONING STRATEGIESHow does an outlier affect the mean and median of a data set? If a data set includes an

outlier, the mean can be increased or decreased

significantly. This can make the mean misleading as

a measure of center. When there are no outliers,

most data values cluster closer to the mean. The

median is much less affected by an outlier, because

a single outlier shifts the middle of the data set by

only a small amount, if at all.

PROFESSIONAL DEVELOPMENT

Data Distributions and Outliers 390

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Find the quartiles and the IQR to determine whether 65 is an outlier.

Reflect

2. Explain why the median was NOT affected by changing the max data value from 150 to 65.

Your Turn

3. Sports Baseball pitchers on a major league team throw at the following speeds (in miles per hour): 72, 84, 89, 81, 93, 100, 90, 88, 80, 84, and 87. Create a dot plot using an appropriate scale for the number line. Determine whether the extreme value is an outlier.

Explain 2 Comparing Data SetsNumbers that characterize a data set, such as measures of center and spread, are called statistics. They are useful when comparing large sets of data.

Example 2 Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set, and then compare the data.

Sports The tables list the average ages of players on 15 teams randomly selected from the 2010 teams in the National Football League (NFL) and Major League Baseball (MLB). Describe how the average ages of NFL players compare to those of MLB players.

NFL Players’ Average Ages, by Team

25.8, 26.0, 26.3, 25.7, 25.1, 25.2, 26.1, 26.4, 25.9, 26.6, 26.3, 26.2, 26.8, 25.6, 25.7

MLB Players’ Average Ages, by Team

28.5, 29.0, 28.0, 27.8, 29.5, 29.1, 26.9, 28.9, 28.6, 28.7, 26.9, 30.5, 28.7, 28.9, 29.3

65 ? > Q3 + 1.5 (IQR)

65 ? > + 1.5 ( - )

65 > True / False

Therefore, 65 is / is not an outlier.

x xxx xx x x xx

70 75 80 85 90 10095

Pitching Speeds (mph)

47.5 47.5 35

66.25

The maximum value in the data set changed, but its ordered position did not, so the

middle value in the ordered list was not moved or changed.

72 ? < Q1 - 1.5 (IQR)

72 ? < 81 - 1.5 (9)

72 < 67.5 False Therefore, 72 is not an outlier.

Module 9 391 Lesson 2

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A

IN1_MNLESE389755_U4M09L2.indd 391 01/04/14 9:27 PM

COLLABORATIVE LEARNING

Peer-to-Peer ActivityHave students work in pairs. Have each student create a data set with 10 values, using the definition of outlier to verify that none of the values are outliers. Students then find the mean, median, range, and IQR for their data sets. Have students trade data sets with their partners. Ask each student to add an outlier to the partner’s data set, and then calculate the new mean, median, range, and IQR for the set. Students should compare their results and discuss how the outliers affected the statistics.

EXPLAIN 2 Comparing Data Sets

INTEGRATE MATHEMATICAL PRACTICESFocus on Technology

MP.5 Review the steps generating statistics using a graphing calculator. Students can

create a list by pressing STAT, then selecting 1:Edit. A previously entered list can be cleared by highlighting the name of the list, pressing CLEAR, then pressing the down arrow.

After entering data in a list, students can find the one-variable statistics by pressing STAT, selecting CALC, and then selecting 1:1-Var Stats. For data in lists other than L1, they must enter the list number before pressing ENTER to generate the statistics.

AVOID COMMON ERRORSStudents may expect their graphing calculators to provide the value of the IQR. Remind them that they must calculate the IQR by finding the difference between the first and third quartiles.

391 Lesson 9 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

On a graphing calculator, enter the two sets of data into L 1 and L 2 .

Use the “1-Var Stats” feature to find statistics for the data in lists L 1 and L 2 . Your calculator may use the following notations: mean _ x , standard deviation σx.

Scroll down to see the median (Med), Q 1 , and Q 3 . Complete the table.

Mean Median IQR ( Q 3 - Q 1 )Standard deviation

NFL 25.98 26.00 0.60 0.46

MLB 28.62 28.70 1.10 0.91

Compare the corresponding statistics.

The mean age and median age are lower for the NFL than for the MLB, which means that NFL players tend to be younger than MLB players. In addition, the IQR and standard deviation are smaller for the NFL than for the MLB, which means that the ages of NFL players are closer together than those of MLB players.

The tables list the ages of 10 contestants on 2 game shows.

Game Show 1

18, 20, 25, 48, 35, 39, 46, 41, 30, 27

Game Show 2

24, 29, 36, 32, 34, 41, 21, 38, 39, 26

On a graphing calculator, enter the two sets of data into L 1 and L 2 .

Complete the table. Then circle the correct items to compare the statistics.

Mean Median IQR ( Q 3 – Q 1 )Standard deviation

Show 1

Show 2

The mean is lower for the 1st / 2nd game show, which means that contestants in the 1st / 2nd game show are on average younger than contestants in the 1st / 2nd game show. However, the median is lower for the 1st / 2nd game show, which means that although contestants are on average younger on the 1st / 2nd game show, there are more young contestants on the 1st / 2nd game show. Finally, the IQR and standard deviation are higher for the 1st / 2nd game show, which means that the ages of contestants on the 1st / 2nd game show are further apart than the age of contestants on the 1st/ 2nd game show.

32.9

32

32.5

33 12 6.45

10.0016

Module 9 392 Lesson 2

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B

IN1_MNLESE389755_U4M09L2.indd 392 25/07/14 12:47 PM

DIFFERENTIATE INSTRUCTION

Multiple RepresentationsStudents may benefit from acting out a real-world example of how adding an outlier to a data set affects measures of center and spread. For example, have five students each begin with 1 to 5 slips of paper (or pennies or markers); each slip represents a dollar. Have the students calculate the mean by equally distributing all the slips of paper among the five students. Then have a sixth student with $25 (25 slips of paper) join the group. Again use the slips of paper to find the mean by distributing them among the six students. Ask whether the new mean is a reasonable measure of center.

QUESTIONING STRATEGIESWhat can you conclude about two data sets by comparing each of the following statistics:

mean, median, IQR, and standard deviation?

By comparing the mean and median values, you can

conclude whether the typical value for one data set

is higher or lower than the typical value for the

other set. By comparing the IQR and standard

deviation values, you can determine whether the

data values in one set are more or less spread out

than the values in the other set.

Data Distributions and Outliers 392

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Your Turn

4. The tables list the age of each member of Congress in two randomly selected states. Complete the table and compare the data.

Illinois

26, 24, 28, 46, 39, 59, 31, 26, 64, 40, 69, 62, 31, 28, 26, 76, 57, 71, 58, 35, 32, 49, 51, 22, 33, 56

Arizona

42, 37, 58, 32, 46, 42, 26, 56, 27

Mean Median IQR ( Q 3 - Q 1 )Standard deviation

Illinois

Arizona

Explain 3 Comparing Data DistributionsA data distribution can be described as symmetric, skewed to the left, or skewed to the right, depending on the general shape of the distribution in a dot plot or other data display.

Example 3 For each data set, make a dot plot and determine the type of distribution. Then explain what the distribution means for each data set.

Sports The data table shows the number of miles run by members of two track teams during one day.

Miles 3 3.5 4 4.5 5 5.5 6

Members of Team A 2 3 4 4 3 2 0

Members of Team B 1 2 2 3 3 4 3

xx xx

xx

xxx

xx

xxx

xx

Symmetric

xx xxx

x xx

xxx

xx

Skewed to the Left

xx xx

xx

x xxx

xx x

Skewed to the Right

43.81

40.67

39.5

42 21.5 10.84

16.4230

The mean is lower for Arizona, which means that, on average, members of Congress tend

to be younger in Arizona than in Illinois. However, the median is lower in Illinois, which

means that there are more young members of Congress in Illinois despite the differences

in average age. Finally, the IQR and standard deviation are lower for Arizona, which

means that the ages of members of Congress are closer together than they are in Illinois.

Module 9 393 Lesson 2

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A

IN1_MNLESE389755_U4M09L2.indd 393 01/04/14 9:27 PM

LANGUAGE SUPPORT

Connect VocabularyEnglish learners who are working on acquiring academic English in algebra may find that some terminology is difficult to pronounce or to differentiate when listening. Words such as effect and affect may be difficult to distinguish, and words such as skew or interquartile may be difficult to pronounce. Be sure to enunciate clearly so that students can understand and learn to pronounce the key words correctly.

EXPLAIN 3 Comparing Data Distributions

AVOID COMMON ERRORSStudents often confuse the terms skewed to the left and skewed to the right. Encourage students to come up with a mnemonic to help them remember how the direction of a skew should be described. For example, students may easily remember how the “tail” of a data distribution looks on a dot plot. Point out that both tail and skew have four letters, and that a data distribution is skewed in the direction of its tail.

QUESTIONING STRATEGIESSome data distributions are described as uniform. What do you think the general shape

of a uniform distribution would be? The general

shape of a uniform distribution is fairly even across

the plot.

What would be true about the mean and median of a data set with a uniform

distribution? The mean and median would be

approximately equal.

393 Lesson 9 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

Team A

The data for team A show a symmetric distribution. This means that the distances run are evenly distributed about the mean.

Team B

The data for team B show a distribution skewed to the left. This means that more than half the team members ran a distance greater than the mean.

B The table shows the number of days, over the course of a month, that specific numbers of apples were sold by competing grocers.

Number of Apples Sold 0 50 100 150 200 250 300

Grocery Store A 1 4 8 8 4 1 0

Grocery Store B 3 6 8 8 2 2 1

Grocery Store A Grocery Store B

The distribution for grocery store A is: left-skewed/right-skewed / symmetric. This means that the number of apples sold each day is evenly / unevenly distributed about the mean.

The distribution for grocery store B is: left-skewed/ right-skewed /symmetric. This means that the number of apples sold each day is evenly/ unevenly distributed about the mean.

Reflect

7. Will the mean and median in a symmetric distribution always be approximately equal? Explain.

8. Will the mean and median in a skewed distribution always be approximately equal? Explain.

xx

xx

xx

xxx

xx

xxxx

xxx

3 4 5 6

Miles

xx

xx

xx

xxx

xxx

xxx

xxx

3 4 5 6

Miles

x

xxxx

xx

xx

xxxx

xx

xx

xxxx x

xxxx

0 100 200 300 400

Number of Apples sold

xxxx

xx

xx

xxxxxx

xxxx

xx

xx

xx

xx

xxx

0 100 200 300 400

Number of Apples sold

The mean and median in a symmetric distribution will always be approximately equal

because the values are equally distributed on either side of the center.

The mean and median in a skewed distribution will not always be approximately equal

because the median will sometimes be closer to where the values cluster than the

mean will be.

Module 9 394 Lesson 2

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B

IN1_MNLESE389755_U4M09L2.indd 394 25/07/14 12:47 PM

Data Distributions and Outliers 394

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Your Turn

9. Sports The table shows the number of free throws attempted during a basketball game. Make a dot plot and determine the type of distribution. Then explain what the distribution means for the data set.

Free Throws Shot 0 2 4 6 8

Members of Team A 2 2 4 2 2

Members of Team B 3 4 2 2 1

Team A Team B

Elaborate

10. If the mean increases after a single data point is added to a set of data, what can you tell about this data point?

11. How can you use a calculation to decide whether a data point is an outlier in a data set?

12. Essential Question Check-In What three shapes can data distributions have?

Number of Free Throws

xxx

xx

xxx

xx xx

0 2 4 6 8

xxxx

xx

xx

xx

xx

0 2 4 6 8

Number of Free ThrowsNumber of Free Throws

The data for team A show a symmetric

distribution. This means that the number of

free throws shot is evenly distributed about

the mean.

The data for team B show a distribution

skewed to the right. This means that fewer

than half of the team members shot a

number of free throws that were greater

than the mean.

If the mean increases after a single data point is added to a set of data, you can tell that the

data point added was larger than the mean of the set.

You can decide whether a data point is an outlier in a data set by finding the 1st and 3rd

quartile and subtracting them to get the interquartile range. If the data point is larger or

smaller than the result found by adding the 3rd quartile to 1.5 times the interquartile range

or by subtracting the 1st quartile from 1.5 times the interquartile range, respectively, then

the data point is an outlier.

Data distributions can be skewed to the left, skewed to the right, and symmetric.

Module 9 395 Lesson 2

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-C;CA-C

IN1_MNLESE389755_U4M09L2 395 6/9/15 12:34 PMExercise Depth of Knowledge (D.O.K.) Mathematical Practices

1–8 1 Recall MP.4 Modeling

9 1 Recall MP.5 Using Tools

10 2 Skills/Concepts MP.7 Using Structure

11 1 Recall MP.5 Using Tools

12 2 Skills/Concepts MP.7 Using Structure

ELABORATE QUESTIONING STRATEGIES

Can a data set have more than one outlier? Explain. Yes; More than one value may be

less than Q1 - 1.5(IQR) or greater than Q3 + 1.5(IQR).

INTEGRATE MATHEMATICAL PRACTICESFocus on Critical ThinkingMP.3 Discuss with students whether all the values in a data set could be outliers. Review the definition of outlier. Students should understand that because an outlier must be less than Q1 or greater than Q3, values between Q1 and Q3 will never be outliers for a data set.

SUMMARIZE THE LESSONHow can you determine whether a value in a data set is an outlier? How does the inclusion

of an outlier affect the mean, median, range, and IQR? An outlier is a value that is less than

Q 1 - 1.5(IQR) or greater than Q 3 + 1.5(IQR). Outliers

significantly affect the mean and range, but affect

the median and IQR very little or not at all.

395 Lesson 9 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

• Online Homework• Hints and Help• Extra Practice

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany • Im

age C

redits: ©

Blend

Imag

es/Alam

y

Evaluate: Homework and Practice

Fitness The numbers of members in 8 workout clubs are 100, 95, 90, 85, 85, 95, 100, and 90. Use this information for Exercises 1–2.

1. Create a dot plot for the data set using an appropriate scale for the number line.

2. Suppose that a new workout club opens and immediately has 150 members. Is the number of members at this new club an outlier?

Sports The number of feet to the left outfield wall for 10 randomly chosen baseball stadiums is 315, 325, 335, 330, 330, 330, 320, 310, 325, and 335. Use this information for Exercises 3–4.

3. Create a dot plot for the data set using an appropriate scale for the number line.

4. The longest distance to the left outfield wall in a baseball stadium is 355 feet. Is this stadium an outlier if it is added to the data set?

Education The numbers of students in 10 randomly chosen classes in a high school are 18, 22, 26, 31, 25, 20, 23, 26, 29, and 30. Use this information for Exercises 5–6.

5. Create a dot plot for the data set using an appropriate scale for the number line.

6. Suppose that a new class is opened for enrollment and currently has 7 students. Is this class an outlier if it is added to the data set?

Number of Members

xx

xx

xx

xx

60 70 80 90 100 110

Number of Feet

xx

xx

x x x xxx

300 310 320 330 340 350

150 > 100 + 1.5(100 - 87.5) = 118.75 True 150 members is an outlier.

Possible plot shown.

Possible plot shown.

Possible plot shown.

355 > 335 + 1.5(335 - 320) = 357.5 False 355 feet is not an outlier.

7 < 20 - 1.5(29 - 20) = 6.5 False 7 is not an outlier.?

?

Number of Students

xx x x xxx x xx

16 20 24 28 32 36

Module 9 396 Lesson 2

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B

DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B

IN1_MNLESE389755_U4M09L2.indd 396 25/07/14 12:47 PMExercise Depth of Knowledge (D.O.K.) Mathematical Practices

13–15 2 Skills/Concepts MP.4 Modeling

16 1 Recall of Information MP.5 Using Tools

17–18 3 Strategic Thinking MP.3 Logic

19 2 Skills/Concepts MP.4 Modeling

EVALUATE

ASSIGNMENT GUIDE

Concepts and Skills Practice

ExploreUsing Dot Plots to Display Data

Exercises 1, 3, 5, 7

Example 1The Effects of an Outlier in a Data Set

Exercises 2, 4, 6, 8, 17–18

Example 2Comparing Data Sets

Exercises 9–12

Example 3Comparing Data Distributions

Exercises 13–16, 19

INTEGRATE MATHEMATICAL PRACTICESFocus on Critical ThinkingMP.3 Understanding how outliers can affect the mean and the median of a data set is an important skill, especially for interpreting data. Discuss how statistics can be misleading when outliers that affect the mean value for a data set are included.

Data Distributions and Outliers 396

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Sports The average bowling scores for a group of bowlers are 200, 210, 230, 220, 230, 225, and 240. Use this information for Exercises 7–8.

7. Create a dot plot for the data set using an appropriate scale for the number line.

8. Suppose that a new bowler joins this group and has an average score of 275. Is this bowler an outlier in the data set?

The tables describe the average ages of employees from two randomly chosen companies. Use this information for Exercises 9–10.

Company A

23, 29, 35, 46, 51, 50, 42, 37, 30

Company B

24, 23, 45, 45, 42, 52, 55, 47, 55

9. Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set.

Mean Median IQR ( Q 3 – Q 1 )Standard deviation

Company A Mean Mean Mean Mean

Company B Mean Mean Mean Mean

10. Compare the data sets.

The tables describe the size of microwaves, in cubic feet, chosen randomly from two competing companies. Use this information for Exercises 11–12.

Company A

1.8, 2.1, 3.1, 2.0, 3.3, 2.9, 3.3, 2.1, 3.2

Company B

1.9, 2.6, 1.8, 3.0, 2.5, 2.8, 2.0, 3.6, 3.1

11. Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set.

Mean Median IQR ( Q 3 – Q 1 )Standard deviation

Company A Mean Mean Mean Mean

Company B Mean Mean Mean Mean

12. Compare the data sets.

Bowling Scores

xx x xxx x

200 210 220 230 240 250

Possible plot shown.

275 > 235 + 1.5(235 - 215) = 265 True 275 is an outlier.?

38.1

43.1 45 20.5 11.33

9.2718.537

Employees at company A tend to be younger than employees at company B. The ages of employees at company A are closer together than the ages of employees at company B.

Microwaves from company B tend to be smaller than microwaves from company A. The average size of microwaves tend to be closer together at company B than at company A.

2.6

2.6 2.6 1.1 0.57

0.591.22.9

Module 9 397 Lesson 2

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-A;CA-A

IN1_MNLESE389755_U4M09L2.indd 397 01/04/14 9:27 PM

MODELINGTo help students think about possible causes for outliers in a data set, ask them to consider the distribution of heights of all the people in a kindergarten classroom, in a high school classroom, and on a basketball court. Discuss how many outliers might be expected in each case, and what factors might affect the number of outliers in each situation. Students should recognize that there is often a reason why one value is very different from the others in a data set, such as the fact that a kindergarten teacher may be the only adult in the classroom.

AVOID COMMON ERRORSMake sure students understand the process for determining the standard deviation for a data set. Encourage them to first create a table to record the deviation and squared deviation for each data value, then add the squared deviations, divide the sum by the number of values, and finally find the square root. Suggest that when they do not record their work, students can easily overlook a step in the process.

397 Lesson 9 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

For each data set, make a dot plot and determine the type of distribution. Then explain what the distribution means for each data set.

13. Sports The data table shows the number of miles run by members of two teams running a marathon.

Miles 5 10 15 20 25

Members of Team A 3 5 10 5 3

Members of Team B 6 10 4 1 5

Team A Team B

14. Sales The data table shows the number of days that specific numbers of turkeys were sold. These days were in the two weeks before Thanksgiving.

Number of Turkeys 10 20 30 40

Grocery Store A 2 5 5 2

Grocery Store B 5 5 1 3

Grocery Store A Grocery Store B

Miles

5 10 15 20 25 30

xxxx

xxxx

xx

xxxxx

xxx

xxxxx

xxx

Miles

5 10 15 20 25 30

xxxx

xxxx

xx

xxxxxx x

xxxxx

xxxx

Number of Turkeys

0 10 20 30 40 50

xx

xxxxx

xxxxx

xx

Number of Turkeys

0 10 20 30 40 50

xx

x

xxxxx

xxxxx x

The data for team A show a symmetric distribution. The distances run are evenly distributed about the mean.

The data for team B show a right-skewed distribution. This means that fewer than half of the team members ran a distance greater than the mean.

Possible plot shown.

The data for grocery store A show a symmetric distribution. This means that the numbers of turkeys sold per day are evenly distributed about the mean.

The data for grocery store B show a right-skewed distribution. This means that the store sold fewer than the average number of turkeys for more than half of the days.

Module 9 398 Lesson 2

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B

IN1_MNLESE389755_U4M09L2.indd 398 25/07/14 12:47 PM

CRITICAL THINKINGHave students analyze and describe the shape of the distribution of a dot plot they created. Ask students how the shape relates to the statistics they would use to characterize the data.

Data Distributions and Outliers 398

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

15. State whether each set of data is left-skewed, right-skewed, or symmetrically distributed.

A. 3, 5, 5, 3B. 1, 1, 3, 1C. 7, 9, 9, 11D. 5, 5, 3, 3E. 19, 21, 21, 19

H.O.T. Focus on Higher Order Thinking

16. What If? Given the data set 8, 15, 12, 10, and 5, what happens to the mean if you add a data value of 40? Is 40 an outlier of the new data set?

17. Critical Thinking Can an outlier be a data value between Q 1 and Q 3 ? Justify your answer.

18. Justify Reasoning If the distribution has outliers, why will they always have an effect on the range?

19. Education The data table describes the average testing scores in 20 randomly selected classes in two randomly selected high schools, rounded to the nearest ten. For each data set, make a dot plot, determine the type of distribution, and explain what the distribution means in context.

Average Scores 0 10 20 30 40 50 60 70 80 90 100

School A 0 1 2 2 3 4 3 2 2 1 0

School B 0 1 1 1 2 4 5 4 2 0 0

School A School B

Test Scores

0 20 40 60 80 100

xxxx

xxx

xx

xx x

xx

xx

xxxx

Test Scores

0 20 40 60 80 100

xxx

xxx

xxx x

xxx

xx

x

xxxx

The data for school A show a symmetric distribution. This means that the test scores were evenly distributed about the mean test score.

The data for school B show a left-skewed distribution. This means that more than half of the classes received a test score that was above the mean.

symmetricright-skewedsymmetricsymmetricsymmetric

The mean increases from 10 to 15. 40 is an outlier of the new data set because 40 > 25.5.

An extreme value such as the max or min value can be an outlier, but by definition, no value between Q 1 and Q 3 can be an outlier.

When present, outliers will always have an effect on the range since one of the outliers will either be the highest or lowest number in a given data set and the range is found by finding the difference between the highest and lowest numbers.

Module 9 399 Lesson 2

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-C;CA-C

IN1_MNLESE389755_U4M09L2 399 6/10/15 8:51 AM

JOURNALHave students create their own graphic organizers to share with classmates, outlining the steps for finding mean, median, Q 1 , Q 3 , IQR, and standard deviation from a dot plot.

399 Lesson 9 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

Lesson Performance TaskThe tables list the daily car sales of two competing dealerships.

Dealer A Dealer B

14 13 15 12 16 17 15 20

15 16 15 17 18 19 18 17

17 12 16 14 19 10 19 18

15 16 14 16 15 17 20 19

13 14 18 15 18 18 16 17

A. Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set. Compare the measures of center for the two dealers.

Mean Median IQR ( Q 3 – Q 1 )Standard deviation

Dealer A

Dealer B

B. Create a dot plot for each data set. Compare the distributions of the data sets.

C. Determine if there are any outliers in the data sets. If there are, remove the outlier and find the statistics for that data set(s). What was affected by the outlier?

10 11 12 13 14 15 16

Dealer A

17 18 19 20

xxxx

xxxxx

xxxx

xx

xx

xx

x

10 11 12 13 14 15 16

Dealer B

17 18 19 20

xx

xxxx

xxxxx

xxx

xxxx

xx

The number of cars sold by Dealer A tends to be lower than the number of cars sold by Dealer B.

The number of cars sold by Dealer A are more consistent than the number of cars sold by Dealer B.

14.85

17.3 18 2.5 2.2

1.6215

The data for Dealer A show a symmetric distribution, so the number of cars sold daily by Dealer A is evenly distributed about the mean.

The data for Dealer B show a distribution skewed to the left, so during more than half of the days, car sales were greater than the mean.

Dealer A:

x < 14 - 1.5 (2) x > 16 + 1.5 (2)

x < 11 x > 19

There are no values in the data set that satisfy these inequalities for x. So, there are no outliers.

Dealer B:

x < 16.5 - 1.5 (2.5) x > 19 + 1.5 (2.5)

x < 12.75 x > 22.75

10 is an outlier in the data set for Dealer B. Removing the outlier increases the mean and decreases the standard deviation. The median is unaffected.

Module 9 400 Lesson 2

DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B

IN1_MNLESE389755_U4M09L2.indd 400 25/07/14 12:47 PM

INTEGRATE MATHEMATICAL PRACTICESFocus on ReasoningMP.2 Ask students whether the dealer who tended to sell more cars than a competitor would necessarily make the greater profit. Students should recognize that a greater number of car sales leads to a greater profit only when the profit per car is about the same in both cases. If one dealer sold more cars by setting the prices so low that there was a very small profit margin, that dealer could end up with lower profits despite having more sales.

QUESTIONING STRATEGIESWhat might be some reasons for an outlier to occur in a set of daily car sale values? Possible

answers: There might have been a day with very bad

weather, so no one went car shopping, or a day

when the best salespeople were out sick, so they

didn’t sell any cars.

Scoring Rubric2 points: Student correctly solves the problem and explains his/her reasoning.1 point: Student shows good understanding of the problem but does not fully solve or explain his/her reasoning.0 points: Student does not demonstrate understanding of the problem.

EXTENSION ACTIVITY

Explain to students that a bimodal data distribution has two peaks. Have students create a set of 20 daily car-sale values with a bimodal distribution, then create a dot plot and calculate statistics for the data. Ask what situations might produce this distribution. Students may speculate that a sudden change in sales tactics or prices could lead to several days with much higher or lower sales values than preceding days. Point out that neither the mean nor the median accurately represents a bimodal distribution. Explain that in some cases, such as when the data originate from two different sets of conditions, it is appropriate to split it into two data sets and evaluate them separately.

Data Distributions and Outliers 400

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C