Name Class Date
Resource Locker
Resource Locker
Resource Locker
© H
oug
hton
Mif
flin
Har
cour
t Pub
lishi
ng
Com
pan
y . i
mag
e cr
edit
: ©Bl
end
Imag
es/A
lam
y
Explore Using Dot Plots to Display Data A dot plot is a data representation that uses a number line and Xs, dots, or other symbols to show frequency. Dot plots are sometimes called line plots.
Finance Twelve employees at a small company make the following annual salaries (in thousands of dollars): 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, and 60.
Choose the number line with the most appropriate scale for this problem. Explain your reasoning.
Create and label a dot plot of the data. Put an X above the number line for each time that value appears in the data set.
Reflect
1. Discussion Recall that quantitative data can be expressed as a numerical measurement. Categorical, qualitative data is expressed in categories, such as attributes or preferences. Is it appropriate to use a dot plot for displaying quantitative data, qualitative data, or both? Explain.
50 1000
50 6030 40 7020
65 8035 50 9520
Salary (thousands of dollars)
x xxx
xx x
xxx xx
20 30 40 50 60 70
The second number line has the most appropriate scale. The scale of the first number line
includes a larger range of numbers than necessary, so dots will be clustered in the middle.
The scale of the third number line does not have convenient tick marks for determining
where values between the labels belong.
A dot plot uses a number line, so it is only appropriate for displaying quantitative data.
Module 9 389 Lesson 2
9 . 2 Data Distributions and OutliersEssential Question: What statistics are most affected by outliers, and what shapes can data
distributions have?
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A
IN1_MNLESE389755_U4M09L2.indd 389 01/04/14 9:27 PM
Common Core Math StandardsThe student is expected to:
S-ID.1
Represent data with plots on the real number line (dot plots, histograms, and box plots). Also S-ID.2, S-ID.3, N-Q.1
Mathematical Practices
MP.2 Reasoning
Language ObjectiveExplain to a partner what an outlier is.
HARDCOVER PAGES 317326
Turn to these pages to find this lesson in the hardcover student edition.
Data Distributions and Outliers
ENGAGE Essential Question: What statistics are most affected by outliers, and what shapes can data distributions have?Outliers affect the mean more than the median, and
they affect the standard deviation more than the
IQR. Data distributions can be described generally
as symmetric, skewed to the left, or skewed to
the right.
PREVIEW: LESSON PERFORMANCE TASKView the Engage section online. Discuss why, if you owned a business, you might compare a competitor’s sales to your company’s sales, and how your findings might lead you to change the way you run your business. Then preview the Lesson Performance Task.
389
HARDCOVER
Turn to these pages to find this lesson in the hardcover student edition.
Name
Class Date
Resource
LockerResource
LockerResource
Locker
© H
ough
ton
Mif
flin
Har
cour
t Pub
lishi
ng C
omp
any
. im
age
cred
it: ©
Blen
d
Imag
es/A
lam
y
Explore Using Dot Plots to Display Data
A dot plot is a data representation that uses a number line and Xs,
dots, or other symbols to show frequency. Dot plots are sometimes
called line plots.
Finance Twelve employees at a small company make the
following annual salaries (in thousands of dollars): 25, 30, 35, 35,
35, 40, 40, 40, 45, 45, 50, and 60.
Choose the number line with the most appropriate scale
for this problem. Explain your reasoning.
Create and label a dot plot of the data. Put an X above the number line for each time that
value appears in the data set.
Reflect
1. Discussion Recall that quantitative data can be expressed as a numerical measurement. Categorical,
qualitative data is expressed in categories, such as attributes or preferences. Is it appropriate to use a dot
plot for displaying quantitative data, qualitative data, or both? Explain.
S-ID.1 Represent data with plots on the real number line (dot plots, histograms, and box plots).
Also S-ID.2, S-ID.3, N-Q.1
50100
0
50 6030 40
70
20
65 8035 50
95
20
Salary (thousands of dollars)
xxxx
x
x xxxx
xx
20 30 40 50 60 70
The second number line has the most appropriate scale. The scale of the first number line
includes a larger range of numbers than necessary, so dots will be clustered in the middle.
The scale of the third number line does not have convenient tick marks for determining
where values between the labels belong.
A dot plot uses a number line, so it is only appropriate for displaying quantitative data.
Module 9
389
Lesson 2
9 . 2 Data Distributions and Outliers
Essential Question: What statistics are most affected by outliers, and what shapes can data
distributions have?
DO NOT EDIT--Changes must be made through "File info"
CorrectionKey=NL-A;CA-A
IN1_MNLESE389755_U4M09L2 389
09/04/14 6:11 PM
389 Lesson 9 . 2
L E S S O N 9 . 2
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton Mifflin H
arcourt Publishin
g Com
pany
Explain 1 The Effects of an Outlier in a Data SetAn outlier is a value in a data set that is much greater or much less than most of the other values in the data set. Outliers are determined by using the first or third quartiles and the IQR.
How to Identify an Outlier
A data value x is an outlier if x < Q 1 - 1.5(IQR) or if x > Q 3 + 1.5(IQR).
Example 1 Create a dot plot for the data set using an appropriate scale for the number line. Determine whether the extreme value is an outlier.
Suppose that the list of salaries from the Explore is expanded to include the owner’s salary of $150,000. Now the list of salaries is 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, 60, and 150.
To choose an appropriate scale, consider the minimum and maximum values, 25 and 150.
A number line from 20 to 160 will contain all the values. A scale of 5 will be convenient for the data. Label tick marks by 20s.
Plot each data value to see the distribution.
Find the quartiles and the IQR to determine whether 150 is an outlier.
Suppose that the salaries from Part A were adjusted so that the owner’s salary is $65,000.
Now the list of salaries is 25, 30, 35, 35, 35, 40, 40, 40, 45, 45, 50, 60, and 65.
To choose an appropriate scale, consider the minimum and
maximum data values, and .
A number line from to will
contain all the data values.
A scale of will be convenient for the data.
Label tick marks by .
Plot each data value to see the distribution.
150 ? > Q3 + 1.5 (IQR)
150 ? > 47.5 + 1.5 (47.5 - 35)
150 > 66.25 True
150 is an outlier.
x xxx
x xxxxxx x x
20 40 60 80 100 160140120
Salary(thousands of dollars)
Salary (thousands of dollars)
20 70
x xxx
xx x
xxx xx
30 40 50 60
25 65
20 70
5
10s
Module 9 390 Lesson 2
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A
IN1_MNLESE389755_U4M09L2.indd 390 01/04/14 9:27 PM
Integrate Mathematical PracticesThis lesson provides an opportunity to address Mathematical Practice MP.2, which calls for students to “reason abstractly and quantitatively.” Students solve real-world problems by creating dot plots for data sets. They analyze and describe the shapes of the data distributions, recognizing how the shapes affect the measures of center and spread, and they use both dot plots and statistical measures to compare data sets. Thus, they first take a situation from its real-world context to represent it symbolically, then they interpret the results in the real-world context.
EXPLORE Using Dot Plots to Display Data
INTEGRATE TECHNOLOGYTo make it easier to create a dot plot for a large data set, students can enter the data values into one column of a spreadsheet, then use the spreadsheet’s data-sorting function to arrange them in increasing order.
QUESTIONING STRATEGIESHow can you use a dot plot to find the interquartile range of a data set? First, find
the median by counting the same number of marks
from each end of the dot plot until the middle value
is reached. If there are an even number of marks,
find the mean of the two middle values. Then use
the same process to find the first quartile
(Q1, the middle value of the lower half) and the
third quartile (Q3, the middle value of the upper
half). Finally, subtract Q1 from Q3 to find the
interquartile range.
EXPLAIN 1 The Effects of an Outlier in a Data Set
AVOID COMMON ERRORSStudents sometimes forget to take the square root of the mean of the squared deviations when calculating standard deviation. Review the steps for calculating the standard deviation.
QUESTIONING STRATEGIESHow does an outlier affect the mean and median of a data set? If a data set includes an
outlier, the mean can be increased or decreased
significantly. This can make the mean misleading as
a measure of center. When there are no outliers,
most data values cluster closer to the mean. The
median is much less affected by an outlier, because
a single outlier shifts the middle of the data set by
only a small amount, if at all.
PROFESSIONAL DEVELOPMENT
Data Distributions and Outliers 390
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton
Mif
flin
Har
cour
t Pub
lishi
ng
Com
pan
y
Find the quartiles and the IQR to determine whether 65 is an outlier.
Reflect
2. Explain why the median was NOT affected by changing the max data value from 150 to 65.
Your Turn
3. Sports Baseball pitchers on a major league team throw at the following speeds (in miles per hour): 72, 84, 89, 81, 93, 100, 90, 88, 80, 84, and 87. Create a dot plot using an appropriate scale for the number line. Determine whether the extreme value is an outlier.
Explain 2 Comparing Data SetsNumbers that characterize a data set, such as measures of center and spread, are called statistics. They are useful when comparing large sets of data.
Example 2 Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set, and then compare the data.
Sports The tables list the average ages of players on 15 teams randomly selected from the 2010 teams in the National Football League (NFL) and Major League Baseball (MLB). Describe how the average ages of NFL players compare to those of MLB players.
NFL Players’ Average Ages, by Team
25.8, 26.0, 26.3, 25.7, 25.1, 25.2, 26.1, 26.4, 25.9, 26.6, 26.3, 26.2, 26.8, 25.6, 25.7
MLB Players’ Average Ages, by Team
28.5, 29.0, 28.0, 27.8, 29.5, 29.1, 26.9, 28.9, 28.6, 28.7, 26.9, 30.5, 28.7, 28.9, 29.3
65 ? > Q3 + 1.5 (IQR)
65 ? > + 1.5 ( - )
65 > True / False
Therefore, 65 is / is not an outlier.
x xxx xx x x xx
70 75 80 85 90 10095
Pitching Speeds (mph)
47.5 47.5 35
66.25
The maximum value in the data set changed, but its ordered position did not, so the
middle value in the ordered list was not moved or changed.
72 ? < Q1 - 1.5 (IQR)
72 ? < 81 - 1.5 (9)
72 < 67.5 False Therefore, 72 is not an outlier.
Module 9 391 Lesson 2
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A
IN1_MNLESE389755_U4M09L2.indd 391 01/04/14 9:27 PM
COLLABORATIVE LEARNING
Peer-to-Peer ActivityHave students work in pairs. Have each student create a data set with 10 values, using the definition of outlier to verify that none of the values are outliers. Students then find the mean, median, range, and IQR for their data sets. Have students trade data sets with their partners. Ask each student to add an outlier to the partner’s data set, and then calculate the new mean, median, range, and IQR for the set. Students should compare their results and discuss how the outliers affected the statistics.
EXPLAIN 2 Comparing Data Sets
INTEGRATE MATHEMATICAL PRACTICESFocus on Technology
MP.5 Review the steps generating statistics using a graphing calculator. Students can
create a list by pressing STAT, then selecting 1:Edit. A previously entered list can be cleared by highlighting the name of the list, pressing CLEAR, then pressing the down arrow.
After entering data in a list, students can find the one-variable statistics by pressing STAT, selecting CALC, and then selecting 1:1-Var Stats. For data in lists other than L1, they must enter the list number before pressing ENTER to generate the statistics.
AVOID COMMON ERRORSStudents may expect their graphing calculators to provide the value of the IQR. Remind them that they must calculate the IQR by finding the difference between the first and third quartiles.
391 Lesson 9 . 2
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton Mifflin H
arcourt Publishin
g Com
pany
On a graphing calculator, enter the two sets of data into L 1 and L 2 .
Use the “1-Var Stats” feature to find statistics for the data in lists L 1 and L 2 . Your calculator may use the following notations: mean _ x , standard deviation σx.
Scroll down to see the median (Med), Q 1 , and Q 3 . Complete the table.
Mean Median IQR ( Q 3 - Q 1 )Standard deviation
NFL 25.98 26.00 0.60 0.46
MLB 28.62 28.70 1.10 0.91
Compare the corresponding statistics.
The mean age and median age are lower for the NFL than for the MLB, which means that NFL players tend to be younger than MLB players. In addition, the IQR and standard deviation are smaller for the NFL than for the MLB, which means that the ages of NFL players are closer together than those of MLB players.
The tables list the ages of 10 contestants on 2 game shows.
Game Show 1
18, 20, 25, 48, 35, 39, 46, 41, 30, 27
Game Show 2
24, 29, 36, 32, 34, 41, 21, 38, 39, 26
On a graphing calculator, enter the two sets of data into L 1 and L 2 .
Complete the table. Then circle the correct items to compare the statistics.
Mean Median IQR ( Q 3 – Q 1 )Standard deviation
Show 1
Show 2
The mean is lower for the 1st / 2nd game show, which means that contestants in the 1st / 2nd game show are on average younger than contestants in the 1st / 2nd game show. However, the median is lower for the 1st / 2nd game show, which means that although contestants are on average younger on the 1st / 2nd game show, there are more young contestants on the 1st / 2nd game show. Finally, the IQR and standard deviation are higher for the 1st / 2nd game show, which means that the ages of contestants on the 1st / 2nd game show are further apart than the age of contestants on the 1st/ 2nd game show.
32.9
32
32.5
33 12 6.45
10.0016
Module 9 392 Lesson 2
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B
IN1_MNLESE389755_U4M09L2.indd 392 25/07/14 12:47 PM
DIFFERENTIATE INSTRUCTION
Multiple RepresentationsStudents may benefit from acting out a real-world example of how adding an outlier to a data set affects measures of center and spread. For example, have five students each begin with 1 to 5 slips of paper (or pennies or markers); each slip represents a dollar. Have the students calculate the mean by equally distributing all the slips of paper among the five students. Then have a sixth student with $25 (25 slips of paper) join the group. Again use the slips of paper to find the mean by distributing them among the six students. Ask whether the new mean is a reasonable measure of center.
QUESTIONING STRATEGIESWhat can you conclude about two data sets by comparing each of the following statistics:
mean, median, IQR, and standard deviation?
By comparing the mean and median values, you can
conclude whether the typical value for one data set
is higher or lower than the typical value for the
other set. By comparing the IQR and standard
deviation values, you can determine whether the
data values in one set are more or less spread out
than the values in the other set.
Data Distributions and Outliers 392
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton
Mif
flin
Har
cour
t Pub
lishi
ng
Com
pan
y
Your Turn
4. The tables list the age of each member of Congress in two randomly selected states. Complete the table and compare the data.
Illinois
26, 24, 28, 46, 39, 59, 31, 26, 64, 40, 69, 62, 31, 28, 26, 76, 57, 71, 58, 35, 32, 49, 51, 22, 33, 56
Arizona
42, 37, 58, 32, 46, 42, 26, 56, 27
Mean Median IQR ( Q 3 - Q 1 )Standard deviation
Illinois
Arizona
Explain 3 Comparing Data DistributionsA data distribution can be described as symmetric, skewed to the left, or skewed to the right, depending on the general shape of the distribution in a dot plot or other data display.
Example 3 For each data set, make a dot plot and determine the type of distribution. Then explain what the distribution means for each data set.
Sports The data table shows the number of miles run by members of two track teams during one day.
Miles 3 3.5 4 4.5 5 5.5 6
Members of Team A 2 3 4 4 3 2 0
Members of Team B 1 2 2 3 3 4 3
xx xx
xx
xxx
xx
xxx
xx
Symmetric
xx xxx
x xx
xxx
xx
Skewed to the Left
xx xx
xx
x xxx
xx x
Skewed to the Right
43.81
40.67
39.5
42 21.5 10.84
16.4230
The mean is lower for Arizona, which means that, on average, members of Congress tend
to be younger in Arizona than in Illinois. However, the median is lower in Illinois, which
means that there are more young members of Congress in Illinois despite the differences
in average age. Finally, the IQR and standard deviation are lower for Arizona, which
means that the ages of members of Congress are closer together than they are in Illinois.
Module 9 393 Lesson 2
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-A;CA-A
IN1_MNLESE389755_U4M09L2.indd 393 01/04/14 9:27 PM
LANGUAGE SUPPORT
Connect VocabularyEnglish learners who are working on acquiring academic English in algebra may find that some terminology is difficult to pronounce or to differentiate when listening. Words such as effect and affect may be difficult to distinguish, and words such as skew or interquartile may be difficult to pronounce. Be sure to enunciate clearly so that students can understand and learn to pronounce the key words correctly.
EXPLAIN 3 Comparing Data Distributions
AVOID COMMON ERRORSStudents often confuse the terms skewed to the left and skewed to the right. Encourage students to come up with a mnemonic to help them remember how the direction of a skew should be described. For example, students may easily remember how the “tail” of a data distribution looks on a dot plot. Point out that both tail and skew have four letters, and that a data distribution is skewed in the direction of its tail.
QUESTIONING STRATEGIESSome data distributions are described as uniform. What do you think the general shape
of a uniform distribution would be? The general
shape of a uniform distribution is fairly even across
the plot.
What would be true about the mean and median of a data set with a uniform
distribution? The mean and median would be
approximately equal.
393 Lesson 9 . 2
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton Mifflin H
arcourt Publishin
g Com
pany
Team A
The data for team A show a symmetric distribution. This means that the distances run are evenly distributed about the mean.
Team B
The data for team B show a distribution skewed to the left. This means that more than half the team members ran a distance greater than the mean.
B The table shows the number of days, over the course of a month, that specific numbers of apples were sold by competing grocers.
Number of Apples Sold 0 50 100 150 200 250 300
Grocery Store A 1 4 8 8 4 1 0
Grocery Store B 3 6 8 8 2 2 1
Grocery Store A Grocery Store B
The distribution for grocery store A is: left-skewed/right-skewed / symmetric. This means that the number of apples sold each day is evenly / unevenly distributed about the mean.
The distribution for grocery store B is: left-skewed/ right-skewed /symmetric. This means that the number of apples sold each day is evenly/ unevenly distributed about the mean.
Reflect
7. Will the mean and median in a symmetric distribution always be approximately equal? Explain.
8. Will the mean and median in a skewed distribution always be approximately equal? Explain.
xx
xx
xx
xxx
xx
xxxx
xxx
3 4 5 6
Miles
xx
xx
xx
xxx
xxx
xxx
xxx
3 4 5 6
Miles
x
xxxx
xx
xx
xxxx
xx
xx
xxxx x
xxxx
0 100 200 300 400
Number of Apples sold
xxxx
xx
xx
xxxxxx
xxxx
xx
xx
xx
xx
xxx
0 100 200 300 400
Number of Apples sold
The mean and median in a symmetric distribution will always be approximately equal
because the values are equally distributed on either side of the center.
The mean and median in a skewed distribution will not always be approximately equal
because the median will sometimes be closer to where the values cluster than the
mean will be.
Module 9 394 Lesson 2
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B
IN1_MNLESE389755_U4M09L2.indd 394 25/07/14 12:47 PM
Data Distributions and Outliers 394
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton
Mif
flin
Har
cour
t Pub
lishi
ng
Com
pan
y
Your Turn
9. Sports The table shows the number of free throws attempted during a basketball game. Make a dot plot and determine the type of distribution. Then explain what the distribution means for the data set.
Free Throws Shot 0 2 4 6 8
Members of Team A 2 2 4 2 2
Members of Team B 3 4 2 2 1
Team A Team B
Elaborate
10. If the mean increases after a single data point is added to a set of data, what can you tell about this data point?
11. How can you use a calculation to decide whether a data point is an outlier in a data set?
12. Essential Question Check-In What three shapes can data distributions have?
Number of Free Throws
xxx
xx
xxx
xx xx
0 2 4 6 8
xxxx
xx
xx
xx
xx
0 2 4 6 8
Number of Free ThrowsNumber of Free Throws
The data for team A show a symmetric
distribution. This means that the number of
free throws shot is evenly distributed about
the mean.
The data for team B show a distribution
skewed to the right. This means that fewer
than half of the team members shot a
number of free throws that were greater
than the mean.
If the mean increases after a single data point is added to a set of data, you can tell that the
data point added was larger than the mean of the set.
You can decide whether a data point is an outlier in a data set by finding the 1st and 3rd
quartile and subtracting them to get the interquartile range. If the data point is larger or
smaller than the result found by adding the 3rd quartile to 1.5 times the interquartile range
or by subtracting the 1st quartile from 1.5 times the interquartile range, respectively, then
the data point is an outlier.
Data distributions can be skewed to the left, skewed to the right, and symmetric.
Module 9 395 Lesson 2
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-C;CA-C
IN1_MNLESE389755_U4M09L2 395 6/9/15 12:34 PMExercise Depth of Knowledge (D.O.K.) Mathematical Practices
1–8 1 Recall MP.4 Modeling
9 1 Recall MP.5 Using Tools
10 2 Skills/Concepts MP.7 Using Structure
11 1 Recall MP.5 Using Tools
12 2 Skills/Concepts MP.7 Using Structure
ELABORATE QUESTIONING STRATEGIES
Can a data set have more than one outlier? Explain. Yes; More than one value may be
less than Q1 - 1.5(IQR) or greater than Q3 + 1.5(IQR).
INTEGRATE MATHEMATICAL PRACTICESFocus on Critical ThinkingMP.3 Discuss with students whether all the values in a data set could be outliers. Review the definition of outlier. Students should understand that because an outlier must be less than Q1 or greater than Q3, values between Q1 and Q3 will never be outliers for a data set.
SUMMARIZE THE LESSONHow can you determine whether a value in a data set is an outlier? How does the inclusion
of an outlier affect the mean, median, range, and IQR? An outlier is a value that is less than
Q 1 - 1.5(IQR) or greater than Q 3 + 1.5(IQR). Outliers
significantly affect the mean and range, but affect
the median and IQR very little or not at all.
395 Lesson 9 . 2
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
• Online Homework• Hints and Help• Extra Practice
© H
oug
hton Mifflin H
arcourt Publishin
g Com
pany • Im
age C
redits: ©
Blend
Imag
es/Alam
y
Evaluate: Homework and Practice
Fitness The numbers of members in 8 workout clubs are 100, 95, 90, 85, 85, 95, 100, and 90. Use this information for Exercises 1–2.
1. Create a dot plot for the data set using an appropriate scale for the number line.
2. Suppose that a new workout club opens and immediately has 150 members. Is the number of members at this new club an outlier?
Sports The number of feet to the left outfield wall for 10 randomly chosen baseball stadiums is 315, 325, 335, 330, 330, 330, 320, 310, 325, and 335. Use this information for Exercises 3–4.
3. Create a dot plot for the data set using an appropriate scale for the number line.
4. The longest distance to the left outfield wall in a baseball stadium is 355 feet. Is this stadium an outlier if it is added to the data set?
Education The numbers of students in 10 randomly chosen classes in a high school are 18, 22, 26, 31, 25, 20, 23, 26, 29, and 30. Use this information for Exercises 5–6.
5. Create a dot plot for the data set using an appropriate scale for the number line.
6. Suppose that a new class is opened for enrollment and currently has 7 students. Is this class an outlier if it is added to the data set?
Number of Members
xx
xx
xx
xx
60 70 80 90 100 110
Number of Feet
xx
xx
x x x xxx
300 310 320 330 340 350
150 > 100 + 1.5(100 - 87.5) = 118.75 True 150 members is an outlier.
Possible plot shown.
Possible plot shown.
Possible plot shown.
355 > 335 + 1.5(335 - 320) = 357.5 False 355 feet is not an outlier.
7 < 20 - 1.5(29 - 20) = 6.5 False 7 is not an outlier.?
?
Number of Students
xx x x xxx x xx
16 20 24 28 32 36
Module 9 396 Lesson 2
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B
DO NOT EDIT--Changes must be made through "File info"CorrectionKey=NL-B;CA-B
IN1_MNLESE389755_U4M09L2.indd 396 25/07/14 12:47 PMExercise Depth of Knowledge (D.O.K.) Mathematical Practices
13–15 2 Skills/Concepts MP.4 Modeling
16 1 Recall of Information MP.5 Using Tools
17–18 3 Strategic Thinking MP.3 Logic
19 2 Skills/Concepts MP.4 Modeling
EVALUATE
ASSIGNMENT GUIDE
Concepts and Skills Practice
ExploreUsing Dot Plots to Display Data
Exercises 1, 3, 5, 7
Example 1The Effects of an Outlier in a Data Set
Exercises 2, 4, 6, 8, 17–18
Example 2Comparing Data Sets
Exercises 9–12
Example 3Comparing Data Distributions
Exercises 13–16, 19
INTEGRATE MATHEMATICAL PRACTICESFocus on Critical ThinkingMP.3 Understanding how outliers can affect the mean and the median of a data set is an important skill, especially for interpreting data. Discuss how statistics can be misleading when outliers that affect the mean value for a data set are included.
Data Distributions and Outliers 396
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton
Mif
flin
Har
cour
t Pub
lishi
ng
Com
pan
y
Sports The average bowling scores for a group of bowlers are 200, 210, 230, 220, 230, 225, and 240. Use this information for Exercises 7–8.
7. Create a dot plot for the data set using an appropriate scale for the number line.
8. Suppose that a new bowler joins this group and has an average score of 275. Is this bowler an outlier in the data set?
The tables describe the average ages of employees from two randomly chosen companies. Use this information for Exercises 9–10.
Company A
23, 29, 35, 46, 51, 50, 42, 37, 30
Company B
24, 23, 45, 45, 42, 52, 55, 47, 55
9. Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set.
Mean Median IQR ( Q 3 – Q 1 )Standard deviation
Company A Mean Mean Mean Mean
Company B Mean Mean Mean Mean
10. Compare the data sets.
The tables describe the size of microwaves, in cubic feet, chosen randomly from two competing companies. Use this information for Exercises 11–12.
Company A
1.8, 2.1, 3.1, 2.0, 3.3, 2.9, 3.3, 2.1, 3.2
Company B
1.9, 2.6, 1.8, 3.0, 2.5, 2.8, 2.0, 3.6, 3.1
11. Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set.
Mean Median IQR ( Q 3 – Q 1 )Standard deviation
Company A Mean Mean Mean Mean
Company B Mean Mean Mean Mean
12. Compare the data sets.
Bowling Scores
xx x xxx x
200 210 220 230 240 250
Possible plot shown.
275 > 235 + 1.5(235 - 215) = 265 True 275 is an outlier.?
38.1
43.1 45 20.5 11.33
9.2718.537
Employees at company A tend to be younger than employees at company B. The ages of employees at company A are closer together than the ages of employees at company B.
Microwaves from company B tend to be smaller than microwaves from company A. The average size of microwaves tend to be closer together at company B than at company A.
2.6
2.6 2.6 1.1 0.57
0.591.22.9
Module 9 397 Lesson 2
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-A;CA-A
IN1_MNLESE389755_U4M09L2.indd 397 01/04/14 9:27 PM
MODELINGTo help students think about possible causes for outliers in a data set, ask them to consider the distribution of heights of all the people in a kindergarten classroom, in a high school classroom, and on a basketball court. Discuss how many outliers might be expected in each case, and what factors might affect the number of outliers in each situation. Students should recognize that there is often a reason why one value is very different from the others in a data set, such as the fact that a kindergarten teacher may be the only adult in the classroom.
AVOID COMMON ERRORSMake sure students understand the process for determining the standard deviation for a data set. Encourage them to first create a table to record the deviation and squared deviation for each data value, then add the squared deviations, divide the sum by the number of values, and finally find the square root. Suggest that when they do not record their work, students can easily overlook a step in the process.
397 Lesson 9 . 2
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton Mifflin H
arcourt Publishin
g Com
pany
For each data set, make a dot plot and determine the type of distribution. Then explain what the distribution means for each data set.
13. Sports The data table shows the number of miles run by members of two teams running a marathon.
Miles 5 10 15 20 25
Members of Team A 3 5 10 5 3
Members of Team B 6 10 4 1 5
Team A Team B
14. Sales The data table shows the number of days that specific numbers of turkeys were sold. These days were in the two weeks before Thanksgiving.
Number of Turkeys 10 20 30 40
Grocery Store A 2 5 5 2
Grocery Store B 5 5 1 3
Grocery Store A Grocery Store B
Miles
5 10 15 20 25 30
xxxx
xxxx
xx
xxxxx
xxx
xxxxx
xxx
Miles
5 10 15 20 25 30
xxxx
xxxx
xx
xxxxxx x
xxxxx
xxxx
Number of Turkeys
0 10 20 30 40 50
xx
xxxxx
xxxxx
xx
Number of Turkeys
0 10 20 30 40 50
xx
x
xxxxx
xxxxx x
The data for team A show a symmetric distribution. The distances run are evenly distributed about the mean.
The data for team B show a right-skewed distribution. This means that fewer than half of the team members ran a distance greater than the mean.
Possible plot shown.
The data for grocery store A show a symmetric distribution. This means that the numbers of turkeys sold per day are evenly distributed about the mean.
The data for grocery store B show a right-skewed distribution. This means that the store sold fewer than the average number of turkeys for more than half of the days.
Module 9 398 Lesson 2
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B
IN1_MNLESE389755_U4M09L2.indd 398 25/07/14 12:47 PM
CRITICAL THINKINGHave students analyze and describe the shape of the distribution of a dot plot they created. Ask students how the shape relates to the statistics they would use to characterize the data.
Data Distributions and Outliers 398
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton
Mif
flin
Har
cour
t Pub
lishi
ng
Com
pan
y
15. State whether each set of data is left-skewed, right-skewed, or symmetrically distributed.
A. 3, 5, 5, 3B. 1, 1, 3, 1C. 7, 9, 9, 11D. 5, 5, 3, 3E. 19, 21, 21, 19
H.O.T. Focus on Higher Order Thinking
16. What If? Given the data set 8, 15, 12, 10, and 5, what happens to the mean if you add a data value of 40? Is 40 an outlier of the new data set?
17. Critical Thinking Can an outlier be a data value between Q 1 and Q 3 ? Justify your answer.
18. Justify Reasoning If the distribution has outliers, why will they always have an effect on the range?
19. Education The data table describes the average testing scores in 20 randomly selected classes in two randomly selected high schools, rounded to the nearest ten. For each data set, make a dot plot, determine the type of distribution, and explain what the distribution means in context.
Average Scores 0 10 20 30 40 50 60 70 80 90 100
School A 0 1 2 2 3 4 3 2 2 1 0
School B 0 1 1 1 2 4 5 4 2 0 0
School A School B
Test Scores
0 20 40 60 80 100
xxxx
xxx
xx
xx x
xx
xx
xxxx
Test Scores
0 20 40 60 80 100
xxx
xxx
xxx x
xxx
xx
x
xxxx
The data for school A show a symmetric distribution. This means that the test scores were evenly distributed about the mean test score.
The data for school B show a left-skewed distribution. This means that more than half of the classes received a test score that was above the mean.
symmetricright-skewedsymmetricsymmetricsymmetric
The mean increases from 10 to 15. 40 is an outlier of the new data set because 40 > 25.5.
An extreme value such as the max or min value can be an outlier, but by definition, no value between Q 1 and Q 3 can be an outlier.
When present, outliers will always have an effect on the range since one of the outliers will either be the highest or lowest number in a given data set and the range is found by finding the difference between the highest and lowest numbers.
Module 9 399 Lesson 2
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-C;CA-C
IN1_MNLESE389755_U4M09L2 399 6/10/15 8:51 AM
JOURNALHave students create their own graphic organizers to share with classmates, outlining the steps for finding mean, median, Q 1 , Q 3 , IQR, and standard deviation from a dot plot.
399 Lesson 9 . 2
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
© H
oug
hton Mifflin H
arcourt Publishin
g Com
pany
Lesson Performance TaskThe tables list the daily car sales of two competing dealerships.
Dealer A Dealer B
14 13 15 12 16 17 15 20
15 16 15 17 18 19 18 17
17 12 16 14 19 10 19 18
15 16 14 16 15 17 20 19
13 14 18 15 18 18 16 17
A. Calculate the mean, median, interquartile range (IQR), and standard deviation for each data set. Compare the measures of center for the two dealers.
Mean Median IQR ( Q 3 – Q 1 )Standard deviation
Dealer A
Dealer B
B. Create a dot plot for each data set. Compare the distributions of the data sets.
C. Determine if there are any outliers in the data sets. If there are, remove the outlier and find the statistics for that data set(s). What was affected by the outlier?
10 11 12 13 14 15 16
Dealer A
17 18 19 20
xxxx
xxxxx
xxxx
xx
xx
xx
x
10 11 12 13 14 15 16
Dealer B
17 18 19 20
xx
xxxx
xxxxx
xxx
xxxx
xx
The number of cars sold by Dealer A tends to be lower than the number of cars sold by Dealer B.
The number of cars sold by Dealer A are more consistent than the number of cars sold by Dealer B.
14.85
17.3 18 2.5 2.2
1.6215
The data for Dealer A show a symmetric distribution, so the number of cars sold daily by Dealer A is evenly distributed about the mean.
The data for Dealer B show a distribution skewed to the left, so during more than half of the days, car sales were greater than the mean.
Dealer A:
x < 14 - 1.5 (2) x > 16 + 1.5 (2)
x < 11 x > 19
There are no values in the data set that satisfy these inequalities for x. So, there are no outliers.
Dealer B:
x < 16.5 - 1.5 (2.5) x > 19 + 1.5 (2.5)
x < 12.75 x > 22.75
10 is an outlier in the data set for Dealer B. Removing the outlier increases the mean and decreases the standard deviation. The median is unaffected.
Module 9 400 Lesson 2
DO NOT EDIT--Changes must be made through "File info" CorrectionKey=NL-B;CA-B
IN1_MNLESE389755_U4M09L2.indd 400 25/07/14 12:47 PM
INTEGRATE MATHEMATICAL PRACTICESFocus on ReasoningMP.2 Ask students whether the dealer who tended to sell more cars than a competitor would necessarily make the greater profit. Students should recognize that a greater number of car sales leads to a greater profit only when the profit per car is about the same in both cases. If one dealer sold more cars by setting the prices so low that there was a very small profit margin, that dealer could end up with lower profits despite having more sales.
QUESTIONING STRATEGIESWhat might be some reasons for an outlier to occur in a set of daily car sale values? Possible
answers: There might have been a day with very bad
weather, so no one went car shopping, or a day
when the best salespeople were out sick, so they
didn’t sell any cars.
Scoring Rubric2 points: Student correctly solves the problem and explains his/her reasoning.1 point: Student shows good understanding of the problem but does not fully solve or explain his/her reasoning.0 points: Student does not demonstrate understanding of the problem.
EXTENSION ACTIVITY
Explain to students that a bimodal data distribution has two peaks. Have students create a set of 20 daily car-sale values with a bimodal distribution, then create a dot plot and calculate statistics for the data. Ask what situations might produce this distribution. Students may speculate that a sudden change in sales tactics or prices could lead to several days with much higher or lower sales values than preceding days. Point out that neither the mean nor the median accurately represents a bimodal distribution. Explain that in some cases, such as when the data originate from two different sets of conditions, it is appropriate to split it into two data sets and evaluate them separately.
Data Distributions and Outliers 400
DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C
Top Related