Post on 20-Feb-2018
7/24/2019 NS2 Data Presenting 14
1/37
1
Chapter 2Presenting Data in
Tables and Charts
David Chow
Sep 2014
7/24/2019 NS2 Data Presenting 14
2/37
2
A Picture Is Worth a Thousand Words
7/24/2019 NS2 Data Presenting 14
3/37
3
Categorical Data
7/24/2019 NS2 Data Presenting 14
4/37
4
Organizing Categorical Data:
Summary Table A summary table indicates the frequency, amount, or
percentage of items in a set of categories.
You can easily see differences between categories.
How do you spend the holidays? Percent
At home with family 45%
Travel to visit family 38%
Vacation 5%Catching up on work 5%
Other 7%
7/24/2019 NS2 Data Presenting 14
5/37
5
Organizing Categorical Data:
Bar Chart & Pie Chart In a bar chart, a bar
shows each category, thelength of which
represents the amount,frequency or percentage.
How Do You Spend the Holidays?
45%
38%
5%
5%
7%
0% 10% 20% 30% 40% 50%
At home w ith family
Travel to visit family
Vacation
Catching up on w ork
Other
Pie chart is a circlebroken up into slicesrepresenting categories.
The size of each slice
corresponds to thepercentage share.
How Do You Spend the Holiday's
45%
38%
5%
5%7%
At home with family
Travel to visit family
Vacation
Catching up on work
Other
7/24/2019 NS2 Data Presenting 14
6/37
6
Organizing Categorical Data:
Pareto Diagram Also for categorical data
Essentially, it is a bar chart and a cumulative
polygon in the same graph
Categories are shown in descending order of frequency
Easy to see the vital few versus the trivial many
7/24/2019 NS2 Data Presenting 14
7/377
Organizing Categorical Data:
Pareto Diagram
cumulative%invested
(linegraph)
%i
nves
tedineach
category
(bargraph)
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
Stocks Bonds Savings CD
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Current Investment Portfolio Pareto diagram
is a bar chart &
a cumulativepolygon together
- in descendingorder offrequency
- easy to see thevital few
7/24/2019 NS2 Data Presenting 14
8/37
8
Numerical DataOrdered Array & Stem-
and-Leaf
7/24/2019 NS2 Data Presenting 14
9/37
9
Organizing Numerical Data:
Ordered Array An ordered array is a sequence of data, in rank order, from
the smallest value to the largest value.
Age ofSurveyed
College
Students
Day Students16 17 17 18 18 18
19 19 20 20 21 22
22 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45
7/24/2019 NS2 Data Presenting 14
10/37
10
Organizing Numerical Data:
Stem and Leaf Display A stem-and-leaf display organizes data into groups (called
stems) so that the values within each group (the leaves)
branch out to the right on each row.
Stem Leaf
1 67788899
2 0012257
3 28
4 2
Age of College Students (stem: 10s column)
Day Students Night Students
Stem Leaf
1 8899
2 0138
3 23
4 15
7/24/2019 NS2 Data Presenting 14
11/37
11
Stem-and-Leaf Display
Construct a stem-and-leaf display forthe following data sets:
1. Midterm scores: 50, 74, 74, 76, 81
2. Average daily expenditure:
$36.15, $31.00, $35.05, $40.25, $33.75
7/24/2019 NS2 Data Presenting 14
12/37
12
Numerical Data:Tables & Charts
7/24/2019 NS2 Data Presenting 14
13/37
13
Organizing Numerical Data:
Frequency Distribution The frequency distribution is a summary table in which the
data are arranged into numerically ordered class groupings.
You must give attention to selecting the appropriate number ofclass groupings, determining a suitable width of a classgrouping, and establishing the boundaries of each to avoidoverlapping.
To determine the width of a class interval, you divide therange (highest value - lowest value) by the number of classgroupings desired.
7/24/2019 NS2 Data Presenting 14
14/37
14
Organizing Numerical Data:
Frequency Distribution Example
Example: A manufacturer of insulation randomly selects 20
winter days and records the daily high temperature (in
Fahrenheit):
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27
7/24/2019 NS2 Data Presenting 14
15/37
15
Organizing Numerical Data:
Frequency Distribution Example Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
Find range: 58 - 12 = 46
Select number of classes: 5 (usually between 5 and 15)
Compute class interval (width): 10 (46/5 then round up)
Determine class boundaries (limits): 10, 20, 30, 40, 50, 60
Compute class midpoints: 15, 25, 35, 45, 55
Count observations & assign to classes
7/24/2019 NS2 Data Presenting 14
16/37
16
Organizing Numerical Data:
Frequency Distribution Example
Class Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100
RelativeFrequency
Percentage
7/24/2019 NS2 Data Presenting 14
17/37
17
Organizing Numerical Data:
The Histogram The graphical version of a frequency distribution is
called a histogram.
The class boundaries(or class midpoints) areshown on the horizontal axis. The vertical axis can
be frequency, relative frequency, orpercentage.
Bars of the appropriate heights are used to representthe number of observations within each class.
7/24/2019 NS2 Data Presenting 14
18/37
18
Organizing Numerical Data:
The Histogram
Class Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100
RelativeFrequency
Percentage
Histogram: Daily High Temperature
0
1
2
3
4
5
6
7
5 15 25 35 45 55 More
Frequency
7/24/2019 NS2 Data Presenting 14
19/37
19
Histogram in Excel: Step 1
Earlier Versions
Select Tools/Data Analysis
EXCEL 2010 VersionData > Data Analysis*
You may need to activate Data Analysis
by yourself. Simply click
File > Options > Add-in
7/24/2019 NS2 Data Presenting 14
20/37
20
Histogram in Excel: Steps 2-4
2. Choose Histogram
3. Input data range and binrange (bin range is a cell rangecontaining the upper class boundariesfor each class grouping)
4. Select Chart Output
and click OK
7/24/2019 NS2 Data Presenting 14
21/37
21
Organizing Numerical Data:
The Polygon A percentage polygon is formed by having the
midpoint of each class represent the data in that class
and then connecting the sequence of midpoints at
their respective class percentages.
The cumulative percentage polygon, or ogive,
displays the variable of interest along theX
axis, andthe cumulative percentages along the Y axis.
7/24/2019 NS2 Data Presenting 14
22/37
22
Organizing Numerical Data:
The Polygon
Frequency Polygon: Daily High Temperature
0
1
2
3
4
5
6
7
5 15 25 35 45 55 More
Freq
uency
Class Frequency
10 but less than 20 3 .15 15
20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 2050 but less than 60 2 .10 10
Total 20 1.00 100
RelativeFrequency
Percentage
(In a percentage polygon
the vertical axis would
be defined to show the
percentage of
observations per class)
7/24/2019 NS2 Data Presenting 14
23/37
23
Organizing Numerical Data:The Cumulative Percentage
Polygon
Ogive: Daily High Temperature
0
20
40
60
80
100
10 20 30 40 50 60Cumulativ
ePercentage
Class LowerBoundary
% Less ThanLower Boundary
10
7/24/2019 NS2 Data Presenting 14
24/37
24
Cross Tabulation
7/24/2019 NS2 Data Presenting 14
25/37
25
Cross Tabulations:The Contingency Table
A cross-classification (or contingency) tablepresentsthe results of two categorical variables
The categories of one variable are located in the rows,
the categories of the other are located in the columns
The joint responses are classified and shown in the cells
A graphical representation is the side-by-side bar chart
7/24/2019 NS2 Data Presenting 14
26/37
26
Cross Tabulations:The Contingency Table
Importance of Brand Name Male Female Total
More 450 300 750
Equal or Less 3300 3450 6750
Total 3750 3750 7500
A survey was conducted to study the importance of brandname to consumers as compared to a few years ago.
The results, classified by gender, were as follows:
7/24/2019 NS2 Data Presenting 14
27/37
27
Cross Tabulations:Side-By-Side Bar Charts
Importance of Brand Name
0 500 1000 1500 2000 2500 3000 3500 4000
More
Less or Equal
Response
Number of Responses
Female
Male
7/24/2019 NS2 Data Presenting 14
28/37
28
Numerical Data:Scatter Plots
& Time Series PlotsTo create scatter plots & time-series
plots in EXCEL, use the XY(Scatter)
option in the chart wizard.
7/24/2019 NS2 Data Presenting 14
29/37
29
Scatter Plots
Scatter plotsare used for numerical data consisting
of paired observations taken from two numerical
variables.
One variable is measured on the vertical axis and the
other variable is measured on the horizontal axis.
7/24/2019 NS2 Data Presenting 14
30/37
30
Scatter Plot Example
Volumeper day
Cost perday
23 125
26 140
29 146
33 160
38 167
42 170
50 188
55 195
60 200
Cost per Day vs. Production Volume
0
50
100
150
200
250
20 30 40 50 60 70
Volume per Day
CostperDay
7/24/2019 NS2 Data Presenting 14
31/37
31
Time Series Plot
Attendance (in millions) at USA
amusement/theme parks from 2000-2005Year Year
NumberAttendance
2000 0 317
2001 1 319
2002 2 324
2003 3 322
2004 4 328
2005 5 335
A time-series plot is used to study patterns in the
values of a numerical variable over time.
Attendance (in millions) at US Theme Parks
316
320
324
328
332
336
0 1 2 3 4 5 6
Year (Since 2000)
Attendance
7/24/2019 NS2 Data Presenting 14
32/37
32
Principles of Excellent Graphs
The graph should not distort the data.
The graph should not contain unnecessary
adornments (chart junk).
The scale on the vertical axis should begin at zero.
The graph should contain a title & properly labeled.
Use the simplest possible graph.
7/24/2019 NS2 Data Presenting 14
33/37
33
Graphical Errors: Chart Junk
1960: $1.00
1970: $1.60
1980: $3.10
1990: $3.80
Minimum Wage Minimum Wage
0
2
4
1960 1970 1980 1990
$
Which one is a better presentation?
Example 1
7/24/2019 NS2 Data Presenting 14
34/37
34
Graphical Errors:
No Relative Basis
As received by
students.As received by
students.
0
200
300
FR SO JR SR
Freq.
10%
30%
FR SO JR SR
FR = Freshmen, SO = Sophomore, JR = Junior, SR = Senior
100
20%
0%
%
Example 2
Which one is a better presentation?
7/24/2019 NS2 Data Presenting 14
35/37
35
Graphical Errors:
Compressing the Vertical Axis
Quarterly Sales Quarterly Sales
0
25
50
Q1 Q2 Q3 Q4
$
0
100
200
Q1 Q2 Q3 Q4
$
Example 3
Which one is a better presentation?
7/24/2019 NS2 Data Presenting 14
36/37
36
Graphical Errors: No Zero
Point on the Vertical Axis
Hang Seng Index
0
5000
10000
15000
20000
25000
9/1
/2008
9/8
/2008
9/15
/2008
9/22
/2008
9/29
/2008
10
/6
/2008
10
/13
/2008
10
/20
/2008
10
/27
/2008
11
/3
/2008
11
/10
/2008
11
/17
/2008
11
/24
/2008
12
/1
/2008
12
/8
/2008
12
/15
/2008
12
/22
/2008
12
/29
/2008
HSI
Example 4
Which one is a better presentation?
Hang Seng Index
10000
12000
14000
16000
18000
20000
22000
24000
9/1/2008
9/8/2008
9/15/2008
9/22/2008
9/29/2008
10/6/2008
10/13/2008
10/20/2008
10/27/2008
11/3/2008
11/10/2008
11/17/2008
11/24/2008
12/1/2008
12/8/2008
12/15/2008
12/22/2008
12/29/2008
HSI
Impact of Financial Tsunami to HSI
7/24/2019 NS2 Data Presenting 14
37/37
How to Sell a Lie
Point to pictures or graphs
Pictures (relevant or not) often alter our
perceptions of truth Present numbers or tables
Use words like because
Tell a story