Post on 25-May-2020
Chapter 1
Describing Data: Graphical
Statistics for Business and Economics
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-1
Dealing with Uncertainty
Everyday decisions are based on incomplete information
Consider:
n Will the job market be strong when I graduate? n Will the price of Yahoo stock be higher in six months
than it is now? n Will interest rates remain low for the rest of the year if
the federal budget deficit is as high as predicted?
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-2
1.1
Dealing with Uncertainty
Numbers and data are used to assist decision making
n Statistics is a tool to help process, summarize, analyze, and interpret data
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-3
(continued)
Key Definitions
n A population is the collection of all items of interest or under investigation
n N represents the population size (N ≈ Infinity)
n A sample is an observed subset of the population n n represents the sample size
n A parameter is a specific characteristic of a population n A statistic is a specific characteristic of a sample
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-4
1.2
Population vs. Sample
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-5
a b c d
ef gh i jk l m n
o p q rs t u v w
x y z
Population Sample
Values calculated using population data are called parameters
Values computed from sample data are called statistics
b c
g i n
o r u
y
Examples of Populations
n Names of all registered voters in Canada
n Incomes of all families living in Vancouver
n Annual returns of all stocks traded on the Toronto Stock Exchange
n Grade point averages of all the students in UBC
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-6
Random Sampling
Simple random sampling is a procedure in which n each member of the population is chosen strictly by
chance, n each member of the population is equally likely to be
chosen, n every possible sample of n objects is equally likely to
be chosen
The resulting sample is called a random sample
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-7
Descriptive and Inferential Statistics
Two branches of statistics: n Descriptive statistics
n Graphical and numerical procedures to summarize and process data
n Inferential statistics n Using data to make predictions, forecasts, and
estimates to assist decision making
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-8
Descriptive Statistics
n Collect data n e.g., Survey
n Present data n e.g., Tables and graphs
n Summarize data n e.g., Sample mean =
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-9
iXn∑
Inferential Statistics
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-10
n Estimation n e.g., Estimate the population
mean weight using the sample mean weight
n Hypothesis testing n e.g., Test the claim that the
population mean weight is 140 pounds
Inference is the process of drawing conclusions or making decisions about a population based on
sample results
Types of Data
Data
Categorical
Numerical
Discrete Continuous Examples:
n Marital Status n Are you registered to
vote? n Eye Color (Defined categories or
groups)
Examples:
n Number of Children n Defects per hour (Counted items)
Examples:
n Weight n Voltage (Measured characteristics)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-11
Graphical Presentation of Data
n Data in raw form are usually not easy to use for decision making
n Some type of organization is needed n Table n Graph
n The type of graph to use depends on the variable being summarized
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-12
1.3
Graphical Presentation of Data
n Techniques reviewed in this chapter:
Categorical Variables
Numerical Variables
• Frequency distribution • Bar chart • Pie chart
• Line chart • Frequency distribution • Histogram • Scatter plot
(continued)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-13
Tables and Graphs for Categorical Variables
Categorical Data
Graphing Data
Pie Chart
Bar Chart
Frequency Distribution
Table
Tabulating Data
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-14
The Frequency Distribution Table
Example: Hospital Patients by Unit
Hospital Unit Number of Patients Cardiac Care 1,052 Emergency 2,245 Intensive Care 340 Maternity 552 Surgery 4,630
(Variables are categorical)
Summarize data by category
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-15
Bar Chart Example
Hospital Patients by Unit
0
1000
2000
3000
4000
5000
Car
diac
Car
e
Emer
genc
y
Inte
nsiv
eC
are
Mat
erni
ty
Surg
ery
Num
ber o
f pa
tient
s pe
r yea
r
Hospital Number Unit of Patients Cardiac Care 1,052 Emergency 2,245 Intensive Care 340 Maternity 552 Surgery 4,630
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-16
Hospital Patients by Unit
Emergency25%
Maternity6%
Surgery53%
Cardiac Care12%
Intensive Care4%
Pie Chart Example
(Percentages are rounded to the nearest percent)
Hospital Number % of Unit of Patients Total Cardiac Care 1,052 11.93 Emergency 2,245 25.46 Intensive Care 340 3.86 Maternity 552 6.26 Surgery 4,630 52.50
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-17
Line Chart Example
Magazine Subscriptions by Year
0
50
100
150
200
250
300
350
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006Th
ousa
nds
of s
ubsc
riber
s
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-18
Frequency Distribution Example
Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature
24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-19
Frequency Distribution Example
n Sort raw data in ascending order: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
n Find range: 58 - 12 = 46
n Select number of classes: 5 n Compute interval width: 10 (46/5 then round up) n Determine interval boundaries: 10 but less than 20, 20 but
less than 30, . . . , 60 but less than 70
n Count observations & assign to classes
(continued)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-20
Frequency Distribution Example
Interval Frequency
10 but less than 20 3 .15 15 20 but less than 30 6 .30 30 30 but less than 40 5 .25 25 40 but less than 50 4 .20 20 50 but less than 60 2 .10 10 Total 20 1.00 100
Relative Frequency Percentage
Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
(continued)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-21
Histogram: Daily High Temperature
0
3
65
4
2
001234567
0 10 20 30 40 50 60
Freq
uenc
y
Temperature in Degrees
Histogram Example
(No gaps between
bars)
Interval
10 but less than 20 3 20 but less than 30 6 30 but less than 40 5 40 but less than 50 4 50 but less than 60 2
Frequency
0 10 20 30 40 50 60 70
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-22
How Many Class Intervals?
n Many (Narrow class intervals) n may yield a very jagged distribution
with gaps from empty classes n Can give a poor indication of how
frequency varies across classes
n Few (Wide class intervals) n may compress variation too much and
yield a blocky distribution n can obscure important patterns of
variation. 0
2
4
6
8
10
12
0 30 60 More
TemperatureFrequency
0
0.5
1
1.5
2
2.5
3
3.5
4 8 12 16 20 24 28 32 36 40 44 48 52 56 60More
Temperature
Frequency
(X axis labels are upper class endpoints)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-23
STATA Example
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-24
0.0
1.0
2.0
3.0
4De
nsity
0 20 40 60 80 100Grade
Grade with Low HW (HW<=0.6)
0.0
1.0
2.0
3.0
4De
nsity
0 20 40 60 80 100Grade
Grade with High HW (HW>0.6)
n Scatter Diagrams are used for paired observations taken from two numerical variables
n The Scatter Diagram: n one variable is measured on the vertical
axis and the other variable is measured on the horizontal axis
Scatter Diagrams
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-25
STATA Example
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-26
2040
6080
100
Fina
l Gra
de
0 2 4 6 8 10Homework
Grade Fitted values
Correlation = 0.499HW and Final Grade
Cross Tables
n Cross Tables (or contingency tables) list the number of observations for every combination of values for two categorical or ordinal variables
n If there are r categories for the first variable (rows) and c categories for the second variable (columns), the table is called an r x c cross table
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-27
Cross Table Example
n 2 x 4 Cross Table for type of patients and the daily average of smoking
Disease Non- 1-14 15-24 25+ Total Group Smokers Cigs. Cigs. Cigs.
lung-cancer 7 55 964 331 1357 Other dis. 61 129 1001 166 1357 Total 68 184 1965 497 2714
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 1-28