ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are...

281
ECN 221 – Business Statistics

Transcript of ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are...

Page 1: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

ECN 221 – Business Statistics

Page 2: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Data– A statistic is a function of data, so data are key to

statistics.

– Data are the facts we observe in a study or experiment.

– A data set contains the data of a particular study.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 8

Page 3: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Example Data Set

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 9

Name Income (2013)Taylor Swift $39,699,575

Kenny Chesney $32,956,240

Justin Timberlake $31,463,297

Bon Jovi $29,436,801

Rolling Stones $26,225,121

Beyonce $24,429,176

Maroon 5 $22,284,754

Luke Bryan $22,142,235

Source: Billboard

Page 4: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

– Elements are entities on which we collect the data.– Observations are the measurements we get for those

elements. – Variables are the characteristics of the elements.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 10

Name Income (2013)Taylor Swift $39,699,575

Kenny Chesney $32,956,240

Justin Timberlake $31,463,297

Bon Jovi $29,436,801

Rolling Stones $26,225,121

Beyonce $24,429,176

Maroon 5 $22,284,754

Luke Bryan $22,142,235

observation

variableelement

Page 5: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Measurement Scales– Nominal Scale: to label or name something (can be a

number). – Ordinal Scale: is nominal (a label) but can be ranked.

For example grades in school, A is better than B.

Notice that these may or may not be numeric.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 11

Page 6: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Measurement Scales– Interval Scale: numeric measurement with fixed

interval between ranks. The rank order is meaningful.– Ratio Scale: similar to interval scale but the ratio of two

values is meaningful. For example, $150 is twice asmuch as $75.

These are always numeric.PREVIOUS CFA EXAM ASKS about measurement scales

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 12

Page 7: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Note the different types of measurement scales.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 13

Posting ID Class Major ACT Score

Credit hours

1234-567 Sophomore Economics 35 15

8901-234 Junior Marketing 28 18

1345-678 Junior Finance 30 9

2334-678 Senior Finance 29 15

3421-987 Senior Political Science 26 15

INTERVAL

NOMINAL

ORDINAL

RATIO

Page 8: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Types of Data/Variables• Categorical or Qualitative: identify

characteristics of the element– Examples: gender, marital status, ever filed for

bankruptcy, college graduate

• Quantitative: numeric values showing howmuch or how many– Examples are: quantity, price, number of people,

income, number of children

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 14

Page 9: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 15

Posting ID Class Major ACT Score

Credit hours

1234-567 Sophomore Economics 35 15

8901-234 Junior Marketing 28 18

1345-678 Junior Finance 30 9

2334-678 Senior Finance 29 15

3421-987 Senior Political Science 26 15

CATEGORICAL

QUANTITATIVEKEY

Page 10: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Types of Data• Cross-sectional data: data collected at the

same point in time across elements or observations– Examples: price of coffee in different cities,

amount different people spend on credit cards in December

• Time series data: data collected over multiple time periods.– Examples: price of coffee each week in Phoenix,

your credit card spending each month of the year

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 16

Page 11: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Example• Monthly unemployment data for the US from

2010-2013 are time series data.• Data showing the unemployment rate of each

state for January 2014 are cross section data.

Note: it is possible to combine cross-sectional and time series data.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 17

Page 12: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Time Series Example

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

10.0

11.0

2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

US Unemployment Rate (Seasonally Adjusted)

unemployment rate

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 18

Source: BLS

Page 13: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Cross Section Example

5.96.3 6.4

4.4

3.5

6.9

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

Arizona California New Mexico Colorado Utah Nevada

unemployment rate, seasonally adjusted, June 2015

unemployment rate

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 19

Data are available from the BLS, http://www.bls.gov/web/laus/laumstrk.htm.

Page 14: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Population vs. SamplePopulation

–represents all possible elements that are of interest in a particular study

Sample–a subset of the population

Ideally the sample is representativeof the population.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 24

Page 15: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Descriptive vs. InferentialDescriptive statistics

– Collecting, summarizing, and displaying data

Inferential statistics– making claims or conclusions about the

population based on a sample

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 27

Page 16: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Descriptive vs. InferentialQuiz (descriptive or inferential)

– The average FICO score is 671.– A 20% increase in the minimum wage results in at

least a .5% increase in the unemployment rate.– 42% of consumers said they are “pessimistic”

about the economy’s direction.– Consumers are less optimistic this year than last

year.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 28

Page 17: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

ECN 221 – Business Statistics

Page 18: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Objectives• Understand how to appropriately display

categorical data– Bar charts/pie charts/frequency tables

• Understand how to appropriately displayquantitative data.– Histograms/stem-and-leaf displays/frequency

tables/crosstabulation aka contingency tables/scatterplots

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 2

Page 19: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Frequency TablesUsed for categorical (qualitative) data.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 3

Major Students

Accountancy 20

Business 132

Civil Engineering 1

Computer Information Systems 8

Computer Science 1

Criminal Justice & Criminology 1

Economics 21

Finance 22

Italian 1

Management 11

Marketing 18

Other 13

Supply Chain Management 16

Page 20: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Frequency TablesRelative and Percent Frequency Tables show related information.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 4

Major Frequency RelativeFrequency

Percent Frequency

Accountancy 20 0.075 7.55Business 132 0.498 49.81Civil Engineering 1 0.004 0.38Computer Information Systems 8 0.030 3.02Computer Science 1 0.004 0.38Criminal Justice & Criminology 1 0.004 0.38Economics 21 0.079 7.92Finance 22 0.083 8.30Italian 1 0.004 0.38Management 11 0.042 4.15Marketing 18 0.068 6.79Other 13 0.049 4.91Supply Chain Management 16 0.060 6.04

Divide 20 by the total, 20/265=.075.

Multiply .004 times 100 (note .004 was rounded from .0038).

Page 21: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Frequency TablesFor cumulative frequencies you add the values as you move down the table.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 5

Major Cumulative Freq.

Cumulative Relative Freq.

Cumulative Percent Freq.

Accountancy 20 0.075 7.55Business 152 0.574 57.36Civil Engineering 153 0.577 57.74Computer Information Systems 161 0.608 60.75Computer Science 162 0.611 61.13Criminal Justice & Criminology 163 0.615 61.51Economics 184 0.694 69.43Finance 206 0.777 77.74Italian 207 0.781 78.11Management 218 0.823 82.26Marketing 236 0.891 89.06Other 249 0.940 93.96Supply Chain Management 265 1.000 100.00

Page 22: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Crosstabulation• Also called contingency tables• Used to summarize frequencies across two variables• You can use these to find percentages and empirical

probabilities.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 7

Resident Non-Resident

Freshman 2 10

Sophomore 51 93

Junior 62 25

Senior 17 5

Example: The percentage of students that are non-resident juniors is (25/265)*100=9.4%.

Page 23: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Frequency Tables QuizUse these data to find1. % of sophomores not in WPC.2. % of seniors in WPC.3. % of juniors.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 8

Major in WPC

Major not in WPC

Freshman 6 1

Sophomore 29 8

Junior 55 42

Senior 24 9

Note the numbers are not from the previous slide.

Page 24: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Bar Chart

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 10

0 20 40 60 80 100 120 140

AccountancyBusiness

Civil EngineeringComputer Information Systems

Computer ScienceCriminal Justice & Criminology

EconomicsFinance

ItalianManagement

MarketingOther

Supply Chain Management

Declared Majors

Page 25: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Pie Chart

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 11

Declared Majors

Accountancy Business Civil Engineering

Computer Information Systems Computer Science Criminal Justice & Criminology

Economics Finance Italian

Management Marketing Other

Supply Chain Management

Page 26: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Stem and Leaf DisplayThe stem is on the left and shows the first digit(s) of the variable.The leaf or leaves are on the right.List the first digit(s) for the observations in the stem column.Then list the “leaf” digits or values in the leaf column.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 13

Selected grades of Previous Students

94% 81% 92% 86% 90% 60% 78%64% 98% 97% 78% 83% 87% 63%78% 93% 84% 87% 90% 77% 86%

6789

034788813466770023478

Page 27: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Box (and Whiskers)A box and whiskers plot shows the• Median: middle line in the box• first and third quartiles: ends of the box• outliers: dots, asterisks or other symbols beyond the whiskers.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 14

Page 28: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Histogram

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 16

0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

400,000

450,000

18-21 22-25 26-29 30-33 34-37 38-41 42-45 46-49 50-53 54-57 58-61 62-65 66-69 70-73 74-77 78-81 82-85 86-87

Number of Respondents by Age, BRFSS Data, 1984-2011

Page 29: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Scatter Diagram• Also called scatter plot• Used to summarize the relationship between two

variables• We will elaborate on this when we discuss correlation

and regression.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 17

Page 30: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Scatter DiagramAn example using excel and husband/wife height data:

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 18

100

110

120

130

140

150

160

170

180

190

120 130 140 150 160 170 180 190 200

Wife

's He

ight

in C

entim

eter

s

Husband's Height in Centimeters

Height of Newlyweds (from Hadi)

Taller men marry taller women and vice versa.The data are from Hadi, http://www1.aucegypt.edu/faculty/hadi/RABE5/Data5/P052.txt

Page 31: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 2

Descriptive Statistics

Chapter 3 part a

Page 32: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

ObjectivesLearn about and how to calculate• Tendency: mean, median and percentiles.• Measures of variation: range, interquartile range,

variance and coefficient of variation.• Shape: distribution shape, z-scores, the empirical rule

and detecting outliers. • Relationship: covariance and correlation.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 3

Page 33: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

The MeanThe mean is the sum of all the values of a variable divided by the sample size.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 4

��𝑥 =∑𝑖𝑖=1𝑛𝑛 𝑥𝑥𝑖𝑖𝑛𝑛

Page 34: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Calculating the Mean

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 5

• Let’s calculate the mean tuition for some nearby schools.

School Fall 2015 “Tuition“ (resident)

ASU $4,742

NAU $4,731

UA $5,195

SDSU $2,736

�𝑖𝑖=1

𝑁𝑁

𝑥𝑥𝑖𝑖

𝑛𝑛=

4742 + 4731 + 5195 + 27364

=17404

4

= $𝟒𝟒,𝟑𝟑𝟑𝟑𝟑𝟑 𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂𝒂 "𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕”

Page 35: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

The MedianThe median is another measure of central tendency. It is themiddle value in the data. To find the median1. Order or sort the data from smallest to largest.2. If n is odd then the median is the middle value.3. If n is even then the median is the average of the two

middle values.

Note that at least half of the observations are at or above the median and at least half are at or below the median.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 6

Page 36: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Find the Median

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 7

The middle values are $4,742 and $4,731.

School Fall 2015 “Tuition“ (resident)

ASU $4,742

NAU $4,731

UA $5,195

SDSU $2,736

$4742+$47312

= $94732

= $4,736.5

Page 37: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Find the Median

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 8

Suppose we have an odd number of observations.

School Fall 2015 “Tuition“ (resident)

ASU $4,742

NAU $4,731

UA $5,195

SDSU $2,736

USC $24,732

𝑊𝑊𝑊𝑊𝑊𝑊𝑊 𝑖𝑖𝑖𝑖 𝑊𝑊𝑊𝑡𝑡 𝑚𝑚𝑡𝑡𝑚𝑚𝑖𝑖𝑊𝑊𝑛𝑛 𝑛𝑛𝑛𝑛𝑛𝑛?

Page 38: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Find the Median

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 9

Sort them from low to high.

School Fall 2015 “Tuition“ (resident)

SDSU $2,736

NAU $4,731

ASU $4,742

UA $5,195

USC $24,732

ASU is in the middle; median = $4,742.

Page 39: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Mean vs Median

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 10

Notice that adding USC had almost no impact on the median. What did it do to the mean?

School Fall 2015 “Tuition“ (resident)

SDSU $2,736

NAU $4,731

ASU $4,742

UA $5,195

USC $24,732

Page 40: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Mean vs Median

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 11

The mean is easy to calculate and understand but is heavily influenced by extreme values or outliers.

The median is not influenced by changes in extreme or tail values.

Page 41: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

ModeThe mode is the value which occurs most frequently in the data.

Example: if we have test scores of 95, 90, 87, 87, 83, 75, 72 and 66 the mode is 87 because it occurs more than any other value.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 12

Page 42: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

PercentilesThe pth percentile is the value with at least p percent of the dataless than or equal to it and at least (100-p) percent of the values equal to or above it.Finding the pth percentile:

• Sort the data smallest to largest.• Find i = np/100.• If i is not an integer round up, e.g. 23.2 goes to 24 and thatis the position of the pth percentile.• If i is an integer then the pth percentile is the average of thevalues at position i and i + 1.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 13

Page 43: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

PercentilesExample Finding a PercentileThe sorted data are 17, 21, 23, 23, 28, 30, 35, 36, 36, 39, 44, 45, 48, 48, 49, 54. Note that n = 16. Suppose you want the 70th percentile. Find i = np/100 = (16)(70)/100 = 11.2. 11.2 is not an integer so go to position 12. The 12th ordered observation is 45 so the 70th percentile

is 45.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 14

Page 44: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Special Percentiles• The 25th percentile is called the first

quartile, Q1.• The 50th percentile is called the second

quartile or median, Q2.• The 75th percentile is called the third

quartile, Q3.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 15

Page 45: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Two distributions may have the same mean but different measures of spread or variability.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 16

Page 46: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Measures of Variability• Range= max-min.• Interquartile Range: IQR = Q3 -Q1.

• Variance: 𝑖𝑖2 = ∑𝑖𝑖=1𝑛𝑛 𝑥𝑥𝑖𝑖−��𝑥 2

𝑛𝑛−1• Standard Deviation: the square root of the

variance e.g. if 𝑖𝑖2=16 then s=4.• Coefficient of Variation: ((s/��𝑥)*100)%.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 17

Page 47: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Range and IQRFind the range and IQR for tuition.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 18

School Fall 2015 “Tuition“ (resident)

SDSU $2,736

UNLV** $2,989

NMSU $3,365

NAU $4,731

ASU $4,742

UA $5,195

UCSD* $6,728

PLNC $15,900

USD $22,000

USC $24,732

Page 48: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Range=max-min= $24,732-$2,736=$21,996.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 19

School Fall 2015 “Tuition“ (resident)

SDSU $2,736

UNLV** $2,989

NMSU $3,365

NAU $4,731

ASU $4,742

UA $5,195

UCSD* $6,728

PLNC $15,900

USD $22,000

USC $24,732

IQR= Q3-Q1= $15,900-$3,365=$12,535.

Page 49: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Your TurnFind the range and IQR for the following values: 328, 472, 131, 295, 253, 238, 275, 142, 213, 163,

258, 292.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 20

Page 50: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Your TurnFind the range and IQR for the following values: 328, 472, 131, 295, 253, 238, 275, 142, 213, 163,

258, 292. Range: 472-131=341. Interquartile Range: First find Q1 and Q3,

(163+213)/2=188 and (292+295)/2=293.5. Then we get 293.5-188=105.5.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 21

Page 51: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Variance and Standard Dev.In practice you will use a calculator or software. Here is the variance for tuition at Arizona schools:

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 22

x-��𝑥 (x-��𝑥)2

ASU $4,742 -$147 $21,609 NAU $4,731 -$158 $24,964 UA $5,195 $306 $93,636 average $4,889

total $140,209

divide by n-1 $70,104.5

S= $70,104.5=$264.77

Page 52: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Your TurnCalculate the variance and standard deviation for private schools.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 23

School Fall 2015 “Tuition”

PLNC $15,900

USD $22,000

USC $24,732

Page 53: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Your TurnCalculate the variance and standard deviation for private schools.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 24

School Fall 2015 “Tuition”

PLNC $15,900

USD $22,000

USC $24,732

s2=20,446,341 and s=4521.8

Page 54: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Coefficient of VariationSometimes we want to compare the variability of two variables that have noticeably different means and standard deviations. In this case we might “standardize” the measure of variability.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 25

CV= ((s/��𝑥)*100)%.

Page 55: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Coefficient of VariationExample: If gasoline has a standard deviation in the price of $.30 per gallon while cars have a standard deviation of $7,000 which really has greater spread or variability?

Can I compare the two?

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 26

Page 56: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Coefficient of VariationExample Calculation: for gasoline the standard deviation in the price is $.30 (per gallon) and the sample mean is $2.55.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 27

CV= ((s/��𝑥)*100)%=(.30/2.55)*100%=11.76%.

Page 57: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Your TurnCalculate the coefficient of variation for car prices. The standard deviation is s=$7,000 and the average is $33,560.

Does gasoline price or car price exhibit more variation?

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 28

Page 58: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

Your TurnCalculate the coefficient of variation for car prices. The standard deviation is s=$7,000 and the average is $33,560.

Does gasoline price or car price exhibit more variation? Car price in this example.

ECN 221 - Business Statistics, Spring 2016 with Richard Cox 29

CV= ((s/��𝑥)*100)%=(7/33.56)*100%=20.86%.

Page 59: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 23

Descriptive StatisticsChapter 3 part b

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 60: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 23

Objectives

I Learn how to calculate mean, median, mode, and percentiles.

I Learn how mean, median and mode differ.

I Learn measures of variability: range, interquartile range,variance, coefficient of variation

I Learn about distribution shape, z-scores, the empirical ruleand detecting outliers.

I Learn about measures of relationship between two variables:covariance and correlation.

Page 61: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 23

Shape

Distribution Shape

Distributions may be symmetric, left skew or right skew. Left skewimplies more of the observations lie away from the left side of thedistribution.

Left Skew

Page 62: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

4/ 23

Shape

Distribution Shape

Right skew implies more of the observations lie away from the rightside of the distribution.

Right Skew

Page 63: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

5/ 23

z-score

z-scoreThe z-score measures how far away an observation is from theaverage and does this in terms of the standard deviation. In otherwords how many standard deviations away from the average is theobservation.

zi =xi − x

s.

We will use this concept a lot when doing hypothesis testing.

Page 64: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

6/ 23

z-score

Why z-scores Are Important

I We can use them to identify outliers or extreme observations.Because outliers can noticeably impact the mean we maywant to identify them.

I Besides their impact on the mean are there other reasons wemight want to identify extreme observations?

I We must understand the concept of a z-score to dohypothesis testing later.

Page 65: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

7/ 23

z-score

Examples

I suppose x = 10 and s = 2. The value 13.4 has a z-score of

z =13.4 − 10

2=

3.4

2= 1.7 .

I suppose x = 29 and s = 7.2. The value 21.1 has a z-score of

z =21.1 − 29

7.2=

−7.9

7.2= -1.097 .

I suppose x = 71.4 and s = 1.3. The value 68.7 has a z-score of

z =68.7 − 71.4

1.3=

−2.7

1.3= -2.077 .

Page 66: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

8/ 23

z-score

Your TurnFor a survey of students and hours worked I got back 655 responses(so far). The mean was 9.17 and the standard deviation wass=12.6. The max was 70 and the next highest value was 50. Findthe z-score for the student that worked 50 hours in a week. (Lastsemester the numbers were n=555, x = 11.2, s=14.45, max=85).

z =

Page 67: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 23

z-score

Your TurnFor a survey of students and hours worked I got back 655responses (so far). The mean was 9.17 and the standard deviationwas s=12.6. The max was 70 and the next highest value was 50.Find the z-score for the student that worked 50 hours in a week.

z =50 − 9.17

12.6=

40.83

12.6= 3.2404762 .

Page 68: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 23

Empirical Rule

Empirical Rule--SEVERAL CFA EXAM QUESTIONSAbout 68% of the data will be within 1 standard deviation of themean, about 95% will be within 2 standard deviations and almostall the data will be within 3 standard deviations.

-4 -3 -2 -1 0 1 2 3

Empirical Rule(for bell-shaped distributions)

~ 68% of observations will be within one standard deviation of the mean

~ 95% of obs. within 2

std. dev. of mean

>99% will be within 3 s of the mean

Page 69: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 23

Empirical Rule

Empirical Rule

The rule only applies to bell-shaped distributions. It also implies:

I Approximately 68% of the data will have a z-score of between-1 and 1.

I Approximately 95% of the data will have a z-score of between-2 and 2.

I Very few observations will have a z-score < −3 or > 3.

Page 70: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

14/ 23

Covariance and Correlation

Bivariate Relationships

The covariance and correlation are measures of associationbetween two variables. They can help us understand how twovariables move together. For example,

I How are price and quantity sold related? If we raise the pricethe quantity sold will decrease but by how much?

I How are competitors’ prices and the quantity we sell related?

I How are advertising and the quantity we sell related?

I How are FICO scores and the probability of defaulting on aloan related?

I How are spending patterns and the probability of defaultingon credit card debt related?

Page 71: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 23

Covariance and Correlation

Bivariate Relationships

The covariance and correlation will help us begin to answer thesequestions. We will build on these concepts when we discussANOVA and regression later in the course.

Important Note

To fully answer the questions above we will need to understandhypothesis testing and how covariance and correlation lead us tothe statistics we will find using ANOVA and regression.

Page 72: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 23

Covariance and Correlation

How to Find the CovarianceThe covariance measures how two variables vary together or howthey co-vary hence the term covariance. The formula for thecovariance between x and y is

sxy =

∑(xi − x)(yi − y)

n − 1.

The numerator shows that for any observation i what we aretaking is how x is different from its average and multiplying thatby how much y is different from its average.

If x is “typically” above/below its average when y is above/belowits average then sxy will be positive. If the opposite is true then sxywill be negative.

Page 73: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 23

Covariance and Correlation

How to Find the CovarianceTo calculate the covariance use the formula above and you canpractice with an example in your book.

I There is an example on page 143 of ASWCC 7th edition.

I We will not do these calculations in this class.

I Exams in this course will not require you to calculate sxy fromraw observations.

I We may do similar calculations from raw data when we coverregression.

Page 74: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

18/ 23

Covariance and Correlation

CorrelationThe correlation measures how two variables relate to each other. Itis a standardized version of the covariance. The formula is :

rxy =sxysxsy

.

Page 75: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

19/ 23

Covariance and Correlation

Example

For the newlyweds’ heights (from Hadi) shw = 68.689 andshusband = 9.908 and swife = 9.081 so that:

shw =68.689

(9.908)(9.081)=

68.689

89.98= .763.

Page 76: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

20/ 23

Covariance and Correlation

Your TurnFor 10 bowl games ASWCC report the Las Vegas spread and theactual margin of victory. The covariance is 22.33, the standarddeviation for LV spread is 5.538 and the standard deviation formargin of victory is 7.134. What is the correlation?

rLV,M =

Compute the correlation between the returns on the Russell 1000index and the DJIA from 1988 to 2012. The covariance was263.61, the Russell standard deviation was 17.89 and the DJIAstandard deviation was 15.37.

rr,dj =

Page 77: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

21/ 23

Covariance and Correlation

Your TurnFor 10 bowl games ASWCC report the Las Vegas spread and theactual margin of victory. The covariance is 22.33, the standarddeviation for LV spread is 5.538 and the standard deviation formargin of victory is 7.134. What is the correlation?

rLV,M =22.33

(5.538)(7.134)= .565 .

Compute the correlation between the returns on the Russell 1000index and the DJIA from 1988 to 2012. The covariance was263.61, the Russell standard deviation was 17.89 and the DJIAstandard deviation was 15.37.

rr,dj =263.61

(17.89)(15.37)= .958 .

Page 78: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

22/ 23

Covariance and Correlation

Correlation and Scatter PlotsPositive correlation yields a scatter plot with an upward trend.Here are the Russell-DJIA data with rxy=.958

-50

-40

-30

-20

-10

0

10

20

30

40

-40 -30 -20 -10 0 10 20 30 40

Ru

sse

ll 1

00

0 %

Re

tu

rn

DJIA % Return

Annual % Returns 1988-2012

Page 79: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

23/ 23

Covariance and Correlation

Correlation and Scatter PlotsNegative correlation yields a scatter plot with a downward trend.Here are price-quantity data with rxy=-.941

500

550

600

650

700

750

800

850

900

2 2.5 3 3.5 4 4.5

Qu

an

tity

So

ld (

Bo

xes)

Price in $

Price vs. Quantity

Page 80: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 35

ProbabilityChapter 4, Introduction to Probabilities

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 81: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 35

Objectives

I Learn about counting rules and combinations.

I Learn what probability means, the basic rules of probabilitiesand how to calculate empirical probabilities.

I Learn about conditional probability and Bayes’ Theorem orBayes’ Rule.

Page 82: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 35

Probability: Definition

Probability

Formal: A probability function is a map from a sample space tothe unit interval.

Informal: A probability function takes anything that could possiblyhappen and turns it into a number between 0 and 1 and the biggerthe number the more likely it is that the thing could happen.

Page 83: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

4/ 35

Probability: Definition

Probability

I The sample space is the set of all possible outcomes.

I The possible outcomes in the sample space are the samplepoints.

I An event is made up of one or more sample points.

I The probability tells us how likely it is an event will occur.

I By unit interval we mean the probability is at least 0 and atmost 1.

Page 84: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

6/ 35

Probability: Counting Rules

CombinationsSometimes to calculate probabilities we will find it useful to countthe number of possible combinations. This is a review fromMAT211.The number of combinations of N objects taken n at a time is:

CNn =

(N

n

)=

N!

n!(N − n)!.

Read this as “N choose n.” Also, it is common to see “n choose r”or “n choose k.”

Page 85: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

7/ 35

Probability: Counting Rules

Example

Suppose a company has 4 openings for first level people-leaderpositions. Suppose there are 11 people in the company that haveenough tenure and experience to qualify for these positions andthat the company will fill the positions internally. What is thenumber of employee combinations possible for filling the 4openings? (

11

4

)=

11!

4!(7)!=

(11)(10)(9)(8)

(4)(3)(2)= 330.

Page 86: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

8/ 35

Probability: Counting Rules

Your TurnIn January 2016 the Powerball lottery reached over 1 billiondollars. Probabilities of winning in poker, 21, roulette, lotteries etcare related to the numbers of combinations possible or possibleoutcomes. Suppose a lottery has 40 numbers to pick from and youcan pick any five. How many combinations are possible?

Page 87: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 35

Probability: Counting Rules

Your TurnSuppose a lottery has 40 numbers to pick from and you can pickany five. How many combinations are possible?

(40

5

)=

40!

5!(35)!=

(40)(39)(38)(37)(36)

5 ∗ 4 ∗ 3 ∗ 2 ∗ 1= 6.58008× 105.

Page 88: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 35

Finding Probabilities

Calculating Probabilities

Probabilities can be found in one of three basic ways.

I classical: we calculate classical probabilities when we knowthe probability distribution or function. For example findingthe probability of drawing a 7 from a deck of playing cards.

I empirical: we calculate empirical probabilities from therelative frequency of the events in available data. Examplesare below.

I subjective: your guess is better than mine. Subjectiveprobabilities are based on opinions where there are no dataavailable and the true distribution is unknown.

Page 89: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 35

Finding Probabilities

Empirical Probabilities

Using the relative frequency method we can find empiricalprobabilities. We construct a relative frequency table and read therelative frequencies as probabilities.

traffic related fatalities and alcohol level (US, 1994-2012)

max BAC fatalities relative frequency

0 558,520 .671

> 0, < .08 41,971 .050

≥ .08 232,395 .279

I Data are from FARS available from NHTSA.

Page 90: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

12/ 35

Finding Probabilities

Your TurnFind the probabilities below.

major frequency relative frequency

Accountancy 39

Economics 32

Finance 60

Marketing 53

Management 38

Supply Chain Management 32

Other: business 228

Other: non-business 21

Page 91: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

13/ 35

Finding Probabilities

Your TurnFind the probabilities below.

major frequency relative freq.

Accountancy 39 0.078

Economics 32 0.064

Finance 60 0.119

Marketing 53 0.105

Management 38 0.076

Supply Chain Management 32 0.064

Other: business 228 0.453

Other: non-business 21 0.042

I The probability of drawing a non-business major is .042.

Page 92: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

14/ 35

Probability Rules

Basic Rules

I The probability of an event, P(Ei ), is between 0 and 1.

0 ≤ P(Ei ) ≤ 1.

I The sum of the probabilities for all possible outcomes(mutually exclusive events) is 1. If there are n possiblemutually exclusive events

n∑i=1

P(Ei ) = 1.

I The complement of an event Ei is denoted E ci . Note that

P(Ei ) = 1− P(E ci ).

Page 93: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 35

Probability Rules

Example

In football the possible events on a team’s first down are{touchdown, first down, second down, turn ball over, end of half orgame}={E1,E2,E3,E4,E5}.

Suppose the associated probabilities are {P(E1) =?, .4, .45, .03,.02}. Then we can figure P(E1) = 1− P(E c

1 ) = 1− .9 = .1.

Page 94: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 35

Probability Rules

Union vs. IntersectionThe intersection of two sets, A and B, is the set of elements thatare members of both A and B. We write this A ∩ B.

The union of two sets A and B, is the set of elements that aremembers of either A or B. We write this A ∪ B.

Page 95: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 35

Probability Rules

Example

A={economics, marketing, finance} and B={economics, statistics,physics} then A ∩ B={economics}.

Your TurnA={economics, marketing, finance} and B={economics, statistics,physics}. What is A ∪ B?

Page 96: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

18/ 35

Probability Rules

Example

A={economics, marketing, finance} and B={economics, statistics,physics} then A ∩ B={economics}.

Your TurnA={economics, marketing, finance} and B={economics, statistics,physics}. What is A ∪ B?A ∪ B={economics, marketing, finance, statistics, physics}

Page 97: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

19/ 35

Probability Rules

Mutually Exclusive

Two events are mutually exclusive if they cannot both happen atthe same time or they cannot happen as part of the sameoutcome. Examples:

I A={illiterate} and B={college graduate}.I A={daytime} and B={night}.

Page 98: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

20/ 35

Probability Rules

Addition LawTo find the probability of the union of two events we have thefollowing rule:

P(A ∪ B) = P(A) + P(B)− P(A ∩ B).

We will see examples below when we look at joint probabilitytables.

Page 99: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

21/ 35

Conditional Probability

Conditional ProbabilitiesInformation about certain outcomes can impact our beliefs aboutthe probabilities of other outcomes.

I Let’s play Let’s Make a Deal.

I You choose between items a, b, and c. One will have a prize.

I What is the probability that a person guesses the option withthe prize?

Page 100: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

22/ 35

Conditional Probability

Joint Probability Tables

The table shows relative frequencies for different majors and courseloads.

Does our assessment of the probabilities of being an accountancyor economics major change when we know the course load?

<15 hours 15 hours >15hours total

Accountancy 0.046 0.077 0.169 0.292

Economics 0.031 0.054 0.162 0.246

Finance 0.046 0.138 0.277 0.462

total 0.123 0.269 0.608 1

Page 101: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

23/ 35

Conditional Probability

Joint Probability Tables

Joint probabilities are in the body of the table, in blue andmarginal probabilities are in the margins in red.

<15 hours 15 hours >15hours total

Accountancy 0.046 0.077 0.169 0.292

Economics 0.031 0.054 0.162 0.246

Finance 0.046 0.138 0.277 0.462

total 0.123 0.269 0.608 1

Page 102: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

24/ 35

Conditional Probability

Examples

I Rule: P(A ∪ B) = P(A) + P(B)− P(A ∩ B).

I P(Acc.)=.292.

I P(Acc. ∪ Fin.)=.292+.462=.754.

I P(Acc.∪ > 15hours)=.292+.608-.169=.731.

I P(Fin. ∩ 15hours) = .138.

<15 hours 15 hours >15hours total

Accountancy 0.046 0.077 0.169 0.292

Economics 0.031 0.054 0.162 0.246

Finance 0.046 0.138 0.277 0.462

total 0.123 0.269 0.608 1

Page 103: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

25/ 35

Conditional Probability

Conditional Probability

The probability that event A will occur given that B has occurred.

P(A|B) =P(A ∩ B)

P(B).

If events are independent:

P(A|B) = P(A).

Also note that,P(A ∩ B) = P(B)P(A|B),

where A and B can be “flipped.”

We need an example.

Page 104: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

26/ 35

Conditional Probability

Conditional Probability

I P(econ.| < 15hours) = P(econ.∩<15hours)P(<15hours) = .031

.123 = .252.

<15 hours 15 hours >15hours total

Accountancy 0.046 0.077 0.169 0.292

Economics 0.031 0.054 0.162 0.246

Finance 0.046 0.138 0.277 0.462

total 0.123 0.269 0.608 1

Page 105: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

27/ 35

Conditional Probability

Your Turn: Find the Following

I P(Fin.)=

I P(Econ. ∪ 15 hours.)=

I P(Fin.∩ > 15hours) =

I P(> 15hours|econ.) =

<15 hours 15 hours >15hours totalAccountancy 0.046 0.077 0.169 0.292Economics 0.031 0.054 0.162 0.246Finance 0.046 0.138 0.277 0.462total 0.123 0.269 0.608 1

Page 106: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

28/ 35

Conditional Probability

Your Turn: Find the Following

I P(Fin.)= .462

I P(Econ. ∪ 15 hours.)=.246+.269-.054= .461

I P(Fin.∩ > 15hours) = .277

I P(> 15hours|econ.) = .162.246 = .659

<15 hours 15 hours >15hours totalAccountancy 0.046 0.077 0.169 0.292Economics 0.031 0.054 0.162 0.246Finance 0.046 0.138 0.277 0.462total 0.123 0.269 0.608 1

Page 107: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

29/ 35

Conditional Probability

Bayes’ Theorem or Rule

Bayes’ Rule can be applied for conditional probabilities when∑P(Ai ) = 1. The equation is:

P(Ai |B) =P(Ai )P(B|Ai )∑P(Ai )P(B|Ai )

Page 108: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

30/ 35

Conditional Probability

Bayes’ Theorem or Rule: Example

We want to find the probability of winning a game conditional onleading at half-time. Suppose the overall probability of winning is.8 and the probability of losing is .2. Suppose that the probabilityof having the lead at the half is .725 when winning, and .3 whenlosing. Using Bayes’ Rule:

P(Ai |B) =.8 ∗ .725

.8 ∗ .725 + .2 ∗ .3=

0.58

0.64= 0.90625

Page 109: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

31/ 35

Appendix

Extra Practice for at Home

Page 110: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

32/ 35

Probability: Counting Rules

Your TurnSuppose a food product developer has 19 ingredients with which towork; e.g. honey, cinnamon. How many different combinations of5 ingredients are possible?

Page 111: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

33/ 35

Probability: Counting Rules

Your TurnSuppose a food product developer has 19 ingredients with which towork; e.g. honey, cinnamon. How many different combinations of5 ingredients are possible?(

19

5

)=

19!

5!(14)!=

(19)(18)(17)(2)

1= 11628 .

Page 112: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

34/ 35

Conditional Probability

Bayes’ Theorem or Rule: Example

We want to find the probability of winning a game conditional ontrailing at half-time. Suppose the overall probability of winning is.8 and the probability of losing is .2. Suppose that the probabilityof trailing at the half is .25 when winning, and .6 when losing.

Page 113: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

35/ 35

Conditional Probability

Bayes’ Theorem or Rule: Example

We want to find the probability of winning a game conditional ontrailing at half-time. Suppose the overall probability of winning is.8 and the probability of losing is .2. Suppose that the probabilityof trailing at the half is .25 when winning, and .6 when losing.Using Bayes’ Rule:

P(Ai |B) =.8 ∗ .25

.8 ∗ .25 + .2 ∗ .6=

0.2

0.32= 0.625

Page 114: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 32

Discrete Probability DistributionsChapter 5

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 115: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 32

Objectives

I Be able to distinguish between discrete and continuousrandom variables.

I Understand the concept of a probability distribution and aprobability mass function.

I Learn how to compute expected values and variance for arandom variable.

I Work with the binomial and Poisson distributions includingexamples calculating probabilities of certain outcomes.

Page 116: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 32

Discrete Variables

Random VariableA random variable is a number used to represent the outcome ofan experiment. Random variables can be discrete, taking on alimited number of values or values that are countable, orcontinuous, taking on any value in a specified range.

I Number of people that reply to an advertisement: discrete.

I Amount someone spends on their credit card: continuous.

I Your weight: continuous.

I Number of tacos sold in one week: discrete.

Page 117: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

6/ 32

PMF

PMF RulesA probability mass function is a probability function so all the rulesfor probability functions apply. In particular note that

I probabilities can’t be negative; f (x) ≥ 0.

I probabilities must add up to 1;∑

f (x) = 1.

Page 118: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

7/ 32

Expectation and Variance

Expectation and Variance

If we know the PMF or probability function we can find theexpected value of a random variable.

E (x) = µ =∑

xf (x).

We can also find the variance.

Var(x) = σ2 =∑

(x − µ)2f (x).

Page 119: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 32

Expectation and Variance

Simpler Example

We want to find the expected number of children per U.S.household. The Census Bureau reports households of 4 or morelumped together. For simplicity assume 4 is the maximum numberof children.

x f(x) xf(x) x f(x) xf(x)

0 0.6770 0.0000 3 0.0472 0.14171 0.1383 0.1383 4 0.0209 0.08352 0.1167 0.2333

µ = (0)(.677)+(1)(.1383)+(2)(.1167)+(3)(.0472)+(4)(.0209) =0 + .1383 + .2333 + .1417 + .0835 = .5968.

Page 120: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 32

Expectation and Variance

Example

Find the expected value and variance for the distribution below.

x f(x) xf(x) (x − µ)2 (x − µ)2f (x)

0 0.11 0.32 0.23 0.4

total 1

Page 121: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 32

Expectation and Variance

Example

Find the expected value and variance for the distribution below.

x f(x) xf(x) (x − µ)2 (x − µ)2f (x)

0 0.1 0.0 3.61 0.3611 0.3 0.3 0.81 0.2432 0.2 0.4 0.01 0.0023 0.4 1.2 1.21 0.484

total 1 1.9 1.09

µ = 1.9. σ2 = 1.09.

Page 122: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

12/ 32

Expectation and Variance

Your TurnFind the expected value and variance

x f(x) xf(x) (x − µ)2 (x − µ)2f (x)

10 0.4 4 1.96 0.78411 0.415 0.2

total 1

Page 123: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

13/ 32

Expectation and Variance

Your TurnFind the expected value and variance

x f(x) xf(x) (x − µ)2 (x − µ)2f (x)

10 0.4 4 1.96 0.78411 0.4 4.4 0.16 0.06415 0.2 3 12.96 2.592

total 1 11.4 3.44

Page 124: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 32

Binomial Probability Distribution

Binomial PMFA binomial experiment has

I n identical and independent trials.

I Each trial can end in success or failure.

I The probability of success for any given trial is p.

Success could be, business major, passed the class, is defective,prefers chocolate, defaulted on a loan...

Page 125: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 32

Binomial Probability Distribution

Binomial PMFThe binomial PMF for the probability that X = x is

P(X = x |n, p) =

(n

x

)px(1− p)n−x .

The parts of this PMF are:

1.(nx

)the number of ways to get x successes out of a sample of

n trials.

2. px is the probability of getting x successes.

3. (1− p)n−x is the probability of getting all the rest, n − x ,non-successes.

Page 126: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 32

Binomial Probability Distribution

Example

Plugging in the numbers: Find P(X = 4|n = 10, p = .7).

P(X = 4|10, .7) =

(10

4

).74(1− .7)10−4

=10!

4!6!(.24)(.3)6 = (210)(.24)(.00073)

= 0.0368.

Your Turn: Find the binomial probabilities below:

P(X = 6|n = 20, p = .2) =

P(X = 5|8, .5) =

Page 127: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

18/ 32

Binomial Probability Distribution

Example

Plugging in the numbers: Find P(X = 4|n = 10, p = .7).

P(X = 4|10, .7) =

(10

4

).74(1− .7)10−4

=10!

4!6!(.24)(.3)6 = (210)(.24)(.00073)

= 0.0368.

Your Turn: Find the binomial probabilities below:

P(X = 6|n = 20, p = .2) = .109

P(X = 5|8, .5) = .219

Page 128: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

19/ 32

Binomial Probability Distribution

Example

In James Harden’s first season in Houston his field goal percentagewas .438 (43.8%) and he took about 17 shots per game.

Page 129: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

20/ 32

Binomial Probability Distribution

Example

What is the probability that Harden would make 10 shots in agame with that field goal % and 17 shots?

P(X = 10|17, .438) =

(17

10

).43810(1− .438)7

=17!

10!7!(.00026)(.0177) = (19448)(.0000046)

= 0.0895.

Page 130: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

24/ 32

Binomial Probability Distribution

µ and σ2

Note that the mean and variance for the binomial distribution are:

EX = µ = np and VarX = σ2 = np(1− p).

Example, n = 18, p = .6, then µ = 18× .6 = 6 + 4.8 = 10.8 andσ2 = 10.8× .4 = 4.32.

Your turn: Find µ and σ2 for how many shots James Hardenmakes using his field goal rate of .438 and 17 shots per game.

Page 131: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

25/ 32

Binomial Probability Distribution

µ and σ2

Note that the mean and variance for the binomial distribution are:

EX = µ = np and VarX = σ2 = np(1− p).

Example, n = 18, p = .6, then µ = 18× .6 = 6 + 4.8 = 10.8 andσ2 = 10.8× .4 = 4.32.

Your turn: Find µ and σ2 when n = 17 and p = .438.The solution is to calculate µ = (17)(.438) = 7.446 andσ2 = (7.446)(.562) = 4.185.

Page 132: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 28

Continuous Probability DistributionsChapter 6 part a

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 133: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 28

Objectives

I Be able to distinguish between discrete and continuousrandom variables.

I Understand the concept of a probability density function.

I Learn how to make computations of expected values andprobabilities for the uniform, exponential and normalprobability distributions.

I Learn about the standard normal distribution and revisit thez-score.

Page 134: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 28

Continuous Random Variables

Continuous Random VariableA continuous random variable is a measurement taking on anyvalue in a specified interval. Examples are:

I $ spent on advertising.

I Production costs.

I Writeoffs; money lost on bad loans.

I Revenue.

Page 135: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

4/ 28

Density Functions

PDFFor continuous random variables such as income, we no longer usea probability mass function. We work with distributions that haveprobability density functions or PDFs, f (x).

For random variables that follow continuous distributionsP(X = x) = 0 6= f (x).

Taking the area under f (x) from point x0 to x1 shows us theprobability that X is between x0 and x1, or P(x0 ≤ X ≤ x1).

Page 136: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

8/ 28

Uniform Distribution

Uniform Distribution --PREVIOUS CFA EXAM HAD QUESTION ABOUT UNIFORM DIST.We use the uniform distribution when all values are equally likely.This distribution is primarily useful for

I theoretical questions because it is so simple.

I sampling.

I situation where all outcomes have an equal chance.

Page 137: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 28

Uniform Distribution

Uniform PDFThe uniform density function is

f (x) =

1

b − aif a ≤ x ≤ b

0 otherwise

where a is the minimum value that the random variable can takeand b is the maximum value.

Page 138: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 28

Uniform Distribution

Example

Suppose that wait times at a restaurant are uniformly distributedbetween 5 and 35 minutes. That means that you are just as likelyto wait 5 to 6 minutes as you are to wait 34-35 minutes. Then

f (x) =

1

30if 5 ≤ x ≤ 35

0 otherwise

Page 139: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 28

Uniform Distribution

CDFThe cumulative distribution function or CDF is the integral of thePDF and shows us the area under the f (x) curve and we use it tofind probabilities over certain ranges. The unifrom CDF is:

P(x0 ≤ X ≤ x1) =x1 − x0b − a

.

If we are asking about P(X ≤ x) then we will use x0 = a and wehave:

P(X ≤ x) =x − a

b − a.

Page 140: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

12/ 28

Uniform Distribution

E(X) and Var(X)

The expectation and variance are given by:

µ =a + b

2

σ2 =(b − a)2

12.

The min and max then fully describe the distribution.

Page 141: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

13/ 28

Uniform Distribution

Example

Suppose the DOT solicits bids for a job to put guard rails on a twomile stretch of the I-17. The DOT will award a contract to thefirm that submits the lowest bid. Suppose that bids are uniformlydistributed with a minimum of $40,000 and a maximum of$60,000.What is the is E (X )? What is the probability that any given bidderwill submit a bid below $45,000? What is the probability that anygiven bid is between $50,000 and $56,000?

Page 142: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

14/ 28

Uniform Distribution

Example

I Recall that µ = a+b2 so we get (40, 000 + 60, 000)/2 or

$50,000.

I For P(X < $45, 000) use the CDF.

P(X < $45, 000) =x1 − x0b − a

=45, 000− 40, 000

60, 000− 40, 000

= 5/20 = 1/4.

Page 143: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 28

Uniform Distribution

Example

For the last part:

I P($50, 000 < X < $56, 000) = 620 = .3

Page 144: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 28

Uniform Distribution

Your TurnSuppose prices are uniformly distributed with a low price of $40and a high price of $150. What is the expected price and what isthe probability that the price is above $100?

Page 145: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 28

Uniform Distribution

Your TurnFor prices uniformly distributed between $40 and $150:

E (X ) =150 + 40

2=

190

2= 95 .

P(X ≥ 100) = 1− 100− 40

150− 40= 1− .545 = .455 .

Page 146: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 28

Sampling and Sampling DistributionsChapter 7

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 147: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 28

Objectives

I Learn about sampling and sampling error.

I Learn about sampling distributions.

I Learn about the central limit theorem and its implications.

Page 148: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 28

Sampling

Sampling

We sample because of the cost of collecting data. We can sampleusing probability sampling or non-probability (convenience)sampling. Types of probability sampling are:

I simple random or random.

I stratified.

I cluster.

I systematic.

Page 149: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

4/ 28

Sampling

Sampling

We may use convenience sampling because it is easy but it doesnot necessarily give us a sample that represents the population.An example of convenience sampling is me asking the class howmany hours they work. If sophomore business majors at ASU arerepresentative of all college students this may be fine. But what ifmost of the students that work 40 hours a week were “too busy”to take the survey?

Page 150: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

5/ 28

Sampling

Sampling

Here are some examples of sampling

I simple random: draw using a random number generator.

I stratified: draw customer with high FICO and low FICO.

I cluster: survey a freshman GE class, sample from Denver.

I systematic: recording every fifth phone call or takingattendance on Fridays.

Page 151: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

6/ 28

Sampling

Potential ProblemsPotential problems with different types of samples include:

I stratified: are the strata well defined and correct for thequestion at hand?

I cluster: is the chosen cluster truly representative of thepopulation?

I systematic: is there a periodicity problem? For example isattendance always lowest on Fridays.

Page 152: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

7/ 28

Estimation

Point EstimateWe collect a sample so that we can make an estimate (or guess)about a population parameter such as µ or σ2. However, ourestimate will sometimes be wrong. Examples:

I For student hours worked our estimate of µ is x = 9.29(F2015=11.61 , S2015=10.31).

I This doesn’t mean µ = 9.29 but it is our best guess.

Sampling error is the difference between the point estimate andthe true value, i.e., x − µ.

Page 153: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

8/ 28

Sampling Distribution

Distribution of xA random variable X has a distribution and so does a statistic suchas x . But these distributions are different.If the estimator is unbiased, which is the case with x , thenE (x) = µ which is is also E (X ) but the standard deviation of x isnot σ. Rather,

σx =σ√n.

Page 154: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 28

Sampling Distribution

Examples

We will see some numerical examples soon. First let’s see whatthis means for the hours students work.

Page 155: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 28

Sampling Distribution

Histogram of Hours Worked

hours

Fre

quen

cy

0 20 40 60 80

020

040

0

Page 156: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 28

Sampling Distribution

Examples

So, assuming the class is the population the mean for hours workedis µ = 9.285822 and the standard deviation is σ = 12.6882255.

Then if we take a sample of 30 the standard error will beσx = 2.3165424. We might also expect that x in our sample willbe close to µ.

Page 157: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

12/ 28

Sampling Distribution

Central Limit TheoremThe central limit theorem states that as n increases, thedistribution of the sample mean, x , approaches the normaldistribution.

The mean of the sampling distribution is equal to the populationmean:

µx = µ

The standard deviation of the sampling distribution is called thestandard error and it is:

σx =σ√n

Page 158: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

13/ 28

Sampling Distribution

Central Limit TheoremIn practice we assume that for n ≥ 30 the sampling distribution isapproximately normal regardless of the distribution of X . If X isnormally distributed then x is normally distributed even for smallern.

We can then use the following

zx =x − µσ/√n.

Page 159: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

14/ 28

Sampling Distribution

Standard ErrorThe standard error is the standard deviation of the samplingdistribution or point estimator.

standard error = se = σx =σ√n.

Page 160: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 28

Sampling Distribution

z valuesNow we can answer questions about P(x > x). For example, whatis P(x > 12, 500) when the population mean is 12,000 and thestandard error is 430?

z =12, 500− 12, 000

430= 1.16.

Now refer to the z table. Find the value for 1.16 which is .877.This means that P(x < 12, 500) = .877 andP(x > 12, 500) = 1− .877 = .123

Page 161: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 28

Sampling Distribution

Example z=1.16.

Page 162: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 28

z values

Example

Suppose that a claim is made that the average age of yourcustomer base is 37 years. Assume the standard deviation is 5years. For a random sample of 40 customers and a sample mean of36.1 what is the probability of seeing a sample mean of 36.1 orlower?

Page 163: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

18/ 28

z values

Example

The standard error is

σ√n

=5√40

= .7906.

z is

z =36.1− 37

.7906= −1.14.

Then P(x ≤ 36.1) = P(z ≤ −1.14) = .1271.

Page 164: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

19/ 28

z values

Your TurnSuppose that a claim is made that the average age of yourcustomer base is 40 years. Assume the standard deviation is 5.4years. For a random sample of 38 customers what is theprobability of seeing a sample mean of 44.1 or higher?

Page 165: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

20/ 28

z values

Your Turn

The standard error is

σ√n

=5.4√

38= .876.

z is

z =44.1− 40

.876= 4.68.

Then P(x ≥ 44.1) = P(z ≥ 4.68) = .0000.

Page 166: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

21/ 28

Appendix

Appendix on Proportions

There is a homework question regarding proportions that is astraight application of the formula in the book. There will not beany exam questions concerning proportions.

Page 167: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

22/ 28

z values

Proportions

Suppose that we are working with binomial data and proportions.For example suppose that your marketing and sales teams claimthat a certain proportion of households watch a particular program,say Monday Night Football.

If we claim 20% of households watch the program, i.e. p = .2,what sample results do we need to validate or contradict the claim?The answer comes later. But we need to understand somethingabout the sampling distribution of the sample proportion first.

Page 168: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

23/ 28

z values

Proportions

The sample proportion is

p =x

n

where x is the number of successes and n is the sample size.Note:

E(p) = p and σp =

√p(1− p)

n

Whennp ≥ 5 and n(1− p) ≥ 5

then the sampling distribution of p will be approximately normal.

Page 169: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

24/ 28

z values

Proportions Example

Suppose in our Monday Night Football example that p = .2. Whatis the probability of drawing a sample proportion of .3 or greaterfrom a sample of 50 households?First, find the standard error

σp =

√p(1− p)

n=

√.2(1− .2)

50=

√.16

50= .0566.

Then find the z score,

z =.3− .2.0566

=.1

.0566= 1.77

From the z table we find P(p > .3) = 1− .9616 = .0384.

Page 170: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

25/ 28

z values

Proportions Your Turn

Now suppose instead that p = .3 and in your sample of 50 you find11 households that watch MNF. What is the probability of gettinga proportion that small or smaller? Draw a graph showing this.

Page 171: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

26/ 28

z values

Proportions Your Turn

First, find the standard error

σp =

√p(1− p)

n=

√.3(1− .3)

50=

√.21

50= .0648.

Then find the z score,

z =.22− .3.0648

=−.08

.0648= −1.234

From the z table we find P(p < .22) = 1− .891 = .109(approximately).

Page 172: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

27/ 28

z values

Proportions Your Turn

Back to the Schlitz problem. Let p = .5. What is the probability ofgetting fewer than 47 beer drinkers that prefer Schlitz when thereare 100 taste testers?

Page 173: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

28/ 28

z values

Proportions Your Turn

First, find the standard error

σp =

√p(1− p)

n=

√.5(1− .5)

100=

√.25

100= .05.

Then find the z score, less than 47 means 46 or fewer, but thedistribution is not really continuous and we use 46.5

z =.465− .5.05

=−.035

.05= −.7.

From the z table we find P(p < .465) = 1− .758 = .242(approximately).

Page 174: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 43

Interval EstimationChapter 8

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 175: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 43

Objectives

I Learn about margin of error.

I Create interval estimates.

I Learn about the t distribution.

I Learn how to find the appropriate sample size.

Page 176: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 43

Margin of Error

IntervalsIf we know x what can we infer about µ? It will be a major miracleif µ = x and yet x may still be our best guess about the value of µ.

We will take our point estimate, x and create an intervalestimate,

x ± margin of error.

Page 177: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

4/ 43

Margin of Error

IntervalsFor example with a confidence interval or interval estimateinstead of saying the average student debt is $25,550 we might saythe average debt is between $19,550 and $31,550.

25, 550± 6, 000.

Page 178: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

5/ 43

Confidence Interval

Vocabulary

I confidence coefficient (c.c.): the probability that the intervalcontains the true parameter. Caution the parameter is not arandom variable but the upper and lower limits of the intervalare.

I confidence level: the confidence coefficient expressed as apercentage. The percentage of confidence intervalsconstructed in this fashion that would contain the trueparameter.

I significance level: α = 1−confidence coefficient. Sometimeswe express this as a percentage.

Page 179: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

6/ 43

Confidence Interval

Confidence CoefficientWhich confidence interval will have a larger margin of error?

1. confidence coefficient=.9

2. confidence coefficient=.95

Page 180: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

7/ 43

Confidence Interval

Confidence CoefficientWhich confidence interval will have a larger margin of error?

1. confidence coefficient=.9

2. confidence coefficient=.95

The larger the confidence coefficient the larger the margin of errorand the wider the confidence interval. Draw this.

Page 181: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

8/ 43

Confidence Interval

Confidence CoefficientTo find the appropriate margin of error we need to know the c.c.We also need to distinguish between whether or not we know σ.Also, we need to know the standard error. And we need to knowthat the random variable, x is (approximately) normally distributed.

Page 182: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 43

Confidence Interval

σ knownIf we know σ then construct a confidence interval with these steps.

1. choose a confidence coefficient and level of significance α.Common c.c.s are .9, .95, and .99

2. find the point estimate, x

3. find the critical z-score=zα/2. (explained below)

4. multiply the critical z times the known standard error,

zα/2σ√n

5. subtract the value from the mean to get the lower confidencelimit and add it to get the upper confidence limit. Theselimits define the confidence interval.

Page 183: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 43

Confidence Interval

σ knownThe interval estimate of a population mean is:

x ± zα/2σ√n.

Page 184: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 43

Confidence Interval

σ knownStep 1. Select a c.c., let’s say .95.Step 2. Find x . Example, if the observations are 11, 15, 19, 24,13, 19 then x = 16.83.Step 3. Find the critical z score, zα/2.

Page 185: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

12/ 43

Confidence Interval

Finding the critical z-score=zα/2

This is the z score that encompasses the c.c.; i.e. if c.c.=.9 thenα = .1 then from the z table find value which leaves α/2 = .05 tothe right. This is 1.645.

0

α/2=.05

z=1.645

c.c.=.9 and α=.1, the critical z values are -1.645 and 1.645

z=-1.645

c.c.=.90

α/2=.05

Page 186: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

13/ 43

Confidence Interval

Find Z∗ (the critical value)where

P(Z ≤ Z∗) = c.c. + α/2= .5 + (.5)(c.c.).

Get this from the Z TABLE.

Page 187: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

14/ 43

Confidence Interval

Critical ValuesFill in the table below, work sdrawkcab in the z table.

c.c. α critical value.8 .2 1.281.85.9 .1 1.645.95.99

Page 188: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 43

Confidence Interval

Critical ValuesFill in the table below, work sdrawkcab in the z table.

c.c. α critical value.8 .2 1.281.85 .15 1.44.9 .1 1.645.95 .05 1.96.99 .01 2.575

Page 189: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 43

Confidence Interval

σ knownStep 4. Multiply the critical value (1.96) by the known standarderror. In this example suppose σ = 5. (Recall c.c.=.95).

zα/2σ√n

= 1.965√6

= 4.00.

Step 5. Add and subtract from the point estimate:

x ± zα/2σ√n

= 16.83± 4 = [12.83, 20.83].

Page 190: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 43

Confidence Interval

Example: σ known

Suppose we want to construct a 95% CI for the number of hoursstudents work in a week. Suppose we know (from previoussemesters) that σ = 14.51. From a sample of 40 students we findan average of x = 11.03.

Page 191: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

18/ 43

Confidence Interval

Example: σ known

Following our steps:

1. c.c.=.95 and α = .05.

2. x = 11.03

3. in this case zα/2 = z.025 = ±1.96. Check the table of z valuesto verify that this is correct.

4. margin of error = (1.96)(14.51)/√

40 = 4.49624.

5. 11.03± 4.49624 =⇒ C .I . = [6.53376, 15.52624].

Page 192: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

19/ 43

Confidence Interval

Interpretation

This does not mean that there is a .95 probability that µ isbetween 6.53376 and 15.52624. It means that 95% of confidenceintervals constructed in this fashion would contain the true mean µ.

Page 193: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

20/ 43

Confidence Interval

Your Turn: σ knownConsider the previous example but construct a 90% confidenceinterval and assume that a sample of 35 students produces a meanof 12.6. (Again, σ = 14.51.)

Page 194: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

21/ 43

Confidence Interval

Example: σ known

Following our steps:

1. c.c.=.9 and α = .1.

2. x = 12.6

3. in this case zα/2 = z.05 = ±1.645.

4. (1.645)(14.51)/√

35 = (1.645)(2.453) = 4.035185.

5. 12.6± 4.035185 =⇒ C .I . = [8.564815, 16.635185].

Page 195: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

22/ 43

Confidence Interval

σ unknownWhen we do not know σ we must use a slightly different approach.Because we do not know σ we must first estimate it. Then we usethe t-distribution instead of the z distribution. Using thet-distribution is appropriate when the underlying variable isnormally distributed or we have a sample of more than 30 and thedistribution is nearly symmetric.

Page 196: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

23/ 43

Confidence Interval

σ unknownSteps:

1. choose a confidence coefficient

2. find the point estimate, x

3. find the critical t-value=tα/2,n−1. (See below).

4. multiply the critical t times the estimated standard error,

s =

√∑ni=1(xi − x)2

n − 1and σx =

s√n

=⇒ tα/2,dfs√n

5. subtract the value from x to get the lower confidence limitand add it to get the upper confidence limit.

1− α C.I. = x ± tα/2,dfs√n.

Page 197: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

24/ 43

Confidence Interval

t valuesThe critical t is the t value that encompasses the c.c.; i.e. ifc.c.=.9 and α = .1 then from the t table find the value whichleaves α/2 = .05 to the right. Finding the t value also requiresknowing the degrees of freedom which we typically abbreviate asd.f., df=n − 1. So if n = 24 then df=23. Then t.05,23 = 1.714.Verify this by looking at the t table.

Page 198: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

25/ 43

Confidence Interval

Practice finding t values

1. for α = .05 and df=19, t = 2.093

2. for n = 28 and c.c.=.99 what is t?

3. for n = 22 and c.c.=.9 what is t?

4. for n = 62 and c.c.=.95 what is t?

Page 199: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

26/ 43

Confidence Interval

Practice finding t values

1. for α = .05 and df=19, t = 2.093

2. for n = 28 and c.c.=.99 what is t?

df=27, α = .01, t=2.771

3. for n = 22 and c.c.=.9 what is t?

df=21, α = .1, t=1.721

4. for n = 62 and c.c.=.95 what is t?

df=61, α = .05, t ≈ 2 (not on table so approximate)

Page 200: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

27/ 43

Confidence Interval

Example σ unknown

Example: suppose you are working with a new portfolio of loans.You want to construct a 95% confidence interval for the averageloan amount. Because the portfolio is new you do not consider thatσ is known. However, in a sample of 80 loans you find the averageloan amount is $12,709 and the standard deviation is $4,030.

Page 201: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

28/ 43

Confidence Interval

Example σ unknown

1. c.c.=.95 and α = .05.

2. x = 12709

3. df=79. This value for df is not on the table but 1.99 is close.

4.

1.99(4030√

80

)= (1.99)(450.6) = 897.

5. 12709-897=11812 and 12709+897=13606.

95% C.I. = [11812, 13606].

Page 202: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

29/ 43

Confidence Interval

Your Turn: σ unknownYou have a new credit card product and you observe after sixmonths that the average customer age of a sample of 34 customersis 31.4 years and the variance is 27.2. Construct a 99% confidenceinterval for the mean customer age. Because the product is newassume that σ is unknown.

Page 203: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

30/ 43

Confidence Interval

Your Turn: σ unknown

1. c.c.=.99 and α = .01.

2. x = 31.4

3. df=33. This value for df is not on the table but 2.73 is close.

4.

2.73(√27.2√

34

)= (2.73)(.89) = 2.44.

5. 31.4 ± 2.44.

99% C.I. = [28.96, 33.84].

Page 204: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

31/ 43

Confidence Interval

Your Turn: σ unknownNow assume that you have a 46 customers with an average age of44.7 years and the variance is 31.6. Construct a 90% confidenceinterval for the mean customer age. Because the product is newassume that σ is unknown.

Page 205: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

32/ 43

Confidence Interval

Your Turn: σ unknown

1. c.c.=.90 and α = .1.

2. x = 44.7

3. df=45. t=1.68.

4.

1.68(√31.6√

46

)= (1.68)(.829) = 1.39.

5. 44.7 ± 1.39.

90% C.I. = [43.3, 46.1].

Page 206: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

33/ 43

Selecting Sample Size

Selecting Sample Size

Suppose you want to construct a confidence interval and you areplanning to collect data. How many observations do you need? Tokeep costs down you will not want to pay for 100 surveys when 40is sufficient.Before determining the sample size you need you must firstdetermine the c.c. or α and the margin of error. For example inour previous problems do we want the margin of error to be $1000or is $2000 an acceptable margin of error? Is a margin of error of2.5 years acceptable or do we need it to be .5 years?

Page 207: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

34/ 43

Selecting Sample Size

Here is how we determine the necessary sample size. Recall

margin of error = zα/2σ√n

then divide both sides by margin of error and multiply both sidesby n and then square both sides to get

n = z2α/2

( σ2

me2

)= z2

α/2

(σ2

E2

)Where E or me is the margin of error.

Page 208: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

35/ 43

Selecting Sample Size

Example: Selecting Sample Size

You want a 95% confidence interval and a margin of error of $1250for average monthly spend on a credit card when σ = 3450. Findn.

n =((1.96)(3450)

1250

)2= 29.3

and then round up to 30 or else you will not have a 95%confidence interval.

Page 209: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

36/ 43

Selecting Sample Size

Your TurnYou want a 90% confidence interval and a margin of error of 75(megabytes) for average monthly data usage on a phone whenσ = 400. Find n. Hint: for c.c.=.9, z = 1.645).

Page 210: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

37/ 43

Selecting Sample Size

Your TurnYou want a 90% confidence interval and a margin of error of 75(megabytes) for average monthly data usage on a phone whenσ = 400. Find n. Hint: for c.c.=.9, z = 1.645).

n =((1.645)(400)

75

)2= 76.97

and then round up to 77 or else you will not have a 90%confidence interval.

Now you know how to find the sample size necessary to create aconfidence interval around the mean for a particular margin oferror and α.

Page 211: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

38/ 43

Confidence Intervals

Confidence Intervals for Population Proportion

To construct a confidence interval for a population proportion wefirst need to know that np ≥ 5 and n(1− p) ≥ 5 for thedistribution of the sample proportion to be approximately normal.We follow similar steps as above. However, for proportions thestandard deviation and the population proportion are linkedtogether. If we know one then we know the other. We use thesample proportion p to calculate the standard error:

s.e. = σp =

√p(1− p)

n

and the confidence interval is

p ± zα/2

√p(1− p)

n.

Page 212: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

39/ 43

Confidence Intervals

Example

A television program wants to estimate the proportion of viewersthat are female in order to understand its options in sellingtargeted advertising spots. In a sample of 147 customers 98 werefemale. We want to construct a 99% confidence interval for p.

p =98

147= .667.

s.e. =

√p(1− p)

n=

√.222

147= .0389.

CI = .667± (2.576)(.0389) = .667± .1001 = [.567, .767].

Page 213: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

40/ 43

Confidence Intervals

Your TurnAgain a television program wants to estimate the proportion ofviewers that are aged 21-35 in order to understand its options inselling targeted advertising spots. In a sample of 115 customers 55fit this demographic. We want to construct a 95% confidenceinterval for p. What is p? What is the standard error? What is theconfidence interval?

Page 214: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

41/ 43

Confidence Intervals

Your TurnAgain a television program wants to estimate the proportion ofviewers that are aged 21-35 in order to understand its options inselling targeted advertising spots. In a sample of 115 customers 55fit this demographic. We want to construct a 95% confidenceinterval for p. What is p? What is the standard error? What is theconfidence interval?

p =55

115= .4783.

s.e. =

√p(1− p)

n=

√.249

115= .0466.

CI = .4783± (1.96)(.0466) = .4783± .0913 = [.387, .569].

Page 215: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

42/ 43

Sample Size

Finding Sample Size for Proportions

The setup is similar to that for sample means but in this case wewe would need some existing estimate about the populationproportion which we may not have. We will use p which may comeabout from:

1. a previous sample

2. a pilot study

3. a best guess

4. just pick p = .5

Page 216: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

43/ 43

Sample Size

Finding Sample Size for Proportions

The formula is

n =z2α/2p(1− p)

me2.

Example with p = .5 and margin of error=.025 and α = .05.

n =z2α/2p(1− p)

me2=

(1.96)2(.5)(.5)

.0252= 1536.6.

Again we round up and get 1537.

Page 217: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

1/ 38

Hypothesis Tests - One MeanChapter 9

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 218: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

2/ 38

Objectives

Understand the concept of hypothesis testing.

Be able to calculate a test statistic for a hypothesisconcerning one mean

Understand the difference between Type I and Type IIerrors.

Be able to use the test statistic to conduct a hypothesistest.

Learn what a p-value is and be able to use it to conduct ahypothesis test.

Page 219: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

3/ 38

Research Question

Suppose I want to know the average salary for people witheconomics degrees. Here are all the economists in the worldand their salaries:

Name Salary Name Salary

David A. 311,953 Alan A. 291,782

David C. 336,367 Brad D. 134,967

Aaron E. 279,532 Barry E. 247,003

Haluk E. 384,226 Ben F. 183,550

Joe F. 174,067 Frederico F. 179,330

Yuriy G. 359,883 *Cecile G. 43,750

Berkeley economists 2014, *postdoc

Page 220: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

4/ 38

Research Question

The true average salary, µ, is $243,868. But suppose I have topay each one $20 to get the information. I will settle for asking7 of them to save myself $100.

If I draw them randomly there are(127

)= 792 different possible

samples.

Page 221: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

5/ 38

Research Question

If my sample is {David A., Alan, Aaron, Haluk, Barry, Ben,Frederico} then the average, x , is $268,197.

But if my sample is {Yuriy, Alan, Brad, Barry, Ben, Frederico,Cecile} then the average, x , is $205,752.

Problem: I could get 792 different answers for x and none ofthem will be $243,868.

Page 222: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

6/ 38

Null Hypothesis or H0

So, if someone asserts that µ = $243, 868 or evenµ = $143, 868 how can we decide if the assertion is true orfalse?

We can find the probability of getting a sample that gave us x ,the sample average, assuming the assertion is true.

If that probability is low or small then perhaps the assertion isfalse and we reject it.

Page 223: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

7/ 38

Null Hypothesis or H0

This assertion (or question) is called the null hypothesis and iswritten H0.

So if our null hypothesis is that average salary for economists,µ, is $200,000 then we would write:

H0 : µ = 200, 000.

Page 224: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

8/ 38

Null Hypothesis or H0

If our null hypothesis is:

H0 : µ = 200, 000,

then there must be some alternative hypothesis. If we get asample where x has a low probability of showing up when H0 isassumed true then perhaps something else (the alternative) istrue. We write:

Ha : µ 6= 200, 000.

Page 225: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

9/ 38

Test Statistic

For our“test”we will use the central limit theorem which tellsus that (for n large enough) x follows the normal distribution.We already saw these results when we constructed confidenceintervals.

x − µs/√n

= tstat ∼ tn−1.

Notice, tstat depends on the sample average, x , and howspread out the values are, s.

Technically, we don’t know µ so we will use µ0 which is thehypothesized value for µ.

Page 226: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

10/ 38

Test Statistic: Example

H0 : µ = 200, 000.

This means we have

tstat =x − µ0s/√n

=x − 200, 000

s/√n

.

If the sample average, x , is $268,197 and the standarddeviation, s, is 72,542 while there were n = 7 observations

tstat =268, 197− 200, 000

72, 542/√

7=

68, 197

72, 542/2.646=

68, 197

27418= 2.49.

Page 227: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

11/ 38

Test Statistic: Example

−3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

t=2.49

dens

ity, t

(6)

P(t<2.49)=.97642

area=.97642

area=.02358

P(t>2.49)=.02358111

tdist1.pdf

Page 228: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

12/ 38

Test Statistic: Example

The probability of getting a sample with a t value of 2.49 orsomething more extreme (e.g. 2.7 or -2.5) is1− Pr(−2.49 < t < 2.49) = 0.0471622254695792.

So if µ is really $200,000 then we would only have gotten asample like this (or “worse”) about 4 or 5 times out of 100.

So, do you think H0 : µ = 200, 00 is false?

Page 229: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

13/ 38

Error Types

If we reject H0 (we think it is false) then we might be wrongsince we don’t actually know what µ is. But if we fail to rejectit (i.e. accept H0) then we might be wrong too.

In this example, if we reject H0 we would be wrong withprobability 0.0471622254695792 or about 4 or 5 times in 100.

conclusion H0 is true H0 is false

fail to reject H0 correct conclusion Type II errorreject H0 Type I error correct conclusion

probability of Type I error is α. super important in thisclass

prob. of a Type II error is β (not covered in this class).

Page 230: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

14/ 38

Error Types

Typically, researchers set α (read alpha) to .05 but .1 and .01are also fairly common. In this class if α is not stated explicitlythen assume it is .05.

α: the level of significance of a test or the maximumacceptable probability of a Type I error (the probability ofrejecting the null hypothesis even though it is true.)

Note: sometimes people refer to α as a percentage and saythings like “the 5% level of significance” to mean α = .05.

Page 231: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

15/ 38

Rejection Rule

So when do we reject H0?

We reject the null hypothesis when the probability of a Type Ierror is less than or equal to α. This probability is also called ap-value. So,

if p-value ≤ α reject H0

if p-value > α fail to reject H0

Page 232: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

16/ 38

Rejection Rule

For an equivalent rule notice that tstat ⇐⇒ p − value

because,

p − value = 1− Pr(−tstat < t < tstat), when tstat > 0

and

p − value = 1− Pr(tstat < t < −tstat), when tstat < 0.

Similarly, α ⇐⇒ tcritical,

α = 1− Pr(−tcritical < t < tcritical).

Note: these rules are for two-tail tests. See one tail tests later.

Page 233: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

17/ 38

Rejection Rule

This means:

if |tstat| ≥ |tcritical| reject H0

if |tstat| < |tcritical| fail to reject H0

We find the test statistic using the data as seen earlier and wefind the critical t value from the t table (or computer software).Graphically we can construct rejection regions.

Page 234: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

18/ 38

Rejection Region: Example

Example with α = .1 and df=3000.

µ0

−4 −3 −2 −1 0 1 2 3 4

rejection region rejection region

critical values

−/+1.645

Page 235: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

19/ 38

Rejection Rule

So we reject if the test statistic falls in the rejection region (theblue area on the previous graph).

We fail to reject if the test statistic does not fall in the rejectionregion (falls in the white space on the previous graph).

Page 236: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

20/ 38

Example

Test the null hypothesis that the average annual amount spentat “bars” is $1,000. You have a sample of 50 and a sampleaverage of $852 with a sample standard deviation of $691. Useα = .05.

1 find the d.f.=n − 1 = 50− 1 = 49.

2 find tcritical = 2.01 by looking at the t table.

3 find the test statistic

tstat =852− 1000

691/√

50=

−148

691/7.071=−148

97.722= −1.514

4 −1.514 > −2.01 =⇒ fail to reject (see graph)

Page 237: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

21/ 38

Example

Example with α = .05 and df=49. Fail to reject.

µ0

−4 −3 −2 −1 0 1 2 3 4

rejection region rejection region

critical values

−/+2.01

test statistic

=−1.514

Page 238: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

22/ 38

Your Turn

As part of quality control a jelly producer wants to verify thatits machines are filling jars to the correct level. The jars aresold as 15oz. jars of jelly and the team takes a sample of 35and finds a mean fill level of 15.04 with a standard deviation of.034. Using α = .01 as the level of significance what do theyconclude?

Other questions:

1 Why would this question be relevant for a company?

2 Why would the QC personnel only take a sample? (Theycould weigh a jar without opening it so they could still sellit if it is used for the study.)

3 How would changing the level of significance change theproblem?

Page 239: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

23/ 38

Your Turn

H0 : µ = 15

1 find the d.f.=35− 1 = 34.

2 find tcritical = ±2.728 by looking at the t table.

3 find the test statistic

tstat =15.04− 15

.034/√

35=

0.04

.034/5.916=

0.04

0.0057= 6.96

4 6.96 > 2.728 =⇒ reject H0 (see graph).

Page 240: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

24/ 38

Your Turn

Example with α = .01 and df=34. Reject.

t

−7.5 −6.5 −5.5 −4.5 −3.5 −2.5 −1.5 −0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5

rejection region rejection region

critical values

−/+2.728test statistic

=6.96

Page 241: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

25/ 38

Using Excel

In excel you can do this by selecting Data > Data Analysis >... well foot! Excel won’t do this so it is good to know how todo it “by hand.”

Excel will, however, give you the mean, standard deviation, andallow you to look up a critical t value (=T.INV(α/2,df)).

Page 242: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

26/ 38

One-Tail Test: Concept

Should Buzz Lightyear have to prove he can fly or shouldWoody have to prove that he can’t?

Page 243: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

27/ 38

One-Tail Test: Concept

Should the NE Patriots have to prove they aren’t cheaters orshould the NFL have to prove that they are?

Page 244: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

28/ 38

Concept

We sometimes look for evidence that a value is at least acertain amount or at most a certain amount. We are talkingnow in terms of greater than (>) and less than (<).

Sometimes people go Dr. Seussy on this one and get their rightmixed up with their left. So let’s just go with a simple example.

Page 245: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

29/ 38

Concept

In the NFL the minimum inflation level is 12.5psi for the gameballs. If the NFL is looking for evidence or “proof” that a teamunder-inflates balls then the null hypothesis is,

H0 : µ ≥ 12.5.

Does that look backwards? To prove the balls areunder-inflated the NFL must reject the hypothesis of adequateinflation. That is, with just a sample of balls they need to seethat there is a low probability of such under-inflation if theteam is actually inflating the balls properly.

Page 246: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

30/ 38

Left vs. Right

Null tail rejectionregion

when to use

H0 : µ ≥ µ0 left tail on the left research agenda isto show the meanis less than µ0

H0 : µ ≤ µ0 right tail on the right research agenda isto show the meanis greater than µ0

Page 247: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

31/ 38

Example

Suppose 18 game balls were found to have an average psi of11.9, i.e. x = 11.9 and the sample standard deviation is .6. Usea significance level of .05 (α = .05) to test the followinghypothesis:

H0 : µ ≥ 12.5.

(Note: for n under 30 we must assume X is approximatelynormally distributed).

Page 248: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

32/ 38

Example

1 find the d.f.=18− 1 = 17.

2 find tcritical = −1.74 by looking at the t table. It isnegative because this is a left tail test (see table above).WARNING: the critical values must leave a totalprobability of α in the rejection region(s) so the values aredifferent for one vs. two tail tests.

3 find the test statistic

tstat =11.9− 12.5

.6/√

18=−0.6

.6/4.243=−0.6

0.141= −4.243

4 −4.243 < −1.74 =⇒ reject H0 (see graph).

Page 249: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

33/ 38

Example

Example with α = .05 and df=17. Left tail test. Reject.

t

−5 −4 −3 −2 −1 0 1 2 3 4 5

rejection regioncritical value

−1.74

test statistic

=−4.24

Page 250: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

34/ 38

Your Turn

A bank wants to show that it has low average finance charges(this is part of an advertising and public relations campaign).The bank wants to advertize that the average monthly financecharge is less than $50 a month but wants to avoid problems(e.g. with the FTC, CFPB...).

With a sample of 2,000 accounts they find an average ofx = $47.24 and a sample standard deviation of s = 38.6.

Page 251: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

35/ 38

Your Turn

Use α = .1 and conduct the appropriate hypothesis test, reportthe test statistic and your conclusion. The null hypothesis is:

H0 : µ ≥ 50.

Other questions:

1 Why would this question be relevant for a company? Ifx < 50 aren’t they safe?

2 Why is this a left tail test?

3 Why would they only take a sample? The bank probablyhas records on all their customers’ accounts and could usethe entire population. No?

4 How would changing the level of significance change theproblem?

Page 252: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

36/ 38

Your Turn

1 find the d.f.=2000− 1 = 1999. (In this case using a zvalue is acceptable).

2 find tcritical = −1.282 by looking at the z or t table.

3 find the test statistic

tstat =47.24− 50

38.6/√

2000=

−2.76

38.6/44.721=−2.76

0.863= -3.198

4 −3.198 < −1.282 =⇒ reject H0 (see graph).

Page 253: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

37/ 38

Your Turn

Example with α = .10 and df=1999 (as good as z). Left tailtest. Reject.

t

−5 −4 −3 −2 −1 0 1 2 3 4 5

rejection regioncritical value

−1.74

test statistic

=−4.24

Page 254: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

HypothesisTests - One

Mean

Richard Cox

Objectives

Context -ResearchQuestion

Test Statistic

Error Types

Rejection Rule

Example/YourTurn

Using Excel

One-Tail Tests

Your Turn:One-Tail test

38/ 38

Right Tail Rejection Region

Example with α = .05 and df=30. Rigt tail test. The criticalvalue is 1.697

t

−4 −3 −2 −1 0 1 2 3 4

rejection regioncritical value

1.697

Page 255: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

1/ 27

Hypothesis Tests for Two MeansChapter 10 part a

Richard Cox

Department of EconomicsArizona State University

ECN 221, Business Statistics

Page 256: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

2/ 27

Objectives

I Be able to test hypotheses concerning the difference in twomeans.

I Be able to test hypotheses for the difference in two populationmeans for matched or paired samples (see appendix).

I Understand the concept of ANOVA.

I Be able to conduct ANOVA and interpret results on ANOVAusing excel.

Page 257: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

3/ 27

Hypothesis Test for Two Means

µ1 − µ2

We continue testing hypotheses but now for two populations. Herethe claims concern the relationship between two populationparameters. For example:

1. the average student debt for students at public universities isthe same as that at private universities H0 : µpublic = µprivate

2. the average salary for a private sector jobs exceeds publicsector job salaries H0 : µprivate ≤ µpublicThis may seem backwards, but we are looking to see if thereis sufficient evidence to reject H0.

3. the average salary for a private sector jobs exceeds publicsector job salaries by more than $5,000,H0 : µpublic ≥ µprivate − $5, 000.

Page 258: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

4/ 27

Hypothesis Test for Two Means

Example

The process for testing these hypotheses is the same as before.However, now we have a new formula for the standard error. Also,for convenience let D0 denote the hypothesized difference betweenµ1 and µ2.The test statistic is for known variances is:

z =(x1 − x2)− D0√

σ21

n1+

σ22

n2

where the subscripts 1 and 2 refer to samples 1 and 2.

Page 259: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

5/ 27

Hypothesis Test for Two Means

ExcelYou can do this type of hypothesis test in excel (click here). Thiswill be handy if you have a large number of observations and nosummary statistics given. First lets try a couple “by hand.”

Page 260: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

6/ 27

Hypothesis Test for Two Means

Example

Example (taken from Donnelly): The average salary for federalemployees is $66,700 with a sample of 35 and the average salaryfor private sector employees in similar-type jobs is $60,400 with asample of 32. The population standard deviations are $12,000 forfederal jobs and $11,000 for private sector jobs. At the 5% level ofsignificance test the hypothesis that the average federal pay is thesame as the average private sector pay.

Page 261: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

7/ 27

Hypothesis Test for Two Means

Example

The critical value is ±1.96 and the test statistic is:

z =(66, 700− 60, 400)− 0√

12,0002

35 + 11,0002

32

=6300

2809.9= 2.24.

Because 2.24 > 1.96 we reject the null hypothesis of equal means.There is statistically significant evidence that the federal andprivate sector salaries are different.

Page 262: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

8/ 27

Hypothesis Test for Two Means

Your TurnSuppose the average starting salaries for business majors is $52236and for communications majors it is $47047 (according to NACEfor 2016). Suppose the sample sizes are 619 and 76 and that theknown population standard deviations are $6,700 and $8,650. Isthere compelling evidence that the mean salary for business majorsis higher than the mean salary for communications majors?Construct a hypothesis to test. Report the critical value whenα = .05, the standard error, the test statistic, and your conclusion.

Page 263: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

9/ 27

Hypothesis Test for Two Means

Your TurnH0 : µb ≤ µc and Ha : µb > µc .

The critical value is 1.645 because it is a right tail test.

The standard error is

se =

√6, 7002

619+

8, 6502

76= 1028.1180734.

Page 264: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

10/ 27

Hypothesis Test for Two Means

Your TurnThe test statistic is:

z =(52236− 47047)− 0

1028.1180734=

5189

1028.1180734= 5.0470857.

If this is > 1.645 we reject the null hypothesis and conclude thatthere is statistically significant evidence that business majors havehigher starting salaries than communications majors.

Page 265: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

11/ 27

Hypothesis Test for Two Means

Your TurnAre there any questions you would want to ask NACE about theirsurvey before trusting these results?

I want to know which schools did the students graduate from andwhere are they now employed.

Page 266: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

12/ 27

Hypothesis Test for Two Means

σ unknownThis will be a t test and the standard error is calculated:

s.e. =

√s21n1

+s22n2

Because the test statistic follows the t distribution we need toknow the degrees of freedom. The formula for calculating the d.f.is in the textbook. For this class the d.f. will always be given inthis setting so you will not need to memorize this formula.

Page 267: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

13/ 27

Hypothesis Test for Two Means

Example, σ unknown

Suppose x1 = 79.4 and x2 = 80.6 and n1 = 173 and n2 = 342 ands1 = 11.4 and s2 = 12.1. Note: there are 358 degrees of freedom.At the .01 level of significance test the hypothesis below.

H0 : µ1 = µ2

Ha : µ1 6= µ2

Page 268: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

14/ 27

Hypothesis Test for Two Means

Example, σ unknown

The test is a two tail test with critical values of ±2.589. The teststatistic is

t =79.4− 80.6√11.42

173 + 12.12

342

=−1.2

1.086= −1.105.

| − 1.105| < |2.589| so we fail to reject H0.

Page 269: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

15/ 27

Hypothesis Test for Two Means

Your Turn, σ unknownIsraeli girls scored .014 on a national math exam and boys scores-.014 (see Lavy and Sand (2015)). The scores are standardizedwith standard deviations of 1 each and sample sizes of 4122 (girls)and 4246 (boys) which means that the df are 8358 (t and z are thesame in this case). Use α = .01 and test the hypothesis that boys’and girls’ average scores on the math exam are the same(H0 : µboys = µgirls). Report the standard error, test statistic,critical value and your conclusion.

Page 270: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

16/ 27

Hypothesis Test for Two Means

Your TurnThe standard error is

s.e. =

√12

4122+

12

4246=√.00047812 = .02186.

t =(.028)− 0

.02186=

.028

.02186= 1.2805.

The critical value is ±2.576.

Because 1.28 < 2.576 we fail to reject the null hypothesis of equalmeans. There is not compelling evidence that math scores aredifferent.

Page 271: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

17/ 27

Hypothesis Test for Two Means

Your TurnSuppose though that I am a researcher and I don’t like that result.What can I do (perhaps unethically)?

I I can change α to .1 which will make the critical value 1.645.

I Make is a one tail test and change α to .1 which means Iwould get a critical value of 1.282.

Do not play these kinds of games but be on the lookout for thosethat do.

Page 272: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

18/ 27

Hypothesis Test for Two Means

The Freak’s WHIP

Does The Freak really have a different WHIP at home than atother people’s houses?

Let’s test it.

Page 273: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

19/ 27

Appendix

APPENDIX

Paired Samples

Page 274: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

20/ 27

Objectives

Paired or Matched Samples

I reduce one source of variation with paired samples.

I Each experimental unit gets both treatments.

Page 275: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

21/ 27

Hypothesis Test for Two Means

Paired Sample

Additional notation:

1. di is the difference in values for the same experimental unit.For example sales of Blue Bunny on top shelf at storeTempe106=158 and bottom shelf at store Tempe106=126.Then dTempe106 = 158− 126 = 32.

2. d is the average difference: d =∑n

i=1 din .

3. sd is the standard deviation of the difference:

sd =

√∑(di − d)2

n − 1.

4. µd is the the mean difference for the population.

Page 276: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

22/ 27

Hypothesis Test for Two Means

Standard Deviation of DifferenceIn practice you would use a software package to make thesecomputations. For our purposes I will provide sd when needed.

Page 277: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

23/ 27

Hypothesis Test for Two Means

Paired Sample

The standard error is calculated as in previous cases (standarddeviation divided by square root of the sample size):

s.e. =sd√n

The test statistic follows the t distribution with n − 1 degrees offreedom.

t =d − µdsd/√n.

Page 278: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

24/ 27

Hypothesis Test for Two Means

Example: Paired Sample (from Donnelly)

Rockstar is displayed at two different places in the store: middleaisle and end aisle. We use 9 different stores are used in theexperiment. The mean difference in sales is 11.44 (end minusmiddle). The standard deviation of the differences is 15.73. Testthe null hypothesis that end aisle sales are less than or equal tomiddle aisle sales. Use α = .05.

Page 279: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

25/ 27

Hypothesis Test for Two Means

Example

The standard error is:

s.e. = 15.73/√

9 = 5.24

The test statistic is:

t =d − µdsd/√n

=11.44− 0

5.24= 2.18.

There are 8 d.f. and the critical t value is 1.86. Because1.86 < 2.18 we reject H0.

Page 280: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

26/ 27

Hypothesis Test for Two Means

Your Turn: Paired Sample

A credit card issuer wants to evaluate the impact of a line increaseon customer spend. Management wants to show that an increasein the credit limit is associated with an increase in monthly creditcard spend.

They take a sample of 12 accounts that were given a line increase.The average monthly spend before the increase was $3,604 and theaverage after the increase was $3,841. The standard deviation issd = 218. They use α = .01. What do they find?

Page 281: ECN 221 – Business Statistics...2019/04/18  · – A statistic is a function of data, so data are key to statistics. – Data are the facts we observe in a study or experiment.

27/ 27

Hypothesis Test for Two Means

Your Turn: Paired Sample

The standard error is: s.e. = 218/√

12 = 62.9.

The test statistic is:

t =d − µdsd/√n

=237

62.9= 3.77.

There are 11 d.f. and the critical t value is 2.718. Because3.77 > 2.718 we reject H0 : µd ≤ 0.They can conclude that higher monthly credit card expendituresare associated with high credit lines.