Review # 1

46
1 Review # 1 Chapter 2 Chapter 3 Chapter 4

description

Review # 1. Chapter 2 Chapter 3 Chapter 4. Chapter 2: Descriptive Statistics Numerical Methods. Problem 1 (Calculations) The ages of employees of a fast-food outlet are as follows: 19, 19, 65, 20, 21, 18, 20. a) Compute the mean, the median, and the mode of the ages - PowerPoint PPT Presentation

Transcript of Review # 1

Page 1: Review # 1

1

Review # 1

Chapter 2

Chapter 3

Chapter 4

Page 2: Review # 1

2

• Problem 1 (Calculations)The ages of employees of a fast-food outlet are as follows: 19, 19, 65, 20, 21, 18, 20.

a) Compute the mean, the median, and the mode of the agesMean = (19+19+…+20)/7 = 26Median = the center of the sorted series = 20 {18, 19, 19, 20, 20, 21, 65}Mode = the number with the largest frequency = 19, 20

Chapter 2: Descriptive Statistics Numerical Methods

Mean = 19.5

Median = 19.5 = (19+20)/2

Mode = 19, 20 no change.

The mean is sensitive to extreme value; the median and the mode are less sensitive.

b) Assume the oldest employee retires

Page 3: Review # 1

3

• Problem 2 (Excel, interpretation)– The summer income of a sample of 125 second-

year business students are stored in Prob 2• Calculate the mean and the median

• What do the two measures of central location tell you about the income?

• Which measure should be used to summarize the data?

Measures of Central Tendency (Excel, interpretation)

Page 4: Review # 1

4

Measures of Central Tendency

Problem 2 - solution • The distribution is reasonably

symmetrical, but a few low incomes have pulled the mean below the median, resulting in a distribution slightly skewed to the left.

• Either measure could be used, but the median is better because it is not affected by a few low incomes.

Mean 2992.56

Standard Error 49.815744

Median 3016

Mode 2817

Standard Deviation 556.95695

Sample Variance 310201.04

Kurtosis 0.271249

Skew ness -0.3106334

Range 2958

Minimum 1333

Maximum 4291

Sum 374070

Count 125

Page 5: Review # 1

5

Problem 2 - solution • The distribution is reasonably

symmetrical, but a few low incomes have pulled the mean below the median, resulting in a distribution slightly skewed to the left.

• Either measure could be used, but the median is better because it is not affected by a few low incomes.

Frequency

01020304050

1800 2300 2800 3300 3800 4300 More

Measures of Central Tendency

Page 6: Review # 1

6

• Problem 3 (Excel, interpretation)The owner of a hardware store that sells electrical wires by the meter is considering selling the wire in pre-cut lengths to save on labor cost. A sample of wire sold over the course of 1 week was recorded (Prob3.xls).

A) Compute the mean, median and mode

B) What is the weakness of each measure in providing useful info?

C) How might the owner decide on the lengths to pre-cut?

Measures of Central Tendency

Page 7: Review # 1

7

A) The mean resides to the right of the median. The distribution of lengths is somewhat asymmetrical, skewed to the right (there must be some long wires sold that affect the mean value).

Lengths

Mean 6.17

Standard Error 0.3824575

Median 5

Mode 5

Standard Deviation 3.824575

Sample Variance 14.627374

Kurtosis 8.6410756

Skew ness 2.7809472

Range 22

Minimum 3

Maximum 25

Sum 617

Count 100

Sales

0

20

40

60

80

3 7 11 15 19 23 27 More

Measures of Central Tendency

Page 8: Review # 1

8

B) The mean is unduly influenced by extreme observations.

The median doesn’t indicate what lengths are most preferred.

The mode doesn’t consider any desired lengths other than the one most frequently purchased.

Lengths

Mean 6.17

Standard Error 0.3824575

Median 5

Mode 5

Standard Deviation 3.824575

Sample Variance 14.627374

Kurtosis 8.6410756

Skew ness 2.7809472

Range 22

Minimum 3

Maximum 25

Sum 617

Count 100

Measures of Central Tendency

Page 9: Review # 1

9

Lengths

Mean 6.17

Standard Error 0.3824575

Median 5

Mode 5

Standard Deviation 3.824575

Sample Variance 14.627374

Kurtosis 8.6410756

Skew ness 2.7809472

Range 22

Minimum 3

Maximum 25

Sum 617

Count 100

C) You may draw the cumulative distribution and trim the tails

Cumulative %

.00%

50.00%

100.00%

150.00%

3 104 11

Measures of Central Tendency

Page 10: Review # 1

10

Measures of Variability

• Problem 4Calculate the mean, variance and the standard deviation of the following set of numbers, treating them as

• Sample

• Population– The set is: 14, 7, 8, 11, 5

For the standard deviation:take the square root of the variance

The mean (both x an = (14+7+…+5)/5 = 9

The sample variance = s2 =[(14-9)2+(7-9)2+…+(5-9)2]/(5-1) = 12.5

The population variance = 2 =[(14-9)2+(7-9)2+…+(5-9)2]/5 = 10

!

Page 11: Review # 1

11

The variance

• Problem 5 (The variance, Excel)– The number of customers entering a bank each

hour for the last 100 days was recorded (Problem 5).

– For each hour determine the mean and standard deviation.

– What do these statistics tell you?

Page 12: Review # 1

12

The Variance10:00-11:00

Mean 102.22

Standard Deviation 16.06588707

Sample Variance 258.1127273

11:00-12:00

Mean 70.26

Standard Deviation 10.5807334

Sample Variance 111.9519192

12:00-1:00

Mean 177.93

Standard Deviation 18.24150833

Sample Variance 332.7526263

1:00-2:00

Mean 65.87

Standard Deviation 9.371841556

Sample Variance 87.83141414

2:00-3:00

Mean 147.92

Standard Deviation 14.62920836

Sample Variance 214.0137374

Problem 5 – solution– The noon hour (12–1) is the busiest,

followed by the (2–3 P.M.) and (10–11 A.M.) periods.

– The variances during the noon hour and between 10-11 AM are the largest, which makes it difficult to predict the number of customer entering the bank.

– Staff lunch breaks and coffee breaks should be scheduled with this in mind.

Comment:All the samples can be analyzed in a single run by Excel > Descriptive statistics.

Page 13: Review # 1

13

The Variance

• Problem 5 (interpretation, empirical rule, Chebyshev)– The mean and standard deviation of the grades of 500

students who took an economic exam were 69 and 7, respectively.

• What are the numerical endpoints of the intervals (x-s, x+s), (x-2s, x+2s), x-3s, x+3s)

• If the grade have a mound-shape distribution, approximately how many students received a grade in each of the three intervals specified above?

• If the grades do not have a mound shaped distribution,at least how many students received grades in the interval

)s3x,s3x(

Page 14: Review # 1

14

The Variance

b) For a mound shaped distribution the Empirical Rule applies. Thus,

Approximately (.68)(500) = 340 grades are in (62, 76)

Approximately (.95)(500) = 475 grades are in (55, 83)

Virtually all of the grades are in (48, 90)

90 ,48s3x ,s3x

83 ,55s2x ,s2x76 ,62sx ,sx

a)

69 – 7 = 6269 + 7 = 76

Problem 5 - solution

Page 15: Review # 1

15

Chebychev Theorem

• If the distribution is not mound shaped we need to use Chebychev Theorem:

90 ,48s3x ,s3x

83 ,55s2x ,s2x76 ,62sx ,sx

1-1/12 = 0. None or more of the Observations can be found within s aroundthe mean1-1/22 =3/4. At least ¾ of the observationscan be found within 2s around the mean.

1-1/32 = 8/9. At least 8/9 of the observationscan be found within 3s around the mean.

c) If the distribution is not mound shaped, at least (8/9)(500) = 444.4 (or 445) grades are within (48, 90)

Page 16: Review # 1

16

Chapter 3 (Probability)

• Probability is a numerical measure that represents the likelihood of occurrence of a random event.

0P(A)1Relationships between eventsUnion: Event A or event B have occurred (at least one of

them took place).– Intersection: Event A and event B have occurred (both

event took place simultaneously).– Complement event: If event A did not occur, then event

called “Not A” (A) occurred.

Page 17: Review # 1

17

Chapter 3

• Problem 6– A firm classifies its customers’ accounts in two ways: By balance and

whether it is overdue. Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16

– Define the following events:• A: An account is under $100

• B: An account is overdue

– An account is selected at random

Find the following probabilities:

P(under $100 and overdue)=P(A and B)=.08.

P(under $100)=P(A)=.08+.42=.50

Marginal probability

Joint probability

Page 18: Review # 1

18

P(A or B)= [.08+.42]+ [.08+.08+.04]=.70

Chapter 3

Find the following probabilities:

P(under $100 or overdue)=

• Problem 6– A firm classifies its customers’ accounts in two

ways: By balance and whether it is overdueAccount balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16

– Define the following events:• A: An account is under $100

• B: An account is overdue

Page 19: Review # 1

19

• Problem 6.11 (Relationships:and, or, conditional)– A firm classifies its customers’ accounts in two ways: By balance

and whether it is overdueAccount balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16

– Define the following events:• A: An account is under $100

• B: An account is overdue

Chapter 3

Find the following probabilities:

P(under $100 or overdue)=P(A or B)= [.08+.42]+

[.08+.08+.04]=.62P(not overdue)=P(not B)=.42+.22+.16=.80. Or,P(not B) = 1 – P(B) = 1 – (.08+.08+.04) = 1 - .20.

More events and their probabilities…

Page 20: Review # 1

20

Chapter 3

• If the account selected is overdue, what is the probability that its balance is under $100? That is…• P(A|B)=?

P(A|B)=P(A and B)/P(B)

P(A|B)=.08/(.08+.08+.04)= .08/(.20)=.40

Note: P(A)=.50 but

P(A|B)=.40

• Problem 7 (conditional probability)– A firm classifies its customers’ accounts in two ways: By balance

and whether it is overdueAccount balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16

– Define the following events:• A: An account is under $100

• B: An account is overdue

Find the following probabilities:

Page 21: Review # 1

21

• Problem 7– A firm classifies its customers’ accounts in two ways: By balance

and whether it is overdue.Account balance Overdue Not OverdueUnder $100 .08 .42$100 - $500 .08 .22Over $500 .04 .16

– Define the following events:• A: An account is under $100

• B: An account is overdue

Chapter 3

P(C|D)=P(C and D)/P(D)

If the account selected is overdue what is the probability that its balance is $500 or less?

P($500 or less|Overdue)=P($500 or less, and Overdue)/P(Overdue)(.08+.08)/(.08+.08+.04)=.80

Find the following probabilities:

Page 22: Review # 1

22

• Problem 8 (Multiplication rule)– Sporting goods store estimates that 20% of the students

at a nearby university ski downhill, and 15% ski cross-country. Of those who ski downhill, 40% also ski cross-country.

– What percentage of the students ski both downhill and cross-country?

Define events:A: a student ski downhillB: a student ski cross-country

Given probabilities:P(A) = .2; P(B) = .15; P(B|A) = .4

Calculate P(A and B) = P(B|A)P(A) = (.4)(.2) = .08

Chapter 3

Page 23: Review # 1

23

Chapter 3

• Problem 9 (Addition rule)– Sporting goods store estimates that 20% of the students

at a nearby university ski downhill, and 15% ski cross-country. Of those who ski downhill, 40% also ski cross-country.

– What percentage of the students do not ski at all? Calculate P(not A and not B) =

1 – P(A or B);P(A or B) = P(A) + P(B) – P(A and B) =(.2) + (.15) – (.08) = .27

Therefore P(not A and not B) =1 – P(A or B) = 1-.27 = .73;

Page 24: Review # 1

24

Chapter 3

• Problem 10 (Independent events, multiplication rule)– Approx. 3 out of every 4 Americans received a refund

from the IRS in 1995. If 3 individuals are selected at random find the probabilities of the following events:

• All three received a refund

• None received a refund

• At least one received a refund

• Exactly one received a refund

Page 25: Review # 1

25

Chapter 3

• Problem 10 – solutionLet A be the event: Individual 1 received a refund. Define B and C similarly for individual 2 and 3. Then P(A)=P(B)=P(C)=3/4.

– P(All the three received a refund)=P(A and B and C)=P(A)P(B)P(C)=(3/4)3

– P(None received a refund)=P(Not A, and not B, and not C)= P(not A)P(not B)P(not C)=(1/4)3

– P(At least one received…)=1-P(none received…)=1-(1/4)3. – P(Exactly one received a refund)=

P(A and not B and not C)+P(not A and B and not C)+P(not A and not B and C)=

(3/4)(1/4)(1/4)+

(1/4)(3/4)(1/4)+

(1/4)(1/4)(3/4)=

3(3/4)(1/4)2.

Page 26: Review # 1

26

Chapter 4Random Variables

• Problem 11 (discrete random variable, expected value, variance)– You and a friend have contributed equally to a portfolio

of $500. The annual income (X) has the following distribution

x 500 1,000 2,000 P(x) .5 .3 .2

– Determine the annual expected value and variance of the income earned on this portfolio.

– Determine the net annual profit and variance to you. – What is the expected profit and variance to you for the

next two years?

Page 27: Review # 1

27

• Problem 11 – solution– E(X)=(500)(.5)+(1000)(.3)+(2000)(.2)= $950

V(X)=(500-950)2(.5)+(1000-950)2(.3)+(2000- 950)2(.2)=$2 322500

– E(Ann. profit)=E(X/2 – 250)=E(X/2)-E(250)= 1/2E(X) – 250 = (½)950 – 250= $225

V(Ann. profit)=V(X/2 – 250)= V(X/2)+V(250) = V(X/2)+0=(1/2)2V(X)=1/4(322500).

Chapter 4Random Variables

Page 28: Review # 1

28

• Problem 11 – solution continued– If Xi is the income for year i, the income for the next two

years is X1 + X2. Your profit is therefore, (½)(X1 + X2) – 250. – E(2 years Profit) = E(½X1+½X2 – 250) =

½ E(X1) + ½ E(X2) – 250 = [assuming the income distribution does not change between the two years] = (½) 950 + (½)950 – 250 = $700.

– V(2 years Profit) = V(½X1+½X2 – 250) = [assuming the income distribution does not change between the two years, and the incomes in the two years are independent random variables] = (½)2V(X1)+(½)2V(X2)

Chapter 4Random Variables

Page 29: Review # 1

29

Chapter 4The Binomial Distribution

• Example 12

• A survey reported that 20% of elementary school teachers use the Web. Fifteen teachers are selected at random. Answer the following questions.

Page 30: Review # 1

30

– Solution: Let us analyze this experiment first.• There are n=15 independent experiments.

• Each experiment has two possible outcomes.

• The probability of success in each experiment is p=.20. which does not change from experiment to experiment.

• Therefore: this is a binomial experiment.

• Define: X – the number of teachers that use the Web. X is binomial with parameters n=15, and p=.2.

Chapter 4The Binomial Distribution

Page 31: Review # 1

31

• P(No teacher uses the Web) =

• P(One teacher uses the Web)=

• P(# of Web users does not exceed 8)=

• P(More than 2 Web users)=P(X3)=P(X=3)+P(X=4)+…+P(X=15)<Let us use the binomial table> =1- P(X2)=1-.398=.602

n,...,2,1,0x)p1(p)!xn(!x

!n)xX(P xnx

150150 .8.2)(1(.2)0)!(50!

15!0)P(X

1411151 (.8)15(.2).2)(1(.2)1)!(151!

15!1)P(X

.999tabletheFrom8)P(X...0)P(X8)P(X

Binomial table

Chapter 4The Binomial Distribution

Page 32: Review # 1

32

• The expected number of teachers using the internet = E(X)=np=15(.2)=3 user;

The variance of the number of Web users=.V(X)=np(1-p)=15(.2)(.8)=2.4 users2. Standard deviation=V(X)1/2.

• P(Less than 8 are Web users, given that more than 2 are users)= P(XX3 P(X and X3)/P(X3) =[P(X=3)+…+P(X=7)]/P(X3)<Let us use the table>

P(X=3)+…+P(X=7) = P(X)-P(X2) = .996 - .398 = .598

P(X3)=1-P(X2)=1-.398=.602

P(XX3) = .598/.602 = .993

Chapter 4The Binomial Distribution

Page 33: Review # 1

33

• Solution continued– Repeat this problem assuming 50 teachers were

sampled. Since there are no tables available for this number of repeated trials (n), we’ll use Excel.

• P(X=0)=.850

• P(X=1)=50(.2)(.8)49

• P(X4)=0.018496 <Go to Excel > Type: =BINOMDIST(4,50,.2, True)

Chapter 4The Binomial Distribution

Page 34: Review # 1

34

0.5

Chapter 4 The normal distribution

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table.P(Z>=1.7)=

0 1.7

Normal Table

Z?

Page 35: Review # 1

35

Chapter 4 The normal distribution

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table.P(Z>=1.7)=?

.5-.4554=.0446

0 1.7Z

.4554 From the Normal Table

Page 36: Review # 1

36

Chapter 4 The normal distribution

Problem 1– Find the following probabilities using the normal

table.P(Z>= –.95)=?

0-.95

Normal Table

+.95

P(Z< +.95)= .5+.3289=0.8289

0.5

.328

9

Page 37: Review # 1

37

Chapter 4The normal distribution

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table. P(-1.14Z1.55)=?

-1.14 1.550 Normal Table

.4394

P (-1.14(<Z<0)+P(0<Z<1.55)

Page 38: Review # 1

38

Chapter 4The normal distribution

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table. P(-1.14Z1.55)=?

-1.14 0 Normal Table1.14

.3729

P (-1.14<Z<0)+P(0<Z<1.55)

Page 39: Review # 1

39

Chapter 4The normal distribution

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table.P(-1.14 Z 1.55)=

-1.14 1.550

.9394

.8123 Normal Table

P (-1.14(<Z<0)+P(0<Z<1.55)

=.3729+.4394 = .8123

Page 40: Review # 1

40

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table. P(-2.97 Z -1.38)=?

Chapter 4The normal distribution

-2.97 - 1.38 0

Normal Table

1.38 2.97

P(0<Z< 2.97)-P(0<Z< 1.38)

Page 41: Review # 1

41

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table. P(-2.97 Z -1.38)=?

-2.97 - 1.38 0

Normal Table

2.97

P(0<Z< 2.97)-P(0<Z< 1.38)

.4985

Chapter 4The normal distribution

Page 42: Review # 1

42

Chapter 4The normal distribution

Problem 1 (calculating normal probabilities)– Find the following probabilities using the normal

table. P(-2.97 Z -1.38)=?

-2.97 - 1.38 0

Normal Table

P(0<Z< 2.97)-P(0<Z< 1.38)

.4162

.4985–.4162=.0823

1.38

P(0<Z< 2.97)-P(0<Z< 1.38)

With Excel type:=normsdist(-1.38)-normsdist(-2.97)

Page 43: Review # 1

43

Chapter 4The normal distribution

Problem 2 (application)– Mensa is an organization whose member posses IQs in the

top 2% of the population. IQ is normally distributed with a mean of 100 and standard deviation of 16.

– Questions• What is the probability that a randomly selected person

have an IQ of 140 or more?• What minimum IQ qualifies a person to be admitted to

Mensa? • What is the probability that a randomly selected person

from among Mensa’s members have an IQ of more than 140?

Page 44: Review # 1

44

• Question 1: What is the probability that a randomly selected person have an IQ of 140 or more?– Answer: Define X as the IQ level of a person.

P(X>140)=P(Z>(140 – 100)/16)=P(Z>2.5)= .5-.4938=.0062. With Excel type: =1-normdist(140,100,16,True)

0 2.5

Chapter 4The normal distribution

Normal Table.4938 .0062

Page 45: Review # 1

45

Z0

.02

• Question 2: What minimum IQ qualifies a person to be admitted to Mensa?

• Answer: Define X as the IQ level of a person.

For a Mensa member P(X>X0)=.02 P(Z>(X0-100)/16)=.02If we define Z0=(X0-100)/16 then P(Z>Z0)=.02Let us first find Z0 and then X0.

0 = 2.055

.48

Finally, determine X0 by2.055=(X0 – 100)/16 X0 = 100+2.055(16) = 132.88

Chapter 4The normal distribution

Normal Table

Page 46: Review # 1

46

• Question 3: What is the probability that a randomly selected person from among members of Mensa have an IQ of more than 140?

• Answer: Define X as the IQ level of a person.

• P(X>140|X>132.88)=

P(X>140 and X>132.88)/P(X>132.88)=

• For comparison we have seen that

P(X>140)=P(Z>(140 – 100)/16)=P(Z>2.5)=.5- 4938 =.0062!!

• No surprise. Given that a person belongs to Mensa

the probability his/her IQ > 140 is much larger than

this of a person from the general population.

132.88

Chapter 4The normal distribution

Normal Table

140

P(X>140)/P(X>132.88)=P(Z>2.5)/.02=.0062/.02=.31