Introduction to Business Statistics QM 220 Chapter 8 ... 8-2.pdfQM-220, M. Zainal 1 DEPARTMENT OF...

QM-220, M. Zainal 1

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS

Introduction to Business StatisticsQM 220

Chapter 8Estimation of the mean and proportion

Dr. Mohammad ZainalSpring 2008

Estimation: An introduction

Estimation is a procedure by which a numerical value orvalues are assigned to a population parameter based on theinformation collected from a sample.

2

In inferential statistics, μ is called the true population mean and p iscalled the true population proportion. There are many otherpopulation parameters, such as the median, mode, variance, andstandarddeviation.

Examples of estimation:

mean fuel consumption for a particular model of a caraverage time taken by new employees to learn a jobmean housing expenditure per month incurred by households

QM-220, M. Zainal

QM-220, M. Zainal 2


If we can conduct a census each time we want to find the valueof a population parameter, then the estimation procedures arenot needed.

E l f h

3

Example, if the Kuwaiti Census Bureau can contact everyhousehold in the Kuwait to find the mean housing expenditureof households, the result of the survey will actually be a census

However, conducting a census:is too expensive,very time consuming,y g,virtually impossible to contact every member of a population

QM-220, M. Zainal


That is why we usually take a sample from the population andcalculate the value of the appropriate sample statistic. Then weassign a value or values to the corresponding populationparameter based on the value of the sample statistic

4

parameter based on the value of the sample statistic.

Example, to estimate the mean housing expenditure per monthof all households in the Kuwait, the Census Bureau will

take a sample of certain householdscollect the information on the housing expenditure per monthcompute the value of the sample meanassign values to the population mean

QM-220, M. Zainal

QM-220, M. Zainal 3


The value assigned to a population parameter based on thevalue of a sample statistics is called an estimate of thepopulation parameter.

h l d l

5

The sample statistic used to estimate a population parameter iscalled an estimator.

The estimation procedure involves the following steps.Select a sample.Collect the required information from the members of thesample.pCalculate the value of the sample statistic.Assign value(s) to the corresponding population parameter.

QM-220, M. Zainal

Point and interval estimates

An estimate may be a point estimate or an interval estimate.

A Point Estimate

6

The value of a sample statistic that is used to estimate a

population parameter is called a point estimate.

If Census Bureau takes a sample of 10,000 households and

determines the mean housing expenditure per month, x, for this

sample is $1370 Then using x as a point estimate of μ thesample is $1370. Then, using x as a point estimate of μ, the

bureau can state that the mean housing expenditure per month,

μ, for all households is about $1370.

QM-220, M. Zainal

QM-220, M. Zainal 4


Usually, whenever we use point estimation, we calculate the

margin of error associated with that point estimation.

F h i i f h l i h i f

7

For the estimation of the population mean, the margin of error

is calculated as follows:

An Interval Estimate

Margin of error = 1.96 or 1.96x xsσ± ±

In the interval estimation, instead of assigning a single value to

a population parameter, an interval is constructed around the

point estimate.QM-220, M. Zainal


For the example, instead of saying that the mean housing

expenditure per month for all households is $1370, we may

obtain an interval subtracting a number from $1370 and adding

8

obtain an interval subtracting a number from $1370 and adding

the same number to $1370.

Then we say that this interval contains the population mean, μ.

For purposes of illustration, suppose subtract $240 from $1370

and add $240 to $1370. Consequently, we obtain the interval

($1370 ‐ $240) to ($1370 + $240), or $1130 to $1610.

QM-220, M. Zainal

QM-220, M. Zainal 5


Then we state that the interval $1130 to $1610 is likely tocontain the population mean, μ, and that the mean housingexpenditure per month for all households in the United States is

9

between $1130 and $1610.

This procedure is called interval estimation.

The value $1130 is called the lower limit of the interval and$1610 is called the upper limit of the interval.

QM-220, M. Zainal


The question is, what number we should add to and subtractfrom the point estimate?

The answer to this question depends on two considerations:

10

q pThe standard deviation of the meanThe level of confidence to be attached to the interval

First, the larger the standard deviation, the greater is thenumber subtracted from and added to the point estimate.Second, the quantity subtracted and added must be large if wewant to have a higher confidence in our intervalwant to have a higher confidence in our interval.Confidence Level and Confidence Interval: Each interval isconstructed with regard to a given confidence level and is calleda confidence interval.

QM-220, M. Zainal

QM-220, M. Zainal 6


The confidence level associated with a confidence intervalstates how much confidence we have that this interval containsthe true population parameter.

11

p p p

The confidence level is denoted by (1 ‐ α)100%, where α is theGreek letter alpha. When expressed as probability, it is called theconfidence coefficient and is denoted by 1 – α.

α is called the significance level.

Any value of the confidence level can be chosen to construct aAny value of the confidence level can be chosen to construct aconfidence interval, the more common values are 90%, 95%, and99%. The corresponding confidence coefficients are .90, .95, and.99.

QM-220, M. Zainal

Interval estimation of a population mean:12

QM-220, M. Zainal

QM-220, M. Zainal 7

Interval estimation of a population mean: large samples

If the population standard deviation σ is not known, then weuse the sample standard deviation S, in which

SS σfi ddi

13

The (1 ‐ α)100% confidence interval for μ is

nnSS xx

σσ == of instead used is

unknown is ifknown is if

σσσ

x

x

zsxzx

±±

The value of z used here is read from the standard normaldistribution table for the given confidence level.

x

QM-220, M. Zainal


The quantity (or when σ is not known) in the confidenceinterval formula is called the maximum error of estimate and isdenoted by E.

xzσ xzs14

To find z:

1‐Divide (1 ‐ α) by 2.2‐Locate the answer in the body of the standard normaldistribution table and record the corresponding value of z.

QM-220, M. Zainal

QM-220, M. Zainal 8


Example: A publishing company has just published a new collegetextbook. Before the company decides the price at which to sell thistextbook, it wants to know the average price of all such textbooks inthe market The research department at the company took a sample of

15

the market. The research department at the company took a sample of36 comparable textbooks and collected information on their prices.This information produced a mean price of $90.50 for this sample. It isknown that the standard deviation of the prices of all such textbooks is$7.50.

(a) What is the point estimate of the mean price of all suchll b k ? Wh i h i f f hicollege textbooks? What is the margin of error for this

estimate?

(b) Construct a 90% confidence interval for the mean price ofall such college textbooks.

QM-220, M. Zainal


Example: According to CardWeb.com, the mean bank credit card debtfor households was $7868 in 2004. Assume that this mean was basedon a random sample of 900 households and that the standardde iation of such debts for all households in 2004 was $2070 Make a

16

deviation of such debts for all households in 2004 was $2070. Make a99% confidence interval for the 2004 mean bank credit card debt for allhouseholds.

QM-220, M. Zainal

QM-220, M. Zainal 9


The width of a confidence interval depends on the size of themaximum error, E, which depends on the values of z, σ, and n.Why ?

17

But we have no control on σ. Why?

So, the width depends only on:

The value of z, which depends on the confidence level.

The sample size n

The value of z increases as the confidence level increasesThe value of z increases as the confidence level increases.

For the same value of σ, an increase in n decreases the value ofσ, which ,in turn decreases the size of E when the confidencelevel remains unchanged.

QM-220, M. Zainal


If we want to decrease the width of a confidence interval, wehave two choices:

Lower the confidence level.

18

Increase the sample size.

Lowering the confidence level is not a good choice because alower confidence level may give less reliable results.

Increasing the sample size n, is the best way to decrease thewidth of a confidence interval.width of a confidence interval.

QM-220, M. Zainal

QM-220, M. Zainal 10


Confidence level and the width of the confidence interval

Reconsider the last example. Suppose all the information givenin that example remains the same First let us decrease the

19

in that example remains the same. First, let us decrease theconfidence level to 95%.

From the normal distribution table, z = 1.96 for a 95%confidence level. Then, using z = 1.96 in the confidence interval,we obtain

95% confidence interval is smaller than the 99% interval

QM-220, M. Zainal


Sample size and the width of the confidence interval

Reconsider the last example. Suppose we change n to be 2500and all other information remain the same

20

and all other information remain the same.

The width of the confidence interval for n = 2500 is smallerthan that of n = 900

QM-220, M. Zainal



Example: The standard deviation for a population is 6.30. Arandom sample selected from this population gave a meanequal to 81.90.

21

Make a 99% confidence interval for μ assuming n = 36



Does the width of the confidence intervals constructed in parts athrough c decrease as the sample size increases? Why?

QM-220, M. Zainal

Interval estimation of a population proportion: large samples

Many times we want to estimate the population proportion.

Examples:The production manager of a company wants to estimate the

22

p g p yproportion of defective items on a machineA bank manager may want to know the percentage of customers who

are satisfied with the bank services.

Recall:The sampling distribution of the sample proportion is

(approximately) normal(approximately) normal.The mean of the sampling distribution of is equal to the population

proportion.The standard deviation of the sampling distribution of the sample

proportion is nqpp /ˆˆˆ =σQM-220, M. Zainal



The margin of error is

pzs ˆ

23

The (1 ‐ α)100% confidence interval for p is

pzsp ˆˆ ±

QM-220, M. Zainal


Example: According to a 2002 survey, 20% of Americans neededlegal advice during the past year to resolve such thorny issuesas family trusts and landlord disputes. Suppose a recent sampleof 1000 adult Americans sho ed that 20% of them needed legal

24

of 1000 adult Americans showed that 20% of them needed legaladvice during the past year to resolve such family‐relatedissues.(a) What is the point estimate of the population proportion? What is themargin of error for this estimate?

(b) Construct a 99% confidence interval for all adults Americans whoneeded legal advice during the past yearneeded legal advice during the past year.

QM-220, M. Zainal



Example: According to the analysis of a CNN‐USA TODAY‐Gallup poll conducted in October 2002, ʺStress has become acommon part of everyday life in the United States. The demandsof work, family, and home place an increasing burden on the

25

y p gaverage American.ʺ According to this poll, 40% of Americansincluded in the survey indicated that they had a limited amountof time to relax (Gallup. com, November 8, 2002). The poll wasbased on a randomly selected national sample of 1502 adultsaged 18 and older. Construct a 95% confidence interval for thecorresponding population proportion.

QM-220, M. Zainal


Example:a. A sample of 400 observations taken from a populationproduced a sample proportion of .63. Make a 95% confidenceinterval for p

26

interval for p.b. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .59. Make a 95%confidence interval for p.c. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .67. Make a 95%confidence interval for p.

QM-220, M. Zainal


Determining the sample size for the estimation of mean

The big reason on why we usually conduct a surveyinstead of a census is our limited recourses.

27

If a smaller sample can serve our purpose then no needto take a bigger sample.

Suppose on a test to estimate the mean life of a battery.If 40 batteries can give us the required confidenceinterval, why should we waste our money by buyingmore batteries.more batteries.

The question is how can we decide the minimumsample size to produce a confidence interval with agiven α.

QM-220, M. Zainal


Recall that E is a function of z, σ, and n. That is

nzE σ.=

28

If we fix z, σ, and E and try to find n. The sample size can befound using

n

2

22.

Ezn σ

=

If we don’t know σ, then s can be used instead by taking a pilotsample with any arbitrary size.

QM-220, M. Zainal



Example: An alumni association wants to estimate the meandebt of this yearʹs college graduates. It is known that thepopulation standard deviation of the debts of this yearʹs college

29

graduates is $11,800. How large a sample should be selected sothat the estimate with a 99% confidence level is within $800 ofthe population mean?

QM-220, M. Zainal

Determining the sample size for the estimation of proportion

Similar to the sampling mean, we can determine the samplesize for the sampling proportion.

The only difference is the standard deviation.

30

The only difference is the standard deviation.

The sample size can be found using

If p is not known, we choose a conservative sample of size n byusing p = q. Why?

nzE σ.=

Then, we estimate p using the preliminary sample.

QM-220, M. Zainal



Example: Lombard Electronics Company has just installed anew machine that makes a part that is used in clocks. Thecompany wants to estimate the proportion of these parts

31

produced by this machine that are defective. The companymanager wants this estimate to be within .02 of the populationproportion for a 95% confidence level. What is the mostconservative estimate of the sample size that will limit themaximum error to within .02 of the population proportion?

QM-220, M. Zainal


Example: Consider the previous example again. Suppose apreliminary sample of 200 parts produced by this machineshowed that 7% of them are defective. How large a sampleshould the company select so that the 95% confidence inter al

32

should the company select so that the 95% confidence intervalfor p is within .02 of the population proportion?

QM-220, M. Zainal


Interval estimation of a population mean: small samples

In a previous section , we considered estimating the populationmean for large samples (n ≥ 30).

Using the CLT, we assumed that the sampling distribution of

33

Using the CLT, we assumed that the sampling distribution ofthe sample mean is approximately normal despite the shape ofthe population and whether or not σ is known.

Unfortunately, many times we are restricted to small samplesdue to the nature of the experiment.

For instance:For instance:

Clinical Trials

Space missions

QM-220, M. Zainal


If we are dealing with small sample sizes, we will have twoscenarios:

1‐The original population is normal and σ is known.

34

g p p

2‐The original population is (approximately) normal and σ is unknown.

In the first scenario, we use the normal distribution to constructthe confidence interval of μ.

In the second scenario, we can’t use the normal distribution toconstruct the confidence interval of μ. Instead, we will useconstruct the confidence interval of μ. Instead, we will useanother distribution called the t‐distribution.

QM-220, M. Zainal



Conditions under which the t‐distribution is used to make a confidence interval about μ.

1‐ The population from which the sample is drawn is (approximately) o ally di t ibuted

35

normally distributed2‐ The sample size is small (that is, n < 30)3‐ The population standard deviation, σ , is not known

The t distribution

The t distribution is a specific type of bell‐shaped distribution with lower height and a wider spread than the standard normalwith lower height and a wider spread than the standard normal distribution.As the sample size becomes larger, the t distribution approaches the standard normal distribution.

QM-220, M. Zainal


The t distribution has only one parameter, called the degrees of freedom (df). The mean of the t distribution is equal to 0 and its standard deviation is √[df/(df ‐ 2)].

36

The units of the t distribution are denoted by t.The number of degrees of freedom (df) is the only parameter ofthe t distribution.

df = n – 1

QM-220, M. Zainal



Example: Find the value of t for n = 10 and .05 area in the righttail. Also, find it’s standard deviation.Solution: df = n – 1 = 9 → standard deviation = 1.134

37

The required value of t for 9 df and .05 area in the right tail

QM-220, M. Zainal


Example: Find the value of t for n = 10 and .05 area in the lefttail. Also, find it’s standard deviation.Solution:

38

QM-220, M. Zainal



Confidence interval for μ using the t distribution

If the following three conditions hold true, we use thet distribution to make a confidence interval about μ.

39

1‐ The population from which the sample is drawn is (approximately) normally distributed

2‐ The sample size is small (that is, n < 30)

3‐ The population standard deviation, σ , is not known

The (1 ‐ α)% confidence interval for μ for small samples is sa p es is

The value of t is obtained from the t distribution tablefor n‐1 df and a given confidence level.

X Xsx ts where sn

± =

QM-220, M. Zainal


Example: A doctor wanted to estimate the mean cholesterol levelfor all adult men living in Dasmah. He took a sample of 25 adultmen from Hartford and found that the mean cholesterol levelfo thi a le i 186 ith a ta da d de iatio of 12 A u e

40

for this sample is 186 with a standard deviation of 12. Assumethat the cholesterol level for all adult men in Hartford are(approximately) normally distributed. Construct a 95%confidence interval for the population mean μ.

QM-220, M. Zainal



Example: Twenty‐five randomly selected adults who buy booksfor general reading were asked how much they usually spendon books per year. The sample produced a mean of $1450 and astandard deviation of $300 for such annual expenses. Assumeh h f ll d l h b b k f l

41

that such expenses for all adults who buy books for generalreading have an approximate normal distribution. Determine a99% confidence interval for the corresponding population meanμ.

QM-220, M. Zainal

Introduction to Business Statistics QM 220 Chapter 8 ... 8-2.pdfQM-220, M. Zainal 1 DEPARTMENT OF...

Documents

Transcript of Introduction to Business Statistics QM 220 Chapter 8 ... 8-2.pdfQM-220, M. Zainal 1 DEPARTMENT OF...