Introduction to Business Statistics QM 220 Chapter 8 ... 8-2.pdfQM-220, M. Zainal 1 DEPARTMENT OF...
-
Upload
vuongthien -
Category
Documents
-
view
230 -
download
0
Transcript of Introduction to Business Statistics QM 220 Chapter 8 ... 8-2.pdfQM-220, M. Zainal 1 DEPARTMENT OF...
QM-220, M. Zainal 1
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS
Introduction to Business StatisticsQM 220
Chapter 8Estimation of the mean and proportion
Dr. Mohammad ZainalSpring 2008
Estimation: An introduction
Estimation is a procedure by which a numerical value orvalues are assigned to a population parameter based on theinformation collected from a sample.
2
In inferential statistics, μ is called the true population mean and p iscalled the true population proportion. There are many otherpopulation parameters, such as the median, mode, variance, andstandarddeviation.
Examples of estimation:
mean fuel consumption for a particular model of a caraverage time taken by new employees to learn a jobmean housing expenditure per month incurred by households
QM-220, M. Zainal
QM-220, M. Zainal 2
Estimation: An introduction
If we can conduct a census each time we want to find the valueof a population parameter, then the estimation procedures arenot needed.
E l f h
3
Example, if the Kuwaiti Census Bureau can contact everyhousehold in the Kuwait to find the mean housing expenditureof households, the result of the survey will actually be a census
However, conducting a census:is too expensive,very time consuming,y g,virtually impossible to contact every member of a population
QM-220, M. Zainal
Estimation: An introduction
That is why we usually take a sample from the population andcalculate the value of the appropriate sample statistic. Then weassign a value or values to the corresponding populationparameter based on the value of the sample statistic
4
parameter based on the value of the sample statistic.
Example, to estimate the mean housing expenditure per monthof all households in the Kuwait, the Census Bureau will
take a sample of certain householdscollect the information on the housing expenditure per monthcompute the value of the sample meanassign values to the population mean
QM-220, M. Zainal
QM-220, M. Zainal 3
Estimation: An introduction
The value assigned to a population parameter based on thevalue of a sample statistics is called an estimate of thepopulation parameter.
h l d l
5
The sample statistic used to estimate a population parameter iscalled an estimator.
The estimation procedure involves the following steps.Select a sample.Collect the required information from the members of thesample.pCalculate the value of the sample statistic.Assign value(s) to the corresponding population parameter.
QM-220, M. Zainal
Point and interval estimates
An estimate may be a point estimate or an interval estimate.
A Point Estimate
6
The value of a sample statistic that is used to estimate a
population parameter is called a point estimate.
If Census Bureau takes a sample of 10,000 households and
determines the mean housing expenditure per month, x, for this
sample is $1370 Then using x as a point estimate of μ thesample is $1370. Then, using x as a point estimate of μ, the
bureau can state that the mean housing expenditure per month,
μ, for all households is about $1370.
QM-220, M. Zainal
QM-220, M. Zainal 4
Estimation: An introduction
Usually, whenever we use point estimation, we calculate the
margin of error associated with that point estimation.
F h i i f h l i h i f
7
For the estimation of the population mean, the margin of error
is calculated as follows:
An Interval Estimate
Margin of error = 1.96 or 1.96x xsσ± ±
In the interval estimation, instead of assigning a single value to
a population parameter, an interval is constructed around the
point estimate.QM-220, M. Zainal
Point and interval estimates
For the example, instead of saying that the mean housing
expenditure per month for all households is $1370, we may
obtain an interval subtracting a number from $1370 and adding
8
obtain an interval subtracting a number from $1370 and adding
the same number to $1370.
Then we say that this interval contains the population mean, μ.
For purposes of illustration, suppose subtract $240 from $1370
and add $240 to $1370. Consequently, we obtain the interval
($1370 ‐ $240) to ($1370 + $240), or $1130 to $1610.
QM-220, M. Zainal
QM-220, M. Zainal 5
Point and interval estimates
Then we state that the interval $1130 to $1610 is likely tocontain the population mean, μ, and that the mean housingexpenditure per month for all households in the United States is
9
between $1130 and $1610.
This procedure is called interval estimation.
The value $1130 is called the lower limit of the interval and$1610 is called the upper limit of the interval.
QM-220, M. Zainal
Point and interval estimates
The question is, what number we should add to and subtractfrom the point estimate?
The answer to this question depends on two considerations:
10
q pThe standard deviation of the meanThe level of confidence to be attached to the interval
First, the larger the standard deviation, the greater is thenumber subtracted from and added to the point estimate.Second, the quantity subtracted and added must be large if wewant to have a higher confidence in our intervalwant to have a higher confidence in our interval.Confidence Level and Confidence Interval: Each interval isconstructed with regard to a given confidence level and is calleda confidence interval.
QM-220, M. Zainal
QM-220, M. Zainal 6
Point and interval estimates
The confidence level associated with a confidence intervalstates how much confidence we have that this interval containsthe true population parameter.
11
p p p
The confidence level is denoted by (1 ‐ α)100%, where α is theGreek letter alpha. When expressed as probability, it is called theconfidence coefficient and is denoted by 1 – α.
α is called the significance level.
Any value of the confidence level can be chosen to construct aAny value of the confidence level can be chosen to construct aconfidence interval, the more common values are 90%, 95%, and99%. The corresponding confidence coefficients are .90, .95, and.99.
QM-220, M. Zainal
Interval estimation of a population mean:12
QM-220, M. Zainal
QM-220, M. Zainal 7
Interval estimation of a population mean: large samples
If the population standard deviation σ is not known, then weuse the sample standard deviation S, in which
SS σfi ddi
13
The (1 ‐ α)100% confidence interval for μ is
nnSS xx
σσ == of instead used is
unknown is ifknown is if
σσσ
x
x
zsxzx
±±
The value of z used here is read from the standard normaldistribution table for the given confidence level.
x
QM-220, M. Zainal
Interval estimation of a population mean: large samples
The quantity (or when σ is not known) in the confidenceinterval formula is called the maximum error of estimate and isdenoted by E.
xzσ xzs14
To find z:
1‐Divide (1 ‐ α) by 2.2‐Locate the answer in the body of the standard normaldistribution table and record the corresponding value of z.
QM-220, M. Zainal
QM-220, M. Zainal 8
Interval estimation of a population mean: large samples
Example: A publishing company has just published a new collegetextbook. Before the company decides the price at which to sell thistextbook, it wants to know the average price of all such textbooks inthe market The research department at the company took a sample of
15
the market. The research department at the company took a sample of36 comparable textbooks and collected information on their prices.This information produced a mean price of $90.50 for this sample. It isknown that the standard deviation of the prices of all such textbooks is$7.50.
(a) What is the point estimate of the mean price of all suchll b k ? Wh i h i f f hicollege textbooks? What is the margin of error for this
estimate?
(b) Construct a 90% confidence interval for the mean price ofall such college textbooks.
QM-220, M. Zainal
Interval estimation of a population mean: large samples
Example: According to CardWeb.com, the mean bank credit card debtfor households was $7868 in 2004. Assume that this mean was basedon a random sample of 900 households and that the standardde iation of such debts for all households in 2004 was $2070 Make a
16
deviation of such debts for all households in 2004 was $2070. Make a99% confidence interval for the 2004 mean bank credit card debt for allhouseholds.
QM-220, M. Zainal
QM-220, M. Zainal 9
Interval estimation of a population mean: large samples
The width of a confidence interval depends on the size of themaximum error, E, which depends on the values of z, σ, and n.Why ?
17
But we have no control on σ. Why?
So, the width depends only on:
The value of z, which depends on the confidence level.
The sample size n
The value of z increases as the confidence level increasesThe value of z increases as the confidence level increases.
For the same value of σ, an increase in n decreases the value ofσ, which ,in turn decreases the size of E when the confidencelevel remains unchanged.
QM-220, M. Zainal
Interval estimation of a population mean: large samples
If we want to decrease the width of a confidence interval, wehave two choices:
Lower the confidence level.
18
Increase the sample size.
Lowering the confidence level is not a good choice because alower confidence level may give less reliable results.
Increasing the sample size n, is the best way to decrease thewidth of a confidence interval.width of a confidence interval.
QM-220, M. Zainal
QM-220, M. Zainal 10
Interval estimation of a population mean: large samples
Confidence level and the width of the confidence interval
Reconsider the last example. Suppose all the information givenin that example remains the same First let us decrease the
19
in that example remains the same. First, let us decrease theconfidence level to 95%.
From the normal distribution table, z = 1.96 for a 95%confidence level. Then, using z = 1.96 in the confidence interval,we obtain
95% confidence interval is smaller than the 99% interval
QM-220, M. Zainal
Interval estimation of a population mean: large samples
Sample size and the width of the confidence interval
Reconsider the last example. Suppose we change n to be 2500and all other information remain the same
20
and all other information remain the same.
The width of the confidence interval for n = 2500 is smallerthan that of n = 900
QM-220, M. Zainal
QM-220, M. Zainal 11
Interval estimation of a population mean: large samples
Example: The standard deviation for a population is 6.30. Arandom sample selected from this population gave a meanequal to 81.90.
21
Make a 99% confidence interval for μ assuming n = 36
Make a 99% confidence interval for μ assuming n = 81
Make a 99% confidence interval for μ assuming n = 100
Does the width of the confidence intervals constructed in parts athrough c decrease as the sample size increases? Why?
QM-220, M. Zainal
Interval estimation of a population proportion: large samples
Many times we want to estimate the population proportion.
Examples:The production manager of a company wants to estimate the
22
p g p yproportion of defective items on a machineA bank manager may want to know the percentage of customers who
are satisfied with the bank services.
Recall:The sampling distribution of the sample proportion is
(approximately) normal(approximately) normal.The mean of the sampling distribution of is equal to the population
proportion.The standard deviation of the sampling distribution of the sample
proportion is nqpp /ˆˆˆ =σQM-220, M. Zainal
QM-220, M. Zainal 12
Interval estimation of a population proportion: large samples
The margin of error is
pzs ˆ
23
The (1 ‐ α)100% confidence interval for p is
pzsp ˆˆ ±
QM-220, M. Zainal
Interval estimation of a population proportion: large samples
Example: According to a 2002 survey, 20% of Americans neededlegal advice during the past year to resolve such thorny issuesas family trusts and landlord disputes. Suppose a recent sampleof 1000 adult Americans sho ed that 20% of them needed legal
24
of 1000 adult Americans showed that 20% of them needed legaladvice during the past year to resolve such family‐relatedissues.(a) What is the point estimate of the population proportion? What is themargin of error for this estimate?
(b) Construct a 99% confidence interval for all adults Americans whoneeded legal advice during the past yearneeded legal advice during the past year.
QM-220, M. Zainal
QM-220, M. Zainal 13
Interval estimation of a population proportion: large samples
Example: According to the analysis of a CNN‐USA TODAY‐Gallup poll conducted in October 2002, ʺStress has become acommon part of everyday life in the United States. The demandsof work, family, and home place an increasing burden on the
25
y p gaverage American.ʺ According to this poll, 40% of Americansincluded in the survey indicated that they had a limited amountof time to relax (Gallup. com, November 8, 2002). The poll wasbased on a randomly selected national sample of 1502 adultsaged 18 and older. Construct a 95% confidence interval for thecorresponding population proportion.
QM-220, M. Zainal
Interval estimation of a population proportion: large samples
Example:a. A sample of 400 observations taken from a populationproduced a sample proportion of .63. Make a 95% confidenceinterval for p
26
interval for p.b. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .59. Make a 95%confidence interval for p.c. Another sample of 400 observations taken from the samepopulation produced a sample proportion of .67. Make a 95%confidence interval for p.
QM-220, M. Zainal
QM-220, M. Zainal 14
Determining the sample size for the estimation of mean
The big reason on why we usually conduct a surveyinstead of a census is our limited recourses.
27
If a smaller sample can serve our purpose then no needto take a bigger sample.
Suppose on a test to estimate the mean life of a battery.If 40 batteries can give us the required confidenceinterval, why should we waste our money by buyingmore batteries.more batteries.
The question is how can we decide the minimumsample size to produce a confidence interval with agiven α.
QM-220, M. Zainal
Determining the sample size for the estimation of mean
Recall that E is a function of z, σ, and n. That is
nzE σ.=
28
If we fix z, σ, and E and try to find n. The sample size can befound using
n
2
22.
Ezn σ
=
If we don’t know σ, then s can be used instead by taking a pilotsample with any arbitrary size.
QM-220, M. Zainal
QM-220, M. Zainal 15
Determining the sample size for the estimation of mean
Example: An alumni association wants to estimate the meandebt of this yearʹs college graduates. It is known that thepopulation standard deviation of the debts of this yearʹs college
29
graduates is $11,800. How large a sample should be selected sothat the estimate with a 99% confidence level is within $800 ofthe population mean?
QM-220, M. Zainal
Determining the sample size for the estimation of proportion
Similar to the sampling mean, we can determine the samplesize for the sampling proportion.
The only difference is the standard deviation.
30
The only difference is the standard deviation.
The sample size can be found using
If p is not known, we choose a conservative sample of size n byusing p = q. Why?
nzE σ.=
Then, we estimate p using the preliminary sample.
QM-220, M. Zainal
QM-220, M. Zainal 16
Determining the sample size for the estimation of proportion
Example: Lombard Electronics Company has just installed anew machine that makes a part that is used in clocks. Thecompany wants to estimate the proportion of these parts
31
produced by this machine that are defective. The companymanager wants this estimate to be within .02 of the populationproportion for a 95% confidence level. What is the mostconservative estimate of the sample size that will limit themaximum error to within .02 of the population proportion?
QM-220, M. Zainal
Determining the sample size for the estimation of proportion
Example: Consider the previous example again. Suppose apreliminary sample of 200 parts produced by this machineshowed that 7% of them are defective. How large a sampleshould the company select so that the 95% confidence inter al
32
should the company select so that the 95% confidence intervalfor p is within .02 of the population proportion?
QM-220, M. Zainal
QM-220, M. Zainal 17
Interval estimation of a population mean: small samples
In a previous section , we considered estimating the populationmean for large samples (n ≥ 30).
Using the CLT, we assumed that the sampling distribution of
33
Using the CLT, we assumed that the sampling distribution ofthe sample mean is approximately normal despite the shape ofthe population and whether or not σ is known.
Unfortunately, many times we are restricted to small samplesdue to the nature of the experiment.
For instance:For instance:
Clinical Trials
Space missions
QM-220, M. Zainal
Interval estimation of a population mean: small samples
If we are dealing with small sample sizes, we will have twoscenarios:
1‐The original population is normal and σ is known.
34
g p p
2‐The original population is (approximately) normal and σ is unknown.
In the first scenario, we use the normal distribution to constructthe confidence interval of μ.
In the second scenario, we can’t use the normal distribution toconstruct the confidence interval of μ. Instead, we will useconstruct the confidence interval of μ. Instead, we will useanother distribution called the t‐distribution.
QM-220, M. Zainal
QM-220, M. Zainal 18
Interval estimation of a population mean: small samples
Conditions under which the t‐distribution is used to make a confidence interval about μ.
1‐ The population from which the sample is drawn is (approximately) o ally di t ibuted
35
normally distributed2‐ The sample size is small (that is, n < 30)3‐ The population standard deviation, σ , is not known
The t distribution
The t distribution is a specific type of bell‐shaped distribution with lower height and a wider spread than the standard normalwith lower height and a wider spread than the standard normal distribution.As the sample size becomes larger, the t distribution approaches the standard normal distribution.
QM-220, M. Zainal
Interval estimation of a population mean: small samples
The t distribution has only one parameter, called the degrees of freedom (df). The mean of the t distribution is equal to 0 and its standard deviation is √[df/(df ‐ 2)].
36
The units of the t distribution are denoted by t.The number of degrees of freedom (df) is the only parameter ofthe t distribution.
df = n – 1
QM-220, M. Zainal
QM-220, M. Zainal 19
Interval estimation of a population mean: small samples
Example: Find the value of t for n = 10 and .05 area in the righttail. Also, find it’s standard deviation.Solution: df = n – 1 = 9 → standard deviation = 1.134
37
The required value of t for 9 df and .05 area in the right tail
QM-220, M. Zainal
Interval estimation of a population mean: small samples
Example: Find the value of t for n = 10 and .05 area in the lefttail. Also, find it’s standard deviation.Solution:
38
QM-220, M. Zainal
QM-220, M. Zainal 20
Interval estimation of a population mean: small samples
Confidence interval for μ using the t distribution
If the following three conditions hold true, we use thet distribution to make a confidence interval about μ.
39
1‐ The population from which the sample is drawn is (approximately) normally distributed
2‐ The sample size is small (that is, n < 30)
3‐ The population standard deviation, σ , is not known
The (1 ‐ α)% confidence interval for μ for small samples is sa p es is
The value of t is obtained from the t distribution tablefor n‐1 df and a given confidence level.
X Xsx ts where sn
± =
QM-220, M. Zainal
Interval estimation of a population mean: small samples
Example: A doctor wanted to estimate the mean cholesterol levelfor all adult men living in Dasmah. He took a sample of 25 adultmen from Hartford and found that the mean cholesterol levelfo thi a le i 186 ith a ta da d de iatio of 12 A u e
40
for this sample is 186 with a standard deviation of 12. Assumethat the cholesterol level for all adult men in Hartford are(approximately) normally distributed. Construct a 95%confidence interval for the population mean μ.
QM-220, M. Zainal
QM-220, M. Zainal 21
Interval estimation of a population mean: small samples
Example: Twenty‐five randomly selected adults who buy booksfor general reading were asked how much they usually spendon books per year. The sample produced a mean of $1450 and astandard deviation of $300 for such annual expenses. Assumeh h f ll d l h b b k f l
41
that such expenses for all adults who buy books for generalreading have an approximate normal distribution. Determine a99% confidence interval for the corresponding population meanμ.
QM-220, M. Zainal