Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution...

13
Lecture Notes in Statistics ESTIMATION OF PARAMETERS Prepared by: Francis Joseph H. Campeña 1 Estimation of Parameters Procedures and formulas used in estimating values of unknown population parameters that are based on information provided in a sample data are based on the theory of sampling distributions and the methods used to collect these sample. The sampling distributions allow us to associates specific levels of confidence with each statistical inference. And thus enabling us to quantify how much confidence we place in a sample statistic correctly estimating the population parameter. We define the terms that we will be using in estimating. An estimator is a rule, usually expressed as a formula that tells us how to calculate an estimate based on information in the sample. We can classify estimators into two, point estimators and interval estimators 1. Point estimation -- Based on sample data, a single number is calculated to estimate the population parameter. The rule or formula that describes this calculation is called the point estimator, and the resulting number is called the point estimate. 2. Interval estimation -- Based on sample data, two numbers are calculated to form an interval within which the parameter is expected to lie. The rule or formula that describes this calculation is called the interval estimator, and the resulting pair of numbers is called an interval estimate or confidence interval. The following are procedures of determining point estimate and constructing an interval estimate of a population parameter. Estimating the population mean, Point estimation The best point estimate for the population mean, , is the sample mean, x . The point estimator x is unbiased with standard error given by SE= n . The margin of error of the point estimate, x , is given by 1.96SE. Example 1. A random sample of n = 50 observations produced x = 56.4 and 2 = 2.6. Give the best point estimate for the population mean and calculate the margin of error. Remark: If is unknown and n 30, the sample standard deviation s can be used to approximate.

Transcript of Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution...

Page 1: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 1

Estimation of Parameters

Procedures and formulas used in estimating values of unknown population parameters that are based on information provided in a sample data are based on the theory of sampling distributions and the methods used to collect these sample. The sampling distributions allow us to associates specific levels of confidence with each statistical inference. And thus enabling us to quantify how much confidence we place in a sample statistic correctly estimating the population parameter.

We define the terms that we will be using in estimating. An estimator is a rule, usually

expressed as a formula that tells us how to calculate an estimate based on information in the sample. We can classify estimators into two, point estimators and interval estimators 1. Point estimation -- Based on sample data, a single number is calculated to estimate the population parameter. The rule or formula that describes this calculation is called the point estimator, and the resulting number is called the point estimate. 2. Interval estimation -- Based on sample data, two numbers are calculated to form an interval within which the parameter is expected to lie. The rule or formula that describes this calculation is called the interval estimator, and the resulting pair of numbers is called an interval estimate or confidence interval.

The following are procedures of determining point estimate and constructing an interval

estimate of a population parameter.

Estimating the population mean, Point estimation

The best point estimate for the population mean, , is the sample mean, x .

The point estimator x is unbiased with standard error given by SE=n

.

The margin of error of the point estimate, x , is given by 1.96SE.

Example

1. A random sample of n = 50 observations produced x = 56.4 and 2 = 2.6. Give the best point

estimate for the population mean and calculate the margin of error.

Remark:

If is unknown and n 30, the sample standard deviation s can be used to approximate.

Page 2: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 2

Solution

= 56.4 and the margin of error is given by

Margin of Error = 70.44694876.

.

50

62961

2. An investigator is interested in the possibility of merging the capabilities of television and the internet. A random sample of n = 50 Internet users who were polled about the time they spend watching television produced an average of 11.5 hours per week with a standard deviation of 3.5 hours. Use this information to estimate the population mean time Internet users spend watching television, and calculate the margin of error. Solution

= 11.5 and the margin of error is given by

Margin of Error = 40.97015050.

.

50

53961

Interval estimation

To construct and interval estimate for the population mean, we consider two cases. One case is when the standard deviation of the population is known or unknown by the sample size

is large enough, that is, n 30. The other case is when the standard deviation is not known and the sample size is less than 30.

CASE 1: If is known or unknown but n 30

A (1-)100% confidence interval for a population mean, is given by:

nzx

nzx

22

where: x = sample mean

2z = z-score with an area of /2 to its right.

n = sample size

= population standard deviation

MAXIMUM ERROR: If x is used as an estimate of , we can then be (1-)100%

confident that the error will not exceed

nz

2

.

Page 3: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 3

SAMPLE SIZE DETERMINATION: If x is used as an estimate of , we can then

be (1-)100% confident that the error will not exceed a specified amount e when

the sample size is

2

2

e

z

n .

Example

1. Find a (1-)100% confidence interval for these values:

a. = 0.01, n = 36, x = 34, s2 = 12. Solution

435.4866769 632.5133230

. .

z z

z z

..

..

6

12575234

6

12575234

6

1234

6

1234

36

1234

36

1234

00500050

2010

2010

b. = 0.05, n = 49, x = 66.3, s = 2.48. Solution

66.9944 65.6056

.

.. .

..

.

z. .

z. ..

7

482961366

7

482961366

49

482366

49

482366

2050

2050

2. If a sample of 36 students who bought a soft drink in the canteen provides the following information: 290x ml and standard deviation of 55 ml; construct a 95 % confidence interval for the average amount of soft drink that the dispenser provides.

Remark:

If is unknown and n 30, the sample standard deviation s can be in place of.

Page 4: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 4

Solution

7307.966666 3272.033333

. .

zx zx

6

5961290

6

55961290

36

55

36

55

22

This shows a computer simulation of 10 samples of size n = 1000. Assuming that the true value of the population mean is 300, we have the following confidence intervals:

300

This interval almost missed the

true mean

Page 5: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 5

CASE 2: If is unknown and n< 30.

A (1-)100% confidence interval for the population mean is given by

n

stx

n

stx

22

where: x = sample mean

2t = t-value with an area of /2 to its right with df = n-1 degrees of freedom

n = sample size s = sample standard deviation

The maximum error for case 2 is analogous to those found in case 1 -- those with

z-scores are replaced by t-values with df = n - 1, and those with are replaced by s.

Example

1. A random sample of 12 female students in a certain dormitory showed an average weekly expenditure of P400 for snack foods, with a standard deviation of P12.50. Construct a 90% confidence interval for the average amount spent on snack foods by female students living in this dormitory, assuming the expenditures to be normally distributed. 2. The contents of 7 similar containers of sulfuric acid are 9.8, 10.2, 10.4, 9.8, 10.0, 10.2, and 9.6 liters. Find a 95% confidence interval for the mean content of all such containers, assuming an approximate normal distribution for container contents. 3. A scientist interested in monitoring chemical contaminants in food, and thereby the accumulation of contaminants in human diets, selected a random sample of n = 50 male adults. It was found that the average daily intake of dairy products was 756 grams per day with a standard deviation of 35 grams per day. Construct a 95% confidence interval for the mean daily intake of dairy products for men. 4. The mean and standard deviation for the quality grade point averages of a random sample of 36 college seniors are calculated to be 2.6 and 0.3 respectively. Find the 95% and 99% confidence intervals for the mean of the entire senior class.

Page 6: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 6

Estimating the population proportion, 𝝅 Point estimation

A point estimator for the population proportion, П in a binomial experiment is given by

the statisticn

x , where x represents the number of successes in n trials. Thus, the

sample proportion n

xp̂ will be our estimator for the parameter П.

The point estimator p̂ has a standard error given by SE = n

q̂p̂, where q̂ =1- p̂ .

The maximum error in estimating П is nq̂p̂z2

if we want to be (1-α)100% confident

Example

1. There are 320 students taking up STATSCI, and 125 of them were Mathematics majors.

Thus the proportion of Mathematics majors in that class is320

125 .

A random sample of 100 was selected from this class and it turned out that 67 of them

were math majors. Thus the sample proportion p̂ is100

67.

Interval estimation

If p̂ is the proportion of success in a random sample of size n, and q̂ =1- p̂ , an appropriate (1-α)

100% confidence interval for the parameter is given by

n

q̂p̂Zp̂

n

q̂p̂Zp̂

22

Sample size determination: if p̂ is used as an estimate for П, then we can be (1-α)100%

confident that the error will not exceed a specified amount e when the sample size is

2

2

e

q̂p̂Zn

.

Example In a random sample of 500 people eating lunch at Ka Teresing’s cafeteria on Fridays, it was found that x = 160 preferred seafood. Find a 95% confidence interval for the actual proportion of people who ate seafood on Fridays at this cafeteria.

Page 7: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 7

Solution:

Given x = 160 and n = 500 thus p̂ =160/500 = 8/25 and q̂ = 17/25. We also know that

(1-α) 100% is 95% thus Zα/2 = 1.96. Substituting all the given in our formula we have the following:

n

q̂p̂Zp̂

n

q̂p̂Zp̂

22

500

25

17

25

8

96.125

8

500

25

17

25

8

96.125

8

Estimating the population variance, 2

Suppose a sample of size n is drawn from a normal population with variance 2. Then

the point estimate for the population variance, 2, is given by the sample variance, s2. Interval estimate

A (1 - )100% confidence interval for the population variance 2 is

2)2/1(

22

22/

2 )1()1(

snsn

where 2

2/ = 2 -value leaving an area of /2 to its right with

df = n - 1.

2

)2/1( = 2 -value leaving an area of (1-/2) to its right with

df = n - 1.

Example 1. A random sample of 20 students obtained a mean of x = 72 and a variance of 16 on a college placement test in mathematics. Assuming the scores to be normally distributed, construct a

98% confidence interval for 2. 2. The following are the volumes, in deciliters, of 10 cans of peaches distributed by a Del Monte Company: 46.4, 46.1, 45.8, 47.0, 46.1, 45.9, 45.8, 46.9, 45.2, and 46.0. Find a 95% confidence interval for the variance of all such cans of peaches distributed by this company, assuming volume to be a normally distributed variable. Exercises 1. Calculate the margin of error in estimating a population mean µ for the following values:

a) n = 30 σ2 = 0.25 d) n = 40 σ = 0.2 b) n = 30 σ2 = 0.50 e) n = 35 σ = 0.2

Page 8: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 8

c) n = 30 σ2 = 0.75 f) n = 30 σ = 0.2 2. The following data were collected based from a sample in an experiment: n = 64, x = 22.5

and s = 3.4. a) What is the point estimate of µ? b) What is the margin of error associated with the point estimate of µ? c) Construct a 99% confidence interval for µ. d) What is the maximum error of the estimate for c?

3. Given a standard deviation of 4.56 from a certain population, a sample of 32 observations

was taken and produced the following data.

74 85 72 73 86 81 77 80

83 78 79 88 76 73 84 78

81 72 82 81 79 83 88 86

78 83 87 82 80 84 76 74

a) What is the point estimate of µ? b) What is the margin of error associated with the point estimate of µ? c) Construct a 95% confidence interval for µ. d) What is the maximum error of the estimate for c?

4. A telephone answering service completes a report in which the length of the call is

recorded, at the end of each call. A random sample of 9 reports yields a mean length of call of 1.2 minutes. Construct a 95% confidence interval for the mean length of call for the whole telephone answering service company if it is known that the population is normally distributed with a standard deviation of 0.6 minutes.

5. The following measurements were recorded for the drying time, in hours, of a certain brand of latex paint:

3.4 2.5 4.8 2.9 3,6 2.8 3.3 5.6 3.7 2.8 4.4 4.0 5.2 3.0 4.8 Assuming that the measurements represent a random sample from a normal population,

a. Construct a 90% confidence interval for the mean drying time of the latex paint. b. What can we assert with 98% confidence about the possible size of error?

6. A random sample of 10 chocolate bars has an average of 230 calories with a standard deviation of 15 calories. Assume that the distribution of the calories is approximately normal.

a. Construct a 99% confidence mean calories content of this chocolate bar. b. How large a sample is needed if we wish to be 99% confident that our sample mean will

be within 5 calories of the true mean? 7. A random sample of 200 voters is selected and 120 are found to support a annexation suit.

Page 9: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 9

a. Find the 96% confidence interval for the proportion of the voting population favoring the suit.

b. What can we assert with 96% confidence about the possible size of error if we estimate the proportion of voters favoring the suit to be 0.6?

8. A random sample of 75 college students is selected at random and 16 are found to have cars on campus. Use a 95% confidence interval to estimate the fraction of students who have cars on campus. 9. In a random sample of 1000 homes in a certain city, it is found that 628 owns an air conditioning unit. a. Find the 98% confidence interval to estimate the proportion of home with air conditioning unit. b. How large a sample is needed if we wish to be 98% confident that our sample proportion will be within 0.05 of the true proportion of homes in this city that owns an air conditioning unit. 10. A new launching system is being considered for deployment of small short range launches. The existing system has p = 0.8 as the probability of a successful launch. A sample of 40 experimental launches is made with the new system and 34 are successful. a. Construct a 95% confidence interval for p. b. Would you conclude that the new system is better? 11. Construct a 99% confidence interval for the population variance in Exercise 4. 12. Construct a 90% confidence interval of the population variance in Exercise 6. 13. The following are heights in feet of posts found in the streets:

162, 152, 168, 178, 178, 183, 142, 163, 176, 154, 158, 147, 146, 155 If heights of posts are approximately normally distributed, construct a 98% confidence

interval for the variance of the height of all posts in the streets. 14. The following are the volumes, in deciliters, of 10 cans of pineapple slices distributed by Fruits Company: 46.4, 46.1, 45.8, 47.0, 46.1, 45.9, 45.8, 46.9, 45.2 and 46. Find a 95% confidence interval for the variance of all such cans of pineapple slices distributed by this company, assuming the volume to be normally distributed random variable.

ESTIMATING THE DIFFERENCE BETWEEN TWO POPULATION MEANS, 1 - 2

Let 1 and 1 be the mean and standard deviation, respectively, of the first population

and 2 and 2 be the mean and standard deviation, respectively, of the second population. Random samples of size n1 are taken from the first population and random samples of size n2 are taken from the second population.

Point estimation

The best point estimate for the difference between two population means, 1 - 2, is

given by the difference between their sample means, 21 xx .

Page 10: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 10

The point estimator, 21 xx , is unbiased with standard error given by SE = 2

22

1

21

nn

.

The margin of error of the point estimate is given by 1.96SE.

Remark: If 12 and 2

2 are unknown but both n1 and n2 are 30 are more, then the sample variances s1

2 and s22 can be used.

Interval Estimation

CASE 1: 12 and 2

2 are known or 12 and 2

2 are unknown but n1 30 and n2 30.

A (1 - )100% confidence interval for the difference between the population means, 1 -

2 , is given by

( 21 xx ) - z/2

2

22

1

21

nn

< 1 - 2 < ( 21 xx ) + z/2

2

22

1

21

nn

MAXIMUM ERROR: If 21 xx is used as an estimator for 1 - 2, then we can be (1 -

)100% confident that our error will not exceed z/2

2

22

1

21

nn

Remark: If 12 and 2

2 are unknown, but both n1 and n2 are greater than or equal to 30, then the sample variances s1

2 and s22 can be used.

EX1. Independent random samples were taken from populations 1 and 2. The sample sizes, means, and variances are as follows:

Population 1: n1 = 35, 1x = 12.7, s12 = 1.38

Population 2: n2 = 49, 2x = 7.4 , s22 = 4.14

a. Find a point estimate for the difference between the population means, 1 and 2 and compute for the margin of error.

b. Find a 95% confidence interval for 1- 2. EX2. The wearing qualities of two types of automobile tires were compared by road-testing samples of n1 = n2 = 100 tires for each type. The number of miles until wear out was defined as a specific amount of tire wear. The data obtained is given below:

TIRE 1 TIRE 2

1x = 26,400 miles 2x = 25,100 miles

s12 = 1,440,000 miles s2

2 = 1,960,000 miles

a. Find a point estimate for the difference between the means of all tires of types 1 and 2 and compute for the margin of error.

b. Construct a 99% confidence interval for 1 - 2. c. What can we assert with 95% confidence about the possible size of our error?

Page 11: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 11

CASE 2: 12 and 2

2 are unknown but equal and n1< 30 and n2 < 30. (small sample)

A (1 - )100% confidence interval for the difference between the population means, 1 -

2 , is given by

( 21 xx ) - t/2 sp

21

11

nn < 1 - 2 < ( 21 xx ) + t/2 sp

21

11

nn

where sp2 =

2nn

s)1n(s)1n(

21

222

211

t/2 = t- value with an area of /2 to its right with df = n1 + n2 - 2

EX1. Independent random samples of n1 = 16 and n2 = 13 observations were selected from two normal populations with equal variances:

Population 1: n1 = 16, 1x = 34.6, s12 = 4.8

Population 2: n2 = 13, 2x = 32.2, s22 = 5.9

Construct a 95% confidence interval for 1- 2. EX2. A taxi company is trying to decide whether to purchase brand A or brand B tires for its fleet of taxis. To estimate the difference in the two brands, an experiment is conducted using 12 of each brand. The tires are run until they wear out. The results are:

Brand A : 1x = 36,300 kms., s12 = 5000 kms

Brand B: 2x = 38,100 kms., s12 = 6100 kms.

Compute for a 95% confidence interval for 1- 2, assuming the populations to be normally distributed.

CASE 3: 12 and 2

2 are unknown and unequal and n1< 30 and n2 < 30. (small sample).

( 21 xx ) - t/2

2

22

1

21

n

s

n

s < 1 - 2 < ( 21 xx ) + t/2

2

22

1

21

n

s

n

s

where t/2 = t-value leaving an area of /2 to its right with

df =

11

2

2

2

2

22

1

2

1

21

2

22

1

21

nn

n

s

n

s

n

s

n

s

Page 12: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 12

EX. The following data represent the running times of films produced by two motion picture companies:

TIME (in minutes)

Company I Company II

103 94 110 87 98 97 82 123 92 175 88 118

Compute a 90% confidence interval for the difference between the average running times of films produced by the companies. Assume that the running times for each of the companies are approximately normally distributed with unequal variances.

ESTIMATING THE DIFFERENCE BETWEEN TWO POPULATION PROPORTIONS, 𝝅𝟏 − 𝝅𝟐 Suppose independent random samples of size n1 and n2 are taken from two populations and let x1 and x2 be the number of successes in the first and second populations, respectively. Then a point estimate for the difference in the population proportion, p1 - p2, is given by the

difference in the sample proportions, 21 ˆˆ pp = 2

2

1

1

n

x

n

x .

Interval estimate

A (1 - )100% confidence interval for the population proportion, p, is given by:

ˆˆˆˆ

)ˆˆ( ˆˆˆˆ

)ˆˆ(2

22

1

112121

2

22

1

1121

22

n

qp

n

qpzpp

n

qp

n

qpzpp

EX1. A poll is taken among the residents of a city and the rounding county to determine the feasibility of a proposal to construct a civic center. If 2400 of 5000 city residents favor the proposal and 1200 of 2000 county residents favor it, find a 90% confidence interval for the true difference in the fractions favoring the proposal to construct the civic center. EX2. A geneticist is interested in the proportion of males and females in the population that have a certain minor blood disorder. In a random sample of 100 males, 24 are found to be afflicted, whereas 13 out of 100 females tested appear to have the disorder. Compute a 99% confidence interval for the difference between proportion of males and females that have this blood disorder.

Page 13: Estimation of parameters - … · Interval estimation-- Based on sample data, ... Solution 272.033333 3 ... Construct a 90% confidence interval for the average amount spent on snack

Lecture Notes in

Statistics ESTIMATION OF PARAMETERS

Prepared by: Francis Joseph H. Campeña 13

ESTIMATING THE RATIO BETWEEN TWO POPULATION VARIANCES, 22

21

Consider two independent random samples of size n1 and n2 drawn from two normal populations with variance

Interval estimate

A (1 - )100% confidence interval for the ratio of two population variances, 22

21

, is given

by:

)df,df(fs

s

)df,df(fs

s12

21

1

22

22

21

22

21

22

21

Examples: 1) Independent sample from 2 normal population produces the following data:

Do the data provide sufficient evidence to indicate that σ21 differs from σ2

2 using an α = 0.05? Solution: F0.025(15, 20) = 2.57 F0.025(20, 15) = 2.76

(55.7/31.4)*(1/F0.025(15,20)) < σ21/σ2

2 < (55.7/31.4)*(F0.025(20,15)) (55.7/31.4)*(1/2.57) < σ2

1/σ22 < (55.7/31.4)*(2.76)

0.690228 < σ21/σ2

2 < 4.895924 Since 1 is included in the interval we constructed we cannot conclude that there is a difference between the variances of the two populations. The data does not provide sufficient evidence to say that the variances differ. 2) Independent sample from 2 normal population produces the following data:

Do the data provide sufficient evidence to indicate that σ21 > σ2

2 using an α = 0.05? Solution:

F0.025(12, 12) = 3.28 (55.7/31.4)*(1/F0.025(12,12)) < σ2

1/σ22 < (55.7/31.4)*(F0.025(12,12))

(55.7/31.4)*(1/3.28) < σ21/σ2

2 < (55.7/31.4)*(3.28) 0.706236 < σ2

1/σ22 < 7.597975

Again since 1 is included in our interval we cannot conclude that there is a difference between the two variances. Thus the data does not provide sufficient evident to say that σ2

1 > σ22.

Sample size Sample variance

Population I 16 55.7

Population II 21 31.4

Sample size Sample variance

Population I 13 18.3

Population II 13 7.9