1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling...

52
1 Using Statistics Sample Statistics as Estimators of Population Parameters Sampling Distributions Estimators and Their Properties Degrees of Freedom Summary and Review of Terms Sampling and Sampling Distributions 5

Transcript of 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling...

Page 1: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

1

Using Statistics Sample Statistics as Estimators of

Population Parameters Sampling Distributions Estimators and Their Properties Degrees of Freedom Summary and Review of Terms

Sampling and Sampling Distributions5

Page 2: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

2

Take random samples from populations Distinguish between population parameters and sample statistics Apply the central limit theorem Derive sampling distributions of sample means and proportions Explain why sample statistics are good estimators of population

parameters Judge one estimator as better than another based on desirable

properties of estimators Apply the concept of degrees of freedom Identify special sampling methods Compute sampling distributions and related results using

templates

LEARNING OBJECTIVESLEARNING OBJECTIVES55

After studying this chapter you should be able to:After studying this chapter you should be able to:

Page 3: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

3

• Statistical Inference: Predict and forecast values of

population parameters... Test hypotheses about values of

population parameters... Make decisions...

On basis of sample statistics derived from limited and incomplete sample information

Make generalizations about the characteristics of a population...

Make generalizations about the characteristics of a population...

On the basis of observations of a sample, a part of a population

On the basis of observations of a sample, a part of a population

5-1 Statistics is a Science of Inference(Inference( 推推論論 ))

Page 4: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

4

Democrats Republicans

People who have phones and/or cars and/or are Digest readers.

BiasedSample

Population

Democrats Republicans

Unbiased Sample

Population

Unbiased, representative sample drawn at random from the entire population.

Biased, unrepresentative sample drawn from people who have cars and/or telephones and/or read the Digest.

The Literary Digest Poll (1936)

Page 5: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

5

• An estimator(estimator( 估計式估計式 )) of a population parameter is a sample statistic used to estimate or predict the population parameter.

• An estimate(estimate( 估計估計 )) of a parameter is a particular numerical value of a sample statistic obtained through sampling.

• A point estimatepoint estimate is a single value used as an estimate of a population parameter.

A population parameterpopulation parameter is a numerical measure of a summary characteristic of a population.

5-2 Sample Statistics as Estimators of Population Parameters(p.176)

• A sample statisticsample statistic is a numerical measure of a summary characteristic

of a sample.

Page 6: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

6

• The sample mean, , is the most common estimator of the population mean,

• The sample variance, s2, is the most common estimator of the population variance, 2.

• The sample standard deviation, s, is the most common estimator of the population standard deviation, .

• The sample proportion, , is the most common estimator of the population proportion, p.

• The sample mean, , is the most common estimator of the population mean,

• The sample variance, s2, is the most common estimator of the population variance, 2.

• The sample standard deviation, s, is the most common estimator of the population standard deviation, .

• The sample proportion, , is the most common estimator of the population proportion, p.

Estimators(估計式 ,樣本統計量 )

X

Page 7: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Estimators

7

Page 8: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

8

• The population proportionpopulation proportion is equal to the number of elements in the population belonging to the category of interest, divided by the total number of elements in the population:

• The sample proportionsample proportion is the number of elements in the sample belonging to the category of interest, divided by the sample size:

Population and Sample Proportions

pxn

pXN

Page 9: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

9

XX

XXX

XX

XXX

XXX

XXX

XX

Population mean ()

Sample points

Frequency distribution of the population

Sample mean ( )

A Population Distribution, a Sample from a Population, and the Population and Sample Means

X

Page 10: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

• Stratified samplingsampling( 分層抽樣 ):: in stratified sampling, the population is partitioned into two or more subpopulation called strata( 層 ), and from each stratum a desired sample size is selected at random.( 層層有顯著差異, ex.性別、變異數 )

• Cluster samplingCluster sampling( 分群抽樣 ):: in cluster sampling, a random sample of the strata is selected and then samples from these selected strata are obtained.

• Single-stage Cluster sampling( 一階段分群抽樣 ): in single-stage cluster sampling, a cluster is chosen at random and every item or person in that cluster is sampled.

Other Sampling Methods

10

Page 11: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

• Two-stage Cluster sampling( 二階段分群抽樣 ): in two-stage cluster sampling, a cluster is chosen at random and random samples of the items or persons are taken from that cluster.

• Multistage Cluster samplingMultistage Cluster sampling( 多階段分群抽樣 ):: in multistage cluster sampling, a random sample of the strata is selected, then samples of items or persons from these strata are taken and then samples from these items and persons are selected.

• Systematic samplingSystematic sampling ( 系統抽樣 )

Other Sampling Methods

11

Page 12: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

12

亂數表之使用 (p.178 or p.702)

• Example, 10 samples from 001~600 numbered data

•習題 5-5

Page 13: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

13

• The sampling distributionsampling distribution of a statistic is the probability distribution of all possible values the statistic may assume, when computed from random samples of the same size, drawn from a specified population.

• The sampling distribution of sampling distribution of XX is the probability distribution of all possible values the random variable may assume when a sample of size n is taken from a specified population.

X

5-3 Sampling Distributions (抽樣分配 ) (p.182)

Page 14: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

14

Ex. Uniform population of integers from 1 to 8:

X P(X) XP(X) (X-x) (X-x)2 P(X)(X-x)2

1 0.125 0.125 -3.5 12.25 1.531252 0.125 0.250 -2.5 6.25 0.781253 0.125 0.375 -1.5 2.25 0.281254 0.125 0.500 -0.5 0.25 0.031255 0.125 0.625 0.5 0.25 0.031256 0.125 0.750 1.5 2.25 0.281257 0.125 0.875 2.5 6.25 0.781258 0.125 1.000 3.5 12.25 1.53125

1.000 4.500 5.25000 87654321

0.2

0.1

0.0

X

P(X

)

Uniform Distribution (1,8)

E(X) = E(X) = = 4.5 = 4.5V(X) = V(X) = 22 = 5.25 = 5.25SD(X) = SD(X) = = 2.2913 = 2.2913

Sampling Distributions (Continued)

Page 15: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

15

• There are 8*8 = 64 different but equally-likely samples of size 2 that can be drawn (with replacement) from a uniform population of the integers from 1 to 8:Samples of Size 2 from Uniform (1,8)

1 2 3 4 5 6 7 81 1,1 1,2 1,3 1,4 1,5 1,6 1,7 1,82 2,1 2,2 2,3 2,4 2,5 2,6 2,7 2,83 3,1 3,2 3,3 3,4 3,5 3,6 3,7 3,84 4,1 4,2 4,3 4,4 4,5 4,6 4,7 4,85 5,1 5,2 5,3 5,4 5,5 5,6 5,7 5,86 6,1 6,2 6,3 6,4 6,5 6,6 6,7 6,87 7,1 7,2 7,3 7,4 7,5 7,6 7,7 7,88 8,1 8,2 8,3 8,4 8,5 8,6 8,7 8,8

Each of these samples has a sample mean. For example, the mean of the sample (1,4) is 2.5, and the mean of the sample (8,4) is 6.

Sample Means from Uniform (1,8), n = 21 2 3 4 5 6 7 8

1 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.52 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.03 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.54 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.05 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.56 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.07 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.58 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Sampling Distributions (Continued)

Page 16: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

16

Sampling Distribution of the Mean

The probability distribution of the sample mean is called the sampling distribution of the the sample meansampling distribution of the the sample mean.(p.184)

8.07.57.06.56.05.55.04.54.03.53.02.52.01.51.0

0.10

0.05

0.00

X

P(X

)

Sampling Distribution of the Mean

X P(X) XP(X) X-X (X-X)2 P(X)(X-X)2

1.0 0.015625 0.015625 -3.5 12.25 0.1914061.5 0.031250 0.046875 -3.0 9.00 0.2812502.0 0.046875 0.093750 -2.5 6.25 0.2929692.5 0.062500 0.156250 -2.0 4.00 0.2500003.0 0.078125 0.234375 -1.5 2.25 0.1757813.5 0.093750 0.328125 -1.0 1.00 0.0937504.0 0.109375 0.437500 -0.5 0.25 0.0273444.5 0.125000 0.562500 0.0 0.00 0.0000005.0 0.109375 0.546875 0.5 0.25 0.0273445.5 0.093750 0.515625 1.0 1.00 0.0937506.0 0.078125 0.468750 1.5 2.25 0.1757816.5 0.062500 0.406250 2.0 4.00 0.2500007.0 0.046875 0.328125 2.5 6.25 0.2929697.5 0.031250 0.234375 3.0 9.00 0.2812508.0 0.015625 0.125000 3.5 12.25 0.191406

1.000000 4.500000 2.625000

E XV XSD X

X

X

X

( )( )

( ) .

4.52.6251 6202

2

Sampling Distributions (Continued)

Page 17: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

17

• Comparing the population distribution and the sampling distribution of the mean:The sampling distribution is

more bell-shaped and symmetric.

Both have the same center.The sampling distribution of

the mean is more compact, with a smaller variance.(p.184)

87654321

0.2

0.1

0.0

X

P(X

)

Uniform Distribution (1,8)

X8.07.57.06.56.05.55.04.54.03.53.02.52.01.51.0

0.10

0.05

0.00

P( X

)

Sampling Distribution of the Mean

Properties of the Sampling Distribution of the Sample Mean

),(~2

nNX

Page 18: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

18

The expected value of the sample meanexpected value of the sample mean is equal to the population mean:

E XX X

( )

The variance of the sample meanvariance of the sample mean is equal to the population variance divided by the sample size:

V XnX

X( ) 2

2

The standard deviation of the sample mean, known as the standard error of standard deviation of the sample mean, known as the standard error of the meanthe mean, is equal to the population standard deviation divided by the square root of the sample size:

SD XnX

X( )

Relationships between Population Parameters and the Sampling Distribution of the Sample Mean

Page 19: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

19

When sampling from a normal populationnormal population with mean and standard deviation , the sample mean, X, has a normal sampling distributionnormal sampling distribution:

When sampling from a normal populationnormal population with mean and standard deviation , the sample mean, X, has a normal sampling distributionnormal sampling distribution:

X Nn

~ ( , ) 2

This means that, as the sample size increases, the sampling distribution of the sample mean remains centered on the population mean, but becomes more compactly distributed around that population mean

This means that, as the sample size increases, the sampling distribution of the sample mean remains centered on the population mean, but becomes more compactly distributed around that population mean

Normal population

0.4

0.3

0.2

0.1

0.0

f (X)

Sampling Distribution of the Sample Mean

Sampling Distribution: n =2

Sampling Distribution: n =16

Sampling Distribution: n =4

Sampling from a Normal Population

Normal population

Page 20: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

20

When sampling from a population with mean and finite standard deviation , the sampling distribution of the sample mean will tend to a normal distribution with mean and standard deviation as the sample size becomes large(n >30).

For “large enough” n:

When sampling from a population with mean and finite standard deviation , the sampling distribution of the sample mean will tend to a normal distribution with mean and standard deviation as the sample size becomes large(n >30).

For “large enough” n:

n

)/,(~ 2 nNX

P( X)

X

0.25

0.20

0.15

0.10

0.05

0.00

n = 5

P( X)

0.2

0.1

0.0 X

n = 20

f ( X)

X

-

0.4

0.3

0.2

0.1

0.0

Large n

The Central Limit Theorem(p.186)

Page 21: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

21

Normal Uniform Skewed

Population

n = 2

n = 30

XXXX

General

The Central Limit Theorem Applies to Sampling Distributions from AnyAny Population(p.187)

Page 22: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

22

Mercury makes a 2.4 liter V-6 engine, the Laser XRi, used in speedboats. The company’s engineers believe the engine delivers an average power of 220 horsepower and that the standard deviation of power delivered is 15 HP. A potential buyer intends to sample 100 engines (each engine is to be run a single time). What is the probability that the sample mean will be less than 217HP?

Mercury makes a 2.4 liter V-6 engine, the Laser XRi, used in speedboats. The company’s engineers believe the engine delivers an average power of 220 horsepower and that the standard deviation of power delivered is 15 HP. A potential buyer intends to sample 100 engines (each engine is to be run a single time). What is the probability that the sample mean will be less than 217HP?

P X PX

n n

P Z P Z

P Z

( )

( ) .

217217

217 22015

100

217 2201510

2 0 0228

The Central Limit Theorem (Example 5-1, p.188)

Page 23: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Mercury makes a 2.4 liter V-6 engine, the Laser XRi, used in speedboats. The company’s engineers believe the engine delivers an average power of 220 horsepower and that the standard deviation of power delivered is 15 HP. A potential buyer intends to sample 100 engines (each engine is to be run a single time). What is the probability that the sample mean will be less than 217HP?

Mercury makes a 2.4 liter V-6 engine, the Laser XRi, used in speedboats. The company’s engineers believe the engine delivers an average power of 220 horsepower and that the standard deviation of power delivered is 15 HP. A potential buyer intends to sample 100 engines (each engine is to be run a single time). What is the probability that the sample mean will be less than 217HP?

The Central Limit Theorem (Example 5-1) – Using the Template

23

Page 24: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

24

Example 5-2 (p.189)EPS Mean Distribution

0

5

10

15

20

25

Range

Fre

qu

en

cy

2.00 - 2.49

2.50 - 2.99

3.00 - 3.49

3.50 - 3.99

4.00 - 4.49

4.50 - 4.99

5.00 - 5.49

5.50 - 5.99

6.00 - 6.49

6.50 - 6.99

7.00 - 7.49

7.50 - 7.99

Page 25: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

25

If the population standard deviation, , is unknownunknown, replace with the sample standard deviation, s. If the population is normal, the resulting statistic: has a t distribution t distribution with with (n - 1) degrees of freedom(n - 1) degrees of freedom.

If the population standard deviation, , is unknownunknown, replace with the sample standard deviation, s. If the population is normal, the resulting statistic: has a t distribution t distribution with with (n - 1) degrees of freedom(n - 1) degrees of freedom.• The t is a family of bell-shaped and symmetric

distributions, one for each number of degree of freedom.

• The expected value of t is 0.

• The variance of t is greater than 1, but approaches 1 as the number of degrees of freedom increases. The t is flatter and has fatter tails than does the standard normal.

• The t distribution approaches a standard normal as the number of degrees of freedom increases.

• The t is a family of bell-shaped and symmetric distributions, one for each number of degree of freedom.

• The expected value of t is 0.

• The variance of t is greater than 1, but approaches 1 as the number of degrees of freedom increases. The t is flatter and has fatter tails than does the standard normal.

• The t distribution approaches a standard normal as the number of degrees of freedom increases.

nsXt

/

Standard normal

t, df=20t, df=10

Student’s t Distribution (p.191)

Page 26: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

26

The sample proportionsample proportion is the percentage of successes in n binomial trials. It is the number of successes, X, divided by the number of trials, n.

The sample proportionsample proportion is the percentage of successes in n binomial trials. It is the number of successes, X, divided by the number of trials, n.

pX

n

As the sample size, n, increases, the sampling distribution of approaches a normal normal distributiondistribution with mean p and standard deviation

How to prove it?

As the sample size, n, increases, the sampling distribution of approaches a normal normal distributiondistribution with mean p and standard deviation

How to prove it?

p

p pn

( )1

Sample proportion:

1514131211109876543210

0.2

0.1

0.0

P(X)

n=15, p = 0.3

X

1415

1315

1215

1115

10 15

9 15

8 15

7 15

6 15

5 15

4 15

3 15

2 15

1 15

0 15

1515 ^

p

210

0 .5

0 .4

0 .3

0 .2

0 .1

0 .0

X

P(X)

n=2, p = 0.3

109876543210

0.3

0.2

0.1

0.0

P(X

)

n=10,p=0.3

X

The Sampling Distribution of the Sample Proportion, p

Page 27: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

27

公式證明

n

pp

n

npq

xVarn

xVarnn

xVarpVar

pnpn

xEn

xEnn

xEpE

pqxVarpxE

pBxwheren

x

n

xp

iii

iii

ii

ii

1

11ˆ

1

11ˆ

)(,)(

)(,ˆ

2

22

Page 28: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

28

In recent years, convertible sports coupes have become very popular in Japan. Toyota is currently shipping Celicas to Los Angeles, where a customizer does a roof lift and ships them back to Japan. Suppose that 25% of all Japanese in a given income and lifestyle category are interested in buying Celica convertibles. A random sample of 100 Japanese consumers in the category of interest is to be selected. What is the probability that at least 20% of those in the sample will express an interest in a Celica convertible?

In recent years, convertible sports coupes have become very popular in Japan. Toyota is currently shipping Celicas to Los Angeles, where a customizer does a roof lift and ships them back to Japan. Suppose that 25% of all Japanese in a given income and lifestyle category are interested in buying Celica convertibles. A random sample of 100 Japanese consumers in the category of interest is to be selected. What is the probability that at least 20% of those in the sample will express an interest in a Celica convertible?

n

p

np E p

p p

nV p

p p

nSD p

100

0 25

100 0 25 25

1 25 75

1000 001875

10 001875 0 04330127

.

( )( . ) ( )

( ) (. )(. ). ( )

( ). . ( )

P p Pp p

p p

n

p

p p

n

P z P z

P z

( . )

( )

.

( )

. .

(. )(. )

.

.

( . ) .

0 201

20

1

20 25

25 75

100

05

0433

1 15 0 8749

Sample Proportion (Example 5-3)(p.193)

Page 29: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

In recent years, convertible sports coupes have become very popular in Japan. Toyota is currently shipping Celicas to Los Angeles, where a customizer does a roof lift and ships them back to Japan. Suppose that 25% of all Japanese in a given income and lifestyle category are interested in buying Celica convertibles. A random sample of 100 Japanese consumers in the category of interest is to be selected. What is the probability that at least 20% of those in the sample will express an interest in a Celica convertible?

In recent years, convertible sports coupes have become very popular in Japan. Toyota is currently shipping Celicas to Los Angeles, where a customizer does a roof lift and ships them back to Japan. Suppose that 25% of all Japanese in a given income and lifestyle category are interested in buying Celica convertibles. A random sample of 100 Japanese consumers in the category of interest is to be selected. What is the probability that at least 20% of those in the sample will express an interest in a Celica convertible?

Sample Proportion (Example 5-3) – Template Solution

29

Page 30: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

30

Exercise (p.193, 15min)

• Exercise 5-12 (Ans:1.Binomial, 2.No, 不適用 )

• Exercise 5-13 (Ans: 0.075 )

• Exercise 5-14 (Ans: 0.9923 )

• Exercise 5-15 (Ans: 0.2308 )

• Exercise 5-16 (Ans: 0.0497 )

Page 31: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

31

An estimatorestimator of a population parameter is a sample statistic used to estimate the parameter. The most commonly-used estimator of the:Population Parameter Sample Statistic Mean () is the Mean (X)Variance (2) is the Variance (s2)Standard Deviation () is the Standard Deviation (s)Proportion (p) is the Proportion ( )

An estimatorestimator of a population parameter is a sample statistic used to estimate the parameter. The most commonly-used estimator of the:Population Parameter Sample Statistic Mean () is the Mean (X)Variance (2) is the Variance (s2)Standard Deviation () is the Standard Deviation (s)Proportion (p) is the Proportion ( )p

• Desirable properties of estimators include:Unbiasedness ( 不偏性 )Efficiency ( 有效性 )Consistency ( 一致性 )Sufficiency ( 充分性 )

• Desirable properties of estimators include:Unbiasedness ( 不偏性 )Efficiency ( 有效性 )Consistency ( 一致性 )Sufficiency ( 充分性 )

5-4 Estimators and Their Properties

(p.177)

Page 32: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

32

An estimator is said to be unbiasedunbiased if its expected value is equal to the population parameter it estimates.

For example, E(X)=so the sample mean is an unbiased estimator of the population mean. Unbiasedness is an average or long-run property. The mean of any single sample will probably not equal the population mean, but the average of the means of repeated independent samples from a population will equal the population mean.

Any systematic deviationsystematic deviation of the estimator from the population parameter of interest is called a biasbias.

An estimator is said to be unbiasedunbiased if its expected value is equal to the population parameter it estimates.

For example, E(X)=so the sample mean is an unbiased estimator of the population mean. Unbiasedness is an average or long-run property. The mean of any single sample will probably not equal the population mean, but the average of the means of repeated independent samples from a population will equal the population mean.

Any systematic deviationsystematic deviation of the estimator from the population parameter of interest is called a biasbias.

Unbiasedness (p.194)

Page 33: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

33

An unbiasedunbiased estimator is on target on average.

A biasedbiased estimator is off target on average.

{

Bias

Unbiased and Biased Estimators

Page 34: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

34

An estimator is efficientefficient if it has a relatively small variance (and standard deviation).

An estimator is efficientefficient if it has a relatively small variance (and standard deviation).

An efficientefficient estimator is, on average, closer to the parameter being estimated..

An inefficientinefficient estimator is, on average, farther from the parameter being estimated.

Efficiency (p.195)

Page 35: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

35

An estimator is said to be consistentconsistent if its probability of being close to the parameter it estimates increases as the sample size increases.

An estimator is said to be consistentconsistent if its probability of being close to the parameter it estimates increases as the sample size increases.

An estimator is said to be sufficientsufficient if it contains all the information in the data about the parameter it estimates.

An estimator is said to be sufficientsufficient if it contains all the information in the data about the parameter it estimates.

n = 100n = 10

ConsistencyConsistency

Consistency and Sufficiency(p.196)

Page 36: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

36

For a normal population, both the sample mean and sample median are unbiased estimatorsunbiased estimators of the population mean, but the sample mean is both more efficientefficient (because it has a smaller variance), and sufficientsufficient. Every observation in the sample is used in the calculation of the sample mean, but only the middle value is used to find the sample median.

In general, the sample mean is the bestbest estimator of the population mean. The sample mean is the most efficient unbiased estimator of the population mean. It is also a consistent estimator.

Properties of the Sample Mean(p.196)

Page 37: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

37

The sample variancesample variance (the sum of the squared deviations from the sample mean divided by (n-1) is an unbiased estimatorunbiased estimator of the population variance. In contrast, the average squared deviation from the sample mean is a biased (though consistent) estimator of the population variance.

The sample variancesample variance (the sum of the squared deviations from the sample mean divided by (n-1) is an unbiased estimatorunbiased estimator of the population variance. In contrast, the average squared deviation from the sample mean is a biased (though consistent) estimator of the population variance.

E s Ex x

n

Ex xn

( )( )( )

( )

22

2

22

1

Properties of the Sample Variance(p.197)

Page 38: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

38

公式證明 :

22

222

22

22

22

2222

2

2

2

2

2

xnx

xnxnx

xnnxnxx

xxxx

xxxx

xxxxxxxx

i

i

i

ii

ii

iiii

2

2

2

1

n

xxEsE i

Page 39: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

39

公式證明 (Continued)

22

22

2

22

22

222

22222

11

1

1

1

1

,0~,,~

1

nn

xxEnn

xxEsE

nNx

nNx

nn

nnxEnnE

xnExExnxExxE

ii

iii

Page 40: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

40

Exercise, p.197, 5min

• 5-19 (Ans. 1300)

• 5-21 (Ans. n > 100 )

Page 41: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

41

Consider a sample of size n=4 containing the following data points:

x1=10 x2=12 x3=16 x4=?

and for which the sample mean is:

Given the values of three data points and the sample mean, the value of the fourth data point can be determined:

x =x

n

12 14 164

414

12 14 164

56

x

x

x4 56 12 14 16

x4 56

xx

n

14

5-5 Degrees of Freedom

x4 = 14x4 = 14

Page 42: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

42

x 14If only two data points and the sample mean are known:

x1=10 x2=12 x3=? x4=?

The values of the remaining two data points cannot be uniquely determined:

x =x

n

12 143 4

414

12 143 4 56

x x

x x

Degrees of Freedom (Continued)

Page 43: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

43

The number of degrees of freedom is equal to the total number of measurements (these are not always raw data points), less the total number of restrictions on the measurements. A restriction is a quantity computed from the measurements.

The sample mean is a restriction on the sample measurements, so after calculating the sample mean there are only (n-1) degrees of freedom remaining with which to calculate the sample variance. The sample variance is based on only (n-1) free data points:

The number of degrees of freedom is equal to the total number of measurements (these are not always raw data points), less the total number of restrictions on the measurements. A restriction is a quantity computed from the measurements.

The sample mean is a restriction on the sample measurements, so after calculating the sample mean there are only (n-1) degrees of freedom remaining with which to calculate the sample variance. The sample variance is based on only (n-1) free data points:

sx x

n2

2

1

( )( )

Degrees of Freedom (Continued)

Page 44: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

44

A sample of size 10 is given below. We are to choose three different numbers from which the deviations are to be taken. The first number is to be used for the first five sample points; the second number is to be used for the next three sample points; and the third number is to be used for the last two sample points.

A sample of size 10 is given below. We are to choose three different numbers from which the deviations are to be taken. The first number is to be used for the first five sample points; the second number is to be used for the next three sample points; and the third number is to be used for the last two sample points.

Example 5-4 (p.201)

Sample # 1 2 3 4 5 6 7 8 9 10

Sample Point

93 97 60 72 96 83 59 66 88 53

i. What three numbers should we choose in order to minimize the SSD (sum of squared deviations from the mean).?

• Note:

2

xxSSD

Page 45: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

45

Solution:Solution: Choose the means of the corresponding sample points. These are: 83.6, 69.33, and 70.5.

ii. Calculate the SSD with chosen numbers.

Solution:Solution: SSD = 2030.367. See table on next slide for calculations.

iii. What is the df for the calculated SSD?

Solution:Solution: df = 10 – 3 = 7.

iv. Calculate an unbiased estimate of the population variance.

Solution:Solution: An unbiased estimate of the population variance is SSD/df = 2030.367/7 = 290.05.

Solution:Solution: Choose the means of the corresponding sample points. These are: 83.6, 69.33, and 70.5.

ii. Calculate the SSD with chosen numbers.

Solution:Solution: SSD = 2030.367. See table on next slide for calculations.

iii. What is the df for the calculated SSD?

Solution:Solution: df = 10 – 3 = 7.

iv. Calculate an unbiased estimate of the population variance.

Solution:Solution: An unbiased estimate of the population variance is SSD/df = 2030.367/7 = 290.05.

Example 5-4 (continued)

Page 46: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

46

Example 5-4 (continued)Sample # Sample Point Mean Deviations Deviation

Squared

1 93 83.6 9.4 88.36

2 97 83.6 13.4 179.56

3 60 83.6 -23.6 556.96

4 72 83.6 -11.6 134.56

5 96 83.6 12.4 153.76

6 83 69.33 13.6667 186.7778

7 59 69.33 -10.3333 106.7778

8 66 69.33 -3.3333 11.1111

9 88 70.5 17.5 306.25

10 53 70.5 -17.5 306.25

SSD 2030.367

SSD/df 290.0524

Page 47: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Sampling Distribution of a Sample MeanSampling Distribution of a Sample Mean

5-6 Using the Computer Using the Template

47

Page 48: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Sampling Distribution of a Sample Mean (continued)Sampling Distribution of a Sample Mean (continued)

Using the Template

48

Page 49: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Sampling Distribution of a Sample ProportionSampling Distribution of a Sample Proportion

Using the Template

49

Page 50: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Using the Template

Sampling Distribution of a Sample Proportion (continued)Sampling Distribution of a Sample Proportion (continued)

50

Page 51: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

Constructing a sampling distribution of the mean from a uniform population (n = 10) using EXCEL (use RANDBETWEEN(0, 1) command to generate values to graph):

Histogram of Sample Means

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Sample Means (Class Midpoints)

Fre

quen

cy

CLASS MIDPOINT FREQUENCY0.15 00.2 00.25 30.3 260.35 640.4 1130.45 1830.5 2130.55 1780.6 1280.65 650.7 200.75 30.8 30.85 0

999

Using Excel to Generate Random Data

51

Page 52: 1 l Using Statistics l Sample Statistics as Estimators of Population Parameters l Sampling Distributions l Estimators and Their Properties l Degrees of.

52

Exercise, p.206, 15min• Exercise5-27 (Ans. 1065, 2500 )• Exercise5-28 (Ans. 53, 0.5 )• Exercise5-29 (Ans. 0.2, 0.04216)• Exercise5-30 (Ans. 0.923)• Exercise5-31 (Ans. 0.9544)• Exercise5-34 (Ans. 1.000)• Exercise5-35 (Ans. 0.9503)• Exercise5-36 (Ans. 不是最小的( n = 1 對常態性已經夠了 ))• Exercise5-40 (Ans. P(Z< - 5) = 0.0000003 不太可能 )