Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words...

40
Chapter 1. Statistical inference in one population Contents I Statistical inference I Point estimators I The estimation of the population mean and variance I Estimating the population mean using confidence intervals I Confidence intervals for the mean of a normal population with known variance I Confidence intervals for the mean in large samples I Confidence intervals for the population proportion I Confidence intervals for the mean of a normal population with unknown variance I Estimating the population variance using confidence intervals I Confidence intervals for the variance of a normal population

Transcript of Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words...

Page 1: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Chapter 1. Statistical inference in one population

Contents

I Statistical inference

I Point estimatorsI The estimation of the population mean and variance

I Estimating the population mean using confidence intervalsI Confidence intervals for the mean of a normal population with known

varianceI Confidence intervals for the mean in large samples

I Confidence intervals for the population proportion

I Confidence intervals for the mean of a normal population withunknown variance

I Estimating the population variance using confidence intervalsI Confidence intervals for the variance of a normal population

Page 2: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Chapter 1. Statistical inference in one population

Learning goalsAt the end of this chapter you should know how to:

I Estimate the unknown population parameters from the sample data

I Construct confidence intervals for the unknown populationparameters from the sample data:

I In the case of a normal distribution: confidence intervals for thepopulation mean and variance

I In large samples: confidence intervals for the population mean andproportion

I Interpret the confidence interval

I Understand the impact of the sample size, confidence level, etc onthe length of the confidence interval

I Calculate a sample size needed to control a given interval width

Page 3: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Chapter 1. Statistical inference in one population

ReferencesI Newbold, P. ”Statistics for Business and Economics”

I Chapters 7 and 8 (8.1-8.6)

I Ross, S. ”Introduction to Statistics”I Chapter 8

Page 4: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Statistical inference: key words (i)

I Population: the complete set of numerical information on aparticular quantity in which an investigator is interested.

I We identify the concept of the population with that of the randomvariable X .

I The law or the distribution of the population is the distribution of X ,FX .

I Sample: an observed subset (say, of size n) of the population values.

I Represented by a collection of n random variables X1,X2, . . . ,Xn,

typically iid (independent identically distributed) .

I Parameter: a constant characterizing X or FX .

Page 5: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Statistical inference: key words (ii)

I Statistical inference: the process of drawing conclusions about apopulation on the basis of measurements or observations made on asample of individuals from the population.

I Statistic: a random variable obtained as a function of a randomsample, X1,X2, . . . ,Xn

I Estimator of a parameter: a random variable obtained as a function,say T , of a random sample, X1,X2, . . . ,Xn, used to estimate theunknown population parameter.

I Estimate: a specific realization of that random variable, i.e., Tevaluated at the observed sample, x1, x2, . . . , xn, that provides anapproximation to that unknown parameter.

Page 6: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Statistical inference: example

We want to know We have n copies We have nµX = E[X ] of X observed values of

X1,X2, . . . ,Xn

X1,X2, . . . ,Xn ∼ F x1, x2, . . . , xn

X ∼ F ⇒ Sample ⇒ Observed sample

⇓ ⇓ ⇓Estimator of µX (r. v.) Estimate of µX (number)

µX = E[X ] ⇐ X̄ ⇐ x̄Expected value of X Sample mean Sample mean

Page 7: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Point estimators: introduction

I A point estimator of a population parameter is a function, call it T ,of the sample information X n = (X1, . . . ,Xn) that yields a singlenumber.

I Examples of population parameters, estimators and estimates:

Population Estimator: Estimate:parameter T (X n) notation notation

Pop. mean µX sample meanX1+...+Xn

n X̄ = µ̂X x̄Pop. prop. pX sample prop. p̂X p̂x

Pop. var. σ2X sample var.

Pi X2

i −n(X̄ )2

n σ̂2X σ̂2

x

Pop. var. σ2X sample quasi. var.

Pi X2

i −n(X̄ )2

n−1 = nn−1 σ̂

2X s2

X s2x

. . . . . . . . . . . .

In general, θX . . . θ̂X θ̂x

Page 8: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Point estimators: properties (i)

What are desirable characteristics of the estimators?

I Unbiasdness. This means that the bias of the estimator is zero.What’s bias? Bias equals the expected value of the estimator minusthe target parameter

Bias[θ̂X ] = E[θ̂X ]− θX

Population Estimator Minimum Varianceparameter T (X n) Bias Unbiased? Unbiased Estimator?

Pop. mean µX X E[X̄ ]− µX = 0 Yes Yes, if X normalPop. prop. pX p̂X E[p̂X ]− pX = 0 Yes Yes

Pop. var. σ2X σ̂2

X E[σ̂2X ]− σ2

X 6= 0 No No

Pop. var. σ2X s2

X E[s2X ]− σ2

X = 0 Yes Yes, if X normal

In general, θX θ̂X E[θ̂X ]− θX Often Rarely

Page 9: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Point estimators: properties (ii)

I Efficiency. Measured by the estimator’s variance. Estimators withsmaller variance are more efficient.

I Relative efficiency of two unbiased estimators θ̂X ,1 and θ̂X ,2 of aparameter θX is

Relative efficiency(θ̂X ,1, θ̂X ,2) =Var[θ̂X ,1]

Var[θ̂X ,2]

Note:I sometimes the inverse is used as a definitionI in any case, an estimator with smaller variance is more efficient

Page 10: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Point estimators: properties (iii)

I A more general criterion to select estimators (among unbiased andbiased ones) is the mean squared error defined as

MSE[θ̂X ] = E[(θ̂X − θX )2] = Var[θ̂X ] + (Bias[θ̂X ])2

Note:I the mean squared error of an unbiased estimator equals its varianceI an estimator with smaller MSE is betterI the minimum variance unbiased estimator has the smallest

variance/MSE among all estimators

I How do we come up with the definition of the estimator T?I In some situations, there exists an optimal estimator called minimum

variance unbiased estimator.I If that’s not the case, there are various alternative methods that

yield reasonable estimators, for example:I Maximum likelihood estimationI Method of moments

Page 11: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Point estimation: example

Example: 7.1 (Newbold) Price-earnings ratios for a random sample often stocks traded on the NY Stock Exchange on a particular day were

10, 16, 5, 10, 12, 8, 4, 6, 5, 4

Use an unbiased estimation procedure to find point estimates of thefollowing population parameters: mean, variance, proportion of valuesexceeding 8.5.

x̄ =80

10= 8

s2x =

782− 10(8)2

10− 1= 15.78

p̂x =1 + 1 + 0 + 1 + 1 + 0 + 0 + 0 + 0 + 0

10= 0.4

Page 12: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Point estimation: example

Example: Let µ̂X = 2n(n+1)

(X1 + 2X2 + . . .+ nXn) be an estimator of thepopulation mean based on a SRS X n. Compare this estimator with the samplemean, X̄ .

We know that X̄ is an unbiased estimator of µX , whose variance isσ2

Xn

.µ̂X is also unbiased:

E[µ̂X ] = E

"2

n(n + 1)(X1 + 2X2 + . . . + nXn)

#

=2

n(n + 1)(E[X1] + 2E[X2] + . . . + nE[Xn ])

=id

2

n(n + 1)(µX + 2µX + . . . + nµX )

=2µX

n(n + 1)

n(n+1)/2z }| {(1 + 2 + . . . + n) = µX

⇒ Bias[µ̂X ] = 0

And its variance/MSE is:

V[µ̂X ] = V

"2

n(n + 1)(X1 + 2X2 + . . . + nXn)

#

=indep.

2

n(n + 1)

!2(V[X1] + 22V[X2] + . . . + n2V[Xn ])

=id

4

n2(n + 1)2σ

2X

n(n+1)(2n+1)/6z }| {(12 + 22 + . . . + n2)

=2(2n + 1)

3n(n + 1)σ

2X

MSE [µ̂X ] = V[µ̂X ] + 02 =2(2n + 1)

3n(n + 1)σ

2X

Relative efficiency(X̄ , µ̂X ) =σ2

X/n2(2n+1)3n(n+1)

σ2X

=3(n + 1)

2(2n + 1)

It’s easy to see that for n ≥ 2, this ratio is smaller than 1 so X̄ is a more

efficient estimator for µX .

Page 13: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

From point estimation to confidence interval estimation

I So far, we have consider the point estimation of an unknownpopulation parameter which, assuming we had a SRS sample of nobservations from X , would produce an educated guess about thatunknown parameter

I Point estimates however, do not take into account the variability ofthe estimation procedure due to, among other factors:

I sample size - surely, larger samples should provide more accurateinformation about the population parameter

I variability in the population - samples from populations with smallervariance should give more accurate estimates

I whether other population parameters are knownI etc

These drawbacks can be overcome by considering confidence intervalestimation, that is, a method that gives a range of values (an interval) inwhich the parameter is likely to fall.

Page 14: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval estimator and confidence intervalLet X n = (X1,X2, . . . ,Xn) be a SRS from a population X with a cdf FX

that depends on an unknown parameter θ.

I A confidence interval estimator for θ at a confidence level(1−α) = 100(1−α)% is an interval (T1(X n),T2(X n)) that satisfies

P (θ ∈ (T1(X n),T2(X n)) = 1− α

I Interpretation: we have a probability of (1− α) that the unknownpopulation parameter will be in (T1(X n),T2(X n)).

I A confidence interval for θ at a confidence level 1− α is theobserved value of the confidence interval estimator,

(T1(xn),T2(xn))

I Interpretation: we can be (1− α) confident that the unknownpopulation parameter will be in (T1(xn),T2(xn)).

Typical levels of confidence

α 0.01 0.05 0.10100(1− α)% 99% 95% 90%

Page 15: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Finding confidence interval estimators: procedure

1. Find a quantity involving the unknown parameter θ and the sampleX n, C (X n, θ), whose distribution is known and does not depend onthe parameter - a so-called pivotal quantity or a pivot for θ

2. Use the upper 1−α/2 and α/2 quantiles of that distribution and thedefinition of the confidence interval estimator to set up the equation

P(

double inequality︷ ︸︸ ︷1− α/2 quantile<C (X n, θ)<α/2 quantile) = 1− α

3. To find the end points T1(X n) and T2(X n) of the confidenceinterval estimator, solve the double inequality for the parameter θ

4. A 100(1− α)% confidence interval for θ is (T1(xn),T2(xn))

Page 16: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean, normalpopulation with known variance

1. Let X n be a SRS of size n from X . Under the assumptions:I X follows a normal distribution with parameters µX and σ2

X

I σ2X is known (rather unrealistic)

2. The pivotal quantity for µX is

X̄ − µX

σX/√

n∼ N(0, 1)

I Note: the standard deviation of X̄ , σX/√

n, (or any other stats) iscalled the standard error

Page 17: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean, normalpopulation with known variance

3. Hence, if z1−α/2 and zα/2 are the(1− α/2) and (α/2) upperquantiles of the N(0, 1), we have

P(z1−α/2 < Z < zα/2) = 1− α

Standard normal densityRecall: If Z ∼ N(0, 1) thenE[Z ] = 0, V[Z ] = 1

1 − αα2

α2

● ●

z1−α2

= − zα2

zα2

4. Therefore P(

−zα/2︷ ︸︸ ︷z1−α/2 <

Z︷ ︸︸ ︷X̄ − µX

σX/√

n< zα/2) = 1− α

Page 18: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean, normalpopulation with known variance

5. Solve the double inequality for µX :

−zα/2 < X̄−µXσX /√

n< zα/2

−zα/2σX√

n< X̄ − µX < zα/2

σX√n

−zα/2σX√

n− X̄ < −µX < −X̄ + zα/2

σX√n

zα/2σX√

n+ X̄ > µX > X̄ − zα/2

σX√n

to obtain the confidence interval estimator

(

T1(X n)z }| {X̄ − zα/2

σX√n,

T2(X n)z }| {X̄ + zα/2

σX√n

)

6. The confidence interval is:

CI1−α(µX ) =

„x̄ − zα/2

σX√n, x̄ + zα/2

σX√n

«=

„x̄ ∓ zα/2

σX√n

«

Page 19: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Example: finding a confidence interval for µXExample: 8.2 (Newbold) A process produces bags of refined sugar. Theweights of the contents of these bags are normally distributed with standarddeviation 1.2 ounces. The contents of a random sample of twenty-five bags hadmean weight 19.8 ounces. Find a 95% confidence interval for the true meanweight for all bags of sugar produced by the process.

Population:X = ”weight of a sugar bag (in oz)”X ∼ N(µX , σ

2X = 1.22)

' SRS: n = 25

Sample: x̄ = 19.8

Area= 0.025

z0.025 = 1.96

Objective: CI0.95(µX ) =“x̄ ∓ zα/2

σX√n

”σX = 1.2

n = 25 x̄ = 19.8

1− α = 0.95 ⇒ α/2 = 0.025

zα/2 = z0.025 = 1.96

CI0.95(µX ) =

„19.8∓ 1.96

1.2√25

«= (19.8∓ 0.47)

= (19.33, 20.27)

Interpretation: We can be 95%confident that µX is in(19.33, 20.27)

Page 20: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Frequency interpretation of the CI, conf. level effectIn this simulated example, 150 samples of the same size n = 50 were generated

from X ∼ N(µX = −5, σ2X = 12) and 150 CI1−α(µX ) were constructed with

α = 0.1 and α = 0.01.µX in approximately 150(0.9) = 135 ints.

(but not in 150(0.1) = 15)

(1− α) = 0.9, n = 50

−6.0 −5.5 −5.0 −4.5 −4.0

050

100

150

Confidence interval

Inde

x

| | || |||| || ||| || ||||||| | || ||| | |||| ||| || || || | ||| || || ||| || | | || || |||| | || || ||| || | || || || || | || | ||| ||||

|| | || || | || | ||| | ||| ||| || | ||| || | |||| || || | || | |||| | |||| || ||

µX in approximately 150(0.99) = 148.5 ints.(but not in 150(0.01) = 1.5)

(1− α) = 0.99, n = 50

−6.0 −5.5 −5.0 −4.5 −4.0

050

100

150

Confidence intervalIn

dex

| | || |||| || ||| || ||||||| | || ||| | |||| ||| || || || | ||| || || ||| || | | || || |||| | || || ||| || | || || || || | || | ||| ||||

|| | || || | || | ||| | ||| ||| || | ||| || | |||| || || | || | |||| | |||| || ||

The width of the interval, w = x̄ +zα/2σX√

n−“x̄ − zα/2σX√

n

”= 2

zα/2σX√n

,

increases with the increasing confidence level (keeping everything else the

same). Why?

Page 21: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Frequency interpretation of the CI, sample size effectHere we collect 150 samples of size n = 50 and another 150 of size n = 200

from X ∼ N(µX = −5, σ2X = 12) .

µX in approximately 150(0.9) = 135 ints.(but not in 150(0.1) = 15)

(1− α) = 0.9, n = 50

−6.0 −5.5 −5.0 −4.5 −4.0

050

100

150

Confidence interval

Inde

x

| | || |||| || ||| || ||||||| | || ||| | |||| ||| || || || | ||| || || ||| || | | || || |||| | || || ||| || | || || || || | || | ||| ||||

|| | || || | || | ||| | ||| ||| || | ||| || | |||| || || | || | |||| | |||| || ||

µX in approximately 150(0.9) = 135 ints.(but not in 150(0.1) = 15)

(1− α) = 0.9, n = 200

−6.0 −5.5 −5.0 −4.5 −4.0

050

100

150

Confidence intervalIn

dex

| | ||| || || ||| |||| |||

| | ||||||| || ||| || |||

|| ||| || || | ||| | | || || | ||||| | |||||||| || | || || || | || ||| || | ||| | || | |||

|| ||| || ||| | || || | ||| |||| |||| | ||| ||| || | ||| | || || ||||

|

The width of the interval decreases with the increasing sample size (keepingeverything else the same). Why?

Question: What is the effect of σ on the width?

Page 22: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Example: estimating the sample sizeExample: 8.14 (Newbold) The lengths of metal rods produced by an industrialprocess are normally distributed with standard deviation 1.8mm. Suppose thata production manager requires a 99% confidence interval extending no furtherthan 0.5mm on each side of the sample mean. How large a sample is needed toachieve such an interval?

Population:X = “length of a metal rod (in mm)”X ∼ N(µX , σ

2X = 1.82)

' SRS: n =?

CI0.99(µX ):

widthz }| {2zα/2σX√

n≤ 2(0.5) = 1

Area= 0.005

z0.005 = 2.575

Objective: n such that width ≤ 1

2zα/2σX√

n≤ 1

2zα/2σX ≤√

n

85.93 = (2(2.575)(1.8))2 ≤ n

To satisfy the manager’srequirement, a sample of at least86 observations is needed.

Page 23: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean in largesamples

1. Let X n be a SRS of size n from X . Under the assumptions:I X follows a nonnormal distribution with parameters µX and σ2

X

I the sample size n is large (n ≥ 30)

2. The pivotal quantity for µX based on the Central Limit Theorem is

X̄ − µX

σ̂X/√

n∼approx. N(0, 1)

Page 24: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean in largesamples

3. Hence, if z1−α/2 and zα/2 are the(1− α/2) and (α/2) upperquantiles of the N(0, 1), we have

P(z1−α/2 < Z < zα/2) = 1− α

Standard normal density1 − α

α2

α2

● ●

z1−α2

= − zα2

zα2

4. Therefore P(

−zα/2︷ ︸︸ ︷z1−α/2 <

Z︷ ︸︸ ︷X̄ − µX

σ̂X/√

n< zα/2) = 1− α

Page 25: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean in largesamples

5. Solve the double inequality for µX :

−zα/2 <X̄ − µX

σ̂X/√

n< zα/2

to obtain the confidence interval estimator

(

T1(X n)︷ ︸︸ ︷X̄ − zα/2

σ̂X√n,

T2(X n)︷ ︸︸ ︷X̄ + zα/2

σ̂X√n

)

6. The confidence interval is:

CI1−α(µX ) = (x̄ − zα/2σ̂x√n, x̄ + zα/2

σ̂x√n

)

Page 26: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population proportion in largesamples

Application of CIs for the population mean in large samplesLet X n, n ≥ 30 be a SRS from a Bernoulli distr. with parameter pX

(µX = E[X ] = pX and σX =√

pX (1− pX )). The sample proportion p̂X

is a special case of the sample mean of zero-one observations, p̂X = X̄ .

Thus, from the CLT

p̂X − pX√pX (1− pX )/n︸ ︷︷ ︸σX/√

n

∼approx. N(0, 1) This result remains true if we

use an estimate for the populationstandard deviation

p̂X − pXpp̂X (1− p̂X )/

√n| {z }

σ̂X/√

n

∼approx. N(0, 1)

Thus, in large samples, the confidence interval for pX is:

CI1−α(pX ) =

(p̂x − zα/2

√p̂x(1− p̂x)

n, p̂x + zα/2

√p̂x(1− p̂x)

n

)

Page 27: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Example: finding a confidence interval for pX

Example: 8.6 (Newbold) A random sample of 344 industrial buyers were asked:”What is your firm’s policy for purchasing personnel to follow on acceptinggifts from vendors?”. For 83 of these buyers, the policy of the firm was for thebuyer to make his/her own decision. Find a 90% confidence interval for thepopulation proportion of all buyers who are allowed to make their own decisions.

Population:X = 1 if a buyer makes their owndecision and 0 otherwiseX ∼ Bernoulli(pX )

' SRS: n = 344 large

Sample: p̂x = 83344

= 0.241

Area= 0.05

z0.05 = 1.645

Objective: CI0.9(pX ) =

p̂x ∓ zα/2

rp̂x (1−p̂x )

n

!

p̂x = 0.241 n = 344

1 − α = 0.9 ⇒ α/2 = 0.05

zα/2 = z0.05 = 1.645

CI0.9(pX ) =

[email protected] ∓ 1.645

s0.241(1 − 0.241)

344

1A= (0.241 ∓ 0.038)

= (0.203, 0.279)

Interpretation: We can be 90%confident that the proportion ofbuyers who make their own decision,pX , falls in (0.203, 0.279)

Page 28: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean, normalpopulation with unknown variance

1. Let X n be a SRS of size n from X . Under the assumptions:I X follows a normal distribution with parameters µX and σ2

X

I σ2X is unknown (quite realistic)

2. The pivotal quantity for µX is

X̄ − µX

sX/√

n∼ tn−1

Page 29: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean, normalpopulation with unknown variance

3. Hence, if tn−1;1−α/2 and tn−1;α/2 are the(1− α/2) and (α/2) upper quantiles ofthe t distribution with n − 1 degrees offreedom (df), we have

P(tn−1;1−α/2 <

∼ tn−1z}|{T < tn−1;α/2) = 1− α

t (Student) densityRecall: if T ∼ tn, E[T ] = 0, V[T ] = n

n−2

1 − αα2

α2

● ●

tn−1 ; 1−α2

= − tn−1 ; α2tn−1 ; α2

4. Therefore P(

−tn−1;α/2︷ ︸︸ ︷tn−1;1−α/2 <

T ∼ tn−1︷ ︸︸ ︷X̄ − µX

sX/√

n< tn−1;α/2) = 1− α

Page 30: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population mean, normalpopulation with known variance

5. Solve the double inequality for µX :

−tn−1;α/2 < X̄−µX

sX/√

n< tn−1;α/2

to obtain the confidence interval estimator

(

T1(X n)︷ ︸︸ ︷X̄ − tn−1;α/2

sX√n,

T2(X n)︷ ︸︸ ︷X̄ + tn−1;α/2

sX√n

)

6. The confidence interval is:

CI1−α(µX ) = (x̄ − tn−1;α/2sx√n, x̄ + tn−1;α/2

sx√n

)

Page 31: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Example: finding a confidence interval for µXExample: 8.4 (Newbold) A random sample of six cars from a particular modelyear had the following fuel consumption figures, in mpg: 18.6, 18.4, 19.2, 20.8,19.4, 20.5. Find a 90% confidence interval for the population mean fuelconsumption, assuming that the population distribution is normal.

Population:X = ”mpg of a car from the model

year” X ∼ N(µX , σ2X ) σ2

X unknown

' SRS: n = 6 small

Sample: x̄ = 116.96

= 19.4833

s2x =

2282.41− 6(19.4833)2

6− 1= 0.96

Area= 0.05

t5 ; 0.05 = 2.015

Objective: CI0.9(µX ) =“x̄ ∓ tn−1;α/2

sx√n

”sx =

√0.96 = 0.98

n = 6 x̄ = 19.48

1− α = 0.9 ⇒ α/2 = 0.05

tn−1;α/2 = t5;0.05 = 2.015

CI0.9(µX ) =

„19.48∓ 2.105

0.98√6

«= (19.48∓ 0.81)

= (18.67, 20.29)

Interpretation: We can be 90%confident that the population meanfuel consumption for these cars, µX ,is between 18.67 and 20.29

Page 32: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Example: finding a confidence interval for µX

Example: 8.4 (cont.) in Excel: Go to menu: Data, submenu: DataAnalysis, choose function: Descriptive Statistics.Column A (data), in yellow (sample mean, half-width tn−1;α/2

sx√n

, lower

end-point (cell D3-D16), upper end-point (cell D3+D16)).

Page 33: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

t and χ2 distributionsI Recall that T ∼ tn if T = Z√

χ2n/n

, where Z ∼ N(0, 1) and χ2n follows a

chi-square distribution with df = n, independent of Z .

I On the other hand, χ2n is the distribution of the sum of n independent

squared N(0, 1) random variables.

I Note that the rescaled sample quasi variance follows a chi-squaredistribution with n − 1 degrees of freedom

(n − 1)s2X

σ2X

=

Pni=1(Xi − X̄ )2

σ2X

=nX

i=1

„Xi − X̄

σX

«2

∼ χ2n−1

Why n − 1 and not n?

If we knew µX , the number ofdegrees of freedom would be n,because we would have n iid randomvariables Xi−µX

σX

Since we have to estimate µX withX̄ , the df are n − 1, because we only

have n− 1 iid random variables Xi−X̄σX

(once you know n − 1 of them, youcan figure out the remaining one)

We say that one degree of freedom is used up to estimate µX

Page 34: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

t and χ2 distributions

t and N(0, 1) densities

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

N(0,1)df=10df=5df=3

χ2 densities

0 10 20 30 40

0.00

0.05

0.10

0.15

df=20df=15df=10df=5

Page 35: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population variance, normalpopulation

1. Let X n be a SRS of size n from X . Under the assumptions:I X follows a normal distribution with parameter σ2

X

2. The pivotal quantity for σ2X is

(n − 1)s2X

σ2X

∼ χ2n−1

Page 36: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population variance, normalpopulation

3. Hence, if χ2n−1;1−α/2 and χ2

n−1;α/2 arethe (1− α/2) and (α/2) upperquantiles of the chi-square distributionwith n− 1 degrees of freedom, we have

P(χ2n−1;1−α/2 < χ2

n−1 < χ2n−1;α/2) = 1− α

Chi-square densityRecall: E[χ2

n] = n, V[χ2n] = 2n

1 − αα2

α2

● ●

χn−1 ; 1−α2

2 χn−1 ; α22

4. Therefore P(χ2n−1;1−α/2 <

χ2n−1z }| {

(n − 1)s2X

σ2X

< χ2n−1;α/2) = 1− α

Page 37: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence interval for the population variance, normalpopulation

5. Solve the double inequality for σ2X :

χ2n−1;1−α/2 <

(n−1)s2X

σ2X

< χ2n−1;α/2

1

χ2n−1;1−α/2

>σ2

X

(n−1)s2X>

1

χ2n−1;α/2

(n − 1)s2X

χ2n−1;1−α/2

> σ2X >

(n − 1)s2X

χ2n−1;α/2

to obtain the confidence interval estimator((n − 1)s2

X

χ2n−1;α/2

,(n − 1)s2

X

χ2n−1;1−α/2

)6. The confidence interval is:

CI1−α(σ2X ) =

((n − 1)s2

x

χ2n−1;α/2

,(n − 1)s2

x

χ2n−1;1−α/2

)

Page 38: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Example: finding a confidence interval for σ2X and σX

Example: 8.8 (Newbold) A random sample of fifteen pills for headache reliefshowed a quasi standard deviation of 0.8% in the concentration of the activeingredient. Find a 90% confidence interval for the population variance for thesepills. How would you obtain a CI for the population standard deviation?

Population:X = ”concentration of an activeingredient in a pill (in %)”X ∼ N(µX , σ

2X )

' SRS: n = 15

Sample: sx = 0.8

Area= 0.05

Area= 0.05

● ●

χ14 ; 0.952

=6.57

χ14 ; 0.052

=23.68

Objective: CI0.9(σ2X ) =

0@ (n−1)s2x

χ2n−1;α/2

,(n−1)s2

xχ2

n−1;1−α/2

1A

s2x = 0.82 = 0.64 n = 15

1− α = 0.9 ⇒ α/2 = 0.05

χ2n−1;1−α/2 = χ2

14;0.95 = 6.57

χ2n−1;α/2 = χ2

14;0.05 = 23.68

CI0.9(σ2X ) =

„14(0.64)

23.68,

14(0.64)

6.57

«= (0.378, 1.364)⇒

CI0.9(σX ) = (√

0.378,√

1.364)

= (0.61, 1.17)

To obtain CI (σX ) we apply√

to the

end-points of CI (σ2X )

Page 39: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence intervals formulae

Summary for one population

I Let X n be a simple random sample from a population X with mean µX

and variance σ2X

Parameter Assumptions Pivotal quantity (1 − α) Conf. Interval

Normal dataKnown variance

X̄−µXσX /√

n∼ N(0, 1) µX ∈

„x̄ − zα/2

σX√n, x̄ + zα/2

σX√n

«

Mean Nonnormal dataLarge sample

X̄−µXσ̂X /√

n∼approx. N(0, 1) µX ∈

„x̄ − zα/2

σ̂x√n, x̄ + zα/2

σ̂x√n

Bernoulli dataLarge sample

p̂X−pXqp̂X (1−p̂X )/n

∼approx. N(0, 1) pX ∈

p̂x ∓ zα/2

rp̂x (1−p̂x )

n

#

Normal dataUnknown variance

X̄−µXsX /√

n∼ tn−1 µX ∈

„x̄ − tn−1,α/2

sx√n, x̄ + tn−1,α/2

sx√n

«

Variance Normal data(n−1)s2

Xσ2

X

∼ χ2n−1 σ2

X ∈

0@ (n−1)s2x

χ2n−1;α/2

,(n−1)s2

xχ2

n−1;1−α/2

1AStandard dev. Normal data

(n−1)s2X

σ2X

∼ χ2n−1 σX ∈

0@vuut (n−1)s2x

χ2n−1;α/2

,

vuut (n−1)s2x

χ2n−1;1−α/2

1A

Page 40: Statistics II Chapter 1: Statistical inference in one ...€¦ · Statistical inference: key words (i) I Population: the complete set of numerical information on a particular quantity

Confidence intervals for the population mean:when to use what?

X ∼ distribution with mean µX and standard deviation σX

↙X ∼ normal

↙σ known

↓z-based (exact)

↘σ unknown

↓t-based (exact)

↘X � normal

↙n small

↓Methods beyondEst II

↘n large

↓z-based(approx. CLT)