MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random...

37
MDP 308 Quality Management Lecture #3 Statistical Inference and confidence intervals

Transcript of MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random...

Page 1: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

MDP 308

Quality Management

Lecture #3

Statistical Inference and confidence intervals

Page 2: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Today’s lecture

Statistical inference

Confidence intervals

Selecting probability distribution

Page 3: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Thinking Challenge

Suppose you’re interested in

the average amount of time

(in minutes) the students at

FECU (the population)

spend daily on watching TV.

How would you find out?

Page 4: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Statistical inference

“The field of statistical inference consists of those methods used to make decisions

or to draw conclusions about a population. These methods utilize the information

contained in a sample from the population in drawing conclusions.”

Page 5: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Statistical inference

A population consists of the totality of the observations

with which we are concerned.

A sample is a subset of observations selected from a

population.

A statistic is any function of the observations in a

random sample.

A random sample is a sample collected by making sure

that each individual in the population has the same

probability of being selected.

Page 6: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Statistical Methods

Statistical

Methods

EstimationHypothesis

Testing

Inferential

Statistics

Descriptive

Statistics

Summarize

the sample

data

Use the data

to learn

about the

population

Take a random sample, then use statistical methods to treat the collected data

Page 7: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Estimation Methods

Estimation

Interval

EstimationPoint

Estimation

Page 8: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Point estimation of parameters

A point estimate of some population parameter 𝜃 is a single numerical value መ𝜃 of a statistic Θ. The statistic Θ is called the point estimator.

Estimation problems occur frequently in engineering. We often need to estimate

The mean of a single population

The variance 2 (or standard deviation ) of a single population

The proportion p of items in a population that belong to a class of interest

The difference in means of two populations, 1 - 2

The difference in two population proportions, p1 – p2

Page 9: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Point Estimation

1. Provides a single value

• Based on observations from one sample

2. Gives no information about how close the value is to

the unknown population parameter

3. Example: Sample mean x = 3 is the point

estimate of the unknown population mean

Page 10: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Unbiased estimator

No bias

True value

Bias

True value

• An estimator should be “close” in some sense to the

true value of the unknown parameter.

• Formally, we say that Θ is an unbiased estimator of 𝜃 if

the expected value of Θ is equal to 𝜃.

Page 11: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Interval Estimator

An interval estimator (or confidence interval) is

a formula that tells us how to use the sample data to

calculate an interval that estimates the target parameter.

Page 12: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Interval Estimation

1. Provides a range of values

• Based on observations from one sample

2. Gives information about closeness to unknown

population parameter

• Stated in terms of probability

3. Example: Unknown population mean lies between 50

and 70 with 95% confidence

Page 13: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Estimation Process

Mean, , is

unknown

Population

☺☺

☺☺

Sample☺

I am 95% confident

that is between 40

& 60.

Random Sample

☺☺

Mean

x = 50

Page 14: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Key Elements of Interval Estimation

Sample statistic

(point estimate)Confidence

interval

Confidence

limit (lower, L)

Confidence

limit (upper, U)

A confidence interval provides a range of

plausible values for the population parameter.

Page 15: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Central limit theorem

“ If X1, X2… Xn is a random sample of size n taken from the

population (either finite or infinite) with mean and

variance 2, and if ത𝑋 is the sample mean, the limiting form of

the distribution of

𝑍 =ത𝑋−𝜇

Τ𝜎 𝑛

as n → , is the standard normal distribution.”

Furthermore, if the variance 2 is unknown and the sample

size n is large, the quantity ത𝑋−𝜇

Τ𝑆 𝑛where 𝑆2 is the sample variance

Has an approximate standard normal distribution.

Page 16: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Central limit theorem

By taking more than one sample and looking at the distribution of means

calculated for each sample, we can see that this calculated mean

approaches the actual population mean as indicated in the following figure.

Even if a population distribution is strongly non-normal, its sampling

distribution of means will be approximately normal for large sample sizes

(n30), and the mean of a sampling distribution of means is an unbiased

estimator of the population mean.

Page 17: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Confidence interval

Confidence interval on the mean of a normal distribution,

variance known.

Suppose that x1, x2, ..., xn is a random sample from a normal

distribution with unknown μ and known σ2 .

We know that ҧ𝑥~𝑁(𝜇,𝜎

𝑛)

A Confidence interval estimate for μ is

/

xZ

n

−=

UL

Prob. of selecting samples provide the range of µ that contains the true value of µ

Page 18: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Confidence interval

In order to find lower and upper confidence limits:

/ 2 /2

/2 /2

{ } 1/

{ } 1

xP z z

n

P x z x zn n

−− = −

− + = −

Page 19: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Confidence interval

Interpreting a CI

We cannot say: "with probability (1 − α) the parameter μ lies in the

confidence interval."

We can say that: if an infinite number of random samples are collected and

a 100(1-)% CI for µ is computed from each sample, 100(1-)% of these

intervals will contain the true value of µ

Page 20: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

If our confidence level is 95%, then in the long run, 95% of

our confidence intervals will contain µ and 5% will not.

Effect of Confidence Level

For a confidence coefficient of 95%, the area in the two

tails is .05. To choose a different confidence coefficient

we increase or decrease the area (call it ) assigned

to the tails. If we place /2 in

each tail and z/2 is the z-value,

the confidence interval with

coefficient (1 – ) is x z 2( ) x .

Page 21: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

1. A random sample is selected from the target

population.

2. The sample size n is large (i.e., n ≥ 30). Due to the

Central Limit Theorem, this condition guarantees

that the sampling distribution of is approximately

normal. Also, for large n, s will be a good estimator

of .

Conditions Required for a Valid Large-Sample

Confidence Interval for µ

x

Page 22: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

where z/2 is the z-value with an area /2 to its right and The parameter is the standard deviation of the sampled population, and n is the sample size.

Note: When is unknown and n is large (n ≥ 30), the confidence interval is approximately equal to

Large-Sample (1 – )% Confidence

Interval for µ

where s is the sample standard deviation.

x z 2( ) x = x z 2

n

x z 2s

n

Page 23: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Thinking Challenge

You’re a Q/C inspector for

FruitTree. The for 0.33-liter

cans is .005 liters. A random

sample of 100 bottles showed x

= 0.329 liters. What is the 90%

confidence interval estimate of

the true mean amount in 0.33-

liter cans?

Page 24: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Confidence Interval Solution

/2 /2

.005 .0050.329 1.645 0.329 1.645

100 100

0.32818 0.32966

x z x zn n

− +

− +

Page 25: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Confidence interval for small sample

By assuming that the measured parameter of the

population is normally distributed, then the random

variable

𝑇 =ത𝑋−𝜇

Τ𝑆 𝑛

Has a t distribution with n-1 degrees of freedom

Therefore, the confidence interval is given by:

ҧ𝑥 − 𝑡𝛼2,𝑛−1

𝑆

𝑛 ҧ𝑥 + 𝑡𝛼

2,𝑛−1

𝑆

𝑛

Page 26: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Student t Distribution…

Here the letter t is used to represent the random

variable, hence the name. The density function for the

Student t distribution is as follows…

(nu) is called the degrees of freedom, and

(Gamma function) is (k)=(k-1)(k-2)…(2)(1)

Page 27: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Student t Distribution…[1 parameter]

In much the same way that and define the normal distribution [2 parameters], [1 parameter], the degrees of freedom, defines the Student t Distribution:

As the number of degrees of freedom increases, the tdistribution approaches the standard normal distribution.

Page 28: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Using the t table (Table 4) for values…

For example, if we want the value of t with 10 degrees of

freedom such that the area under the Student t curve is .05:Area under the curve value (t) : COLUMN

Degrees of Freedom : ROW

t.05,10

t.05,10=1.812

Page 29: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Student t Probabilities and Values

Excel can calculate Student distribution probabilities and

values. Warning: Excel will give you the value for “t”

where is the area in “BOTH” tails

=TINV(0.1,10) "=" 1.812

Page 30: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Selecting a probability distribution

In statistical process control, most of the quality characteristics are random variables.

The determination of confidence intervals are based on the assumption that the population distribution is Normal.

If that is not the case, we need to test the hypothesis that a particular distribution we select will be satisfactory.

This is done by first collecting numerical values from the real studied system.

Probability plots can be used as a first guess of the probability distribution function that can suit the collected data.

Then, goodness-of-fit test can be used for further verification based on detailed numerical comparison between the collected data and the selected probability distribution.

Page 31: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Chapter 3Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Copyright (c) 2013 John Wiley & Sons, Inc.

Determining if a sample of data might reasonably be

assumed to come from a specific distribution

Probability plots are available for various distributions

Easy to construct with computer software

(MINITAB)

Subjective interpretation

3.4 Probability Plots

Page 32: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Normal Probability Plot

Chapter 332Statistical Quality

Control, 7th Edition by

Douglas C. Montgomery.

Page 33: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Chapter 3Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Copyright (c) 2013 John Wiley & Sons, Inc.

Page 34: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

The Normal Probability Plot on Standard

Graph Paper

Chapter 3Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Copyright (c) 2013 John Wiley & Sons, Inc.

Page 35: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Chapter 3Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Copyright (c) 2013 John Wiley & Sons, Inc.

Other Probability Plots

What is a reasonable choice as a probability model

for these data?

Page 36: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Chapter 3Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Copyright (c) 2013 John Wiley & Sons, Inc.

Page 37: MDP 308 Quality Management 2021. 2. 2. · Central limit theorem “ If X 1, X 2…X n is a random sample of size n taken from the population (either finite or infinite) with mean

Chapter 3Statistical Quality Control, 7th Edition by Douglas C. Montgomery.

Copyright (c) 2013 John Wiley & Sons, Inc.

3.5 Some Useful Approximations