Sampling and Sampling Distrbution

Post on 19-Jan-2016

74 views 6 download

Tags:

description

Sampling and Sampling Distrbution, hypothesis testing

Transcript of Sampling and Sampling Distrbution

“Sampling and Sampling distribution”

Arun Kumar, Ravindra Gokhale, and NagarajanKrishnamurthy

Quantitative Techniques-I, Term I, 2012Indian Institute of Management Indore

Acceptance Sampling of Pins

Production manager knows the length of pins arenormally distributed with mean 1.008 inches and standarddeviation of 0.045 inches.

Customer takes a random sample of 50 pins from thebatch and compute the sample mean. If the sample meanis within the interval 1.00± 0.01 inch, then customer willbuy the whole batch.

Why examine only 50 pins; why not examine the

whole batch?

Checking each pin in the whole batch will be timeconsuming (may be impossible).

Checking each pin in the whole batch costs money.

Why examine only 50 pins; why not examine the

whole batch?

Checking each pin in the whole batch will be timeconsuming (may be impossible).

Checking each pin in the whole batch costs money.

Why examine only 50 pins; why not examine the

whole batch?

Checking each pin in the whole batch will be timeconsuming (may be impossible).

Checking each pin in the whole batch costs money.

More reasons for sampling

Process of collecting the data may be destructive.

Smaller number of items can be checked in more detailthan a larger number of items.

More reasons for sampling

Process of collecting the data may be destructive.

Smaller number of items can be checked in more detailthan a larger number of items.

Population

Population

A group all of whose members are of interest to the researcher.

Population is generally very large

e.g. all pins in the batch.

Population

Population

A group all of whose members are of interest to the researcher.

Population is generally very large e.g. all pins in the batch.

Parameter

Parameter

A descriptive measure (number) pertaining to the population.

Parameter is the question we are trying to answer

e.g. meanlength of the pins.

Parameter

Parameter

A descriptive measure (number) pertaining to the population.

Parameter is the question we are trying to answer e.g. meanlength of the pins.

Sample

Sample

A subset of individuals drawn from the population.

e.g. 50 pins drawn in the sample.

Sample

Sample

A subset of individuals drawn from the population.

e.g. 50 pins drawn in the sample.

Statistic

Statistic

A descriptive measure (number) of the sample.

Used to make inferences about the parameter

e.g. meanlength of 50 pins.

Statistic

Statistic

A descriptive measure (number) of the sample.

Used to make inferences about the parameter e.g. meanlength of 50 pins.

Inferential Statistics

In inferential statistics, we learn how to make inference aboutpopulation parameter using sample statistic.

How the pins should be chosen?

We want to avoid a sample that has smaller and larger pins ina different proportion than they are in the population.

How the pins should be chosen?

We want to avoid a sample that has smaller and larger pins ina different proportion than they are in the population.

How to choose a sample?

Sample should be representative of the population.

How to choose a sample?

Sample should be representative of the population.

Why random sampling?

Random sampling ensures unbiased information

Random sampling gives the most representative sample ofthe population. A representative sample gives unbiasedinformation about the population.

Simple random sample (SRS)

A sample selected in such a way that every possible samplewith the same number of individuals is equally likely to bechosen is known as simple random sample.

Why random sampling?

Random sampling ensures unbiased information

Random sampling gives the most representative sample ofthe population. A representative sample gives unbiasedinformation about the population.

Simple random sample (SRS)

A sample selected in such a way that every possible samplewith the same number of individuals is equally likely to bechosen is known as simple random sample.

Other sampling method

Random sampling methods

Stratified Sampling

Cluster Sampling

Non-random sampling methods

Voluntary Sampling

Convenience Sampling

Other sampling method

Random sampling methods

Stratified Sampling

Cluster Sampling

Non-random sampling methods

Voluntary Sampling

Convenience Sampling

Important questions to ask

What is the sample size?

What is the response rate?

Why 50 pins were selected, why not 30 or 100....?

Sample size is chosen depending on the error in theestimate that customer is willing to tolerate.

Smaller sample has larger margin of error and largersample has smaller margin of error.

More data means more investment of time and money.Therefore error threshold should be chosen judiciously sothat the cost of collecting the data and the margin oferror in the estimate both are reasonable.

Why 50 pins were selected, why not 30 or 100....?

Sample size is chosen depending on the error in theestimate that customer is willing to tolerate.

Smaller sample has larger margin of error and largersample has smaller margin of error.

More data means more investment of time and money.Therefore error threshold should be chosen judiciously sothat the cost of collecting the data and the margin oferror in the estimate both are reasonable.

Why 50 pins were selected, why not 30 or 100....?

Sample size is chosen depending on the error in theestimate that customer is willing to tolerate.

Smaller sample has larger margin of error and largersample has smaller margin of error.

More data means more investment of time and money.Therefore error threshold should be chosen judiciously sothat the cost of collecting the data and the margin oferror in the estimate both are reasonable.

Why 50 pins were selected, why not 30 or 100....?

Sample size is chosen depending on the error in theestimate that customer is willing to tolerate.

Smaller sample has larger margin of error and largersample has smaller margin of error.

More data means more investment of time and money.Therefore error threshold should be chosen judiciously sothat the cost of collecting the data and the margin oferror in the estimate both are reasonable.

Acceptance Sampling of Pins

Production manager knows the length of pins arenormally distributed with mean 1.008 inches and standarddeviation of 0.045 inches.

Customer takes a random sample of 50 pins from thebatch and compute the sample mean. If the sample meanis within the interval 1.00± 0.01 inch, then customer willbuy the whole batch.

Question?

What is the probability that a batch will be acceptable to theconsumer? Is the probability large enough to be an acceptablelevel of performance?

Why the above question is important from business point ofview?Ans: Rejection of a batch of pins will cost money to thecompany.

Question?

What is the probability that a batch will be acceptable to theconsumer? Is the probability large enough to be an acceptablelevel of performance?

Why the above question is important from business point ofview?

Ans: Rejection of a batch of pins will cost money to thecompany.

Question?

What is the probability that a batch will be acceptable to theconsumer? Is the probability large enough to be an acceptablelevel of performance?

Why the above question is important from business point ofview?Ans: Rejection of a batch of pins will cost money to thecompany.

What is the probability that batch will be accepted

in the pin problem?

P(1− 0.01 ≤ X̄50 ≤ 1 + 0.01)=P(0.99 ≤ X̄50 ≤ 1.01)

What is the probability that batch will be accepted

in the pin problem?

P(1− 0.01 ≤ X̄50 ≤ 1 + 0.01)=P(0.99 ≤ X̄50 ≤ 1.01)

Is X̄50 a random variable?

Yes because of sampling variation. Everytime you choose anew set of 50 pins, you will get a new sample mean.Distribution of a sample statistics due to sampling variation isalso known as sampling distribution.

Is X̄50 a random variable?

Yes because of sampling variation. Everytime you choose anew set of 50 pins, you will get a new sample mean.Distribution of a sample statistics due to sampling variation isalso known as sampling distribution.

How to find the sampling distribution of X̄50

Two different situations

Population is normal

Population is not normal

Sampling distribution of sample mean when

population is normal

Distribution of X̄n is exactly normal with mean µ and standarddeviation σ/

√n, where µ is the population mean, σ is the

population standard deviation, and n is the sample size.

Sampling distribution of Sample mean when

population is not normal?

When sample size is large, distribution of X̄n is approximatelynormal with mean µ and standard deviation σ/

√n, where µ is

the population mean, σ is the population standard deviation,and n is the sample size. This is the Central Limit Theorem.

Distribution is approximate but that does not affect how wedo the calculation.

Outline of the Proof of the Central Limit Theorem

Moment Generating Function (MGF)

MGF of a random variable X denoted by mX (t) is E (etx).[dk

dt(mX (t))

](t=0)

= E (X k).

Proof of the Central Limit Theorem

What is the sampling distribution of X̄50 in the pin

example?

X̄50 ∼ N(1.008, (0.045/√

50)2) (1)

What is P(0.99 < X̄50 < 1.01)?

Conclusion

Lathe should be adjusted so that acceptance rate increases.

Conclusion

Lathe should be adjusted so that acceptance rate increases.

What changes can be made in the process?

We can change the mean.

We can change the standard deviation.

What changes can be made in the process?

We can change the mean.

We can change the standard deviation.

What changes can be made in the process?

We can change the mean.

We can change the standard deviation.

If lathe could be adjusted to have mean of lengths

at any desired level, what should be the lathe

adjusted to?

It should be adjusted to 1.000.

If lathe could be adjusted to have mean of lengths

at any desired level, what should be the lathe

adjusted to?

It should be adjusted to 1.000.

What is P(0.99 < X̄50 < 1.01) when µ is adjusted

to 1.00 inches?

What should be the standard deviation if at least

90% of the parts should be acceptable?

Standard deviation should be 0.011.

What should be the standard deviation if at least

90% of the parts should be acceptable?

Standard deviation should be 0.011.

Which one is easier to adjust—mean or standard

deviation?

????

Which one is easier to adjust—mean or standard

deviation?

????