From last lecture (Sampling Distribution): –The first important bit we need to know about sampling...

27
• From last lecture (Sampling Distribution): – The first important bit we need to know about sampling distribution is…? – What is the mean of the sampling distribution of means? – How do we calculate the standard deviation of the sampling distribution of means and what is it called?

Transcript of From last lecture (Sampling Distribution): –The first important bit we need to know about sampling...

Page 1: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

• From last lecture (Sampling Distribution):– The first important bit we need to know about

sampling distribution is…?– What is the mean of the sampling distribution

of means?– How do we calculate the standard deviation

of the sampling distribution of means and what is it called?

Page 2: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Hypothesis testing:

Page 3: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

The story so far...

(a) The means of a sampling distribution are often normally distributed (even if the original population from which the samples came is not normal).

(b) Any given sample mean ( ) can be expressed in terms of how much it differs from the mean of the sampling distribution (i.e., as a z-score).

This is true of N ≥ 30.

X

Page 4: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

sample A: mean = 650g

population: mean = 500 g

sample D: mean = 600gsample C: mean = 500g

sample B: mean = 450g

Brain size in hares: sample means and population means

Page 5: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

sample A mean

The Central Limit Theorem in action:

sample H mean

sample G mean

sample F mean

sample J mean

sample B mean

sample C mean

sample D mean

sample E mean

sample K mean, etc.....

Frequency with which each sample mean occurs:

the population mean, and the mean of the sample means

Page 6: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

(c) Any given sample mean can be expressed in terms of how much it differs from the population mean.

Population mean A particular sample mean

(d) "Deviation from the mean" is the same as "probability of occurrence": a sample mean which is very deviant from the population mean is unlikely to occur.

Page 7: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

A word about the use of the term “population”

• The term does not necessarily refer to a set of individuals or items (e.g. cars). Rather, it refers to a state of individuals or items.

• Example: After a major earthquake in a city (in which no one died) the actual set of individuals remains the same. But the anxiety level, for example, may change. The anxiety level of the individuals before and after the quake defines them as two populations.

Page 8: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

A concrete example:An American Sludge Oil Company has developed a new petrol additive that they claim will increase the mileage of cars

• It is known that the current average of cars is 18 mi/gallon

• A sample of 100 cars is taken and run on sludge fuel

• The sample mileage is 19 mi/gallon

• Is the Sludge company’s claim valid?

Page 9: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

• When the test is fair ONLY two sources contribute to the difference between the population and the sample:

– The effect of the fuel additive

– Random variability

Our job is to decide which of these possibilities is it

Page 10: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Hypothesis testing

• Two types hypotheses: H0 and H1

• The NULL hypothesis (H0): The mean of the population from which the sample comes IS 18 mi/gallon.

• The ALTERNATIVE hypothesis (H1): The mean of the population from which the sample comes IS SIGNIFICANTLY DIFFERENT FROM 18 mi/gallon.

Page 11: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

The question we want to ask is: How likely is it that the sample (N = 100) whose mean is 19 mi/gallon comes from a population whose mean and standard deviation are18 mi/gallon and 4 mi/gallon, respectively?

Page 12: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Assume we know the following:

• The population mean mileage of cars: k = 18 mi/gallon

• The population standard deviation (σ) = 4 mi/gallon

• The sample mean ( ) = 19 mi/gallon• Number of cars (N) = 100

X

Page 13: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

The sampling distribution for N = 100

Frequencyof samplemeans

µ = 1817 19

X

N

4

1000.4

Page 14: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

We start by supposing the null hypothesis is true. Meaning that the sampling distribution from which our sample comes is one for which µ = 18

Then we ask what is the probability that our sample came from this distribution. If the probability is low, then we reject the null hypothesis in favor of the alternative hypothesis.

Page 15: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Assume "innocent until proven guilty": we retain the null hypothesis unless there is enough evidence to allow us to reject it in favour of the alternative hypothesis.

Large difference between sample mean and µ: reject H0, and assume H1 is true.

If the difference is Small: have no reason to reject H0.

Page 16: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

How low does the probability have to be?• We usually select 0.05 as the cutoff. (Sometimes we also

select 0.01). This is called the level of significance.

• We look for values that occur with a probability of 0.05 or less.

Frequencyof samplemeans

µ = 1817 19

X

Page 17: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

How low does the probability have to be?• We usually select 0.05 as the cutoff. (Sometimes we also

select 0.01). This is called the level of significance.

• We look for values that occur with a probability of 0.05 or less.

Frequencyof samplemeans

µ = 1817 19

X

Page 18: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

How low does the probability have to be?• We usually select 0.05 as the cutoff. (Sometimes we also

select 0.01). This is called the level of significance.

• We look for values that occur with a probability of 0.05 or less.

Frequencyof samplemeans

µ = 1817 19

X

Page 19: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

The shaded area corresponds to the 5% of the lowest and highest means ( ) in the sampling distribution

Frequencyof samplemeans

µ = 1817 19

X

0.025 0.025

Page 20: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Next we find the critical valuesThese are given as z-scores in the Table:

Areas Under the Unit Normal Distribution

Frequencyof samplemeans

µ = 1817 19

0.025 0.025

uppercritical value

1.96

lowercritical value

-1.96

Page 21: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

z Below z Above z Betweenmean and z

1.90 0.9713 0.0287 0.4713 1.91 0.9719 0.0281 0.4719 1.92 0.9725 0.0275 0.4725 1.93 0.9732 0.0268 0.4732 1.94 0.9738 0.0262 0.4738 1.95 0.9744 0.0256 0.4744 1.96 0.9750 0.0250 0.4750 1.97 0.9755 0.0245 0.4755 1.98 0.9761 0.0239 0.4761

A portion of the table:

Page 22: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

If z-observed ≥ than 1.96then reject H0

z observed X

X

X

N

19 184

100

1

0.42.50

Our z-score for the sludge company is:

So, we reject the Null hypothesis in favour of the alternative hypothesis

Page 23: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

How big must a difference be, before we reject H0?

Two types errors can occur

Type 1 error: reject H0 when it is true. (Also known as α, "alpha“ error). We think our experimental manipulation has had an effect, when in fact it has not.

Type 2 error: retain H0 when it is false. (Also known as β, "beta“ error). We think our experimental manipulation has not had an effect, when in fact it has.

Page 24: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Any observed difference between two sample means could in principle be either "real" or due to chance - we can never tell for certain.

But -Large differences between samples from the same population are unlikely to arise by chance.

Small differences between samples are likely to have arisen by chance.

Page 25: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Problem: reducing the chances of making a Type 1 error increases the chances of making a Type 2 error, and vice versa.

Psychologists therefore compromise between the chances of making a Type 1 error, and the chances of making a Type 2 error:

We set the probability of making a Type 1 error at 0.05. When we do an experiment, we accept a difference between two samples as "real", if a difference of that size would be likely to occur, by chance, 5% of the time, i.e. five times in every hundred experiments performed.

Page 26: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Directional and non-directional hypotheses:

Non-directional (two-tailed) hypothesis:Merely predicts that sample mean ( ) will be significantly different from population mean (µ)

> µ: Differences this extreme (or more) occur by chance with p = 0.025.

= µ > µ < µ

< µ: Differences this extreme (or more) occur by chance with p = 0.025.

possible differences

X

X

X

X

X

X

Page 27: From last lecture (Sampling Distribution): –The first important bit we need to know about sampling distribution is…? –What is the mean of the sampling.

Directional (one-tailed) hypothesis:More precise - predicts the direction of difference (i.e., either predicts is bigger than µ, or predicts is smaller than µ).

> µ: Differences this extreme (or more) occur by chance with p = 0.05.

possible differences

X

= µ > µ < µ

X

X

X

X

X