Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal...

Exam

• Exam starts two weeks from today

Amusing Statistics

• Use what you know about normal distributions to evaluate this finding:

The study, published in Pediatrics, the journal of the American Academy of Pediatrics, found that among the 4,508 students in Grades 5-8 ﾊ who participated, 36 per cent reported excellent school performance, 38 per cent reported good performance, 20 per cent said they were average performers, and 7 per cent said they performed below average.

Review

• The Z-test is used to compare the mean of a sample to the mean of a population

€

Zx =x − μ x

σ x

€

σX

=σ

nand

Review

• The Z-score is normally distributed

Review

• The Z-score is normally distributed

• Thus the probability of obtaining any given Z-score by random sampling is given by the Z table

Review

• We can likewise determine critical values for Z such that we would reject the null hypothesis if our computed Z-score exceeds these values– For alpha = .05:

• Zcrit (one-tailed) = 1.64• Zcrit (two-tailed) = 1.96

Confidence Intervals

• A related question you might ask:– Suppose you’ve measured a mean and

computed a standard error of that mean

– What is the range of values such that there is a 95% chance of the population mean falling within that range?

Gaussian (Normal) Distribution

0

0.1

0.2

0.3

0.4

0.5

0.6

-4 -3 -2 -1 0 1 2 3 4

score

probability

• There is a 2.5% chance that the population mean is actually 1.96 standard errors more than the observed mean


95%1.96

2.5%

True mean?

Gaussian (Normal) Distribution

0

0.1

0.2

0.3

0.4

0.5

0.6

-4 -3 -2 -1 0 1 2 3 4

score

probability

• There is a 2.5% chance that the population mean is actually 1.96 standard errors less than the observed mean


2.5%-1.96

95%

True mean?

• Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean


• Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean

• Likewise, there is a 95% chance that the true population mean falls within + or - 1.96 standard deviations from a single measurement


• This is called the 95% confidence interval…and it is very useful

• It works like significance bounds…if the 95% C.I. doesn’t include the mean of a population you’re comparing your sample to, then your sample is significantly different from that population


• Consider an example:

• You measure the concentration of mercury in your backyard to be .009 mg/kg

• The concentration of mercury in the Earth’s crust is .007 mg/kg. Let’s pretend that, when measured at many sites around the globe, the standard deviation is known to be .002 mg/kg


• The 95% confidence interval for this mercury measurement is


€

backyard = .009mg /kg

€

σ =.002mg /kg

€

95%C.I. = x + /− Zcrit (two − tailed) ×σ

€

=.009 + /−1.96 × .002mg /kg

€

=.0051 → .0129

• This interval includes .007 mg/kg which, it turns out, is the mean concentration found in the earth’s crust in general

• Thus you would conclude that your backyard isn’t artificially contaminated by mercury


€

.0051 ≥ .007 ≥ .0129

• Imagine you take 25 samples from around Alberta and you found:


€

x = .009mg /kg

€

σ =.002mg /kg

€

σx

=σ

n=

.002

25= .0004

• Imagine you take 25 samples from around Alberta and you found:

• .009 +/- (1.96 x .0004) = .008216 to .009784

• This interval doesn’t include the .007 mg/kg value for the earth’s crust so you would conclude that Alberta has an artificially elevated amount of mercury in the soil


Power

• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05

Power


• we say that we have a significant result…

Power


• we say that we have a significant result…

• but what if p is > .05?

Power

• What are the two reasons why p comes out greater than .05?

Power

• What are the two reasons why p comes out greater than .05?

– Your experiment lacked Statistical Power and you made a Type II Error

– The null hypothesis really is true

Power

• Two approaches:– The Hopelessly Jaded Grad Student

Solution

– The Wise and Well Adjusted Professor Procedure

Power

1. Hopelessly Jaded Grad Student Solution - conclude that your hypothesis was wrong and go directly to the grad student pub

Power

- This is not the recommended course of action

Power

2. The Wise Professor Procedure - consider the several reasons why you might not have detected a significant effect

Power

- recommended by wise professors the world over

Power

• Why might p be greater than .05 ?

• Recall that:

€

Zx =x − μ x

σ x

€

σX

=σ

nand

Power• Why might p be greater than .05 ?

1. Small effect size:

– The effect doesn’t stand out from the variability in the data– You might be able to increase your effect size (e.g. with a larger dose or treatment)

€

X is quite close to the mean of the population


2. Noisy Data

– A large denominator will swamp the small effect– Take greater care to reduce measurement errors

€

σand therefore

€

σX is quite large


3. Sample Size is Too Small

– A large denominator will swamp the small effect – Run more subjects

€

σX is quite large because

€

n is small

Power

• The solution in each case is more power:

Power• The solution in each case is more power:• Power is like sensitivity - the ability to detect small effects in noisy data

Power• The solution in each case is more power:• Power is like sensitivity - the ability to detect small effects in noisy data• It is the opposite of Type II Error rate

Power• The solution in each case is more power:• Power is like sensitivity - the ability to detect small effects in noisy data• It is the opposite of Type II Error rate• So that you know: there are equations for computing statistical power

Power

• An important point about power and the null hypothesis:

– Failing to reject the null hypothesis DOES NOT PROVE it to be true!!!

Power


– How to prove that smoking does not cause cancer:

• enroll 2 people who smoke infrequently and use an antique X-Ray camera to look for cancer

• Compare the mean cancer rate in your group (which will probably be zero) to the cancer rate in the population (which won’t be) with a Z-test

Power


– If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer

Power


– If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer

– You will, however, often encounter statements such as “The study failed to find…” misinterpreted as “The study proved no effect of…”

• We’ve been using examples in which a single sample is compared to a population

Experimental Design


• Often we employ more sophisticated designs

Experimental Design


• Often we employ more sophisticated designs

• What are some different ways you could run an experiment?

Experimental Design

Experimental Design

• Compare one mean to some value– Often that value is zero

Experimental Design

• Compare one mean to some value– Often that value is zero

• Compare two means to each other

Experimental Design

• There are two general categories of comparing two (or more) means with each other

Experimental Design

1. Repeated Measures - also called “within-subjects” comparison

• The same subjects are given pre- and post- measurements

• e.g. before and after taking a drug to lower blood pressure

• Powerful because variability between subjects is factored out

• Note that pre- and post- scores are linked - we say that they are dependant

• Note also that you could have multiple tests

Experimental Design

1. Problems with Repeated-Measure design:

• Practice/Temporal effect - subjects get better/worse over time

• The act of measuring might preclude further measurement - e.g. measuring brain size via surgery

• Practice effect - subjects improve with repeated exposure to a procedure

Experimental Design

2. Between-Subjects Design• Subjects are randomly assigned to treatment

groups - e.g. drug and placebo• Measurements are assumed to be statistically

independent

Experimental Design

2. Problems with Between-Subjects design

• Can be less powerful because variability between two groups of different subjects can look like a treatment effect

• Often needs more subjects

Experimental Design

• We’ll need some statistical tests that can compare:

– One sample mean to a fixed value– Two dependent sample means to each

other (within-subject)– Two independent sample means to each

other (between-subject)

Experimental Design

• The t-test can perform each of these functions

• It also gets around a big problem with the z-test…

Problems with Z

and what to do instead

The Z statistic

• The Z statistic (with which to compare to the Zcrit)

€

Zx =x − μ x

σ x

€

σX

=σ

n

Where

€

σX

The Z statistic

• What is the problem you will encounter in trying to use this statistic?

The Z statistic

• What is the problem you will encounter in trying to use this statistic?

• Although you might have a guess about the population mean, you will almost certainly not know the population variance!

The Z statistic

€

Zx =x − μ x

σ x

€

σX

=σ

n

Where

€

σX

The Z statistic

• What to do?

• Could we estimate

• What would we use and what would have to be the case for it to be useful?

€

σ 2

The Z statistic

• What to do?

• Could we estimate

• What would we use and what would have to be the case for it to be useful?

• We could use our sample variance, S2 to estimate the population variance

€

σ 2

€

σ 2

Estimating Population Variance

• Just like there are many sample means (the sampling distribution of the mean) there are many S2s



• tends to be near the value of but does S2 tend to be near the value of

€

X

€

μ

€

σ 2



• tends to be near the value of but does S2 tend to be near the value of

• No. It is a biased estimator. It tends to be lower than €

X

€

σ 2

€

μ

€

σ 2


• Why is S2 biased?



• The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population




• This means that the deviations in your sample are somewhat more constrained than in the population




• This means that the deviations in your sample are somewhat more constrained than in the population

• S2 is has relatively fewer degrees of freedom than the entire population


• Specifically S2 has n - 1 degrees of freedom


• Specifically S2 has n - 1 degrees of freedom

• So if we compute S2 but use n - 1 instead of n in the denominator we’ll get an unbiased estimator of

€

σ 2


• Of course if you’ve already computed S2 using n in the denominator you can multiply by n to recover the sum of squared deviations and then divide by n-1

The t Statistic(s)

• Using an estimated , which we’ll call we can create an estimate of which we’ll call

€

ˆ σ X

=ˆ σ

n€

σ 2

€

ˆ σ 2

€

σX

€

ˆ σ X

where

€

ˆ σ =(X i − X )2

n −1∑ =

nS2

n −1

The t Statistic(s)

• Using, instead of we get a statistic that isn’t from a normal (Z) distribution - it is from a family of distributions called t

€

tn−1 =x − μ x

ˆ σ x

€

ˆ σ X

€

σX

Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal...

Documents

Transcript of Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal...