INNOVATIVE DELIGHTFUL AMUSING ENGROSSING ABSORBING FASCINATING
Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal...
-
Upload
kerry-hill -
Category
Documents
-
view
213 -
download
0
Transcript of Exam Exam starts two weeks from today. Amusing Statistics Use what you know about normal...
Amusing Statistics
• Use what you know about normal distributions to evaluate this finding:
The study, published in Pediatrics, the journal of the American Academy of Pediatrics, found that among the 4,508 students in Grades 5-8 ハ who participated, 36 per cent reported excellent school performance, 38 per cent reported good performance, 20 per cent said they were average performers, and 7 per cent said they performed below average.
Review
• The Z-test is used to compare the mean of a sample to the mean of a population
€
Zx =x − μ x
σ x
€
σX
=σ
nand
Review
• The Z-score is normally distributed
• Thus the probability of obtaining any given Z-score by random sampling is given by the Z table
Review
• We can likewise determine critical values for Z such that we would reject the null hypothesis if our computed Z-score exceeds these values– For alpha = .05:
• Zcrit (one-tailed) = 1.64• Zcrit (two-tailed) = 1.96
Confidence Intervals
• A related question you might ask:– Suppose you’ve measured a mean and
computed a standard error of that mean
– What is the range of values such that there is a 95% chance of the population mean falling within that range?
Gaussian (Normal) Distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
-4 -3 -2 -1 0 1 2 3 4
score
probability
• There is a 2.5% chance that the population mean is actually 1.96 standard errors more than the observed mean
Confidence Intervals
95%1.96
2.5%
True mean?
Gaussian (Normal) Distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
-4 -3 -2 -1 0 1 2 3 4
score
probability
• There is a 2.5% chance that the population mean is actually 1.96 standard errors less than the observed mean
Confidence Intervals
2.5%-1.96
95%
True mean?
• Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean
Confidence Intervals
• Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean
• Likewise, there is a 95% chance that the true population mean falls within + or - 1.96 standard deviations from a single measurement
Confidence Intervals
• This is called the 95% confidence interval…and it is very useful
• It works like significance bounds…if the 95% C.I. doesn’t include the mean of a population you’re comparing your sample to, then your sample is significantly different from that population
Confidence Intervals
• Consider an example:
• You measure the concentration of mercury in your backyard to be .009 mg/kg
• The concentration of mercury in the Earth’s crust is .007 mg/kg. Let’s pretend that, when measured at many sites around the globe, the standard deviation is known to be .002 mg/kg
Confidence Intervals
• The 95% confidence interval for this mercury measurement is
Confidence Intervals
€
backyard = .009mg /kg
€
σ =.002mg /kg
€
95%C.I. = x + /− Zcrit (two − tailed) ×σ
€
=.009 + /−1.96 × .002mg /kg
€
=.0051 → .0129
• This interval includes .007 mg/kg which, it turns out, is the mean concentration found in the earth’s crust in general
• Thus you would conclude that your backyard isn’t artificially contaminated by mercury
Confidence Intervals
€
.0051 ≥ .007 ≥ .0129
• Imagine you take 25 samples from around Alberta and you found:
Confidence Intervals
€
x = .009mg /kg
€
σ =.002mg /kg
€
σx
=σ
n=
.002
25= .0004
• Imagine you take 25 samples from around Alberta and you found:
• .009 +/- (1.96 x .0004) = .008216 to .009784
• This interval doesn’t include the .007 mg/kg value for the earth’s crust so you would conclude that Alberta has an artificially elevated amount of mercury in the soil
Confidence Intervals
Power
• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05
Power
• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05
• we say that we have a significant result…
Power
• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05
• we say that we have a significant result…
• but what if p is > .05?
Power
• What are the two reasons why p comes out greater than .05?
– Your experiment lacked Statistical Power and you made a Type II Error
– The null hypothesis really is true
Power
• Two approaches:– The Hopelessly Jaded Grad Student
Solution
– The Wise and Well Adjusted Professor Procedure
Power
1. Hopelessly Jaded Grad Student Solution - conclude that your hypothesis was wrong and go directly to the grad student pub
Power
2. The Wise Professor Procedure - consider the several reasons why you might not have detected a significant effect
Power• Why might p be greater than .05 ?
1. Small effect size:
– The effect doesn’t stand out from the variability in the data– You might be able to increase your effect size (e.g. with a larger dose or treatment)
€
X is quite close to the mean of the population
Power• Why might p be greater than .05 ?
2. Noisy Data
– A large denominator will swamp the small effect– Take greater care to reduce measurement errors
€
σand therefore
€
σX is quite large
Power• Why might p be greater than .05 ?
3. Sample Size is Too Small
– A large denominator will swamp the small effect – Run more subjects
€
σX is quite large because
€
n is small
Power• The solution in each case is more power:• Power is like sensitivity - the ability to detect small effects in noisy data
Power• The solution in each case is more power:• Power is like sensitivity - the ability to detect small effects in noisy data• It is the opposite of Type II Error rate
Power• The solution in each case is more power:• Power is like sensitivity - the ability to detect small effects in noisy data• It is the opposite of Type II Error rate• So that you know: there are equations for computing statistical power
Power
• An important point about power and the null hypothesis:
– Failing to reject the null hypothesis DOES NOT PROVE it to be true!!!
Power
• Consider an example:
– How to prove that smoking does not cause cancer:
• enroll 2 people who smoke infrequently and use an antique X-Ray camera to look for cancer
• Compare the mean cancer rate in your group (which will probably be zero) to the cancer rate in the population (which won’t be) with a Z-test
Power
• Consider an example:
– If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer
Power
• Consider an example:
– If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer
– You will, however, often encounter statements such as “The study failed to find…” misinterpreted as “The study proved no effect of…”
• We’ve been using examples in which a single sample is compared to a population
Experimental Design
• We’ve been using examples in which a single sample is compared to a population
• Often we employ more sophisticated designs
Experimental Design
• We’ve been using examples in which a single sample is compared to a population
• Often we employ more sophisticated designs
• What are some different ways you could run an experiment?
Experimental Design
Experimental Design
• Compare one mean to some value– Often that value is zero
• Compare two means to each other
Experimental Design
• There are two general categories of comparing two (or more) means with each other
Experimental Design
1. Repeated Measures - also called “within-subjects” comparison
• The same subjects are given pre- and post- measurements
• e.g. before and after taking a drug to lower blood pressure
• Powerful because variability between subjects is factored out
• Note that pre- and post- scores are linked - we say that they are dependant
• Note also that you could have multiple tests
Experimental Design
1. Problems with Repeated-Measure design:
• Practice/Temporal effect - subjects get better/worse over time
• The act of measuring might preclude further measurement - e.g. measuring brain size via surgery
• Practice effect - subjects improve with repeated exposure to a procedure
Experimental Design
2. Between-Subjects Design• Subjects are randomly assigned to treatment
groups - e.g. drug and placebo• Measurements are assumed to be statistically
independent
Experimental Design
2. Problems with Between-Subjects design
• Can be less powerful because variability between two groups of different subjects can look like a treatment effect
• Often needs more subjects
Experimental Design
• We’ll need some statistical tests that can compare:
– One sample mean to a fixed value– Two dependent sample means to each
other (within-subject)– Two independent sample means to each
other (between-subject)
Experimental Design
• The t-test can perform each of these functions
• It also gets around a big problem with the z-test…
The Z statistic
• The Z statistic (with which to compare to the Zcrit)
€
Zx =x − μ x
σ x
€
σX
=σ
n
Where
€
σX
The Z statistic
• What is the problem you will encounter in trying to use this statistic?
• Although you might have a guess about the population mean, you will almost certainly not know the population variance!
The Z statistic
• What to do?
• Could we estimate
• What would we use and what would have to be the case for it to be useful?
€
σ 2
The Z statistic
• What to do?
• Could we estimate
• What would we use and what would have to be the case for it to be useful?
• We could use our sample variance, S2 to estimate the population variance
€
σ 2
€
σ 2
Estimating Population Variance
• Just like there are many sample means (the sampling distribution of the mean) there are many S2s
Estimating Population Variance
• Just like there are many sample means (the sampling distribution of the mean) there are many S2s
• tends to be near the value of but does S2 tend to be near the value of
€
X
€
μ
€
σ 2
Estimating Population Variance
• Just like there are many sample means (the sampling distribution of the mean) there are many S2s
• tends to be near the value of but does S2 tend to be near the value of
• No. It is a biased estimator. It tends to be lower than €
X
€
σ 2
€
μ
€
σ 2
Estimating Population Variance
• Why is S2 biased?
• The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population
Estimating Population Variance
• Why is S2 biased?
• The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population
• This means that the deviations in your sample are somewhat more constrained than in the population
Estimating Population Variance
• Why is S2 biased?
• The sum of the deviation scores in your sample must equal zero regardless of where they came from in the population
• This means that the deviations in your sample are somewhat more constrained than in the population
• S2 is has relatively fewer degrees of freedom than the entire population
Estimating Population Variance
• Specifically S2 has n - 1 degrees of freedom
• So if we compute S2 but use n - 1 instead of n in the denominator we’ll get an unbiased estimator of
€
σ 2
Estimating Population Variance
• Of course if you’ve already computed S2 using n in the denominator you can multiply by n to recover the sum of squared deviations and then divide by n-1
The t Statistic(s)
• Using an estimated , which we’ll call we can create an estimate of which we’ll call
€
ˆ σ X
=ˆ σ
n€
σ 2
€
ˆ σ 2
€
σX
€
ˆ σ X
where
€
ˆ σ =(X i − X )2
n −1∑ =
nS2
n −1