MATH& 146 Lesson 20 · Opportunity Cost In Section 2.2 (Lesson 13) we were introduced to the...
Transcript of MATH& 146 Lesson 20 · Opportunity Cost In Section 2.2 (Lesson 13) we were introduced to the...
MATH& 146
Lesson 20
Section 2.7
Applying the Normal Model for
Hypothesis Testing
1
Applying the Normal Model
Earlier this course, we used simulation techniques
to form conclusions about the population.
An alternate, more common approach is to use the
normal distribution to form conclusions.
When the sample size is sufficiently large, this
approximation generally provides us with the same
conclusions.
2
Standard Error
Point estimates vary from sample to sample, and
we quantify this variability with what is called the
standard error (SE).
The standard error is equal to the standard
deviation associated with the estimate.
3
Standard Error
The way we determine the standard error varies
from one situation to the next. However, typically it
is determined using a formula based on the
Central Limit Theorem.
For the time being, the standard error will just be
given. Formulas will be introduced in upcoming
lessons.
4
Opportunity Cost
In Section 2.2 (Lesson 13) we were introduced to
the opportunity cost study, which found that
students became thriftier when they were
reminded that not spending money now means the
money can be spent on other things in the future.
Let's re-analyze the data in the context of the
normal distribution and compare the results.
5
Opportunity Cost
The figure below summarizes the null distribution
as determined using the randomization method.
The best fitting normal distribution for the null
distribution has a mean of 0.
6
Opportunity Cost
The standard error has a value SE = 0.078 as a
given. Recall the point estimate of the difference
was 0.20, as shown in the plot.
7
Opportunity Cost
Now, we'll use the normal distribution approach to
compute the two-tailed p-value.
8
Opportunity Cost
It is helpful to draw and shade a picture of the
normal distribution so we know precisely what we
want to calculate. Here we want to find the area of
the two tails representing the p-value.
9
P-Values Method 1
There are two approaches you can take to find the
p-value. The most direct approach is to use the
normalcdf command in your calculator, along with
the point estimate, standard error, and mean of the
null distribution.
10
P-Values Method 1
(Pt Est > mean)
For a two-tail test,
11
-value normalcdf(Pt Est,BIG,mean, ) 2p SE
The point estimate, or
observed value, assuming
it is larger than the mean.
The mean (null value) of
the null distribution.
Multiply the final result by
two to account for the other
tail.
Standard error of the point
estimate.
Go "Pt Est, BIG" if the point
estimate is larger than the
mean.
P-Values Method 1
(Pt Est < mean)
For a two-tail test,
12
-value normalcdf( BIG,Pt Est,mean, ) 2p SE
Go "–BIG, Pt Est" if the
point estimate is smaller
than the mean.
The mean (null value) of
the null distribution.
Multiply the final result by
two to account for the other
tail.
Standard error of the point
estimate.
The point estimate, or
observed value, assuming
it is smaller than the mean.
P-Values Method 1
For the opportunity cost study, the point estimate =
0.20, mean = 0, and SE = 0.078. Therefore,
13
-value normalcdf(0.20,999,0,0.078) 2 0.0103p
P-Values Method 2
The second approach to calculating p-values also
uses the normalcdf command, but we compute the
Z-score for the point estimate (called a test
statistic) first, then make use of the fact that Z-
scores always have a mean of 0 and standard
deviation of 1.
14
P-Values Method 2
For a two-tail test,
15
-value normalcdf(| |,BIG,0,1) 2p Z
The absolute value of the Z
score. That is, remove any
negative sign.
Use some arbitrarily big
number, such as 999.
Z scores always have a
mean of 0.
Multiply the final result be
two to account for the other
tail.
Z scores always have a
standard deviation of 1.
P-Values Method 2
For the opportunity cost study, the point estimate =
0.20, mean = 0, and SE = 0.078. Therefore the Z
score is
The p-value then is
16
-value normalcdf(2.56,999,0,1) 2 0.0105p
point estimate mean 0.20 02.56
standard error 0.078Z
Opportunity Cost
Notice that the two methods give nearly the same
p-value (0.0103 for Method 1 vs. 0.0105 for
Method 2).
Technically, these two methods should give
identical p-values, but the extra rounding of the Z
score changed the answer slightly.
Either way, both values are about the same as we
got from the randomization approach (two-tail p-
value was about .012).
17
Opportunity Cost
As before, since the p-value is less than 0.05, we
conclude that the treatment did indeed impact
students' spending.
18
Medical Consultant
In Section 2.4 (Lesson 15) we learned about a medical
consultant who reported that only 3 of her 62 clients
who underwent a liver transplant had complications,
which is less than the more common complication rate
of 0.10.
As in the other case studies, we identified a suitable
null distribution using a simulation approach, as shown
below.
19
Medical Consultant
Here we have added the best-fitting normal curve to
the figure, which has a mean of 0.10. Borrowing a
formula that we'll encounter in Chapter 3 (Lesson 22),
the standard error of the distribution was also
computed: SE = 0.038.
20
Medical Consultant
In the previous analysis, we obtained a p-value of
0.2444, and we will try to reproduce that p-value using
the normal distribution approach.
However, before we begin, we want to point out a
simple detail that is easy to overlook: the null
distribution we earlier generated is slightly skewed,
and the distribution isn't that smooth.
21
Medical Consultant
In fact, the normal distribution only sort-of fits this
model. We'll discuss this discrepancy more in a
moment, but for the time being we will continue
with a normal distribution.
We'll again begin by creating a picture. Here a
normal distribution centered at 0.10 with a
standard error of 0.038.
22
P-Values Method 1
Again, for this study, the point estimate = 3/62,
mean = 0.10, and SE = 0.038. Since the point
estimate is less than the mean,
23
-value normalcdf( 999,3/62,.1,.038) 2 0.1744p
P-Values Method 2
For the medical consultant study, the Z score is
The p-value then is
24
-value normalcdf(1.358,999,0,1) 2 0.1745p
point estimate mean 3 / 62 0.101.358
standard error 0.038Z
Conditions for Inference
Both methods give a p-value of about 0.1744. This is
the estimated p-value for the hypothesis test.
However, there's a problem: this is very different than
the earlier (simulated) p-value we computed: 0.2444.
The discrepancy is explained by normal model's poor
representation of the null distribution. As noted earlier,
the null distribution from the simulations is not very
smooth, and the distribution itself is slightly skewed.
That's the bad news.
25
Conditions for Inference
The good news is that we can foresee these problems
using some simple checks.
Previously, we noted that the two common
requirements to apply the Central Limit Theorem are
(1) independence, and (2) a large enough sample size
(in this case, at least 10 successes and 10 failures).
The guidelines for this particular situation would have
alerted us that the normal model was a poor
approximation.
26
Conditions for Inference
The success story in this section was the application of
the normal model in the context of the opportunity cost
data.
However, the biggest lesson comes from our failed
attempt to use the normal approximation in the
medical consultant case study.
Statistical techniques are like a carpenter's tools.
When used responsibly, they can produce amazing
and precise results.
27
Conditions for Inference
However, if the tools are applied irresponsibly or under
inappropriate conditions, they will produce unreliable
results.
For this reason, with every statistical method that we
introduce in future lessons, we will carefully outline
conditions when the method can reasonably be used.
These conditions should be checked in each
application of the technique.
28