1 Introduction to Hypothesis Testing Chapter 11. 2 Introduction The purpose of hypothesis testing is...
-
date post
15-Jan-2016 -
Category
Documents
-
view
221 -
download
1
Transcript of 1 Introduction to Hypothesis Testing Chapter 11. 2 Introduction The purpose of hypothesis testing is...
1
Introduction to Hypothesis Testing
Introduction to Hypothesis Testing
Chapter 11
2
Introduction
• The purpose of hypothesis testing is to determine whether there is enough statistical evidence supporting a certain belief about a parameter.
• Examples– Is there statistical evidence in a random sample of potential
customers, that support the hypothesis that more than p% of all potential customers will purchase a new products?
– Is the hypothesis that a certain drug is effective supported by the level of improvement in patients’ conditions after treated with the drug, compared with this of another group of patients who were given a placebo?
3
• Two hypotheses are defined.
H0: The null hypothesis. Under this hypothesis we specify our current belief about the parameter we test. ( = 170, p = .4, etc.)
H1: The alternative hypothesis. Under this hypothesis we specify a range of values for the parameter tested ( > 170; p .4; etc.)effected by some action taken.This is the hypothesis we try to prove!
11.1 Concepts of Hypothesis Testing
4
• The two hypotheses are stated, and a test is run to determine whether a sample statistic supports the rejection of H0 in favor of H1.
Concepts of Hypothesis Testing
H0: = 170H1: > 170
5
The Concept of Hypothesis Testing
180x
= 170
Let’s assume H0 is true: = 170
If we have little incentive to believe 170because and are relatively close. x
180x
A sample is drawn.Assume the sample mean = 180.
6
The Concept of Hypothesis Testing
= 170 250x
A sample is drawn.Now assume the sample mean = 250.
Let’s assume H0 is true: = 170
If we have much more incentive to believe 170 because falls far above . x
250x
The question is: How far is far?Is 250 sufficiently larger than 170 for us to believe that > 170? Click.
7
The Concept of Hypothesis Testing
= 170 250x
Let’s assume H0 is true: = 170
This is the probability that when = 170250x
You may want to think about it as follows. Click:
> 170
If were greater than 170… click
With = 170… clickThis is the probability that when > 170250x
As you can see it becomes more likely that when > 170 250x
8
The Concept of Hypothesis Testing
• We’ll look next at the probability thatas a tool to help decide whether we shouldreject H0.
• This idea will be further discussed (with a somewhat more computational flavor) as example 1 is presented next.
• Pay attention!
250x
9
11.2 Testing the Population Mean when the Population Standard Deviation is Known
• Example 1: Department Store new Billing System – A new billing system for a department store will be cost- effective
only if the mean monthly account is more than $170.– A sample of 400 accounts has a mean of $178.– If the accounts are approximately normally distributed with =
$65, can we conclude that the new system will be cost effective? (can we conclude from the sample result that the accounts population mean is greater than 170?)
10
• Example 1 - Solution– The population of interest is the credit accounts at
the store.– We want to show that the mean account for all
customers is greater than $170.H1 : > 170
– The null hypothesis must specify the values of the parameter not included in H1
H0 : 170
This is what you want to prove
Testing the Population Mean ( is Known)
11
Testing the Population Mean ( is Known)
• To better understand the hypotheses testing concept let us ask the following question: – If H0 is true ( = 170) how likely is it a sample of 400 accounts have a
sample mean at least as large as 178? – Answer: By the central limit theorem
– To illustrate, by sheer chance, out of 10000 samples of 400 accounts each only 69 samples will have a sample mean of 178 or more, if indeed = 170.
– It seems there must be another reason (rather than just “chance”) why the event has occurred. Click.
– Most likely > 170, which explains better why . That is, H0 should be rejected in favor of H1
0.006940065
170178ZP178)xP(
178x 178x
12
Types of Errors
• Testing the hypotheses, two types of errors may occur when deciding whether to reject H0 based on the sample result.– Type I error: Reject H0 when it is true.
– Type II error: Do not reject H0 when it is false.
13
Types I and Type II Errors in Example 1
• Example 1 - continued– Type I error: Believe that > 170 when the real
value of is 170 (reject H0 in favor of H1 when H0 is true).
– Type II error: Believe that 170 when the real value of > 170 (do not reject H0 when it is false).
14
Controlling the probability of conducting a type I error
• Recall:H0: 170 H1: > 170, Since the alternative hypothesis has the form of > 0, H0 is rejected if is sufficiently large!x
x= 170
Critical value
H0
Our job is to determine a critical value for the sample mean. H0 is rejected if the sample mean exceeds that critical value.
15
Controlling the probability of conducting a type I error
• Recall:H0: 170 H1: > 170,
Note. May exceed a critical value (leading to the rejection of H0) but the population mean may still be 170. We don’t want the probability of this event exceeds some acceptable value ().
x
x= 170
Critical value
H0
So how do we determine this critical value?We turn to a type I error and limit the probabilityit occurs.
16
Approaches to Testing
• There are two approaches to test whether the sample mean supports the alternative hypothesis (H1)– The rejection region method is mandatory for manual testing
(but can be used when testing is supported by a statistical software)
– The p-value method which is mostly used when a statistical software is available.
• Both involve an upper limit we set on the probability of conducting a type I error.
17
The null hypothesis is rejected in favor of the alternative hypothesis if a test statistic falls in
the rejection region.
The null hypothesis is rejected in favor of the alternative hypothesis if a test statistic falls in
the rejection region.
The Rejection Region Method
18
Example 1 – solution continued
• Recall: H0: 170 H1: > 170.
• Define a critical value for that is just large enough to reject the null hypothesis.
xLx
• Reject the null hypothesis if
Lxx Lxx
The Rejection Region Method of a Right Hand Tail Test
19
• Allow the probability of committing a type I error be (also called the significance level).
• Find a critical value of the sample mean that is just large enough to guarantee that the actual probability of committing a type I error does not exceed .
Determining the Critical Value for the Rejection Region of a Right Hand Tail Test
20
= 170
P(commit a type I error) = P(reject H0 when H0 is true)
Lxx
n
xZP L
Example 1 – solution continued
Determining the Critical Value for a Right Hand Tail Test
α170)μwhenP( From the central limit theorem:
Lxx
21
= 170x
Example 1 – solution continued
Determining the Critical Value for a Right Hand Tail Test
αL z
nσμx
)ZZ(PSince
n
xZP Land
Lx
22
Determining the Critical Value for a Right Hand Tail Test
.34.17540065
645.1170x
.645.1z,05.0selectweIf
.40065
z170x
L
05.
L
170x Lx
Example 11.1 – solution continued
nzxL
α
L znσμx
Simple algebra
23
Determining the Critical Value for a Right Hand Tail Test
34.175xifhypothesisnullthejectRe
34.175xifhypothesisnullthejectRe
ConclusionSince the sample mean (178) is greater than the critical value of 175.34, there is sufficient evidence to infer that the mean monthly balance is greater than $170 at 5% significance level.
ConclusionSince the sample mean (178) is greater than the critical value of 175.34, there is sufficient evidence to infer that the mean monthly balance is greater than $170 at 5% significance level.
24
Determining the Critical Value for a Right Hand Tail Test
InterpretationThe null hypothesis is rejected in favor of the alternative hypothesis because the sample mean falls in the rejection region. Still we may be erroneous when rejecting the null hypothesis, since could be 170, but the chance we make such a mistake is not greater than 5% (the significance level).
InterpretationThe null hypothesis is rejected in favor of the alternative hypothesis because the sample mean falls in the rejection region. Still we may be erroneous when rejecting the null hypothesis, since could be 170, but the chance we make such a mistake is not greater than 5% (the significance level).
25
– Instead of using the statistic , we can use the standardized value z.
– If the alternative hypothesis is: H1: , then the rejection region is
x
nσμx
z 0
zz
The standardized test statistic
H0: = 0
26
• Example 1 - continued– We redo this example using the standardized test
statistic.Recall:H0: 170
H1: > 170– Test statistic:
– Rejection region: z > z.051.645.
46.240065
170178
n
xz
The standardized test statistic
27
• Example 11.1 - continued
The standardized test statistic
645.1ZifhypothesisnullthejectRe
645.1ZifhypothesisnullthejectRe
ConclusionSince Z = 2.46 > 1.645, reject the null hypothesis in favor of the alternative hypothesis.
ConclusionSince Z = 2.46 > 1.645, reject the null hypothesis in favor of the alternative hypothesis.
28
• Ask the question: How probable is it to obtain a sample mean at least as extreme as 178, if the population mean is 170 (H0 is true)?
The P-value Method
29
0069.)4615.2z(P
)40065170178
z(P
170x 178
The probability of observing a test statistic at least as extreme as 178, given that = 170 is…
The p-value
P-value method
)170when178x(P
30
Because the probability that the sample mean will assume a value of more than 178 when = 170 is so small (.0069), there are reasons to believe that > 170.
178x 170:H x0
170:H x1
…it becomes more probable under H1, when 170x
Note how the event is rare under H0
when but...178x
,170x
Interpreting the p-value
31
We can conclude that the smaller the p-value the more statistical evidence exists to support the alternative hypothesis.
We can conclude that the smaller the p-value the more statistical evidence exists to support the alternative hypothesis.
Interpreting the p-value
32
The p-value provides information about the amount of statistical evidence that supports the alternative hypothesis.
The p-value of a test is the probability of observing a test statistic at least as extreme as the one computed, given that the null hypothesis is true.
P-value – Summary
33
• Describing the p-value– If the p-value is less than 1%, there is overwhelming
evidence that supports the alternative hypothesis.– If the p-value is between 1% and 5%, there is a strong
evidence that supports the alternative hypothesis.– If the p-value is between 5% and 10% there is a weak
evidence that supports the alternative hypothesis.– If the p-value exceeds 10%, there is no evidence that
supports the alternative hypothesis.
Interpreting the p-value
34
The p-value = 0.0069
– The p-value can be used when making decisions based on rejection region methods as follows:
34.175xL
170x
= 0.05
178x
The p-value and the rejection region methods
– Compare the p-value to . Reject the null hypothesis only if the p value < ; Otherwise, do not reject the null hypothesis.
Note: 0.0069 < 0.05!
35
H0:
0
H1:
< 0
Left Hand Tail Test
Reject H0 if falls herex Criticalvalue
36
An Example for a Left Hand Tail Test
• The SSA envelop plan example.– The chief financial officer in FedEx believes that
including a stamped self-addressed (SSA) envelop in the monthly invoice sent to customers will decrease the amount of time it take for customers to pay their monthly bills.
– Currently, customers return their payments in 22 days on the average, with a standard deviation of 6 days.
37
• The SSA envelop example – continued – A random sample of 220 customers was selected
and SSA envelops were included with their invoice packs.
– The time it took customers to pay their bill was recorded (see SSA)
– Can the CFO conclude that the plan will be successful at 10% significance level?
An Example for a Left Hand Tail Test
38
• The SSA envelop example – Solution– The parameter tested is the ‘population mean of the
payment time’ ()– Since the CFO wants to prove that the plan will
be successful, we test whether H1: < 22
– Accordingly, The null hypothesis is: H0: 22
An Example for a Left Hand Tail Test
39
• The SSA envelop example – Solution continued– The rejection region:
It makes sense to believe that < 22 if the sample mean is sufficiently smaller than 22.
– Thus, reject the null hypothesis if
An Example for a Left Hand Tail Test
Lxx Lxx
Lx 22
Rejection Region
40
• Note that is small (certainly less than 50%). So the critical Z value must be negative. Click.
The Standardized Rejection Region for a Left Hand Tail Test
zz zz
The standardized rejection region is:
-z 0
41
• The SSA envelop example – Solution continued• The standardized approach:
From the data we find that the sample mean = 21.44
An Example for a Left Hand Tail Test
Conclusion: Since -1.384 < –1.285 reject the null hypothesis.
-z 0Z = Z.10 = 1.285 so,
-Z.10 = -1.285
This is the sample mean
1.3842206
2221.44nσμx
Z
42
• The SSA envelop example – Solution continue
The p – value approach for a Left Hand Tail Test
The p value = P(Z<-1.384) = .0831 and = 0.1Since .0831 < .1 (p value<) reject the null hypothesis.
p value
-1.384 -1.285
43
An Example for a Two Tail Test
H0:
H1:
Reject H0 if falls herex
Criticalvalue
Criticalvalue
Reject H0 if falls herex
44
• Example 2– AT&T has been challenged by competitors whose
rates arguably resulted in lower bills.– A statistician believes the monthly mean and
standard deviation of the long-distance bills for all AT&T residential customers are $17.85 and $3.87 respectively.
An Example for a Two Tail Test
45
• Example 2 - continued– A random sample of 25 customers is selected and
customers’ bills recalculated using a leading competitor’s rates.
– Assuming the standard deviation is indeed 3.87, can we infer that there is a difference between AT&T’s bills and the competitor’s bills (on the average)?
An Example for a Two Tail Test
46
17.85
• Solution – Is the mean different than 17.85?
H0: 17.85
17.85μ:H1 – Define a two tail rejection region of the form…
(see ATT)
1Lx 2Lx1Lxx 2Lxx
An Example for a Two Tail TestThe Rejection Region approach
47
17.85
We do not want this erroneous rejection of H0 occurs too frequently, say not more than = 5% of the time.
Even under H0 ( =17.85), can fall far above or far below 17.85, in which case we erroneously state that
x
17.85μ
20.025 )xx(P 1L
1Lx 2Lx
20.025 )xx(P 2L
Solution - continued
An Example for a Two Tail TestThe Rejection Region approach
48
17.851Lx 2Lx
16.3325
3.871.9617.85
nσ
zμx α/20L1
19.3625
3.871.9617.85
nσ
zμx α/20L2
19.13x
From the sample we have:
19.13
Solution - continued
An Example for a Two Tail TestThe Rejection Region approach
49
17.8516.33 19.36
Solution - continued
Since falls between the twocritical values, do not reject the null hypothesis
x
19.13x
From the sample we have:
19.13
An Example for a Two Tail TestThe Rejection Region approach
50
0
20.025 20.025
1.656253.87
17.8519.13nσμx
z
-z= -1.96 z= 1.96
Rejection region
Solution - continued
An Example for a Two Tail Test Standardized approach
Do not reject the null hypothesis
51
20.025 20.025
1.65253.87
17.8519.13nσμx
z
-z= -1.96 z= 1.961.65
The p-value = P(Z< -1.65)+ P(Z >1.65)= 2 P(Z >1.65) > .05
-1.65 0
The two areas combined form the p value
An Example for a Two Tail Test P – Value approach
52
Conclusion: There is insufficient evidence to infer that there is a difference between the bills of AT&T and the competitor, at 5% significance level.
53
11.3 Calculating the Probability of a Type II Error
• To properly interpret the results of a hypothesis test, we need to– specify an appropriate significance level or judge the
p-value of a test;– understand the relationship between Type I and
Type II errors.• How do we compute a type II error probability?
54
• To calculate the probability of a type II error we need to…– Express the rejection region directly, in terms of
the parameter hypothesized (not standardized).– Specify the alternative value under H1
H0:
H1:
Calculating the Probability of a Type II Error
55
• Let us revisit example 1– The null hypothesis was H0: = 170
= 170
Calculating the Probability of a Type II Error
180
H0: = 170
Specify the alternative value
under H1.
– Let the alternative value be = 180 (rather than just >170)
H1: = 180
56
–The rejection region was with = .05.
Express the rejection region directly, not in standardized terms
=.05
= 170
Calculating the Probability of a Type II Error
• Let us revisit example 1
180
H0: = 170
H1: = 180
175.34x
175.34
57
– A type II error occurs when a false H0 is not rejected.
– H0 is false when
– H0 is not rejected when – So, the probability a
type II errors occurs is
180
H1: = 180
H0: = 170
= 170
Calculating the Probability of a Type II Error
180μ
34.175x
34.175x
175.34
58
Calculating the Probability of a Type II Error
175.34xP(β
0764.)40065
18034.175z(P
when = 180)
180
True
H1: = 180
34.175
To summarize:
59
• A hypothesis test is effectively defined by the significance level and by the sample size n.
• The probability of a type II error can be controlled by– changing , and/or– changing the sample size.
Judging the Test
60
Effects on of changing
• Increasing the significance level decreases the value of and vice versa
= 170 180
2 < 2 >
Lx
61
Judging the Test
• Increasing the sample size n reduces
nzxthus,
nx
z:callRe LL
So, by increasing the sample size decreases, and grows smaller.
n
Lx
62
H0 = 170
Judging the TestGraphical demonstration: Note what happens when n increases:
Lx
xL moves to the left, thus, grows smaller.
Lx H1:180
H0 = 170 H1:180
Small n
Larger n
n
σzμx αL
63
• Increasing the sample size reduces • In example 1, suppose ‘n’ increases from 400 to 1000.
0)22.3Z(P)100065
18038.173Z(P
38.173100065
645.1170n
zxL
Judging the Test
• The probability of conducting a type I error remains 5%, but the probability of conducting a type II error drops dramatically.
64
• Power of a test– The power of a test is defined as 1 - – It represents the probability to reject the null
hypothesis when it is false.
Judging the Test
65
Optional: Determining the Sample Size for a Hypothesis test about the Population
Mean (known )• It has been shown that and ‘n’ are inversely related
(increasing ‘n’ decreases ). • So, for a desired value of we can determine the
required sample size.• The formula to determine ‘n’ is:
22
10
2β σ)μ(μ
)Z(Zn
22
10
2β σ)μ(μ
)Z(Zn
• For a two tailed test Z/2 replaces Z.
66
Optional: Determining the Sample Size for a one tail test – Example
• Example 8: Determine the sample size needed to test H0: = 100 against H1: = 130, if the significance level is 2.5% and the desired probability of a type II error is 8%. The population standard deviation is known to be 30.
• Solution: Z = Z.025 = 1.96; Z = Z.08 = 1.405
The selected sample size is therefore n = 12.
11.32130)(100
301.405)(1.96)μ(μ
σ)Z(Zn 2
22
210
22βα