Copyright © 2006 Pearson Addison-Wesley. All rights reserved.
Lecture 8: Hypothesis Testing
(Chapter 7.1–7.2, 7.4)
Distribution of Estimators(Chapter 5.1–5.2, Chapter 6.4)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-2
Agenda for Today
• Hypothesis Testing (Chapter 7.1)
• Distribution of Estimators (Chapter 5.2)
• Estimating 2 (Chapter 5.1, Chapter 6.4)
• t-tests (Chapter 7.2)
• P-values (Chapter 7.2)
• Power (Chapter 7.2)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-3
What Sorts of Hypotheses to Test?
• To test a hypothesis, we first need to specify our “null hypothesis” precisely, in terms of the parameters of our regression model. We refer to this “null hypothesis” as H0.
• We also need to specify our “alternative hypothesis,” Ha , in terms of our regression parameters.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-4
What Sorts of Hypotheses to Test? (cont.)
• Claim: The marginal propensity to consume is greater than 0.70 :
• Conduct a one-sided test of the null hypothesis
• H0 : 1 > 0.70 against the alternative,
• Ha : 1 = 0.70
Ci 0 1Incomei i
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-5
What Sorts of Hypotheses to Test? (cont.)
• Claim: The marginal propensity to consume equals the average propensity to consume:
• Conduct a two-sided test of
• H0 : 0 = 0 against the alternative,
• Ha : 0 ≠0
Ci
0
1Income
i
i, with
00
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-6
What Sorts of Hypotheses to Test? (cont.)
• The CAPM model from finance says that the
• Regress
for a particular mutual fund, using data over time. Test H0 : 0 > 0.
• If 0 > 0, the fund performs better than expected, said early analysts. If 0 < 0, the fund performs less well than expected.
E(excess return on portfolio k)
·(Excess return on market portfolio)
E(excess return on portfolio k)
0 1(excess return on market portfolio)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-7
What Sorts of Hypotheses to Test? (cont.)
• H0 : 0 > 0
• Ha : 0 = 0
• What if we run our regression and find
• Can we reject the null hypothesis? What if
012.0ˆ 0
?012.0ˆ 0
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-8
Hypothesis Testing: Errors
• In our CAPM example, we are testing–H0 : 0 > 0, against the alternative
–Ha : 0 = 0
• We can make 2 kinds of mistakes.– Type I Error: We reject the null hypothesis
when the null hypothesis is “true.”
– Type II Error: We fail to reject the null hypothesis when the null hypothesis is “false.”
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-9
Hypothesis Testing: Errors (cont.)
• Type I Error: Reject the null hypothesis when it is true.
• Type II Error: Fail to reject the null hypothesis when it is false.
• We need a rule for deciding when to reject a null hypothesis. To make a rule with a lower probability of Type I error, we have to have a higher probability of Type II error.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-10
Hypothesis Testing: Errors (cont.)
• Type I Error: Reject the null hypothesis when it is true.
• Type II Error: Fail to reject the null hypothesis when it is false.
• In practice, we build rules to have a low probability of a Type I error. Null hypotheses are “innocent until proven guilty beyond a reasonable doubt.”
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-11
Hypothesis Testing: Errors (cont.)
• Type I Error: Reject the null hypothesis when it is true.
• Type II Error: Fail to reject the null hypothesis when it is false.
• We do NOT ask whether the null hypothesis is more likely than the alternative hypothesis.
• We DO ask whether we can build a compelling case to reject the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-12
Hypothesis Testing
• What constitutes a compelling case to reject the null hypothesis?
• If the null hypothesis were true, would we be extremely surprised to see the data that we see?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-13
Hypothesis Testing: Errors
• In our CAPM example
• What if we run our regression and find
• Could a reasonable jury reject the null hypothesis if the estimate is “just a little lower” than 0?
?012.0ˆ 0
H0 :0 0
Ha :0 0
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-14
Hypothesis Testing: Errors (cont.)
• Type I Error: Reject the null hypothesis when it is true.
• Type II Error: Fail to reject the null hypothesis when it is false.
• In our CAPM example, our null hypothesis is 0 > 0. Can we use our data to amass overwhelming evidence that this null hypothesis is false?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-15
Hypothesis Testing: Errors (cont.)
• Note: if we “fail to reject” the null, it does NOT mean we can “accept” the null hypothesis.
• “Failing to reject” means the null has “reasonable doubt.”
• The null hypothesis could still be fairly unlikely, just not overwhelmingly unlikely.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-16
Hypothesis Testing: Strategy
• Our Strategy: Look for a Contradiction.
• Assume the null hypothesis is true.
• Calculate the probability that we see the data, assuming the null hypothesis is true.
• Reject the null hypothesis if this probability is just too darn low.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-17
How Should We Proceed?
1. Ask how our estimates of 0 and are distributed if the null hypothesis is true.
2. Determine a test statistic.
3. Settle upon a critical region to reject the null hypothesis if the probability of seeing our data is too low.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-18
How Should We Proceed? (cont.)
• The key tool we need is the probability of seeing our data if the null hypothesis is true.
• We need to know the distribution of our estimators.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-19
0
0ˆ
If the true value of 0, what is
the probability that we observe data
with estimated intercept ?
Distribution of a Linear Estimator (from Chapter 5.2)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-20
Distribution of a Linear Estimator (cont.)
• Perhaps the most common hypothesis test is H0 : = 0 against Ha : ≠ 0
• This hypothesis tests whether a variable has any effect on Y
• We will begin by calculating the variance of our estimator for the coefficient on X1
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-21
Hypothesis Testing
• Add to the Gauss–Markov Assumptions
• The disturbances are normally distributed
i ~ N(0, 2 )
Yi ~ N(Xi , 2 )
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-22
DGP Assumptions
Yi
0
1X
1i
2X
2i
kX
ki
i
E(i) 0
Var(i) 2
Cov(i,
j) 0, for i j
Each explanator is fixed across samples
i~ N (0, 2 )
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-23
How Should We Proceed?
• Ask how our guesses of 0 and 1 are distributed.
• Since the Yi are distributed normally, all linear estimators are, too.
0 0
1 1
0 1
ˆ ~ ( ,?)
ˆ ~ ( ,?)
ˆ ˆ
N
N
What are the variances of and ?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-24
What is the Variance of 1?
Var(wiY
i) Var(w
iY
i)0
wi2Var(Y
i) 2w
i2
Let
xi Xi - X
and
yi Yi - Y
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-25
1?What is the Variance of (cont.)
Var(wiYi ) 2wi2
1 xiyixi
2wi xi
xi2
Var(1) 2 wi2 2 (
xi
xi2)2
2 xi2
( xi2 )2
2
xi2
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-26
2
12
ˆ( )i
i i
Varx
x X X
where
1?What is the Variance of (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-27
*
2
1 2
2
1 1 2
*0 1 1
2
1 1 2
ˆ( )
ˆ ~ ( , )
:
ˆ ~ ( , )
i
i
i
Varx
Nx
H
Nx
Thus...
Suppose we have as our null hypothesis
Under the null hypothesis
1?What is the Variance of (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-28
Distribution of a Linear Estimator
• We have a formula for the distribution of our estimator. However, this formula is not in a very convenient form. We would really like a formula that gives a distribution for which we can look up the probabilities in a common table.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-29
Test Statistics
• A “test statistic” is a statistic:
1. Readily calculated from the data
2. Whose distribution is known (under the null hypothesis)
• Using a test statistic, we can compute the probability of observing the data given the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-30
*
*
2
1 1 2
1 1
2
2
ˆ ~ ( , )
ˆ -~ (0,1)
i
i
Nx
Z N
x
If we subtract the mean and divide by the
standard error, we can transform a
Normal Distribution into a Standard Normal
Distribution.
Test Statistics (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-31
*1 1
2
2
ˆ -
i
Z
x
We can easily look up the probability
that we observe a given value of
on a Standard Normal table.
Test Statistics (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-32
Test Statistics (cont.)*
1 1
2
2
0 1
1
1
ˆ -~ (0,1)
: 0.70
: 0.70
* 0.70
-1.64 5%
i
a
Z N
x
H
H
Z
Z
Suppose we want to test
, against the alternative
We could replace with and calculate
If , then we know there is less than a
chance we woul 1
0.70
d observe this data if really
were greater than
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-33
Estimating 2 (from Chapter 5.1, Chapter 6.4)
*1 1 2
2
2
2
ˆ -.
.
ix
One Problem: We cannot observe
because we cannot observe
Solution: Estimate
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-34
Estimating 2 (cont.)
• We need to estimate the variance of the error terms,
• Problem: we do not observe i directly.
• Another Problem: we do not know 0…k, so we cannot calculate i either.
i Yi - 0 - 1X1i - ... - k Xki
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-35
Estimating 2 (cont.)
• We need to estimate the variance of the error terms,
• We can proxy for the error terms using the residuals.
i Yi - 0 - 1X1i - ... - k Xki
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-36
Estimating 2 (cont.)
0 1 1ˆ ˆ ˆ...
ˆi i i k ki
i i
e Y X X
Y Y
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-37
0 1 1ˆ ˆ ˆ...
ˆi i i k ki
i i
e Y X X
Y Y
2 21
1is e
n k
Estimating 2 (cont.)
• Once we have an estimate of the error term, we can calculate an estimate of the variance of the error term. We need to make a “degrees of freedom” correction.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-38
*
2
1 2
2
1 1 2
*0 1 1
2
1 1 2
ˆ( )
ˆ ~ ( , )
:
ˆ ~ ( , )
i
i
i
Varx
Nx
H
Nx
Recall
Thus...
Suppose we have as our null hypothesis
Under the null hypothesis
Estimating 2 (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-39
*
*0 1 1
2
1 1 2
:
ˆ ~ ,i
H
Nx
Suppose we have as our null hypothesis
Under the null hypothesis
Estimating 2 (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-40
*
*
2
2
1 1 2
2
1 1 2
:
ˆ ~ ,
ˆ ~ ,( 1)
i
i
i
sN
x
eN
n k x
Plug in our estimate for
Estimating 2 (cont.)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-41
Standard Error (from Chapter 5.2)
• Remember, the standard deviation of the distribution of our estimator is called the “standard error.”
• The smaller the standard error, the more closely your estimates will tend to fall to the mean of the distribution.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-42
Standard Error (from Chapter 5.2)
• If your estimate is unbiased, a low standard error implies that your estimate is probably “close” to the true parameter value.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-43
*
12
2
ˆˆ ~ n k
i
t ts
x
t-statistic (from Chapter 7.2)
• Because we need to estimate the standard error, the t-statistic is NOT distributed as a Standard Normal. Instead, it is distributed according to the t-distribution.
• The t-distribution depends on n-k-1. For large n-k-1, the t-distribution closely resembles the Standard Normal.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-44
t-statistic (cont.)
• Under the null hypothesis H0 : 1 = 1*
• In our earlier example, we could:
• Replace 1* with 0.70
• Compare t to the “critical value” for which the tn-2 distribution has .05 of its probability mass lying to the left,
• There is less than a 5% chance of observing the data under the null if t < “critical value.”
t ~ tn-2
*
12
ˆ
ˆˆ ~ n kt t
s
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-45
Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-46
Significance Level
• We can now calculate the probability of observing the data IF the null hypothesis is true.
• We choose the maximum chance we are willing to risk that we accidentally commit a Type I Error (reject a null hypothesis when it is true).
• This chance is called the “significance level.”
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-47
Significance Level (cont.)
• We choose the probability we are willing to accept of a Type I Error.
• This probability is the “Significance Level.”
• The significance level gives operational meaning to how compelling a case we need to build.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-48
Significance Level (cont.)
• The significance level denotes the chance of committing a Type I Error.
• By historical convention, we usually reject a null hypothesis if we have less than a 5% chance of observing the data under the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-49
Significance Level (cont.)
• 5% is the conventional significance level. Analysts also often look at the 1% and 10% levels.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-50
Critical Region
• We know the distribution of our test statistic under the null hypothesis.
• We can calculate the values of the test statistic for which we would reject the null hypothesis (i.e., values that we would have less than a 5% chance of observing under the null hypothesis).
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-51
Critical Region (cont.)
• We can calculate the values of the test statistic for which we would reject the null hypothesis.
• These values are called the “critical region.”
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-52
Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-53
Critical Region
• Regression packages routinely report estimated coefficients, their estimated standard errors, and the t-statistics associated with the null hypothesis that an individual coefficient is equal to zero.
• Some programs also report a “p-value” for each estimated coefficient.
• This reported p-value is the smallest significance level for a two sided test at which one would reject the null that the coefficient is zero.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-54
One-Sided, Two-Sided Tests
• t-tests come in two flavors: 1-sided and 2-sided.
• 2-sided tests are much more common:– H0 : = *– Ha : ≠*
• 1-sided tests look at only one-side:– H0 :> *– Ha: = *
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-55
One-Sided, Two-Sided Tests (cont.)
• The procedure for both 1-sided and 2-sided tests is very similar.
• For either test, you construct the same t-statistic:
)ˆ.(.
ˆˆ
*
est
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-56
One-Sided, Two-Sided Tests (cont.)
• Once you have your t-statistic, you need to choose a “critical value.” The critical value is the boundary point for the critical region. You reject the null hypothesis if your t-statistic is greater in magnitude than the critical value.
• The choice of critical value depends on the type of test you are running.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-57
Critical Value for 1-Sided Test
• For a 1-sided test, you need a critical value such that of the distribution of the estimator is greater than (or less than) the critical value. is our significance level (for example, 5%).
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-58
Critical Value for 1-Sided Test (cont.)
• In our CAPM example, we want to test:
• We need a critical value t* such that of the distribution of our estimator is less than t*
H0 :0 0
Ha :0 0
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-59
Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-60
Critical Value of a 1-Sided Test (cont.)
• For a 5% significance level and a large sample size, t* = -1.64
• We reject the null hypothesis if:
64.1ˆ t
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-61
Critical Value for a 2-Sided test
• For a 2-sided test, we need to spread our critical region over both tails. We need a critical value t* such that
– /2 of the distribution is to the right of t*
– /2 of the distribution is to the left of –t*
• Summing both tails, of the distribution is beyond either t* or -t*
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-62
Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-63
Critical Value for 2-Sided Test
• For a large sample size, the critical value for a 2-sided test at the 5% level is 1.96
• You reject the null hypothesis if:
96.1ˆ
or
96.1ˆ
t
t
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-64
P-values
• The p-value is the smallest significance level for which you could reject the null hypothesis.
• The smaller the p-value, the stricter the significance level at which you can reject the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-65
P-values (cont.)
• Many statistics packages automatically report the p-value for a two-sided test of the null hypothesis that a coefficient is 0
• If p < 0.05, then you could reject the null that = 0 at a significance level of 0.05
• The coefficient “is significant at the 95% confidence level.”
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-66
Statistical Significance
• A coefficient is “statistically significant at the 95% confidence level” if we could reject the null that = 0 at the 5% significance level.
• In economics, the word “significant” means “statistically significant” unless otherwise qualified.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-67
Performing Tests
• How do we compute these test statistics using our software?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-68
Power
• Type I Error: reject a null hypothesis when it is true
• Type II Error: fail to reject a null hypothesis when it is false
• We have devised a procedure based on choosing the probability of a Type I Error.
• What about Type II Errors?
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-69
Power (cont.)
• The probability that our hypothesis test rejects a null hypothesis when it is false is called the Power of the test.
• (1 – Power) is the probability of a Type II Error.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-70
Power (cont.)
• If a test has a low probability of rejecting the null hypothesis when that hypothesis is false, we say that the test is “weak” or has “low power.”
• The higher the standard error of our estimator, the weaker the test.
• More efficient estimators allow for more powerful tests.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-71
Power (cont.)
• Power depends on the particular Ha you are considering. The closer Ha is to H0, the harder it is to reject the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-72
Figure 7.2 Distribution of for s = -2, 0, and a Little Less Than 0
s
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-73
Figure 7.3 Power Curves for Two-Tailed Tests of H0 : s = 0
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-74
Figure SA.12 The Distribution of the t-Statistic Given the Null Hypothesis is False and = + 5
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-75
Figure SA.13 The t-Statistic’s Power When the Sample Size Grows
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-76
Review
• To test a null hypothesis, we:
– Assume the null hypothesis is true;
–Calculate a test statistic, assuming the null hypothesis is true;
–Reject the null hypothesis if we would be very unlikely to observe the test statistic under the null hypothesis.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-77
Six Steps to Hypothesis Testing
1. State the null and alternative hypotheses
2. Choose a test statistic (so far, we have learned the t-test)
3. Choose a significance level, the probability of a Type I Error (typically 5%)
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-78
Six Steps to Hypothesis Tests (cont.)
4. Find the critical region for the test (for a 2-sided t-test at the 5% level in large samples, the critical value is t*=1.96)
5. Calculate the test statistic
6. Reject the null hypothesis if the test statistic falls within the critical region
)ˆ.(.
ˆˆ
*
est
?ˆor ˆ Is ** tttt
Top Related