Download - Copyright © 2006 Pearson Addison-Wesley. All rights reserved. Lecture 8: Hypothesis Testing (Chapter 7.1–7.2, 7.4) Distribution of Estimators (Chapter.

Copyright © 2006 Pearson Addison-Wesley. All rights reserved.

Lecture 8: Hypothesis Testing

(Chapter 7.1–7.2, 7.4)

Distribution of Estimators(Chapter 5.1–5.2, Chapter 6.4)

Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 8-2

Agenda for Today

• Hypothesis Testing (Chapter 7.1)

• Distribution of Estimators (Chapter 5.2)

• Estimating 2 (Chapter 5.1, Chapter 6.4)

• t-tests (Chapter 7.2)

• P-values (Chapter 7.2)

• Power (Chapter 7.2)


What Sorts of Hypotheses to Test?

• To test a hypothesis, we first need to specify our “null hypothesis” precisely, in terms of the parameters of our regression model. We refer to this “null hypothesis” as H0.

• We also need to specify our “alternative hypothesis,” Ha , in terms of our regression parameters.


What Sorts of Hypotheses to Test? (cont.)

• Claim: The marginal propensity to consume is greater than 0.70 :

• Conduct a one-sided test of the null hypothesis

• H0 : 1 > 0.70 against the alternative,

• Ha : 1 = 0.70

Ci 0 1Incomei i



• Claim: The marginal propensity to consume equals the average propensity to consume:

• Conduct a two-sided test of

• H0 : 0 = 0 against the alternative,

• Ha : 0 ≠0

Ci

0

1Income

i

i, with

00



• The CAPM model from finance says that the

• Regress

for a particular mutual fund, using data over time. Test H0 : 0 > 0.

• If 0 > 0, the fund performs better than expected, said early analysts. If 0 < 0, the fund performs less well than expected.

E(excess return on portfolio k)

·(Excess return on market portfolio)

E(excess return on portfolio k)

0 1(excess return on market portfolio)



• H0 : 0 > 0

• Ha : 0 = 0

• What if we run our regression and find

• Can we reject the null hypothesis? What if

012.0ˆ 0

?012.0ˆ 0


Hypothesis Testing: Errors

• In our CAPM example, we are testing–H0 : 0 > 0, against the alternative

–Ha : 0 = 0

• We can make 2 kinds of mistakes.– Type I Error: We reject the null hypothesis

when the null hypothesis is “true.”

– Type II Error: We fail to reject the null hypothesis when the null hypothesis is “false.”


Hypothesis Testing: Errors (cont.)

• Type I Error: Reject the null hypothesis when it is true.

• Type II Error: Fail to reject the null hypothesis when it is false.

• We need a rule for deciding when to reject a null hypothesis. To make a rule with a lower probability of Type I error, we have to have a higher probability of Type II error.





• In practice, we build rules to have a low probability of a Type I error. Null hypotheses are “innocent until proven guilty beyond a reasonable doubt.”





• We do NOT ask whether the null hypothesis is more likely than the alternative hypothesis.

• We DO ask whether we can build a compelling case to reject the null hypothesis.


Hypothesis Testing

• What constitutes a compelling case to reject the null hypothesis?

• If the null hypothesis were true, would we be extremely surprised to see the data that we see?


Hypothesis Testing: Errors

• In our CAPM example

• What if we run our regression and find

• Could a reasonable jury reject the null hypothesis if the estimate is “just a little lower” than 0?

?012.0ˆ 0

H0 :0 0

Ha :0 0





• In our CAPM example, our null hypothesis is 0 > 0. Can we use our data to amass overwhelming evidence that this null hypothesis is false?



• Note: if we “fail to reject” the null, it does NOT mean we can “accept” the null hypothesis.

• “Failing to reject” means the null has “reasonable doubt.”

• The null hypothesis could still be fairly unlikely, just not overwhelmingly unlikely.


Hypothesis Testing: Strategy

• Our Strategy: Look for a Contradiction.

• Assume the null hypothesis is true.

• Calculate the probability that we see the data, assuming the null hypothesis is true.

• Reject the null hypothesis if this probability is just too darn low.


How Should We Proceed?

1. Ask how our estimates of 0 and are distributed if the null hypothesis is true.

2. Determine a test statistic.

3. Settle upon a critical region to reject the null hypothesis if the probability of seeing our data is too low.


How Should We Proceed? (cont.)

• The key tool we need is the probability of seeing our data if the null hypothesis is true.

• We need to know the distribution of our estimators.


0

0ˆ

If the true value of 0, what is

the probability that we observe data

with estimated intercept ?

Distribution of a Linear Estimator (from Chapter 5.2)


Distribution of a Linear Estimator (cont.)

• Perhaps the most common hypothesis test is H0 : = 0 against Ha : ≠ 0

• This hypothesis tests whether a variable has any effect on Y

• We will begin by calculating the variance of our estimator for the coefficient on X1


Hypothesis Testing

• Add to the Gauss–Markov Assumptions

• The disturbances are normally distributed

i ~ N(0, 2 )

Yi ~ N(Xi , 2 )


DGP Assumptions

Yi

0

1X

1i

2X

2i

kX

ki

i

E(i) 0

Var(i) 2

Cov(i,

j) 0, for i j

Each explanator is fixed across samples

i~ N (0, 2 )


How Should We Proceed?

• Ask how our guesses of 0 and 1 are distributed.

• Since the Yi are distributed normally, all linear estimators are, too.

0 0

1 1

0 1

ˆ ~ ( ,?)

ˆ ~ ( ,?)

ˆ ˆ

N

N

What are the variances of and ?


What is the Variance of 1?

Var(wiY

i) Var(w

iY

i)0

wi2Var(Y

i) 2w

i2

Let

xi Xi - X

and

yi Yi - Y


1?What is the Variance of (cont.)

Var(wiYi ) 2wi2

1 xiyixi

2wi xi

xi2

Var(1) 2 wi2 2 (

xi

xi2)2

2 xi2

( xi2 )2

2

xi2


2

12

ˆ( )i

i i

Varx

x X X

where



*

2

1 2

2

1 1 2

*0 1 1

2

1 1 2

ˆ( )

ˆ ~ ( , )

:

ˆ ~ ( , )

i

i

i

Varx

Nx

H

Nx

Thus...

Suppose we have as our null hypothesis

Under the null hypothesis



Distribution of a Linear Estimator

• We have a formula for the distribution of our estimator. However, this formula is not in a very convenient form. We would really like a formula that gives a distribution for which we can look up the probabilities in a common table.


Test Statistics

• A “test statistic” is a statistic:

1. Readily calculated from the data

2. Whose distribution is known (under the null hypothesis)

• Using a test statistic, we can compute the probability of observing the data given the null hypothesis.


*

*

2

1 1 2

1 1

2

2

ˆ ~ ( , )

ˆ -~ (0,1)

i

i

Nx

Z N

x

If we subtract the mean and divide by the

standard error, we can transform a

Normal Distribution into a Standard Normal

Distribution.

Test Statistics (cont.)


*1 1

2

2

ˆ -

i

Z

x

We can easily look up the probability

that we observe a given value of

on a Standard Normal table.

Test Statistics (cont.)


Test Statistics (cont.)*

1 1

2

2

0 1

1

1

ˆ -~ (0,1)

: 0.70

: 0.70

* 0.70

-1.64 5%

i

a

Z N

x

H

H

Z

Z

Suppose we want to test

, against the alternative

We could replace with and calculate

If , then we know there is less than a

chance we woul 1

0.70

d observe this data if really

were greater than


Estimating 2 (from Chapter 5.1, Chapter 6.4)

*1 1 2

2

2

2

ˆ -.

.

ix

One Problem: We cannot observe

because we cannot observe

Solution: Estimate


Estimating 2 (cont.)

• We need to estimate the variance of the error terms,

• Problem: we do not observe i directly.

• Another Problem: we do not know 0…k, so we cannot calculate i either.

i Yi - 0 - 1X1i - ... - k Xki



• We need to estimate the variance of the error terms,

• We can proxy for the error terms using the residuals.

i Yi - 0 - 1X1i - ... - k Xki



0 1 1ˆ ˆ ˆ...

ˆi i i k ki

i i

e Y X X

Y Y


0 1 1ˆ ˆ ˆ...

ˆi i i k ki

i i

e Y X X

Y Y

2 21

1is e

n k


• Once we have an estimate of the error term, we can calculate an estimate of the variance of the error term. We need to make a “degrees of freedom” correction.


*

2

1 2

2

1 1 2

*0 1 1

2

1 1 2

ˆ( )

ˆ ~ ( , )

:

ˆ ~ ( , )

i

i

i

Varx

Nx

H

Nx

Recall

Thus...





*

*0 1 1

2

1 1 2

:

ˆ ~ ,i

H

Nx





*

*

2

2

1 1 2

2

1 1 2

:

ˆ ~ ,

ˆ ~ ,( 1)

i

i

i

sN

x

eN

n k x

Plug in our estimate for



Standard Error (from Chapter 5.2)

• Remember, the standard deviation of the distribution of our estimator is called the “standard error.”

• The smaller the standard error, the more closely your estimates will tend to fall to the mean of the distribution.


Standard Error (from Chapter 5.2)

• If your estimate is unbiased, a low standard error implies that your estimate is probably “close” to the true parameter value.


*

12

2

ˆˆ ~ n k

i

t ts

x

t-statistic (from Chapter 7.2)

• Because we need to estimate the standard error, the t-statistic is NOT distributed as a Standard Normal. Instead, it is distributed according to the t-distribution.

• The t-distribution depends on n-k-1. For large n-k-1, the t-distribution closely resembles the Standard Normal.


t-statistic (cont.)

• Under the null hypothesis H0 : 1 = 1*

• In our earlier example, we could:

• Replace 1* with 0.70

• Compare t to the “critical value” for which the tn-2 distribution has .05 of its probability mass lying to the left,

• There is less than a 5% chance of observing the data under the null if t < “critical value.”

t ~ tn-2

*

12

ˆ

ˆˆ ~ n kt t

s


Figure 7.1 Critical Regions for Two-Tailed and One-Tailed t-Statistics with a 0.05 Significance Level and 10 Degrees of Freedom


Significance Level

• We can now calculate the probability of observing the data IF the null hypothesis is true.

• We choose the maximum chance we are willing to risk that we accidentally commit a Type I Error (reject a null hypothesis when it is true).

• This chance is called the “significance level.”


Significance Level (cont.)

• We choose the probability we are willing to accept of a Type I Error.

• This probability is the “Significance Level.”

• The significance level gives operational meaning to how compelling a case we need to build.



• The significance level denotes the chance of committing a Type I Error.

• By historical convention, we usually reject a null hypothesis if we have less than a 5% chance of observing the data under the null hypothesis.



• 5% is the conventional significance level. Analysts also often look at the 1% and 10% levels.


Critical Region

• We know the distribution of our test statistic under the null hypothesis.

• We can calculate the values of the test statistic for which we would reject the null hypothesis (i.e., values that we would have less than a 5% chance of observing under the null hypothesis).


Critical Region (cont.)

• We can calculate the values of the test statistic for which we would reject the null hypothesis.

• These values are called the “critical region.”


Critical Region

• Regression packages routinely report estimated coefficients, their estimated standard errors, and the t-statistics associated with the null hypothesis that an individual coefficient is equal to zero.

• Some programs also report a “p-value” for each estimated coefficient.

• This reported p-value is the smallest significance level for a two sided test at which one would reject the null that the coefficient is zero.


One-Sided, Two-Sided Tests

• t-tests come in two flavors: 1-sided and 2-sided.

• 2-sided tests are much more common:– H0 : = *– Ha : ≠*

• 1-sided tests look at only one-side:– H0 :> *– Ha: = *


One-Sided, Two-Sided Tests (cont.)

• The procedure for both 1-sided and 2-sided tests is very similar.

• For either test, you construct the same t-statistic:

)ˆ.(.

ˆˆ

*

est


One-Sided, Two-Sided Tests (cont.)

• Once you have your t-statistic, you need to choose a “critical value.” The critical value is the boundary point for the critical region. You reject the null hypothesis if your t-statistic is greater in magnitude than the critical value.

• The choice of critical value depends on the type of test you are running.


Critical Value for 1-Sided Test

• For a 1-sided test, you need a critical value such that of the distribution of the estimator is greater than (or less than) the critical value. is our significance level (for example, 5%).


Critical Value for 1-Sided Test (cont.)

• In our CAPM example, we want to test:

• We need a critical value t* such that of the distribution of our estimator is less than t*

H0 :0 0

Ha :0 0


Critical Value of a 1-Sided Test (cont.)

• For a 5% significance level and a large sample size, t* = -1.64

• We reject the null hypothesis if:

64.1ˆ t


Critical Value for a 2-Sided test

• For a 2-sided test, we need to spread our critical region over both tails. We need a critical value t* such that

– /2 of the distribution is to the right of t*

– /2 of the distribution is to the left of –t*

• Summing both tails, of the distribution is beyond either t* or -t*


Critical Value for 2-Sided Test

• For a large sample size, the critical value for a 2-sided test at the 5% level is 1.96

• You reject the null hypothesis if:

96.1ˆ

or

96.1ˆ

t

t


P-values

• The p-value is the smallest significance level for which you could reject the null hypothesis.

• The smaller the p-value, the stricter the significance level at which you can reject the null hypothesis.


P-values (cont.)

• Many statistics packages automatically report the p-value for a two-sided test of the null hypothesis that a coefficient is 0

• If p < 0.05, then you could reject the null that = 0 at a significance level of 0.05

• The coefficient “is significant at the 95% confidence level.”


Statistical Significance

• A coefficient is “statistically significant at the 95% confidence level” if we could reject the null that = 0 at the 5% significance level.

• In economics, the word “significant” means “statistically significant” unless otherwise qualified.


Performing Tests

• How do we compute these test statistics using our software?


Power

• Type I Error: reject a null hypothesis when it is true

• Type II Error: fail to reject a null hypothesis when it is false

• We have devised a procedure based on choosing the probability of a Type I Error.

• What about Type II Errors?


Power (cont.)

• The probability that our hypothesis test rejects a null hypothesis when it is false is called the Power of the test.

• (1 – Power) is the probability of a Type II Error.


Power (cont.)

• If a test has a low probability of rejecting the null hypothesis when that hypothesis is false, we say that the test is “weak” or has “low power.”

• The higher the standard error of our estimator, the weaker the test.

• More efficient estimators allow for more powerful tests.


Power (cont.)

• Power depends on the particular Ha you are considering. The closer Ha is to H0, the harder it is to reject the null hypothesis.


Figure 7.2 Distribution of for s = -2, 0, and a Little Less Than 0

s


Figure 7.3 Power Curves for Two-Tailed Tests of H0 : s = 0


Figure SA.12 The Distribution of the t-Statistic Given the Null Hypothesis is False and = + 5


Figure SA.13 The t-Statistic’s Power When the Sample Size Grows


Review

• To test a null hypothesis, we:

– Assume the null hypothesis is true;

–Calculate a test statistic, assuming the null hypothesis is true;

–Reject the null hypothesis if we would be very unlikely to observe the test statistic under the null hypothesis.


Six Steps to Hypothesis Testing

1. State the null and alternative hypotheses

2. Choose a test statistic (so far, we have learned the t-test)

3. Choose a significance level, the probability of a Type I Error (typically 5%)


Six Steps to Hypothesis Tests (cont.)

4. Find the critical region for the test (for a 2-sided t-test at the 5% level in large samples, the critical value is t*=1.96)

5. Calculate the test statistic

6. Reject the null hypothesis if the test statistic falls within the critical region

)ˆ.(.

ˆˆ

*

est

?ˆor ˆ Is ** tttt