Module 35M F Test

Post on 03-Jun-2018

228 views 0 download

Transcript of Module 35M F Test

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 1/25

TESTING THE STRENGTH

OF THEMULTIPLE REGRESSION MODEL

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 2/25

Test 1: Are Any of the x’s Usefulin Predicting y?

We are asking: Can we conclude at leastone of the ’s (other than 0) 0?

H 0: 1 = 2 = 3 = 4 = 0H A: At least one of these ’s 0

= .05

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 3/25

Idea of the Test

• Measure the overall “average variability”due to changes in the x’s

• Measure the overall “average variability”that is due to randomness (error)

• If the overall “average variability” due to

changes in the x’s I S A L OT L ARGER than“average variability” due to error, weconclude at least is non-zero, i.e. at leastone factor (x) is useful in predicting y

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 4/25

“Total Variability”

• Just like with simple linear regression wehave total sum of squares due to regressionSSR , and total sum of squares due to error,SSE, which are printed on the EXCELoutput.

– The formulas are a more complicated (theyinvolve matrix operations)

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 5/25

“Average Variability” • “Average variability” (Mean variability)

for a group is defined as the TotalVariability divided by the degrees of

freedom associated with that group:• Mean Squares Due to Regression

MSR = SSR/DFR• Mean Squares Due to Error

MSE = SSE/DFE

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 6/25

Degrees of Freedom

• Total number of degrees of freedom DF(Total)always = n-1

• Degrees of freedom for regression (DFR) = thenumber of factors in the regression (i.e. thenumber of x’s in the linear regression)

• Degrees of freedom for error (DFE) =difference between the two = DF(Total) -DFR

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 7/25

The F-Statistic

• The F-statistic is defined as the ratio of twomeasures of variability. Here,

• Recall we are saying if MSR is “large” compared

to MSE, at least one β ≠ 0. • Thus if F is “large”, we draw the conclusion is

that H A is true, i.e. at least one β ≠ 0.

MSEMSR F

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 8/25

The F-test

• “Large” compared to what?

F-tables give critical values for givenvalues of

• TEST: REJECT H 0 (Accept H A) if:

F = MSR/MSE > F ,DFR,DFE

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 9/25

RESULTS

• If we do not get a large F statistic – We cannot conclude that any of the variables

in this model are significant in predicting y.

• If we do get a large F statistic – We can conclude at least one of the variables

is significant for predicting y . – NATURAL QUESTION --

• WHICH ONES?

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 10/25

DFR = #x’s DFE = Total DF- DFRTotal DF = n-1

SSRSSE

Total SS = (y i - ) 2 y

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 11/25

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 12/25

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 13/25

Test 2: Which Variables AreSignificant IN THIS MODEL?

• The question we are asking is, “taking all theother factors (x’s) into consideration, does a

change in a particular x (x 3, say) valuesignificantly affect y.

• This is another hypothesis test (a t-test).• To test if the age of the house is significant:

H 0: 3 = 0 (x 3 is not significant in thi s model )H A: 3 0 (x 3 is significant in thi s model )

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 14/25

The t-test for a particularfactor IN THIS MODEL

• Reject H 0 (Accept H A) if:

DFE.025,DFE.025,β

3 torts

0βˆ

t3

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 15/25

t-value for test of 3 = 0

p-value for test of 3 = 0

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 16/25

Reading Printout for the t-test

• Simply look at the p-value – p-value for 3 = 0 is .02194 < .05

• Thus the age of the house is significant in th is model

• The other variables – p-value for 1 = 0 is .0000839 < .05

• Thus square feet is significant in th is model

– p-value for 2 = 0 is .15503 > .05• Thus the land (acres) is not significant in th is model

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 17/25

Does A Poor t-value Imply theVariable is not Useful in Predicting y?

• NO• It says the variable is not significant I N TH I S

MODEL when we consider all the other factors.

• In this model – land is not significant whenincluded with square footage and age.

But if we would have run this model withoutsquare footage we would have gotten the outputon the next slide.

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 18/25

p-value for land is .00000717.In this model Land is significant.

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 19/25

Can it even happen that F says at leastone variable is significant, but none of

the t’s indicate a useful variable?

• YES

EXAMPLES IN WHICH THIS MIGHT HAPPEN: – Miles per gallon vs. horsepower and engine size – Salary vs. GPA and GPA in major – Income vs. age and experience – H OUSE PRI CE vs. SQUARE F OOTAGE OF H OUSE A ND L AN D

• There is a relation between the x’s – – Multicollinearity

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 20/25

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 21/25

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 22/25

What is Adjusted R 2?• Adjusted R 2 adjusts R 2 to take into account

degrees of freedom.• By assuming a higher order equation for y, we

can force the curve to fit this one set of datapoints in the model – eliminating much of thevariability (See next slide).

• But this is not what is going on!R 2 might be higher – but adjusted R 2 might be much

lower• Adjusted R 2 takes this into account• Adjusted R 2 = 1-MSE/SST

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 23/25

ScatterplotSales vs Ad Dollars

$0

$20,000

$40,000

$60,000

$80,000

$100,000

$120,000

$140,000

$- $200 $400 $600 $800 $1,000 $1,200 $1,400

Ad Dollars

S a

l e s

This is not what is really going on

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 24/25

Review• Are any of the x’s useful in predicting y IN THIS

MODEL – Look at p-value for F-test – Significance F – F = MSR/MSE would be compared to F ,DFR,DFE

• Which variables are significant in this model? – Look at p-values for the individual t-tests

• What proportion of the total variance in y can beexplained by changes in the x’s?

– R 2 – Adjusted R 2 takes into account the reduced degrees of

freedom for the error term by including more terms in themodel

8/12/2019 Module 35M F Test

http://slidepdf.com/reader/full/module-35m-f-test 25/25

1-regression equation 3- p-values for t-testsWhich variables are significant

in this model?

4- R 2

What proportion of y can beexplained by changes in x?

4 Places to Look on Excel Printout

2- Significance FAre any variables useful?