Module 35M F Test
-
Upload
joan-bubest -
Category
Documents
-
view
228 -
download
0
Transcript of Module 35M F Test
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 1/25
TESTING THE STRENGTH
OF THEMULTIPLE REGRESSION MODEL
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 2/25
Test 1: Are Any of the x’s Usefulin Predicting y?
We are asking: Can we conclude at leastone of the ’s (other than 0) 0?
H 0: 1 = 2 = 3 = 4 = 0H A: At least one of these ’s 0
= .05
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 3/25
Idea of the Test
• Measure the overall “average variability”due to changes in the x’s
• Measure the overall “average variability”that is due to randomness (error)
• If the overall “average variability” due to
changes in the x’s I S A L OT L ARGER than“average variability” due to error, weconclude at least is non-zero, i.e. at leastone factor (x) is useful in predicting y
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 4/25
“Total Variability”
• Just like with simple linear regression wehave total sum of squares due to regressionSSR , and total sum of squares due to error,SSE, which are printed on the EXCELoutput.
– The formulas are a more complicated (theyinvolve matrix operations)
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 5/25
“Average Variability” • “Average variability” (Mean variability)
for a group is defined as the TotalVariability divided by the degrees of
freedom associated with that group:• Mean Squares Due to Regression
MSR = SSR/DFR• Mean Squares Due to Error
MSE = SSE/DFE
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 6/25
Degrees of Freedom
• Total number of degrees of freedom DF(Total)always = n-1
• Degrees of freedom for regression (DFR) = thenumber of factors in the regression (i.e. thenumber of x’s in the linear regression)
• Degrees of freedom for error (DFE) =difference between the two = DF(Total) -DFR
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 7/25
The F-Statistic
• The F-statistic is defined as the ratio of twomeasures of variability. Here,
• Recall we are saying if MSR is “large” compared
to MSE, at least one β ≠ 0. • Thus if F is “large”, we draw the conclusion is
that H A is true, i.e. at least one β ≠ 0.
MSEMSR F
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 8/25
The F-test
• “Large” compared to what?
•
F-tables give critical values for givenvalues of
• TEST: REJECT H 0 (Accept H A) if:
F = MSR/MSE > F ,DFR,DFE
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 9/25
RESULTS
• If we do not get a large F statistic – We cannot conclude that any of the variables
in this model are significant in predicting y.
• If we do get a large F statistic – We can conclude at least one of the variables
is significant for predicting y . – NATURAL QUESTION --
• WHICH ONES?
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 10/25
DFR = #x’s DFE = Total DF- DFRTotal DF = n-1
SSRSSE
Total SS = (y i - ) 2 y
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 11/25
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 12/25
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 13/25
Test 2: Which Variables AreSignificant IN THIS MODEL?
• The question we are asking is, “taking all theother factors (x’s) into consideration, does a
change in a particular x (x 3, say) valuesignificantly affect y.
• This is another hypothesis test (a t-test).• To test if the age of the house is significant:
H 0: 3 = 0 (x 3 is not significant in thi s model )H A: 3 0 (x 3 is significant in thi s model )
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 14/25
The t-test for a particularfactor IN THIS MODEL
• Reject H 0 (Accept H A) if:
DFE.025,DFE.025,β
3 torts
0βˆ
t3
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 15/25
t-value for test of 3 = 0
p-value for test of 3 = 0
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 16/25
Reading Printout for the t-test
• Simply look at the p-value – p-value for 3 = 0 is .02194 < .05
• Thus the age of the house is significant in th is model
• The other variables – p-value for 1 = 0 is .0000839 < .05
• Thus square feet is significant in th is model
– p-value for 2 = 0 is .15503 > .05• Thus the land (acres) is not significant in th is model
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 17/25
Does A Poor t-value Imply theVariable is not Useful in Predicting y?
• NO• It says the variable is not significant I N TH I S
MODEL when we consider all the other factors.
• In this model – land is not significant whenincluded with square footage and age.
•
But if we would have run this model withoutsquare footage we would have gotten the outputon the next slide.
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 18/25
p-value for land is .00000717.In this model Land is significant.
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 19/25
Can it even happen that F says at leastone variable is significant, but none of
the t’s indicate a useful variable?
• YES
EXAMPLES IN WHICH THIS MIGHT HAPPEN: – Miles per gallon vs. horsepower and engine size – Salary vs. GPA and GPA in major – Income vs. age and experience – H OUSE PRI CE vs. SQUARE F OOTAGE OF H OUSE A ND L AN D
• There is a relation between the x’s – – Multicollinearity
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 20/25
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 21/25
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 22/25
What is Adjusted R 2?• Adjusted R 2 adjusts R 2 to take into account
degrees of freedom.• By assuming a higher order equation for y, we
can force the curve to fit this one set of datapoints in the model – eliminating much of thevariability (See next slide).
• But this is not what is going on!R 2 might be higher – but adjusted R 2 might be much
lower• Adjusted R 2 takes this into account• Adjusted R 2 = 1-MSE/SST
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 23/25
ScatterplotSales vs Ad Dollars
$0
$20,000
$40,000
$60,000
$80,000
$100,000
$120,000
$140,000
$- $200 $400 $600 $800 $1,000 $1,200 $1,400
Ad Dollars
S a
l e s
This is not what is really going on
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 24/25
Review• Are any of the x’s useful in predicting y IN THIS
MODEL – Look at p-value for F-test – Significance F – F = MSR/MSE would be compared to F ,DFR,DFE
• Which variables are significant in this model? – Look at p-values for the individual t-tests
• What proportion of the total variance in y can beexplained by changes in the x’s?
– R 2 – Adjusted R 2 takes into account the reduced degrees of
freedom for the error term by including more terms in themodel
8/12/2019 Module 35M F Test
http://slidepdf.com/reader/full/module-35m-f-test 25/25
1-regression equation 3- p-values for t-testsWhich variables are significant
in this model?
4- R 2
What proportion of y can beexplained by changes in x?
4 Places to Look on Excel Printout
2- Significance FAre any variables useful?