Linear Models Of Regression: Bias-Variance Decomposition ...
Multiple regression. Problem: to draw a straight line through the points that best explains the...
-
date post
19-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of Multiple regression. Problem: to draw a straight line through the points that best explains the...
![Page 1: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/1.jpg)
Multiple regression
![Page 2: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/2.jpg)
Problem: to draw a straight line through the points that best explains the variance
Regression
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6
![Page 3: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/3.jpg)
Problem: to draw a straight line through the points that best explains the variance
Regression
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6
![Page 4: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/4.jpg)
Problem: to draw a straight line through the points that best explains the variance
Regression
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6
![Page 5: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/5.jpg)
Test with F, just like ANOVA:
Variance explained by x-variable / dfVariance still unexplained / df
Regression
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6
0
1
2
3
4
5
6
7
8
9
0 1 2 3 4 5 6
Varianceexplained
(change in line lengths2)
Varianceunexplained
(residualline lengths2)
![Page 6: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/6.jpg)
Test with F, just like ANOVA:
Variance explained by x-variable / dfVariance still unexplained / df
Regression
In regression, each x-variable will normally have 1 df
![Page 7: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/7.jpg)
Test with F, just like ANOVA:
Variance explained by x-variable / dfVariance still unexplained / df
Regression
Essentially a cost: benefit analysis –
Is the benefit in variance explained worth the cost in using up degrees of freedom?
![Page 8: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/8.jpg)
Total variance for 32 data points is 300 units.
An x-variable is then regressed against the data, accounting for 150 units of variance.
1. What is the R2?
2. What is the F ratio?
Regression example
![Page 9: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/9.jpg)
Total variance for 32 data points is 300 units.
An x-variable is then regressed against the data, accounting for 150 units of variance.
1. What is the R2?
2. What is the F ratio?
Regression example
R2 = 150/300 = 0.5
F 1,30 = 150/1 = 30 150/30
Why is df error = 30?
![Page 10: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/10.jpg)
Multiple regression
Tree age
Herbivore damage
Higher nutrient treesLower nutrient trees
Damage= m1*age + b
![Page 11: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/11.jpg)
Tree age
Herbivore damage
Tree nutrient concentration
Residuals ofherbivore damage
![Page 12: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/12.jpg)
Tree age
Herbivore damage
Tree nutrient concentration
Residuals ofherbivore damage
Damage= m1*age + m2*nutrient + b
![Page 13: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/13.jpg)
0
20
40
60
1 2 3 41 0
50
100
1 2 3 41
Damage= m1*age + m2*nutrient + m3*age*nutrient +b
No interaction (additive): Interaction (non-additive):
y y
![Page 14: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/14.jpg)
Non-linear regression?
Just a special case of multiple regression!
Y = m1 x +m2 x2 +b
X X2 Y1 1 1.12 4 2.03 9 3.64 16 3.15 25 5.26 36 6.77 49 11.3
X2X1
Y = m1 x1 +m2 x2 +b
![Page 15: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/15.jpg)
STEPWISE REGRESSION
![Page 16: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/16.jpg)
8 11109
Jump height (how high ball can be raised off the ground)
Feet off ground
Total SS = 11.11
![Page 17: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/17.jpg)
7
7.5
8
8.5
9
9.5
10
10.5
11
4.5 5.5 6.5 7.5 8.5
Height (ft)
Ju
mp
(ft
)
X variable parameter SS F1,13 p
Height +0.943 9.96 112 <0.0001of player
![Page 18: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/18.jpg)
7
7.5
8
8.5
9
9.5
10
10.5
11
105 125 145 165 185 205
Weight (lbs)
Ju
mp
(ft
)
X variable parameter SS p
Weight +0.040 7.92 32 <0.0001of player
F1,13
![Page 19: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/19.jpg)
Why do you think weight is + correlated with jump height?
![Page 20: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/20.jpg)
An idea
Perhaps if we took two people of identical height, the lighter one might actually jump higher? Excess weight may reduce ability to jump high…
![Page 21: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/21.jpg)
How could we test this idea?
![Page 22: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/22.jpg)
7
7.5
8
8.5
9
9.5
10
10.5
11
4 5 6 7 8
Height (lbs)
Ju
mp
(ft
)
lighterheavier
X variable parameter SS F p
Height +2.133 9.956 803 <0.0001Weight -0.059 1.008 81 <0.0001
![Page 23: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/23.jpg)
Questions:
•Why did the parameter estimates change?
•Why did the F tests change?
![Page 24: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/24.jpg)
Heavy people often tall (tall people often
heavy)
Tall people can jump higher
People light for their height can jump a bit more
Weight
HeightJump
+
+
-
![Page 25: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/25.jpg)
The problem:
The parameter estimate and significance of an x-variable is affected by the x-variables already in the model!
How do we know which variables are significant, and which order to enter them in model?
![Page 26: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/26.jpg)
Solutions
1) Use a logical order. For example in ANCOVA it makes sense to test the interaction first
2) Stepwise regression: “tries out” various orders of removing variables.
![Page 27: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/27.jpg)
Stepwise regression
Enters or removes variables in order of significance, checks after each step if the significance of other variables has changed
Enters one by one: forward stepwise
Enters all, removes one by one: backwards stepwise
![Page 28: Multiple regression. Problem: to draw a straight line through the points that best explains the variance Regression.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d2a5503460f949ff7af/html5/thumbnails/28.jpg)
Forward stepwise regression
• Enter the variable with the highest correlation with y-variable first (p>p enter).
• Next enter the variable to explains the most residual variation (p>p enter).
• Remove variables that become insignificant (p> p leave) due to other variables being added. And so on…