Statistics for Business and Economics
description
Transcript of Statistics for Business and Economics
![Page 1: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/1.jpg)
Statistics for Business and Economics
Chapter 11 Multiple Regression and Model
Building
![Page 2: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/2.jpg)
Learning Objectives
1. Explain the Linear Multiple Regression Model
2. Describe Inference About Individual Parameters
3. Test Overall Significance
4. Explain Estimation and Prediction
5. Describe Various Types of Models
6. Describe Model Building
7. Explain Residual Analysis
8. Describe Regression Pitfalls
![Page 3: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/3.jpg)
Types of Regression Models
Simple
1 ExplanatoryVariable
RegressionModels
2+ ExplanatoryVariables
Multiple
LinearNon-
LinearLinear
Non-Linear
![Page 4: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/4.jpg)
Models With Two or More Quantitative Variables
![Page 5: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/5.jpg)
Types of Regression Models
Simple
1 ExplanatoryVariable
RegressionModels
2+ ExplanatoryVariables
Multiple
LinearNon-
LinearLinear
Non-Linear
![Page 6: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/6.jpg)
Multiple Regression Model
• General form:
• k independent variables
• x1, x2, …, xk may be functions of variables
— e.g. x2 = (x1)2
0 1 1 2 2 k ky x x x
![Page 7: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/7.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 8: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/8.jpg)
Probability Distribution of Random Error
![Page 9: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/9.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 10: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/10.jpg)
Assumptions for Probability Distribution of ε
1. Mean is 0
2. Constant variance, σ2
3. Normally Distributed
4. Errors are independent
![Page 11: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/11.jpg)
Linear Multiple Regression Model
![Page 12: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/12.jpg)
Types of Regression Models
ExplanatoryVariable
1stOrderModel
3rdOrderModel
2 or MoreQuantitative
Variables
2ndOrderModel
1stOrderModel
2ndOrderModel
Inter-ActionModel
1Qualitative
Variable
DummyVariable
Model
1Quantitative
Variable
![Page 13: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/13.jpg)
Regression Modeling Steps
1. Hypothesize deterministic component
2. Estimate unknown model parameters
3. Specify probability distribution of random error term
• Estimate standard deviation of error
4. Evaluate model
5. Use model for prediction and estimation
![Page 14: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/14.jpg)
First–Order Multiple Regression Model
Relationship between 1 dependent and 2 or more independent variables is a linear function
0 1 1 2 2 k ky x x x
Dependent (response) variable
Independent (explanatory) variables
Population slopes
Population Y-intercept
Random error
![Page 15: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/15.jpg)
• Relationship between 1 dependent and 2 independent variables is a linear function
• Model
• Assumes no interaction between x1 and x2
— Effect of x1 on E(y) is the same regardless of x2 values
First-Order Model With 2 Independent Variables
0 1 1 2 2( )E y x x
![Page 16: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/16.jpg)
y
x1
0Response
Plane
(Observed y)
i
Population Multiple Regression Model
Bivariate model:
x2
(x1i , x2i)
0 1 1 2 2i i i iy x x
0 1 1 2 2( ) i iE y x x
![Page 17: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/17.jpg)
Sample Multiple Regression Model
y
0Response
Plane
(Observed y)
i
^^
Bivariate model:
x1
x2
(x1i , x2i)
0 1 1 2 2ˆ ˆ ˆ
i i i iy x x
0 1 1 2 2ˆ ˆ ˆˆi i iy x x
![Page 18: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/18.jpg)
No Interaction
Effect (slope) of x1 on E(y) does not depend on x2 value
E(y)
x1
4
8
12
00 10.5 1.5
E(y) = 1 + 2x1 + 3(2) = 7 + 2x1
E(y) = 1 + 2x1 + 3x2
E(y) = 1 + 2x1 + 3(1) = 4 + 2x1
E(y) = 1 + 2x1 + 3(0) = 1 + 2x1
E(y) = 1 + 2x1 + 3(3) = 10 + 2x1
![Page 19: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/19.jpg)
Parameter Estimation
![Page 20: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/20.jpg)
Regression Modeling Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random Error Term
• Estimate Standard Deviation of Error
4. Evaluate Model
5. Use Model for Prediction & Estimation
![Page 21: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/21.jpg)
1 1 1 3
2 4 8 5
3 1 3 2
4 3 5 6
: : : :
First-Order Model Worksheet
Run regression with y, x1, x2
Case, i yi x1i x2i
![Page 22: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/22.jpg)
Multiple Linear Regression Equations
Too complicated
by hand! Ouch!
![Page 23: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/23.jpg)
Interpretation of Estimated Coefficients
2. Y-Intercept (0)• Average value of y when xk = 0
^
^
^
^1. Slope (k)
• Estimated y changes by k for each 1 unit increase in xk holding all other variables constant– Example: if 1 = 2, then sales (y) is expected to
increase by 2 for each 1 unit increase in advertising (x1) given the number of sales rep’s (x2)
![Page 24: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/24.jpg)
1st Order Model Example
You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.) and newspaper circulation (000) on the number of ad responses (00). Estimate the unknown parameters.
You’ve collected the following data:
(y) (x1) (x2)Resp Size Circ
1 1 24 8 81 3 13 5 72 6 44 10 6
![Page 25: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/25.jpg)
Parameter Estimates
Parameter Standard T for H0:Variable DF Estimate Error Param=0 Prob>|T|
INTERCEP 1 0.0640 0.2599 0.246 0.8214
ADSIZE 1 0.2049 0.0588 3.656 0.0399
CIRC 1 0.2805 0.0686 4.089 0.0264
Parameter Estimation Computer Output
2^
0^
1^
1 2ˆ .0640 .2049 .2805y x x
![Page 26: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/26.jpg)
Interpretation of Coefficients Solution
^2. Slope (2)• Number of responses to ad is expected to increase
by .2805 (28.05) for each 1 unit (1,000) increase in circulation holding ad size constant
^1. Slope (1)• Number of responses to ad is expected to
increase by .2049 (20.49) for each 1 sq. in. increase in ad size holding circulation constant
![Page 27: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/27.jpg)
Estimation of σ2
![Page 28: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/28.jpg)
Regression Modeling Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random Error Term
• Estimate Standard Deviation of Error
4. Evaluate Model
5. Use Model for Prediction & Estimation
![Page 29: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/29.jpg)
Estimation of σ2
22 ˆ( 1) i i
SSEs where SSE y y
n k
2
( 1)
SSEs s
n k
For a model with k independent variables
![Page 30: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/30.jpg)
Calculating s2 and s Example
You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Find SSE, s2, and s.
![Page 31: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/31.jpg)
Analysis of Variance
Source DF SS MS F PRegression 2 9.249736 4.624868 55.44 .0043
Residual Error 3 .250264 .083421Total 5 9.5
Analysis of Variance Computer Output
SSE S2
2 .250264.083421
6 3
.083421 .2888
s
s
![Page 32: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/32.jpg)
Evaluating the Model
![Page 33: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/33.jpg)
Regression Modeling Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random Error Term
• Estimate Standard Deviation of Error
4. Evaluate Model
5. Use Model for Prediction & Estimation
![Page 34: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/34.jpg)
Evaluating Multiple Regression Model Steps
1. Examine variation measures
2. Test parameter significance• Individual coefficients • Overall model
3. Do residual analysis
![Page 35: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/35.jpg)
Variation Measures
![Page 36: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/36.jpg)
Evaluating Multiple Regression Model Steps
1. Examine variation measures
2. Test parameter significance• Individual coefficients • Overall model
3. Do residual analysis
![Page 37: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/37.jpg)
Multiple Coefficient of Determination
• Proportion of variation in y ‘explained’ by all x variables taken together
•
• Never decreases when new x variable is added to model
— Only y values determine SSyy
— Disadvantage when comparing models
2
yy
Explained Variation SSER = = 1-
Total Variation SS
![Page 38: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/38.jpg)
Adjusted Multiple Coefficient of Determination
• Takes into account n and number of parameters
• Similar interpretation to R2
2 2a
n-1R = 1- 1
n-(k+1)R
![Page 39: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/39.jpg)
Estimation of R2 and Ra2
ExampleYou work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Find R2 and Ra
2.
![Page 40: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/40.jpg)
Excel Computer OutputSolution
R2
Ra2
![Page 41: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/41.jpg)
Testing Parameters
![Page 42: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/42.jpg)
Evaluating Multiple Regression Model Steps
1. Examine variation measures
2. Test parameter significance• Individual coefficients • Overall model
3. Do residual analysis
![Page 43: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/43.jpg)
Inference for an Individual β Parameter
• Confidence Interval
• Hypothesis TestHo: βi = 0Ha: βi ≠ 0 (or < or > )
• Test Statistic
ˆ2ˆ
ii t s df = n – (k + 1)
ˆ
ˆ
i
its
![Page 44: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/44.jpg)
Confidence Interval Example
You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Find a 95% confidence interval for β1.
![Page 45: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/45.jpg)
Excel Computer OutputSolution
1s1
![Page 46: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/46.jpg)
Confidence IntervalSolution
1
.204921 3.182(.058822)
.0177 .3921
![Page 47: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/47.jpg)
Hypothesis Test Example
You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Test the hypothesis that the mean ad response increases as circulation increases (ad size constant). Use α = .05.
![Page 48: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/48.jpg)
Hypothesis Test Solution
• H0:• Ha:• • df • Critical Value(s):
Test Statistic:
Decision:
Conclusion:
t0 2.353
.05
Reject H0
2 = 0
2 0
.05
6 - 3 = 3
![Page 49: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/49.jpg)
Excel Computer OutputSolution
2ˆs
2
![Page 50: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/50.jpg)
Hypothesis Test Solution
• H0:• Ha:• • df • Critical Value(s):
Test Statistic:
Decision:
Conclusion:
t0 2.353
.05
Reject H0
2 = 0
2 0
.05
6 - 3 = 3 2
2
ˆ
ˆ .2804924.089
.068602t
S
Reject at = .05
There is evidence the mean ad response increases as circulation increases
![Page 51: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/51.jpg)
Excel Computer OutputSolution
2
2
ˆ
ˆt
s
P–Value
![Page 52: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/52.jpg)
Evaluating Multiple Regression Model Steps
1. Examine variation measures
2. Test parameter significance• Individual coefficients• Overall model
3. Do residual analysis
![Page 53: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/53.jpg)
Testing Overall Significance
• Shows if there is a linear relationship between all x variables together and y
• Hypotheses— H0: 1 = 2 = ... = k = 0
No linear relationship
— Ha: At least one coefficient is not 0 At least one x variable affects y
![Page 54: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/54.jpg)
• Test Statistic
• Degrees of Freedom1 = k 2 = n – (k + 1)
— k = Number of independent variables
— n = Sample size
( )
( )
MS ModelF
MS Error
Testing Overall Significance
![Page 55: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/55.jpg)
Testing Overall Significance Example
You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Conduct the global F–test of model usefulness. Use α = .05.
![Page 56: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/56.jpg)
Testing Overall Significance Solution
• H0:• Ha:• =• 1 = 2 = • Critical Value(s):
Test Statistic:
Decision:
Conclusion:
F0 9.55
= .05
β1 = β2 = 0
At least 1 not zero
.052 3
![Page 57: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/57.jpg)
Testing Overall SignificanceComputer Output
Analysis of Variance
Sum of Mean Source DF Squares Square F Value Prob>F
Model 2 9.2497 4.6249 55.440 0.0043
Error 3 0.2503 0.0834
C Total 5 9.5000
k
n – (k + 1) MS(Error)
MS(Model)
![Page 58: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/58.jpg)
Testing Overall Significance Solution
• H0:• Ha:• =• 1 = 2 = • Critical Value(s):
Test Statistic:
Decision:
Conclusion:
F0 9.55
= .05
β1 = β2 = 0
At least 1 not zero
.052 3
Reject at = .05
There is evidence at least 1 of the coefficients is not zero
4.624955.44
.0834F
![Page 59: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/59.jpg)
Testing Overall SignificanceComputer Output Solution
Analysis of Variance
Sum of Mean Source DF Squares Square F Value Prob>F
Model 2 9.2497 4.6249 55.440 0.0043
Error 3 0.2503 0.0834
C Total 5 9.5000
P-Value
MS(Model) MS(Error)
![Page 60: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/60.jpg)
Interaction Models
![Page 61: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/61.jpg)
Types of Regression Models
ExplanatoryVariable
1stOrderModel
3rdOrderModel
2 or MoreQuantitative
Variables
2ndOrderModel
1stOrderModel
2ndOrderModel
Inter-ActionModel
1Qualitative
Variable
DummyVariable
Model
1Quantitative
Variable
![Page 62: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/62.jpg)
Interaction Model With 2 Independent Variables
• Hypothesizes interaction between pairs of x variables
— Response to one x variable varies at different levels of another x variable
0 1 1 2 2 3 1 2( )E y x x x x • Contains two-way cross product terms
• Can be combined with other models— Example: dummy-variable model
![Page 63: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/63.jpg)
Effect of Interaction
0 1 1 2 2 3 1 2( )E y x x x x Given:
• Without interaction term, effect of x1 on y is measured by 1
• With interaction term, effect of x1 on y is measured by 1 + 3x2
— Effect increases as x2 increases
![Page 64: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/64.jpg)
Interaction Model Relationships
Effect (slope) of x1 on E(y) depends on x2 value
E(y)
x1
4
8
12
00 10.5 1.5
E(y) = 1 + 2x1 + 3x2 + 4x1x2
E(y) = 1 + 2x1 + 3(1) + 4x1(1) = 4 + 6x1
E(y) = 1 + 2x1 + 3(0) + 4x1(0) = 1 + 2x1
![Page 65: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/65.jpg)
Interaction Model Worksheet
1 1 1 3 3
2 4 8 5 40
3 1 3 2 6
4 3 5 6 30
: : : : :Multiply x1 by x2 to get x1x2. Run regression with y, x1, x2 , x1x2
Case, i yi x1i x2i x1i x2i
![Page 66: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/66.jpg)
Interaction Example
You work in advertising for the New York Times. You want to find the effect of ad size (sq. in.), x1, and newspaper circulation (000), x2, on the number of ad responses (00), y. Conduct a test for interaction. Use α = .05.
![Page 67: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/67.jpg)
Interaction Model Worksheet
1 2 2
8 8 64
3 1 3
5 7 35
Multiply x1 by x2 to get x1x2. Run regression with y, x1, x2 , x1x2
x1i x2i x1i x2i
1
4
1
3
yi
6 4 24
10 6 60
2
4
![Page 68: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/68.jpg)
Excel Computer OutputSolution
Global F–test indicates at least one parameter is not zero
P-ValueF
![Page 69: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/69.jpg)
Interaction Test Solution
• H0:• Ha:• • df • Critical Value(s):
Test Statistic:
Decision:
Conclusion:
t0 2.776-2.776
.025
Reject H0 Reject H0
.025
3 = 0
3 ≠ 0
.05
6 - 2 = 4
![Page 70: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/70.jpg)
Excel Computer OutputSolution
3
3
ˆ
ˆt
s
![Page 71: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/71.jpg)
Interaction Test Solution
• H0:• Ha:• • df • Critical Value(s):
Test Statistic:
Decision:
Conclusion:
t0 2.776-2.776
.025
Reject H0 Reject H0
.025
3 = 0
3 ≠ 0
.05
6 - 2 = 4
Do no reject at = .05
There is no evidence of interaction
t = 1.8528
![Page 72: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/72.jpg)
Second–Order Models
![Page 73: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/73.jpg)
Types of Regression Models
ExplanatoryVariable
1stOrderModel
3rdOrderModel
2 or MoreQuantitative
Variables
2ndOrderModel
1stOrderModel
2ndOrderModel
Inter-ActionModel
1Qualitative
Variable
DummyVariable
Model
1Quantitative
Variable
![Page 74: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/74.jpg)
Second-Order Model With 1 Independent Variable
• Relationship between 1 dependent and 1 independent variable is a quadratic function
• Useful 1st model if non-linear relationship suspected
• Model
20 1 2( )E y x x
Linear effect
Curvilinear effect
![Page 75: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/75.jpg)
y
Second-Order Model Relationships
yy
y
2 > 02 > 0
2 < 02 < 0
x1
x1x1
x1
![Page 76: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/76.jpg)
Second-Order Model Worksheet
Create x2 column. Run regression with y, x, x2.
1 1 1 1
2 4 8 64
3 1 3 9
4 3 5 25
: : : :
Case, i yi xi2xi
![Page 77: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/77.jpg)
2nd Order Model Example
The data shows the number of weeks employed and the number of errors made per day for a sample of assembly line workers. Find a 2nd order model, conduct the global F–test, and test if β2 ≠ 0. Use α = .05 for all tests.
Errors (y) Weeks (x)20 1 18 1 16 210 48 44 53 61 82 101 110 121 12
![Page 78: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/78.jpg)
Second-Order Model Worksheet
Create x2 column. Run regression with y, x, x2.
20 1 1
18 1 1
16 2 4
10 4 16
yi xi2xi
: ::
![Page 79: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/79.jpg)
Excel Computer Output Solution
2ˆ 23.728 4.784 .242y x x
![Page 80: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/80.jpg)
Overall Model Test Solution
Global F–test indicates at least one parameter is not zero
P-ValueF
![Page 81: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/81.jpg)
β2 Parameter Test Solutionβ2 test indicates curvilinear relationship exists
P-Valuet
![Page 82: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/82.jpg)
Types of Regression Models
ExplanatoryVariable
1stOrderModel
3rdOrderModel
2 or MoreQuantitative
Variables
2ndOrderModel
1stOrderModel
2ndOrderModel
Inter-ActionModel
1Qualitative
Variable
DummyVariable
Model
1Quantitative
Variable
![Page 83: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/83.jpg)
Second-Order Model With
2 Independent Variables• Relationship between 1 dependent and 2
independent variables is a quadratic function
• Useful 1st model if non-linear relationship suspected
• Model
225
214
21322110)(
ii
iiii
xx
xxxxyE
![Page 84: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/84.jpg)
Second-Order Model Relationships
y
x2x1
4 + 5 > 0 y 4 + 5 < 0
y 32 > 4 4 5
225
214213
22110)(
i
iii
ii
x
xxx
xxyE
x2x1
x2
x1
![Page 85: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/85.jpg)
Second-Order Model Worksheet
1 1 1 3 3 1 9
2 4 8 5 40 64 25
3 1 3 2 6 9 4
4 3 5 6 30 25 36
: : : : : : :
Multiply x1 by x2 to get x1x2; then create x12, x2
2. Run regression with y, x1, x2 , x1x2, x1
2, x22.
Case, i yi x1i2x1i
2x2ix2i x1ix2i
![Page 86: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/86.jpg)
Models With One Qualitative Independent Variable
![Page 87: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/87.jpg)
Types of Regression Models
ExplanatoryVariable
1stOrderModel
3rdOrderModel
2 or MoreQuantitative
Variables
2ndOrderModel
1stOrderModel
2ndOrderModel
Inter-ActionModel
1Qualitative
Variable
DummyVariable
Model
1Quantitative
Variable
![Page 88: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/88.jpg)
Dummy-Variable Model
• Involves categorical x variable with 2 levels— e.g., male-female; college-no college
• Variable levels coded 0 and 1• Number of dummy variables is 1 less than
number of levels of variable• May be combined with quantitative variable
(1st order or 2nd order model)
![Page 89: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/89.jpg)
Dummy-Variable Model Worksheet
1 1 1 1
2 4 8 0
3 1 3 1
4 3 5 1
: : : :x2 levels: 0 = Group 1; 1 = Group 2. Run regression with y, x1, x2
Case, i yi x1i x2i
![Page 90: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/90.jpg)
Interpreting Dummy-Variable Model Equation
Given:
y = Starting salary of college graduates
x1 = GPA0 if Male1 if Female
Same slopes
0 1 1 2 2ˆ ˆ ˆˆi i iy x x
x2 =
Male ( x2 = 0 ):
0 1 1 2 0 1 1ˆ ˆ ˆ ˆ ˆˆ (0)i i iy x x
0 1 1 2 0 2 1 1ˆ ˆ ˆ ˆ ˆ ˆˆ (1)i i iy x x
Female ( x2 = 1 ):
![Page 91: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/91.jpg)
Dummy-Variable Model Example
Computer Output:
Same slopes
0 if Male1 if Female
x2 =
Male ( x2 = 0 ):
1 1 1ˆ 3 5 7(0) 3 5i i iy x x
Female ( x2 = 1 ):
1 1ˆ 3 5 7(1) 3 7 5i i iy x x
1 2ˆ 3 5 7i i iy x x
![Page 92: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/92.jpg)
Dummy-Variable Model Relationships
y
x100
Same Slopes 1
0
0 + 2
^
^ ^
^
Female
Male
![Page 93: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/93.jpg)
Nested Models
![Page 94: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/94.jpg)
Comparing Nested Models• Contains a subset of terms in the complete (full) model
• Tests the contribution of a set of x variables to the relationship with y
• Null hypothesis H0: g+1 = ... = k = 0
— Variables in set do not improve significantly the model when all other variables are included
• Used in selecting x variables or models
• Part of most computer programs
![Page 95: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/95.jpg)
Selecting Variables in Model Building
![Page 96: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/96.jpg)
Selecting Variables in Model Building
A butterfly flaps its wings in Japan, which causes it to rain in Nebraska. -- Anonymous
Use Theory Only! Use Computer Search!
![Page 97: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/97.jpg)
Model Building with Computer Searches
• Rule: Use as few x variables as possible
• Stepwise Regression— Computer selects x variable most highly correlated
with y
— Continues to add or remove variables depending on SSE
• Best subset approach— Computer examines all possible sets
![Page 98: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/98.jpg)
Residual Analysis
![Page 99: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/99.jpg)
Evaluating Multiple Regression Model Steps
1. Examine variation measures
2. Test parameter significance• Individual coefficients • Overall model
3. Do residual analysis
![Page 100: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/100.jpg)
Residual Analysis
• Graphical analysis of residuals— Plot estimated errors versus xi values
Difference between actual yi and predicted yi
Estimated errors are called residuals
— Plot histogram or stem-&-leaf of residuals
• Purposes— Examine functional form (linear v. non-linear
model)— Evaluate violations of assumptions
![Page 101: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/101.jpg)
Residual Plot for Functional Form
Add x2 Term Correct Specification
x
e
x
e
![Page 102: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/102.jpg)
Residual Plot for Equal Variance
Unequal Variance Correct Specification
Fan-shaped.Standardized residuals used typically.
x
e
x
e
![Page 103: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/103.jpg)
Residual Plot for Independence
x
Not Independent Correct Specification
Plots reflect sequence data were collected.
e
x
e
![Page 104: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/104.jpg)
Residual Analysis Computer Output
Dep Var Predict Student
Obs SALES Value Residual Residual -2-1-0 1 2
1 1.0000 0.6000 0.4000 1.044 | |** |
2 1.0000 1.3000 -0.3000 -0.592 | *| |
3 2.0000 2.0000 0 0.000 | | |
4 2.0000 2.7000 -0.7000 -1.382 | **| |
5 4.0000 3.4000 0.6000 1.567 | |*** |
Plot of standardized (student) residuals
![Page 105: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/105.jpg)
Regression Pitfalls
![Page 106: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/106.jpg)
Regression Pitfalls
• Parameter Estimability— Number of different x–values must be at least one
more than order of model
• Multicollinearity— Two or more x–variables in the model are
correlated
• Extrapolation— Predicting y–values outside sampled range
• Correlated Errors
![Page 107: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/107.jpg)
Multicollinearity
• High correlation between x variables
• Coefficients measure combined effect
• Leads to unstable coefficients depending on x variables in model
• Always exists – matter of degree
• Example: using both age and height as explanatory variables in same model
![Page 108: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/108.jpg)
Detecting Multicollinearity
• Significant correlations between pairs of x variables are more than with y variable
• Non–significant t–tests for most of the individual parameters, but overall model test is significant
• Estimated parameters have wrong sign
![Page 109: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/109.jpg)
Solutions to Multicollinearity
• Eliminate one or more of the correlated x variables
• Avoid inference on individual parameters
• Do not extrapolate
![Page 110: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/110.jpg)
y Interpolation
x
Extrapolation Extrapolation
Sampled Range
Extrapolation
![Page 111: Statistics for Business and Economics](https://reader035.fdocuments.in/reader035/viewer/2022081723/56814a33550346895db74fe0/html5/thumbnails/111.jpg)
Conclusion
1. Explained the Linear Multiple Regression Model
2. Described Inference About Individual Parameters
3. Tested Overall Significance
4. Explained Estimation and Prediction
5. Described Various Types of Models
6. Described Model Building
7. Explained Residual Analysis
8. Described Regression Pitfalls