Multiple Linear Regression
Sukru Acitas
Anadolu University, Department of Statistics, 26470 Eskisehir, TURKEY,[email protected].
ENM310 Experimental Design& Regression Analysis
Reference textbook
⊲ Montgomery, D. C., Peck, E. A., & Vining, G. G. (2015). Introductionto linear regression analysis. John Wiley & Sons. (Chapter 3)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 2 / 35
Multiple regression model
Definition
A regression model that involves more than one regressor variable is calleda multiple regression model.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 3 / 35
Multiple regression model
The response y may be related to k regressor or predictor variables.
Statistical model
yi = β0 + β1x1 + β1x2 + . . . + βkxk + ε (1)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 4 / 35
Multiple regression model
The response y may be related to k regressor or predictor variables.
Statistical model
yi = β0 + β1x1 + β1x2 + . . . + βkxk + ε (1)
The parameters βj (j = 0, 1, . . . , k), are called the regressioncoefficients.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 4 / 35
Multiple regression model
The response y may be related to k regressor or predictor variables.
Statistical model
yi = β0 + β1x1 + β1x2 + . . . + βkxk + ε (1)
The parameters βj (j = 0, 1, . . . , k), are called the regressioncoefficients.
This model describes a hyperplane in the k− dimensional space of theregressor variables xj .
Multiple Linear Regression Exp.Desg.& Reg.Ana. 4 / 35
Multiple regression model
The response y may be related to k regressor or predictor variables.
Statistical model
yi = β0 + β1x1 + β1x2 + . . . + βkxk + ε (1)
The parameters βj (j = 0, 1, . . . , k), are called the regressioncoefficients.
This model describes a hyperplane in the k− dimensional space of theregressor variables xj .
The parameter βj represents the expected change in the response yper unit change in xj when all of the remaining regressor variables xj(i j) are held constant. For this reason the parameters βj are oftencalled partial regression coefficients.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 4 / 35
Multiple regression model
Note
Any regression model that is linear in the parameters (theβ’ s) is a linearregression model, regardless of the shape of the surface that it generates.
Example
y = β0 + β1x1 + ε
y = β0 + β1x1 + β2x2 + ε
y = β0 + β1x1 + β2x1x2 + ε
y = β0 + β1x31 + β2x
22 + β3x3 + ε
y = β0 +√
β1x31 ++ε
y =1
β1x31 ++ε
Multiple Linear Regression Exp.Desg.& Reg.Ana. 5 / 35
Estimation of the model parameters
The method of least squares can be used to estimate the regressioncoefficients in model (1).
Suppose that n > k observations are available, and let yi denote thei − th observed response and xij denote the i − th observation or levelof regressor xj . The data will appear as in the following Table.
Data for multiple linear regression
i y x1 x2 . . . xk1 y1 x11 x12 . . . x1k2 y2 x21 x22 . . . x2k...
......
... . . ....
n yn xn1 xn2 . . . xnk
Multiple Linear Regression Exp.Desg.& Reg.Ana. 6 / 35
Estimation of the model parameters
We may write the sample regression model corresponding to Eq. (1) asfollows:
Multiple linear regression model
yi = β0 + β1xi1 + β1xi2 + . . .+ βkxik + εi , i = 1, 2, . . . , n (2)
= β0 +
k∑
j=1
βjxij , i = 1, 2, . . . , n (3)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 7 / 35
Estimation of the model parameters
Assumptions:
The error term ε in the model has E (ε) = 0 , Var(ε) = σ2 , and thatthe errors are uncorrelated,
The regressor variables are fixed (i.e., mathematical or nonrandom)variables, measured without error and they are uncorrelated.
When testing hypotheses or constructing CIs, we will have to assumethat the error term ε has normal distribution with mean 0 andvariance σ2.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 8 / 35
Estimation of the model parameters
The least - squares function is
S(β0, β1, . . . , βk) =n∑
i=1
ε2i =n∑
i=1
yi − β0 −k∑
j=1
βjxij
2
(4)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 9 / 35
Estimation of the model parameters
The least - squares function is
S(β0, β1, . . . , βk) =n∑
i=1
ε2i =n∑
i=1
yi − β0 −k∑
j=1
βjxij
2
(4)
⊲ The function S must be minimized with respect to β0, β1, β2, . . . , βk .
Multiple Linear Regression Exp.Desg.& Reg.Ana. 9 / 35
Estimation of the model parameters
The least - squares estimators of β0, β1, β2, . . . , βk must satisfy
∂S(β0, β1, . . . , βk)
∂β0
∣∣∣∣β̂0,β̂1,β̂2,...,β̂k
= (−2)
n∑
i=1
yi − β̂0 −
k∑
j=1
β̂jxij
= 0
(5)and
∂S(β0, β1, . . . , βk)
∂βj
∣∣∣∣β̂j ,β̂1,β̂2,...,β̂k
= (−2)n∑
i=1
yi − β̂0 −k∑
j=1
β̂jxij
xij = 0.
(6)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 10 / 35
Estimation of the model parameters
Simplifying Eq. (5) and (6), we obtain the least - squares normalequations:
Normal equations
n
n∑
i=1
yi = β̂0 + β̂1
n∑
i=1
xi1 + · · ·+ β̂k
n∑
i=1
xik (7)
n∑
i=1
xi1yi = β̂0
n∑
i=1
xi1 + β̂1
n∑
i=1
x2i1 + · · ·+ β̂k
n∑
i=1
xi1xik (8)
...n∑
i=1
xikyi = β̂0
n∑
i=1
xi1xik + β̂1
n∑
i=1
xi1xik + · · ·+ β̂k
n∑
i=1
xi1x2ik (9)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 11 / 35
Estimation of the model parameters
Note
There are p = k + 1 normal equations, one for each of the unknownregression coefficients. The solution to the normal equations will bethe least - squares estimators β̂0, β̂1, β̂2, . . . , β̂k .
It is more convenient to deal with multiple regression models if theyare expressed in matrix notation.
This allows a very compact display of the model, data, and results.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 12 / 35
Matrix form of multiple linear regression model
In matrix notation, the model given by Eq. (3) is
Matrix notation
y = Xβ + ε (10)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 13 / 35
Matrix form of multiple linear regression model
y =
y1y2...yn
, X =
1 x11 x12 · · · x1k1 x21 x22 · · · x2k...
... · · ·. . .
...1 xn1 xn2 · · · xnk
,
β =
β0β1β2...βk
ve ε =
ε1ε2...εn
Multiple Linear Regression Exp.Desg.& Reg.Ana. 14 / 35
Least-squares estimation
We wish to find the vector of least-squares estimators, β̂, that minimizes
Least-squares function: Matrix form
S(β) =
n∑
i=1
ε2i = ε′ε = (y − Xβ)′(y − Xβ) (11)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 15 / 35
Least-squares estimation
Note that S(β) may be expressed as
S(β) = y′y − β′X′y − y′Xβ + βX′Xβ′ (12)
= y′y − 2β′X′y + βX′Xβ′ (13)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 16 / 35
Least-squares estimation
The least-squares estimators must satisfy
∂S(β)
∂β
∣∣∣∣β=β̂
= −2X′y + 2X′Xβ̂ = 0 (14)
which simplifies to
Least - squares normal equations
X′y = X′Xβ̂ (15)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 17 / 35
Least-squares estimation
Least-squares estimator
β̂ = (X′X)−1X′y (16)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 18 / 35
Some definitions
Fitted regression model
The fitted regression model is given by
ŷ = Xβ̂ = X(X′X)−1X′y. (17)
Hat matrix
The hat matrix is defined as
H = X(X′X)−1X′ (18)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 19 / 35
Some definitions
Note
ŷ = Hy (19)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 20 / 35
Some definitions
Note
ŷ = Hy (19)
The hat matrix maps the vector of observed values into a vector offitted values.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 20 / 35
Some definitions
Note
ŷ = Hy (19)
The hat matrix maps the vector of observed values into a vector offitted values.
The hat matrix and its properties play a central role in regressionanalysis.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 20 / 35
Some definitions
Residual
The difference between the observed value yi and the corresponding fittedvalue ŷi is the residual ei = yi − ŷi . The n residuals may be convenientlywritten in matrix notation as
e = y − ŷ. (20)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 21 / 35
Some definitions
Alternative notation for residual
e = y − Xβ̂ (21)
= y −Hy (22)
= y(I−H) (23)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 22 / 35
Estimation of σ2
As in simple linear regression, we may develop an estimator of σ2 from theresidual sum of squares
SSRes =n∑
i=
(yi − ŷi)2 =
n∑
i=1
e2i = e′e (24)
Substituting y − Xβ̂, we have
SSRes = e′e (25)
= (y − Xβ̂)′(y − Xβ̂) (26)
= y′y − 2β̂′
X′y + β̂′
X′Xβ̂︸ ︷︷ ︸
X′y
(27)
= y′y − β̂′
X′y. (28)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 23 / 35
Estimation of σ2
The residual sum of squares has n − p degrees of freedom associatedwith it since p parameters are estimated in the regression model.
The residual mean square is
MSRes
MSRes =SSRes
n − p(29)
The expected value of MSRes is σ2, so an unbiased estimator of σ2 is
Estimator of σ2
σ̂2 = MSRes . (30)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 24 / 35
Properties of LS estimators
β̂ is an unbiased estimator of β. That is E (β̂) = β.
The variance property of β̂ is expressed by the variance-covariancematrix:
Variance of β̂
Var(β̂) = E
{
(β̂ − β)′(β̂ − β)
}
= σ̂2(X′X)−1.
Var(β̂) is p × p symmetric matrix.
j−th diagonal element of Var(β̂) is the variance of β̂j .
(ij)−th off-diagonal element is the covariance between β̂i and β̂j .
Multiple Linear Regression Exp.Desg.& Reg.Ana. 25 / 35
Hypothesis testing in multiple linear regression
Once we have estimated the parameters in the model, we face twoimmediate questions:
What is the overall adequacy of the model?
What is the overall adequacy of the model?
Multiple Linear Regression Exp.Desg.& Reg.Ana. 26 / 35
Test for significance of regression
The test for significance of regression is a test to determine if there is alinear relationship between the response y and any of the regressorvariables x1, x2, . . . , xk . This procedure is often thought of as an overall orglobal test of model adequacy.
Hypotheses
H0 : β1 = β2 = · · · = βk = 0
H1 : βj 6= 0 for at least one j
Multiple Linear Regression Exp.Desg.& Reg.Ana. 27 / 35
Test for significance of regression
Test Statistic
To test the null hypothesis H0,
F0 =SSReg/k
SSRes/(n − k − 1)=
MSReg
MSRes
is used. It can be shown that F0 has F distribution with degrees offreedom ν1 = k and ν2 = n − k − 1.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 28 / 35
Test for significance of regression
Total sum of squares
SST = y′y −
(n∑
i=1
yi
)2
n
Regression sum of squares
SSReg = β̂′
X′y −
(n∑
i=1
yi
)2
n
Residual sum of squares
SSRes = y′y − β̂
′
X′y
Multiple Linear Regression Exp.Desg.& Reg.Ana. 29 / 35
Test for significance of regression
Decomposition of total sum of squares
SST = SSReg + SSRes
Multiple Linear Regression Exp.Desg.& Reg.Ana. 30 / 35
Test for significance of regression
ANOVA Table
Source SS df MS F
Regression SSReg k MSReg F0Residual SSRes n − k − 1 MSResTotal SST n − 1
Multiple Linear Regression Exp.Desg.& Reg.Ana. 31 / 35
Test for significance of regression
ANOVA Table
Source SS df MS F
Regression SSReg k MSReg F0Residual SSRes n − k − 1 MSResTotal SST n − 1
Reject H0 : β1 = β2 = · · · = βk = 0 if F0 > Fα,ν1,ν2.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 31 / 35
R2 and Adjusted R2
Two other ways to assess the overall adequacy of the model are R2
and adjusted R2, denoted R2Adj .
In general, R2 never decreases when a regressor is added to themodel, regardless of the value of the contribution of that variable.Therefore, it is difficult to judge whether an increase in R2 is reallytelling us anything important.
Adjusted R2
R2Adj = 1−SSRes/(n − p)
SST/(n − 1)(31)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 32 / 35
R2 and Adjusted R2
Two other ways to assess the overall adequacy of the model are R2
and adjusted R2, denoted R2Adj .
In general, R2 never decreases when a regressor is added to themodel, regardless of the value of the contribution of that variable.Therefore, it is difficult to judge whether an increase in R2 is reallytelling us anything important.
Adjusted R2
R2Adj = 1−SSRes/(n − p)
SST/(n − 1)(31)
R2Adj will only increase on adding a variable to the model if theaddition of the variable reduces the residual mean square.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 32 / 35
Tests on individual regression coefficients
Adding a variable to a regression model always causes the sum ofsquares for regression to increase and the residual sum of squares todecrease.
We must decide whether the increase in the regression sum of squaresis sufficient to warrant using the additional regressor in the model.
The addition of a regressor also increases the variance of the fittedvalue ŷ , so we must be careful to include only regressors that are ofreal value in explaining the response.
Furthermore, adding an unimportant regressor may increase theresidual mean square, which may decrease the usefulness of themodel.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 33 / 35
Tests on individual regression coefficients
Hypotheses
H0 : βj = 0, j = 1, 2, . . . , k
H1 : βj 6= 0.
Test statistic
t0 =β̂j
se(β̂j ), j = 1, 2, . . . , k
Multiple Linear Regression Exp.Desg.& Reg.Ana. 34 / 35
Tests on individual regression coefficients
Hypotheses
H0 : βj = 0, j = 1, 2, . . . , k
H1 : βj 6= 0.
Test statistic
t0 =β̂j
se(β̂j ), j = 1, 2, . . . , k
Reject H0 : βj = 0 if |t0| > tα/2,n−k−1.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 34 / 35
Tests on individual regression coefficients
Hypotheses
H0 : βj = 0, j = 1, 2, . . . , k
H1 : βj 6= 0.
Test statistic
t0 =β̂j
se(β̂j ), j = 1, 2, . . . , k
Reject H0 : βj = 0 if |t0| > tα/2,n−k−1.
If H0 : βj = 0 is not rejected, then this indicates that the regressor xjcan be deleted from the model.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 34 / 35
Tests on individual regression coefficients
Hypotheses
H0 : βj = 0, j = 1, 2, . . . , k
H1 : βj 6= 0.
Test statistic
t0 =β̂j
se(β̂j ), j = 1, 2, . . . , k
Reject H0 : βj = 0 if |t0| > tα/2,n−k−1.
If H0 : βj = 0 is not rejected, then this indicates that the regressor xjcan be deleted from the model.
This is really a partial or marginal test because the regressioncoefficient β̂j depends on all of the other regressor variables xi (i 6= j)that are in the model. Thus, this is a test of the contribution of xjgiven the other regressors in the model.
Multiple Linear Regression Exp.Desg.& Reg.Ana. 34 / 35
Thank you :)
Multiple Linear Regression Exp.Desg.& Reg.Ana. 35 / 35
Top Related