MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus...

23
Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics, linear regression is a linear approach to modeling the relationship between scalar responses with one or more explanatory variables. The case of one explanatory variable is called simple linear regression. For more than one explanatory variable it is called multiple linear regression. Multiple linear regression is the most common form of linear regression analysis. As a predictive analysis, multiple linear regression is used to explain the relationship between one continuous dependent variable and two or more independent variables. The independent variable can be continuous or category (dummy given the appropriate code). Assumption: 1. Residual regression must be normally distributed. 2. A linear relationship is assumed between the dependent variable and the independent variable. 3. The residual is homoscedastic and is roughly rectangular in shape. 4. The absence of multicollinearity is assumed in the model, which means that the independent variables are not highly correlated. Benefits of Regression There are 3 main uses for multiple linear regression analysis. First, it can be used to identify the strength of the effect that an independent variable has on the dependent variable. Second, it can be used to estimate the effect or impact of change. That is, multiple linear regression analysis helps us to understand how much the dependent variable will change when we change the independent variable. Third, multiple linear regression analysis predicts future trends and values. Multiple linear regression analysis can be used to obtain point estimates. When choosing a model for multiple linear regression analysis, another important consideration is the appropriate model. Adding independent variables to the multiple linear regression model will always increase the amount of variance explained in the dependent variable (usually expressed as R²). Therefore, adding too many independent variables without theoretical justification can produce a model that is too appropriate.

Transcript of MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus...

Page 1: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING

In statistics, linear regression is a linear approach to modeling the relationship between

scalar responses with one or more explanatory variables. The case of one explanatory

variable is called simple linear regression. For more than one explanatory variable it is

called multiple linear regression.

Multiple linear regression is the most common form of linear regression analysis. As a

predictive analysis, multiple linear regression is used to explain the relationship between

one continuous dependent variable and two or more independent variables. The

independent variable can be continuous or category (dummy given the appropriate code).

Assumption:

1. Residual regression must be normally distributed.

2. A linear relationship is assumed between the dependent variable and the independent

variable.

3. The residual is homoscedastic and is roughly rectangular in shape.

4. The absence of multicollinearity is assumed in the model, which means that the

independent variables are not highly correlated.

Benefits of Regression

There are 3 main uses for multiple linear regression analysis.

First, it can be used to identify the strength of the effect that an independent variable has

on the dependent variable.

Second, it can be used to estimate the effect or impact of change. That is, multiple linear

regression analysis helps us to understand how much the dependent variable will change

when we change the independent variable.

Third, multiple linear regression analysis predicts future trends and values. Multiple

linear regression analysis can be used to obtain point estimates.

When choosing a model for multiple linear regression analysis, another important

consideration is the appropriate model. Adding independent variables to the multiple

linear regression model will always increase the amount of variance explained in the

dependent variable (usually expressed as R²). Therefore, adding too many independent

variables without theoretical justification can produce a model that is too appropriate.

Page 2: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Example:

Here is the multiple regression equation:

GDP = β0 + β1 GFCF + β2 TR + β3 TRADE + β4 AK+ β5 IVA + ε

atau

LOG(GDP) = β + β1LOG(GFCF) + β2LOG(TR) + β3LOG(TRADE) +

β4LOG(AK) + β5LOG(IVA) + ε

Where

Gross : Domestic Product GDP

GFCF : Gross fixed capital formation

TR : Tax Receipts

TRADE : Trade Value

AK : Work Force

IVA : Industrial Value Added

Page 3: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Yes

No

Literature Study

(Previous Theories and Studies)

Identification of Research Variables

and Modeling

Making Hypotheses

Data processing

Model Specification Test and Classical

Assumption Test

Data Collection Process

Model Estimation and

Hypothesis test

Conclusions and Recommendations

Figure 1

Research Steps With Regression

Revition

Fullfil

Page 4: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

ABCDEFG Country Macro Data

Tahun GDP Kurs GFCF TR Trade IVA

1986 2,047,292,604,338,300 1,283 525,767,839,086,000 14,993,000,000,000 819,472,884,152,016 798,544,781,014,800

1987 2,155,799,176,654,500 1,644 554,680,733,051,900 18,827,000,000,000 998,818,533,420,652 848,962,545,204,100

1988 2,292,814,846,746,000 1,686 618,517,644,265,400 21,435,000,000,000 1,083,459,633,537,290 907,301,991,769,700

1989 2,501,110,525,830,700 1,770 710,782,070,626,400 26,678,000,000,000 1,227,592,114,499,590 1,053,730,190,492,900

1990 2,726,249,821,167,800 1,843 825,058,186,077,300 37,431,600,000,000 1,441,964,271,955,940 1,161,956,026,945,300

1991 2,969,643,847,614,300 1,950 931,494,177,183,200 39,098,000,000,000 1,628,539,758,976,440 1,277,017,158,693,800

1992 3,184,067,029,251,900 2,030 964,890,833,384,600 44,499,900,000,000 1,828,527,952,182,690 1,503,686,756,495,200

1993 3,415,041,652,725,900 2,087 1,028,569,876,323,000 47,344,000,000,000 1,725,393,071,594,540 1,482,119,880,662,300

1994 3,672,538,070,470,900 2,161 1,170,056,947,025,800 60,958,000,000,000 1,905,206,289,918,400 1,647,643,312,108,300

1995 3,980,897,516,197,100 2,249 1,333,804,863,859,200 68,017,000,000,000 2,148,036,149,401,290 1,819,329,402,675,000

1996 4,285,149,005,555,500 2,342 1,527,398,594,230,900 75,810,000,000,000 2,239,622,144,209,320 2,013,806,086,902,600

1997 4,486,545,547,065,900 2,909 1,658,266,039,607,700 100,505,700,000,000 2,512,191,662,450,940 2,117,949,036,253,500

1998 3,897,609,098,897,500 10,014 1,110,903,117,229,100 143,626,700,000,000 3,748,961,858,536,470 1,822,466,026,611,800

1999 3,928,444,247,354,300 7,855 908,769,274,450,800 179,430,400,000,000 2,472,716,527,537,000 1,858,334,024,795,000

2000 4,121,726,241,993,900 8,422 1,060,872,288,362,200 99,643,790,476,190 2,944,432,464,112,750 1,967,791,836,600,200

2001 4,271,899,954,667,000 10,261 1,129,749,087,480,800 190,614,200,000,000 2,981,495,922,803,340 2,021,590,172,211,700

2002 4,464,113,041,849,300 9,311 1,182,784,395,845,000 215,467,500,000,000 2,637,374,038,628,250 2,107,764,749,213,200

2003 4,677,514,123,258,600 8,577 1,189,884,726,120,600 249,404,313,253,852 2,507,919,080,077,860 2,186,913,010,032,200

2004 4,912,833,962,560,100 8,939 1,364,599,072,554,300 283,093,000,000,000 2,935,973,057,506,130 2,273,100,844,234,100

2005 5,192,500,538,917,800 9,705 1,513,164,999,669,000 312,488,056,626,926 3,322,573,914,833,110 2,380,026,639,458,400

2006 5,478,137,490,010,100 9,159 1,552,460,084,009,500 343,625,377,952,311 3,103,755,304,808,870 2,486,855,317,982,500

2007 5,825,726,531,358,900 9,141 1,697,209,598,268,500 374,762,699,277,696 3,194,202,162,925,520 2,604,234,878,138,900

2008 6,176,068,457,506,700 9,699 1,898,942,099,974,500 658,701,000,000,000 3,616,792,130,902,920 2,701,585,275,030,700

2009 6,461,950,710,551,600 10,390 1,961,482,072,067,500 619,922,000,000,000 2,940,970,850,172,160 2,798,525,931,939,900

2010 6,864,133,100,000,000 9,090 2,127,840,682,000,000 723,307,000,000,000 3,205,637,598,235,500 2,936,192,400,000,000

2011 7,287,635,300,000,000 8,770 2,316,359,104,097,800 873,874,000,000,000 3,656,936,354,349,880 3,122,633,400,000,000

Page 5: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Tahun GDP Kurs GFCF TR Trade IVA

2012 7,727,083,400,000,000 9,387 2,527,728,790,484,400 980,518,000,000,000 3,831,311,903,721,210 3,288,298,000,000,000

2013 8,156,497,800,000,000 10,461 2,654,375,036,700,900 1,077,310,000,000,000 3,967,106,232,237,770 3,431,080,900,000,000

2014 8,566,271,200,000,000 11,865 2,775,733,579,064,600 1,145,282,928,434,310 4,116,716,178,297,760 3,577,694,800,000,000

2015 8,976,931,500,000,000 13,389 2,916,601,630,562,700 1,164,554,577,386,610 3,764,719,516,265,140 3,672,595,500,000,000

2016 8,164,934,610,302,790 12,935 2,542,077,573,709,210 935,748,526,661,157 4,201,921,841,339,830 3,547,268,352,813,750

2017 8,380,512,700,553,410 13,343 2,612,020,756,005,150 974,202,791,298,637 4,304,178,262,152,610 3,638,770,115,628,690

2018 8,596,090,790,804,030 13,751 2,681,963,938,301,090 1,012,657,055,936,110 4,406,434,682,965,400 3,730,271,878,443,630

Source: Hypothesis Data

Page 6: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Type the data in Excell and name the file with the ABCDFEG hypothesis data name, then

close the file.

Open Eviews

Page 7: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Appears on the screen

Fill in the Date specification with

Start date : 1986

Page 8: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

End date : 2018

Then the data that we have created in Excel, we move it to eviews

Click file import Import from file …

it will appear on the screen

Page 9: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Locate the file to be imported and click open

Click Next

Click Next

Click Finish

Page 10: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Then the screen will appear

How to Regress

Click Quick Estimate Equation …

Then appears

Page 11: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Click OK

Then all of our data is log (Logarithmic), so the equation

LS LOG(GDP) C LOG(GFCF) LOG(TR) LOG(TRADE) LOG(AK) LOG(IVA)

Click Estimate

Click OK

The coefficients are

very large and some

are small

Page 12: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Classic assumption test

The classic assumption test used in linear regression with the Ordinary Least Squared

(OLS) approach includes Linearity test, Normality test. Autocorrelation test,

Heteroscedasticity test, and Multicollinearity test However, not all classic assumption

tests must be performed on every linear regression model with the OLS approach.

1. Linearity tests are hardly carried out on every linear regression model. Because it is

assumed that the model is linear. Even if it has to be done solely to see the extent of

the linearity.

2. Normality test is basically not a BLUE (Best Linear Unbias Estimator) requirement

and some opinions do not require this condition as something that must be fulfilled.

3. Autocorrelation only occurs in time series data. Testing autocorrelation on data that is

not time series (cross section or panel) will be useless or meaningless.

4. Multicollinearity needs to be done when linear regression uses more than one

independent variable. If only one independent variable, multicollinearity is not

possible.

5. Heteroscedasticity usually occurs in cross section data, where panel data is closer to

the cross section data characteristics than time series.

Normality test

Click View Residual Diagnostics Histogram – Normality Test

The coefficient

shows the elasticity

number

Page 13: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

The test results are as follows

Probability value> 0.05, then the model has normal distribution data

Linearity Test

Click View Stability Diagnostics Ramsey RESET Test …

Appears on the screen

Page 14: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Click OK

Because the probability value of F-statistic <0.05 means that the regression equation is

not linear.

Autocorrelation Test

Click View Residual Diagnostocs Serial Correlation LM Test …

Page 15: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

will appear

Click OK

Because the probability value Obs * Rsquared <0.05 means the regression equation

contains autocorrelation models

Heteroscedasticity Test

Click View Residual Diagnostocs Heteroskedastisitas Tests …

Then click White

Page 16: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Because the probability value Obs * Rsquared> 0.05 means the regression equation does

not contain heteroscedasticity (homoscedasticity)

Multicollinearity Test

The multicollinearity test is used to assess whether there is a correlation or

intercorrelation between independent variables in the regression model or is also

commonly used to determine whether or not there is a deviation from the classic

assumption of multicollinearity, namely the existence of a linear relationship between the

independent variables in the regression model. In testing the presence or absence of

multicollinearity symptoms is done by looking at the value of VIF (Variance Inflation

Factor) and Tolerance.

Hypothesis:

H0: There is a multicollinearity problem

H1: There is no multicollinearity problem

Probability <10, H0 rejected, H1 accepted

Probability> 10, H1 rejected, H0 accepted

The following steps are VIF and TOL

Click View Coefficient Diagnostics Variance Inflation Factors

Page 17: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

And the result

Values of Centered VIF are above 10, meaning that the model contains multicollinearity.

Another way of testing multicollinearity

Block all variables as below

Page 18: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Then right click copy

Then click Quick Group Statistics Correlations

Click OK

Page 19: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Almost all variables have correlation values above 0.85, meaning that the model contains

multicollinearity.

Or by comparing R2

a. LS LOG(GDP) C LOG(GFCF) LOG(TR) LOG(TRADE) LOG(AK) LOG(IVA) hitung R2

1

b. LOG(GFCF) C LOG(TR) LOG(TRADE) LOG(AK) LOG(IVA) hitung R22

c. LOG(TR) C LOG(GFCF) LOG(TRADE) LOG(AK) LOG(IVA) hitung R23

d. LOG(TRADE) C LOG(GFCF) LOG(TR) LOG(AK) LOG(IVA) hitung R24

e. LOG(AK) C LOG(GFCF) LOG(TR) LOG(TRADE) LOG(IVA) hitung R25

f. LOG(IVA) C LOG(GFCF) LOG(TR) LOG(TRADE) LOG(AK) hitung R26

The model does not contain multicollinearity if:

R21 > R2

2 , R23, R

24, R

25, R

26,

Value R21 = 0,9987

Page 20: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Value R22 = 0,9691

Value R22 = 0,9781

Value R2

2 = 0,9457

Page 21: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Value R22 = 0,9870

Value R2

2 = 0,9928

Because of value R21 > R2

2 , R23, R

24, R

25, dan R

26

0,9987 > 0,9691, 0,9781, 0,9457, 0,9870 dan 0,9928

Then the model does not contain multicollinearity

Page 22: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

Results of the regression equation above

Estimation Command: ========================= LS LOG(GDP) C LOG(GFCF) LOG(TR) LOG(TRADE) LOG(AK) LOG(IVA) Estimation Equation: ========================= LOG(GDP) = C(1) + C(2)*LOG(GFCF) + C(3)*LOG(TR) + C(4)*LOG(TRADE) + C(5)*LOG(AK) + C(6)*LOG(IVA) Substituted Coefficients: ========================= LOG(GDP) = 4.6489 + 0.3739 LOG(GFCF) + 0.0498 LOG(TR) - 0.0688 LOG(TRADE) + 0.7019 LOG(AK) + 0.1781 LOG(IVA)

Page 23: MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING · Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc. MULTIPLE REGRESSION AND CLASSICAL ASSUMPTION TESTING In statistics,

Multiple Regression Teaching Materials Agus Tri Basuki, M.Sc.

DAFTAR PUSTAKA

Basuki, A. T., & Prawoto, N. (2016). ANALISIS REGRESI DALAM PENELITIAN

EKONOMI & BISIS (DILENGKAPI APLIKASI SPSS & EVIEWS).

Gujarati, D. N. (2009). Basic econometrics. Tata McGraw-Hill Education.