ARIMA model Building

23
http://staffweb.hkbu.edu.hk/billhung/econ3600/application/app05/ app05.html Building ARIMA Models In an ARIMA model, we do not have a priori for forecasting model before model identification takes place. ARIMA helps us to choose a “right model” to best fit the time series. Put it in a flow chart: Demonstration to find "right ARIMA model (p, d, q)" to fit the time series through trial and error: Firstly, download the excel file called "exchange_rate" from the "Sample Data" of Econ3600 homepage. Second, open EVIEWS program in this way: click "File", "New", "Workfile" commands, then in the "Workfile Range", choose "Monthly" and type "1990.01" for the "Start observation" and "2000.07"for "End observation" in the dialogue box. Then, we will get a workfile. Next, import the data from the excel file to generate the following result: (Remember to change "B8" for upper left data cell.)

description

ARIMA Model

Transcript of ARIMA model Building

Page 1: ARIMA model Building

http://staffweb.hkbu.edu.hk/billhung/econ3600/application/app05/app05.html

Building ARIMA Models

In an ARIMA model, we do not have a priori for forecasting model before model identification takes place. ARIMA helps us to choose a “right model” to best fit the time series. Put it in a flow chart:

Demonstration to find "right ARIMA model (p, d, q)" to fit the time series through trial and error:

Firstly, download the excel file called "exchange_rate" from the "Sample Data" of Econ3600 homepage.

Second, open EVIEWS program in this way: click "File", "New", "Workfile" commands, then in the "Workfile Range", choose "Monthly" and type "1990.01" for the "Start observation" and "2000.07"for "End observation" in the dialogue box. Then, we will get a workfile. Next, import the data from the excel file to generate the following result: (Remember to change "B8" for upper left data cell.)

Double click the variable "yen" to check its data whether it is consistent with the Excel file and choose "View", "Line" to get a general idea about the time series is  stationary or no.  Also, choose "View", "Correlogram"  to get the tentatively identify

Page 2: ARIMA model Building

patterns and model components (i.e. the degree of p, d, q of ARIMA)  The resulting graphs are:

From the above graphs, you can see that the time series is likely to have random walk pattern, which random walk up and down in the line graph. Also, in correlogram, the ACFs are suffered from linear decline and there is only one significant spike for PACFs. The graph of correlogram suggests that ARIMA(1, 0, 0) may be an appropriate model. Then, we take the first-difference of "Yen" to see whether the time

Page 3: ARIMA model Building

series becomes stationary before further finding AR(p) and MA(q). (Remember that I(d) is used to get stationary series if necessary.)

To see whether first difference can get level-stationary time series or not, you need to generate it by choosing "GENR", type "dyen=d(yen)".  Then, you will get "dyen" item in the "Workfile", and use it to draw a line graph and also get a correlogram graph. the results are:

Page 4: ARIMA model Building

Now, the first-difference series "DYEN" becomes stationary as showing in line graph and is white noise as shown no significant patterns in the graph of correlgram. And the unit root test also confirms the first-difference becomes stationary. The strong evidents support that the ARIMA(0,1,0) is suitable for the time series. Then, we can construct the ARIMA model as following steps:

Step 1. Choosing "Quick", "Estimate Equation", then specify the mode and type " yen c ar(1)", 

Page 5: ARIMA model Building

click "OK", the result is:

Step 2. choosing " View", "Residual tests", "Correlogram-Q- Statistic" the result is:

Page 6: ARIMA model Building

(Since there is  no significant spikes of ACFs and PACFs, it means that the residuals of this selected ARIMA model are white noise, so that there is no other significant patterns left in the time series,  then we can stop at here and don't need to further consider another AR(p) and MA(q))

The criterions to judge for the best model are as follows:

Relatively small of BIC (Schwarz criterion which is measured by nLog(SEE)+kLog(n)) 

Relatively small of SEE   Relatively high adjust R2   Q- statistics and correlogram show that there is no significant pattern left in the

ACFs and PACFs of the residuals, it means the residuals of the selected model are white noise.

You may try another ARIMAs and compare the statistical results as in the following table:

ARIMA model BIC Adjusted R2 SEE

(1, 0 , 0) 5.708 0.93476 4.075

(1, 0, 1)  5.734 0.93503 4.067

(2, 0, 0)  5.725 0.93425 4.047

Page 7: ARIMA model Building

(0, 0, 1)  7.384 0.65598 9.422

(0, 0, 2)  6.888 0.79754 7.220

(1, 1, 0) 5.724 0.0019 4.108

(0, 1, 0) 5.708 0.9347 4.075

As you can see that ARIMA(1,0,0) is a relatively best model, 

Remark: The ARIMA (1, 0, 0) is same as ARIMA (0, 1, 0). The result of ARIMA(0,1,0) is:

In our several trial and error procedures, the ARIMA(1,0,0) or ARIMA(0,1,0) is selected as the best model. 

Now, we can express this selected best model as

Page 8: ARIMA model Building

Students are encouraged to try to find the best ARIMA model for the series of  "pound".

Page 9: ARIMA model Building

Non-stationary variance time series

Firstly, download the excel file called " hk_gdp. xls " from the "Sample Data" of Econ3600 homepage.

Then, following the previous procedures to plot the line graph and correlogram graph as the following outputs so as to get general idea of the time series:

Page 10: ARIMA model Building

From the time plot graph,  the level of HKGDP is clearly a non-stationary series because it has an deterministic upward trend and seasonal cycle. Also, the ACFs are suffered from linear decline and there are two significant spikes of PACFs in period 1 and period 5. It is hard to guess an appropriate ARIMA model for this time series now. Firstly, what we need to do is to identify the integration order of the non-stationary time series. We take the first-difference of the series and want to see whether the first-difference series becomes stationary before we go for further searching for the AR(p) and MA(q). 

After taking the first-difference and plotting the line graph as follow:

Page 11: ARIMA model Building

AS we can see in the time plot graph, the first-difference series performs with a non-constant (increasing) variance and it indicates non-stationarity of this first-difference. Therefore, we may need to transform the data into the logarithm values. 

Simply click "Gene" and type "lnhkgdp = Log(hkgdp)". And then further to generate a log form of first-difference to remove the non-constant variance and  non-stationary problem. The time plot graphs of the logarithm of the level series "lnhkgdp" and the first-difference series "dlnhkgdp" are as follows:

Page 12: ARIMA model Building

The first-difference of log first-difference series is now free from non-stationary problems. Therefore, we can go back to the previous steps to construct the ARIMA model .

As shown above, taking the logarithmic transformation can solve the problem of variance non-stationary but no for the problem of level non-stationary.  After taking first-difference of lnhkgdp, the series achieves stationary. However, the resulting residuals are not white noise as showing in following graph of correlogram: 

Page 13: ARIMA model Building

Since there are seasonal cycles existing in the correlogram, we can further try the four-period difference. Click "GENR" and type "d4lnhkgdp = lnhkgdp - lnhkgdp(-4)". To confirm whether four-order differences are necessary, let us estimate a seasonal autoregressive model to the data. This model is ARIMA(1,0,0)4 and the estimated result is as follow

Page 14: ARIMA model Building

The estimated coefficient confirms that the level of LNHKGDP is seasonally non-statistionary.

Now, the time plot graph and correlogram of the four-order difference series are shown as follows: 

Page 15: ARIMA model Building
Page 16: ARIMA model Building

As you can see from the above graph, there are two significant spikes of PACFs and three significant spikes of ACFs, thus we can have a quick idea to examine whether there is a AR(2) and MA(3) of the four-period difference of LNHKGDP, i.e., whether "ARIMA(2, 1, 3)4, 1,1" is the best model?

To judge which is a best fit ARIMA model from different trials and errors, we have to base on some criteria, such as the smallest Schwarz criterion (BIC), Standard Errro of Regression (SEE), the highest adjusted R2 and the invertiability condition and significant of AR and MA root, to determine the best fitted model. Also, the residuals of the selected best model must be white noise.

Example 1. ARIMA (2, 1, 3)4, 1, 1

Page 17: ARIMA model Building
Page 18: ARIMA model Building
Page 19: ARIMA model Building

Example 2. ARIMA(3, 1, 3)4, 1,1:

Page 20: ARIMA model Building
Page 21: ARIMA model Building

After trying several ARIMA models, compare their statistical information as in the following table:

ARIMA Model BIC adjusted R2 SEE

(2,1,3)4,1,1 -3.102 0.850 0.0461

(2,1,3)4,0,1 -3.019 0.832 0.0489

(3,1,3)4,1,1 -3.091 0.855 0.0455

(3,1,3)4,0,1 -2.985 0.834 0.0488

(2,1,2)4,1,1 -3.080 0.842 0.0474

(2,1,2)4,1,1 -2.953 0.814 0.0514

(1,1,2)4,1,1 -3.014 0.832 0.0491

(3,1,3),1,1 -2.535 0.547 0.0603

(2,1,3),1,1 -2.220 0.351 0.0718

(2,1,2),1,1 -2.592 0.537 0.0607

Page 22: ARIMA model Building

Which model is best fitted ARIMA model? 

In our several trial and error procedures, the ARIMA(2,1,3)4,1,1 is selected as the best one. 

We can express this model as

THE END