Assignment 1

16

Click here to load reader

Transcript of Assignment 1

Page 1: Assignment 1

Maastricht University

School of Business and Economics

Times-Series Methods and Dynamic Econometrics

Assignment I

Air Pollution Analysis in Bogota

Ali NabbiI6061881

Jonas StopschinskiI6053966

November 23, 2015

Page 2: Assignment 1

Contents

1 Introduction 2

2 Graphical Analysis 3

3 ARIMA Models 5

4 Vector Autoregressive Regression (VAR) 6

5 Forecast 8

6 Conclusion 9

1

Page 3: Assignment 1

Chapter 1

Introduction

This assignment analyzes air pollution in Bogota. The dataset shows the amount of air pollu-tants such as

• Carbon Monoxide (CO)

• Nitrogen Dioxide (NO2)

• Nitrogen Oxide (NOx)

• Particulate Matter (PM)

• Sulfur Dioxide (SO2)

• Ground Level Ozone (O3)

Additionally, the following meteorological variables have been used.

• Wind Speed (WS)

• Wind Direction (WD)

• Temperature (Temp)

• Rain (Rain)

• Solar Radiation (RD)

The analysis includes time series behavior and the dynamic relation towards other pollutants.The data that is used, is the logarithm of the daily data. As main pollutant of interest, SulfurDioxide has been chosen. Sulfur Dioxide is mainly produced by various industrial processes i.e.coal and petroleum which seems to be related to meteorological variables such as rain, windspeed and wind direction.

2

Page 4: Assignment 1

Chapter 2

Graphical Analysis

As mentioned in the introduction, the main pollutant of interest is Sulfur Dioxide.

Figure 2.1: The level, first and second difference of Sulfur Dioxide

In figure (2.1) the level, the first and the second differences are presented. The level plotseems to indicate stationary behavior of the pollutant and that an intercept should be includedin the regression. Similarly, the first and second differences are stationary. Both series havemean zero and thus intercepts are not needed. According to this analysis, differencing is notnecessary for this process. Moreover the Sulfur Dioxide process is not integrated. However itshould be modeled with a constant. As a conclusion, Sulfur Dioxide can be modeled with anARIMA(p,0,q) process.

In figure (2.2) the autocorrelation(ACF) and partial autocorrelation(PACF) functions ofSulfur Dioxide are presented. Note that the effect of the ACF decays and the PACF is cut-offon the seven-th lag. This confirms that an ARIMA(p,0,q) model with non-zero lag orders forboth the autoregressive (AR) and the moving average (MA) processes should be chosen.

3

Page 5: Assignment 1

CHAPTER 2. GRAPHICAL ANALYSIS 4

Figure 2.2: Autocorrelation and partial autocorrelation functions of SO2 (in levels)

The frequencies of Sulfur Dioxide are plotted in figure (2.3). It can be seen that there is noextreme values to be concerned about.

Figure 2.3: Frequencies of SO2

Page 6: Assignment 1

Chapter 3

ARIMA Models

The previous section concluded that the series for Sulfur Dioxide should be modeled with anARIMA(p,0,q) process. In this section, the appropriate number of lags of the ARIMA processwill be determined.

In figure (2.2) the ACF decays to zero eventually. However, in order to take all nonzero lagsinto account, many lags would be needed. Instead, the maximum number of lags was restrictedto 5 lags for both the AR and MA processes. The Schwarz information criterion (BIC) is chosento find the right model from these 36 possibilities. The best three candidates are summarizedin table (3.1). In addition to BIC, the Ljung-Box Q-statistic is used to investigate whether thegroup of autocorrelations is significantly different from zero.

Model BIC Q(10) Q(15) Q(20) Last uncorrelated lags

ARMA(4, 5)∗ 0.6758 2.5343 4.3747 22.086 Q(18)=12.231(0.111) (0.626) (0.024) (0.201)

ARMA(3, 5) 0.6805 8.8377 16.398 29.617(0.012) (0.022) (0.003)

ARMA(3, 1) 0.6812 8.9189 17.033 36.627 Q(15)=17.033(0.176) (0.107) (0.002) (0.107)

p-Values are shown in parentheses.

Table 3.1: Model candidates Schwarz Information Criteria(BIC) and Ljung-Box Q-Statistics

The table (3.1) shows that the model with the smallest BIC-value is an ARMA(4,5) model.This model also outperforms the other two in terms of the Q-statistic. The estimated model is:

SO2t = 1.76 + 2.571× SO2t−1 − 2.995× SO2t−2 + 1.753× SO2t−3 − 0.344× SO2t−4

+ 0.101× ϵt − 1.875× ϵt−1 + 1.685× ϵt−2 − 0.510× ϵt−3 − 0.094× ϵt−4 − 0.002× ϵt−5

In the figure (A.1) the ACF and PACF of the residuals of ARMA(4,5) is plotted. It canbe concluded that the residuals are not correlated. Moreover, the White test for heteroskedas-ticity rejects the null hypothesis in all of our candidates, indicating heteroskedasticity. As aconsequence, the residuals are not white noise.

In the figure (A.2), Histogram of residuals of ARMA(4,5) is presented. As a quick comparisonthe density of the Gaussian distribution is plotted. Note the fat tails and high kurtosis. Theresult confirms that the residuals are not white noise.

5

Page 7: Assignment 1

Chapter 4

Vector Autoregressive Regression(VAR)

In this section, the data is analyzed to find the best fitting VAR model. The two followingmodels have been considered.

1. includes SO2, all the other pollutants and a constant.

2. includes SO2, all the meteorological variables and a constant.

The aim of this analysis of the VAR is to forecast. That is why an efficient criterion, such asthe AIC, is preferable to a consistent one, such as the BIC. This reasoning leads to the decisionto base the selection of the model on the lowest Akaike Information Criterion (AIC). Moreover,the maximum number of lags is limited to 10.

The VAR analysis leads to VAR(6) for the first model (including all pollutants) with anAIC-value of -2.756. Similarly, the analysis suggests VAR(7) for the second model (includingall meteorological variables) with an AIC-value of -1.0265. All inverse roots of autoregressivecharacteristic polynomial lay inside the unit circle. As a consequence, both models satisfy thestability condition (Figure (A.3) and (A.4)).

In the tables (4.1) and (4.2), the results of the Granger-causality test are summarized forthe two models.

yjSO2 CO NO2 NOx PM O3

xi

SO2 × × × X XCO × × × × XNO2 × × × X XNOx × × × X ×PM × × × × XO3 × × × × X

Based on 95% significant levelxi is Granger-caused by yj for all i, j = 1, ..., 6

Table 4.1: Summary of all pollutants model VAR(7) Granger-Causality

In the figures (A.6) and (A.5), Impulse Response Functions of both models are analyzedwith 95% confident interval.

• Model 1: Effects of any fluctuation of the SO2-lags vanish over time and there are relativelysmall changes in response to changes of the other pollutants. However, PM and O3

6

Page 8: Assignment 1

CHAPTER 4. VECTOR AUTOREGRESSIVE REGRESSION (VAR) 7

yjSO2 WS WD Temp Rain SR

xi

SO2 × × × × ×WS × X × × ×WD × × × × ×Temp × × × × XRain × × X × XSR × × × X ×

Based on 95% significant levelxi is Granger-caused by yj for all i, j = 1, ..., 6

Table 4.2: Summary of all meteorological variables model VAR(6) Granger-Causality

Granger-cause SO2 which is confirmed by the impulse response analysis. Note, that thereis causality from CO to SO2, while the Granger-causality test reports non-causality.

• Model 2: Similar to model 1, the effects of the SO2-lags vanish over time. According tothe table (4.2), none of the meteorological variables Granger-cause SO2. This result hasbeen confirmed by the impulse response analysis.

Page 9: Assignment 1

Chapter 5

Forecast

In this chapter, the last 250 observations are forecasted using the chosen models from theprevious chapters, namely ARMA(4,5), VAR(7) and VAR(6). This means, that a model basedon the entire sample, was used to forecast the last 250 observations of So2. To calculate theforecast errors, Mean Square Errors (MSE) and Mean Absolute Errors (MAE) are chosen andused to compare the forecasts.

In figures (A.7) to (A.9), the results of forecasting of each model are presented. Moreover,the forecast errors of each model using MSE and MAE are summarized in table (5.1). Clearly,ARMA(4,5) has the best forecast performance among the candidates.

MSE MAE

ARMA(4,5) 0.4030 0.3110VAR(6) 0.4048 0.3199VAR(7) 0.3974 0.3099

Table 5.1: Summary of forecast errors

As a result it can be said that the pollutant SO2 was found to be stationary. After con-sidering 36 possibilities, the best model for the series is an VAR(7) with constant term and allmeteorological variables.

8

Page 10: Assignment 1

Chapter 6

Conclusion

For the VAR, two models were taken into consideration. For the first model (a VAR modelwith the chosen variable SO2, all the pollutants, their lags and constant term) a comparison ofthe AIC leads to a VAR with 6 lags. The second model (a VAR model with the chosen variableSO2, the meteorological variables and constant term) lead to a VAR with 7 lags. The Granger-causality test, in combination with the Impulse Response Function, indicates that there is somecausality from CO to SO2 in the first model, while in the second model, no causality can bedetected which has been confirmed by impulse response analysis.

The previously best fitting models were used to forecast the last 250 observations of thesample. Using loss functions MSE and MAE, it is evident that second VAR model (with 7 lagson meteorological variabls and constant term) has the best forecast performance of our mainvariable of interest (SO2).

9

Page 11: Assignment 1

Appendix

Figure A.1: Autocorrelation of the ARMA(4,5) residuals

Figure A.2: Autocorrelation of the ARMA(4,5) residuals

10

Page 12: Assignment 1

CHAPTER 6. CONCLUSION 11

Figure A.3: Inverse roots of AR characteristic polynomial of VAR(6) All pollutant model

Figure A.4: Inverse roots of AR characteristic polynomial of VAR(7) All meteorological variablesmodel

Page 13: Assignment 1

CHAPTER 6. CONCLUSION 12

Figure A.5: The Impulse Response Functions of the Pollutant variables on SO2

Page 14: Assignment 1

CHAPTER 6. CONCLUSION 13

Figure A.6: The Impulse Response Functions of the Meteorological variables on SO2

Page 15: Assignment 1

CHAPTER 6. CONCLUSION 14

Figure A.7: Forecast of ARMA(4,5)

Page 16: Assignment 1

CHAPTER 6. CONCLUSION 15

Figure A.8: Forecast of VAR(6) All pollutant model

Figure A.9: Forecast of VAR(7) All meteorological model