EGOWS 2008 The Beijing 2008 Nowcasting Forecast Demonstration Project
Project 2 forecast
-
Upload
asyraf-afthanorhan -
Category
Education
-
view
210 -
download
0
description
Transcript of Project 2 forecast
By:Ahmad Nazim bin AimranWan Mohamad Asyraf bin Wan Afthanorhan
introduction
A series of the number of international airline passenger from 1949-1956 have been studied. The purpose for this project is to generate the best SARIMA model and to compare it with the univariate model.
Since the series have been identified to have seasonal component,we have create the SARIMA model.
3 main stages are:
i.model identification
Ii.model estimation and validation
Iii.model application
Objective
*to determine if the time series is stationary and if there is any seasonal component in the series.
*to create the SARIMA model (seasonal component exist) based on ACF and PACF.
*to identify the best SARIMA model with lowest MSE.
*to compare the box-jenkins model with the univariate model output.
methodology Construct a time plot for the original data series and try to identify
any unusual observation. If such observation exists, decide whether a transformation is necessary. If necessary, transform to achieve stationary in variance.
If data series appears non-stationary, perform the differencing. For non-seasonal data series the first difference is sufficient. If seasonality exists, perform the seasonal difference. Then, after the seasonal difference, then perform the non-seasonal difference.
When stationary condition has been achieved, examined the ACF and PACF to see whether any discernible pattern of the data series exists.
In the model identification stage, it may be slightly easier to select pure AR or pure MA models. But when selecting a mixed ARIMA model, the process to decide on the values of p and q is much more difficult. More so for model with seasonal component. Hence, it is worth considering several possible models in order to minimize the chance of not picking the most appropriate model form. To finally determine the best fitted model, one needs to use several statistical measures such as the MSE, AIC/ BIC or the Box-Pierce (Ljung-Box) statistic.
*To determine if the time series data is stationary and have seasonal component or not.
*To verify the ACF and PACF that measures the degree of interdependence.
*To create the best SARIMA model based on ACF and PACF value.
*To find the best SARIMA model with lowest MSE.
OBJECTIVES
* The graph below shows that the series not stationary and also contains the seasonal component.
*the graph is not stationary – upward trend.
The actual series graph
DNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJ
19071906190519041903190219011900
400
300
200
100
Month
Year
C1
The ACF of the actual series shows that there is an existence of a wave-like pattern. This pattern shows that the series is not stationary and have seasonal component.
22122
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Aut
ocor
rela
tion
LBQTCorrLagLBQTCorrLagLBQTCorrLagLBQTCorrLag
730.87
719.60
709.29
701.07
694.03
687.76
680.83
672.14
660.64
645.63
625.29
598.33
562.50
518.33
476.72
440.53
407.16
375.52
342.32
304.89
262.54
213.85
155.55
85.32
0.76
0.73
0.66
0.62
0.59
0.63
0.71
0.83
0.96
1.14
1.35
1.60
1.85
1.88
1.82
1.82
1.84
1.97
2.21
2.51
2.94
3.66
4.97
9.09
0.29
0.28
0.25
0.24
0.23
0.24
0.27
0.31
0.36
0.42
0.48
0.56
0.63
0.61
0.58
0.56
0.54
0.56
0.60
0.64
0.69
0.76
0.84
0.93
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Autocorrelation Function for C1
The PACF also shows the non stationary pattern since the existence of wave-like pattern.
The PACF of actual series
22122
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Par
tial A
utoc
orre
latio
n
TPACLagTPACLagTPACLagTPACLag
0.24
-0.27
-0.24
0.29
-0.79
1.00
0.35
0.42
-0.27
0.41
-0.04
-4.93
-1.17
2.02
0.60
1.77
1.42
0.13
0.01
0.93
0.01
0.55
-1.68
9.09
0.02
-0.03
-0.02
0.03
-0.08
0.10
0.04
0.04
-0.03
0.04
-0.00
-0.50
-0.12
0.21
0.06
0.18
0.14
0.01
0.00
0.09
0.00
0.06
-0.17
0.93
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Partial Autocorrelation Function for C1
Since the actual series shows the seasonal pattern, therefore we need to apply the seasonal differencing.
The graph below shows the series after seasonal differencing.
differencing
DNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJ
19071906190519041903190219011900
60
50
40
30
20
10
0
-10
Month
Year
C2
The ACF still shows a wave-like pattern.
This pattern indicate that the graph is not in stationary state.
The ACF after seasonal differencing
21111
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Aut
ocor
rela
tion
LBQTCorrLagLBQTCorrLagLBQTCorrLag
192.01
189.57
185.02
181.39
177.69
174.48
170.85
169.74
169.11
169.02
168.58
168.55
168.07
165.64
161.67
156.26
146.50
132.65
114.48
90.98
52.15
-0.59
-0.82
-0.74
-0.76
-0.71
-0.77
-0.43
-0.33
-0.13
-0.28
0.07
0.30
0.67
0.87
1.03
1.43
1.78
2.16
2.68
4.10
7.09
-0.15
-0.20
-0.18
-0.18
-0.17
-0.18
-0.10
-0.08
-0.03
-0.07
0.02
0.07
0.16
0.20
0.24
0.32
0.39
0.45
0.51
0.66
0.77
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Autocorrelation Function for C2
The PACF clearly shows that the series are not in stationary state even after seasonal differencing.
The PACF after seasonal differencing
21111
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Par
tial A
utoc
orre
latio
n
TPACLagTPACLagTPACLag
1.63
0.03
-0.66
-0.21
0.97
-0.93
-0.64
-1.15
1.50
-0.71
0.01
-1.69
-0.12
0.62
-0.76
-0.49
0.45
0.87
-1.00
1.48
7.09
0.18
0.00
-0.07
-0.02
0.11
-0.10
-0.07
-0.13
0.16
-0.08
0.00
-0.18
-0.01
0.07
-0.08
-0.05
0.05
0.09
-0.11
0.16
0.77
21
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Partial Autocorrelation Function for C2
After applying non-seasonal differencing, the graph of series shows that the series are in stationary state since there is no trend shown.
Applying non-seasonal differencing
DNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJ
19071906190519041903190219011900
30
20
10
0
-10
-20
Month
Year
C3
After applying both seasonal and non-seasonal differencing, there is no pattern shows. Therefore we can conclude that the series is alreaady stationary.
The ACF
2015105
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Aut
ocor
rela
tion
LBQTCorrLagLBQTCorrLagLBQTCorrLag
33.77
29.83
29.51
29.36
29.36
26.39
25.19
25.18
22.03
15.39
14.27
13.37
12.83
12.78
11.47
11.01
10.86
10.83
6.76
5.90
-1.34
0.38
-0.27
0.01
-1.22
0.79
-0.09
1.32
-2.03
0.85
-0.77
0.61
0.19
-0.96
0.58
0.34
-0.16
-1.82
0.85
-2.39
-0.19
0.05
-0.04
0.00
-0.17
0.11
-0.01
0.18
-0.26
0.11
-0.10
0.08
0.02
-0.12
0.07
0.04
-0.02
-0.21
0.10
-0.26
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Autocorrelation Function for C3
The PACF graph also shows the stationary state pattern which is the graph doesn’t have any pattern.
The PACF
2015105
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Par
tial A
utoc
orre
latio
n
TPACLagTPACLagTPACLag
-2.56
-0.58
0.21
-0.47
-1.16
0.32
0.47
0.60
-1.73
0.23
-0.98
1.22
-0.34
-1.24
0.53
0.19
-1.24
-1.77
0.30
-2.39
-0.28
-0.06
0.02
-0.05
-0.13
0.03
0.05
0.07
-0.19
0.02
-0.11
0.13
-0.04
-0.14
0.06
0.02
-0.14
-0.19
0.03
-0.26
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Partial Autocorrelation Function for C3
PROCEED TO CREATING THESARIMA MODEL !!
2015105
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Par
tial A
utoc
orre
latio
n
TPACLagTPACLagTPACLag
-2.56
-0.58
0.21
-0.47
-1.16
0.32
0.47
0.60
-1.73
0.23
-0.98
1.22
-0.34
-1.24
0.53
0.19
-1.24
-1.77
0.30
-2.39
-0.28
-0.06
0.02
-0.05
-0.13
0.03
0.05
0.07
-0.19
0.02
-0.11
0.13
-0.04
-0.14
0.06
0.02
-0.14
-0.19
0.03
-0.26
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Partial Autocorrelation Function for C3
2015105
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Aut
ocor
rela
tion
LBQTCorrLagLBQTCorrLagLBQTCorrLag
33.77
29.83
29.51
29.36
29.36
26.39
25.19
25.18
22.03
15.39
14.27
13.37
12.83
12.78
11.47
11.01
10.86
10.83
6.76
5.90
-1.34
0.38
-0.27
0.01
-1.22
0.79
-0.09
1.32
-2.03
0.85
-0.77
0.61
0.19
-0.96
0.58
0.34
-0.16
-1.82
0.85
-2.39
-0.19
0.05
-0.04
0.00
-0.17
0.11
-0.01
0.18
-0.26
0.11
-0.10
0.08
0.02
-0.12
0.07
0.04
-0.02
-0.21
0.10
-0.26
20
19
18
17
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Autocorrelation Function for C3
By referring to the ACF and PACF after seasonal and non-seasonal differencing. We can create the SARIMA model.
ACF graph is used to determine MA and SMA in the SARIMA model
PACF graph is used to determine AR and SAR in the model.
SARIMA (AR, I, MA) (SAR, I, SMA)12
The SARIMA obtained referring the ACF and PACF model are
SARIMA (1,1,3)(1,1,1)12
The model obtained earlier is not necessary the best model.
To obtain the best model, more SARIMA model based on the ACF and PACF model after seasonal and non-seasonal model need to be produced.
SARIMA (1,1,3)(1,1,1)12
SARIMA (1,1,2)(1,1,1)12
SARIMA (1,1,2)(0,1,1)12
SARIMA (0,1,2)(1,1,1)12
Proceed on creating other SARIMA
For the model created, the chi-square value and MSE need to be calculated to find the best model. The best model must be ‘white noise’ with smallest MSE.
By using minitab
(1,1,3)(1,1,1)12 (1,1,2)(1,1,1)12 (1,1,2)(0,1,1)12 (0,1,2)(1,1,1)12
Q 3.9 7.6 7.3 8.9
Qtabulated 10.64 12.01 13.36 13.36
DF (significance
0.1)
6 7 8 8
DECISION Accept Ho Accept Ho Accept Ho Accept Ho
CONCLUSION
The error are white noise
The error are white noise
The error are white noise
The error are white noise
MSE 89.21 90.28 89.81 94.09
From the minitab output earlier, we have obtained 3 white noise model. To obtain the best model, we need to calculate the MSE fitting value and the MSE hold-out value by using Microsoft Excel and Minitab.
Best among the best
SARIMA MODEL FOR THE MINITAB
(1,1,3)(1,1,1)12 (1,1,2)(1,1,1)12 (1,1,2)(0,1,1)12
MSE FITTING 70.85 78.84 80.82
MSE HOLD OUT
1646.778 1653.141 1638.623
From the hold out value. The value of SARIMA (1,1,2)(0,1,1)12 is the best since the value ofHold-out value for the model is the smallest.
The best model
Univariate model
MSE Naïve with trend
Single Exponential
Double Exponential
Holt’s Method
Fitting Part :1949-1956
646.4 10183.3 291.2 353.2
Hold-out Part :1949-1956
1496.1 25505.5 891.0 871.9
MSE ARRES Method Holt’s Winter
Fitting Part:1949-1956
789.4 77.8
Hold-out Part :1949-1956
2583.8 95.2
Best univariate model is the Holt’s Winter model since it have the lowest MSE fitting and MSE hold-out.
Best univariate model
MSE Holt’s WinterFitting Part:1949-1956
77.8
Hold-out Part :1949-1956
95.2
Comparison between box-jenkins best model and the univariate model.
MSE SARIMA (1,1,2)(0,1,1)12
HOLT’S WINTER MODEL
FITTING PART (1949-1956)
80.82 77.8
HOLD-OUT PART
(1949-1956)
1638.623 95.2
From the comparison of the SARIMA model and Holt’s Winter univariate model, we can
conclude that the best among the best model is the holt’s winter model since it have the
smallest hold-out MSE value.
Best among the best model…
Thank you!