Project 2 forecast

26
By: Ahmad Nazim bin Aimran Wan Mohamad Asyraf bin Wan Afthanorhan

description

 

Transcript of Project 2 forecast

Page 1: Project 2 forecast

By:Ahmad Nazim bin AimranWan Mohamad Asyraf bin Wan Afthanorhan

Page 2: Project 2 forecast

introduction

A series of the number of international airline passenger from 1949-1956 have been studied. The purpose for this project is to generate the best SARIMA model and to compare it with the univariate model.

Since the series have been identified to have seasonal component,we have create the SARIMA model.

3 main stages are:

i.model identification

Ii.model estimation and validation

Iii.model application

Page 3: Project 2 forecast

Objective

*to determine if the time series is stationary and if there is any seasonal component in the series.

*to create the SARIMA model (seasonal component exist) based on ACF and PACF.

*to identify the best SARIMA model with lowest MSE.

*to compare the box-jenkins model with the univariate model output.

Page 4: Project 2 forecast

methodology Construct a time plot for the original data series and try to identify

any unusual observation. If such observation exists, decide whether a transformation is necessary. If necessary, transform to achieve stationary in variance.

If data series appears non-stationary, perform the differencing. For non-seasonal data series the first difference is sufficient. If seasonality exists, perform the seasonal difference. Then, after the seasonal difference, then perform the non-seasonal difference.

When stationary condition has been achieved, examined the ACF and PACF to see whether any discernible pattern of the data series exists.

In the model identification stage, it may be slightly easier to select pure AR or pure MA models. But when selecting a mixed ARIMA model, the process to decide on the values of p and q is much more difficult. More so for model with seasonal component. Hence, it is worth considering several possible models in order to minimize the chance of not picking the most appropriate model form. To finally determine the best fitted model, one needs to use several statistical measures such as the MSE, AIC/ BIC or the Box-Pierce (Ljung-Box) statistic.

Page 5: Project 2 forecast

*To determine if the time series data is stationary and have seasonal component or not.

*To verify the ACF and PACF that measures the degree of interdependence.

*To create the best SARIMA model based on ACF and PACF value.

*To find the best SARIMA model with lowest MSE.

OBJECTIVES

Page 6: Project 2 forecast

* The graph below shows that the series not stationary and also contains the seasonal component.

*the graph is not stationary – upward trend.

The actual series graph

DNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJ

19071906190519041903190219011900

400

300

200

100

Month

Year

C1

Page 7: Project 2 forecast

The ACF of the actual series shows that there is an existence of a wave-like pattern. This pattern shows that the series is not stationary and have seasonal component.

22122

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Aut

ocor

rela

tion

LBQTCorrLagLBQTCorrLagLBQTCorrLagLBQTCorrLag

730.87

719.60

709.29

701.07

694.03

687.76

680.83

672.14

660.64

645.63

625.29

598.33

562.50

518.33

476.72

440.53

407.16

375.52

342.32

304.89

262.54

213.85

155.55

85.32

0.76

0.73

0.66

0.62

0.59

0.63

0.71

0.83

0.96

1.14

1.35

1.60

1.85

1.88

1.82

1.82

1.84

1.97

2.21

2.51

2.94

3.66

4.97

9.09

0.29

0.28

0.25

0.24

0.23

0.24

0.27

0.31

0.36

0.42

0.48

0.56

0.63

0.61

0.58

0.56

0.54

0.56

0.60

0.64

0.69

0.76

0.84

0.93

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Autocorrelation Function for C1

Page 8: Project 2 forecast

The PACF also shows the non stationary pattern since the existence of wave-like pattern.

The PACF of actual series

22122

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Par

tial A

utoc

orre

latio

n

TPACLagTPACLagTPACLagTPACLag

0.24

-0.27

-0.24

0.29

-0.79

1.00

0.35

0.42

-0.27

0.41

-0.04

-4.93

-1.17

2.02

0.60

1.77

1.42

0.13

0.01

0.93

0.01

0.55

-1.68

9.09

0.02

-0.03

-0.02

0.03

-0.08

0.10

0.04

0.04

-0.03

0.04

-0.00

-0.50

-0.12

0.21

0.06

0.18

0.14

0.01

0.00

0.09

0.00

0.06

-0.17

0.93

24

23

22

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Partial Autocorrelation Function for C1

Page 9: Project 2 forecast

Since the actual series shows the seasonal pattern, therefore we need to apply the seasonal differencing.

The graph below shows the series after seasonal differencing.

differencing

DNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJ

19071906190519041903190219011900

60

50

40

30

20

10

0

-10

Month

Year

C2

Page 10: Project 2 forecast

The ACF still shows a wave-like pattern.

This pattern indicate that the graph is not in stationary state.

The ACF after seasonal differencing

21111

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Aut

ocor

rela

tion

LBQTCorrLagLBQTCorrLagLBQTCorrLag

192.01

189.57

185.02

181.39

177.69

174.48

170.85

169.74

169.11

169.02

168.58

168.55

168.07

165.64

161.67

156.26

146.50

132.65

114.48

90.98

52.15

-0.59

-0.82

-0.74

-0.76

-0.71

-0.77

-0.43

-0.33

-0.13

-0.28

0.07

0.30

0.67

0.87

1.03

1.43

1.78

2.16

2.68

4.10

7.09

-0.15

-0.20

-0.18

-0.18

-0.17

-0.18

-0.10

-0.08

-0.03

-0.07

0.02

0.07

0.16

0.20

0.24

0.32

0.39

0.45

0.51

0.66

0.77

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Autocorrelation Function for C2

Page 11: Project 2 forecast

The PACF clearly shows that the series are not in stationary state even after seasonal differencing.

The PACF after seasonal differencing

21111

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Par

tial A

utoc

orre

latio

n

TPACLagTPACLagTPACLag

1.63

0.03

-0.66

-0.21

0.97

-0.93

-0.64

-1.15

1.50

-0.71

0.01

-1.69

-0.12

0.62

-0.76

-0.49

0.45

0.87

-1.00

1.48

7.09

0.18

0.00

-0.07

-0.02

0.11

-0.10

-0.07

-0.13

0.16

-0.08

0.00

-0.18

-0.01

0.07

-0.08

-0.05

0.05

0.09

-0.11

0.16

0.77

21

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Partial Autocorrelation Function for C2

Page 12: Project 2 forecast

After applying non-seasonal differencing, the graph of series shows that the series are in stationary state since there is no trend shown.

Applying non-seasonal differencing

DNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJDNOSAJJMAMFJ

19071906190519041903190219011900

30

20

10

0

-10

-20

Month

Year

C3

Page 13: Project 2 forecast

After applying both seasonal and non-seasonal differencing, there is no pattern shows. Therefore we can conclude that the series is alreaady stationary.

The ACF

2015105

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Aut

ocor

rela

tion

LBQTCorrLagLBQTCorrLagLBQTCorrLag

33.77

29.83

29.51

29.36

29.36

26.39

25.19

25.18

22.03

15.39

14.27

13.37

12.83

12.78

11.47

11.01

10.86

10.83

6.76

5.90

-1.34

0.38

-0.27

0.01

-1.22

0.79

-0.09

1.32

-2.03

0.85

-0.77

0.61

0.19

-0.96

0.58

0.34

-0.16

-1.82

0.85

-2.39

-0.19

0.05

-0.04

0.00

-0.17

0.11

-0.01

0.18

-0.26

0.11

-0.10

0.08

0.02

-0.12

0.07

0.04

-0.02

-0.21

0.10

-0.26

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Autocorrelation Function for C3

Page 14: Project 2 forecast

The PACF graph also shows the stationary state pattern which is the graph doesn’t have any pattern.

The PACF

2015105

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Par

tial A

utoc

orre

latio

n

TPACLagTPACLagTPACLag

-2.56

-0.58

0.21

-0.47

-1.16

0.32

0.47

0.60

-1.73

0.23

-0.98

1.22

-0.34

-1.24

0.53

0.19

-1.24

-1.77

0.30

-2.39

-0.28

-0.06

0.02

-0.05

-0.13

0.03

0.05

0.07

-0.19

0.02

-0.11

0.13

-0.04

-0.14

0.06

0.02

-0.14

-0.19

0.03

-0.26

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Partial Autocorrelation Function for C3

Page 15: Project 2 forecast

PROCEED TO CREATING THESARIMA MODEL !!

Page 16: Project 2 forecast

2015105

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Par

tial A

utoc

orre

latio

n

TPACLagTPACLagTPACLag

-2.56

-0.58

0.21

-0.47

-1.16

0.32

0.47

0.60

-1.73

0.23

-0.98

1.22

-0.34

-1.24

0.53

0.19

-1.24

-1.77

0.30

-2.39

-0.28

-0.06

0.02

-0.05

-0.13

0.03

0.05

0.07

-0.19

0.02

-0.11

0.13

-0.04

-0.14

0.06

0.02

-0.14

-0.19

0.03

-0.26

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Partial Autocorrelation Function for C3

2015105

1.0

0.8

0.6

0.4

0.2

0.0

-0.2

-0.4

-0.6

-0.8

-1.0

Aut

ocor

rela

tion

LBQTCorrLagLBQTCorrLagLBQTCorrLag

33.77

29.83

29.51

29.36

29.36

26.39

25.19

25.18

22.03

15.39

14.27

13.37

12.83

12.78

11.47

11.01

10.86

10.83

6.76

5.90

-1.34

0.38

-0.27

0.01

-1.22

0.79

-0.09

1.32

-2.03

0.85

-0.77

0.61

0.19

-0.96

0.58

0.34

-0.16

-1.82

0.85

-2.39

-0.19

0.05

-0.04

0.00

-0.17

0.11

-0.01

0.18

-0.26

0.11

-0.10

0.08

0.02

-0.12

0.07

0.04

-0.02

-0.21

0.10

-0.26

20

19

18

17

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

Autocorrelation Function for C3

Page 17: Project 2 forecast

By referring to the ACF and PACF after seasonal and non-seasonal differencing. We can create the SARIMA model.

ACF graph is used to determine MA and SMA in the SARIMA model

PACF graph is used to determine AR and SAR in the model.

SARIMA (AR, I, MA) (SAR, I, SMA)12

The SARIMA obtained referring the ACF and PACF model are

SARIMA (1,1,3)(1,1,1)12

Page 18: Project 2 forecast

The model obtained earlier is not necessary the best model.

To obtain the best model, more SARIMA model based on the ACF and PACF model after seasonal and non-seasonal model need to be produced.

SARIMA (1,1,3)(1,1,1)12

SARIMA (1,1,2)(1,1,1)12

SARIMA (1,1,2)(0,1,1)12

SARIMA (0,1,2)(1,1,1)12

Proceed on creating other SARIMA

Page 19: Project 2 forecast

For the model created, the chi-square value and MSE need to be calculated to find the best model. The best model must be ‘white noise’ with smallest MSE.

By using minitab

(1,1,3)(1,1,1)12 (1,1,2)(1,1,1)12 (1,1,2)(0,1,1)12 (0,1,2)(1,1,1)12

Q 3.9 7.6 7.3 8.9

Qtabulated 10.64 12.01 13.36 13.36

DF (significance

0.1)

6 7 8 8

DECISION Accept Ho Accept Ho Accept Ho Accept Ho

CONCLUSION

The error are white noise

The error are white noise

The error are white noise

The error are white noise

MSE 89.21 90.28 89.81 94.09

Page 20: Project 2 forecast

From the minitab output earlier, we have obtained 3 white noise model. To obtain the best model, we need to calculate the MSE fitting value and the MSE hold-out value by using Microsoft Excel and Minitab.

Best among the best

SARIMA MODEL FOR THE MINITAB

(1,1,3)(1,1,1)12 (1,1,2)(1,1,1)12 (1,1,2)(0,1,1)12

MSE FITTING 70.85 78.84 80.82

MSE HOLD OUT

1646.778 1653.141 1638.623

Page 21: Project 2 forecast

From the hold out value. The value of SARIMA (1,1,2)(0,1,1)12 is the best since the value ofHold-out value for the model is the smallest.

The best model

Page 22: Project 2 forecast

Univariate model

MSE Naïve with trend

Single Exponential

Double Exponential

Holt’s Method

Fitting Part :1949-1956

646.4 10183.3 291.2 353.2

Hold-out Part :1949-1956

1496.1 25505.5 891.0 871.9

MSE ARRES Method Holt’s Winter

Fitting Part:1949-1956

789.4 77.8

Hold-out Part :1949-1956

2583.8 95.2

Page 23: Project 2 forecast

Best univariate model is the Holt’s Winter model since it have the lowest MSE fitting and MSE hold-out.

Best univariate model

MSE Holt’s WinterFitting Part:1949-1956

77.8

Hold-out Part :1949-1956

95.2

Page 24: Project 2 forecast

Comparison between box-jenkins best model and the univariate model.

MSE SARIMA (1,1,2)(0,1,1)12

HOLT’S WINTER MODEL

FITTING PART (1949-1956)

80.82 77.8

HOLD-OUT PART

(1949-1956)

1638.623 95.2

Page 25: Project 2 forecast

From the comparison of the SARIMA model and Holt’s Winter univariate model, we can

conclude that the best among the best model is the holt’s winter model since it have the

smallest hold-out MSE value.

Best among the best model…

Page 26: Project 2 forecast

Thank you!