ASHISH RANJAN
ANIRBAN GHOSH
AYAN DAS
MADHUMITA GHOSH
SOMDEEP SEN
Data Description
The dataset contains ten years sales of an airlines industry ; from 1949 to 1960
Monthly breakup of each year sales has been provided in thousands as ‘AIR’ variable
Objective
To predict the sales of the year 1961 through Time Series Analysis using SAS
Time Series relates to data varying over a period of time
Time Series generally includes four components
Components Description
Trend•Smooth long term movements for long period of time
•Data moves steadily in one particular direction with little fluctuation
Seasonality •Periodic movements with period of cycle <=1 year
Cyclicality •Periodic movements with cycle >1 year
Irregularity •Random erratic movements
Volatility Check
Non Stationarity Check
Check for Seasonality
Creation of Development & Validation Sample
Selection of P & Q
Generating final Forecast
Plot created by gplot option with time & sales provides an indication
A Japanese fan shaped or an inverted fan shaped plot are indicators of high volatility
For fan shaped plot we use log or square root
For inverted fan shape we use square or exponential
During Analysis:
The initial graph was fan shaped & hence log/square root was used for transformation
Among the two log provided a better result & hence it was chosen
A non stationary data is completely memory less with no fixed patterns
Such a data can’t be used for forecasting
Non-stationary is checked by using Augmented Dickey Fuller Test (ADF)
Here the null hypothesis(H0) is that the data is non stationary
If the P-value<α we reject H0 to claim that the data is stationary
If the P-value>α we can’t reject H0 to claim that the data is non-stationary
Such data can be converted to Stationarity by differencing
During Analysis:
Initial check using ADF showed non-Stationarity
Therefore differencing was used to convert the data in to a stationary one
Note: differencing was used for the log of the variables
Autocorrelation function gives the correlation between Y(t) & Y(t-s); S is the period of lag
If ACF gives high values at fixed interval, then it can be considered as period of seasonality
A differencing of same order would de- seasonalize the data
During Analysis:
It was found that ACF gave high values at fixed intervals of 12 (so, S=12)
Hence differencing was done at an interval of 12
Note: differencing was used for the log of the variables
Depending upon the no. of future time point to forecast some time points are set aside
These data are the validation sample; the rest of are called the development sample
The development sample is used to generate forecast for different models
During Analysis:
The development & validation sample created was named as D & S respectively
•AIC: Akaike information criterion
•BIC: Bayesian information criterion
•SBC: Schwarz criterion
The ‘minic’ function under proc ARIMA gives the minimum BIC model
All possible combinations of P&Q from 0-5 are explored
For each AIC & SBC are generated & the corresponding averages are calculated
Out of that 5-6 models based on the relative lower value of the average are selected
For each of them separate forecasts are generated
During Analysis:
The (3,0) combination gives the minimum value of BIC
Therefore all the possible 15 combinations except (0,0) are considered
Among those 6 combinations providing relative lower avg. of AIC & SBC were used for forecasting
Please the link to view avg. of AIC & SBC: http://bit.ly/1oZ1R4F
Forecasts are generated from each combinations of AIC & SBC
These are separately compared with actual values of same time using MAPE
The combination having minimum MAPE is selected
During Analysis:
MAPE was found to be minimum for (0,3)
Final forecast was done for that combination
Note: Here it needs to mentioned that before making the final forecasting both
the actual and predicted values were converted to original form using
exponential
Please the link to view min MAPE details: http://bit.ly/1oZ1R4F
Date Sales (In thousands)
Jan-61 444.82
Feb-61 420.47
Mar-61 453.76
Apr-61 499.38
May-61 511.44
Jun-61 579.87
Jul-61 674.35
Aug-61 657.19
Sep-61 551.07
Oct-61 500.22
Nov-61 423.30
Dec-61 469.02
0.00
100.00
200.00
300.00
400.00
500.00
600.00
700.00
800.00
Jan
-49
May
-49
Sep
-49
Jan
-50
May
-50
Sep
-50
Jan
-51
May
-51
Sep
-51
Jan
-52
May
-52
Sep
-52
Jan
-53
May
-53
Sep
-53
Jan
-54
May
-54
Sep
-54
Jan
-55
May
-55
Sep
-55
Jan
-56
May
-56
Sep
-56
Jan
-57
May
-57
Sep
-57
Jan
-58
May
-58
Sep
-58
Jan
-59
May
-59
Sep
-59
Jan
-60
May
-60
Sep
-60
Jan
-61
May
-61
Sep
-61
SALE
S in
Th
ou
san
ds
Date
Actual/Forecasted
Actual Forecasted
Top Related