Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES)...

23
Case Study: Time Series Table of Contents 1 Definition ................................................................................................................................................... 3 1.1 Time Series Patterns............................................................................................................................ 3 1.2 White Noise ......................................................................................................................................... 3 1.3 Autocorrelation Function (ACF) .......................................................................................................... 3 1.4 Ljung-Box Test ..................................................................................................................................... 4 2 Performance Evaluation ............................................................................................................................ 5 2.1 Residuals.............................................................................................................................................. 5 2.2 Forecast Accuracy Measurements ...................................................................................................... 5 2.3 Time series Cross-Validation (tsCV) .................................................................................................... 5 3 Basic Algorithms ......................................................................................................................................... 6 3.1 Example: Article 10 Monthly Data ...................................................................................................... 6 3.2 Naïve, Mean, Seasonal Naïve .............................................................................................................. 6 4 Exponential Weighted Forecast ................................................................................................................. 7 4.1 Simple Exponential Smoothing (SES) .................................................................................................. 7 4.1.1 Definition ..................................................................................................................................... 7 4.1.2 Parameter Selection..................................................................................................................... 7 4.1.3 Example: Article 10 Monthly Data ............................................................................................... 8 4.2 Exponential Smoothing with Trend: Holt’s Linear Trend Model ........................................................ 9 4.2.1 Definition ..................................................................................................................................... 9 4.2.2 Parameter Selection..................................................................................................................... 9 4.2.3 Example: Article 10 Monthly Data ............................................................................................... 9 4.3 Damped Holt’s Trend Model ............................................................................................................. 10 4.3.1 Definition ................................................................................................................................... 10 4.3.2 Parameter Selection................................................................................................................... 10 4.3.3 Example: Article 10 Monthly Data ............................................................................................. 10 4.4 Exponential Smoothing with Trend and Seasonality: Holt-Winters Model ...................................... 11 4.4.1 Definition ................................................................................................................................... 11 4.4.2 Example: Article 10 Monthly Data ............................................................................................. 11 4.5 State Space Models for Exponential Smoothing ............................................................................... 13 4.5.1 Example: Article 10 Monthly Data ............................................................................................. 13 4.6 Summary ........................................................................................................................................... 14 5 Stationary ................................................................................................................................................. 14 5.1 Variance Stabilization: Box-Cox Transformation............................................................................... 14 5.1.1 Definition ................................................................................................................................... 14 5.1.2 Purpose ...................................................................................................................................... 15 5.1.3 Box-Cox Transformation ............................................................................................................ 15

Transcript of Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES)...

Page 1: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

Case Study: Time Series

Table of Contents 1 Definition ................................................................................................................................................... 3

1.1 Time Series Patterns ............................................................................................................................ 3

1.2 White Noise ......................................................................................................................................... 3

1.3 Autocorrelation Function (ACF) .......................................................................................................... 3

1.4 Ljung-Box Test ..................................................................................................................................... 4

2 Performance Evaluation ............................................................................................................................ 5

2.1 Residuals .............................................................................................................................................. 5

2.2 Forecast Accuracy Measurements ...................................................................................................... 5

2.3 Time series Cross-Validation (tsCV) .................................................................................................... 5

3 Basic Algorithms ......................................................................................................................................... 6

3.1 Example: Article 10 Monthly Data ...................................................................................................... 6

3.2 Naïve, Mean, Seasonal Naïve .............................................................................................................. 6

4 Exponential Weighted Forecast ................................................................................................................. 7

4.1 Simple Exponential Smoothing (SES) .................................................................................................. 7

4.1.1 Definition ..................................................................................................................................... 7

4.1.2 Parameter Selection..................................................................................................................... 7

4.1.3 Example: Article 10 Monthly Data ............................................................................................... 8

4.2 Exponential Smoothing with Trend: Holt’s Linear Trend Model ........................................................ 9

4.2.1 Definition ..................................................................................................................................... 9

4.2.2 Parameter Selection..................................................................................................................... 9

4.2.3 Example: Article 10 Monthly Data ............................................................................................... 9

4.3 Damped Holt’s Trend Model ............................................................................................................. 10

4.3.1 Definition ................................................................................................................................... 10

4.3.2 Parameter Selection................................................................................................................... 10

4.3.3 Example: Article 10 Monthly Data ............................................................................................. 10

4.4 Exponential Smoothing with Trend and Seasonality: Holt-Winters Model ...................................... 11

4.4.1 Definition ................................................................................................................................... 11

4.4.2 Example: Article 10 Monthly Data ............................................................................................. 11

4.5 State Space Models for Exponential Smoothing ............................................................................... 13

4.5.1 Example: Article 10 Monthly Data ............................................................................................. 13

4.6 Summary ........................................................................................................................................... 14

5 Stationary ................................................................................................................................................. 14

5.1 Variance Stabilization: Box-Cox Transformation............................................................................... 14

5.1.1 Definition ................................................................................................................................... 14

5.1.2 Purpose ...................................................................................................................................... 15

5.1.3 Box-Cox Transformation ............................................................................................................ 15

Page 2: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

5.1.4 Example: Article 6 Daily Data ..................................................................................................... 15

5.2 Mean Stabilization ............................................................................................................................. 16

5.2.1 Non-Seasonal Differencing ........................................................................................................ 16

5.2.2 Seasonal Differencing ................................................................................................................ 16

6 ARIMA ...................................................................................................................................................... 16

6.1 Definition ........................................................................................................................................... 16

6.1.1 Non-Seasonal ARIMA Model ...................................................................................................... 16

6.1.2 Seasonal ARIMA Model .............................................................................................................. 17

6.2 Parameter Selection .......................................................................................................................... 17

6.3 Example 1: Article 10 Monthly Data, Non-Seasonal ARIMA ............................................................. 17

6.3.1 ARIMA Parameters ..................................................................................................................... 17

6.3.2 Forecast ...................................................................................................................................... 18

6.3.3 Performance .............................................................................................................................. 18

6.4 Example 2: Article 11 Daily Data, Seasonal ARIMA ........................................................................... 18

6.4.1 ARIMA Parameters ..................................................................................................................... 19

6.4.2 Forecast ...................................................................................................................................... 19

6.4.3 Performance .............................................................................................................................. 19

7 Dynamic Regression ................................................................................................................................. 19

7.1 Definition ........................................................................................................................................... 19

7.2 Implementation ................................................................................................................................. 19

7.3 Example: Article 10 Monthly Data .................................................................................................... 20

7.3.1 Mathematics Model ................................................................................................................... 20

7.3.2 Outliers Filtering ........................................................................................................................ 20

7.3.3 ARIMA Parameters ..................................................................................................................... 21

7.3.4 Interpretation ............................................................................................................................ 21

7.3.5 Forecast ...................................................................................................................................... 21

7.3.6 Performance .............................................................................................................................. 22

8 Dynamic Harmonic Regression ................................................................................................................ 22

8.1 Definition ........................................................................................................................................... 22

8.2 Example: Article 10 Monthly Data .................................................................................................... 22

8.2.1 Parameter Selection................................................................................................................... 22

8.2.2 Performance .............................................................................................................................. 23

9 Summary .................................................................................................................................................. 23

Page 3: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

1 Definition Time series data is a series of data observed over time. The following figure demonstrates an example of time series, which is the daily sales quantity of article 11 from 2014 Q3 to 2016 Q2.

1.1 Time Series Patterns There are three main patterns of time series:

Trend A pattern exists involving a long-term increase or decrease in the data

Seasonal A periodic pattern exists due to the calendar (e.g. the quarter, month, or day of the week)

Cyclic A pattern exists where the data exhibits rises and falls that are not of fixed period (duration usually of at least 2 years)

The cyclic pattern possesses some characters different from season pattern:

• Seasonal pattern has constant period, while period of cyclic pattern can be variable

• Average length of cycle is longer than length of seasonal pattern

• Magnitude of cycle is more variable than magnitude of seasonal pattern

1.2 White Noise A white noise signal is a time series of independent and identically distributed (iid) data. Figure below simulates a while noise signal with the same length to the article 11 data.

1.3 Autocorrelation Function (ACF) Autocorrelation is the correlation of a series with a delayed copy of itself as a function of delay. It can be interpreted as the similarity between observations as a function of the time lag between them. ACF is a tool to identify repeating patterns, or periodicity of a time series.

Page 4: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

A white noise signal yields ACF values close to 0. The ACF plot of the simulated noise signal is demonstrated as below:

The cut-off lines (dashed blue) are determined by sample distribution. 95% of all ACF values for white noise should lie within the blue lines. If not, the series is probably not white noise. In contrast, the ACF plot of the article 11 data has many values outside the cut-off lines as shown below:

In addition, the periodicity of this time series can be identified from the ACF plot:

• The peak value occurs at 91 days of lag, which reflects the quarterly seasonality;

• The weekly pattern (7 days) and its harmonics are also clearly shown by the ACF.

1.4 Ljung-Box Test The Ljung-Box test is a significant test to evaluate the likelihood of a time series being white noise. It considers the first ℎ autocorrelation values together. A small p-value indicates the time series is probably not white noise. Section below compares the results of Ljung-Box tests of the above white noise and article 11 data: Ljung-Box test result of white noise:

Ljung-Box test result of article 11 data:

Page 5: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

2 Performance Evaluation 2.1 Residuals The residuals resulted from a valid forecast model is expected to look like white noise, which possesses useful properties of estimating prediction intervals:

• Uncorrelated

• Zero mean

• Constant variance

• Normally distributed

2.2 Forecast Accuracy Measurements There are various measurements with different limitations. This section will list several common ones. Let 𝑦𝑡 = actual values 𝑦�̅� = forecasted values 𝑒𝑡 = forecast error = 𝑦𝑡 − 𝑦�̂� Mean Squared Error (MSE) and Root Mean Squared Error (RMSE)

𝑀𝑆𝐸 = 𝑚𝑒𝑎𝑛(𝑒𝑡2)

𝑅𝑀𝑆𝐸 = √𝑀𝑆𝐸 Mean Absolute Percentage Error (MAPE)

𝑀𝐴𝑃𝐸 = 100 × 𝑚𝑒𝑎𝑛(|𝑒𝑡

𝑦𝑡|)

Coefficient of Variation

𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑜𝑓 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛 =𝑅𝑀𝑆𝐸

𝑦�̅�

Mean Absolute Scaled Error (MASE)

𝑀𝐴𝑆𝐸 =∑ |𝑒𝑡|

∑(𝑦𝑡 − 𝑦𝑡−1)

MASE will be used as the primary accuracy measurements in this case study, because MASE has the following properties:

• Scale invariance: independent of the scale of the data

• Predictable behavior as 𝑦𝑡 → 0

• Symmetry: penalizes positive and negative forecast errors equally, and penalized errors in large forecasts and small forecasts equally

• Interpretability: MASE=1 is equivalent to one-step naïve method. MASE less than 1 indicates between accuracy than one-step naïve method.

• Asymptotic normality: the Diebold-Mariano test statistics of MASE has been empirically shown to approximate normal distribution.

Besides MASE, RMSE will also be used as reference.

2.3 Time series Cross-Validation (tsCV) The formation of validation sets in tsCV is different from non-time-series cross-validation. This will be explained through the following example:

Page 6: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

The figure above demonstrates the tsCV validation sets (folds) of a 1-step forecast. Each line presents one validation set, where the blue dots are the training data, and the red dots are the test data. As shown, the validate sets are formed on a moving basis. In addition, the length of the test set and the number of folds can be determined based on the purpose of the model.

3 Basic Algorithms The section will introduce several basic algorithms of time-series forecast. These algorithms are usually used as benchmarks of more complicated methods.

3.1 Example: Article 10 Monthly Data In this section, monthly data of article 10 in year 2010 to 2014 will be used as an example. The following figures show the sample data’s time-series plots and ACF plots:

For this sample data, the 2008 – 2012 data will be used as training set, and the 2013 data as test set.

3.2 Naïve, Mean, Seasonal Naïve Naïve model uses the last value of the training data as the forecast, while Mean model uses the mean value of the training data as the forecast. Seasonal Naïve model uses the values of data in the previous defined season as forecast. The following figures illustrate the forecasts and residual plots of these models respectively.

Page 7: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

Table below summarizes the performance of these models:

Model MASE (test) RMSE (test) RMSE (tsCV) Naïve 1.75 748.61 763.33

Mean 3.56 1407.68 855.81 Seasonal Naïve 2.5 1043.72 813.42

4 Exponential Weighted Forecast 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential window function. Whereas in the simple moving average the past observations are weighted equally, exponential functions are used to assign exponentially decreasing weights over time.

4.1.1 Definition Let

�̂�𝑡+ℎ|𝑡 = point forecast of 𝑦𝑡+ℎ given data 𝑦1, … , 𝑦𝑡

Forecast Equation:

�̂�𝑡+ℎ|𝑡 = 𝛼𝑦𝑡 + 𝛼(1 − 𝛼)𝑦𝑡−1 + 𝛼(1 − 𝛼)2𝑦𝑡−2 + ⋯ where 0 ≤ 𝛼 ≤ 1

4.1.2 Parameter Selection Let

�̂�𝑡+ℎ|𝑡 = 𝑙𝑡

The forecast equation can be written as:

𝑙𝑡 = 𝛼𝑦𝑡 + (1 − 𝛼)𝑙𝑡−1 The parameters, 𝛼 and 𝑙0 can be chosen by minimizing the squared error.

Page 8: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

4.1.3 Example: Article 10 Monthly Data Fit an SES model to the training data yields the following results:

Note that the estimated parameters are listed in the results, where 𝛼 = 0.1814 and 𝑙0 = 494.5198. “sigma” is the standard deviation of the residuals. Figure below demonstrates the visualization of the smoothed training data, which is presented by the red line:

The following figures show the forecast and residual plot of the SES model:

Performance of the SES model:

Model MASE (test) RMSE (test) RMSE (tsCV)

SES 1.89 777.06 704.45

Page 9: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

4.2 Exponential Smoothing with Trend: Holt’s Linear Trend Model 4.2.1 Definition Recall that in SES,

�̂�𝑡+ℎ|𝑡 = 𝑙𝑡

𝑙𝑡 = 𝛼𝑦𝑡 + (1 − 𝛼)𝑙𝑡−1 Holt’s Linear Trend model added a trend component to the SES model:

𝑦𝑡+ℎ|𝑡 = 𝑙𝑡 + ℎ𝑏𝑡

where, 𝑏𝑡 = 𝛽∗(𝑙𝑡 − 𝑙𝑡−1) + (1 − 𝛽∗)𝑏𝑡−1, 𝑤ℎ𝑒𝑟𝑒 0 ≤ 𝛽∗ ≤ 1

4.2.2 Parameter Selection The parameters, 𝛼, 𝛽∗, 𝑙0 and 𝑏0 can be chosen by minimizing the squared error.

4.2.3 Example: Article 10 Monthly Data Fit a Holt’s Linear Trend model to the training data yields the following results:

Note that the estimated parameters are listed in the results, where

𝛼 = 10−4, 𝛽 = 10−4, 𝑙0 = 380.3413, 𝑏0 = 25.8833 The following figures show the forecast and residual plot of the Holt’s Linear Trend model:

Performance of Holt’s Linear Trend model:

Model MASE (test) RMSE (test) RMSE (tsCV)

Holt’s Linear Trend 1.35 570.08 1159.16

Page 10: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

4.3 Damped Holt’s Trend Model 4.3.1 Definition Holt’s Trend model is generalized by adding a damping parameter 𝜙 to the trend component:

�̂�𝑡+ℎ|𝑡 = 𝑙𝑡 + (𝜙 + 𝜙2 + ⋯ + 𝜙ℎ)𝑏𝑡 , 𝑤ℎ𝑒𝑟𝑒 0 ≤ 𝜙 ≤ 1

𝑙𝑡 = 𝛼𝑦𝑡 + (1 − 𝛼)(𝑙𝑡−1 + 𝜙𝑏𝑡−1) 𝑏𝑡 = 𝛽∗(𝑙𝑡 − 𝑙𝑡−1) + (1 − 𝛽∗)𝜙𝑏𝑡−1

If 𝜙 = 1, identical to Holt’s Linear Trend model; If 𝜙 = 0, identical to SES model.

4.3.2 Parameter Selection The parameters, 𝛼, 𝛽∗, 𝑙0, 𝑏0 and 𝜙 can be chosen by minimizing the squared error.

4.3.3 Example: Article 10 Monthly Data Fit a Holt’s Trend model to the training data yields the following results:

Note that the estimated parameters are listed in the results, where

𝛼 = 0.0611, 𝛽 = 0.0611, 𝑙0 = 381.2388, 𝑏0 = 31.7118 and 𝜙 = 0.8266 The following figures show the forecast and residual plot of the Holt’s Trend model:

Performance of Holt’s Trend model:

Model MASE (test) RMSE (test) RMSE (tsCV)

Holt’s Trend 1.53 644.87 1127.1

Page 11: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

4.4 Exponential Smoothing with Trend and Seasonality: Holt-Winters Model 4.4.1 Definition

Denote the Holt’s Trend model as �̂�𝑡+ℎ|𝑡, recall that

�̂�𝑡+ℎ|𝑡 = 𝑙𝑡 + (𝜙 + 𝜙2 + ⋯ + 𝜙ℎ)

Holt-Winters model added a seasonal component 𝑠𝑡−𝑚+ℎ𝑚

+ . There are two variations:

1. Additive model:

�̂�𝑡+ℎ|𝑡 = �̂�𝑡+ℎ|𝑡 + 𝑠𝑡−𝑚+ℎ𝑚+

𝑙𝑡 = 𝛼(𝑦𝑡 − 𝑠𝑡−𝑚) + (1 − 𝛼)(𝑙𝑡−1 + 𝜙𝑏𝑡−1) 𝑏𝑡 = 𝛽∗(𝑙𝑡 − 𝑙𝑡−1) + (1 − 𝛽∗)𝜙𝑏𝑡−1 𝑠𝑡 = 𝛾(𝑦𝑡 − 𝑙𝑡−1 − 𝑏𝑡−1) + (1 − 𝛾)𝑠𝑡−𝑚

2. Multiplicative Model:

�̂�𝑡+ℎ|𝑡 = �̂�𝑡+ℎ|𝑡 ∙ 𝑠𝑡−𝑚+ℎ𝑚+

𝑙𝑡 =𝛼𝑦𝑡

𝑠𝑡−𝑚+ (1 − 𝛼)(𝑙𝑡−1 + 𝜙𝑏𝑡−1)

𝑏𝑡 = 𝛽∗(𝑙𝑡 − 𝑙𝑡−1) + (1 − 𝛽∗)𝜙𝑏𝑡−1

𝑠𝑡 =𝛾𝑦𝑡

𝑙𝑡−1 − 𝑏𝑡−1+ (1 − 𝛾)𝑠𝑡−𝑚

where, 𝑚 = period of seasonality

0 ≤ 𝛾 ≤ 1 − 𝛼 Note that, seasonal component averages zero from the additive model, and one for the multiplicative model. In general, multiplication method is used when seasonal variation increases with the level of the series.

4.4.2 Example: Article 10 Monthly Data Fit a Holt-Winters model (additive) to the training data yields the following results:

Fit a Holt-Winters model (multiplicative) to the training data yields the following results:

Page 12: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

Figure below plots the forecast of additive model; and the multiplicative model is shown as the red line.

And residual plots:

Performance of Holt-Winters model:

Model MASE (test) RMSE (test) RMSE (tsCV)

Holt-Winters Additive 2.06 892.74 810.36

Holt-Winters Multiplicative 2.04 920.02 854.68

Page 13: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

4.5 State Space Models for Exponential Smoothing Sections above have introduced types of trend component and seasonal component of exponential smoothing models. The combinations of these types result in a set of states. Moreover, the errors can be further classified into additive and multiplicative. As a result, the taxonomy of the state space is denoted by (Error, Trend, Seasonal) format, namely “ETS”. The common values of ETS components includes:

• Error (E) = {A: additive, M: multiplicative}

• Trend (T) = {N: none, A: additive, Ad: additive damped}

• Seasonal (S) = {N: none, A: additive, M: multiplicative} The ETS model targets to choose the best model from the state space by minimizing the corrected Akaike’s Information Criterion (AICC).

4.5.1 Example: Article 10 Monthly Data Fit an ETS model to the training data yields the following results:

The selected model is a Holt’s Linear Trend model with multiplication error. The following figures show the forecast and residual plot of the ETS model:

Performance of ETS model:

Model MASE (test) RMSE (test) RMSE (tsCV) ETS 1.41 590.36 749.96

Page 14: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

4.6 Summary Table below summarizes the performances of the exponential smoothing models and the basic models from the examples in the above sections:

Model MASE (test) RMSE (test) RMSE (tsCV)

Naïve 1.75 748.61 763.33 Mean 3.56 1407.68 855.81

Seasonal Naïve 2.5 1043.72 813.42

SES 1.89 777.06 704.45 Holt’s Linear Trend 1.35 570.08 1159.16

Holt’s Trend 1.53 644.87 1127.1 Holt-Winters Additive 2.06 892.74 810.36

Holt-Winters Multiplicative 2.04 920.02 854.68 ETS 1.41 590.36 749.96

In this example, the Holt’s Linear Trend model yielded the smallest test error, while the SES model yielded the smallest tsCV error.

5 Stationary A stationary time series is one whose statistical properties such as mean, variance, autocorrelation, etc. are all constant over time. It is a useful property, as many statistical forecasting methods are based on the assumption that the time series can be rendered approximately stationary through mathematical transformations.

5.1 Variance Stabilization: Box-Cox Transformation 5.1.1 Definition If the time series data show increasing variation as the level of series increases, a transformation can be useful. The figure below demonstrates daily data of article 6 from year 2008 to 2013 as an example. The following table show that the variation increases as the level of sales_qty grows higher.

Year Mean Mean Increment Variance Variance Increment 2008 15.95 1.0 109.54 1.0

2009 18.32 1.1 106.12 1.0

2010 25.42 1.6 198.36 1.8

2011 27.09 1.7 249.38 2.3

2012 32.55 2.0 299.22 2.7 2013 43.47 2.7 331.88 3.0

Page 15: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

Table below lists some basic mathematical transformation to stabilizing the variations, sorted by increasing stabilizing strength:

Transformation Equation Square Root 𝑤𝑡 = √𝑦𝑡

Cube Root 𝑤𝑡 = √𝑦𝑡3

Logarithm 𝑤𝑡 = log (𝑦𝑡)

Inverse 𝑤𝑡 = −1/𝑦𝑡

5.1.2 Purpose Variance stabilization methods target to transform the time series, so that its variation is not related to its mean. It helps simplify considerations in data analysis, and prepares the data for application of regression-based

methods. Note that it is not common to use ETS model with variance stabilization, because ETS model has the capability to handle the increasing variance directly by using multiplicative components. On the other hand, variance stabilization can be helpful in ARIMA based models.

5.1.3 Box-Cox Transformation The Box-Cox transformations is a combination of the above basic transformations:

𝑤𝑡 = {log (𝑦𝑡) 𝜆 = 0

(𝑦𝑡𝜆 − 1)/𝜆 𝜆 ≠ 0

which yields:

• 𝜆 = 1 : No substantive transformation

• 𝜆 =1

2 : square root + linear transformation

• 𝜆 =1

3 : cube root + linear transformation

• 𝜆 = 0 : natural logarithm transformation

• 𝜆 = −1 : inverse transformation Optimal 𝜆 value can be selected by minimizing the coefficient of variation of the subseries.

5.1.4 Example: Article 6 Daily Data Figure below shows the Box-Cox transformation of article 6 data using the optimal 𝜆 = 0.35 :

Year Mean Mean Increment Variance Variance Increment

2008 4.30 1.0 3.02 1.0

2009 4.74 1.1 2.58 0.9

2010 5.77 1.3 2.13 0.7 2011 5.86 1.4 3.24 1.1

2012 6.38 1.5 4.72 1.6 2013 7.62 1.8 2.32 0.8

Page 16: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

5.2 Mean Stabilization A time series with trend component can be transformed for mean stabilization through removing the trend component (de-trend). Differencing methods are basic ways to remove trend component from a time series.

5.2.1 Non-Seasonal Differencing Non-Seasonal Differencing is the differential of each value and its previous value:

𝑤𝑡 = 𝑦𝑡 − 𝑦𝑡−1 Figure below show the non-seasonal differencing of the article 6 data:

5.2.2 Seasonal Differencing Seasonal Differencing is the differential of each value and the value in the previous defined season:

𝑤𝑡 = 𝑦𝑡 − 𝑦𝑡−𝑚 Figure below show the seasonal differencing of the article 6 data, with weekly seasonality (𝑚 = 7):

6 ARIMA 6.1 Definition 6.1.1 Non-Seasonal ARIMA Model Autoregressive (AR) model: multiple regression with lagged observations as predictors

𝑦 = 𝑐 + 𝜙1𝑦𝑡−1 + 𝜙2𝑦𝑡−2 + ⋯ + 𝜙𝑝𝑦𝑡−𝑝 + 𝑒𝑡

Moving Average (MA) models: Multiple regression with lagged errors as predictors

𝑦 = 𝑐 + 𝑒𝑡 + 𝜃1𝑒𝑡−1 + 𝜃2𝑒𝑡−2 + ⋯ + 𝜃𝑞𝑒𝑡−𝑞

Combining the AR and MA models results in ARMA models:

𝑦 = 𝑐 + 𝜙1𝑦𝑡−1 + 𝜙2𝑦𝑡−2 + ⋯ + 𝜙𝑝𝑦𝑡−𝑝 + 𝜃1𝑒𝑡−1 + 𝜃2𝑒𝑡−2 + ⋯ + 𝜃𝑞𝑒𝑡−𝑞 + 𝑒𝑡

𝑐 is also called the drift component. Integrate the ARMA models with differencing results in the ARIMA(p, d, q) models, where 𝑑 = number of differencing

Page 17: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

6.1.2 Seasonal ARIMA Model The seasonal ARIMA model is denoted as ARIMA(p, d, q)(P, D, Q)m, where

𝐷 = Number of seasonal differences 𝑃 = Number of seasonal AR lags: 𝑦𝑡−𝑚 , 𝑦𝑡−2𝑚 , … , 𝑦𝑡−𝑃𝑚 𝑄 = Number of seasonal MA lags: 𝑒𝑡−𝑚 , 𝑒𝑡−2𝑚 , … , 𝑒𝑡−𝑄𝑚

𝑚 = Number of observations per period

6.2 Parameter Selection The auto.arima() function from forecast package in R implements the Hyndman-Khandakar algorithm to select the optimal ARIMA parameters of the given data:

1. Select number of differencing 𝑑 via unit root tests. 2. Select 𝑝 and 𝑞 by minimizing AICc. 3. Estimate parameters using maximum likelihood estimation. 4. Use stepwise search to traverse the model space. Stepwise search is used to save time, while it may

not return global optimal.

6.3 Example 1: Article 10 Monthly Data, Non-Seasonal ARIMA The same data used in the basic algorithms and exponential weighted forecast sections will be demonstrated in this section, which is the monthly data of article 10 in year 2008 to 2013:

Recall that the 2008 – 2012 data is used as training set, and the 2013 data as test set.

6.3.1 ARIMA Parameters By applying auto.arima() to the training data, the following model is fitted:

The optimal model is selected as a non-seasonal ARIMA model with 𝑝 = 1, 𝑞 = 1 integrated by 1 level of differencing.

Page 18: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

6.3.2 Forecast The following figures present the forecast and residual plots of the model:

6.3.3 Performance Table below summarizes the performance of the ARIMA model:

Model MASE (test) RMSE (test) RMSE (tsCV) ARIMA 1.92 811.43 784.32

6.4 Example 2: Article 11 Daily Data, Seasonal ARIMA Daily data of article 11 from 2014 Q3 to 2016 Q2 will be used as example. Recall its time series plot and ACF plot:

2014 Q3 to 2016 Q1 (7 quarters) data will be used as training set, and 2016 Q2 data as test set.

Page 19: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

6.4.1 ARIMA Parameters By applying auto.arima() to the training data, the following model is fitted:

The optimal model is selected as a seasonal ARIMA model with 𝑝 = 1 and 𝑄 = 1 with 𝑚 = 91. Note that the value of 𝑚 is not selected by the auto.arima() function, but a user input observed from the ACF plot.

6.4.2 Forecast The following figures present the forecast and residual plots of the model:

Note that the residuals have high ACF value, which implies that there are patterns unexplained by this model.

6.4.3 Performance Table below summarizes the performance of the ARIMA model, as well as Seasonal Naïve model as benchmark:

Model MASE (test) RMSE (test) RMSE (tsCV) ARIMA 0.78 6.63 7.83

Seasonal Naive 0.69 5.72 8.06

7 Dynamic Regression 7.1 Definition Dynamic Regression is a method to combine external information with ARIMA model. It is formulated as a regression model with ARIMA errors:

𝑦𝑡 = 𝛽0 + 𝛽1𝑥1,𝑡 + ⋯ + 𝛽𝑟𝑥𝑟,𝑡 + 𝑒𝑡 where, 𝑦𝑡 is modeled as function of 𝑟 explanatory variables 𝑥1,𝑡 , … , 𝑥𝑟,𝑡 𝑒𝑡 is modeled by an ARIMA process

7.2 Implementation auto.arima() function from forecast package in R is a flexible tool to implement Dynamic Regression. The explanatory variables 𝑥1,𝑡, … , 𝑥𝑟,𝑡 is input as external regressors to this function.

Page 20: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

7.3 Example: Article 10 Monthly Data In this example, the external regressors will be coverage% 𝑐 and discount% 𝑑 as discussed in the stock out and discount case studies. The same transformation will be applied to map 𝑐 and 𝑑 to the multipliers 𝑐𝑚 and 𝑑𝑚 respectively. Refer to the stock out and discount case studies for more details of the multiplier transformation. Figure below plots the time series including the external regressors:

7.3.1 Mathematics Model

𝑦 = 𝑦0𝑐𝑚𝑏𝑐𝑑𝑚

𝑏𝑑𝑒 where, 𝑦0 = base market demand to be estimated 𝑐𝑚 = coverage multiplier 𝑏𝑐 = coverage coefficient to be estimated (𝑏𝑐 > 0) 𝑑𝑚 = discount multiplier 𝑏𝑑 = discount coefficient to be estimated (𝑏𝑑 > 0) 𝑒 = error Convert to linear by logarithm transformation:

𝑌 = 𝑏0 + 𝑏𝑐𝑋𝑐 + 𝑏𝑑𝑋𝑑 + e where, 𝑌 = log 𝑦 𝑏0 = log 𝑦0 𝑋𝑐 = log 𝑐𝑚 𝑋𝑑 = log 𝑑𝑚

7.3.2 Outliers Filtering Because regression models are sensitive to outliers, the outliers shall be identified and filtered out before fitting the model. There are three types of outliers defined in this example:

• sales_qty = 0, because log 0 cannot be handled by the model

• coverage = 0, because log 0 cannot be handled by the model

• low discount_pct while high sales_qty, which cannot be explained by the model. Note that the data used in this example happen to have no observation matching the above outlier criteria, while it is still important to exam whether there are outliers in the data.

Page 21: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

7.3.3 ARIMA Parameters By applying auto.arima() to the training data, the following model is fitted:

7.3.4 Interpretation

• The optimal model is selected as a non-seasonal ARIMA model with 𝑝 = 1, 𝑞 = 1 integrated by 1 level of differencing.

• 𝑏𝑐 = 1.3958: Coverage coefficient greater than 1 implies sites with high demand went out of stock first. For example, at 20% stock out, the service level will drop to 73%. The following figure illustrates the underlining relationship between stock_out_pct and service level:

• 𝑏𝑑 = 0.5107: The figure below shows the underlining relationship between discount_pct and sales

boosted (𝑑𝑚𝑏𝑑), in which at 40% discount, the sales will be boosted by 1.75:

7.3.5 Forecast The following figures present the forecast and residual plots of the model:

Page 22: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

A linear regression model is fitted as benchmark. The linear regression model models the external regressors with white noise error. Below is the forecast of the linear regression model:

7.3.6 Performance Table below summarizes the performance of the Dynamic Regression model, as well as a linear regression model as benchmark:

Model MASE (test) RMSE (test) RMSE (tsCV)

Dynamic Regression 0.83 437.81 523.08 Linear Regression 3.38 1336.75 496.49

The tsCV error of Dynamic Regression is high, because one of the folds yielded very high error. This implies the robustness issue of Dynamic Regression: it works well for most cases, while it may result in abnormal output given certain special input pattern.

8 Dynamic Harmonic Regression 8.1 Definition Besides seasonal models, Fourier terms is another way to handle periodic seasonality. Let

𝑠𝑘(𝑡) = sin (2𝜋𝑘𝑡

𝑚)

𝑐𝑘(𝑡) = cos (2𝜋𝑘𝑡

𝑚)

𝑦𝑡 = 𝛽0 + ∑[𝛼𝑘𝑠𝑘(𝑡) + 𝛾𝑘𝑐𝑘(𝑡)]

𝐾

𝑘=1

+ 𝑒𝑡

where, 𝑚 = seasonal period 𝛼𝑘 , 𝛾𝑘 = regression coefficients 𝑒𝑡 can be modeled as a non-seasonal ARIMA process Note that, the Fourier model assumes that seasonal pattern does not change over time, while a seasonal ARIMA model allows the pattern to evolve over time.

8.2 Example: Article 10 Monthly Data 8.2.1 Parameter Selection Larger 𝐾 value will result in more period patterns extracted. The following figures compare the forecasts with 𝐾 = 1 and 𝐾 = 6 on PA 10 monthly data respectively, with external regressors (𝑐 and 𝑑):

Page 23: Case Study: Time Series - IORA - Home · 2019. 12. 20. · 4.1 Simple Exponential Smoothing (SES) Exponential smoothing is a technique to smooth time series data using the exponential

𝐾 = 1 𝐾 = 6 The optimal 𝐾 value can be selected by minimizing AICc. In this example, the optimal 𝐾 = 1.

8.2.2 Performance Table below summarizes the performance of the Dynamic Harmonic Regression model.

Model MASE (test) RMSE (test) RMSE (tsCV) Dynamic Harmonic Regression 1.52 675.14 444.65

9 Summary In this case study, various forecast techniques were demonstrated using the article 10 monthly data. Performance of these models are summarized as below, ranking from high to low tsCV error:

Model MASE (test) RMSE (test) RMSE (tsCV)

Holt’s Linear Trend 1.35 570.08 1159.16 Holt’s Trend 1.53 644.87 1127.1

Mean 3.56 1407.68 855.81 Holt-Winters Multiplicative 2.04 920.02 854.68

Seasonal Naïve 2.5 1043.72 813.42 Holt-Winters Additive 2.06 892.74 810.36

ARIMA 1.92 811.43 784.32

Naïve 1.75 748.61 763.33 ETS 1.41 590.36 749.96

SES 1.89 777.06 704.45 Dynamic Regression 0.83 437.81 523.08

Linear Regression 3.38 1336.75 496.49

Dynamic Harmonic Regression 1.52 675.14 444.65 A seasonal ARIMA model was also fitted using the article 11 daily data, with seasonal naïve as benchmark:

Model MASE (test) RMSE (test) RMSE (tsCV)

ARIMA 0.78 6.63 7.83

Seasonal Naive 0.69 5.72 8.06