Box Jenkins Methodology

1Box-Jenkins Seasonal Modeling

Five-step Iterative Procedures Stationarity Checking and Differencing

Model Identification

Parameter Estimation

Diagnostic Checking

Forecasting

3Step One: Stationarity Checking and Differencing

Seasonality Seasonality is defined as a pattern that repeats

itself over fixed intervals of time. In general, seasonality can be found by

identifying a large autocorrelation coefficient or a large partial autocorrelation coefficient at the seasonal lag.

Often autocorrelations at multiples of the seasonal lag will also be significant.

Example: Employees178 monthly values of the number of people in Wisconsin employed in trade are reported from 1961 to 1975.

Dataset: employees.txt SAS program: ARIMA_employees.sas

16214412610890725436181

400

375

350

325

300

275

250

Time

E

m

p

l

o

y

e

e

s

T ime S er ies P lot of N umber of Employees

Example: Employees

Example: Employees

Back

Continue

Assessing Stationarity Using a time series plot

If there is no evidence of a change in the mean over time, the series is stationary in the mean.

If the plotted series shows no obvious change in the variance over time, it is stationary in the variance.

Example: Stationary Series

Example: Non-stationary Series Non-stationary in the mean

33329625922218514811174371

600

550

500

450

400

350

300

Time

Time Series Plot of the Daily Closing IBM Stock Prices

Example: Non-stationary Series Non-stationary in the variance

130117104917865523926131

3000

2000

1000

0

-1000

-2000

Index

Time Series Plot of a Nonstationary Series

Example: Non-stationary Series Non-stationary in both the mean and the variance

130117104917865523926131

6000

5000

4000

3000

2000

1000

0

Time

Time Series Plot of Shipments of Pollution Equipment

Assessing Stationarity A time series plot is often enough to convince a

forecaster that the data are stationary or non-stationary. The ACF and PACF plots can readily expose non-stationarity in the mean. The autocorrelations of stationary data drop to

zero relatively quickly. For a typical pattern of non-stationary series

1 is very large and positive (ACF plot). ks are relatively large and positive, until k gets

big enough (ACF plot). PACF plot displays a large spike close to 1 at lag 1.

Example: Employees

Remove Nonstationarity The use of Box-Jenkins modeling techniques requires a

stationary process. Pre-differencing transformation is often used to

stabilize the seasonal variation of the time series. A common transformation is of the form:

)(* tt yny

Remove Nonstationarity Regular differencing

Let represent an appropriate predifferencing transformation.

If exhibits the pattern of nonstationarity, first regular differencing can be applied.

* *1t t tz y y

*ty

*ty

Example: Employees

Example: Employees Is pre-differencing transformation necessary?

Take first differences

Example: Employees

Seasonal Differencing With seasonal data which is non-stationary, it may be

appropriate to take seasonal differences. A seasonal difference is the difference between an

observation and the corresponding observation from the previous year.

where s is the number of seasons. For monthly data, L=12, for quarterly data, L=4, and so on.

The differencing can be repeated to obtain second-order seasonal differencing, although this is rarely needed.

* *t t t Lz y y

Example: Employees Non-stationarity in the mean remains in the seasonal

level. Try to remove it with a further first seasonal difference.

* * * * * * * *1 1 1 12 13( ) ( ) ( ) ( )t t t t L t L t t t tz y y y y y y y y

Example: Employees

Example: Employees

Back

Remarks for Removing Nonstationarity When both seasonal and first differences are applied,

it makes no difference which is done firstthe result will be the same.

If differencing is used, the differences should be interpretable. First differences are the change between one observation and next. Seasonal differences are the change from one year to the next.

Example: Employees Do seasonal differencing: 12 ttt yyZ

Example: Employees

Example: Employees Is stationarity achieved after first seasonal

differencing?

Stationarity In general, if ACF does both of the following, the series

should be considered stationary. Cuts off fairly quickly or dies down fairly quickly at

the nonseasonal level Arbitrarily define the nonseasonal level as lags 1 through

L3. For monthly data, lags 1 through 9 For quarterly data, lags 1, 2, and possibly 3

Cuts off fairly quickly or dies down fairly quickly at the seasonal level

Define the seasonal level as lags equal to (or nearly equal to) L, 2L, 3L, and 4L.

Define the lags L, 2L, 3L, and 4L to be exact seasonal lags Define the lags L 2, L 1, L + 1, L + 2, 2L 2, 2L 1, 2L + 1, 2L

+ 2, 3L 2, 3L 1, 3L + 1, 3L + 2, 4L 2, 4L 1, 4L + 1, and 4L + 2 to be near seasonal lags

Example: Employees

Summary of Removing Nonstationarity Pre-differencing transformation is often used to

stabilize the seasonal variation of the time series. A common transformation is of the form:

)(* tt yny

Summary of Removing Nonstationarity First regular difference

First seasonal difference, where L is the number of seasons in a year

First seasonal and first regular difference

One can also obtain second and higher order differences by simply applying the same rule.

For most practical purposes, a maximum of two differences (including both regular and seasonal differences) will transform the data into a stationary series.

* *1t t tz y y

* *t t t Lz y y

* * * *1 1t t t t L t Lz y y y y

Backshift Notation Backward shift operator, B

B, operating on yt, has the effect of shifting the data back one period.

Two applications of B to yt shifts the data back two periods.

m applications of B to yt shifts the data back mperiods.

1t tBy y

22( )t t tB By B y y

mt t mB y y

Backshift Notation The backward shift operator is convenient for

describing the process of differencing.

In general, a dth-order can be written as

An sth difference

The backshift notation is convenient because the terms can be multiplied together to see the combined effect.

1 (1 )t t t t t ty y y y By B y 2 2 2

1 22 (1 2 ) (1 )t t t t t ty y y y B B y B y

(1 )d dt ty B y

11 1(1 )(1 ) (1 )

s s st t t t t s t sB B y B B B y y y y y

(1 )s tB y

33

Step Two: Model Identification

ARIMA models for time series data Three basic Box-Jenkins models for a stationary time

series {yt } Autoregressive models (AR) Moving Average models (MA) Autoregressive Moving Average models (ARMA)

Autoregressive Integrated Moving Average models (ARIMA) ARMA models extended to nonstationary series by

allowing differencing of the data series

Autoregressive Models (AR) Autoregressive model of order p (AR(p))

i.e., yt depends on its p previous values Using backshift notation

where

1 1 2 2t t t p t p ty y y y

1( ) 1 ...p

p pB B B

( )p t tB y

Theoretical ACF and PACF for AR(p)

Characteristics: ACF dies down. PACF cuts off after lag p.

Example for AR(1) Empirical ACF and PACF

13 0.7t t ty y

Moving Average Models (MA) Moving Average model of order q (MA(q))

i.e., yt depends on q previous random error terms. Using backshift notation

where

1 1 2 2t t t t q t qy

21 2( ) 1

qq qB B B B

( )t q ty B

Theoretical ACF and PACF for MA(q)

Characteristics: ACF cuts off after lag q. PACF dies down.

Example for MA(1) Empirical ACF and PACF

110 0.7t t ty

Autoregressive Moving Average Models (ARMA)

Autoregressive-moving average model of order p and q (ARMA(p,q))

i.e., yt depends on its p previous values and q previous random error terms

Using backshift notation

When q = 0, the ARMA(p,0) model reduces to AR(p); when p = 0, the ARMA(0,q) model reduces to MA(q).

1 1 2 2 1 1 2 2t t t p t p t t t q t qy y y y

( ) ( )p t q tB y B

Theoretical ACF and PACF for ARMA(p,q)

Characteristics: Both ACF and PACF die down.

Summary of the Behaviors of ACF and PACF

Behaviors of ACF and PACF for general non-seasonal models

Process ACF PACF

AR(p) Dies down. Cuts off after lag p.MA(q) Cuts off after lag q. Dies down.

ARMA(p,q) Dies down. Dies down.

Box Jenkins Methodology

Documents

Transcript of Box Jenkins Methodology