Box Jenkins Methodology

43
1 Box-Jenkins Seasonal Modeling

description

Discuss the use of Box Jenkins modelling in time series analysis

Transcript of Box Jenkins Methodology

  • 1Box-Jenkins Seasonal Modeling

  • Slide 2

    Five-step Iterative Procedures Stationarity Checking and Differencing

    Model Identification

    Parameter Estimation

    Diagnostic Checking

    Forecasting

  • 3Step One: Stationarity Checking and Differencing

  • Slide 4

    Seasonality Seasonality is defined as a pattern that repeats

    itself over fixed intervals of time. In general, seasonality can be found by

    identifying a large autocorrelation coefficient or a large partial autocorrelation coefficient at the seasonal lag.

    Often autocorrelations at multiples of the seasonal lag will also be significant.

  • Slide 5

    Example: Employees178 monthly values of the number of people in Wisconsin employed in trade are reported from 1961 to 1975.

    Dataset: employees.txt SAS program: ARIMA_employees.sas

    16214412610890725436181

    400

    375

    350

    325

    300

    275

    250

    Time

    E

    m

    p

    l

    o

    y

    e

    e

    s

    T ime S er ies P lot of N umber of Employees

  • Example: Employees

  • Slide 7

    Example: Employees

    Back

    Continue

  • Slide 8

    Assessing Stationarity Using a time series plot

    If there is no evidence of a change in the mean over time, the series is stationary in the mean.

    If the plotted series shows no obvious change in the variance over time, it is stationary in the variance.

  • Slide 9

    Example: Stationary Series

  • Slide 10

    Example: Non-stationary Series Non-stationary in the mean

    33329625922218514811174371

    600

    550

    500

    450

    400

    350

    300

    Time

    Time Series Plot of the Daily Closing IBM Stock Prices

  • Slide 11

    Example: Non-stationary Series Non-stationary in the variance

    130117104917865523926131

    3000

    2000

    1000

    0

    -1000

    -2000

    Index

    Time Series Plot of a Nonstationary Series

  • Slide 12

    Example: Non-stationary Series Non-stationary in both the mean and the variance

    130117104917865523926131

    6000

    5000

    4000

    3000

    2000

    1000

    0

    Time

    Time Series Plot of Shipments of Pollution Equipment

  • Slide 13

    Assessing Stationarity A time series plot is often enough to convince a

    forecaster that the data are stationary or non-stationary. The ACF and PACF plots can readily expose non-stationarity in the mean. The autocorrelations of stationary data drop to

    zero relatively quickly. For a typical pattern of non-stationary series

    1 is very large and positive (ACF plot). ks are relatively large and positive, until k gets

    big enough (ACF plot). PACF plot displays a large spike close to 1 at lag 1.

    Example: Employees

  • Slide 14

    Remove Nonstationarity The use of Box-Jenkins modeling techniques requires a

    stationary process. Pre-differencing transformation is often used to

    stabilize the seasonal variation of the time series. A common transformation is of the form:

    )(* tt yny

  • Slide 15

    Remove Nonstationarity Regular differencing

    Let represent an appropriate predifferencing transformation.

    If exhibits the pattern of nonstationarity, first regular differencing can be applied.

    * *1t t tz y y

    *ty

    *ty

    Example: Employees

  • Slide 16

    Example: Employees Is pre-differencing transformation necessary?

    Take first differences

  • Example: Employees

  • Slide 18

    Example: Employees

  • Slide 19

    Seasonal Differencing With seasonal data which is non-stationary, it may be

    appropriate to take seasonal differences. A seasonal difference is the difference between an

    observation and the corresponding observation from the previous year.

    where s is the number of seasons. For monthly data, L=12, for quarterly data, L=4, and so on.

    The differencing can be repeated to obtain second-order seasonal differencing, although this is rarely needed.

    * *t t t Lz y y

  • Slide 20

    Example: Employees Non-stationarity in the mean remains in the seasonal

    level. Try to remove it with a further first seasonal difference.

    * * * * * * * *1 1 1 12 13( ) ( ) ( ) ( )t t t t L t L t t t tz y y y y y y y y

  • Slide 21

    Example: Employees

  • Slide 22

    Example: Employees

    Back

  • Slide 23

    Remarks for Removing Nonstationarity When both seasonal and first differences are applied,

    it makes no difference which is done firstthe result will be the same.

    If differencing is used, the differences should be interpretable. First differences are the change between one observation and next. Seasonal differences are the change from one year to the next.

  • Slide 24

    Example: Employees Do seasonal differencing: 12 ttt yyZ

  • Slide 25

    Example: Employees

  • Slide 26

    Example: Employees

  • Slide 27

    Example: Employees Is stationarity achieved after first seasonal

    differencing?

  • Slide 28

    Stationarity In general, if ACF does both of the following, the series

    should be considered stationary. Cuts off fairly quickly or dies down fairly quickly at

    the nonseasonal level Arbitrarily define the nonseasonal level as lags 1 through

    L3. For monthly data, lags 1 through 9 For quarterly data, lags 1, 2, and possibly 3

    Cuts off fairly quickly or dies down fairly quickly at the seasonal level

    Define the seasonal level as lags equal to (or nearly equal to) L, 2L, 3L, and 4L.

    Define the lags L, 2L, 3L, and 4L to be exact seasonal lags Define the lags L 2, L 1, L + 1, L + 2, 2L 2, 2L 1, 2L + 1, 2L

    + 2, 3L 2, 3L 1, 3L + 1, 3L + 2, 4L 2, 4L 1, 4L + 1, and 4L + 2 to be near seasonal lags

    Example: Employees

  • Slide 29

    Summary of Removing Nonstationarity Pre-differencing transformation is often used to

    stabilize the seasonal variation of the time series. A common transformation is of the form:

    )(* tt yny

  • Slide 30

    Summary of Removing Nonstationarity First regular difference

    First seasonal difference, where L is the number of seasons in a year

    First seasonal and first regular difference

    One can also obtain second and higher order differences by simply applying the same rule.

    For most practical purposes, a maximum of two differences (including both regular and seasonal differences) will transform the data into a stationary series.

    * *1t t tz y y

    * *t t t Lz y y

    * * * *1 1t t t t L t Lz y y y y

  • Slide 31

    Backshift Notation Backward shift operator, B

    B, operating on yt, has the effect of shifting the data back one period.

    Two applications of B to yt shifts the data back two periods.

    m applications of B to yt shifts the data back mperiods.

    1t tBy y

    22( )t t tB By B y y

    mt t mB y y

  • Slide 32

    Backshift Notation The backward shift operator is convenient for

    describing the process of differencing.

    In general, a dth-order can be written as

    An sth difference

    The backshift notation is convenient because the terms can be multiplied together to see the combined effect.

    1 (1 )t t t t t ty y y y By B y 2 2 2

    1 22 (1 2 ) (1 )t t t t t ty y y y B B y B y

    (1 )d dt ty B y

    11 1(1 )(1 ) (1 )

    s s st t t t t s t sB B y B B B y y y y y

    (1 )s tB y

  • 33

    Step Two: Model Identification

  • Slide 34

    ARIMA models for time series data Three basic Box-Jenkins models for a stationary time

    series {yt } Autoregressive models (AR) Moving Average models (MA) Autoregressive Moving Average models (ARMA)

    Autoregressive Integrated Moving Average models (ARIMA) ARMA models extended to nonstationary series by

    allowing differencing of the data series

  • Slide 35

    Autoregressive Models (AR) Autoregressive model of order p (AR(p))

    i.e., yt depends on its p previous values Using backshift notation

    where

    1 1 2 2t t t p t p ty y y y

    1( ) 1 ...p

    p pB B B

    ( )p t tB y

  • Slide 36

    Theoretical ACF and PACF for AR(p)

    Characteristics: ACF dies down. PACF cuts off after lag p.

  • Slide 37

    Example for AR(1) Empirical ACF and PACF

    13 0.7t t ty y

  • Slide 38

    Moving Average Models (MA) Moving Average model of order q (MA(q))

    i.e., yt depends on q previous random error terms. Using backshift notation

    where

    1 1 2 2t t t t q t qy

    21 2( ) 1

    qq qB B B B

    ( )t q ty B

  • Theoretical ACF and PACF for MA(q)

    Characteristics: ACF cuts off after lag q. PACF dies down.

  • Slide 40

    Example for MA(1) Empirical ACF and PACF

    110 0.7t t ty

  • Slide 41

    Autoregressive Moving Average Models (ARMA)

    Autoregressive-moving average model of order p and q (ARMA(p,q))

    i.e., yt depends on its p previous values and q previous random error terms

    Using backshift notation

    When q = 0, the ARMA(p,0) model reduces to AR(p); when p = 0, the ARMA(0,q) model reduces to MA(q).

    1 1 2 2 1 1 2 2t t t p t p t t t q t qy y y y

    ( ) ( )p t q tB y B

  • Theoretical ACF and PACF for ARMA(p,q)

    Characteristics: Both ACF and PACF die down.

  • Slide 43

    Summary of the Behaviors of ACF and PACF

    Behaviors of ACF and PACF for general non-seasonal models

    Process ACF PACF

    AR(p) Dies down. Cuts off after lag p.MA(q) Cuts off after lag q. Dies down.

    ARMA(p,q) Dies down. Dies down.