Box Jenkins Models

download Box Jenkins Models

of 14

description

Introduces a new method for forecasting time series.

Transcript of Box Jenkins Models

  • Box-Jenkins Methodology

    BJ models use only current and past values of the time series to produce forecasts (no other independent variables)

    Steps in Box-Jenkins Modeling

    Prepare Raw Data

    Identify Model

    Estimate Parameters

    Model Good ?

    Forecast

    Revise the Model

    No

    Data PreparationData has to be transformed to stationarity before applying BJ technique. Stationarity consists of three parts.

    Stationary in Mean Fluctuates about a fixed level. Detectable by scatter plot and ACF. Usually enforced by differencing a suitable

    number of times d. Mathematically, E(Yt) = .

    Data Preparation

    Stationary in Variance Fluctuation constant over time. Detectable by scatter plot. Usually enforced by taking loge or square root. Mathematically, Var(Yt) = 2.

    Covariance Stationary Not detectable by scatter plot. Mathematically, for any k 0, Cov(Yt ,Yt-k) depends on k only.

  • Variance StabilizationYt = tt where t ~ iid N(5,1)

    Before Taking Log

    0

    100

    200

    300

    400

    500

    600

    700

    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99

    t

    Y

    (

    t

    )

    After Taking Log

    0

    1

    2

    3

    4

    5

    6

    7

    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99

    t

    L

    o

    g

    (

    Y

    (

    t

    )

    )

    Backshift Operator : B

    kt t kB y = y

    1 21 1t tB y = B B ... y( ) ( )

    11 t t tB y = y y( ) 2

    1 21 2t t t tB y = y y y( )

    Autoregressive (AR) Models

    Typical model :

    General AR(p) model : (stationarity assumed)

    where does not contain

    the factor .

    1 26 1 2 0 8t t ttY . Y . Y

    0 1 1t t p t p tY Y ... Y 1 01

    pp t t B ... B Y ( )

    11p

    p B ... B 1 B

    Moving Average (MA) Models

    Typical model :

    General MA(q) model : (stationarity assumed)

    1 20 8t t t tY .

    0 1 1t t t q t qY ... 0 11

    qt q tY B ... B ( )

  • Typical model :

    General ARMA(p,q) model : (stationarity assumed)

    where does not contain the factor

    Autoregressive Moving Average (ARMA) Models

    1 120 2 0 8 0 1t t t ttY Y Y . . .

    1 1 0 1 1t t p t p t t q t qY Y ... Y ...

    11p

    p B ... B 1 0 11 1

    p qp t q tB ... B Y B ... B ( ) ( )

    1 B.

    Autoregressive Integrated Moving Average (ARIMA) Models

    1

    0 1

    1 11

    p dp t

    qq t

    B ... B B Y B ... B ( ) ( )

    ( )

    11

    ppB ... B

    1 .B

    These are ARMA models fitted to data that need to be differenced to ensure stationarity in mean.

    General ARIMA(p,d,q) model :

    where does not contain the factor

    ARIMA(p,d,q) Models

    ARIMA(2,1,1) = ARMA(2,1) fitted to data differenced once

    ARIMA(0,2,1) = MA(1) fitted to data differenced twice

    ARIMA(1,0,1) = ARMA(1,1)

    Model Identification

    First transform data to stationarity by differencing suitable number of times, taking logs, etc

    Choose those models (there may be more than one) with (1) the theoretical ACF most closely matches the sample ACF and (2) the theoretical PACF most closely matches the sample PACF

  • What is PACF ?

    For given k, regress Yt against Yt-1,,Yt-k :

    The lag-k partial autocorrelation coefficient (PAC) is the coefficient bk of Yt-k

    It measures the strength of correlation between Yt-k and Yt when the effects of other time lags : 1, 2, ,(k-1) are removed

    The collection of bk (k1) constitutes the PACF

    0 1 1 1 1t k t k tk ktY a +a Y +...+ ba Y + Y

    Plots for Yt = 0.7Yt-1+ t ; t ~ iid N(0,1)

    -5.000

    -4.000

    -3.000

    -2.000

    -1.000

    0.000

    1.000

    2.000

    3.000

    4.000

    5.000

    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49

    Y(t)

    -.6000

    -.4000

    -.2000

    .0000

    .2000

    .4000

    .6000

    .8000

    1.0000

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

    ACF

    Upper Limit

    Low er Limit

    -.4000

    -.2000

    .0000

    .2000

    .4000

    .6000

    .8000

    1.0000

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

    PACF

    Upper Limit

    Low er Limit

    Typical PACFs for AR Models

    Y(t) = -0.7Y(t-1) + e(t)

    -1-0.8-0.6-0.4-0.2

    00.20.4

    1 2 3 4 5 6 7 8 9 10 11 12

    Y(t) = 0.5*Y(t-1) - 0.4*Y(t-2) + e(t)

    -.4000

    -.2000

    .0000

    .2000

    .4000

    .6000

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

    Y(t) = -0.5Y(t-1) + 0.4Y(t-2) + 0.3Y(t-3) + e(t)

    -.6000

    -.4000

    -.2000

    .0000

    .2000

    .4000

    .6000

    1 2 3 4 5 6 7 8 9 10 11 12

    Typical PACFs for MA Models

    Y(t) = -0.7e(t-1) + e(t)

    -.5000-.4000-.3000-.2000-.1000.0000.1000.2000

    1 2 3 4 5 6 7 8 9 10 11 12

    Y(t) = -0.9e(t-1) + 0.8e(t-2) + e(t)

    -.8000

    -.6000

    -.4000

    -.2000

    .0000

    .2000

    .4000

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    Y(t) = -0.4e(t-1) + 0.5e(t-2) + 0.6e(t-3) + e(t)

    -.6000

    -.4000

    -.2000

    .0000

    .2000

    .4000

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

  • AR(1) Model : Examples AR(2) Model : Examples

    MA(1) Model : Examples MA(2) Model : Examples

  • ARMA(1,1) Model : Examples ARMA(1,1) Model : Examples

    Guidelines for Model Identification

    MODEL ACF PACF

    AR(p) Decays rapidly Truncates after lag p

    MA(q) Truncates after lag q Decays rapidly

    ARMA(p,q) Decays rapidly Decays rapidly

    In most cases, 0 p,d,q 2 and 0 p+q 2

    Case : S&P Monthly Closing

  • S&P Monthly Closing : Differenced Once S&P Closing : One Step Ahead Forecast

    1 t tB Y ( )1t ttY Y

    The ARIMA(0,1,0) model is :

    Forecast for t = 234 :

    = 1482.37234

    1t ttY Y

    Case : Transportation Daily Closing Index Closing : Identify Model

  • Closing : ACF of Differenced Data Closing : PACF of Differenced Data

    Closing : SPSS Closing : Choosing (p, d, q)

  • Closing : Error Measures Closing : Residual ACF

    Closing : Saving Residuals Closing : Residuals Saved

  • Closing : Error Measures Closing : Parameter Estimates

    Closing : ACF of Residuals Closing : Normality of Residuals

  • Closing : One Step Ahead Forecast

    1 10 438 t tB B Y ( )( ).21 1 438 0 438 t tB B Y ( . . )

    The ARIMA(1,1,0) model is :

    Forecast for t = 66 :

    = 1.438(288.57) 0.438(286.33) = 289.55 66

    1 21 438 0 438t t- t- tY Y Y . .

    Case : Paper Towel Weekly Sales

    Towel : Identify Model Towel : Identify Model

  • Towel : Parameter Estimates (1) Towel : Residual ACF (1)

    Towel : Residual Saved (1) Towel : Q-Q Plot (1)

  • Towel : Parameter Estimates (2) Towel : Residual ACF (2)

    Towel : Q-Q Plot (2) Towel : Parameter Estimates (3)

  • Towel : One-Step Ahead Forecast

    1 10 351t t- t- tY Y .

    The ARIMA(0,1,1) model is :

    Forecast for t = 121 := 15.65 + 0.351(0.69) = 15.89121

    31 0 11 5t tB Y B .( ) ( )

    Steps in Model Building Transform data to stationarity

    Based on the ACF & PACF, determine the values of pand q

    From the computer printout, determine whether ALLfitted parameter values are significant; if not re-fit using other values of p and/or q

    Check whether the residuals appear random

    If there are more than one tentative model, choose the best one by considering their error measures