¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9...

73
Decision 411: Class 9 Decision 411: Class 9 Presentation/discussion of HW#3 Presentation/discussion of HW#3 Introduction to ARIMA models Introduction to ARIMA models Rules for fitting Rules for fitting nonseasonal nonseasonal models models Differencing and Differencing and stationarity stationarity Reading the tea leaves Reading the tea leaves : ACF and PACF plots : ACF and PACF plots Unit roots Unit roots Estimation issues Estimation issues Example: the UNITS series Example: the UNITS series

Transcript of ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9...

Page 1: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Decision 411: Class 9Decision 411: Class 9

Presentation/discussion of HW#3Presentation/discussion of HW#3

Introduction to ARIMA modelsIntroduction to ARIMA models

Rules for fitting Rules for fitting nonseasonalnonseasonal modelsmodels

Differencing and Differencing and stationaritystationarity

““Reading the tea leavesReading the tea leaves””: ACF and PACF plots: ACF and PACF plots

Unit rootsUnit roots

Estimation issuesEstimation issues

Example: the UNITS seriesExample: the UNITS series

Page 2: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

HW#3 issuesHW#3 issues

Data transformations needed?Data transformations needed?

How to model seasonality?How to model seasonality?

Outliers?Outliers?

Lagged effects?Lagged effects?

Comparison of economic impact of promotion Comparison of economic impact of promotion variables (sums of coefficients)variables (sums of coefficients)

Forecast for next periodForecast for next period

Page 3: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Transformations?Transformations?

If the dependent variable has only a slight trend, it If the dependent variable has only a slight trend, it may be hard to tell if growth is exponential or if may be hard to tell if growth is exponential or if seasonality is multiplicative: a log transformation seasonality is multiplicative: a log transformation may be unnecessarymay be unnecessary

Effects of promotion variables on sales are Effects of promotion variables on sales are probably additiveprobably additive if promotional activities are if promotional activities are carried out on a percarried out on a per--package basis (rather than package basis (rather than mass media)mass media)

Variables cannot be logged if they contain zeroes.Variables cannot be logged if they contain zeroes.

Page 4: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Seasonal adjustment?Seasonal adjustment?A priori seasonal adjustment of the dependent variable A priori seasonal adjustment of the dependent variable in a regression model may be problematic due to in a regression model may be problematic due to effects of independent variables (seasonal adjustment effects of independent variables (seasonal adjustment may distort the effects of other variables if those may distort the effects of other variables if those effects do not naturally vary with the season)effects do not naturally vary with the season)Seasonal adjustment can be performed inside the Seasonal adjustment can be performed inside the

regression model by using an externallyregression model by using an externally--supplied supplied seasonal index, and perhaps also the product of the seasonal index, and perhaps also the product of the seasonal index and the time index, as a separate seasonal index and the time index, as a separate regressorregressor (or else by using seasonal dummies)(or else by using seasonal dummies)If the dependent variable is transformed via seasonal If the dependent variable is transformed via seasonal adjustment or logging, it will be necessary to adjustment or logging, it will be necessary to ““untransformuntransform”” the final forecasts. Economic the final forecasts. Economic interpretations of coefficients are also affected.interpretations of coefficients are also affected.

Page 5: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Outliers?Outliers?In general, outliers in data need to be carefully In general, outliers in data need to be carefully considered for their effects on model estimation considered for their effects on model estimation and predictionand prediction——they should not be removed highthey should not be removed high--handedly.handedly.Outliers in Outliers in ““randomrandom”” variables may or may not be variables may or may not be expected to occur again in the future.expected to occur again in the future.Outliers in Outliers in ““decisiondecision”” variables may be the most variables may be the most informative data points (experiments!)informative data points (experiments!)Do effects remain linear for large values of Do effects remain linear for large values of independent variables? (Residual plots may shed independent variables? (Residual plots may shed light on this.)light on this.)

Page 6: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Lagged effects?Lagged effects?When fitting regression models to time series data, When fitting regression models to time series data, crosscross--correlation plots may help to reveal lags of correlation plots may help to reveal lags of independent variables that are useful independent variables that are useful regressorsregressorsHowever, lack of significant crossHowever, lack of significant cross--correlations does correlations does not prove that lagged variables wonnot prove that lagged variables won’’t be helpful in t be helpful in the context of a multiple regression model, where the context of a multiple regression model, where other variables are also presentother variables are also presentIf there are a priori reasons for believing that an If there are a priori reasons for believing that an independent variable may have lagged effects, independent variable may have lagged effects, then those lags probably should be included in a then those lags probably should be included in a preliminary preliminary ““all likely suspectsall likely suspects”” modelmodelUsually only low order lags (e.g., lag 1 and lag 2, in Usually only low order lags (e.g., lag 1 and lag 2, in addition to lag 0) are importantaddition to lag 0) are important

Page 7: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA modelsARIMA models

AAutouto--RRegressive egressive IIntegrated ntegrated MMoving oving AAverageverage

Are an adaptation of discreteAre an adaptation of discrete--time filtering methods time filtering methods developed in 1930developed in 1930’’s and 1940s and 1940’’s by electrical s by electrical engineers (Norbert Wiener et al.)engineers (Norbert Wiener et al.)

Statisticians George Box and Statisticians George Box and GwilymGwilym Jenkins Jenkins developed systematic methods for applying them to developed systematic methods for applying them to business & economic data in the 1970business & economic data in the 1970’’s (hence the s (hence the name name ““BoxBox--Jenkins modelsJenkins models””))

Page 8: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA models are...ARIMA models are...Generalized random walk modelsGeneralized random walk models finefine--tuned to tuned to eliminate all residual autocorrelationeliminate all residual autocorrelation

Generalized exponential smoothing modelsGeneralized exponential smoothing models that can that can incorporate longincorporate long--term trends and seasonalityterm trends and seasonality

StationarizedStationarized regression modelsregression models that use lags of the that use lags of the dependent variables and/or lags of the forecast errors dependent variables and/or lags of the forecast errors as as regressorsregressors

The most general class of forecasting models for time The most general class of forecasting models for time series that can be series that can be stationarizedstationarized** by transformations by transformations such as differencing, logging, and or deflatingsuch as differencing, logging, and or deflating

* A time series is “stationary” if all of its statistical properties—mean, variance, autocorrelations, etc.—are constant in time. Thus, it has no trend, no heteroscedasticity, and a constant degree of “wiggliness.”

Page 9: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Why use ARIMA models?Why use ARIMA models?ARIMA modelARIMA model--fitting tools allow you to fitting tools allow you to systematically explore the autocorrelation pattern systematically explore the autocorrelation pattern in a time series in order to pick the in a time series in order to pick the ““optimaloptimal””statistical modelstatistical model

……assuming there is really a stable pattern in the assuming there is really a stable pattern in the data!data!

ARIMA options let you ARIMA options let you ““finefine--tunetune”” model types model types such as RW, SES, and multiple regression (add such as RW, SES, and multiple regression (add trend to SES, add autocorrelation correction to trend to SES, add autocorrelation correction to RW or multiple regression)RW or multiple regression)

Page 10: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Construction of an ARIMA modelConstruction of an ARIMA model

1. 1. StationarizeStationarize the series, if necessary, by differencing the series, if necessary, by differencing (& perhaps also logging, deflating, etc.)(& perhaps also logging, deflating, etc.)

2. Regress the 2. Regress the stationarizedstationarized series on series on lags of itselflags of itselfand/or and/or lags of the forecast errorslags of the forecast errors as needed to as needed to remove all traces of autocorrelation from the remove all traces of autocorrelation from the residualsresiduals

Page 11: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

What ARIMA stands forWhat ARIMA stands for

A series which needs to be differenced to be A series which needs to be differenced to be made stationary is an made stationary is an ““integratedintegrated”” ((II) series) series

Lags of the Lags of the stationarizedstationarized series are called series are called ““autoauto--regressiveregressive”” ((ARAR) terms) terms

Lags of the forecast errors are called Lags of the forecast errors are called ““moving moving averageaverage”” ((MAMA) terms) terms

Page 12: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA terminologyARIMA terminology

A nonA non--seasonal ARIMA model can be (almost) seasonal ARIMA model can be (almost) completely summarized by three numbers:completely summarized by three numbers:

pp = the number of = the number of autoregressiveautoregressive termsterms

dd = the number of = the number of nonseasonalnonseasonal differencesdifferences

qq = the number of = the number of movingmoving--averageaverage termsterms

This is called an This is called an ““ARIMA(ARIMA(p,d,qp,d,q))”” modelmodel

The model may also include a The model may also include a constantconstant term (or not)term (or not)

Page 13: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA models weARIMA models we’’ve already metve already metARIMA(0,0,0)+c = mean (constant) modelARIMA(0,0,0)+c = mean (constant) modelARIMA(0,1,0) = RW modelARIMA(0,1,0) = RW modelARIMA(0,1,0)+c = RW with drift modelARIMA(0,1,0)+c = RW with drift modelARIMA(1,0,0)+c = 1ARIMA(1,0,0)+c = 1stst order AR modelorder AR modelARIMA(1,1,0)+c = differenced 1ARIMA(1,1,0)+c = differenced 1stst--order AR modelorder AR modelARIMA(0,1,1) = SES modelARIMA(0,1,1) = SES modelARIMA(0,1,1)+c = SES + trend = RW + correctionARIMA(0,1,1)+c = SES + trend = RW + correction

for lagfor lag--1 autocorrelation1 autocorrelationARIMA(1,1,2) = LES w/ damped trend (leveling off)ARIMA(1,1,2) = LES w/ damped trend (leveling off)ARIMA(0,2,2) = generalized LES (including HoltARIMA(0,2,2) = generalized LES (including Holt’’s)s)

Page 14: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

The ARIMA The ARIMA ““filtering boxfiltering box””

0

1

2

p

0

1

2

d

0

1

2

qtime series “signal”(forecasts)

“noise”(residuals)

Objective: adjust the knobs until the residuals are “white noise” (uncorrelated)

Page 15: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA forecasting equationARIMA forecasting equation

Let Let YY denote the denote the originaloriginal seriesseries

Let Let yy denote the denote the differenceddifferenced ((stationarizedstationarized) series) series

No difference (No difference (dd=0): =0): yytt = = YYtt

First differenceFirst difference ((dd=1): =1): yytt = = YYtt −− YYtt--11

Second difference Second difference ((dd=2): =2): yytt = = ((YYtt −− YYtt--11) ) −− ((YYtt--11 −− YYtt--22))

= = YYtt −− 22YYtt--1 1 + + YYtt--22

Page 16: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Forecasting equation for Forecasting equation for yy

Not as bad as it looks! Usually Not as bad as it looks! Usually pp++q q ≤≤ 2 and 2 and either either p=p=0 or 0 or qq=0 (pure AR or pure MA model)=0 (pure AR or pure MA model)

ptptt yyy −− +++= φφμ ...ˆ 11

qtqt ee −− −− θθ ...11

constant AR terms (lagged values of y)

MA terms (lagged errors)

By convention, the By convention, the AR terms are + and AR terms are + and the MA terms are the MA terms are −−

Page 17: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

UndifferencingUndifferencing the forecastthe forecast

tt yYd ˆˆ:0If ==

1ˆˆ:1If −+== ttt YyYd

212ˆˆ:2If −− −+== tttt YYyYd

The differencing (if any) must be The differencing (if any) must be reversedreversed to to obtain a forecast for the original series:obtain a forecast for the original series:

Fortunately, your software will do all of this Fortunately, your software will do all of this automatically!automatically!

Page 18: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Why do you need both AR and MA terms?Why do you need both AR and MA terms?

In general, you In general, you dondon’’tt: usually it suffices to use : usually it suffices to use only one type or the other.only one type or the other.

Some series are better fitted by AR terms, others Some series are better fitted by AR terms, others are better fitted by MA terms (at a given level of are better fitted by MA terms (at a given level of differencing).differencing).

Rough rule of thumb: if the Rough rule of thumb: if the stationarizedstationarized series series has has positivepositive autocorrelation at lag 1, AR terms autocorrelation at lag 1, AR terms often work best. If it has often work best. If it has negativenegative autocorrelation autocorrelation at lag 1, MA terms often work best. at lag 1, MA terms often work best.

Page 19: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Interpretation of AR termsInterpretation of AR terms

A series displays autoregressive (AR) behavior if it A series displays autoregressive (AR) behavior if it apparently feels a apparently feels a ““restoring forcerestoring force”” that tends to pull it that tends to pull it back toward its mean.back toward its mean.

•• In an AR(1) model, the AR(1) coefficient determines In an AR(1) model, the AR(1) coefficient determines how fast the series tends to return to its mean. If how fast the series tends to return to its mean. If the coefficient is the coefficient is near zeronear zero, the series returns to its , the series returns to its mean mean quicklyquickly; if the coefficient is ; if the coefficient is near 1near 1, the series , the series returns to its mean returns to its mean slowlyslowly. .

•• In a model with 2 or more AR coefficients, the In a model with 2 or more AR coefficients, the sumsumof the coefficients determines the speed of mean of the coefficients determines the speed of mean reversion, and the series may also show an reversion, and the series may also show an oscillatoryoscillatory pattern.pattern.

Page 20: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Interpretation of MA termsInterpretation of MA termsA series displays movingA series displays moving--average (MA) behavior if average (MA) behavior if it apparently undergoes random it apparently undergoes random ““shocksshocks”” whose whose effects are felt in two or more consecutive periods.effects are felt in two or more consecutive periods.

The MA(1) coefficient is (minus) the fraction of The MA(1) coefficient is (minus) the fraction of last periodlast period’’s shock that is still felt in the current s shock that is still felt in the current period. period.

The MA(2) coefficient, if any, is (minus) the The MA(2) coefficient, if any, is (minus) the fraction of the shock fraction of the shock twotwo periods ago that is still periods ago that is still felt in the current period, and so on.felt in the current period, and so on.

Page 21: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Tools for identifying ARIMA models: Tools for identifying ARIMA models: ACF and PACF plotsACF and PACF plots

The autocorrelation function (ACF) plot shows the The autocorrelation function (ACF) plot shows the correlationcorrelation of the series with of the series with itselfitself at different lagsat different lags

The autocorrelation of The autocorrelation of YY at lag at lag kk is the correlation is the correlation between between YY and and LAG(LAG(YY,,kk))

The The partialpartial autocorrelation function (PACF) plot autocorrelation function (PACF) plot shows the amount of autocorrelation at lag shows the amount of autocorrelation at lag kk that is that is not explained by lowernot explained by lower--order autocorrelationsorder autocorrelations

The partial autocorrelation at lag The partial autocorrelation at lag kk is the coefficient is the coefficient of of LAG(LAG(YY,,kk) in an ) in an AR(AR(kk) model, i.e., in a regression ) model, i.e., in a regression of of YY on LAG(on LAG(YY, 1), LAG(, 1), LAG(YY,2), ,2), …… up to up to LAG(LAG(YY,,kk))

Page 22: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

AR and MA AR and MA ““signaturessignatures””ACF that dies out gradually and PACF that cuts off ACF that dies out gradually and PACF that cuts off sharply after a few lags sharply after a few lags ⇒⇒ AR signatureAR signature

An AR series is usually An AR series is usually positively positively autocorrelatedautocorrelated at at lag 1lag 1 (or even borderline (or even borderline nonstationarynonstationary))

ACF that cuts off sharply after a few lags and PACF ACF that cuts off sharply after a few lags and PACF that dies out more gradually that dies out more gradually ⇒⇒ MA signatureMA signature

An MA series is usually An MA series is usually negatively negatively autcorrelatedautcorrelated at at lag 1lag 1 (or even mildly (or even mildly overdifferencedoverdifferenced))

Page 23: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

AR signature: meanAR signature: mean--reverting reverting behavior, slow decay in ACF behavior, slow decay in ACF

(usually positive at lag 1), (usually positive at lag 1), sharp cutoff after a few lags in sharp cutoff after a few lags in

PACF.PACF.

Here the signature is Here the signature is AR(2) because of 2 AR(2) because of 2

spikes in PACF.spikes in PACF.

Page 24: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

MA signature: noisy MA signature: noisy pattern, sharp cutoff pattern, sharp cutoff

in ACF (usually in ACF (usually negative at lag 1), negative at lag 1), gradual decay in gradual decay in

PACF.PACF.

Here the signature is Here the signature is MA(1) because of 1 MA(1) because of 1

spike in ACF.spike in ACF.

Page 25: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

AR or MA? It depends!AR or MA? It depends!

Whether a series displays AR or MA behavior Whether a series displays AR or MA behavior often often depends on the extent to which it has been depends on the extent to which it has been differenceddifferenced..

An An ““underdifferencedunderdifferenced”” series has an AR series has an AR signature (positive autocorrelation)signature (positive autocorrelation)

After one or more orders of differencing, the After one or more orders of differencing, the autocorrelation will become more negative and autocorrelation will become more negative and an MA signature will emergean MA signature will emerge

Page 26: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

The autocorrelation spectrumThe autocorrelation spectrum

Nonstationary Auto-Regressive White Noise Moving-Average Overdifferenced

←Positive autocorrelation No autocorrelation Negative autocorrelation→

add ARadd DIFF add MAadd DIFF

remove DIFF

Page 27: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ModelModel--fitting stepsfitting steps1.1. Determine the order of differencingDetermine the order of differencing

2.2. Determine the numbers of AR & MA termsDetermine the numbers of AR & MA terms

3.3. Fit the modelFit the model——check to see if residuals are check to see if residuals are ““white noise,white noise,”” highesthighest--order coefficients are order coefficients are significant (w/ no significant (w/ no ““unit unit ““rootsroots””), and forecasts ), and forecasts look reasonable. If not, return to step 1 or 2.look reasonable. If not, return to step 1 or 2.

In other words, move right or left in the In other words, move right or left in the ““autocorrelation autocorrelation spectrumspectrum”” by appropriate choices of differencing and by appropriate choices of differencing and AR/MA terms, until you reach the center (white noise)AR/MA terms, until you reach the center (white noise)

Page 28: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

1. Determine the order of differencing1. Determine the order of differencingIf the series apparently (or logically) has a If the series apparently (or logically) has a stable stable longlong--term meanterm mean, then perhaps , then perhaps nono differencing is differencing is neededneeded

If the series has a consistent If the series has a consistent trendtrend, you , you MUSTMUST use use at least oneat least one order of differencing [or else include a order of differencing [or else include a trended trended regressorregressor] otherwise the long] otherwise the long--term term forecasts will revert to the mean of the historical forecasts will revert to the mean of the historical samplesample

If If two two orders of differencing are used, you should orders of differencing are used, you should suppress the constant suppress the constant (otherwise you will get a (otherwise you will get a quadratic trend in forecasts)quadratic trend in forecasts)

Page 29: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

How much differencing is needed?How much differencing is needed?

Usually the best order of differencing is the Usually the best order of differencing is the least least amount needed to amount needed to stationarizestationarize the seriesthe series

When in doubt, you can also try the next higher order When in doubt, you can also try the next higher order of differencing & compare modelsof differencing & compare models

If the lagIf the lag--1 autocorrelation is already zero or 1 autocorrelation is already zero or negative, you donnegative, you don’’t need more differencingt need more differencing

Beware of Beware of overdifferencingoverdifferencing——if there isif there is moremoreautocorrelation and/or autocorrelation and/or higherhigher variance after variance after differencing, you may have gone too far! differencing, you may have gone too far!

Page 30: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

2. Determine # of AR & MA terms2. Determine # of AR & MA termsAfter differencing, inspect the ACF and PACF at After differencing, inspect the ACF and PACF at lowlow--order lags (1, 2, 3, order lags (1, 2, 3, ……):):

PositivePositive decay pattern in ACF at lags 1, 2, 3decay pattern in ACF at lags 1, 2, 3……, , single* positive spike in PACF at lag 1 single* positive spike in PACF at lag 1 ⇒⇒ AR=1AR=1

NegativeNegative spike in ACF at lag 1, negative decay spike in ACF at lag 1, negative decay pattern in PACF at lags 1, 2, 3,pattern in PACF at lags 1, 2, 3,…… ⇒⇒ MA=1MA=1

MA terms often work well with differences: MA terms often work well with differences: ARIMA(0,1,1) = SES, ARIMA(0,2,2) = LES.ARIMA(0,1,1) = SES, ARIMA(0,2,2) = LES.

*Two*Two positives spikes in PACF positives spikes in PACF ⇒⇒ AR=2,AR=2,two negative spikes in ACF two negative spikes in ACF ⇒⇒ MA=2, etc.MA=2, etc.

Page 31: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

3. Fit the model & check3. Fit the model & check

Residual time series plot looks stationary?Residual time series plot looks stationary?

No residual autocorrelation? (or is there a No residual autocorrelation? (or is there a ““signaturesignature”” of AR or MA terms that still need to be of AR or MA terms that still need to be added)?added)?

HighestHighest--order order coefficient(scoefficient(s) significant?) significant?

Sum of coefficients of same type < 1?Sum of coefficients of same type < 1?

Page 32: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

UNITS series: begin with ARIMA(0,0,0)+c model, i.e., constant UNITS series: begin with ARIMA(0,0,0)+c model, i.e., constant model. Residuals = original data with mean subtracted, showing model. Residuals = original data with mean subtracted, showing a strong upward trend, laga strong upward trend, lag--1 autocorrelation close to 1.0, slowly1 autocorrelation close to 1.0, slowly--

declining pattern in ACF declining pattern in ACF ⇒⇒ differencing is obviously neededdifferencing is obviously needed

Page 33: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

After 1 After 1 nonseasonalnonseasonal difference: ARIMA(0,1,0)+c (random walk difference: ARIMA(0,1,0)+c (random walk with drift) model: lagwith drift) model: lag--1 residual autocorrelation still positive, ACF 1 residual autocorrelation still positive, ACF still declines slowly, 2 technicallystill declines slowly, 2 technically--significant spikes in PACF.significant spikes in PACF. Is it Is it stationary (e.g. meanstationary (e.g. mean--reverting)? If we think so, try adding AR=2.reverting)? If we think so, try adding AR=2.

Page 34: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

After adding AR=2: ARIMA(2,1,0)+c model. No significant spikes After adding AR=2: ARIMA(2,1,0)+c model. No significant spikes in ACF or in ACF or PACF, residuals appear to be stationary. The PACF, residuals appear to be stationary. The meanmean is the estimated constant is the estimated constant

longlong--term trend (0.416 per period). AR(2) coefficient is significantterm trend (0.416 per period). AR(2) coefficient is significant (t=2.49) . (t=2.49) . AR coefficients sum to less than 1.0 (AR coefficients sum to less than 1.0 (⇒⇒ no no ““unit rootunit root””, i.e., not seriously , i.e., not seriously

underdifferencedunderdifferenced). White noise std. dev. (RMSE) = 1.35, versus 1.44 for RW.). White noise std. dev. (RMSE) = 1.35, versus 1.44 for RW.

Page 35: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

WhatWhat’’s the difference between MEAN s the difference between MEAN and CONSTANT in the ARIMA output?and CONSTANT in the ARIMA output?MEANMEAN = = meanmean of of stationarizedstationarized series series

If If nono differencing was used, this is just the differencing was used, this is just the mean of mean of the original series.the original series.If If oneone order of differencing was used, this is the order of differencing was used, this is the longlong--term average trendterm average trend

CONSTANTCONSTANT = the = the ““interceptintercept”” term (term (μμ) in the ) in the forecasting equationforecasting equation

Connection (a mathematical identity):Connection (a mathematical identity):CONSTANT= MEAN CONSTANT= MEAN ×× (1 (1 –– sum of AR coefficients)sum of AR coefficients)

Page 36: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

““Estimated white noise standard Estimated white noise standard deviationdeviation””??

This is just the RMSE of the model in transformed This is just the RMSE of the model in transformed units, adjusted for # coefficients estimated.units, adjusted for # coefficients estimated.

I.e., it is the same statistic that is called I.e., it is the same statistic that is called ““standard standard error of the estimateerror of the estimate”” in regressionin regression

If a natural log transform was used, it is the RMS If a natural log transform was used, it is the RMS error in error in percentagepercentage terms.terms.

Page 37: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

The plot of data and forecasts shows that the trend in the The plot of data and forecasts shows that the trend in the forecasts equals the historical forecasts equals the historical averageaverage trend. A good model?? trend. A good model??

What if we had gone to 2 orders of differencing instead?What if we had gone to 2 orders of differencing instead?

Time Sequence Plot for unitsARIMA(2,1,0) with constant

units

actualforecast95.0% limits

0 30 60 90 120 150 180190

210

230

250

270

290

Page 38: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

With 2 With 2 nonseasonalnonseasonal differences and differences and no constantno constant: ARIMA(0,2,0) : ARIMA(0,2,0) model. Residuals are definitely stationary but now strongly model. Residuals are definitely stationary but now strongly

negativelynegatively autocorrelatedautocorrelated at lag 1, with a single negative spike in at lag 1, with a single negative spike in ACF. ACF. OverdifferencedOverdifferenced or MA=1?? Note that RMSE is larger than or MA=1?? Note that RMSE is larger than

it was with one differenceit was with one difference——1.69 1.69 vsvs 1.441.44——not a good sign.not a good sign.

Page 39: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

After adding MA=1: ARIMA(0,2,1) modelAfter adding MA=1: ARIMA(0,2,1) model——essentially a linear essentially a linear exponential smoothing model. Residuals look fine, no exponential smoothing model. Residuals look fine, no

autocorrelation. Estimated MA(1) coefficient is only 0.75 (not autocorrelation. Estimated MA(1) coefficient is only 0.75 (not dangerously close to 1.0 dangerously close to 1.0 ⇒⇒ not seriously not seriously overdifferencedoverdifferenced). White ). White

noise std. dev. = 1.37, about the same as (2,1,0)+c model.noise std. dev. = 1.37, about the same as (2,1,0)+c model.

Page 40: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Forecast plot shows that the trend in this model is the Forecast plot shows that the trend in this model is the locallocal trend trend estimated at the end of the series, flatter than in the previousestimated at the end of the series, flatter than in the previous

model. Confidence limits widen more rapidly than in the previoumodel. Confidence limits widen more rapidly than in the previous s model because a timemodel because a time--varying trend is assumed, as in LES.varying trend is assumed, as in LES.

Time Sequence Plot for unitsARIMA(0,2,1)un

itsactualforecast95.0% limits

0 30 60 90 120 150 180190

210

230

250

270

290

Page 41: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Equation of first modelEquation of first model

The ARIMA(2,1,0)+c forecasting equation isThe ARIMA(2,1,0)+c forecasting equation is

……where where μμ = 0.228, = 0.228, φφ11 = 0.250, and = 0.250, and φφ22 = 0.201.= 0.201.

Thus, the Thus, the predicted change is a constant plus predicted change is a constant plus multiples of the last two changesmultiples of the last two changes ((““momentummomentum””) )

)()(ˆ3222111 −−−−− −+−+=− tttttt YYYYYY φφμ

Page 42: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Equation of second modelEquation of second modelThe ARIMA(0,2,1) forecasting equation is:The ARIMA(0,2,1) forecasting equation is:

...where the MA(1) coefficient is ...where the MA(1) coefficient is θθ11 = 0.753= 0.753

Thus, the Thus, the predicted change equals the last change predicted change equals the last change minus a multiple of the last forecast errorminus a multiple of the last forecast error

11211 )(ˆ−−−− −−=− ttttt eYYYY θ

An ARIMA(0,2,1) model is almost the same as BrownAn ARIMA(0,2,1) model is almost the same as Brown’’s LES s LES model, with the correspondence model, with the correspondence αα ≈≈ 11-- ½½θθ11. Hence this model . Hence this model

is similar to an LES model with is similar to an LES model with αα ≈≈ 0.6250.625

Page 43: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA(0,2,2) vs. LESARIMA(0,2,2) vs. LES

General ARIMA (0,2,2) model (w/o const):General ARIMA (0,2,2) model (w/o const):

BrownBrown’’s LES is exactly equivalent to (0,2,2) with s LES is exactly equivalent to (0,2,2) with θθ11 = 2 = 2 −− 22αα, , θθ22 = = −− (1(1−α−α))22

HoltHolt’’s LES is LES is exactly equivalent to (0,2,2) withs exactly equivalent to (0,2,2) withθθ11 = 2 = 2 −− αα −− ααββ, , θθ22 = = −− (1(1−α−α))

The ARIMA model is somewhat more general The ARIMA model is somewhat more general because it allows because it allows αα and and ββ outside (0,1)outside (0,1)

1 1 2 1 1 2 2ˆ ( )t t t t t tY Y Y Y e eθ θ− − − − −− = − − −

Page 44: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Model comparison shows Model comparison shows that the two best ARIMA that the two best ARIMA

models (B and D) models (B and D) perform about equally perform about equally

well for onewell for one--periodperiod--ahead ahead forecasts. They differ forecasts. They differ

mainly in their mainly in their assumptions about trend assumptions about trend and width of confidence and width of confidence

limits for longerlimits for longer--term term forecasts.forecasts.

An LES model has been included An LES model has been included for comparisonfor comparison——essentially the essentially the

same as ARIMA(0,2,1). same as ARIMA(0,2,1). Note that Note that αα ≈≈ 0.61.0.61.

Page 45: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Estimation of ARIMA modelsEstimation of ARIMA models

Like SES and LES models, ARIMA models Like SES and LES models, ARIMA models face a face a startstart--up problemup problem at the beginning of at the beginning of the series: the series: prior lagged valuesprior lagged values are needed.are needed.

NaNaïïve approach:ve approach:

AR models can be initialized by assuming AR models can be initialized by assuming that prior values of the differenced series that prior values of the differenced series are equal to the meanare equal to the mean

MA models can be initialized by assuming MA models can be initialized by assuming that prior errors were zerothat prior errors were zero

Page 46: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

BackforecastingBackforecasting

A better approach: use A better approach: use backforecastingbackforecasting to to estimate the prior values of the differenced series estimate the prior values of the differenced series and the forecast errorsand the forecast errors

The key idea: a stationary time series looks the The key idea: a stationary time series looks the same (statistically speaking) whether it is same (statistically speaking) whether it is traversed traversed forward or backward in timeforward or backward in time..

Hence the same model that forecasts the Hence the same model that forecasts the futurefuture of of a series can also forecast itsa series can also forecast its pastpast..

Page 47: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

How How backforecastingbackforecasting worksworksOn each iteration of the estimation algorithm, the On each iteration of the estimation algorithm, the algorithm first makes a algorithm first makes a ““backward passbackward pass”” through through the series and the series and forecasts into the pastforecasts into the past until the until the forecasts decay to the mean.forecasts decay to the mean.

Then it turns around and starts forecasting in the Then it turns around and starts forecasting in the forward direction.forward direction.

Squared error calculations are based on Squared error calculations are based on allall the the errors in the forward pass, even those in the errors in the forward pass, even those in the ““priorprior”” period.period.

Theoretically, this yields the maximally efficient Theoretically, this yields the maximally efficient estimates of coefficients.estimates of coefficients.

Page 48: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

BackforecastingBackforecasting caveatscaveatsIf the model is badly If the model is badly mismis--specified (e.g., under specified (e.g., under or overor over--differenced, or overdifferenced, or over--fitted with too many fitted with too many AR or MA terms), AR or MA terms), backforecastingbackforecasting may drive the may drive the coefficient estimates to perverse values and/or coefficient estimates to perverse values and/or yield tooyield too--optimistic error statisticsoptimistic error statistics

If in doubt, try estimating the model without If in doubt, try estimating the model without backforecastingbackforecasting until you are fairly sure it is until you are fairly sure it is properly identified. properly identified.

Turning on Turning on backforecastingbackforecasting should yield should yield slight slight changes in coefficients & improved error stats, changes in coefficients & improved error stats, but not a radical change.but not a radical change.

Page 49: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

BackforecastingBackforecastingis an is an ““estimation estimation

optionoption”” for for ARIMA models, ARIMA models, as well as SES, as well as SES, LES, Holt, and LES, Holt, and Winters modelsWinters models

(The stopping criteria are thresholds for the change in residual(The stopping criteria are thresholds for the change in residualsum of squares and parameter estimates below which the sum of squares and parameter estimates below which the

estimation will stop)estimation will stop)

Page 50: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

““Number of iterationsNumber of iterations””ARIMA models are estimated by an iterative, ARIMA models are estimated by an iterative, nonlinear optimization algorithm (like Solver in nonlinear optimization algorithm (like Solver in Excel).Excel).

The precision of the estimation is controlled by The precision of the estimation is controlled by two two ““stopping criteriastopping criteria”” (normally you don(normally you don’’t need t need to change these)to change these)

A large number of iterations (A large number of iterations (≥≥10) indicates that 10) indicates that the optimal coefficients were hard to findthe optimal coefficients were hard to find——possibly a problem with the model such as possibly a problem with the model such as redundant AR/MA parameters or a redundant AR/MA parameters or a unit rootunit root

Page 51: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

What is a What is a ““unit rootunit root””??

Suppose that a time series has the true equation of Suppose that a time series has the true equation of motion:motion:

...where ...where ε εtt is a stationary process (meanis a stationary process (mean--reverting, reverting, constant variance & autocorrelation, etc.)constant variance & autocorrelation, etc.)

If the true value of If the true value of φφ is 1, then is 1, then YY is said to have an is said to have an (autoregressive) (autoregressive) unit rootunit root. Its true equation is then. Its true equation is then

……which is just a random walk (perhaps with drift and which is just a random walk (perhaps with drift and correlated steps)correlated steps)

1t t tY Yφ ε−= +

1t t tY Y ε−− =

Page 52: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Why are unit roots important?Why are unit roots important?If a series has an AR unit root, then a If a series has an AR unit root, then a ““jumpjump””(discontinuity) in the series permanently raises its (discontinuity) in the series permanently raises its expected future level, as in a random walkexpected future level, as in a random walk

If it doesnIf it doesn’’t have an AR unit root, then it eventually t have an AR unit root, then it eventually reverts to a longreverts to a long--term mean or trend line after a jumpterm mean or trend line after a jump

A series that A series that hashas a unit root is a unit root is much less predictable much less predictable at long forecast horizonsat long forecast horizons than one that does notthan one that does not

If can be hard to tell whether a trended series has a If can be hard to tell whether a trended series has a unit root or whether it is reverting to a trend lineunit root or whether it is reverting to a trend line

Page 53: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Unit root testsUnit root testsA crude unit root test is to fit an AR model and see A crude unit root test is to fit an AR model and see whether the AR coefficients sum to almost exactly 1.whether the AR coefficients sum to almost exactly 1.A more sensitive test is to regress DIFF(A more sensitive test is to regress DIFF(YY) on ) on LAG(LAG(YY,1), i.e., fit the model ,1), i.e., fit the model

……and test whether and test whether ββ is significantly different from is significantly different from zerozero using a using a oneone--sided sided tt--test with 1.75 times the test with 1.75 times the usual critical valueusual critical value (the (the ““DickeyDickey--Fuller testFuller test””). If not, ). If not, then you canthen you can’’t reject the unit root hypothesis.t reject the unit root hypothesis.Fancier unit root and Fancier unit root and ““cointegrationcointegration”” tests regress tests regress DIFF(DIFF(YY) on LAG() on LAG(YY,1) plus a bunch of lags of DIFF(,1) plus a bunch of lags of DIFF(YY) ) ((““augmented DF testaugmented DF test””) and/or other variables (e.g. ) and/or other variables (e.g. time trend), with more complicated critical values.time trend), with more complicated critical values.

1 1t t t tY Y Yα β ε− −− = + +

Page 54: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Does UNITS have a unit root?Does UNITS have a unit root?

2.8791.645Infinity2.9051.6601002.9331.67650

3.0181.72520

Critical Dickey-Fullervalue (= t *1.75)

Critical t value(.05, 1-tailed)

Degrees of freedom Degrees of freedom for errorfor error

By the DF test, By the DF test, we canwe can’’t reject t reject the unit root the unit root

hypothesis for hypothesis for the UNITS seriesthe UNITS series

Page 55: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

But we But we can can reject reject the unit root the unit root

hypothesis for hypothesis for DIFF(UNITS), DIFF(UNITS),

suggesting that 1 suggesting that 1 order of differencing order of differencing

is sufficientis sufficient

Page 56: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Practical implications (for us)Practical implications (for us)

In regression analysis, if both In regression analysis, if both YY and and XX have unit roots have unit roots (i.e. , are random walks), then the regression (i.e. , are random walks), then the regression YY on on XXmay yield a spurious estimate of the relation of may yield a spurious estimate of the relation of YY to to XX(better to regress DIFF((better to regress DIFF(YY) on DIFF() on DIFF(XX) in that case)) in that case)

In ARIMA analysis, if a unit root exists In ARIMA analysis, if a unit root exists eithereither on the on the AR or the MA side of the model, the model can be AR or the MA side of the model, the model can be simplified simplified by using a higher or lower order of by using a higher or lower order of differencingdifferencing

This brings us toThis brings us to……

Page 57: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

OverOver-- or underor under--differencingdifferencing

What if we had identified a What if we had identified a grossly wrong order grossly wrong order of differencingof differencing? ?

For example, suppose we hadnFor example, suppose we hadn’’t differenced the t differenced the UNITS series at all, or supposed we had UNITS series at all, or supposed we had differenced it more than twice?differenced it more than twice?

Page 58: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Here is the original UNITS series again. If we ignore the Here is the original UNITS series again. If we ignore the obvious trend in the time series plot, we see that the ACF obvious trend in the time series plot, we see that the ACF

and PACF display a (suspiciously!) strong AR(1) signature. and PACF display a (suspiciously!) strong AR(1) signature. Suppose we just add an AR(1) termSuppose we just add an AR(1) term……

Page 59: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

The estimated AR(1) coefficient is 0.998 (essentially 1.0), The estimated AR(1) coefficient is 0.998 (essentially 1.0), which is equivalent to an order of differencing. An AR factor which is equivalent to an order of differencing. An AR factor

““mimicsmimics”” an order of differencing when the sum of AR an order of differencing when the sum of AR coefficients is 1.0 (a sign of a unit root), so the model is reacoefficients is 1.0 (a sign of a unit root), so the model is really lly

telling us that a higher order of differencing is neededtelling us that a higher order of differencing is needed..

Page 60: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Adding an AR(2) term does not help. Now the Adding an AR(2) term does not help. Now the sumsum of the AR of the AR coefficients is almost exactly equal to 1.0. Again, this means coefficients is almost exactly equal to 1.0. Again, this means that that there is an AR unit rootthere is an AR unit root----a higher order of differencing is needed.a higher order of differencing is needed.

Page 61: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

On the other hand, suppose we grossly On the other hand, suppose we grossly overoverdifferenceddifferencedthe series. Here 3 orders of differencing have been used the series. Here 3 orders of differencing have been used

((notnot recommended!). The series now displays a recommended!). The series now displays a suspiciously strong MA(1) or MA(2) signature.suspiciously strong MA(1) or MA(2) signature.

Page 62: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

If we add an MA(1) term, its estimated coefficient is 0.995 If we add an MA(1) term, its estimated coefficient is 0.995 (essentially 1.0), which is exactly (essentially 1.0), which is exactly cancelingcanceling one order of one order of

differencing. An MA factor cancels an order of differencing if differencing. An MA factor cancels an order of differencing if the sum of MA coefficients is 1.0 (indicating an MA unit root, the sum of MA coefficients is 1.0 (indicating an MA unit root,

also called a also called a ““nonnon--invertibleinvertible”” model).model).

Page 63: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

Adding an MA(2) term does not really improve the results, and Adding an MA(2) term does not really improve the results, and the the sumsum of the MA coefficients is now almost exactly equal to of the MA coefficients is now almost exactly equal to

1.0. Again, this tells us 1.0. Again, this tells us lessless differencing was needed.differencing was needed.

Page 64: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ConclusionsConclusionsIf the series is If the series is grosslygrossly underunder-- or overor over--differenced, differenced, you will see what appears to be a you will see what appears to be a ““suspiciously suspiciously strongstrong”” AR or MA signature.AR or MA signature.

This is not good!! The This is not good!! The ““correctcorrect”” order of differencing order of differencing usually reduces rather than increases the patterns usually reduces rather than increases the patterns in the ACF and PACFin the ACF and PACF

When coefficients are estimated, if the coefficients When coefficients are estimated, if the coefficients of the same type sum to almost exactly 1.0, this of the same type sum to almost exactly 1.0, this indicates that a higher or lower order of differencing indicates that a higher or lower order of differencing should have been used.should have been used.

Page 65: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

The The ““rulesrules”” of the ARIMA gameof the ARIMA game

The preceding guidelines for fitting ARIMA The preceding guidelines for fitting ARIMA models to data can be summed up in the models to data can be summed up in the following rulesfollowing rules……..

Page 66: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: differencingARIMA rules: differencingRule 1:Rule 1: If the series has If the series has positivepositive autocorrelations autocorrelations

out to a out to a high number of lagshigh number of lags and/or a nonand/or a non--zero zero trendtrend, then it probably needs a higher order of , then it probably needs a higher order of differencing.differencing.

Rule 2: Rule 2: If the lagIf the lag--1 autocorrelation is 1 autocorrelation is zero or zero or negativenegative, the series does , the series does not not need a higher need a higher order of differencing. If the lagorder of differencing. If the lag--1 autocorrelation 1 autocorrelation is is --0.5 or more negative, the series may already 0.5 or more negative, the series may already be be overoverdifferenceddifferenced..

Rule 3:Rule 3: The optimal order of differencing is often The optimal order of differencing is often (not always, but often) the order of differencing (not always, but often) the order of differencing at which at which the standard deviation is lowestthe standard deviation is lowest. .

Page 67: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: differencing continuedARIMA rules: differencing continuedRule 4:Rule 4:(a) A model with (a) A model with nono orders of differencing assumes orders of differencing assumes

that the original series is that the original series is stationarystationary (among (among other things, meanother things, mean--reverting). reverting).

(b) A model with (b) A model with oneone order of differencing assumes order of differencing assumes that the original series has a that the original series has a constant average constant average trend trend (e.g. a random walk or SES(e.g. a random walk or SES--type model, type model, with or without drift).with or without drift).

(c) A model with (c) A model with twotwo orders of total differencing orders of total differencing assumes that the original series has aassumes that the original series has a timetime--varying trendvarying trend (e.g. an LES(e.g. an LES--type model). type model).

Page 68: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: constant term or not?ARIMA rules: constant term or not?Rule 5: Rule 5:

(a) A model with (a) A model with nono orders of differencing normally orders of differencing normally includes a constant term. The estimated mean is includes a constant term. The estimated mean is the average the average levellevel..

(b) In a model with (b) In a model with oneone order of differencing, a order of differencing, a constant should be included if there is a longconstant should be included if there is a long--term term trend. The estimated mean is the average trend. The estimated mean is the average trendtrend..

(b) A model with(b) A model with twotwo orders of total differencing orders of total differencing normally does normally does not not include a constant term. include a constant term.

Page 69: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: AR signatureARIMA rules: AR signatureRule 6: Rule 6:

If the If the partial autocorrelation functionpartial autocorrelation function (PACF) (PACF) of the differenced series displays a sharp of the differenced series displays a sharp cutoff and/or the lagcutoff and/or the lag--1 autocorrelation is 1 autocorrelation is positivepositive——i.e., if the series appears i.e., if the series appears ““slightly slightly underdifferencedunderdifferenced””——then consider adding then consider adding one or more one or more ARAR terms to the model. terms to the model.

The lag beyond which the PACF cuts off is The lag beyond which the PACF cuts off is the indicated number of AR terms.the indicated number of AR terms.

Page 70: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: MA signatureARIMA rules: MA signatureRule 7: Rule 7:

If the If the autocorrelation functionautocorrelation function (ACF) of the (ACF) of the differenced series displays a sharp cutoff differenced series displays a sharp cutoff and/or the lagand/or the lag--1 autocorrelation is 1 autocorrelation is negativenegative——i.e., if the series appears i.e., if the series appears ““slightly slightly overdifferencedoverdifferenced””——then consider adding an then consider adding an MAMA term to the model. term to the model.

The lag beyond which the ACF cuts off is the The lag beyond which the ACF cuts off is the indicated number of MA terms. indicated number of MA terms.

Page 71: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: residual signatureARIMA rules: residual signature

Rule 8: Rule 8: If an important AR or MA term has been If an important AR or MA term has been omittedomitted, its signature will usually show up in , its signature will usually show up in the residuals. the residuals.

For example, if an AR(1) term (only) has For example, if an AR(1) term (only) has been included in the model, and the residual been included in the model, and the residual PACF clearly shows a spike at lag 2, this PACF clearly shows a spike at lag 2, this suggests that an AR(2) term may also be suggests that an AR(2) term may also be needed.needed.

Page 72: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: cancellationARIMA rules: cancellation

Rule 9: Rule 9: It is possible for an AR term and an MA It is possible for an AR term and an MA term to term to cancelcancel each other's effects, so if a mixed each other's effects, so if a mixed ARAR--MA model seems to fit the data, also try a MA model seems to fit the data, also try a model with one fewer AR term and one fewer model with one fewer AR term and one fewer MA termMA term——particularly if the parameter estimates particularly if the parameter estimates in the original model need more than 10 in the original model need more than 10 iterations to converge.iterations to converge.

Page 73: ¾Presentation/discussion of HW#3rnau/Decision411_2007/411class09... · Decision 411: Class 9 ¾Presentation/discussion of HW#3 ¾Introduction to ARIMA models ¾Rules for fitting

ARIMA rules: unit rootsARIMA rules: unit rootsRule 10: Rule 10: If there is a unit root in the AR part of the If there is a unit root in the AR part of the

modelmodel——i.e., if the sum of the AR coefficients is i.e., if the sum of the AR coefficients is almost exactly 1almost exactly 1——you should try reducing the you should try reducing the number of AR terms by one and number of AR terms by one and increasingincreasing the order the order of differencing by one.of differencing by one.

Rule 11:Rule 11: If there is a unit root in the MA part of the If there is a unit root in the MA part of the modelmodel——i.e., if the sum of the MA coefficients is i.e., if the sum of the MA coefficients is almost exactly 1almost exactly 1——you should try reducing the you should try reducing the number of MA terms by one and number of MA terms by one and reducingreducing the order the order of differencing by one.of differencing by one.

Rule 12:Rule 12: If the longIf the long--term forecasts appear erratic or term forecasts appear erratic or unstable, and/or the model takes 10 or more unstable, and/or the model takes 10 or more iterations to estimate, there may be a unit root in the iterations to estimate, there may be a unit root in the AR or MA coefficients. AR or MA coefficients.