Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A...

73
Decision 411: Class 2 Decision 411: Class 2 Explanation of lags & differences Explanation of lags & differences Random walk model Random walk model How to identify a random walk How to identify a random walk Examples of random walks Examples of random walks Forecasting from the random walk model Forecasting from the random walk model Log transformation & geometric random walk Log transformation & geometric random walk Linear trend model Linear trend model Model comparison & validation Model comparison & validation

Transcript of Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A...

Page 1: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Decision 411: Class 2Decision 411: Class 2Explanation of lags & differencesExplanation of lags & differencesRandom walk modelRandom walk model

How to identify a random walkHow to identify a random walkExamples of random walksExamples of random walksForecasting from the random walk modelForecasting from the random walk modelLog transformation & geometric random walkLog transformation & geometric random walk

Linear trend model Linear trend model Model comparison & validationModel comparison & validation

Page 2: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Explanation of lags & differencesExplanation of lags & differencesA lagged series is just the original series shifted down

by one or more periods, so that it “lags behind.” LAG(Y,1) in row k is the same as Y in row k-1

A differenced series is the change in the original series from last

period to this period (i.e., “delta-Y”). DIFF(Y) is logically the same as

Y – LAG(Y,1). It also has a missing value at the beginning.

You can also lag a series by 2 or more periods. Lagged series can be used as autoregressors

and leading indicators in regression models.

Note that a lagged series has missing values at the beginning, but it extends

farther into the future than the original series.

Page 3: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

The random walk modelThe random walk model

A time series is a A time series is a random walkrandom walk if its periodif its period--toto--period period changeschanges are statistically independent & are statistically independent & identically distributed (“identically distributed (“i.i.di.i.d.”).”)

In each period it takes an independent random In each period it takes an independent random “step” away from its last position“step” away from its last position

If the If the meanmean step size is nonstep size is non--zero, it is a random zero, it is a random walk “with drift” (i.e., trend)walk “with drift” (i.e., trend)

Page 4: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

AnalogiesAnalogies

Random walk: a drunk person staggering left and Random walk: a drunk person staggering left and right with equal probability while moving forwardright with equal probability while moving forward

Random walk with drift: a drunk person with one Random walk with drift: a drunk person with one shoeshoe

Continuous random walk (“Brownian motion”): a Continuous random walk (“Brownian motion”): a drunk snaildrunk snail

Page 5: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Walking the random walkWalking the random walk

Page 6: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

A random walk with little or no drift often does not look A random walk with little or no drift often does not look “random”! It may appear to have trends, cycles, “head “random”! It may appear to have trends, cycles, “head and shoulders” patterns and other interesting featuresand shoulders” patterns and other interesting features

by sheer chanceby sheer chance..

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

-25

-20

-15

-10

-5

0

5

10

15

20

25

0 50 100 150 200

“It Ain't the Things You Don't Know That Hurt You, It's the Things You Know... That Ain't So!”

Page 7: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

A related statistical illusion: A related statistical illusion: the “hot hand in basketball”the “hot hand in basketball”

Many basketball players are perceived as “streaky” Many basketball players are perceived as “streaky” shooters (e.g., Allen Iverson), but statistical analysis shooters (e.g., Allen Iverson), but statistical analysis shows that the chance of making a given field goal or shows that the chance of making a given field goal or free throw is roughly independent of what happened free throw is roughly independent of what happened on the last few shots:on the last few shots:

“Chance is a very powerful force in creating streaks" “Chance is a very powerful force in creating streaks"

See the Hot Hand in Sports website at See the Hot Hand in Sports website at http://http://thehothand.blogspot.comthehothand.blogspot.com//

Page 8: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

How to identify a random walkHow to identify a random walkPlot the Plot the first difference, first difference, i.e., the periodi.e., the period--toto--period period changes, of the original time series (“deltachanges, of the original time series (“delta--YY”)”)

If the first difference has If the first difference has constant varianceconstant variance and and no no significant autocorrelations*significant autocorrelations*, the original series is a , the original series is a random walkrandom walk——at least approximately.at least approximately.

If the series is logically bounded above or below or If the series is logically bounded above or below or has a stable longhas a stable long--run average, then it is not a “true” run average, then it is not a “true” random walk, but the random walk model may still random walk, but the random walk model may still be appropriate for shortbe appropriate for short--term forecasting (at least term forecasting (at least as a benchmark for comparing fancier models).as a benchmark for comparing fancier models).

*to be explained shortly…

Page 9: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Relation to mean modelRelation to mean model

If a time series is a random walk, then its If a time series is a random walk, then its first first differencedifference is described by the is described by the mean modelmean model..

Thus, you should predict that the next Thus, you should predict that the next changechange will will equal the equal the averageaverage change.change.

The average change may or may not be zero, The average change may or may not be zero, depending on whether there is “drift”.depending on whether there is “drift”.

Page 10: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

What’s an autocorrelation?What’s an autocorrelation?A A correlationcorrelation is a number is a number rrxyxy between between --1 and +1 that 1 and +1 that measures the strength of the linear relationship measures the strength of the linear relationship between two variables between two variables XX and and YY

The correlation is The correlation is zero zero if if XX and and YY are are statistically statistically independentindependent

If you regress If you regress YY on on XX, the percent of variance , the percent of variance explained (“explained (“RR squared”) is just the square of the squared”) is just the square of the correlation coefficient (correlation coefficient (rrxyxy squared).squared).

The The autocorrelation autocorrelation of of YY at lag at lag kk, denoted , denoted rrkk,, is the is the correlation between correlation between YY and and LAG(LAG(YY,,kk), i.e., the ), i.e., the correlation between correlation between YY and itself lagged by and itself lagged by kk periods.periods.

Page 11: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Computing & plotting autocorrelationsComputing & plotting autocorrelations

Formula: Formula:

An An autocorrelation plotautocorrelation plot is a bar chart of the is a bar chart of the autocorrelation vs. the lag, i.e., autocorrelation vs. the lag, i.e., rrkk vs. vs. kk

Estimated autocorrelations for adjusted MortgageRate

lag

Aut

ocor

rela

tions

0 5 10 15 20 25-1

-0.6

-0.2

0.2

0.6

1

( )( )( )

12

1

nt t kt k

k ntt

y Y y Yr

y Y

−= +

=

− −=

∑∑

This plot shows a single significant autocorrelation

at lag 1.

Page 12: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

StatgraphicsStatgraphics automatically calculates and plots automatically calculates and plots autocorrelations in its time series procedures, but you autocorrelations in its time series procedures, but you can also do this in Excel using the CORREL functioncan also do this in Excel using the CORREL function

For example, suppose you have 100 observations of For example, suppose you have 100 observations of a time series stored in the range A1:A100 in Excel, a time series stored in the range A1:A100 in Excel, then…then…

rr11 = CORREL(A1:A99, A2:A100) = CORREL(A1:A99, A2:A100)

rr22 = CORREL(A1:A98, A3:A100) = CORREL(A1:A98, A3:A100)

rr33 = CORREL(A1:A97, A4:A100)= CORREL(A1:A97, A4:A100)

Computing autocorrelations in ExcelComputing autocorrelations in Excel

Note that the ranges are offset by 1 period,

2 periods, 3 periods, etc.

Page 13: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

What does autocorrelation mean ?What does autocorrelation mean ?Strong positive [negative] autocorrelation at lag Strong positive [negative] autocorrelation at lag kkmeans that if one observation is above the mean means that if one observation is above the mean then the then the k k thth following observation is also likely to be following observation is also likely to be above [below] the mean.above [below] the mean.

Thus, Thus, positivepositive [or negative] autocorrelation measures [or negative] autocorrelation measures the tendency for successive observations to fall on the tendency for successive observations to fall on the the samesame [or opposite] sides of the mean.[or opposite] sides of the mean.

Series with consistent Series with consistent trendstrends always have strong always have strong positive autocorrelation at small lags (1, 2, …)positive autocorrelation at small lags (1, 2, …)

Series with Series with seasonalityseasonality have strong positive have strong positive autocorrelation at the seasonal period (e.g., lag 12)autocorrelation at the seasonal period (e.g., lag 12)

Page 14: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

How we use autocorrelationsHow we use autocorrelationsIt’s usually “good” to find significant autocorrelations It’s usually “good” to find significant autocorrelations in your in your original seriesoriginal series..

This means the past contains clues to the future.This means the past contains clues to the future.It’s “bad” to find significant autocorrelations in your It’s “bad” to find significant autocorrelations in your residualsresiduals (forecast errors).(forecast errors).

This means that there is some pattern in the data This means that there is some pattern in the data that your model has not explained.that your model has not explained.

Rough rule of thumb for “significant” autocorrelation:Rough rule of thumb for “significant” autocorrelation:

95% confidence interval 95% confidence interval ≈≈ 2n

±

1 21Exact ( ) (1 ) / 1/ for small and/or low autocorrelationk

k iiSE r r n n k−== + ≈∑

Page 15: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Autocorrelation & the RW modelAutocorrelation & the RW modelIn a true random walk, there is In a true random walk, there is strong strong autocorrelation autocorrelation in the original series, but in the original series, but no no significant autocorrelation significant autocorrelation in the in the first differencefirst difference of the series at of the series at any any lags.lags.

This means there is no pattern in the data except “the This means there is no pattern in the data except “the change next period will equal the average change”change next period will equal the average change”

Hence the random walk model is sometimes called the Hence the random walk model is sometimes called the “naïve” model…“naïve” model…

..but it’s not really naïve! It says you ..but it’s not really naïve! It says you can’t do better can’t do better than thisthan this, and it has precise implications for the , and it has precise implications for the uncertainty uncertainty (i.e., widths of confidence intervals) (i.e., widths of confidence intervals) surrounding the forecasts at horizons of more than surrounding the forecasts at horizons of more than one period ahead.one period ahead.

Page 16: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Example: daily dollar/euro FX rate 1999Example: daily dollar/euro FX rate 1999--present:present:random walk without drift?random walk without drift?

Time Series Plot for DollarEuroFXday

0 400 800 1200 1600 2000 24000.82

0.92

1.02

1.12

1.22

1.32

1.42

Original series shows erratic behavior, strong positive autocorrelation, peaks and valleys

Estimated Autocorrelations for DollarEuroFXday

0 5 10 15 20 25lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

The heights of the bars are the autocorrelations. Here the autocorrelations are all close to 1 because the series tends to remain on the same side of its sample mean for long periods.

Page 17: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

After a firstAfter a first--difference transformation, autocorrelations difference transformation, autocorrelations are all insignificant, the signature of a random walkare all insignificant, the signature of a random walk

Right-mouse button options in “Describe/Time Series/Descriptive Methods” procedure: one order of nonseasonal differencing has been chosen

Time Series Plot for adjusted DollarEuroFXday

0 400 800 1200 1600 2000 2400-25

-15

-5

5

15

250.001)

Estimated Autocorrelations for adjusted DollarEuroFXday

0 5 10 15 20 25lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

The variance of the differences also appears to be roughly constant over time.

Page 18: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Forecasting from the RW modelForecasting from the RW modelRandom walk Random walk without without drift:drift:

i.e., i.e., the next forecast equals the last observationthe next forecast equals the last observation..

Standard error:Standard error:

where is the change in where is the change in YY at period at period tt, , i.e., “deltai.e., “delta--Y Y ” at time ” at time tt..

Thus, Thus, SESEfcstfcst is is the RMS value of deltathe RMS value of delta--YY. .

nn yy =+1ˆ

∑ = −Δ= nt tfcst nySE 2

2 1)/(

1−−=Δ ttt yyy

Page 19: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Forecasting from RWForecasting from RW--withwith--drift modeldrift model

Random walk Random walk withwith drift:drift:

where where

is the average change between periods.is the average change between periods.

Thus, Thus, the next forecast equals the last observation the next forecast equals the last observation plus the average changeplus the average change..

1ˆn ny y Y+ = + Δ

12/( 1) ( ) /( 1)n

t ntY y n y y n

=Δ = Δ − = − −∑

Page 20: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Forecast standard error for RWD modelForecast standard error for RWD model

For the RW with drift model, For the RW with drift model, SESEfcstfcst is the same as is the same as SESEfcstfcst for the mean model applied to deltafor the mean model applied to delta--YY::

where where

is the sample standard deviation of deltais the sample standard deviation of delta--YY

2 2 1( 1) 11fcst y y YSE s s / n s

nΔ Δ Δ= + − = +−

22( ) /( 2)n

Y tts y Y nΔ =

= Δ −Δ −∑

Note that the sample size for delta-Y is only n−1, hence the denominator in the sample standard deviation calculation is n−2and a t-test for non-zero drift will have n−2 degrees of freedom.

Page 21: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Forecasting more than 1 period aheadForecasting more than 1 period aheadRW without drift: RW without drift:

(zero trend)(zero trend)

RW with drift:RW with drift:(non(non--zero trend)zero trend)

In both cases:In both cases:

…i.e., the …i.e., the kk--periodperiod--ahead forecast standard error is ahead forecast standard error is larger than the 1larger than the 1--period ahead standard error by a period ahead standard error by a factor offactor of

nkn yy =+ˆ

ykyy nkn Δ+=+ˆ

)1()( fcstkfcst SEkSE =

k

Page 22: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

In plain English...In plain English...The forecasts from the random walk model are The forecasts from the random walk model are extrapolated as extrapolated as a straight line extending from the last a straight line extending from the last observed data pointobserved data point..

In a RW without drift, the line is In a RW without drift, the line is horizontalhorizontal..

In a RW with drift, it has a In a RW with drift, it has a nonnon--zero slopezero slope equal to the equal to the average trend over the whole sample.average trend over the whole sample.

The standard error of the The standard error of the kk--step ahead forecast is the step ahead forecast is the sample standard deviation of deltasample standard deviation of delta--YY times .times .

Hence confidence intervals widen in proportion to the Hence confidence intervals widen in proportion to the square root of time square root of time (“sideways parabola” shape).(“sideways parabola” shape).

k

Page 23: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent
Page 24: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Why the square root of time rule?Why the square root of time rule?In a random walk, the In a random walk, the kk--step ahead forecast step ahead forecast errorerror is is the sum of the sum of k k independent random variables (“steps”)independent random variables (“steps”)

The The variance of a sumvariance of a sum of independent random of independent random variables is the variables is the sum of the variancessum of the variances, hence the , hence the variance of the sum of variance of the sum of k k steps is just steps is just kk times the times the variance of one step.variance of one step.

The The standard deviationstandard deviation is the square root of is the square root of variance, hence the standard deviation of the variance, hence the standard deviation of the kk--stepstep-- ahead forecast error goes up in proportion to ahead forecast error goes up in proportion to square root of square root of k.k.

Page 25: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

SG’sSG’s “User“User--specified” forecasting procedurespecified” forecasting procedure1.1. Applies inflation adjustment, math transformation, Applies inflation adjustment, math transformation,

and seasonal adjustment (if any), in that order.and seasonal adjustment (if any), in that order.

2.2. Fits a forecasting model of the specified type to the Fits a forecasting model of the specified type to the adjusted data, produces residual plots and adjusted data, produces residual plots and computes forecasts in computes forecasts in adjustedadjusted units.units.

3.3. ““UntransformsUntransforms” the forecasts by undoing the ” the forecasts by undoing the adjustment operations (in reverse order), to obtain adjustment operations (in reverse order), to obtain & plot forecasts in & plot forecasts in originaloriginal units.units.

4.4. Computes forecast errors and their statistics in Computes forecast errors and their statistics in original units.original units.

5.5. Compares up to 5 models sideCompares up to 5 models side--byby--side!side!

Page 26: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Model specification optionsModel specification options

Page 27: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Fitting RW models in Fitting RW models in StatgraphicsStatgraphics

The Forecasting/UserThe Forecasting/User--SpecifiedSpecified--Model procedure Model procedure includes a “Random Walk” model typeincludes a “Random Walk” model type

If the “Constant” box is checked, you get a If the “Constant” box is checked, you get a random walk with drift.random walk with drift.

If the input variable is logged, or if the “Natural If the input variable is logged, or if the “Natural log” box is checked, you get a log” box is checked, you get a geometric random geometric random walk (…walk (…more about that later…)more about that later…)

Page 28: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Time Sequence Plot for DollarEuroFXdayRandom Walk Without Drift

0 400 800 1200 1600 2000 24000.82

0.92

1.02

1.12

1.22

1.32

1.42

Dol

larE

uroF

Xda

y

actualforecast95.0% limits

RW forecasts for FX rateRW forecasts for FX rate

Analysis options in Forecasting procedure: RW model with no constant term

Confidence intervals for forecasts have parabolic shape

Page 29: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200

Random walk with drift

Y

actualforecast95.0% limits

0 10 20 30 40 5050

75

100

125

150

175

200

Updating of RW forecastsUpdating of RW forecasts

Forecasts into the future are a trend line re-anchored on the last observed data point. Past forecasts look like a plot of the data shifted to the right and slightly up.

Page 30: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Here’s a RW model without drift for housing starts SAAR. Note that the forecasts extend horizontally, and the confidence interval widen parabolically. (The longer-horizon CI’s are probably somewhat too wide: the series seems to have a stable long-run average.)

The residual autocorrelations are almost insignificant, indicating that this series is close to a random walk.

Page 31: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Here’s a random walk model for deflated, seasonally adjusted retail sales data. Note that the fitted values “track” the historical data closely (just lagging behind by one month), and the trend line is extrapolated from the last observed data point. This report also shows head-to-head comparisons against 2 other models—we’ll come back to that later…

Page 32: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

How to tell if drift (trend) is nonHow to tell if drift (trend) is non--zero?zero?

The drift term often does not matter much to The drift term often does not matter much to 11--periodperiod--ahead forecasts; it shows up only as the ahead forecasts; it shows up only as the trend in longertrend in longer--horizon forecasts. horizon forecasts.

Ask whether it makes sense that the series should Ask whether it makes sense that the series should continue to trend upward or downward indefinitely: if continue to trend upward or downward indefinitely: if not, then assume no drift.not, then assume no drift.

It may be difficult to test the hypothesis of zero drift It may be difficult to test the hypothesis of zero drift by purely statistical methods (e.g., by looking at the by purely statistical methods (e.g., by looking at the tt--statistic of the sample mean of deltastatistic of the sample mean of delta--YY) unless the ) unless the sample is large.sample is large.

Page 33: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Looking aheadLooking ahead

Some of the more sophisticated forecasting models Some of the more sophisticated forecasting models we will meet later (e.g., simple and linear exponential we will meet later (e.g., simple and linear exponential smoothing) are just fanciedsmoothing) are just fancied--up random walk models.up random walk models.

The forecast line is extrapolated from The forecast line is extrapolated from the average the average position of the last few pointsposition of the last few points..

Its slope is equal to Its slope is equal to the average trend of the recent the average trend of the recent datadata, not the whole sample., not the whole sample.

Page 34: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

The logarithm transformationThe logarithm transformationThe natural logarithm function looks like this:The natural logarithm function looks like this:

Note that the line y = x−1 is tangent to y = LN(x) at x=1. Hence LN(x) ≈x−1 for x ≈ 1, and LN(1+r) ≈ r when r is a small percentage (e.g., a return)

-2

-1

0

1

2

0 1 2 3 4

Y=LOG(X)Y=X-1

Page 35: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Properties of the natural logarithm Properties of the natural logarithm transformationtransformation

The defining property of any logarithm is thatThe defining property of any logarithm is that

LOG(LOG(xyxy) = ) = LOG(LOG(xx) + ) + LOG(LOG(yy))

Because the Because the natural natural log has a slope of log has a slope of 11 at at x x == 11 it it converts converts percentagepercentage changes into changes into absoluteabsolute changes:changes:

LN((1+LN((1+rr))xx) = ) = LN(LN(xx) + LN(1+) + LN(1+rr) ) ≈≈ LN(LN(xx) + ) + rr

By the same token, it converts an By the same token, it converts an exponential exponential (compound) growth curve into a (compound) growth curve into a linearlinear growth curvegrowth curve

LN((1+LN((1+rr))kk) = ) = kk LN(1+LN(1+rr) ) ≈≈ k r k r

Page 36: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Geometric random walkGeometric random walkIf the If the loglog of a series is a random walk, the original of a series is a random walk, the original series is a series is a geometric random walk.geometric random walk.

The change in the natural log is (approximately) the The change in the natural log is (approximately) the percentage changepercentage change between periods :between periods :

LN(LN(yytt) ) −− LN(LN(yytt--11) = ) = LN(LN(yytt / / yytt--11))

≈≈ ((yytt / / yytt--11) ) −−1 = (1 = (yytt −− yytt--11)/ )/ yytt--11 ! !

Hence, in a geometric random walk, the series takes Hence, in a geometric random walk, the series takes random steps in (roughly) random steps in (roughly) percentage percentage termsterms

Percentage change is a more familiar and easy-to-think-about concept, but change-in-the-natural-log is theoretically the right way to measure relative changes when exponential growth is occurring orwhen small changes are compounded over many periods.

Page 37: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Best example: stock pricesBest example: stock pricesThe geometric random walk is the “default” model for The geometric random walk is the “default” model for stock prices* & many other financial assets for which stock prices* & many other financial assets for which speculative markets existspeculative markets exist

This means it is hard to beat the market by technical This means it is hard to beat the market by technical analysis (“charting”)...analysis (“charting”)...

… or by fitting regression models to monthly, weekly, … or by fitting regression models to monthly, weekly, or even daily data.or even daily data.

*First proposed by Louis Bachelier in 1900, 70 years later it became the basis for the Black-Scholes options pricing model

Page 38: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Why should stock prices behave Why should stock prices behave like a geometric random walk?like a geometric random walk?

If everyone could predict that the stock market will If everyone could predict that the stock market will go up more than average tomorrow, go up more than average tomorrow, it would have it would have already gone up todayalready gone up today, hence future returns are , hence future returns are independent of past returns (and other public independent of past returns (and other public information)information)

Investors generally think in terms of Investors generally think in terms of percentage percentage changeschanges in stock values when responding to in stock values when responding to informational events (earnings announcements, informational events (earnings announcements, interest hikes, etc.), hence volatility is fairly constant interest hikes, etc.), hence volatility is fairly constant in percentage terms.in percentage terms.

Page 39: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Three forms of random walk Three forms of random walk (“efficient markets”) hypotheses(“efficient markets”) hypotheses

1.1. Weak form:Weak form: future returns can’t be profitably future returns can’t be profitably predicted from predicted from past histories of returnspast histories of returns (e.g., by (e.g., by chartists or datachartists or data--miners… miners… or we in this classor we in this class))

2.2. SemiSemi--strong form:strong form: future returns can’t be future returns can’t be profitably predicted from profitably predicted from any public informationany public information(e.g. by mutual fund managers)(e.g. by mutual fund managers)

3.3. Strong form:Strong form: future returns can’t be profitably future returns can’t be profitably predicted from predicted from any available informationany available information (even (even by insiders)by insiders)

Conventional wisdom: the truth is probably somewhere between semi-strong and strong. Better to buy index funds…or throw darts!

Page 40: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

“The market is not a perfect random walk. But any systematic relationships that exist are so small that they are not useful for an investor…

The history of stock price movements contains no useful information that will enable an investor to outperform a buy-and-hold strategy in managing a portfolio.”

(p. 151)

Page 41: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Example: S&P 500 monthly closeExample: S&P 500 monthly close

Original series shows exponential growth up to 2000-2001 crash

Differenced series has no autocorrelation…but variance is increasing (“heteroscedasticity”)

Page 42: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged Logged S&P 500 monthly closeS&P 500 monthly close

Logged series shows linear growth with “bubble” in late ‘90’s

Right-mouse button options in “Time Series/Descriptive Methods” procedure: first let’s add a log transformation.Next we will also add a first-difference transformation.

Page 43: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Slope of trend in logged data Slope of trend in logged data ≈≈ average percentage increaseaverage percentage increase

The slope of a trend line fitted to the natural log of the data is (approximately) the average percentage change per unit time. Here the natural log increased by roughly 7.4 – 4.4 = 3.0 over 25 years, so the average annual increase in the S&P500 index was 3/25 ≈ 12%

Page 44: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged and Logged and differenced differenced S&P 500 monthly closeS&P 500 monthly close

The first difference of the logged data is (approximately) the percentage change from period to period. Here we see that the percentage changes have no significant autocorrelations and (almost) constant variance over time

Page 45: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

S&P 500 S&P 500 daily daily close since 1954close since 1954

12,000 data points!

Page 46: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged Logged S&P 500 daily closeS&P 500 daily close

Not a perfect geometric random walk! The lag-1 autocorrelation is about 0.09. Also, very-long time scale shows periods of greater & lesser volatility.

Page 47: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged Logged S&P 500 daily closeS&P 500 daily close

Excluding Black Monday (& Tues. & Wed.), an even clearer pattern emerges. (Actually, the lag-1 daily autocorrelation has faded out in the last 20 years.)

Page 48: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged DJIA daily closeLogged DJIA daily close

Same pattern!

Page 49: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Daily 3Daily 3--mo. Tmo. T--bill ratesbill rates

Page 50: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged Logged daily 3daily 3--mo. Tmo. T--bill ratesbill rates

Here we also see autocorrelation at lag 1, as well as 5 and 10: a day-of-week effect? Also, note the pattern of changing volatility.

Page 51: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Another example: weekly gas prices 1999Another example: weekly gas prices 1999--20072007

Time Series Plot for GasPrice

0 100 200 300 400 50080

120

160

200

240

280

320

Gas

Pric

e

Original series shows erratic behavior, strong positive

autocorrelation, peaks and valleys

Page 52: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Logged Logged weeklyweekly gas pricesgas prices

Natural log transformation linearizesgrowth, stabilizes size of fluctuations at different points in time

Time Series Plot for adjusted GasPrice

0 100 200 300 400 5004.4

4.7

5

5.3

5.6

5.9

adju

sted

Gas

Pric

e

Page 53: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Differenced Differenced and logged weekly gas pricesand logged weekly gas prices

Time Series Plot for adjusted GasPrice

0 100 200 300 400 500-0.07

-0.03

0.01

0.05

0.09

0.13

0.17

adju

sted

Gas

Pric

e

Estimated Autocorrelations for adjusted GasPrice

0 5 10 15 20 25lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

First difference of logged prices (i.e. weekly % change) looks much more like “noise”

However, there is a significant “wave” pattern of autocorrelation in the differences, so this series is not a random walk. (This pattern suggests an exponentially weighted moving average would be a better model, as we will see later.)

Page 54: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Forecasting from the GRW modelForecasting from the GRW modelFirst apply the standard RW model to the logged First apply the standard RW model to the logged series to obtain forecasts and confidence intervals series to obtain forecasts and confidence intervals in logged units… in logged units…

Then “Then “unlogunlog” them (apply the “EXP” function) to ” them (apply the “EXP” function) to obtain forecasts and CI’s in original units.obtain forecasts and CI’s in original units.

StatgraphicsStatgraphics does all this automatically when you does all this automatically when you specify RW with a log transformation.specify RW with a log transformation.

Page 55: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Direct calculation of GRW forecastDirect calculation of GRW forecastOne period ahead: One period ahead:

kk periods ahead:periods ahead:

…where …where rr is the average percentage growth per periodis the average percentage growth per period

Example: forecast for next month’s S&P500 closing Example: forecast for next month’s S&P500 closing value value ≈≈ 1.0091 x this month’s closing value, based 1.0091 x this month’s closing value, based on estimated drift of 0.0091* in logged unitson estimated drift of 0.0091* in logged units

Forecast standard error Forecast standard error ≈≈ 0.042 x this month’s closing 0.042 x this month’s closing value, based on RMSE of 0.042* in logged unitsvalue, based on RMSE of 0.042* in logged units

*See output on later slide for RW model fitted to log(SP500)*See output on later slide for RW model fitted to log(SP500)

nn yry )1(ˆ 1 +=+

nk

kn yry )1(ˆ +=+Exponential growth curve

Page 56: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Linear trend modelLinear trend modelThe linear trend model is a straight line fitted to a The linear trend model is a straight line fitted to a plot of data versus time:plot of data versus time: yytt = = a+bta+bt. .

Mathematically, it is a simple regression of the data Mathematically, it is a simple regression of the data variable variable Y Y on the time index on the time index TT..

It may be visually helpful to slap a trend line on a It may be visually helpful to slap a trend line on a time series plot (Excel can do this)...time series plot (Excel can do this)...

…but this is often a poor way to extrapolate a trend …but this is often a poor way to extrapolate a trend for purposes of forecasting the future!for purposes of forecasting the future!

Page 57: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Here’s the linear trend model for the retail data. Notice that its fit to the historical data is poor (especially during the “bubble” period of the late ‘90’s). Coincidentally, the trend line happens to pass almost through the last observed data point, although this need not be true in general. The confidence intervals for more distant horizons are unrealistically “constant” in width.

Estimated trend = 2.68, compared with 2.45 in RWD model

Page 58: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Linear trend vs. RW with driftLinear trend vs. RW with driftA linear trend model assumes that there is a trend A linear trend model assumes that there is a trend line fixed “somewhere in space” around which the line fixed “somewhere in space” around which the data varies in an data varies in an i.i.di.i.d. manner.. manner.

The The fitted fitted trend line always passes through the trend line always passes through the center center of the data, i.e., through the point of the data, i.e., through the point

A RWA RW--withwith--drift model also assumes that there is a drift model also assumes that there is a constant trend, but it continually “reconstant trend, but it continually “re--anchors” the anchors” the trend line on the last observed data point.trend line on the last observed data point.

CI’s for the RW model widen “CI’s for the RW model widen “parabolicallyparabolically” as the ” as the forecast horizon increases, CI’s for the linear trend forecast horizon increases, CI’s for the linear trend model hardly widen at all. Which is more realistic model hardly widen at all. Which is more realistic for your data set?for your data set?

( , )T Y

Page 59: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Linear trend vs. RWD, continuedLinear trend vs. RWD, continuedIn practice a RWIn practice a RW--withwith--drift model works better for drift model works better for “smooth” data; the linear trend may be appropriate “smooth” data; the linear trend may be appropriate for very “noisy” data.for very “noisy” data.

If the growth pattern in the data is irregular or not If the growth pattern in the data is irregular or not perfectly linear, the linear trend model may fit badly perfectly linear, the linear trend model may fit badly near the near the endend of the seriesof the series——which is where the which is where the forecasting action occurs!forecasting action occurs!

Because the linear trend model anchors the trend Because the linear trend model anchors the trend line in the exact center of the data, its goodness of line in the exact center of the data, its goodness of fit near the end of the series is very sensitive to the fit near the end of the series is very sensitive to the amount of past history used.amount of past history used.

Page 60: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Importance of model assumptionsImportance of model assumptionsEvery forecasting model is based on particular Every forecasting model is based on particular assumptionsassumptions about the nature of the true pattern about the nature of the true pattern in the datain the data

Example: RW model assumes that Example: RW model assumes that periodperiod--toto--period changesperiod changes are are i.i.di.i.d., while LT model assumes ., while LT model assumes that that deviations from a fixed trend linedeviations from a fixed trend line are are i.i.di.i.d..

The validity of the forecasts and confidence The validity of the forecasts and confidence intervals depends on whether the assumptions intervals depends on whether the assumptions are “correct”are “correct”

Assumptions should be tested Assumptions should be tested statisticallystatistically as well as well as by appealing to intuition or relevant theories.as by appealing to intuition or relevant theories.

Page 61: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

How to test assumptionsHow to test assumptionsStatistical output not only shows “goodness of fit” Statistical output not only shows “goodness of fit” but also provides diagnostic tests of assumptionsbut also provides diagnostic tests of assumptions

Violations of model assumptions are indicated by Violations of model assumptions are indicated by nonnon--random patterns in the errors (residuals):random patterns in the errors (residuals):

AutocorrelationAutocorrelation

HeteroscedasticityHeteroscedasticity (non(non--constant variance)constant variance)

NonNon--normalitynormality

…and/or by poor out…and/or by poor out--ofof--sample performancesample performance

Page 62: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

OutOut--ofof--sample validationsample validation

It is always good practice to “hold out” some data It is always good practice to “hold out” some data during modelduring model--fitting and then “validate” the model fitting and then “validate” the model by testing on the holdby testing on the hold--out data. out data.

This is also called “outThis is also called “out--ofof--sample” testing or (in the sample” testing or (in the investment business) “investment business) “backtestingbacktesting””

To be completely honest, you should hold out data To be completely honest, you should hold out data while selecting the modelwhile selecting the model, not just during the final , not just during the final parameter estimation.parameter estimation.

Page 63: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Estimation vs. validation vs. forecastingEstimation vs. validation vs. forecasting

Page 64: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Model comparisons: which model Model comparisons: which model is better, and why?is better, and why?

Models should be compared on the basis of the Models should be compared on the basis of the size of the errors they are expected to make in the size of the errors they are expected to make in the futurefuture, as well as on simplicity & intuitive , as well as on simplicity & intuitive reasonableness.reasonableness.

The key error measures in the estimation & The key error measures in the estimation & validation period (RMSE, MAE, MAPE) are validation period (RMSE, MAE, MAPE) are indicatorsindicators of the size of the errors the model is of the size of the errors the model is likely to make …likely to make …if they can be trustedif they can be trusted!*!*

*Depends on whether the model passes tests of its assumptions!

Page 65: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Keeping score: error measuresKeeping score: error measures

RMSE:RMSE: rootroot--meanmean--squaredsquared error, i.e., the square error, i.e., the square root of the average of the squared errors*root of the average of the squared errors*

MAE:MAE: mean mean absolute absolute error, i.e., the average of error, i.e., the average of the the absolute valuesabsolute values of the errors**of the errors**

MAPE:MAPE: mean absolute mean absolute percentagepercentage error**error**

*Usually adjusted for # coefficients estimated, and itequals the sample standard deviation of the errors ifif the mean error is zero (i.e., if forecasts are “unbiased”)

**NOT adjusted for # coefficients estimated

Page 66: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Which measure is “best” to focus on?Which measure is “best” to focus on?

RMSERMSE is what your is what your software software is trying to optimize, is trying to optimize, because it always estimates the coefficients of the because it always estimates the coefficients of the model by “least squares”, and it heavily penalizes model by “least squares”, and it heavily penalizes very large errors very large errors

MAEMAE is a bit easier to understand and does not is a bit easier to understand and does not give as much relative weight to large errorsgive as much relative weight to large errors

MAPEMAPE is also easy to understand and is “unitis also easy to understand and is “unit--free” free” (robust against effects of inflation, compound (robust against effects of inflation, compound growth, and multiplicative seasonality)growth, and multiplicative seasonality)

Page 67: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Here the geometric RW model for SP500 has been implemented as a RW model fitted to log(SP500) so that output statistics and plots are in log units. The last 50 points were held out for validation. (Note: “log” is the natural log function in Statgraphics.)

Estimated drift is 0.0091 = 0.91% per month = 11.5% per year

All three error stats are a bit larger in the validation period than in the estimation period, but in the same ballpark. This is probably because the volatility is a bit higher at the end of the series. RMSE of 0.042 in estimation period means that the standard deviation of monthly changes is roughly 4.2%

Page 68: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Here is the same geometric RW model, constructed by using SP500 as the input variable and with the log transformation performed inside the procedure as a model option. The plots and statistics are now in “unlogged terms”, so we can’t compare RMSE and MAE between estimation and validation periods, although we can still compare MAPE.

Error stats are now in unlogged terms, so RMSE and MAE are much smaller in estimation period than in validation period…

Same estimated drift

Forecast plot is now unlogged

Residual plot is still logged

…but MAPE is still relatively similar in both periods (essentially the same as the MAE in log units on the previous slide)

Page 69: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Comparisons between modelsComparisons between models

Ideally, error measures should be compared Ideally, error measures should be compared between models in the between models in the samesame (e.g., original) (e.g., original) unitsunitsand fitted to the and fitted to the same samplesame sample of data.of data.

The UserThe User--Specified Forecasting procedure in Specified Forecasting procedure in StatgraphicsStatgraphics makes this easy: it produces a makes this easy: it produces a “Model Comparison” report that gives side“Model Comparison” report that gives side--byby--side comparisons side comparisons in original unitsin original units..

Page 70: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

What to look forWhat to look forCompare key error measures between models in the Compare key error measures between models in the estimation period.estimation period.

Also check to see that validation period results are Also check to see that validation period results are roughly roughly consistentconsistent with estimation period results, with estimation period results, particularly on the MAPE measure.particularly on the MAPE measure.

RMSE and MAE are not comparable between RMSE and MAE are not comparable between estimation & validation periods if a log transform has estimation & validation periods if a log transform has been used as a model option inside been used as a model option inside SG’sSG’s forecasting forecasting procedure or if inflation or multiplicative patterns are procedure or if inflation or multiplicative patterns are presentpresent——stick with MAPE in those cases.stick with MAPE in those cases.

Page 71: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Which model to choose?Which model to choose?If one model…If one model…

has has clearlyclearly smaller errors than its rivals, smaller errors than its rivals,

passes the residual diagnostic and validation tests passes the residual diagnostic and validation tests so that its assumptions are credible,so that its assumptions are credible,

yields sensibleyields sensible--looking plots of forecasts & CI’s, andlooking plots of forecasts & CI’s, and

is supported by theory and intuition, is supported by theory and intuition,

…then you have good reasons for choosing it.…then you have good reasons for choosing it.

But don’t pick models based on “hairBut don’t pick models based on “hair--splitting” splitting” differences in error stats. When in doubt, choose the differences in error stats. When in doubt, choose the model that is simpler and more intuitive.model that is simpler and more intuitive.

Page 72: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Here, model A (random walk with drift) is best on all measures in the estimation

period

MAPE in the validation period is consistent with estimation

period results

Residual diagnostics OK

Page 73: Decision 411: Class 2 - Fuqua School of Businessrnau/Decision411... · The random walk model ¾A time series is a random walk if its period-to- period changes are statistically independent

Class 2 RecapClass 2 RecapExplanation of lags & differencesExplanation of lags & differencesRandom walk modelRandom walk model

How to identify a random walkHow to identify a random walkExamples of random walksExamples of random walksForecasting from the random walk modelForecasting from the random walk modelLog transformation & geometric random walkLog transformation & geometric random walk

Linear trend modelLinear trend modelModel comparison & validation Model comparison & validation