Forecasting Using Time Series

download Forecasting Using Time Series

of 7

Transcript of Forecasting Using Time Series

  • 8/9/2019 Forecasting Using Time Series

    1/7

  • 8/9/2019 Forecasting Using Time Series

    2/7

    Rajiv Mohapatra

    TrendsA trend is a gradual upward or downward shift in the level of the series or the tendency of the seriesvalues to increase or decrease over time.

    Trend

    Trends are eitherlocalorglobal, but a single series can exhibit both types. Historically, series plots of the

    stock market index show an upward global trend. Local downward trends have appeared in times ofrecession, and local upward trends have appeared in times of prosperity.

    Trends can also be eitherlinear ornonlinear. Linear trends are positive or negative additive increments

    to the level of the series, comparable to the effect of simple interest on principal. Nonlinear trends are

    often multiplicative, with increments that are proportional to the previous series value(s).

    Global linear trends are fit and forecast well by both exponential smoothing and ARIMA models. Inbuilding ARIMA models, series showing trends are generally differenced to remove the effect of thetrend.

    SeasonalCyclesA seasonal cycle is a repetitive, predictable pattern in the series values.

    Seasonal cycle

  • 8/9/2019 Forecasting Using Time Series

    3/7

    Rajiv Mohapatra

    Seasonal cycles are tied to the interval of the series. For instance, monthly data typically cycles over

    quarters and years. Series that show a seasonal cycle are said to exhibit seasonality. Seasonal patterns areuseful in obtaining good fits and forecasts, and there are exponential smoothing and ARIMA models that

    capture seasonality.

    NonseasonalCycles

    A nonseasonal cycle is a repetitive, possibly unpredictable, pattern in the series values.

    Nonseasonal cycle

    Some series, such as unemployment rate, clearly display cyclical behavior; however, the periodicity of the

    cycle varies over time, making it difficult to predict when a high or low will occur. Other series may havepredictable cycles but do not neatly fit into the Gregorian calendar or have cycles longer than a year e.g,the tides follow the lunar calendar, international travel and trade related to the Olympics swell every four

    years, and there are many religious holidays whose Gregorian dates change from year to year.

    Nonseasonal cyclical patterns are difficult to model and generally increase uncertainty in forecasting. Thestock market, for example, provides numerous instances of series that have defied the efforts offorecasters. All the same, nonseasonal patterns must be accounted for when they exist. In many cases, we

    can still identify a model that fits the historical data reasonably well, which gives you the best chance tominimize uncertainty in forecasting.

    Pulses and StepsMany series experience abrupt changes in level. They generally come in two types:

    y A sudden, temporary shift, orpulse, in the series levely A sudden,permanentshift, orstep, in the series levelWhen steps or pulses are observed, it is important to find a plausible explanation. Time series models are

    designed to account for gradual, not sudden, change. As a result, they tend to underestimate pulses and be

    ruined by steps, which lead to poor model fits and uncertain forecasts. (Some instances of seasonality mayappear to exhibit sudden changes in level, but the level is constant from one seasonal period to the next.)

    If a disturbance can be explained, it can be modeled using an intervention orevent. e.g. after the collapseof Lehman Brothers in September 2008, the lack of investor confidence caused a recession in the US and

    European economies, which then returned to normal levels in 2010. By specifying a point intervention

  • 8/9/2019 Forecasting Using Time Series

    4/7

    Rajiv Mohapatra

    for the month of the Lehman Brothers collapse, we can improve the fit of your model, thus indirectly

    improving our forecasts. e.g. a retail store might find that sales were much higher than usual on the day allitems were marked 50% off. By specifying the 50%-off promotion as a recurring event, we can improve

    the fit of your model and estimate the effect of repeating the promotion on future dates.

    Series with a pulse

    OutliersShifts in the level of a time series that cannot be explained are referred to as outliers. These observationsare inconsistent with the remainder of the series and can dramatically influence the analysis and,

    consequently, affect the forecasting ability of the time series model.

    y Additive Outlier. An additive outlier appears as a surprisingly large or small value occurring for asingle observation. Subsequent observations are unaffected by an additive outlier. Consecutive

    additive outliers are typically referred to additive outlier patches.

    y Innovational Outlier. An innovational outlier is characterized by an initial impact with effectslingering over subsequent observations. The influence of the outliers may increase as time proceeds.

    y Level Shift Outlier. For a level shift, all observations appearing after the outlier move to a newlevel. In contrast to additive outliers, a level shift outlier affects many observations and has a

    permanent effect.

    y Transient Change Outlier. Transient change outliers are similar to level shift outliers, but theeffect of the outlier diminishes exponentially over the subsequent observations. Eventually, the series

    returns to its normal level.

    y Seasonal Additive Outlier. A seasonal additive outlier appears as a surprisingly large or smallvalue occurring repeatedly at regular intervals.

    y Local Trend Outlier. A local trend outlier yields a general drift in the series caused by a pattern inthe outliers after the onset of the initial outlier.

    The following figure displays several types of outliers commonly occurring in time series. The blue lines

    represent a series without outliers. The red lines suggest a pattern that might be present if the seriescontained outliers. These outliers are all classified as deterministicbecause they affect only the mean

    level of the series.

  • 8/9/2019 Forecasting Using Time Series

    5/7

    Rajiv Mohapatra

    Outlier types

    Autocorrelation andPartialAutocorrelation Functions

    Autocorrelation and partial autocorrelation are measures of association between current and past seriesvalues and indicate which past series values are most useful in predicting future values. With this

    knowledge, we can determine the order of processes in an ARIMA model. More specifically,

    y Autocorrelation function (ACF). At lag k, this is the correlation between series values thatare kintervals apart.

    y Partial autocorrelation function (PACF). At lag k, this is the correlation between series valuesthat are kintervals apart, accounting for the values of the intervals between.

  • 8/9/2019 Forecasting Using Time Series

    6/7

    Rajiv Mohapatra

    ACFplot for a series

    The x axis of the ACF plot indicates the lag at which the autocorrelation is computed; the y axis indicatesthe value of the correlation (between -1 and 1). e.g. a spike at lag 1 in an ACF plot indicates a strongcorrelation between each series value and the preceding value, a spike at lag 2 indicates a strong

    correlation between each value and the value occurring two points previously, and so on.

    - A positive correlation indicates that large current values correspond with large values at the specifiedlag; a negative correlation indicates that large current values correspond with small values at thespecified lag.

    - The absolute value of a correlation is a measure of the strength of the association, with larger absolutevalues indicating stronger relationships.

    Series Transformations

    Transformations are often useful for stabilizing a series before estimating models. This is particularly

    important for ARIMA models, which require series to be stationarybefore models are estimated. A

    series is stationary if the global level (mean) and average deviation from the level (variance) are constantthroughout the series.While most interesting series are not stationary, ARIMA is effective as long as the series can be madestationary by applying transformations, such as the natural log, differencing, or seasonal differencing.

    Variance stabilizing transformations. Series in which the variance changes over time can oftenbe stabilized using a natural log or square root transformation. These are also called functionaltransformations.

    y Natural log. The natural logarithm is applied to the series values.y Square root. The square root function is applied to the series values.

    Natural log and square root transformations cannot be used for series with negative values.

    Level stabilizing transformations. A slow decline of the values in the ACF indicates that each seriesvalue is strongly correlated with the previous value. By analyzing the change in the series values, weobtain a stable level.

    y Simple differencing. The differences between each value and the previous value in the series arecomputed, excepting, of course, the oldest value in the series. This means that the differenced serieswill have one less value than the original series.

    y Seasonal differencing. Identical to simple differencing, except that the differences between eachvalue and the previous seasonal value are computed.

  • 8/9/2019 Forecasting Using Time Series

    7/7

    Rajiv Mohapatra

    When either simple or seasonal differencing is simultaneously in use with either the log or squareroot transformation, the variance stabilizing transformation is always applied first. When simple

    and seasonal differencing are both in use, the resulting series values are the same whether simpledifferencing or seasonal differencing is applied first.

    Predictor Series

    Predictor series include related data that may help explain the behavior of the series to be forecast. e.g, a

    Web- or catalog-based retailer might forecast sales based on the number of catalogs mailed, the numberof phone lines open, or the number of hits to the company Web page.

    While adding predictors may improve a models ability to fit the historical data, it doesnt necessarilymean that the model does a better job of forecasting, so the added complexity may not be worth the

    trouble. Ideally, the goal should be to identify the simplest model that does a good job of forecasting.

    As a general rule, it is recommended that the number of predictors should be less than the sample sizedivided by 15 (at most, one predictor per 15 cases).

    Time Series Models

    Exponential smoothing is a method of forecasting that uses weighted values of previous series

    observations to predict future values. As such, exponential smoothing is not based on a theoretical

    understanding of the data. It forecasts one point at a time, adjusting its forecasts as new data come in. Thetechnique is useful for forecasting series that exhibit trend, seasonality, or both.

    ARIMA(Autoregressive Integrated Moving Average) models provide more sophisticated methods formodeling trend and seasonal components than do exponential smoothing models, and, in particular, they

    allow the added benefit of including independent (predictor) variables in the model. This involves

    explicitly specifying autoregressive and moving average orders as well as the degree of differencing.

    In practical terms, ARIMA models are most useful in explaining the behavior of the series being forecast,

    by including predictors such as the number of catalogs mailed or the number of hits to a company Web

    page. Exponential smoothing models describe the behavior of the time series without attempting tounderstand why it behaves as it does. e.g a series that historically has peaked every 12 months will

    probably continue to do so even if you dont know why.

    Limitations ofusing Time Series Models in Forecasting

    y In order to forecast using this model, the same time span and interval information must be used withthe same series for both the target variable and predictor variables.

    y The further into the future you try to predict, the less effective the model will become.y Time series model must be re-estimated and forecasted again, each time new data is available.