INTERVENTION BASED ARIMA TIME-SERIES MODELS
Ramasubramanian V.
Indian Agricultural Statistics Research Institute
Library Avenue, New Delhi – 110012
1. Time series models
Time series (TS) data refers to observations on a variable that occurs in a time sequence.
Mostly these observations are collected at equally spaced, discrete time intervals. The TS
movements of such chronological data can be resolved or decomposed into discernible
components as trend, periodic (say, seasonal), cyclical and irregular variations. One or
two of these components may overshadow the others in some series. A basic assumption
in any TS analysis/modeling is that some aspects of the past pattern will continue to
remain in the future. Here it is tacitly assumed that information about the past is
available in the form of numerical data. Ideally, at least 50 observations are necessary for
performing TS analysis/ modeling, as propounded by Box and Jenkins who were pioneers
in TS modeling.
TS models have advantages over other statistical models in certain situations. They can
be used more easily for forecasting purposes because historical sequences of observations
upon study variables are readily available from published secondary sources. These
successive observations are statistically dependent and TS modeling is concerned with
techniques for the analysis of such dependencies. Thus in TS modeling, the prediction of
values for the future periods is based on the pattern of past values of the variable under
study, but not generally on explanatory variables which may affect the system. There are
two main reasons for resorting to such TS models. First, the system may not be
understood, and even if it is understood it may be extremely difficult to measure the
cause and effect relationship, second, the main concern may be only to predict what will
happen and not to know why it happens. Many a time, collection of information on causal
factors (explanatory variables) affecting the study variable(s) may be cumbersome
/impossible and hence availability of long series data on explanatory variables is a
problem. In such situations, the TS models are a boon for forecasters. Hence, if TS
models are put to use, say, for instance, for forecasting purposes, then they are especially
applicable only in the ‗short term‘.
Decomposition models are among the oldest approaches to TS analysis albeit a number
of theoretical weaknesses from a statistical point of view. These were followed by the
crudest form of forecasting methods called the moving averages method. As an
improvement over this method which had equal weights, exponential smoothing methods
came into being which gave more weights to recent data. Exponential smoothing methods
have been proposed initially as just recursive methods without any distributional
assumptions about the error structure in them, and later, they were found to be particular
cases of the statistically sound AutoRegressive Integrated Moving Average (ARIMA)
models.
M III: 1: Intervention based ARIMA Time-series Models
M III-2
A detailed discussion regarding various TS components has been done by Croxton et al.
(1979). A good account on exponential smoothing methods is given in Makridakis et al.
(1998). A practical treatment on ARIMA modeling along with several case studies can
be found in Pankratz (1983). A reference book on ARIMA and related topics such as
intervention models with a more rigorous theoretical flavour is by Box et al. (1994).
2. Time series components
An important step in analysing TS data is to consider the types of data patterns, so that
the models most appropriate to those patterns can be utilized. Four types of TS
components can be distinguished.
They are
(i) Horizontal when data values fluctuate around a constant value
(ii) Trend when there is long term increase or decrease in the data
(iii)Seasonal when a series is influenced by seasonal factor and recurs on a regular
periodic basis
(iv) Cyclical when the data exhibit rises and falls that are not of a fixed period
Note that many data series include combinations of the preceding patterns. After
separating out the existing patterns in any TS data, the pattern that remains unidentifiable
form the ‗random‘ or ‗error‘ component. Time plot (data plotted over time) and seasonal
plot (data plotted against individual seasons in which the data were observed) help in
visualizing these patterns while exploring the data.
Trend analysis of TS data is usually done to analyse a variable over time to detect or
investigate long-term changes. Trend is ‘long-term' behaviour of a TS process usually in
relation to the mean level. The trend of a TS may be studied because the interest lies in
the trend it self, or may be to eliminate the trend statistically in order to have insight into
other components such as periodic variations in the series. A periodic movement is one
which recurs with some degree of regularity, within a definite period. The most
frequently studied periodic movement is that which occurs within a year and which is
known as seasonal variation. Sometimes the TS data are de-seasonalized for the purpose
of making the other movements (particularly trend) more readily discernible. Climatic
conditions directly affect the production system in agriculture and hence in turn their
patterns of prices and thus are primarily responsible for most of the seasonal variations
exhibited in such series.
Over a period of time, a TS is very likely to show a tendency to increase or to decrease
otherwise termed as an upward or downward trend respectively. One should not loose
sight of the underlying factors that sometimes may cause such trend like growth in
population, price changes etc. Technological developments and adoption patterns have
M III: 1: Intervention based ARIMA Time-series Models
M III-3
been affecting agriculture so as to increase output enormously. Not always keeping pace
with them, but induced by them, have been changes in the main variable (read here price
of agricultural commodities) under study. Not all historical series show upward trends.
Some, say, plant disease incidence, exhibit a generally downward trend. This particular
declining trend is attributable to better and more widely available advisory and extension
services or due to good government policies. An economic series may have a downward
trend because a better or cheaper substitute may be available.
Many techniques such as time plots, auto-correlation functions, box plots and scatter
plots abound for suggesting relationships with possibly influential factors. For long and
erratic series, time plots may not be helpful. Alternatives could be to go for smoothing or
averaging methods like moving averages, exponential smoothing methods etc. In fact, if
the data contains considerable error, then the first step in the process of trend
identification is smoothing.
3. Moving average methods
Before discussing any decomposition method, the method of moving averages is
discussed as this will be used for the same.
(i) Simple moving averages
A moving average (MA) is simply a numerical average of the last N data points. There
are prior MA, centered MA etc. in the TS literature. In general, the moving average at
time t, taken over N periods, is given by
where Yt is the observed response at time t. Another way of stating the above equation is
Mt[1]
= Mt1[1]
+ (YtYtN ) / N
At each successive time period the most recent observation is included and the farthest
observation is excluded for computing the average. Hence the name ‗moving‘ averages.
(ii) Double moving averages
The simple moving average is intended for data of constant and no trend nature. If the
data have a linear or quadratic trend, the simple moving average will be misleading.
In order to correct for the bias and develop an improved forecasting equation, the double
moving average can be calculated. To calculate this, simply treat the moving averages
Mt[1]
over time as individual data points and obtain a moving average of these averages.
(iii) Centered moving averages
N
Y.......YYM 1Nt1tt]1[
t
M III: 1: Intervention based ARIMA Time-series Models
M III-4
In order to ensure that the average was centered at the middle of the data values being
averaged, the order or span or term k of the moving average (MA) in principle required to
be an odd number. This issue of not able to use any general span (read even number) can
be overcome by taking an additional 2-period moving average of the k-period moving
average denoted by 2 x k MA. The centered MA can be thought of as a weighted MA,
where the weights of the first and last terms will be different from those of the other
terms. Thus, a 2 x k MA smother is equivalent to a weighted MA of order k+1 with
weights 1/k for all observations except for the first and last observations in the average,
which have weights 1/2k. For instance, if a 2 x 4 MA i.e. a centered 4 MA was used with
quarterly data each quarter would be given equal weight. The ends of the moving
average will apply to the same quarter in consecutive years. So each quarter receives the
same weight with the weight for the quarter at the ends of the moving average split
between the two years. It is this property that makes 2 x 4 MA very useful for estimating
a trend-cycle in the presence of quarterly seasonality. The seasonal variation with
slightly longer or a slightly shorter MA will still retain some seasonal variation. An
alternative to a 2 x 4 MA for quarterly data is a 2 x 8 MA or 2 x 12 MA which will also
give equal weights to all quarters and produce a smoother fit than the 2 x 4 MA. Other
MAs tend to be contaminated by the seasonal variation, although many times do not seem
to be particularly serious. .
The following is useful in discussing weighted and centered MAs. Let Y1, Y2, Y3, Y4 and
Y5 etc. till TN be the given TS data for N time periods. Thus to calculate a 4-term MA for
the data, the trend-cycle at time period 3 could be calculated as either
(Y1 + Y2 + Y3 + Y4) /4 = T2.5 (say) or (Y2 + Y3 + Y4 +Y5) /4 = T3.5 (say). That is, for the
period 3, whether to include two terms on the left and two terms on the right or the other
way is a problem. The centre of the first Ma is at 2.5 (half a period early) while the
centre of the second Ma is at 3.5 (half a period late). However, the average of the two
MAs is centered at 3, just where it should be. This can be done by using a centered MA
i.e. T3 = (T2.5 + T3.5) /2 = (Y1 + 2Y2 + 2Y3 + 2Y4 +Y5)/8. Here note that the first and last
terms have the weights of 1/8 and all other terms have weights of double that value i.e. ¼.
Therefore, a 2 x 4 MA smoother is equivalent to a weighted MA of order 5.
(iv) Caveat
To illustrate a cautionary note, we will make use of a simple dataset that represents a time
series meant to represent sales figures for some product over five years. The data are
quarterly so the series contains 20 data points overall.
M III: 1: Intervention based ARIMA Time-series Models
M III-5
Year Quarter Sales (1000‘s)
1989 Q1 9
1989 Q2 50
1989 Q3 1
1989 Q4 7
1990 Q1 38
1990 Q2 68
1990 Q3 11
1990 Q4 43
1991 Q1 7
1991 Q2 41
1991 Q3 13
1991 Q4 12
1992 Q1 50
1992 Q2 36
1992 Q3 16
1992 Q4 35
1993 Q1 41
1993 Q2 64
1993 Q3 1
1993 Q4 53
Smoothed data takes out random error from so-called ―noisy‖ data—data with lots of
noise. The analogy is the signal to noise ratio in radio broadcasting, for instance. A
noisy channel is one with so much static (the noise) that it is hard to hear the program
(the signal). Likewise, a time series with lots of noise makes it hard to see a pattern,
which might be a trend.
A simple example would be a linear series with little or no noise (Figure 6). You see that
the Moving Averages always lag behind the actual data. So substituting the Moving
Average for Data Point #7 for the Actual Data for #7 would give a number that is too
low. If the trends were decreasing the Moving Averages would usually be larger than the
actual data. The reason for the difference is that Moving Averages and Exponential
Smoothing only use data from the past, dragging down (or up) the real numbers with past
values which, in the case of a strong trend, are always higher or lower than the actual data
point.
M III: 1: Intervention based ARIMA Time-series Models
M III-6
We can solve this problem by ―centering‖ the Moving Average on the data point that we
are substituting for. Rather than averaging the first three values to get the value for Data
Point #3, we use that same number for Data Point #2, the center of the first three
numbers. That basically shifts the Moving Average graph up, right on top of the actual
data (Figure 7). If there were more noise in that data, it would smooth it without
destroying the trend.
Notice, however, that there is a cost. Moving the data up loses the last value of the
Moving Average (#10) since there is no number after it to include in the Average. The
problem would be worse for MA5 and MA7 which would give up the last two or three
values respectively. That could be a problem because some forecasting methods use that
last data point as the jumping off point of the forecast.
The other way of handling the problem is simply not to smooth curves that have little or
no random error in them. And that makes sense. We do not remove seasonality for a
series that has little or none. So likewise, do not try to remove noise when trend is more
than the random error. It just messes up what‘s left.
M III: 1: Intervention based ARIMA Time-series Models
M III-7
4. Classical time series decomposition methods
Decomposition methods are among the oldest approaches to TS analysis. A crude yet
practical way of decomposing the original data (including the cyclical pattern in the trend
itself) is to go for a seasonal decomposition either by assuming an additive or
multiplicative model such as
Yt = Tt + St + Et or Yt = Tt . St . Et
where Yt - Original TS data
Tt - Trend component
St – Seasonal component
Et – Error/ Irregular component
If the seasonal variations of a TS increases as the level of the series increases then one
has to go for a multiplicative model else an additive model. Alternatively, an additive
decomposition of the logged values of the TS can also be done. The decomposition
methods may enable one to study the TS components separately or will allow workers to
de-trend or to do seasonal adjustments if needed for further analysis. The steps involved
in above two classical decomposition methods will be discussed in detail and other
methods and developments will be stated in brief subsequently.
Earlier workers observed that when an underlying pattern exists in a data series, that
pattern can be distinguished from randomness by smoothing (averaging) past values.
This concept was exploited in the methods such as moving averages, exponential
smoothing methods to eliminate randomness so the pattern can be projected into the
future and used as forecast models. Decomposition methods usually try to identify two
separate components i.e. trend-cycle and seasonality of the basic underlying pattern that
tend to characterize economic and business series. The seasonal factor relate to periodic
fluctuations of constant length that are caused by such things as weather factors, month of
the year, timing of holidays and corporate policies. Thus the given TS data is divided as
pattern plus error.
The several approaches that exist in decomposing TS data, albeit with a theoretical
weakness from a statistical point of view, is based on one common premise that the
separation is empirical and consists of first removing the trend-cycle, then isolating the
seasonal component. Another key step in all decomposition methods involves smoothing
the original data. Any residual is assumed as ‗irregular‘ or ‗remainder‘ component
identified as the difference between the combined effect of the two subpatterns of the
series and the actual data.
(i) Additive decomposition
Consider a TS which is additive in nature with seasonal period 12 such that, preserving
the earlier mentioned notations, Yt = Tt + St + Et .
M III: 1: Intervention based ARIMA Time-series Models
M III-8
The steps involved are
1. The trend cycle Tt is computed using a centered MA i.e. a 2 x 12 MA.
2. The de-trended series is computed by subtracting the trend-cycle component from the
data, leaving the seasonal and irregular terms. That is, Yt - Tt = St + Et.
3. For calculating the seasonal component, assuming that the same is constant from year
to year, gather all the de-trended values for a given month and take the average. The
seasonal component St is obtained by stringing together this set of 12 values called
seasonal indices (one for each month) repeated the same for each year of data.
4. Finally, the irregular series Et is computed by simply subtracting the estimated
seasonality and trend – cycle from the original data series.
(ii) Multiplicative decomposition
Consider a TS with periodicity 12 (i.e. monthly data) which is multiplicative in nature in
the sense that the seasonal variation increases with the level of the series i.e. Yt = Tt * St *
Et.
The steps involved are
1. The trend cycle Tt is computed using a centered MA i.e. a 2 x 12 MA. There will be
some values missing both at the beginning and the end because of the averaging
procedure used.
2. The de-trended series is computed as the ratio of actual-to-moving averages. Thus the
resultant values Rt = (Yt / Tt )= St * Et isolates the additional two components of the
TS.
3. For calculating the seasonal component, assuming that the same is constant from year
to year, gather all the de-trended values for a given month and take the average. The
seasonal component St is obtained by stringing together this set of 12 values called
seasonal indices (one for each month given in bold values in col. 5) repeated the same
for each year of data.
4. Finally, the irregular series Et is computed simply as the ratio of the original data to
the estimated seasonality and trend – cycle components as (Yt /St * Tt). Thus apart
from de-trending, the data has also been de-seasonalised.
M III: 1: Intervention based ARIMA Time-series Models
M III-9
Illustration for Multiplicative decomposition using MSExcel
(a) Data set (International Airline passenger totals in thousands; Box et al., 1994)
1 2 3 4 5 6 7 8 9 10 11 12
1949 112 118 132 129 121 135 148 148 136 119 104 118
1950 115 126 141 135 125 149 170 170 158 133 114 140
1951 145 150 178 163 172 178 199 199 184 162 146 166
1952 171 180 193 181 183 218 230 242 209 191 172 194
1953 196 196 236 235 229 243 264 272 237 211 180 201
1954 204 188 235 227 234 264 302 293 259 229 203 229
1955 242 233 267 269 270 315 364 347 312 274 237 278
1956 284 277 317 313 318 374 413 405 355 306 271 306
(b) Graph
0
50
100
150
200
250
300
350
400
450
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
(c) Multiplicative decomposition calculations
col.1 col.2 col.3 col.4 col.5 col.6 col.7
Yt
12
MA Tt=2x12MA detrended St Et
Tt*St*
Et
2000 1 112 -
2 118 -
3 132 -
4 129 -
5 121 -
6 135 -
7 148 126.67 126.79 116.73 119.97 0.010 148.00
M III: 1: Intervention based ARIMA Time-series Models
M III-10
8 148 126.92 127.25 116.31 118.80 0.010 148.00
9 136 127.58 127.96 106.28 105.64 0.010 136.00
10 119 128.33 128.58 92.55 92.01 0.010 119.00
11 104 128.83 129.00 80.62 79.80 0.010 104.00
12 118 129.17 129.75 90.94 90.53 0.010 118.00
2001 1 115 130.33 131.25 87.62 91.09 0.010 115.00
2 126 132.17 133.08 94.68 90.30 0.010 126.00
3 141 134.00 134.92 104.51 103.22 0.010 141.00
4 135 135.83 136.42 98.96 98.63 0.010 135.00
5 125 137.00 137.42 90.96 97.88 0.009 125.00
6 149 137.83 138.75 107.39 109.89 0.010 149.00
7 170 139.67 140.92 120.64 119.97 0.010 170.00
--- --- --- --- --- --- --- ---
5. Simple Exponential smoothing
Depending upon whether the data is horizontal, or has trend or has also seasonality the
following methods are employed for forecasting purposes.
For non-seasonal time series data in which there is no trend, simple exponential
smoothing (SES) method can be employed. Let the observed time series up to time
period t upon a variable Y be denoted by y1, y2, …, yt. Suppose the forecast, say, yt (1), of
the next value yt+1 of the series (which is yet to be observed) is to be found out. Given
data up to time period t-1, the forecast for the next time t is denoted by yt-1 (1). When the
observation yt becomes available, the forecast error is found to be yt− yt-1(1). Thereafter
method of simple or single exponential smoothing takes the forecast for the previous
period and adjusts it using the forecast error to find the forecast for the next period, i.e.
Level: yt (1)= yt -1 (1) + α [yt− yt-1 (1)]
Forecast: yt (h) = yt (1); h=2,3,…
where is the smoothing constant (weight) which lies between 0 and 1. Here yt (h) is the
forecast for h periods ahead i.e. the forecast of yt+h for some subsequent time period t+h
based on all the data up to time period t. It is emphasized here that this model is a
recursive one and is mathematical rather than statistical in nature. Note that the choice of
should be based on the criterion that yields the smallest Mean Squared Error (MSE)
value. Also to start the iterative method, a value of the forecast for the initial period has
to be supplied by the user. However, as the weight attached to this value is small, its
effect on the ultimate forecast is negligible. However, sometimes, the choice of has
considerable impact on the forecast. A large value of (say 0.9) gives very little
smoothing in the forecast, whereas a small value of (say 0.1) gives considerable
smoothing. Alternatively, one can choose from a grid of values (say =0.1,0.2,…,0.9)
and choose the value that yields the smallest MSE value. If you expand the above model
recursively then yt (1) will come out to be a function of , past yt values and the initial
M III: 1: Intervention based ARIMA Time-series Models
M III-11
forecast. So, having known values of and past values of yt the point of concern relates
to finding the the value of this initial forecast. One method of initialization is to use the
first observed value Y1 as the first forecast and then proceed. Another possibility would
be to average the first four or five values in the data set and use this as the initial forecast.
However, because the weight attached to this user-defined one is minimal, its effect on
the ultimate forecast is negligible.
The illustration for SES is given below using MSExcel:
Table 1: Illustration for Single Exponential Smoothing model (SES) - MSExcel
Monthly wholesale prices(in rupees per quintal) - U.P. -Hapur - Sugar variety C-29 (12 months period 1996
Jan - 1996-Dec)
Forecasts with varying values of alpha
T yt 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1 1350 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00
2 1350 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00
3 1320 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00 1350.00
4 1350 1347.00 1344.00 1341.00 1338.00 1335.00 1332.00 1329.00 1326.00 1323.00
5 1300 1347.30 1345.20 1343.70 1342.80 1342.50 1342.80 1343.70 1345.20 1347.30
6 1250 1342.57 1336.16 1330.59 1325.68 1321.25 1317.12 1313.11 1309.04 1304.73
7 1300 1333.31 1318.93 1306.41 1295.41 1285.63 1276.85 1268.93 1261.81 1255.47
8 1400 1329.98 1315.14 1304.49 1297.24 1292.81 1290.74 1290.68 1292.36 1295.55
9 1325 1336.98 1332.11 1333.14 1338.35 1346.41 1356.30 1367.20 1378.47 1389.55
10 1325 1335.79 1330.69 1330.70 1333.01 1335.70 1337.52 1337.66 1335.69 1331.46
11 1360 1334.71 1329.55 1328.99 1329.80 1330.35 1330.01 1328.80 1327.14 1325.65
12 ? 1337.24 1335.64 1338.29 1341.88 1345.18 1348.00 1350.64 1353.43 1356.56
Y12= 1325
MSE calc.
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
900 900 900 900 900 900 900 900 900
9 36 81 144 225 324 441 576 729
2237.29 2043.04 1909.69 1831.84 1806.25 1831.84 1909.69 2043.04 2237.29
8569.20 7423.55 6494.75 5727.46 5076.56 4505.09 3982.87 3485.72 2995.37
1109.76 358.27 41.13 21.09 206.64 536.02 965.16 1458.63 1982.65
4902.56 7200.81 9122.33 10558.63 11489.16 11937.92 11950.88 11586.03 10910.37
143.60 50.61 66.30 178.14 458.23 979.42 1781.18 2859.29 4167.31
116.32 32.39 32.49 64.13 114.56 156.71 160.31 114.37 41.67
639.75 927.03 961.63 911.75 879.03 899.56 973.54 1079.85 1180.23
1693 1725 1783 1849 1923 2006 2097 2191 2286
M III: 1: Intervention based ARIMA Time-series Models
M III-12
6. Forecasting using simple decomposition methods
In the class, we will discuss about how time series decomposition can be used for
forecasting values for future time periods by means of MSExcel as given below:
Data Seasonal Deaseas Smoothed Trend Deviation
average Data Data
0.8 Y=39.3+4.6X
Data -
Trend
1989 Q1 0 9 29 31 31 39 -8.6
1989 Q2 1 50 52 96 44 44 -0.1
1989 Q3 2 1 8 10 37 49 -11.6
1989 Q4 3 7 30 22 34 53 -19.1
1990 Q1 4 38 29 131 53 58 -4.4
1990 Q2 5 68 52 132 69 62 6.7
1990 Q3 6 11 8 134 82 67 15.1
1990 Q4 7 43 30 142 94 72 22.5
1991 Q1 8 7 29 26 80 76 4.2
1991 Q2 9 41 52 79 80 81 -0.6
1991 Q3 10 13 8 153 95 85 9.4
1991 Q4 11 12 30 41 84 90 -6.1
1992 Q1 12 50 29 173 102 95 7.0
1992 Q2 13 36 52 70 95 99 -4.0
1992 Q3 14 16 8 188 114 104 9.8
1992 Q4 15 35 30 118 115 109 6.0
1993 Q1 16 41 29 140 120 113 6.5
1993 Q2 17 64 52 123 120 118 2.6
1993 Q3 18 1 8 15 99 122 -23.1
1993 Q4 19 53 30 177 115 127 -12.1
1994 Q1 20 36 29 124 124 132 -8.0
1994 Q2 21 69 52 132 132 136 -4.0
1994 Q3 22 11 8 141 141 141 0.0
1994 Q4 23 44 30 145 145 145 0.0
Smoothed data No change
Sum of
trend Trend for Average of
x seas avg/100
and
deviation new values last 6 pts
M III: 1: Intervention based ARIMA Time-series Models
M III-13
7. Other Exponential smoothing methods
If the data has trend or has also seasonality the following methods are employed for
forecasting purposes.
(i) Double exponential smoothing (Holt) (For illustration see Table 2)
The above SES method was then extended to linear exponential smoothing and has been
employed to allow for forecasting non-seasonal time series data with trends. The forecast
for Holt‘s linear exponential smoothing is found by having two equations to deal with –.
one for level and one for trend. The forecast is found using two smoothing constants,
and (with values between 0 and 1), and three equations:
Level: lt = α yt +(1− α)(lt−1 + bt−1),
Trend: bt = β(lt − lt−1)+(1− β)bt−1,
Forecast: yt(h)= lt + bth.
Here lt denotes the level of the series at time t and bt denotes the trend (additive) of the
series at time t. The optimal combination of smoothing parameters and should be
chosen by minimizing the MSE over observations of the model data set.
(ii) Triple exponential smoothing (Winters) (For illustration see Tables 3 (a) & (b))
If the data have no trend or seasonal patterns, then SES is appropriate. If the data exhibit
a linear trend, Holt‘s method is appropriate. But if the data are seasonal, these methods,
on their own, cannot handle the problem well. Holt‘s method was later extended to
capture seasonality directly. The Winters‘ method is based on three smoothing
equations—one for the level, one for trend, and one for seasonality. It is similar to Holt‘s
method, with one additional equation to deal with seasonality. In fact there are two
different Winters‘ methods, depending on whether seasonality is modelled in an additive
or multiplicative way.
The basic equations for Winters‘ multiplicative method are as follows.
Level: lt = α m-t
t
s
y+(1− α)(lt−1 + bt−1)
Trend: bt = β (lt − lt−1)+(1− β)bt−1
Seasonal: st = γ yt / (lt−1 + bt−1)+(1− γ)st−m
Forecast: yt(h)=(lt + bt h)st−m+h
where m is the length of seasonality (e.g., number of, say, months or ‗seasons‘ in a year),
lt represents the level of the series, bt denotes the trend of the series at time t, st is the
seasonal component, and yt(h) is the forecast for h periods ahead. As with all exponential
smoothing methods, we need initial values of the components and parameter values.
M III: 1: Intervention based ARIMA Time-series Models
M III-14
The basic equations for Winters‘ additive method are as follows.
Level: lt = α m-tsty +(1− α)(lt−1 + bt−1)
Trend: bt = β (lt − lt−1)+(1− β)bt−1
Seasonal: st = γ (yt - lt−1 - bt−1) + (1− γ) s t−m
Forecast: yt(h)=(lt + bt h) + st−m+h
8. ARIMA models
8.1 Stationarity of a TS process
A TS is said to be stationary if its underlying generating process is based on a constant
mean and constant variance with its autocorrelation function (ACF) essentially constant
through time. Thus, if we consider different subsets of a realization (TS ‗sample‘) the
different subsets will typically have means, variances and autocorrelation functions that
do not differ significantly.
A statistical test for stationarity is the most widely used Dickey Fuller test. To carry out
the test, estimate by OLS the regression model
y't = y t -1+b 1y‘ t -2+…+ b p y‘ t -p
where y't denotes the differenced series (yt -yt -1). The number of terms in the regression,
p, is usually set to be about 3. Then if is nearly zero the original series yt needs
differencing. And if <0 then yt is already stationary.
8.2 Autocorrelation functions
(i) Autocorrelation
Autocorrelation refers to the way the observations in a TS are related to each other and is
measured by the simple correlation between current observation (Yt) and observation
from p periods before the current one (Ytp ). That is for a given series Yt, autocorrelation
at lag p is the correlation between the pair (Yt , Ytp ) and is given by
It ranges from 1 to +1. Box and Jenkins has suggested that maximum number of useful
rp are roughly N/4 where N is the number of periods upon which information on yt is
available.
(ii) Partial autocorrelation
n
1t
2t
pn
1t
ptt
p
YY
YYYY
r
M III: 1: Intervention based ARIMA Time-series Models
M III-15
Partial autocorrelations are used to measure the degree of association between y t and y t-p
when the y-effects at other time lags 1,2,3,…,p-1 are removed.
(iii) Autocorrelation function(ACF) and partial autocorrelation function(PACF)
Theoretical ACFs and PACFs (Autocorrelations versus lags) are available for the various
models chosen (say, see Pankratz, 1983) for various values of orders of autoregressive
and moving average components i.e. p and q. Thus compare the correlograms (plot of
sample ACFs versus lags) obtained from the given TS data with these theoretical
ACF/PACFs, to find a reasonably good match and tentatively select one or more ARIMA
models. The general characteristics of theoretical ACFs and PACFs are as follows:-
(here ‗spike‘ represents the line at various lags in the plot with length equal to magnitude
of autocorrelations)
Model ACF PACF
AR Spikes decay towards zero Spikes cutoff to zero
MA Spikes cutoff to zero Spikes decay to zero
ARMA Spikes decay to zero Spikes decay to zero
8.3 Description of ARIMA representation
(i) ARIMA modeling (For illustration see Table 4)
In general, an ARIMA model is characterized by the notation ARIMA (p,d,q) where,p, d
and q denote orders of auto-regression, integration (differencing) and moving average
respectively. In ARIMA parlance, TS is a linear function of past actual values and
random shocks. For instance, given a TS process {yt}, a first order auto-regressive
process is denoted by ARIMA (1,0,0) or simply AR(1) and is given by
y t = + 1y t-1 + t
and a first order moving average process is denoted by ARIMA (0,0,1) or simply MA(1)
and is given by
y t = - 1 t-1 + t Alternatively, the model ultimately derived, may be a mixture of these processes and of
higher orders as well. Thus a stationary ARMA (p, q) process is defined by the equation
y t = 1y t-1+ 2y t-2+…+ p y t-p - 1 t-1- 2 t-2+…- q t-q + t where t‘s are independently and normally distributed with zero mean and constant
variance 2 for t = 1,2,...n. Note here that the values of p and q, in practice lie between
0 and 3. The degree of differencing of main variable yt will be discussed in section 7 (i)).
(ii) Seasonal ARIMA modeling
Identification of relevant models and inclusion of suitable seasonal variables are
necessary for seasonal models. The Seasonal ARIMA i.e. ARIMA (p,d,q) (P,D,Q)s
model is defined by
p (B) P (Bs)
d s
D y t = Q (B
s) q (B) t
where
M III: 1: Intervention based ARIMA Time-series Models
M III-16
p (B) = 1 - 1B-….-pB p
, q (B) = 1-1B-…-qBq
P (Bs) = 1-1 B
s-…-P B
sP , Q (B
s) = 1- 1 B
s-…-Q B
sQ
B is the backshift operator (i.e. B y t= y t-1, B 2y t = y t-2 and so on), ‘s‘ the seasonal lag
and ‗ t‘ a sequence of independent normal error variables with mean 0 and variance 2.
's and 's are respectively the seasonal and non-seasonal autoregressive parameters. 's
and 's are respectively seasonal and non-seasonal moving average parameters. p and q
are orders of non-seasonal autoregression and moving average parameters respectively
whereas P and Q are that of the seasonal auto regression and moving average
parameters respectively. Also d and D denote non-seasonal and seasonal differences
respectively.
8.4 Model building
(i) Identification
The foremost step in the process of modeling is to check for the stationarity of the
series, as the estimation procedures are available only for stationary series. There are two
kinds of stationarity, viz., stationarity in ‗mean‘ and stationarity in ‗variance‘. A cursory
look at the graph of the data and structure of autocorrelation and partial correlation
coefficients may provide clues for the presence of stationarity. Another way of checking
for stationarity is to fit a first order autoregressive model for the raw data and test
whether the coefficient ‗1‘ is less than one. If the model is found to be non-stationary,
stationarity could be achieved mostly by differencing the series. Or go for a Dickey
Fuller test (see section 8.1). This is applicable for both seasonal and non-seasonal
stationarity.
Thus, if ‗X t‘ denotes the original series, the non-seasonal difference of first order is
Y t = X t – X t-1
followed by the seasonal differencing (if needed)
Z t = Yt – Y t—s = (X t – X t-1) – (X t-s - Xt-s-1)
The next step in the identification process is to find the initial values for the orders of
seasonal and non-seasonal parameters, p, q, and P, Q. They could be obtained by looking
for significant autocorrelation and partial autocorrelation coefficients (see section 5 (iii)).
Say, if second order auto correlation coefficient is significant, then an AR (2), or MA (2)
or ARMA (2) model could be tried to start with. This is not a hard and fast rule, as
sample autocorrelation coefficients are poor estimates of population autocorrelation
coefficients. Still they can be used as initial values while the final models are achieved
after going through the stages repeatedly. Note that usually up to order 2 for p, d, or q are
sufficient for developing a good model in practice.
(ii) Estimation
M III: 1: Intervention based ARIMA Time-series Models
M III-17
At the identification stage one or more models are tentatively chosen that seem to provide statistically adequate representations of the available data. Then we attempt to obtained precise estimates of parameters of the model by least squares as advocated by Box and Jenkins. Standard computer packages like SAS, SPSS etc. are available for finding the estimates of relevant parameters using iterative procedures. The methods of estimation are not discussed here for brevity.
(iii) Diagnostics
Different models can be obtained for various combinations of AR and MA individually
and collectively. The best model is obtained with following diagnostics.
(a) Low Akaike Information Criteria (AIC)/ Bayesian Information Criteria (BIC)/
Schwarz-Bayesian Information Criteria (SBC)
AIC is given by
AIC = (-2 log L + 2 m)
where m=p+ q+ P+ Q and L is the likelihood function. Since -2 log L is approximately
equal to {n (1+log 2π) + n log σ2} where σ
2 is the model MSE, Thus AIC can be written
as
AIC={n (1+log 2π) + n log σ2 + 2 m}
and because first term in this equation is a constant, it is usually omitted while comparing
between models. As an alternative to AIC, sometimes SBC is also used which is given
by
SBC = log σ2
+ (m log n) /n.
(b) Plot of residual ACF
Once the appropriate ARIMA model has been fitted, one can examine the goodness of fit
by means of plotting the ACF of residuals of the fitted model. If most of the sample
autocorrelation coefficients of the residuals are within the limits N/96.1± where N is
the number of observations upon which the model is based then the residuals are white
noise indicating that the model is a good fit.
(c) Non-significance of auto correlations of residuals via Portmonteau tests (Q-tests
based on Chisquare statistics)-Box-Pierce or Ljung-Box texts
After tentative model has been fitted to the data, it is important to perform diagnostic checks to test the adequacy of the model and, if need be, to suggest potential improvements. One way to accomplish this is through the analysis of residuals. It has been found that it is effective to measure the overall adequacy of the chosen model by examining a quantity Q known as Box-Pierce statistic (a function of autocorrelations of residuals) whose approximate distribution is chi-square and is computed as follows:
Q = n r2 (j)
M III: 1: Intervention based ARIMA Time-series Models
M III-18
where summation extends from 1 to k with k as the maximum lag considered, n is the
number of observations in the series, r (j) is the estimated autocorrelation at lag j; k can
be any positive integer and is usually around 20. Q follows Chi-square with (k-m1)
degrees of freedom where m1 is the number of parameters estimated in the model. A
modified Q statistic is the Ljung-box statistic which is given by
Q = n(n+2) r2 (j) / (nj)
The Q Statistic is compared to critical values from chi-square distribution. If model is
correctly specified, residuals should be uncorrelated and Q should be small (the
probability value should be large). A significant value indicates that the chosen model
does not fit well.
All these stages require considerable care and work and they themselves are not
exhaustive.
Table 2: Illustration for Double Exponential Smoothing model (Holt) using
MSExcel calculations
Monthly wholesale prices (in rupees per quintal) -U.P.-Hapur-Sugar variety C-29 (24
months period 1997Apr - 1999Mar)
Forecasts with values of alpha & neta as 0.1 & 0.2
respectively
Year Month t yt h Ft bt
1997 4 1 1275 1275 5.094*
* for t=o, bt is obtained by fitting yt = a0 + b0 t
5 2 1315 1283.58 5.79
6 3 1325 1292.94 6.50
7 4 1400 1309.50 8.52
8 5 1420 1328.21 10.56
9 6 1400 1344.89 11.78
10 7 1400 1361.00 12.65
11 8 1430 1379.29 13.77
12 9 1450 1398.75 14.91
1998 1 10 1450 1417.30 15.64
2 11 1450 1434.64 15.98
3 12 1420 1447.56 15.37
4 13 1450 1461.64 15.11
5 14 1450 1474.07 14.57
6 15 1460 1485.78 14.00
7 16 1400 1489.80 12.01
8 17 1400 1491.63 9.97
9 18 1465 1497.94 9.24
10 19 1475 1503.96 8.59
11 20 1500 1511.30 8.34
12 21 1455 1513.18 7.05
M III: 1: Intervention based ARIMA Time-series Models
M III-19
1999 1 22 1515 1519.70 6.95
2 23 1465 1520.48 5.71
3 24 1400 1513.58 3.19
4 25 1460 1 1516.77
5 26 1440 2 1519.95
6 27 1515 3 1523.14
7 28 1525 4 1526.33
8 29 1530 5 1529.52
9 30 1525 6 1532.71
Table 3 (a): Illustration for Triple Exponential Smoothing model (Winters) using
MSExcel calculations
Quarterly exports (in thousands of francs) of a French company over a six year
period (Makridakis et al. 1998; page 162)
alpha beta gamma
0.1 0.1 0.2
year quarter yt lead lt bt st
1 1 1 362 362.00 17.83** 0.96*
** the initial bt value is obtained as b0 by fitting a simple linear regression of yt
on t i.e. yt = a0 + b0 t
2 2 385 385.00 18.35 1.02*
3 3 432 432.00 21.21 1.14*
4 4 341 341.00 9.99 0.87*
*Refer Table 3 (b)
2 1 5 382 355.77 10.47 0.98
2 6 409 369.74 10.82 1.04
3 7 498 386.30 11.39 1.17
4 8 387 402.16 11.84 0.89
3 1 9 473 420.67 12.51 1.02
2 10 513 439.24 13.11 1.07
3 11 582 456.80 13.56 1.19
4 12 474 476.31 14.15 0.92
4 1 13 544 494.98 14.60 1.03
2 14 582 513.12 14.96 1.08
3 15 681 532.29 15.38 1.21
4 16 557 553.63 15.98 0.94
5 1 17 628 573.36 16.35 1.05
2 18 707 596.03 16.98 1.11
3 19 773 615.42 17.22 1.22
4 20 592 632.55 17.21 0.94
6 1 21 627 644.61 16.70 1.03
2 22 725 660.73 16.64 1.10
M III: 1: Intervention based ARIMA Time-series Models
M III-20
3 23 854 679.47 16.85 1.23
4 24 661 697.24 16.94 0.94
25 ? 1 736.62
26 ? 2 813.27
27 ? 3 1000.71
28 ? 4 940.03
Table 3 (b): Illustration for ‘seasonals’ used in Triple Exponential Smoothing model
(Winters) using MSExcel calculations
yt 4MA 2x4MA yt/(2x4MA) Sea.index
year quarter
1 1 1 362 0.96
2 2 385 1.02
3 3 432 382.50 1.13 1.14
4 4 341 380.00 388.00 0.88 0.87
2 1 5 382 385.00 399.25 0.96
2 6 409 391.00 413.25 0.99
3 7 498 407.50 430.38 1.16
4 8 387 419.00 454.75 0.85
3 1 9 473 441.75 478.25 0.99
2 10 513 467.75 499.63 1.03
3 11 582 488.75 519.38 1.12
4 12 474 510.50 536.88 0.88
4 1 13 544 528.25 557.88 0.98
2 14 582 545.50 580.63 1.00
3 15 681 570.25 601.50 1.13
4 16 557 591.00 627.63 0.89
5 1 17 628 612.00 654.75 0.96
2 18 707 643.25 670.63 1.05
3 19 773 666.25 674.88 1.15
4 20 592 675.00 677.00 0.87
6 1 21 627 674.75 689.38 0.91
2 22 725 679.25 708.13 1.02
3 23 854 699.50
4 24 661 716.75
M III: 1: Intervention based ARIMA Time-series Models
M III-21
Table 4: ARIMA model upon cotton crop yield (in kg/ha) for Maharashtra state
using SPSS package
1970-71 31 1987-88 100
1971-72 69 1988-89 89
1972-73 75 1989-90 143
1973-74 77 1990-91 117
1974-75 112 1991-92 72
1975-76 57 1992-93 124
1976-77 67 1993-94 180
1977-78 93 1994-95 154
1978-79 89 1995-96 155
1979-80 111 1996-97 173
1980-81 81 1997-98 95
1981-82 92 1998-99 139
1982-83 103 1999-00 162
1983-84 52 2000-01 100
1984-85 93 2001-02 147
1985-86 123 2002-03 158
1986-87 56 2003-04 189
SPSS Output
87654321
Lag Number
1.0
0.5
0.0
-0.5
-1.0
AC
F
Lower ConfidenceLimit
Upper Confidence Limit
Coefficient
y
87654321
Lag Number
1.0
0.5
0.0
-0.5
-1.0
Part
ial A
CF
Lower ConfidenceLimit
Upper Confidence Limit
Coefficient
y
Parameter Estimates
Estimates Std Error T
Approx
Sig
Non-
Seasonal
Lags
AR1 -.613 .172 -3.558 .001
AR2 -.499 .173 -2.880 .008
Constant 3.169 2.819 1.124 .271
M III: 1: Intervention based ARIMA Time-series Models
M III-22
9. Intervention based ARIMA models
Every time series modelling approach has its own advantages and limitations. Time
series intervention models have advantages in certain situations. Sometimes, it may be
known that certain exceptional external events called ‗interventions‘ could affect the time
series under study. Under such circumstances, ‗transfer function‘ models may be used to
account for the effects of the intervention event on the series but wherein the input series
(apart from the main variable) will be in the form of a simple indicator variable to
indicate the presence or absence of the event. Here transfer function modeling refers to
accounting for the dynamic relationship between two time series Yt and Xt (the latter is
the input series) wherein past values of both series may be used in forecasting Yt, leading
to a considerable reduction in the errors of the forecast.
Intervention analysis or event study is used to assess the impact of a special event on the
time series of interest. Alternatively, intervention analysis may be undertaken to adjust
for any unusual values in the series Yt that might have resulted as a consequence of the
intervention event. This will ensure that the results of the time series analysis of the
series, such as the structure of the fitted model, estimates of model parameters, and
forecasts of future values, are not seriously distorted by the influence of these unusual
values. In agriculture, intervention may occur due to introduction of new variety, new
environmental regulations, economic policy changes, strikes, special promotion
campaigns, natural disaster etc. There are broadly three kinds of intervention viz., step,
pulse and ramp.
In its simplest form, intervention analysis itself may be regarded as a generalization of
the two-sample problem (corresponding to pre and post intervention periods) to the case
where the error or noise term is autocorrelated rather than independent. It is well known
that the usual two-sample procedures are not robust against alternatives involving
autocorrelation. Moreover, in many intervention analysis applications, time series data
may be expensive or otherwise difficult to collect. In such cases, ‗power functions‘ are
helpful, because they can be used to determine the probability that a proposed
intervention analysis application will detect a meaningful change. Power is the statistical
term used for the probability that a test will reject the null hypothesis of no change at a
given level of significance for a prescribed change. McLeod and Vingilis (2005) have
suggested power computation methods for use with time series analyses for the certain
cases of intervention analysis. They have also showed that power function also helps to
compute the sample size required for intervention analysis.
9.1 Time Series Intervention Model:
Intervention model is a model in time series which is used to explore the impact on the
series from external factors. Suppose that Yt is a time series the ARIMA (p, d, q) model
is written as:
1 1 1 1 1...d d d
t t p t t q t q tY Y Y
where
M III: 1: Intervention based ARIMA Time-series Models
M III-23
=autoregressive parameter
=moving average parameter
d=degree of differencing
B=backshift operator
t =white noise
Let 2
1 2( ) (1 ... )p
pB B B B
2
1 2( ) (1 ... )q
qB B B B
Now the ARIMA model can be written as-
( )
( )
d
t t
BY
B
Sometimes the time series is depends on season i.e. Yt depends Yt-s, Yt-2s etc. where s is
the length of periodicity. Linearly this relation may be represented as-
1.... ...D
t s t s Ps t Ps s t Qs t Q tY Y Y
where
=seasonal autoregressive parameter
= seasonal moving average parameter
D= degree of seasonal differencing i.e.
Let 2
1 2( ) (1 ... )S S S PS
PB B B B
2
1 2( ) (1 ... )Q
QB B B B
D=degree of seasonal differencing
Now the model can be written as-
( )
( )
D
s t t
BY
B
The ‗seasonal non-seasonal multiplicative‘ model with the above notation can be written
as-
( ) ( )
( ) ( )
sd D
s t ts
B BY
B B
The time series input-output model is of the form
1 1 0 0 1 0... ...t t r t r t b t b t b s tY Y Y X X X N
where,
Xt=exogenous variable
b=delay parameter
Nt=error term
Let
1( ) (1 ... )r
rB B B
M III: 1: Intervention based ARIMA Time-series Models
M III-24
0 1( ) ( ... )s
sB B B
Then the input-output model can be written as
( )
( )
b
t t t
BY B X N
B
When Nt is generated as an ARIMA process then the input-output model is known as
transfer function model. The ARIMA model can be extended to Seasonal ARIMA model
in a straightforward manner as shown above.
Intervention models are special cases of transfer function modeling in which the
exogenous variable is a deterministic categorical variable. Accordingly, an intervention
model with Seasonal ARIMA process can be written as
( ) ( ) ( )
( ) ( ) ( )
sb
t t ts
B B BY B I
B B B
where
It=dummy variable
The term ( )
( )
bBB
B
is called the intervention component, and the model may be extended
to include several intervention components and thereby to account for several types of
interventions that influence the process.
9.2. Input variables:
In general, there are three types of functions in the intervention variable. The intervention
type of step function starts from a given time till the last time period. Mathematically, the
intervention type of step function is written as:
0
1tI
t T
t T
with T is time of intervention when it first occurred.
The pulse function is the intervention type occurs only particular period of time.
Mathematically, the intervention type of pulse function is usually written as:
0
1tI
t T
t T
Apart from pulse and step functions, there is another kind of function known as ramp
function. Mathematically, the intervention type of ramp function is usually written as:
0tI
t T
t T
t T
To fix ideas, the illustration is given with the help of pulse function.
M III: 1: Intervention based ARIMA Time-series Models
M III-25
Indicator coding for pulse, step and ramp functions are given in Table 3.1.
Table 3.1: Values of Intervention variable under different functio (when it occurred
in the 6th
time point)
Time t Pulse function It Step function It Ramp function It
1 0 0 0
2 0 0 0
3 0 0 0
4 0 0 0
5 0 0 0
6 1 1 1
7 0 1 2
8 0 1 3
9 0 1 4
10 0 1 5
9.3 Graphical representation of input and various types of outputs
When the input variable takes the form of step and pulse functions, their forms can be
graphically represented as shown below in Fig-9.1 and Fig-9.2 respectively.
Fig-9.1 Fig-9.2
Output
( )T
tBS ( )1
1
T
t
BP
B
Fig-9.3 Fig-9.4
In Fig-9.3, when an input step intervention has occurred, then the output pattern can be
described by single parameter ω and the response is abrupt-permanent and in Fig-9.4,
when a pulse intervention has occurred and the output pattern can be described by two
parameters ω1 and δ and the response due to intervention is abrupt-temporary.
M III: 1: Intervention based ARIMA Time-series Models
M III-26
( )
1
T
t
BS
B
(' )1 2
1 1
T
t
B BP
B B
Fig-9.5 Fig-9.6
In Fig-9.5 step intervention has occurred and the response type is gradual permanent and
in Fig-9.6 pulse intervention has occurred and the response of the input can be described
by two intervention components, one component for explaining the abrupt temporary
effect and other explaining the permanent effect.
( )
1
T
t
BS
B
(' )1 2
01 1
T
t
B BP
B B
Fig-9.7 Fig-9.8
In Fig-9.7 step intervention has occurred and the response type is gradual increasing and
in Fig-9.8 pulse intervention has occurred and the response type can be explained by
three different intervention components.
9.4 Intervention model of null order pulse function:
The Intervention Type Null Order Pulse Function can be written as:
t t tY I N
with
Yt = Response variable at t
ω = Intervention effect on Yt
It = Intervention variable
Nt = ARIMA in pre- intervention data period
M III: 1: Intervention based ARIMA Time-series Models
M III-27
In the above equation, the effect of I on Y is assumed to have an intervention element.
The estimated value of ω can be used to estimate the difference between the two periods
before and after the occurrence of intervention. In general, the effect of I on Y are
categorized as temporary, gradual, permanent or after delay in certain time.
9.5 Intervention model of first order pulse function:
Null order pulse function occurs when the δ parameter is zero. An alternative approach
which accommodates another sort of such an effect as gradual is denoted as first order
pulse function. If the intervention component is denoted as
*
t t tY Y N
then an additional parameter is needed to define f(It) as follows
* ( )1
t t tY f I IB
such that 1 1
We have * *
1t t tY Y I
since * *
1 2 1t t tY Y I and 1 , and so on recursively, the equation becomes:
*
0
j
t t j
j
Y I
If this equation is applied for k (k = 0, 1, 2 …) periods after intervention, the equation
below can be obtained: * 2 1
1 2 ( 1)( .... ...)K K
T K T K T K T K T K K T K KY I I I I
(0 0 0 ... 0 1 0 0 ...)K
K
( a)
M III: 1: Intervention based ARIMA Time-series Models
M III-28
(b)
Fig-9.9 Intervention response with single pulse occurred in t = T
Equation (ix) means that the effect from pulse will vanish gradually corresponding to
geometric sequence which is determined by δ value. Figure 1(a) shows *
tY value for
model with value ω = 1, Fig. 1(b) for value ω = -1 and single pulse occurred in t = T for
some δ (delta) value with 1 and 0 . From Fig. 1, it can also be seen that Yt*
value
approaches to zero asymptotically. In this case, as δ approaches to 0, then the effect of I
to Y will be for a minimum period. On the other hand, as δ approaches to 1, the effect of
I will last longer. For the extreme case of δ = 1, we can get the permanent effect of I to Y
which is shown in Fig. 9.9.
9.6 Procedure for intervention model building
The intervention response which is written as Yt
* in essence are values of the differences
(errors) between the original data after the intervention period and the corresponding
ARIMA model forecasts of the same fitted on the pre-intervention data period.
The time series data Yt are divided into Data I data time series period before the
intervention occurred, Y1t and Data II-data time series period after the intervention
occurred, Y2t:
Step I: The forming of ARIMA model for data time series period before intervention,
Y1t. The model building with respect to identification of autoregressive and moving
average parameters will be done in the usual way based on the Autocorrelation Function
(ACF) and the partial autocorrelation function (PACF).
Step II: In order to identification of the order of b, r and s cross-correlation function is
used in case of transfer function model but in case of intervention model cross-correlation
function cannot be used because the intervention variable are not continuous, cross
correlations between the intervention variable and the dependent variable are meaningless
M III: 1: Intervention based ARIMA Time-series Models
M III-29
the only way to identify the intervention model is impulse response function which is
given below-
( )
( )
B
B
Impulse response
0
0 1B
0
1 B
0 1
1
B
B
Step III: The parameter estimates of intervention model will be tested by using statistical
tests (such as t or z-tests, the latter will be elaborated in a subsequent section). Model
suitability will be diagnosed through residual assumption test for adherence to white
noise.
Bibliography
M III: 1: Intervention based ARIMA Time-series Models
M III-30
Box, G.E.P., Jenkins, G.M. and Reinsel, G.C. (1994). Time series analysis : Forecasting
and control, Pearson Education, Delhi.
Croxton, F.E., Cowden, D.J. and Klein, S.(1979). Applied General Statistics, New Delhi:
Prentice Hall of India Pvt. Ltd.
Makridakis, S., Wheelwright, S.C. and Hyndman, R.J. (1998). Forecasting Methods and
Applications, 3rd
Edition, John Wiley, New York.
Pankratz, A. (1983). Forecasting with univariate Box-Jenkins models: concepts and
cases, John Wiley, New york.
Top Related