ARMA unplugged (Excel)
-
Upload
spider-financial -
Category
Documents
-
view
212 -
download
0
description
Transcript of ARMA unplugged (Excel)
ARMA Unplugged ‐1‐ © Spider Financial Corp, 2012
ARMAUnpluggedThis is the first entry in our series of “Unplugged” tutorials, in which we delve into the details of each of
the time series models with which you are already familiar, highlighting the underlying assumptions and
driving home the intuitions behind them.
In this issue, we tackle the ARMA model – a cornerstone in time series modeling. Unlike earlier analysis
issues, we will start here with the ARMA process definition, state the inputs, outputs, parameters,
stability constraints, assumptions, and finally draw a few guidelines for the modeling process.
BackgroundBy definition, the auto‐regressive moving average (ARMA) is a stationary stochastic process made up of
sums of autoregressive and moving average components.
Alternatively, in a simple formulation:
1 1 2 2 1 1 2 2
1 1 2 2 1 1 2 2
... ...
( ... ) ( ... )
t t t p t p t t t q t q
t t t p t p t t q t q t
y y y y a a a a
OR
y y y y a a a a
Where
1. ty is the observed output at time t.
2. ta is the innovation, shock or error term at time t.
3. { }ta time series observations:
a. Are independent and identically distributed ( ~ . .ta i i d )
b. Follow a Gaussian distribution (i.e. 2(0, ) ).
Note: The variance of the shocks distribution (i.e. 2 ) is time invariant.
Using back‐shift notations (i.e. L ), we can express the ARMA process as follows:
2 2
1 2 1 2(1 ... )( ) (1 ... )p qp t q tL L L y L L L a
AssumptionsLet’s look closer at the formulation. The ARMA process is simply a weighted sum of the past output
observations and shocks, with few key assumptions:
ARMA Unplugged ‐2‐ © Spider Financial Corp, 2012
1. The ARMA process generates a stationary time series ({ }ty ).
2. The residuals { }ta follow a stable Gaussian distribution.
3. The components’ parameter 1 2 1 2{ , ,..., , , ,..., }p q values are constants.
4. The parameter 1 2 1 2{ , ,..., , , ,..., }p q values yield a stationary ARMA process.
Whatdotheseassumptionsmean?A stochastic process is a counterpart of a deterministic process; it describes the evolution of a random
variable over time. In our case, the random variable is ty .
Next, are the { }ty values independent? Are they identically distributed? If so, ty should not be described
by a stochastic process, but instead by a probabilistic distribution model.
For cases where { }ty values are not independent (e.g. ty value is path‐dependent), a stochastic model
similar to ARMA is in order to capture the evolution of ty .
The ARMA process only captures the serial correlation (i.e. auto‐correlation) between the observations.
In plain words, the ARMA process sums up the values of past observations, not their squared values or
their logarithms, etc. Higher order dependency mandates a different process (e.g. ARCH/GARCH, non‐
linear models, etc.).
There are numerous examples of a stochastic process where past values affect current ones. For
instance, in a sales office that receives RFQs on an ongoing basis, some are realized as sales‐won, some
as sales‐lost, and a few spilled over into the next month. As a result, in any given month, some of the
sales‐won cases originate as RFQs or are repeat sales from the prior months.
Whataretheshocks,innovationsorerrorterms?This is difficult question, and the answer is no less confusing. Still, let’s give it a try: In simple words, the
error term in a given model is a catch‐all bucket for all the variations that the model does not explain.
Confused? Let’s put it a different way. In any given system, there are possibly tens of variables that
affect the evolution of ty , but the model captures few of them and will bundle the rest as an error term
in its formula (i.e. ta ).
Still lost? Let’s use an example. For a stock price process, there are possibly hundreds of factors that
drive the price level up/down, including:
‐ Dividends and Split announcements
‐ Quarterly earnings reports
‐ Merger and acquisition (M&A) activities
‐ Legal events, e.g. the threat of class action lawsuits.
ARMA Unplugged ‐3‐ © Spider Financial Corp, 2012
‐ Others
A model, by design, is a simplification of a complex reality, so whatever we leave outside the model is
automatically bundled in the error term.
The ARMA process assumes that the collective effect of all those factors acts more or less like Gaussian
noise.
Whydowecareaboutpastshocks?Unlike a regression model, the occurrence of a stimulus (e.g. shock) may have an effect on the current
level, and possibly future levels. For instance, a corporate event (e.g. M&A activity) affects the underling
company’s stock price, but the change may take some time to have its full impact, as market
participants absorb/analyze the available information and react accordingly.
This begs the question: don’t the past values of the output already have the shocks’ past information?
YES, the shocks history is already accounted for in the past output levels. An ARMA model can be solely
represented as a pure auto‐regressive (AR) model, but the storage requirement of such a system in
infinite. This is the sole reason to include the MA component: to save on storage and simplify the
formulation.
2 21 2 1 2
21 2
21 2
2 31 2 3
(1 ... )( ) (1 ... )
(1 ... )( )
(1 ... )
(1 ... ...)( )
p qp t q t
pp
t tqq
NN t t
L L L y L L L a
L L Ly a
L L L
L L L L y a
ARMA Unplugged ‐4‐ © Spider Financial Corp, 2012
ARMAMachineThe ARMA process is a simple machine that retains limited information about its past outputs and the
shocks it has experienced. In a more systematic view, the ARMA process or machine can be viewed as
below:
In essence, the physical implementation of an ARMA(P,Q) process requires P+Q storages, or the memory
requirements of an ARMA(P,Q ) are finite (P+Q).
At time zero (0), the ARMA machine is reset and all its storages (i.e. 1 2 1 2{ , ,..., , , ,..., }y y y a a ap qS S S S S S ) are
set to zero. As new shocks (i.e. ta ) begin to arrive in our system, the internal storages get updated with
the new observed output ( ty ) and realized shocks.
Furthermore, the AR component in the ARMA represents a positive feedback (adding the weighted sum
of the past output), and this may cause the output to be non‐stationary. The feedback effect for the
shocks is less of a concern, as the shocks have zero mean and are independent.
For a stable (i.e. stationary) ARMA process, the roots of the characteristic AR equation must be within a
unit‐circle.
21 2 1 21 ... (1 )(1 )...(1 )
1
pp p
k
L L L L L L
Where
1. k is the k‐th root
2. 1 k p
Note: in the event that the MA and AR components have any common root, the ARMA orders (i.e. P and
Q) should be reduced.
ARMA Unplugged ‐5‐ © Spider Financial Corp, 2012
StationaryARMANow that we have a stationary (stable) ARMA process, let’s shift gears and examine the output series
characteristics.
1 1 2 2 1 1 2 2... ...t t t p t p t t t q t qy y y y a a a a
Marginalmean(long‐runmean)
1 1 2 2 1 1 2 2
1 1 2 2
1 2
[ ] [ ... ... ]
[ ] [ ] [ ] ... [ ]
[ ]1 ..
t t t p t p t t t q t q
t t t p t p
tp
E y E y y y a a a a
E y E y E y E y
E y
For a stationary ARMA process, the 1
1p
kk
.
Marginalvariance(long‐runvariance)The formula for long‐term variance of an ARMA model is more involved: to solve for it, I derive the MA
representation of the ARMA process, after which I take the variance, since all terms are independent.
For illustration, let’s consider the ARMA (1, 1) process:
2 2 3 1
11 2
2 2 2 2 2 2 2
(1 )( ) (1 )
1(1 ( ) ( ) ( ) ... ( ) ...)
1
( ) ( ) ... ( ) ...
[ ] [ ] (1 ( ) ( ) ... ( ) ...)
t t
N Nt t t
Nt t t t t N
Nt t a
L y L a
Ly a L L L L a
L
y a a a a
Var y Var y
Va
2
2 2 4 2 2 2 22
22 2
2
( )[ ] (1 ( ) (1 ... ...)) (1 )
1
1 2( )
1
Nt a a
y a
r y
Again, the ARMA process must be stationary for the marginal (unconditional) variance to exist.
Note: In my discussion above, I am not making a distinction between merely the absence of a unit root
in the characteristic equation and the stationarity of the process. They are related, but the absence of a
unit root is not a guarantee of stationarity. Still, the unit root must be lie inside the unit circle to be
accurate.
ARMA Unplugged ‐6‐ © Spider Financial Corp, 2012
ConclusionLet’s recap what we have done so far. First we examined a stationary ARMA process, along with its
formulation, inputs, assumptions, and storage requirements. Next, we showed that an ARMA process
incorporates its output values (auto‐correlation) and shocks it experienced earlier in the current output.
Finally, we showed that the stationary ARMA process produces a time series with a stable long‐run
mean and variance.
In our data analysis, before we propose an ARMA model, we ought to verify the stationarity assumption
and the finite memory requirements.
1. In the event the data series exhibits a deterministic trend, we need to remove (de‐trend) it first,
and then use the residuals for ARMA.
2. In the event the data set exhibits a stochastic trend (e.g. random walk) or seasonality, we need
to entertain ARIMA/SARIMA.
3. Finally, the correlogram (i.e. ACF/PACF) can be used to gauge the memory requirement of the
model; we should expect either ACF or PACF to decay quickly after a few lags. If not, this can be
a sign of non‐stationarity or a long‐term pattern (e.g. ARFIMA).