Time Series Analysis: Method and Substance Introductory Workshop on Time Series Analysis
Time Series Analysis
-
Upload
alysia-prince -
Category
Documents
-
view
44 -
download
9
Transcript of Time Series Analysis
-
Time Series Analysis
Session 0: Course outline
Carlos scar Snchez Sorzano, Ph.D.Madrid
-
2Motivation for this course
-
3Course outline
-
4Course outline: Session 1
Kinds of time series Continuous time series sampling Data models Descriptive analysis: time plots and data preprocessing Distributional properties: statisitcal distribution, stationarity and autocorrelation Outlier detection and rejection
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4Sampling
t
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4Regular sampling
tIrregular sampling
][][][][ nrandomnperiodicntrendnx
-
5Course outline: Session 2
Trend analysis: Linear and non-linear regression Polynomial fitting Cubic spline fittingSeasonal component analysis: Spectral representation of stationary processesSpectral signal processing: Detrending and filtering Non-stationary signal processing
][][][][ nrandomnperiodicntrendnx
-
6Course outline: Session 3
Model definition: Moving Average processes (MA) Autoregressive processes (AR) Autoregressive, Moving Average (ARMA) Autoregressive, Integrated, Moving Average (ARIMA, FARIMA) Seasonal, Autoregressive, Integrated, Moving Average (SARIMA) Known external inputs: System identification A family of models Nonlinear modelsParameter estimationOrder selectionModel checkingSelf-similarity, fractal dimension and chaos theory
][][][][ nrandomnperiodicntrendnx
-
7Course outline: Session 4
Forecasting Univariate forecasting
Intervention modelling State-space modelling Time series data mining
Time series representation Distance measure Anomaly/Novelty detection Classification/Clustering Indexing Motif discovery Rule extraction Segmentation Summarization
W inding Dataset (T he angular speed of reel 2)
0 500 1000 1500 2000 2500A B C
-
8Course Outline: Session 5
Su nombre aqu
Bring your own data if possible!
-
9Suggested readings
It is suggested to read (before coming): Geo4990: Time series analysis Adler1998: Analysing stable time series Leonard: Mining Transactional and Time Series Data Chattarjee2006: Simple Linear Regression
-
10
Resources
Data setshttp://www.york.ac.uk/depts/maths/data/tshttp://www-personal.buseco.monash.edu.au/~hyndman/TSDL
Competitionhttp://www.neural-forecasting-competition.com
Links to organizations, events, software, datasets http://www.buseco.monash.edu.au/units/forecasting/links.phphttp://www.secondmoment.org/time_series.php
Lecture noteshttp://www.econphd.net/notes.htm#Econometrics
-
11
Bibliography
D. Pea, G. C. Tiao, R. S. Tsay. A course in time series analysis. John Wiley and Sons, Inc. 2001
C. Chatfield. The analysis of time series: an introduction. Chapman & Hall, CRC, 1996.
C. Chatfield. Time-series forecasting. Chapman & Hall, CRC, 2000. D.S.G. Pollock. A handbook of time-series analysis, signal processing
and dynamics. Academics Press, 1999. J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994. C. Prez. Econometra de las series temporales. Prentice Hall (2006) A. V. Oppenheim, R. W. Schafer, J. R. Buck. Discrete-time signal
processing, 2nd edition. Prentice Hall, 1999. A. Papoulis, S. U. Pillai. Probability, random variables and stochastic
processes, 4th edition. McGraw Hill, 2002.
-
1Time Series Analysis
Session I: Introduction
Carlos scar Snchez Sorzano, Ph.D.Madrid
-
22
Session outline
1. Features and objectives of the time series2. Sampling3. Components of time series: data models4. Descriptive analysis5. Distributional properties6. Detection and removal of outliers7. Time series methods
-
33
1. Features and objectives of the time series
Goal: Explain historyFeatures:1. Samples (discrete) 2. Bounded 3. Finite support
nx
nnsnslnx 1 Fnnnnnx ,...,1,0 00
TrendPeriodic componentPeriod=N7 years
][][ nxNnx
-
44
1. Features and objectives of the time series
Goal: Forecast demandFeatures:1. Seasonal2. Non stationary3. Non independent samples
Mean
StandardVariance
-
55
1. Features and objectives of the time seriesPendulum angle with different controllers
Goal: Control=Forecast(+correct)
Features:1. Continuous signal
Regular sampling)(][ snTxnx
sT =Sampling period
-
66
1. Features and objectives of the time seriesOther kind of times series not covered in this course
-
77
1. Features and objectives of the time seriesOther kind of times series not covered in this course
-
88
2. Sampling
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4Sampling
t
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4Regular sampling
t
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4Irregular sampling
t
)(][ snTxnx
Also called non-uniform sampling
-
99
2. Sampling
0 10 20 30 40 50 60 70 80 90 100-5
0
5Continuous signal
t
0 10 20 30 40 50 60 70 80 90 100-5
0
5Oversampling
t
0 10 20 30 40 50 60 70 80 90 100-5
0
5Critical sampling
t
0 10 20 30 40 50 60 70 80 90 100-5
0
5Undersampling
t
10min T
5minTTs
2minTTs
minTTs
Nyquist/Shannon criterion
81001041003211001 2sin2.02sin2sin2)( ttttx
-
10
10
2. Sampling
-
11
11
2. Sampling: Signal reconstruction
0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
1
Continuous signal
t0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
1
Sampled signal
t-5 0 5
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1sinc(t)
t
0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
1
Sinc superposition
t9 9.5 10 10.5 11
-0.2
0
0.2
0.4
0.6
0.8
1
Sinc superposition
t0 5 10 15 20
-0.2
0
0.2
0.4
0.6
0.8
1
Reconstructed signal
t
n
srr nTthnxtx )()(
sr T
tcth sin)(
Reconstruction formula
Reconstruction kernel for bandlimited signals
-
12
12
2. Sampling: Aliasing
0 20 40 60 80 100 120 140 160 180 200-1
-0.5
0
0.5
1Obvious aliasing effect
t
0 50 100 150 200 250 300 350 400-3
-2
-1
0
1
2
3Not so clear aliasing effect
t
21.1minmin TTTs
21.1minmin TTTs
81001041003211001 2sin2.02sin2sin2)( ttttx
10100( ) sin 2x t t
-
13
13
2. Aliasing: Conclusions
From the disgression above, two are the main consequences that must be kept in mind:1. Any continuous time series can be safely treated as a discrete time
series as long as the Nyquist criterion is satisfied.2. Once our discrete analysis is finished, we can always return to the
continuous world by reconstructing our output sequence.
Although not discussed, once the continuous signal is discretized, one can arbitrarily change the sampling period without having to go back to the continuous signal with two operations called upsampling (going for a finer discretization) and downsampling (coarser discretization).
-
14
14
3. Components of a time series: data models
][][][][ nrandomnperiodicntrendnx SeasonalEx: Unemployment is low in summerEx: Temperature is high in the middle of the day
Long-term change. What is long-term?
0 50 100 150 200 250 300 350 400-3
-2
-1
0
1
2
3
t
Explained by statistical models (AR, MA, )
-
15
15
0 100 200 300 400-10
-5
0
5
10
t
Trend
0 100 200 300 400-10
-5
0
5
10
t
Seasonal
0 100 200 300 400-10
-5
0
5
10
t
Noise
0 100 200 300 400-10
-5
0
5
10
t
Additive model
0 100 200 300 400-10
-5
0
5
10
t
Seasonal multiplicative model
0 100 200 300 400-10
-5
0
5
10
t
Fully multiplicative model
3. Components of a time series: data models
][][][][ nnSnmnx ][][][][ nnSnmnx ][][][][ nnSnmnx
][log][log][log][log nnSnmnx log
-
16
16
3. Components of a time series: data model
0 10 20 30 40 50 60 70 80 90 100-1
-0.5
0
0.5
1
0 10 20 30 40 50 60 70 80 90 100-1
-0.5
0
0.5
1
Noise is dependent of signal?
0 10 20 30 40 50 60 70 80 90 100-1
0
1Signal x1
0 10 20 30 40 50 60 70 80 90 100-1
0
1Signal x2
0 10 20 30 40 50 60 70 80 90 100-1
0
1Interferent signal
0 10 20 30 40 50 60 70 80 90 100-2
0
2Observed signal
Noise is dependent of noise
[ ] [ ] [ ]x n x n n
-
17
17
4. Descriptive analysis
Time plot Take care of appearance (scale, point shape, line shape, etc.) Take care of labelling axes specifying units (be careful with the
scientific notation, e.g. from 0.5e+03 to 0.1e+04) Data Preprocessing
Consider transforming the data to enhance a certain feature, although this is a controversial topic:
Logarithm: Stabilizes variance in the fully multiplicative model
Box-Cox: the transformed data tends to be normally distributed])[log(][ nxny
0])[log(0][
1][
nxny
nx
-
18
18
4. Descriptive analysis
Data preprocessing: (Removal of trend) Detrending: if the trend clearly follows a known
curve (line, polynomial, logistic, Gaussian, Gompertz, etc.), fit a model to the time series and detrend (either by substracting or dividing).
(Removal of trend) Differencing
2
2
]2[]1[2][][][
]1[][2]1[][][
][]1[][][
]1[][][][
sllll
slrlr
srr
sll
Tnxnxnxnyny
Tnxnxnxnyny
Tnxnxnxny
Tnxnxnxny
-6 -4 -2 0 2 4 6-10
-5
0
5
10
15
20
25
30
x[n]yl[n]
ylr[n]
1.5
)(][)2sin(5.110)(
snTxnxtttx
-
19
19
4. Descriptive analysis
Data preprocessing: (Removal of season) Deseasoning:average over the seasonal period
to remove its effect. (Estimate of season) Estimating seasoning: substract the
deseasoned time series from a local estimate of the current sample.
][...]4[]5[]6[121][ 21 nxnxnxnxnmest
]6[]5[...]1[ 21 nxnxnx][][
21][
2
21 nmknxnS est
kkest
][][][][ nenSnmnx
0 10 20 30 40 50 60 70 80 90 100-20
0
20
40
60
80
100
x[n]m[n]mest[n]
Sest[n]
][]2[]1[][]1[]2[][ 8141214181 nmnxnxnxnxnxnS estest
-
20
20
4. Descriptive analysis
Data preprocessing: (Removal or isolation of trend, seasonal or noise) Filtering: filtering
aims at the removal of any of the components, for example, the moving average is a common filtering operation in stock analysis to remove noise.
20
1][
201][
kknxny
10
10][
211][
kknxny
40 60 80 100 120 140 160 180
7
8
9
10
OriginalCausalCausal+Anticausal
-
21
21
5. Distributional properties
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4
tIrregular sampling
][nx31]31[ Xx
All variables are identically distributed StationarityA variable can be normally distributed or Poisson or etc.It is important to characterize their distribution
All variables are independent White processIndependency is measured through the autocorrelation function
0 10 20 30 40 50 60 70 80 90 100-3
-2
-1
0
1
2
3
-
22
22
5. Distributional properties
-5 0 5
0
0.5
1
Normal distribution function
X-5 0 5
0
0.1
0.2
0.3
0.4Normal probability density function
X
-5 0 5 10 15 20
0
0.5
1
Poisson distribution function
X-5 0 5 10
0
0.1
0.2
0.3
0.4Poisson probability density function
X
xXxFX Pr)( x dttf )(
xx
ii
xXPr
)(dxFxXE Xrr
x r dttfx )(
xx
ir
i
xXx Pr
1 !
1)(k
kkk
itXX tk
XEieEt
dttiteeaFbF X
itbita
XX )(lim21)()(
yYxXyxF YX ,Pr),(,
Characteristic function
-
23
23
5. Distributional properties
0 20 40 60 80 100-20
0
20
40
60
80
100
0 20 40 60 80 100-1
-0.5
0
0.5
1
0 50 100 150 200-0.2
0
0.2
0.4
0.6
0.8
1
0 1 2 3 4
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Examples of nonstationary variables
-
24
24
5. Distributional propertiesStrictly stationary ),...,,(),...,,(
21212121 ,...,,,...,, NnNnNnXXXnnnXXX kNknNnNnkknnnxxxFxxxF
Nnnnk k ,,...,,, 21Consequences: nXE n
20*0 ][))((),(][ 00 nXXEXXCovnC nnnnnn
0,nnAutocorrelation function (ACF)
Wide sense stationary nXE n][),( 00 nXXCorr nnn 0,nn
][][ 0*
0 nn ][]0[ 0n
lag
][),( 0* 00 nXXEXXCorr nnnnnn
Autocovariance function
Correlation coefficient ]0[
][][ 00 CnCnr
0n0n
Example: white process
0001
][][0
00
20 n
nnn X
-
25
25
5. Distributional properties
]0[][][ 00 C
nCnr Correlation coefficients (Zero-order correlations)
Under the null hypothesis it is distributed as Students t with N-2 degrees of freedom.
Number of samples upon which is estimated][ 0nr
The probability of observing if the null hypothesis is true is given by
0][ 00 nrH
][ 0nr
][12][Pr
020 nr
Nnrt
-
26
26
5. Distributional proepertiesErgodicity
0 5 10 15 20 25-1
0
1
0 5 10 15 20 25-1
0
1
0 5 10 15 20 25-1
0
1
0 5 10 15 20 25-1
0
1
Realization 1
Realization 2
Realization 3
Realization 4
4
1,204
120
iix
25
1251 ][
nnx
Strong law of large numbers
N
iiNN
x1
,201
20 lim
Equal if stationary and ergodic for the mean
Ensemble average
Time average
-
27
27
5. Distributional properties
0 10 20 30 40 50 60 70 80 90 100-25
-20
-15
-10
-5
0
5
10
15
20
][][][ nenmnx
),0( 2N ),0( 2N 0nXE
22),(0 nnn XXCov
The time average still converges to the ensemble average
Stationarity ErgodicityExample:
-
28
28
5. Distributional properties
How to detect non-stationarity in the mean?There is a variation in the local mean
Solutions:
Polynomial trends: By differentiating p times
(2) (1) (1)[ ] [ ] [ 1]x n x n x n
( ) ( 1) ( 1)[ ] [ ] [ 1]p p px n x n x n
Removal of a linear trend
Removal of a quadratic trend
Removal of a p-th order polynomial trend
(1)[ ] [ ] [ 1]x n x n x n Removal of a constant (or piece-wise constant)
(3) (2) (2)[ ] [ ] [ 1]x n x n x n
-
29
29
5. Distributional properties
How to detect non-stationarity in the mean?Exponential trends: Taking logs
[ ][ ] log[ 1]x ny n
x n Financial time series
-
30
30
5. Distributional properties
How to detect non-stationarity due to seasonality?There is a periodic variation in the local mean
Solutions:
Polynomial trends: By differentiating p times with that seasonality(1)[ ] [ ] [ 12]x n x n x n Monthly data: removal of a yearly seasonality(1)[ ] [ ] [ 7]x n x n x n Daily data: removal of a weekly seasonality
How to detect non-stationarity in the variance?There is a variation in the local variance
Solutions: Box-Cox transformation tends to stabilize variance
How to detect non-stationarity?Solutions: Unit root tests
-
31
31
6. Outlier detection and rejection
Common procedures:1. Visual inspection2. Mean k Standard Deviation3. Median k Median Abs. Deviation4. Robust detection5. Robust model fitting
Common actions:1. Remove observation2. Substitute by an estimate
Do Estimate mean and Std. Dev. Remove samples outside a given interval
Until (No sample is removed)
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Sensor blackout
-
32
32
7. Time series methods
Time-domain methods: Based on classical theory of correlation (autocovariance) Include parametric methods (AR, MA, ARMA, ARIMA, etc.) Include regression (linear, nonlinear)
Frequency-domain (spectral) methods: Based on Fourier analysis Include harmonic methods
Neural networks and Fuzzy neural networks Other fancy approaches: fractal models, wavelet models,
bayesian networks, etc.
-
33
33
Bibliography
C. Chatfield. The analysis of time series: an introduction. Chapman & Hall, CRC, 1996.
D.S.G. Pollock. A handbook of time-series analysis, signal processing and dynamics. Academics Press, 1999.
J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994. A. V. Oppenheim, R. W. Schafer, J. R. Buck. Discrete-time signal
processing, 2nd edition. Prentice Hall, 1999. A. Papoulis, S. U. Pillai. Probability, random variables and stochastic
processes, 4th edition. McGraw Hill, 2002.
-
Time Series Analysis
Session II: Regression and Harmonic Analysis
Carlos scar Snchez Sorzano, Ph.D.Madrid
-
2Session outline
1. Goal2. Linear and non-linear regression3. Polynomial fitting4. Cubic spline fitting5. A short introduction to system analysis6. Spectral representation of stationary processes7. Detrending and filtering8. Non-stationary processes
-
31. Goal
][][][][ nrandomnperiodicntrendnx
Session 3Session 2
-
42. Linear and non linear regression
Year (n)
Price (x[n])
1500 17
1501 19
1502 20
1503 15
1504 13
1505 14
1506 14
1507 14
Linear regression
][][][ nrandomntrendnx nntrend 10][
...150315150220150119150017
10
10
10
10
...15201917
......15031150211501115001
1
0
xX
-
52. Linear and non linear regression
Linear regressionxX
][][][ nrandomntrendnx xXxXxXXx t2
][][0][
02
0 nnnE
Lets assume
xXXXxX
tt 12minarg Least Squares Estimate
EProperties: 12 XX tCov xXxX tkN
1 2
Homocedasticity
2
22 1
1xxXx
R
11)1(1 22
kN
NRRadjusted
Degree of fit
Linear regression with constraints
2 1 1arg min ( ). .
t t t tR
s t
X X R R X X R r RR r 2 1 1 2
2 2 0 Example:
-
62. Linear and non linear regression
][][][ nrandomntrendnx 2
210][ nnntrend ...
1503150315
1502150220
1501150119
1500150017
22
10
22
10
22
10
22
10
...15201917
.........150315031150215021150115011150015001
2
1
0
2
2
2
2
xX
0
0
1
0
HHFirst test: 0tt0 XX 12kF H0 is true ),( kNkF
If F>FP, reject H0
201
200
2
2
HHSecond test: 20221t2t202 )XP(IX 1 22kF H0 is true ),( 2 kNkF
][][
),0(][
02
0
2
nnNn
Lets assume
x
XX
2
121 t111t111 XXXXP
01
00
ii
ii
HH
ii
iit
0 iiii 1 XXt H0 is true )( kNt
-
72. Linear and non linear regression
[40, 45] [0.05,0.45]Y X
We got a certain regression line but the true regression line lies within this region with a 95% confidence.
12 2 tj jj X X2
21 , 1
j j jN kt Unbiased variance of the j-th regression coefficient
Confidence interval for the j-th regression coefficient
Confidence intervals for the coefficients
-
82. Linear and non linear regression
Durbin-Watson test:
0][0][
01
00
nHnH
N
n
N
n
n
nn
d1
2
2
2
][
]1[][
Reject H0Ldd
Do not reject H0Udd
00 n
Test that the residuals are certainly uncorrelated
Other tests: Durbins h, Wallis D4, Von Neumanns ratio, Breusch-Godfrey
-
92. Linear and non linear regression
(0) (0) [ ] [ ]y n n xCochrane-Orcutt method of regression with correlated residues
1. Estimate a first model2. Estimate residuals (correlated)3. Estimate the correlation of the residues4. i=15. Estimate uncorrelated residuals6. Estimate uncorrelated output7. Estimate uncorrelated input8. Reestimate model9. Estimate residuals10. i=i+1 until convergence in
(0) (0)[ ] [ ] [ ]n y n y n
( ) ( ) ( )[ ] [ ] [ 1] [ 1]i i ia n n i n ( ) ( )[ ] [ ] [ 1] [ 1]i iy n y n i y n ( ) ( )[ ] [ ] [ 1] [ 1]i in n i n x x x( ) ( ) ( ) ( ) [ ] [ ] [ ]i i i iy n a n n x
( ) ( ) ( )[ ] [ ] [ ]i i in y n y n ( )i
-
10
2. Linear and non linear regression
Assumptions of regression
The sample is representative of your population The dependent variable is noisy, but the predictors are not!!. Solution: Total Least
Squares Predictors are linearly independent (i.e., no predictor can be expressed as a linear
combination of the rest), although they can be correlated. If it happens, this is called multicollinearity. Solution: add more samples, remove dependent variable, PCA
The errors are homoscedastic. Solution: Weighted Least Squares The errors are uncorrelated to the predictors and to itself. Solution: Generalized
Least Squares The errors follow a normal distribution. Solution: Generalized Linear Models
-
11
2. Linear and non linear regression
More linear regression
][][ 10 nwnnx ][]1[][ 210 nwnxnnx
][])2[]1[(][ 210 nwnxnxnnx ][]2[]1[][ 3210 nwnxnxnnx
][][...][][][ 66110010 nwnMnMnMnnx
otherwise007n1
][0
nM
otherwise067n1
][6
nM
7 years
Non linear regression][)sin(][ 10210 nwnnnx
][][ 10 nwnnx ][logloglog][log 10 nwnnx ]['log][' '1'0 nwnnx
-
12
3. Polynomial fitting
Polynomial trends
][][][ nrandomntrendnx 2
210][ nnntrend ...
1503150315
1502150220
1501150119
1500150017
22
10
22
10
22
10
22
10
...15201917
.........150315031150215021150115011150015001
2
1
0
2
2
2
2
xX Orthogonal polynomials
...)1503()1503()1503(15)1502()1502()1502(20
)1501()1501()1501(19)1500()1500()1500(17
221100
221100
221100
221100
qqnnnntrend ...][ 2210
)(...)()()(][ 221100 nnnnntrend qq xX x
1 XRR 1
0)()( dtttji ji Such that
-
13
Polynomial trendsOrthogonal polynomials
3. Polynomial fitting
11 0)()( dtttji ji 1)(0 ttt )(1
)(1
)(112)( 21 ti
ittiit iii
212
23
2 )( ttttt 23
325
3 )(
Grafted polynomials
)(...)()()(][ 221100 nnnnntrend qq
S(t) is a cubic spline in some interval if this interval can be decomposed in a set of subintervals such that S(t) is a polynomial of degree at most 3 on each of these subintervals and the first and second derivatives of S(t) are continuous.
1t 2t 3t
1
3
3)()(k
jjj ttdtS
-
14
4. Cubic spline fitting
Cubic B-splines
otherwise0
21
1
)( 62
22
32
32
3
3
t
tt
tB t
t
1
3
3 )()(k
jjj tBdtS
-3 -2 -1 0 1 2 30
0.5
1
t
0
3 3 3 2 2 3( ) ( ) ( 3 3 )q q
p j j j j j jj p j p
B t d t t d t t t tt t
0)( tBtt pp
3,2,1,000)(
ktdtBttq
pj
kjjpq
000011111
4
3
2
1
34
33
32
31
3
24
23
22
21
24321
p
p
p
p
p
ppppp
ppppp
ppppp
ddddd
ttttttttttttttt
1)(3 dttBp
210124321 ppppp ttttt
By definition
-
15
4. Cubic spline fitting
1
3
3 )()(k
jjj tBdtS
1
1
3
0
)(Fj
jjTt
j jBdtSTtj
Ttj FF ,00
BdS
-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5-10
-8
-6
-4
-2
0
2
4
6
8
10
Cubic B-splines
T
-
16
4. Cubic spline fitting
Overfitting and forecasting
-0.2 0 0.2 0.4 0.6 0.8 1 1.2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
-1 -0.5 0 0.5 1 1.5 2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
-
4. Polynomial fitting with discontinuities
17
-
18
5. A short introduction to system analysis
T][nx )(][ xTny
Example: ][3][ nxny Amplifier]3[][ nxny Delay
][][ 2 nxny Instant power
3]1[][]1[][ nxnxnxny Smoother
Box-Cox transformation: family of systems
...]4[]5[]6[121][ 21 nxnxnxnmest
][][2
1][2
21 nmknxnS est
kkest
Season estimation
0])[log(0)(][
1][
nxxTny
nx
-5 0 5 10 150
0.5
1
n
x[n]
-5 0 5 10 150
0.5
1
n
x[n-3]
-
19
Systems with memory ,...)2,1,( nxnxnxTnyMemoryless systems )( nxTny Invertible systems ))(()(:)()( 111 nxTTnyTnxyTnxTny Causal systems ,...)2,1,( nxnxnxTnyAnticausal sytems ,...)2,1,( nxnxnxTnyStable systems nBnxTBnBnx TTx ,)(:,Time invariant systems )()( 00 nnxTnnynxTny Linear systems nxbTnxaTnbxnaxT 2121
LTI
Basic system properties
5. A short introduction to system analysisPresent Past
Future
-
20
5. Basic introduction to system analysis
LTI systems
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
x 1 [ n ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
y 1 [ n ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
x 2 [ n ] = x 1 [ n -3 ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
y 2 [ n ] = y 1 [ n -3 ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
x 3 [ n ] = 2 x 1 [ n ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
y 3 [ n ] = 2 y 3 [ n ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
x 4 [ n ] = x 1 [ n ] + x 2 [ n ]
-5 0 5 1 0 1 50
0 . 5
1
1 . 5
2
n
y 4 [ n ] = y 1 [ n ] + y 2 [ n ]
-
21
5. A short introduction to system analysis
T][nx )(][ xTny
Example: ][3][ nxny LTI, memoryless, invertible, causal, stable ]10[][ nxny
][][ 2 nxny Non linear, TI, memoryless, non invertible, causal,stable
3]1[][]1[][ nxnxnxny
0])[log(0)(][
1][
nxxTny
nxNon linear, memoryless, invertible, causal, unstable
...]4[]5[]6[121][ 21 nxnxnxnmest
][][2
1][2
21 nmknxnS est
kkest
LTI, with memory, invertible, causal, stable
LTI, with memory, invertible, non causal, stable
LTI, with memory, invertible, non causal, stable
-
22
5. A short introduction to system analysis
Impulse response of a LTI system
T][n ][nh
-5 0 50
0.5
1
n
[n]
FIR: Finite impulse responseIIR: Infinite impulse response
T][nx
kknhkxnhnxny ][][][*][][
3]1[][]1[][ nxnxnxny
...]4[]5[]6[121][ 21 nxnxnxnmest
][][2
1][2
21 nmknxnS est
kkest
Example:
3]1[][]1[][ nnnnh
-5 0 50
0.1
0.2
0.3
0.4
n
h[n]
-10 -5 0 5 10-0.2
0
0.2
0.4
n
h[n]
-
23
5. A short introduction to system analysis
M
kk
N
kk knxbknya
00
Difference equation
Example: ]1[5.0][]1[][ nxnxnyny
Nk
kk
M
k
kk
za
zb
zXzYzH
0
0
)()()(
M
k
kk
N
k
kk zzXbzzYa
00
)()(
11 )(5.0)()()( zzXzXzzYzY
1
1
15.01
)()()(
zz
zXzYzH
Transfer function
Cz
Impulse response of a LTI system )()()( zXzHzY
][ 0nnx ZT )(0 zXzn
-
24
6. Spectral representation of stationary processes
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870
200
300
Year
P
r
i
c
e
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870
200
300
YearP
r
i
c
e
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870-100
0
100
Year
P
r
i
c
e
-
t
r
e
n
d
][][][][ nrandomnseasonalntrendnx
nntrend 6.12777][
][][][ ntrendnxny ][][ nrandomnseasonal
][cos5.22cos5.26 1223282 nrandomnn Period=8 years Period=12 years
-
25
6. Spectral representation of stationary processes
Harmonic components
][cos][ nrandomnAnx Amplitude
Frequency (rad/sample)
Phase (rad)
0 10 20 30 40 50 60 70 80 90 100
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
A
][2cos][ nrandomfnTAnx s Frequency (Hertz)
kNNnseasonalnseasonal ff
fTkk ss 2][][ ,...3,2,1k
nnnseasonal 1223282 cos5.22cos5.26][Period=8 years Period=12 years
Example:
Period=lcm(N1,N2)=24 years
-
26
6. Spectral representation of stationary processes
Harmonic components
nAnx cos][
0 5 10 15 20 25 30
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30
-1
-0.5
0
0.5
1
0 5 10 15 20 25 30
-1
-0.5
0
0.5
1
nnx 152cos][ nnx 1528cos][ n 2cos 1528 n152cos )cos( 152 n
-
27
6. Spectral representation of stationary processes
Harmonic representation nnnx 1223282 cos5.22cos5.26][
K
kkkk nAnx
1cos][ 0 ))(cos()(][ dnAnx
deXnx nj)(21 njn enxX
)(
Fourier transform pair
)(1 XFTnx ][)( nxFTX nnFTY 1223282 cos5.22cos5.26)(Example:
)(5.26)(5.26 8282 32
32 jj ee
)(5.22)(5.22 122122 jj ee
njjnjj eeAeeAnA )()(21 )()()(cos)()()()( jeAX
)()( * XX
-3 -2 -1 0 1 2 30
20
40
60
80
|
Y
(
)
|
Low speed variation
Medium speed
Highspeed
Average
-
28
6. Spectral representation of stationary processes
Power spectral density (PSD)
2)()( XSX ][ 0nX FT )(XS Deterministic
][][ 02
0 nn WW FT 2)( WWS Example:Example:
)()5.26()()5.26()( 822
822 YS
)()5.22()()5.22( 1222
1222
][cos5.22cos5.26][ 1223282 nwnnny )(YS
1820 1830 1840 1850 1860 1870-100
-50
0
50
100
Year
P
r
i
c
e
-
t
r
e
n
d
-3 -2 -1 0 1 2 30
2000
4000
6000
8000
SY()
Periodogram
-
29
7. Detrending and filtering
)(zH][nw ][nx)()()( 2 WX SHS )(WS
-3 -2 -1 0 1 2 30
20
40
60
80
|
Y
(
)
|
Low speed variation
Medium speed
Highspeed
Highpass filteringDetrending
Bandpass filteringDetrending+Denoising
Lowpass filteringDenoising
Trend estimation
Seasonal estimation
Identity
1820 1830 1840 1850 1860 1870
150
200
250
300
Year
x[n]
-2 0 20
1
2
3x 10
4
SX()
1820 1830 1840 1850 1860 1870-100
0
100
Year
y[n]=x[n]-trend[n]
-2 0 20
1
2
3x 10
4
SY()
-
30
7. Detrending and filtering
)(zH][nw ][nx)()()( 2 WX SHS )(WS
zzzH 1)( 131Example: Smoother
)()()( jez eHzHH j
jj eeH 1)( 313
]1[][]1[][ nxnxnxny
-3 -2 -1 0 1 2 30
0.5
1
|H()|2
-3 -2 -1 0 1 2 3
-202
angle(H())
-3 -2 -1 0 1 2 30
1
2
|1-H()|2
-3 -2 -1 0 1 2 3
-202
angle(1-H())
zzzH 11)( 131 jj eeH 11)( 313
]1[][]1[][][ nxnxnxnxny
-
31
7. Detrending and filtering
zzzH 1)( 1311
Example: Smoothers
jj eeH 1)( 3113
]1[][]1[][ nxnxnxny
21312 1)( zzzH 2312 1)( jj eeH
3]2[]1[][][ nxnxnxny
-2 0 20
0.5
1
|H1()|2
-2 0 2
-20
2
angle(H1())
-2 0 20
0.5
1
|H2()|2
-2 0 2
-20
2
angle(H2())
-2 0 20
0.5
1
|H3()|2
-2 0 2
-20
2
angle(H3())
2411
41
21
3 )( zzzH
24141213 )( jj eeH
]2[]1[][][ 414121 nxnxnxny
-
32
7. Detrending and filtering
...]4[]5[]6[121][ 21 nxnxnxnmest
][][2
1][2
21 nmknxnS est
kkest
Example: Yearly season estimation
6215432123456211 1121)( )()( zzzzzzzzzzzzzX zMzH est 2
81
41
211
412
81
2 )()(')( zzzzzXzYzH
)()()( 12 zHzHzH
-2 0 20
0.5
1
1.5
2
|H1()|2
-2 0 20
0.5
1
1.5
2
|H2()|2
-2 0 20
0.5
1
1.5
2
|H3()|2
-
33
7. Detrending and filtering
-3 -2 -1 0 1 2 30
0.5
1
|H1()|2
-3 -2 -1 0 1 2 30
0.5
1
|H2()|2
6])x[n6]-0.0611(x[n-7])x[n7]-0.0313(x[n-8])x[n8]-n-0.0101(x[][1 ny2])x[n2]-0.0934(x[n4])x[n4]-0.0634(x[n-5])x[n5]-0.0802(x[n-
0.2108x[n]1])x[n1]-0.1772(x[n
4]-0.00056x[n6])-x[n1]-n0.00037(x[-8])-x[n93(x[n]0000.0][2 ny8]-58y[n.07]-4.3y[n-6]-14.6y[n5]-29.7y[n-4]-39y[n3]-34y[n-2]-19.3y[n1]-6.5y[n-
fc=2*pi/12;ef=2*pi/60;b1=fir1(18,[fc-ef fc+ef]/pi);
fc=2*pi/12;ef=2*pi/60;[b1,a1]=butter(4,[fc-ef fc+ef]/pi);
-
34
7. Detrending and filtering
[ ] [ ] (1 )( [ 1] [ 1])S n ax n a x n y n [ ] ( [ ] [ 1]) (1 ) [ 1]y n b S n S n b y n
Example: Holts linear exponential smoothing
[ ] [ 1] [ 1]NS n S n y n N
N
1 1( ) ( ( ) ( ) ) (1 ) ( ) ( )Y z b S z S z z b Y z z f S
1 1( ) ( ) (1 )( ( ) ( ) )S z aX z a X z z Y z z ( , ) ( )f X Y f X
1 1 ( ) ( ) ( ) ( )NS z S z z NY z z f X
( )f X
12
3
-
8. Non-stationary processes
35
[ , ) [ ] [ ] j nn
X m x n m w n e
Short-time Fourier Transform
-
8. Non-stationary processes
36
Wavelet transform
-
8. Non-stationary processes
37
Wavelet transform
-
8. Non-stationary processes
38
Wavelet transform
-
8. Non-stationary processes
39
Empirical Mode Decomposition
-
40
Session outline
1. Goal2. Linear and non-linear regression3. Polynomial fitting4. Cubic spline fitting5. A short introduction to system analysis6. Spectral representation of stationary processes7. Detrending and filtering8. Non-stationary processes
-
41
Bibliography
C. Chatfield. The analysis of time series: an introduction. Chapman & Hall, CRC, 1996.
D.S.G. Pollock. A handbook of time-series analysis, signal processing and dynamics. Academics Press, 1999.
J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994.
-
1Time Series Analysis
Session III: Probability models for time series
Carlos scar Snchez Sorzano, Ph.D.Madrid,
-
22
Session outline
1. Goal2. A short introduction to system analysis3. Moving Average processes (MA)4. Autoregressive processes (AR)5. Autoregressive, Moving Average (ARMA)6. Autoregressive, Integrated, Moving Average (ARIMA, FARIMA)7. Seasonal, Autoregressive, Integrated, Moving Average (SARIMA)8. Known external inputs: System identification9. A family of models10. Nonlinear models11. Parameter estimation12. Order selection13. Model checking14. Self-similarity, Fractal dimension, and Chaos theory
-
33
1. Goal
][][][][ nrandomnperiodicntrendnx
Explained by statistical models (AR, MA, )
0 10 20 30 40 50 60 70 80 90 100-4
-2
0
2
4
0 10 20 30 40 50 60 70 80 90 100-1
-0.5
0
0.5
1
1.5
0 10 20 30 40 50 60 70 80 90 100-0.5
0
0.5
1
[ ] 0.9 [ ] [ ]random n random n completelyRandom n [ ] [ ] 0.9 [ 1]random n completelyRandom n completelyRandom n
-
44
1. Goal
-
55
2. A short introduction to system analysis
M
kk
N
kk knxbknya
00
Difference equation
Example: ]1[5.0][]1[][ nxnxnyny
Nk
kk
M
k
kk
za
zb
zXzYzH
0
0
)()()(
M
k
kk
N
k
kk zzXbzzYa
00)()(
11 )(5.0)()()( zzXzXzzYzY
1
1
15.01
)()()(
zz
zXzYzH
Transfer function
Cz
T][nx )(][ xTny
][ 0nnx ZT )(0 zXzn
-
66
2. A short introduction to system analysis
Poles/Zeros0z is a pole of iff)(zH )( 0zH0z is a zero of iff)(zH 0)( 0 zH
Stability of LTI systemsA causal system is stable iff all its poles are inside the unit circle
Example:3
]1[][]1[][ nxnxnxnyzzzH 3131
131)(
Poles: ,0z Zeros: 2321 jz }Re{z
}Im{ z
1z
Invertibility of LTI systemsThe transfer function of the inverse system of a LTI system whose transfer function is is . Therefore, the zeros of one system are the poles of its inverse, and viceversa.
)(zH)(
1zH
1z
-
77
2. A short introduction to system analysis
Downsampling nx
M ][nMxnxd
nxL
Upsampling
k
e kLnkxrestoLLnLnx
nx ][][0
,...2,,0]/[
-
88
3. Moving average processes: MA(q)
][...]1[][][ 10 qnwbnwbnwbnx q
)(...)( 221
10 zBzbzbzbbzHq
q
)(qMA][nw
LTI, with memory, invertible, causal, stable
0 2 0 4 0 6 0 8 0 1 0 0-4
-3
-2
-1
0
1
2
3w [ n ]
t im e0 2 0 4 0 6 0 8 0 1 0 0
-5
0
5x 1 [ n ]
t im e0 2 0 4 0 6 0 8 0 1 0 0
-1
-0 . 5
0
0 . 5
1x 2 0 [ n ]
t im e
-1 0 -5 0 5 1 0
0
0 . 2
0 . 4
0 . 6
0 . 8
1
A C F w [ n 0 ]
la g-1 0 -5 0 5 1 0
0
0 . 2
0 . 4
0 . 6
0 . 8
1
A C F x1[ n 0 ]
la g-3 0 -2 0 -1 0 0 1 0 2 0 3 0
0
0 . 2
0 . 4
0 . 6
0 . 8
1
A C F x20
[ n 0 ]
la g
Definition
210 ( ) ( )Wn TF H S
-
99
3. Moving average processes: MA(q)
),0( 2WN
][][ 02
0 nn WW
),0(0
22
q
kkW bN
0)(
0
0
][
00
00
2
0
0
0
0
nn
qnbb
nq
n
X
nq
knkkWX
It has limited support!!
q
kk knwbnx
0][][)(qMA][nw
q
k
q
kkk
q
kk
q
kkX knnwknwEbbknnwbknwbEnnxnxEn
0 0'0'
0'0'
000 ]'[][]'[][][][][
2' 0 ' 00 ' 0 0 ' 0
[ '] [ ' ' ] [ ( ' )]q q q q
k k k k Wk k k k
b b E w n w n n k k b b n k k
Statistical properties
Proof
-
10
10
3. Moving average processes: MA(q)
0)(
0
0
)]'([...][
00
00
2
0
0 0'0
2'0
0
0
nn
qnbb
nq
kknbbn
X
nq
knkkW
q
k
q
kWkkX
0)'(00)'(1
)]'([0
00 kkn
kknkkn 0' nkk
k
'k
q
q
0n
00 0 0 0 0 0
00 0 0 0 0 0
00 0 0 0 0 0
01 0 0 0 0 0
10 0 0 0 0 0
00 1 0 0 0 0
00 0 1 0 0 0
Statistical propertiesProof (contd.)
-
11
3. Moving average processes: MA(q)
11
(Brute force) determination of the MA parameters
0
0
0
20 0
0
0 0
0
[ ] 0
( ) 0
q n
X W k k nk
X
q n
n b b n q
n n
2 2 2 20 1 2[0]X Wr b b b 2 0 1 1 2[1]X Wr b b b b 2 0 2[2]X Wr b b
-
12
12
3. Moving average processes: MA(q)
Invertibility
)(qMA][nw
q
kk knwbnx
0][][
)(1 qMA
q
k
kk zbzH
0)(
q
k
kk
inv
zbzHzH
0
1)(
1)(
q
kk knwbnxb
nw10
][][1][
][nw
][][0
nxknwbq
kk
Example:3
]2[]1[][][ nxnxnxny does not have a stable, causal inverse]1[9.0][][ nxnxny has a stable, causal inverse
}Re{z
}Im{ z
1z
Whitening filter
Colouring filter
Channel equalizationChannel estimationCompressionForecasting
-
13
13
3. Moving average processes: generalizations
][]1[...]1[][][ 01010 00 nwbnwbqnwbqnwbnx qq Model not restricted to be causal
Model not restricted to be linear
q
k
q
kkk
q
kk knwknwbknwbnx
0
'
0'',
0
]'[][][][
][...]1[1 Fq qnwbnwb F Causal component
Anticausal component
Quadratic component
q
k
q
k
q
kkkk
q
k
q
kkk
q
kk knwknwknwbknwknwbknwbnx
0
'
0'
''
0'''',',
0
'
0'',
0
]''[]'[][]'[][][][
Volterra Kernels
1)
2)
q
kk knwfbnx
0
])[(][
-
14
14
4. Autoregressive processes: AR(p)
0 2 0 4 0 6 0 8 0 1 0 0-4
-3
-2
-1
0
1
2
3w [ n ]
t im e0 2 0 4 0 6 0 8 0 1 0 0
-5
0
5x 1 [ n ]
t im e0 2 0 4 0 6 0 8 0 1 0 0
-1
-0 . 5
0
0 . 5
1x 2 0 [ n ]
t im e
][...]1[][][ 1 pnxanxanwnx p
)(1
...11)( 2
21
1 zAzazazazH p
p
)( pAR][nw
LTI, with memory, invertible, causal, stable Definition
-30 -20 -10 0 10 20 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
n0
ACF(n0)
-
15
15
4. Autoregressive processes: AR(p)
][...]1[][][ 1 pnxanxanwnx p
)(1
...11)( 2
21
1 zAzazazazH p
p
)( pAR][nw
LTI, with memory, invertible, causal, stable Definition
Relationship to MA processes
12211
1...1
1)(k
kkp
p
zbzazaza
zH
Laurent series
p
kXkWX knann
100
20 ][][][
Statistical properties
),0( 2WN ])0[,0( XN Yule-Walker equations
p
k
nkkX zAn
10
0][Whose solution is kz Poles of )(zH
-
16
16
4. Autoregressive processes: AR(p)
Determination of the constants
p
k
nkkX zAn
10
0][
Example:
]2[]1[][][ 21 nxanxanwnx 22
111
1)( zazazH
Poles:
kA
p
k
nkkX zAnr
1
'0
0][
20 0 0
1[ ] [ ] [ ]
p
X W k Xk
n n a n k
p
kXkX knranr
100 ][][ 00 n
24
, 2211
21
aaazz
1,1,11 21212 aaaaazi
002
'21
'10 ][
nnX zAzAnr
04 221 aaRzi
1]0[ '2'1 AArX
]1[]0[]1[ 212'11
'1 XXX rarazAzAr
-
17
17
5. Autoregressive, Moving average: ARMA(p,q)
p
kk
q
kk knxaknwbnx
10][][][
1 20 1 2
( ) ( ) 1 21 2
... ( )( ) ( ) ( )1 ... ( )
qq
MA q AR p pp
b b z b z b z B zH z H z H za z a z a z A z
)( pAR][nw
Definition
Statistical properties
),0( 2WN ])0[,0( XN
)(qMA
p
kXk
q
kkWX knankhbn
10
00
20 ][][][
-
18
18
6. Autoregressive, Integrated, Moving Average: ARIMA(p,d,q)
][nxd
)()()()( ),( zHzWzXzH qpARMAd
),( qpARMA][nw
Definition
][*])1[][(*...*])1[][(][][ nxnnnnnxnx dld d
dd
Intd
d zzXzXzHzXzzX
)1(1
)()()()()1()( 1
1
)(dInt ][nx
Poles: 1z (Multiplicity=d)Unit root
)()()()(
)()(
)()()( )(),(),,( zHzHzX
zXzWzX
zWzXzH dIntqpARMA
d
dqdpARIMA
p
kk
q
kk knxknxaknwbnxnx
10]1[][][]1[][Example for d=1:
Qd FARIMA or ARFIMA
-
19
19
7. Seasonal ARIMA: SARIMA(p,d,q)x(P,D,Q)s(Box-Jenkins model)
),,( qdpARIMA][nw
Definition
( , , )( ) ( )( ) ARIMA p d qs
X z H zX z
),,( QDPARIMAs s][nxs ][nx
)()()(
),,(s
QDPARIMAs zHzWzX
)()()()()( ),,(),,(),,(),,( zHzHzWzXzH qdpARIMA
sQDPARIMAQDPqdpSARIMA s
P
kssk
Q
kkss sknxksnxAksnwBsnxnx
10])1([][][][][
p
kk
q
ksk knxknxaknxbnxnx
10]1[][][]1[][
1D
1d
-
20
20
7. Seasonal ARIMA: SARIMA(p,d,q)x(P,D,Q)sExample: SARIMA(1,0,0)x(0,1,1)12
),,( qdpARIMA][nw ),,( QDPARIMAs s][nxs ][nx
]12[][]12[][ 10 nwBnwBnxnx ss
]1[][][ 1 nxanxnx s)0,0,1(),,( qdp
)1,1,0(),,( QDP
]1[][][ 1 nxanxnxs
]12[][]13[]12[]1[][ 1011 nwBnwBnxanxnxanx
]13[]1[]12[][]12[][ 110 nxnxanwBnwBnxnx
-
21
21
8. Known external inputs: System identification
)(1zA
][nw +
)()(zAzB
][nu
][][][][01
nwknubknxanxq
kk
p
kk
)()(
1)()()()( zW
zAzU
zAzBzX
ARX
)()(zAzC][nw +
)()(zAzB
][nu
'
001][][][][
q
kk
q
kk
p
kk knwcknubknxanx
)()()()(
)()()( zW
zAzCzU
zAzBzX
ARMAX
-
22
22
9. A family of models
)()(zDzC][nw +
)()(zFzB
][nu
)()()(
)()()()(
)()( zWzAzD
zCzUzAzF
zBzX General model
)(1zA
][nx
Polynomials used Name of the model
A AR
C MA
AC ARMA
ACD ARIMA
AB ARX
ABC ARMAX
ABD ARARX
ABCD ARARMAX
BFCD Box-Jenkins
-
23
23
10. Nonlinear models
][][],...,2[],1[][ nwpnxnxnxfnx Nonlinear AR:][][][][
1nwknxnanx
p
kk
Time-varying AR:
Random coeff. AR: ][][])[(][1
nwknxnanxp
kk
Bilinear models ][][][][][11
nwMknwknxbknxanxq
kk
p
kk
'1 1
[ ] [ ] [ ] (1 ) [ ]p p
k kk k
x n a x n k p n a x n k p n w n
Smooth transition:
-
24
24
10. Nonlinear models
Smooth TAR (STAR):
Threshold AR (TAR):
tdnxnwknxa
tdnxnwknxanx p
kk
p
kk
][][][
][][][][
1
)2(
1
)1(
][])[(][][][1
)2(
1
)1( nwdnxSknxaknxanxp
kk
p
kk
Heterocedastic model: ][][][ nwnnx Random walk)1,0(N
p
kk knxan
1
220
2 ][][
q
kk
p
kk knbknxan
1
2
1
220
2 ][][][
ARCH
GARCH
(Neural networks)(Chaos)
-
25
25
10. Nonlinear models
[ ]x n[ ] 0[ ]x n n 2 0[ ][ ]x n n
q
kk
p
kk knbknxan
1
2
1
220
2 ][][][ GARCH
1 11
p q
k kk k
a b
The model is unique and stationary if Properties
[ ] 0E x n Zero meanLack of correlation
20
[ ] 0 0max ,
1
[ ] [ ]1
x n p q
k kk
n na b
-
26
26
10. Nonlinear models
q
kk
p
kk knbknxan
1
2
1
220
2 ][][][ GARCHEstimation through Maximum LikelihoodForecasting
min ,2 2 2
01 1
[ ] ( ) [ ] [ ]p q q
k k kk k
x n h a b x n h k b z n h k
2 2
0 [ 1] ...x n h 2 2
0 [ 2] ...x n h ...2[ ]x n2[ 1]x n Observed
[ 1] 0z n h [ 2] 0z n h
...2 2[ ] [ ] [ ]z n x n n
2 2[ 1] [ 1] [ 1]z n x n n GARCH(1,1)
2 2 2 20 1 1 [ 1] [ ] [ ]x n a x n b n
12 2 2 1 2
0 1 1 1 1 1 1 1 10
[ ] ( ) ( ) [ ] ( ) [ ]h
k h
kx n h a b a a b x n b a b n
-
27
27
10. Nonlinear models
Extensions of GARCH
Exponential GARCH (EGARCH)
2 2 2 20
1 1log [ ] log log [ ] log [ ]
p q
k kk k
n a x n k b n k
Integrated GARCH (IGARCH)
1 1
1p q
k kk k
a b
1 1
1p q
k kk k
a b
GARCH IGARCH
-
28
28
11. Parameter estimation
Maximum Likelihood Estimates (MLE)
][]1[][ 1 nwnxanx AR(1)Assume that we observe ][],...,2[],1[ Nxxx
][]1[][ 1 nwnxanx
212
1 1,0|
aNX W
][][ 02
0 nn WW
...)( 121111 nnnnnn WWXaaWXaX...3
312
2111 nnnn WaWaWaW
0nXE 2331221112 ...nnnnn WaWaWaWEXE 2
1
261
41
21
2
1...1
aaaa WW
2112 WXaX 1122 XaXW 2
2 1 1 1 21
| , ,1
WX X N a xa
2,0 WN
21, Wa
-
29
29
11. Parameter estimation
Maximum Likelihood Estimates (MLE)
212
1 1,0|
aNX W
2
2 1 1 1 21
| , ,1
WX X N a xa
2
3 2 1 1 2 21
| , , ,1
WX X X N a xa
),...,,( 21|...21 NXXX xxxf N )()...()()( ,|3,|2,|1| 123121 NXXXXXXX xfxfxfxf NN
N
n W
nnW
N
a
WNXXX
xaxxa
xxxfLWN 2
2
211
212
21
1
21
21
2
21
21
21|... )2log(21
log)2log(),...,,(log)(21
221
21
21
)(0)()(maxarg,1 WaW
LaLLa
Numerical, iterative solution
Confidence intervals
-
30
30
11. Parameter estimation
][][][ nwnxnx
0][ nwE ]0[]1[2]0[][ 21122 XXXW aanwE
][][][ nxnxnw Least Squares Estimates (LSE)
][]1[][ 1 nwnxanx ]1[][ 1 nxanx
1
]1[]0[]0[
]0[]1[]0[2]1[20 2
22
111
2
X
XXW
X
XXX
W aaa
-
31
31
12. Order selection
If I have to fit a model ARMA(p,q), what are the p and q values I have to supply?
ACF/PACF analysis Akaike Information Criterion
Bayesian Information Criterion
Final Prediction Error
NqpqpAIC W
2)(log),( 2
NNqpqpBIC W
log)(log),( 2
2)( WpNpNpFPE
-
32
32
12. Order selection
Partial correlation coefficients (PACF, First-order correlations)
][][...]2[]1[][ 0,2,1, 0000 nwnnxnxnxnx nnnn
00
10,0 ][][
n
nnn nnrnr
][
]1[...
]3[]2[]1[
...
]0[]1[...]2[]1[]1[]1[]0[...]4[]3[]2[
..................]3[]4[...]0[]1[]2[]2[]3[...]1[]0[]1[]1[]2[...]2[]1[]0[
0
0
,
1,
3,
2,
1,
000
000
00
00
00
00
00
0
0
0
nrnr
rrr
rnrnrnrrrnrnrnr
nrnrrrrnrnrrrrnrnrrrr
nn
nn
n
n
n
Yule-Walker equations
-
33
33
12. Order selection
Thumb ruleARMA(1,0): ACF: exponential decrease; PACF: one peakARMA(2,0): ACF: exponential decrease or waves; PACF: two peaksARMA(p,0): ACF: unlimited, decaying; PACF: limitedARMA(0,1): ACF: one peak; PACF: exponential decrease ARMA(0,2): ACF: two peaks; PACF: exponential decrease or wavesARMA(0,q): ACF: limited; PACF: unlimited decayingARMA(1,1): ACF&PACF: exponential decrease ARMA(p,q): ACF: unlimited; PACF: unlimited
-
34
34
13. Model checking
Residual AnalysisExample: ARMA (1,1)
]1[]1[][][]1[][]1[][ 1110 0 nwbnaxnxnwnwbnwbnaxnx b
Assumptions
1. Gaussianity:1. The input random signal w[n] is univariate normal with zero mean2. The output signal, x[n] (the time series being studied), is multivariate normal and its
covariance structure is fully determined by the model structure and parameters2. Stationarity: x[n] is stationary once that the necessary operations to produce a stationary
signal have been carried out.3. Residual independency: the input random signal w[n] is independent of all previous
samples.
0]0[ w
-
35
35
13. Model checking
[ ]x n
n
Structural changes
1 1[ ] ( , )y n f x y 2 2[ ] ( , )y n f x y
[ ] ( , )y n f x y
2
1
( [ ] [ ])N
i in
S y n y n
Sum of squares of the residuals with model i
Chow test:
0 1 2
1 1 2
::
H f fH f f
1 2
11 2
1 211 22
( ( )) ( , 2 )( )
k
N N k
S S SF F k N N kS S
Number of samples in each period
Number of parameters in
the model
Assumption: variance is the same in both regionsSolution: Robust standard errors
-
36
36
13. Model checking
Diagnostic checking
1. Compute and plot the residual error2. Check that its mean is approximately zero3. Check for the randomness of the residual, i.e., there are no time intervals where the
mean is significantly different from zero (intervals where the residual is systematically positive or negative).
4. Check that the residual autocorrelation is not significantly different from zero for all lags
5. Check that the residual is normally distributed.6. Check if there are residual outliers.7. Check the ability of the model to predict future samples
-
37
14. Self-similarity, Fractal dimension, Chaos theory
37
Intuitively a fractal is a curve that is self-similar at all scales
Koch curve
k=1
k=0
k=2
k=3
k=4
k=5
( )XS
-
38
14. Self-similarity, Fractal dimension, Chaos theory
38
L==1m
=0.5m=0.5m 12N =0.25m =0.25m =0.25m =0.25m 14N
1 1N D
11N
S=1m2
L==1m
2
11N
=0.5m
2
14N 2=1m2 2=0.25m2
2
116N 2=0.0625m2
=0.25m
2 2N D
-
39
14. Self-similarity, Fractal dimension, Chaos theory
39
1 1D DN 13m
1m
2
13
m
3
13m
143
DDN
22
143
DDN
33
143
DDN
k=1
k=2
k=3
1 log 44 1.263 log3
DD k
kN D
-
40
14. Self-similarity, Fractal dimension, Chaos theory
40
( )XS 5 2D
2H D Hurst exponent
0.2
0,1 0.50.8
H
Uncorrelated time series
Long-range dependent alternating signs
Long-range dependent same signs
maxmax max
max
1 [ ] [ ]N
HMA MAN n
n nx n x n n
-
41
14. Self-similarity, Fractal dimension, Chaos theory
41
-
42
14. Self-similarity, Fractal dimension, Chaos theory
42
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
n
x0.1[n]
[ ] 4 [ 1](1 [ 1])x n x n x n
Chaotic systems
Logistic equation
[0] 0.1x
0 10 20 30 40 50 60 70 80 90 100-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
n
x0.1[n]-x0.100001[n]
-
43
14. Self-similarity, Fractal dimension, Chaos theory
43
Phase space
Attractor (Fixed point)
2-history 22[ ] [ ], [ 1]n x n x n x 3-history 33[ ] [ ], [ 1], [ 2]n x n x n x n x h-history [ ] [ ], [ 1],..., [ 1] hh n x n x n x n h x
2[2] [2], [1]x xx 2[3] [3], [2]x xx 2[1] [1], [0]x xx
2-Phase space
-
44
14. Self-similarity, Fractal dimension, Chaos theory
44
Recurrence plots
,
1 [ ] [ ']( , ')
0 [ ] [ ']h h
hh h
n nR n n
n n
x xx x
White noise
2, ( , ')R n n
Sinusoidal Chaotic system with trend
AR model
-
45
14. Self-similarity, Fractal dimension, Chaos theory
45
Recurrence plots
-
46
14. Self-similarity, Fractal dimension, Chaos theory
46
Correlation dimension (Grassberger-Procaccia plots)Correlation integral
( 1), ' 12
'
1( ) Pr [ ] [ '] lim [ ] [ ']N
h h h h hN NN n nn n
C n n H n n x x x x
0 0( )
1 0x
H xx
Heaviside functionCorrelation dimension
0
log( ( ))limlog( )
hh
CD
hD
hD h (random)
(maybe chaotic)
h
-
47
14. Self-similarity, Fractal dimension, Chaos theory
47
Brock-Dechert-Scheinkman (BDS) Test
1,
( ) ( )(0,1)
hh
h
C CV N
N
Lyapunov exponentConsider two time points such that
0[ ] [ '] 1h hn n x x Consider the distance m samples later [ ] [ ' ]m h hn m n m x x
0m
m e The Lyapunov exponent relates these two distances
0 Histories diverge: chaos, cannot be predicted0 Histories converge: can be predicted
Maximal Lyapunov exponent
0, 0
1lim log mm m
H0: samples are iid
-
48
48
Bibliography
C. Chatfield. The analysis of time series: an introduction. Chapman & Hall, CRC, 1996.
D.S.G. Pollock. A handbook of time-series analysis, signal processing and dynamics. Academics Press, 1999.
J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994.
-
Time Series Analysis
Session IV: Forecasting and Data Mining
Carlos scar Snchez Sorzano, Ph.D.Madrid
-
2Session outline
1. Forecasting2. Univariate forecasting3. Intervention modelling4. State-space modelling5. Time series data mining
1. Time series representation2. Distance measure3. Anomaly/Novelty detection4. Classification/Clustering5. Indexing6. Motif discovery7. Rule extraction8. Segmentation9. Summarization
-
31. Forecasting
?][ hnx Goals: Symmetric loss functions:
Quadratic loss function: Other loss functions:
Assymmetric loss functions:
])[],[(minarg][][ hghnxLEhghnx ngn
2][][])[],[( hnxhnxhnxhnxL ][][])[],[( hnxhnxhnxhnxL
][][][][][][][][])[],[( 2
2
hnxhnxhnxhnxbhnxhnxhnxhnxahnxhnxL
Solution: )][],...,2[],1[][][][ nxxxhnxEhghnx n Univariate Forecasting: use only samples of the time series to be predicted
Multivariate Forecasting: use samples of the time series to be predicted and other companion time series.
-
42. Univariate Forecasting
Trend and seasonal component extrapolation
][][][][ nrandomnseasonalntrendnx nntrend 6.12777][ nnnseasonal 1223282 cos5.22cos5.26][
-0.2 0 0.2 0.4 0.6 0.8 1 1.2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
-1 -0.5 0 0.5 1 1.5 2-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
-
52. Univariate Forecasting
Exponential smoothing
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870
200
300
Year
P
r
i
c
e
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870
200
300
Year
P
r
i
c
e
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870-100
0
100
Year
P
r
i
c
e
-
t
r
e
n
d
1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870-50
050
Year
R
e
s
i
d
u
a
l
][)1(][]1[ nxnxnx
1)()()(
zzX
zXzH
-2 0 20
0.5
1
|
H
(
)
|
2
-2 0 2
-2
0
2
a
n
g
l
e
(
H
(
)
)
][])[][( nxnxnx ][][ nxne
]1[]1[ xx
-
62. Univariate forecasting
Model based forecasting (Box-Jenkins procedure)
1. Model identification: examine the data to select an apropriate model structure (AR, ARMA, ARIMA, SARIMA, etc.)
2. Model estimation: estimate the model parameters3. Diagnostic checking: examine the residuals of the model to check if it is valid4. Consider alternative models if necessary: if the residual analysis reveals that the
selected model is not apropriate5. Use the model difference equation to predict: as shown in the next two slides.
-
72. Univariate forecasting
Model based forecasting
]1[][]1[][ nbwnwnaxnx
][]1[][]1[ nbwnwnaxnx
Example: ARMA(1,1)
1
1
11
)()()(
azbz
zWzXzHW
)()()()()()(zH
zXbzaXzbWzaXzXzW
bzba
bzba
zzHba
zzXzXzH
WX
11
1)(
1)()()(
][)(][]1[ nxbanxbnx ][][)(]1[ nxbnxbanx
...]3[]2[]1[][]1[][ 32 hnxahnxahnbwhnwhnxahnx
][][]1[ 11 nxbnxbaanxa hh
1h
-
82. Univariate forecasting
]12[][])13[]1[(]12[][ nwnwnxnxnxnx
]11[]1[])12[][(]11[]1[ nwnwnxnxnxnx
Example: SARIMA(1,0,0)x(0,1,1)12
Model based forecasting
13121
12
11
)()()(
zzzz
zWzXzHW
]12[]13[]12[]1[][][ nwnxnxnxnxnw Eliminating w[n]
]13[]12[)(]11[]1[][)(]1[ nxnxnxnxnxnx ]12[]24[]23[ nxnxnx
-
92. Univariate Forecasting
-
10
2. Univariate forecasting
[ ]x n
n
1 1[ ] ( , )y n f x y
[ ] ( , )y n f x y
Chow test:
0 1
1 1
::
H f fH f f
2
1
11
2 111
( )( , )N
N k
S SF F N N k
S
Assumption: variance is the same in both regionsSolution: Robust standard errors
Predictive power test
-
2. Univariate forecasting
11
Artificial Neural Network (ANN)
1 2 ( , ,..., )t t t t p t t ty f y y y y w
-
2. Univariate forecasting
12
Fuzzy Artificial Neural Network (ANN)
( ) : [0,1]Ma
x
-
2. Univariate forecasting
13
Fuzzy Artificial Neural Network (ANN)
-
2. Univariate forecasting
14
Fuzzy Artificial Neural Network (ANN)
-
2. Univariate forecasting
15
Fuzzy Artificial Neural Network (ANN)
-
16
3. Intervention modellingWhat happens in the time series of a price if there is tax raise of 3% in 1990?
[ ] [ 1] [ ] 0.03 [ 1990]price n price n w n u n
1 0[ ]
0 0n
u nn
1982 1984 1986 1988 1990 1992 1994
0
0.02
0.04
[ ] [ 1] [ ] 0.03( [ 1990] [ 1993])price n price n w n u n u n
1982 1984 1986 1988 1990 1992 1994
0
0.02
0.04
What happens in the time series of a price if there is tax raise of 3% between 1990 and 1992?
1 19901
1( ) ( ) ( ) 0.031
price z price z z w z zz
1( ) ( ) ( )price z price z z w z 1990 1993
1
10.03 ( )1
z zz
-
17
3. Intervention modellingWhat happens in the time series of a price if in 1990 there was an earthquake?
[ ] [ 1] [ ] [ 1990]price n price n w n n 1 0
[ ]0
nn
otherwise
1982 1984 1986 1988 1990 1992 1994
0
0.02
0.04
What happens in the time series of a price if there is a steady tax raise of 3% since 1990?[ ] [ 1] [ ] 0.03( 1989) [ 1990]price n price n w n n u n
1982 1984 1986 1988 1990 1992 19940
0.05
0.1
0.15
0.2
1 1990( ) ( ) ( ) 1price z price z z w z z
1 19901 1
1 1( ) ( ) ( ) 0.031 1
price z price z z w z zz z
-
18
3. Intervention modelling
In general
[ ] ( [ 1],..., [ ], [ ], [ 1],..., [ ]) ( [ ])x n f x n x n q w n w n w n p f i n ( ) ( )( ) ( ) ( )( ) ( )
B z C zX z W z I zA z D z
We are back to the system identification problem with external inputs
-
19
3. Intervention modelling: outliers revisited
1982 1984 1986 1988 1990 1992 1994
0
0.02
0.04
Additive outliers: ( )( ) ( ) ( )( )
B zX z W z I zA z
Innovational outliers:( ) ( )( ) ( ) ( )( ) ( )
B z C zX z W z I zA z D z
Level outliers:
1982 1984 1986 1988 1990 1992 1994
0
0.02
0.04
( )( ) ( ) ( )( )
B zX z W z I zA z
Time change outliers: ( ) 1( ) ( ) ( )( ) ( )
B zX z W z I zA z D z
-
20
4. State-space modelling
F1z
[ ] kn x H
1[ ]n2[ ]n
[ ] ln y StateNoise Noise
Observed time-seriesObservation
systemState-transition
system
Linear, Gaussian system: 1[ ] [ 1] [ ]n F n Q n x x State-transition model2[ ] [ ] [ ]n H n n y x Observation model
0 0[0] ( , )N x 1 1[ ] ( , )n N 02 2[ ] ( , )n N 0
Kalman filter: Given estimateTime series modelling (System Identification):
0 0 1 2[ ], , , , , ,x n H F
1 2[ ], ,x n 0 0[ ], , , ,y n H F
Given y[n] estimate
-
21
4. State-space modelling
-
22
4. State-space modelling
Linear, Gaussian system: 1[ ] [ 1] [ ]n F n Q n x x State-transition model2[ ] [ ] [ ]n H n n y x Observation model
Extended/Unscented Kalman filter: Given estimate 1 2[ ], ,x n 0 0[ ], , , ,y n h f
General system: 1[ ] ( [ 1], [ ])n f n n x x 2[ ] ( [ ], [ ])n h n ny x
,f h Differentiable
Particle filter:
Given and the probability density functions of 0 0[ ], , , ,y n h f 1 2, estimate x[n] and the free parameters of the noise signals.
-
23
5. Time series data mining
Source: http://www.kdnuggets.com/polls/2004/time_series_data_mining.htm
-
24
5. Time series data miningTime Series
Representations
Data Adaptive Non Data Adaptive
SpectralWavelets PiecewiseAggregateApproximation
Piecewise Polynomial
SymbolicSingularValueDecomposition
RandomMappings
PiecewiseLinear
ApproximationAdaptivePiecewiseConstant
Approximation
DiscreteFourier
Transform
DiscreteCosine
TransformHaar Daubechies
dbn n > 1Coiflets Symlets
Sorted Coefficients
Orthonormal Bi-Orthonormal
Interpolation Regression
Trees
Natural Language
Strings
0 20 40 60 80 100120 0 20 40 60 80 100120 0 20 40 60 80 100120 0 20 40 60 80 1001200 20 40 60 80 100120 0 20 40 60 80 100120
DFT DWT SVD APCA PAA PLA
0 20 40 60 80 100120
SYM
UUCUCUCD
UU
CU
CU
DD
Time series representation
-
25
5. Time series data mining
0 20 40 60 80 100 120
C
C
0
--
0 20 40 60 80 100 120
bbb
a
cc
c
a
baabccbc
Time series representation
-
26
5. Time series data miningTime series representation
Source: http://www.cs.cmu.edu/~bobski/pubs/tr01108-onesided.pdf
-
27
5. Time series data miningTime series distance measure
-
28
5. Time series data mining
Source: http://www.cs.ucr.edu/~wli/SSDBM05/
Anomaly/Novelty detection
A very complex and noisy ECG, but according to a cardiologist there is only one abnormal heartbeat.The algorithm easily finds it.
-
29
5. Time series data miningClassification/Clustering
-
30
5. Time series data miningIndexing
-
31
5. Time series data mining
W inding Dataset (T he angular speed of reel 2)
0 500 1000 1500 2000 2500
0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
A B C
A B C
Motif discovery
-
5. Time series data mining
32
Motif discovery
-
33
5. Time series data miningRule extraction
-
5. Time series data mining
34
Rule extraction
1x n x n
y nx n
Step 1: Convert time series into a symbol sequence
-
5. Time series data mining
35
Rule extractionStep 2: Identify frequent itemsets
x[n]=abbcaabbcabababcaabcabbcabcbcabbaabcbcabbcbc
2-item setaa 3ab 11ac 1ba 2bb 5bc 11ca 7cb 3cc 0
Min.Support=5
3-item setaba 2abb 5abc 4bba 1bbb 0bbc 4bca 7bcb 3bcc 0caa 2cab 5cac 0
4-item setabba 1abbb 0abbc 4bcaa 2bcab 5bcac 0caba 1cabb 3cabc 1
5-item setbcaba 1bcabb 3bcabc 1
-
36
5. Time series data miningSegmentation (Change Point Detection)
-
37
5. Time series data miningSummarization
-
38
Session outline
1. Forecasting2. Univariate forecasting3. Intervention modelling4. State-space modelling5. Time series data mining
1. Time series representation2. Distance measure3. Anomaly/Novelty detection4. Classification/Clustering5. Indexing6. Motif discovery7. Rule extraction8. Segmentation9. Summarization
-
39
Conclusions
][][][][ nrandomnperiodicntrendnx
Regression, Curve fitting
Explained by statistical models (AR, MA, ARMA)
Preprocessing (heteroskedasticity, gaussianity, outliers, )?
Harmonic analysis,Filtering
(F)ARIMA SARIMA
-
40
Bibliography
C. Chatfield. The analysis of time series: an introduction. Chapman & Hall, CRC, 1996.
C. Chatfield. Time-series forecasting. Chapman & Hall, CRC, 2000. D.S.G. Pollock. A handbook of time-series analysis, signal processing
and dynamics. Academics Press, 1999. D. Pea, G. C. Tiao, R. S. Tsay. A course in time series analysis. John
Wiley and Sons, Inc. 2001
Session 0_x1Session I_x1Session II_x1Session III_x1Session IV_x1