Time Series Analysis

Time Series Analysis

Session 0: Course outline

Carlos scar Snchez Sorzano, Ph.D.Madrid

2Motivation for this course

3Course outline

4Course outline: Session 1

Kinds of time series Continuous time series sampling Data models Descriptive analysis: time plots and data preprocessing Distributional properties: statisitcal distribution, stationarity and autocorrelation Outlier detection and rejection

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4Sampling

t

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4Regular sampling

tIrregular sampling

][][][][ nrandomnperiodicntrendnx


Trend analysis: Linear and non-linear regression Polynomial fitting Cubic spline fittingSeasonal component analysis: Spectral representation of stationary processesSpectral signal processing: Detrending and filtering Non-stationary signal processing



Model definition: Moving Average processes (MA) Autoregressive processes (AR) Autoregressive, Moving Average (ARMA) Autoregressive, Integrated, Moving Average (ARIMA, FARIMA) Seasonal, Autoregressive, Integrated, Moving Average (SARIMA) Known external inputs: System identification A family of models Nonlinear modelsParameter estimationOrder selectionModel checkingSelf-similarity, fractal dimension and chaos theory



Forecasting Univariate forecasting

Intervention modelling State-space modelling Time series data mining

Time series representation Distance measure Anomaly/Novelty detection Classification/Clustering Indexing Motif discovery Rule extraction Segmentation Summarization

W inding Dataset (T he angular speed of reel 2)

0 500 1000 1500 2000 2500A B C

8Course Outline: Session 5

Su nombre aqu

Bring your own data if possible!

9Suggested readings

It is suggested to read (before coming): Geo4990: Time series analysis Adler1998: Analysing stable time series Leonard: Mining Transactional and Time Series Data Chattarjee2006: Simple Linear Regression

10

Resources

Data setshttp://www.york.ac.uk/depts/maths/data/tshttp://www-personal.buseco.monash.edu.au/~hyndman/TSDL

Competitionhttp://www.neural-forecasting-competition.com

Links to organizations, events, software, datasets http://www.buseco.monash.edu.au/units/forecasting/links.phphttp://www.secondmoment.org/time_series.php

Lecture noteshttp://www.econphd.net/notes.htm#Econometrics

11

Bibliography

D. Pea, G. C. Tiao, R. S. Tsay. A course in time series analysis. John Wiley and Sons, Inc. 2001

C. Chatfield. The analysis of time series: an introduction. Chapman & Hall, CRC, 1996.

C. Chatfield. Time-series forecasting. Chapman & Hall, CRC, 2000. D.S.G. Pollock. A handbook of time-series analysis, signal processing

and dynamics. Academics Press, 1999. J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994. C. Prez. Econometra de las series temporales. Prentice Hall (2006) A. V. Oppenheim, R. W. Schafer, J. R. Buck. Discrete-time signal

processing, 2nd edition. Prentice Hall, 1999. A. Papoulis, S. U. Pillai. Probability, random variables and stochastic

processes, 4th edition. McGraw Hill, 2002.

1Time Series Analysis

Session I: Introduction


22

Session outline

1. Features and objectives of the time series2. Sampling3. Components of time series: data models4. Descriptive analysis5. Distributional properties6. Detection and removal of outliers7. Time series methods

33

1. Features and objectives of the time series

Goal: Explain historyFeatures:1. Samples (discrete) 2. Bounded 3. Finite support

nx

nnsnslnx 1 Fnnnnnx ,...,1,0 00

TrendPeriodic componentPeriod=N7 years

][][ nxNnx

44

1. Features and objectives of the time series

Goal: Forecast demandFeatures:1. Seasonal2. Non stationary3. Non independent samples

Mean

StandardVariance

55

1. Features and objectives of the time seriesPendulum angle with different controllers

Goal: Control=Forecast(+correct)

Features:1. Continuous signal

Regular sampling)(][ snTxnx

sT =Sampling period

66

1. Features and objectives of the time seriesOther kind of times series not covered in this course

77

1. Features and objectives of the time seriesOther kind of times series not covered in this course

88

2. Sampling

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4Sampling

t

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4Regular sampling

t

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4Irregular sampling

t

)(][ snTxnx

Also called non-uniform sampling

99

2. Sampling

0 10 20 30 40 50 60 70 80 90 100-5

0

5Continuous signal

t

0 10 20 30 40 50 60 70 80 90 100-5

0

5Oversampling

t

0 10 20 30 40 50 60 70 80 90 100-5

0

5Critical sampling

t

0 10 20 30 40 50 60 70 80 90 100-5

0

5Undersampling

t

10min T

5minTTs

2minTTs

minTTs

Nyquist/Shannon criterion

81001041003211001 2sin2.02sin2sin2)( ttttx

10

10

2. Sampling

11

11

2. Sampling: Signal reconstruction

0 5 10 15 20

-0.2

0

0.2

0.4

0.6

0.8

1

Continuous signal

t0 5 10 15 20

-0.2

0

0.2

0.4

0.6

0.8

1

Sampled signal

t-5 0 5

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1sinc(t)

t

0 5 10 15 20

-0.2

0

0.2

0.4

0.6

0.8

1

Sinc superposition

t9 9.5 10 10.5 11

-0.2

0

0.2

0.4

0.6

0.8

1

Sinc superposition

t0 5 10 15 20

-0.2

0

0.2

0.4

0.6

0.8

1

Reconstructed signal

t

n

srr nTthnxtx )()(

sr T

tcth sin)(

Reconstruction formula

Reconstruction kernel for bandlimited signals

12

12

2. Sampling: Aliasing

0 20 40 60 80 100 120 140 160 180 200-1

-0.5

0

0.5

1Obvious aliasing effect

t

0 50 100 150 200 250 300 350 400-3

-2

-1

0

1

2

3Not so clear aliasing effect

t

21.1minmin TTTs

21.1minmin TTTs

81001041003211001 2sin2.02sin2sin2)( ttttx

10100( ) sin 2x t t

13

13

2. Aliasing: Conclusions

From the disgression above, two are the main consequences that must be kept in mind:1. Any continuous time series can be safely treated as a discrete time

series as long as the Nyquist criterion is satisfied.2. Once our discrete analysis is finished, we can always return to the

continuous world by reconstructing our output sequence.

Although not discussed, once the continuous signal is discretized, one can arbitrarily change the sampling period without having to go back to the continuous signal with two operations called upsampling (going for a finer discretization) and downsampling (coarser discretization).

14

14

3. Components of a time series: data models

][][][][ nrandomnperiodicntrendnx SeasonalEx: Unemployment is low in summerEx: Temperature is high in the middle of the day

Long-term change. What is long-term?

0 50 100 150 200 250 300 350 400-3

-2

-1

0

1

2

3

t

Explained by statistical models (AR, MA, )

15

15

0 100 200 300 400-10

-5

0

5

10

t

Trend

0 100 200 300 400-10

-5

0

5

10

t

Seasonal

0 100 200 300 400-10

-5

0

5

10

t

Noise

0 100 200 300 400-10

-5

0

5

10

t

Additive model

0 100 200 300 400-10

-5

0

5

10

t

Seasonal multiplicative model

0 100 200 300 400-10

-5

0

5

10

t

Fully multiplicative model

3. Components of a time series: data models

][][][][ nnSnmnx ][][][][ nnSnmnx ][][][][ nnSnmnx

][log][log][log][log nnSnmnx log

16

16

3. Components of a time series: data model

0 10 20 30 40 50 60 70 80 90 100-1

-0.5

0

0.5

1

0 10 20 30 40 50 60 70 80 90 100-1

-0.5

0

0.5

1

Noise is dependent of signal?

0 10 20 30 40 50 60 70 80 90 100-1

0

1Signal x1

0 10 20 30 40 50 60 70 80 90 100-1

0

1Signal x2

0 10 20 30 40 50 60 70 80 90 100-1

0

1Interferent signal

0 10 20 30 40 50 60 70 80 90 100-2

0

2Observed signal

Noise is dependent of noise

[ ] [ ] [ ]x n x n n

17

17

4. Descriptive analysis

Time plot Take care of appearance (scale, point shape, line shape, etc.) Take care of labelling axes specifying units (be careful with the

scientific notation, e.g. from 0.5e+03 to 0.1e+04) Data Preprocessing

Consider transforming the data to enhance a certain feature, although this is a controversial topic:

Logarithm: Stabilizes variance in the fully multiplicative model

Box-Cox: the transformed data tends to be normally distributed])[log(][ nxny

0])[log(0][

1][

nxny

nx

18

18


Data preprocessing: (Removal of trend) Detrending: if the trend clearly follows a known

curve (line, polynomial, logistic, Gaussian, Gompertz, etc.), fit a model to the time series and detrend (either by substracting or dividing).

(Removal of trend) Differencing

2

2

]2[]1[2][][][

]1[][2]1[][][

][]1[][][

]1[][][][

sllll

slrlr

srr

sll

Tnxnxnxnyny

Tnxnxnxnyny

Tnxnxnxny

Tnxnxnxny

-6 -4 -2 0 2 4 6-10

-5

0

5

10

15

20

25

30

x[n]yl[n]

ylr[n]

1.5

)(][)2sin(5.110)(

snTxnxtttx

19

19


Data preprocessing: (Removal of season) Deseasoning:average over the seasonal period

to remove its effect. (Estimate of season) Estimating seasoning: substract the

deseasoned time series from a local estimate of the current sample.

][...]4[]5[]6[121][ 21 nxnxnxnxnmest

]6[]5[...]1[ 21 nxnxnx][][

21][

2

21 nmknxnS est

kkest

][][][][ nenSnmnx

0 10 20 30 40 50 60 70 80 90 100-20

0

20

40

60

80

100

x[n]m[n]mest[n]

Sest[n]

][]2[]1[][]1[]2[][ 8141214181 nmnxnxnxnxnxnS estest

20

20


Data preprocessing: (Removal or isolation of trend, seasonal or noise) Filtering: filtering

aims at the removal of any of the components, for example, the moving average is a common filtering operation in stock analysis to remove noise.

20

1][

201][

kknxny

10

10][

211][

kknxny

40 60 80 100 120 140 160 180

7

8

9

10

OriginalCausalCausal+Anticausal

21

21

5. Distributional properties

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4

tIrregular sampling

][nx31]31[ Xx

All variables are identically distributed StationarityA variable can be normally distributed or Poisson or etc.It is important to characterize their distribution

All variables are independent White processIndependency is measured through the autocorrelation function

0 10 20 30 40 50 60 70 80 90 100-3

-2

-1

0

1

2

3

22

22


-5 0 5

0

0.5

1

Normal distribution function

X-5 0 5

0

0.1

0.2

0.3

0.4Normal probability density function

X

-5 0 5 10 15 20

0

0.5

1

Poisson distribution function

X-5 0 5 10

0

0.1

0.2

0.3

0.4Poisson probability density function

X

xXxFX Pr)( x dttf )(

xx

ii

xXPr

)(dxFxXE Xrr

x r dttfx )(

xx

ir

i

xXx Pr

1 !

1)(k

kkk

itXX tk

XEieEt

dttiteeaFbF X

itbita

XX )(lim21)()(

yYxXyxF YX ,Pr),(,

Characteristic function

23

23


0 20 40 60 80 100-20

0

20

40

60

80

100

0 20 40 60 80 100-1

-0.5

0

0.5

1

0 50 100 150 200-0.2

0

0.2

0.4

0.6

0.8

1

0 1 2 3 4

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Examples of nonstationary variables

24

24

5. Distributional propertiesStrictly stationary ),...,,(),...,,(

21212121 ,...,,,...,, NnNnNnXXXnnnXXX kNknNnNnkknnnxxxFxxxF

Nnnnk k ,,...,,, 21Consequences: nXE n

20*0 ][))((),(][ 00 nXXEXXCovnC nnnnnn

0,nnAutocorrelation function (ACF)

Wide sense stationary nXE n][),( 00 nXXCorr nnn 0,nn

][][ 0*

0 nn ][]0[ 0n

lag

][),( 0* 00 nXXEXXCorr nnnnnn

Autocovariance function

Correlation coefficient ]0[

][][ 00 CnCnr

0n0n

Example: white process

0001

][][0

00

20 n

nnn X

25

25


]0[][][ 00 C

nCnr Correlation coefficients (Zero-order correlations)

Under the null hypothesis it is distributed as Students t with N-2 degrees of freedom.

Number of samples upon which is estimated][ 0nr

The probability of observing if the null hypothesis is true is given by

0][ 00 nrH

][ 0nr

][12][Pr

020 nr

Nnrt

26

26

5. Distributional proepertiesErgodicity

0 5 10 15 20 25-1

0

1

0 5 10 15 20 25-1

0

1

0 5 10 15 20 25-1

0

1

0 5 10 15 20 25-1

0

1

Realization 1

Realization 2

Realization 3

Realization 4

4

1,204

120

iix

25

1251 ][

nnx

Strong law of large numbers

N

iiNN

x1

,201

20 lim

Equal if stationary and ergodic for the mean

Ensemble average

Time average

27

27


0 10 20 30 40 50 60 70 80 90 100-25

-20

-15

-10

-5

0

5

10

15

20

][][][ nenmnx

),0( 2N ),0( 2N 0nXE

22),(0 nnn XXCov

The time average still converges to the ensemble average

Stationarity ErgodicityExample:

28

28


How to detect non-stationarity in the mean?There is a variation in the local mean

Solutions:

Polynomial trends: By differentiating p times

(2) (1) (1)[ ] [ ] [ 1]x n x n x n

( ) ( 1) ( 1)[ ] [ ] [ 1]p p px n x n x n

Removal of a linear trend

Removal of a quadratic trend

Removal of a p-th order polynomial trend

(1)[ ] [ ] [ 1]x n x n x n Removal of a constant (or piece-wise constant)

(3) (2) (2)[ ] [ ] [ 1]x n x n x n

29

29


How to detect non-stationarity in the mean?Exponential trends: Taking logs

[ ][ ] log[ 1]x ny n

x n Financial time series

30

30


How to detect non-stationarity due to seasonality?There is a periodic variation in the local mean

Solutions:

Polynomial trends: By differentiating p times with that seasonality(1)[ ] [ ] [ 12]x n x n x n Monthly data: removal of a yearly seasonality(1)[ ] [ ] [ 7]x n x n x n Daily data: removal of a weekly seasonality

How to detect non-stationarity in the variance?There is a variation in the local variance

Solutions: Box-Cox transformation tends to stabilize variance

How to detect non-stationarity?Solutions: Unit root tests

31

31

6. Outlier detection and rejection

Common procedures:1. Visual inspection2. Mean k Standard Deviation3. Median k Median Abs. Deviation4. Robust detection5. Robust model fitting

Common actions:1. Remove observation2. Substitute by an estimate

Do Estimate mean and Std. Dev. Remove samples outside a given interval

Until (No sample is removed)

0 10 20 30 40 50 60 70 80 90 100-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Sensor blackout

32

32

7. Time series methods

Time-domain methods: Based on classical theory of correlation (autocovariance) Include parametric methods (AR, MA, ARMA, ARIMA, etc.) Include regression (linear, nonlinear)

Frequency-domain (spectral) methods: Based on Fourier analysis Include harmonic methods

Neural networks and Fuzzy neural networks Other fancy approaches: fractal models, wavelet models,

bayesian networks, etc.

33

33

Bibliography


D.S.G. Pollock. A handbook of time-series analysis, signal processing and dynamics. Academics Press, 1999.

J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994. A. V. Oppenheim, R. W. Schafer, J. R. Buck. Discrete-time signal

processing, 2nd edition. Prentice Hall, 1999. A. Papoulis, S. U. Pillai. Probability, random variables and stochastic

processes, 4th edition. McGraw Hill, 2002.


Session II: Regression and Harmonic Analysis


2Session outline

1. Goal2. Linear and non-linear regression3. Polynomial fitting4. Cubic spline fitting5. A short introduction to system analysis6. Spectral representation of stationary processes7. Detrending and filtering8. Non-stationary processes

31. Goal


Session 3Session 2

42. Linear and non linear regression

Year (n)

Price (x[n])

1500 17

1501 19

1502 20

1503 15

1504 13

1505 14

1506 14

1507 14

Linear regression

][][][ nrandomntrendnx nntrend 10][

...150315150220150119150017

10

10

10

10

...15201917

......15031150211501115001

1

0

xX


Linear regressionxX

][][][ nrandomntrendnx xXxXxXXx t2

][][0][

02

0 nnnE

Lets assume

xXXXxX

tt 12minarg Least Squares Estimate

EProperties: 12 XX tCov xXxX tkN

1 2

Homocedasticity

2

22 1

1xxXx

R

11)1(1 22

kN

NRRadjusted

Degree of fit

Linear regression with constraints

2 1 1arg min ( ). .

t t t tR

s t

X X R R X X R r RR r 2 1 1 2

2 2 0 Example:


][][][ nrandomntrendnx 2

210][ nnntrend ...

1503150315

1502150220

1501150119

1500150017

22

10

22

10

22

10

22

10

...15201917

.........150315031150215021150115011150015001

2

1

0

2

2

2

2

xX

0

0

1

0

HHFirst test: 0tt0 XX 12kF H0 is true ),( kNkF

If F>FP, reject H0

201

200

2

2

HHSecond test: 20221t2t202 )XP(IX 1 22kF H0 is true ),( 2 kNkF

][][

),0(][

02

0

2

nnNn

Lets assume

x

XX

2

121 t111t111 XXXXP

01

00

ii

ii

HH

ii

iit

0 iiii 1 XXt H0 is true )( kNt


[40, 45] [0.05,0.45]Y X

We got a certain regression line but the true regression line lies within this region with a 95% confidence.

12 2 tj jj X X2

21 , 1

j j jN kt Unbiased variance of the j-th regression coefficient

Confidence interval for the j-th regression coefficient

Confidence intervals for the coefficients


Durbin-Watson test:

0][0][

01

00

nHnH

N

n

N

n

n

nn

d1

2

2

2

][

]1[][

Reject H0Ldd

Do not reject H0Udd

00 n

Test that the residuals are certainly uncorrelated

Other tests: Durbins h, Wallis D4, Von Neumanns ratio, Breusch-Godfrey


(0) (0) [ ] [ ]y n n xCochrane-Orcutt method of regression with correlated residues

1. Estimate a first model2. Estimate residuals (correlated)3. Estimate the correlation of the residues4. i=15. Estimate uncorrelated residuals6. Estimate uncorrelated output7. Estimate uncorrelated input8. Reestimate model9. Estimate residuals10. i=i+1 until convergence in

(0) (0)[ ] [ ] [ ]n y n y n

( ) ( ) ( )[ ] [ ] [ 1] [ 1]i i ia n n i n ( ) ( )[ ] [ ] [ 1] [ 1]i iy n y n i y n ( ) ( )[ ] [ ] [ 1] [ 1]i in n i n x x x( ) ( ) ( ) ( ) [ ] [ ] [ ]i i i iy n a n n x

( ) ( ) ( )[ ] [ ] [ ]i i in y n y n ( )i

10


Assumptions of regression

The sample is representative of your population The dependent variable is noisy, but the predictors are not!!. Solution: Total Least

Squares Predictors are linearly independent (i.e., no predictor can be expressed as a linear

combination of the rest), although they can be correlated. If it happens, this is called multicollinearity. Solution: add more samples, remove dependent variable, PCA

The errors are homoscedastic. Solution: Weighted Least Squares The errors are uncorrelated to the predictors and to itself. Solution: Generalized

Least Squares The errors follow a normal distribution. Solution: Generalized Linear Models

11


More linear regression

][][ 10 nwnnx ][]1[][ 210 nwnxnnx

][])2[]1[(][ 210 nwnxnxnnx ][]2[]1[][ 3210 nwnxnxnnx

][][...][][][ 66110010 nwnMnMnMnnx

otherwise007n1

][0

nM

otherwise067n1

][6

nM

7 years

Non linear regression][)sin(][ 10210 nwnnnx

][][ 10 nwnnx ][logloglog][log 10 nwnnx ]['log][' '1'0 nwnnx

12

3. Polynomial fitting

Polynomial trends

][][][ nrandomntrendnx 2

210][ nnntrend ...

1503150315

1502150220

1501150119

1500150017

22

10

22

10

22

10

22

10

...15201917

.........150315031150215021150115011150015001

2

1

0

2

2

2

2

xX Orthogonal polynomials

...)1503()1503()1503(15)1502()1502()1502(20

)1501()1501()1501(19)1500()1500()1500(17

221100

221100

221100

221100

qqnnnntrend ...][ 2210

)(...)()()(][ 221100 nnnnntrend qq xX x

1 XRR 1

0)()( dtttji ji Such that

13

Polynomial trendsOrthogonal polynomials

3. Polynomial fitting

11 0)()( dtttji ji 1)(0 ttt )(1

)(1

)(112)( 21 ti

ittiit iii

212

23

2 )( ttttt 23

325

3 )(

Grafted polynomials

)(...)()()(][ 221100 nnnnntrend qq

S(t) is a cubic spline in some interval if this interval can be decomposed in a set of subintervals such that S(t) is a polynomial of degree at most 3 on each of these subintervals and the first and second derivatives of S(t) are continuous.

1t 2t 3t

1

3

3)()(k

jjj ttdtS

14

4. Cubic spline fitting

Cubic B-splines

otherwise0

21

1

)( 62

22

32

32

3

3

t

tt

tB t

t

1

3

3 )()(k

jjj tBdtS

-3 -2 -1 0 1 2 30

0.5

1

t

0

3 3 3 2 2 3( ) ( ) ( 3 3 )q q

p j j j j j jj p j p

B t d t t d t t t tt t

0)( tBtt pp

3,2,1,000)(

ktdtBttq

pj

kjjpq

000011111

4

3

2

1

34

33

32

31

3

24

23

22

21

24321

p

p

p

p

p

ppppp

ppppp

ppppp

ddddd

ttttttttttttttt

1)(3 dttBp

210124321 ppppp ttttt

By definition

15


1

3

3 )()(k

jjj tBdtS

1

1

3

0

)(Fj

jjTt

j jBdtSTtj

Ttj FF ,00

BdS

-1.5 -1 -0.5 0 0.5 1 1.5 2 2.5-10

-8

-6

-4

-2

0

2

4

6

8

10

Cubic B-splines

T

16


Overfitting and forecasting

-0.2 0 0.2 0.4 0.6 0.8 1 1.2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-1 -0.5 0 0.5 1 1.5 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

4. Polynomial fitting with discontinuities

17

18

5. A short introduction to system analysis

T][nx )(][ xTny

Example: ][3][ nxny Amplifier]3[][ nxny Delay

][][ 2 nxny Instant power

3]1[][]1[][ nxnxnxny Smoother

Box-Cox transformation: family of systems

...]4[]5[]6[121][ 21 nxnxnxnmest

][][2

1][2

21 nmknxnS est

kkest

Season estimation

0])[log(0)(][

1][

nxxTny

nx

-5 0 5 10 150

0.5

1

n

x[n]

-5 0 5 10 150

0.5

1

n

x[n-3]

19

Systems with memory ,...)2,1,( nxnxnxTnyMemoryless systems )( nxTny Invertible systems ))(()(:)()( 111 nxTTnyTnxyTnxTny Causal systems ,...)2,1,( nxnxnxTnyAnticausal sytems ,...)2,1,( nxnxnxTnyStable systems nBnxTBnBnx TTx ,)(:,Time invariant systems )()( 00 nnxTnnynxTny Linear systems nxbTnxaTnbxnaxT 2121

LTI

Basic system properties

5. A short introduction to system analysisPresent Past

Future

20

5. Basic introduction to system analysis

LTI systems

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

x 1 [ n ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

y 1 [ n ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

x 2 [ n ] = x 1 [ n -3 ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

y 2 [ n ] = y 1 [ n -3 ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

x 3 [ n ] = 2 x 1 [ n ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

y 3 [ n ] = 2 y 3 [ n ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

x 4 [ n ] = x 1 [ n ] + x 2 [ n ]

-5 0 5 1 0 1 50

0 . 5

1

1 . 5

2

n

y 4 [ n ] = y 1 [ n ] + y 2 [ n ]

21


T][nx )(][ xTny

Example: ][3][ nxny LTI, memoryless, invertible, causal, stable ]10[][ nxny

][][ 2 nxny Non linear, TI, memoryless, non invertible, causal,stable

3]1[][]1[][ nxnxnxny

0])[log(0)(][

1][

nxxTny

nxNon linear, memoryless, invertible, causal, unstable

...]4[]5[]6[121][ 21 nxnxnxnmest

][][2

1][2

21 nmknxnS est

kkest

LTI, with memory, invertible, causal, stable

LTI, with memory, invertible, non causal, stable

LTI, with memory, invertible, non causal, stable

22


Impulse response of a LTI system

T][n ][nh

-5 0 50

0.5

1

n

[n]

FIR: Finite impulse responseIIR: Infinite impulse response

T][nx

kknhkxnhnxny ][][][*][][

3]1[][]1[][ nxnxnxny

...]4[]5[]6[121][ 21 nxnxnxnmest

][][2

1][2

21 nmknxnS est

kkest

Example:

3]1[][]1[][ nnnnh

-5 0 50

0.1

0.2

0.3

0.4

n

h[n]

-10 -5 0 5 10-0.2

0

0.2

0.4

n

h[n]

23


M

kk

N

kk knxbknya

00

Difference equation

Example: ]1[5.0][]1[][ nxnxnyny

Nk

kk

M

k

kk

za

zb

zXzYzH

0

0

)()()(

M

k

kk

N

k

kk zzXbzzYa

00

)()(

11 )(5.0)()()( zzXzXzzYzY

1

1

15.01

)()()(

zz

zXzYzH

Transfer function

Cz

Impulse response of a LTI system )()()( zXzHzY

][ 0nnx ZT )(0 zXzn

24

6. Spectral representation of stationary processes

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870

200

300

Year

P

r

i

c

e

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870

200

300

YearP

r

i

c

e

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870-100

0

100

Year

P

r

i

c

e

-

t

r

e

n

d

][][][][ nrandomnseasonalntrendnx

nntrend 6.12777][

][][][ ntrendnxny ][][ nrandomnseasonal

][cos5.22cos5.26 1223282 nrandomnn Period=8 years Period=12 years

25


Harmonic components

][cos][ nrandomnAnx Amplitude

Frequency (rad/sample)

Phase (rad)

0 10 20 30 40 50 60 70 80 90 100

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

A

][2cos][ nrandomfnTAnx s Frequency (Hertz)

kNNnseasonalnseasonal ff

fTkk ss 2][][ ,...3,2,1k

nnnseasonal 1223282 cos5.22cos5.26][Period=8 years Period=12 years

Example:

Period=lcm(N1,N2)=24 years

26


Harmonic components

nAnx cos][

0 5 10 15 20 25 30

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25 30

-1

-0.5

0

0.5

1

0 5 10 15 20 25 30

-1

-0.5

0

0.5

1

nnx 152cos][ nnx 1528cos][ n 2cos 1528 n152cos )cos( 152 n

27


Harmonic representation nnnx 1223282 cos5.22cos5.26][

K

kkkk nAnx

1cos][ 0 ))(cos()(][ dnAnx

deXnx nj)(21 njn enxX

)(

Fourier transform pair

)(1 XFTnx ][)( nxFTX nnFTY 1223282 cos5.22cos5.26)(Example:

)(5.26)(5.26 8282 32

32 jj ee

)(5.22)(5.22 122122 jj ee

njjnjj eeAeeAnA )()(21 )()()(cos)()()()( jeAX

)()( * XX

-3 -2 -1 0 1 2 30

20

40

60

80

|

Y

(

)

|

Low speed variation

Medium speed

Highspeed

Average

28


Power spectral density (PSD)

2)()( XSX ][ 0nX FT )(XS Deterministic

][][ 02

0 nn WW FT 2)( WWS Example:Example:

)()5.26()()5.26()( 822

822 YS

)()5.22()()5.22( 1222

1222

][cos5.22cos5.26][ 1223282 nwnnny )(YS

1820 1830 1840 1850 1860 1870-100

-50

0

50

100

Year

P

r

i

c

e

-

t

r

e

n

d

-3 -2 -1 0 1 2 30

2000

4000

6000

8000

SY()

Periodogram

29

7. Detrending and filtering

)(zH][nw ][nx)()()( 2 WX SHS )(WS

-3 -2 -1 0 1 2 30

20

40

60

80

|

Y

(

)

|

Low speed variation

Medium speed

Highspeed

Highpass filteringDetrending

Bandpass filteringDetrending+Denoising

Lowpass filteringDenoising

Trend estimation

Seasonal estimation

Identity

1820 1830 1840 1850 1860 1870

150

200

250

300

Year

x[n]

-2 0 20

1

2

3x 10

4

SX()

1820 1830 1840 1850 1860 1870-100

0

100

Year

y[n]=x[n]-trend[n]

-2 0 20

1

2

3x 10

4

SY()

30


)(zH][nw ][nx)()()( 2 WX SHS )(WS

zzzH 1)( 131Example: Smoother

)()()( jez eHzHH j

jj eeH 1)( 313

]1[][]1[][ nxnxnxny

-3 -2 -1 0 1 2 30

0.5

1

|H()|2

-3 -2 -1 0 1 2 3

-202

angle(H())

-3 -2 -1 0 1 2 30

1

2

|1-H()|2

-3 -2 -1 0 1 2 3

-202

angle(1-H())

zzzH 11)( 131 jj eeH 11)( 313

]1[][]1[][][ nxnxnxnxny

31


zzzH 1)( 1311

Example: Smoothers

jj eeH 1)( 3113

]1[][]1[][ nxnxnxny

21312 1)( zzzH 2312 1)( jj eeH

3]2[]1[][][ nxnxnxny

-2 0 20

0.5

1

|H1()|2

-2 0 2

-20

2

angle(H1())

-2 0 20

0.5

1

|H2()|2

-2 0 2

-20

2

angle(H2())

-2 0 20

0.5

1

|H3()|2

-2 0 2

-20

2

angle(H3())

2411

41

21

3 )( zzzH

24141213 )( jj eeH

]2[]1[][][ 414121 nxnxnxny

32


...]4[]5[]6[121][ 21 nxnxnxnmest

][][2

1][2

21 nmknxnS est

kkest

Example: Yearly season estimation

6215432123456211 1121)( )()( zzzzzzzzzzzzzX zMzH est 2

81

41

211

412

81

2 )()(')( zzzzzXzYzH

)()()( 12 zHzHzH

-2 0 20

0.5

1

1.5

2

|H1()|2

-2 0 20

0.5

1

1.5

2

|H2()|2

-2 0 20

0.5

1

1.5

2

|H3()|2

33


-3 -2 -1 0 1 2 30

0.5

1

|H1()|2

-3 -2 -1 0 1 2 30

0.5

1

|H2()|2

6])x[n6]-0.0611(x[n-7])x[n7]-0.0313(x[n-8])x[n8]-n-0.0101(x[][1 ny2])x[n2]-0.0934(x[n4])x[n4]-0.0634(x[n-5])x[n5]-0.0802(x[n-

0.2108x[n]1])x[n1]-0.1772(x[n

4]-0.00056x[n6])-x[n1]-n0.00037(x[-8])-x[n93(x[n]0000.0][2 ny8]-58y[n.07]-4.3y[n-6]-14.6y[n5]-29.7y[n-4]-39y[n3]-34y[n-2]-19.3y[n1]-6.5y[n-

fc=2*pi/12;ef=2*pi/60;b1=fir1(18,[fc-ef fc+ef]/pi);

fc=2*pi/12;ef=2*pi/60;[b1,a1]=butter(4,[fc-ef fc+ef]/pi);

34


[ ] [ ] (1 )( [ 1] [ 1])S n ax n a x n y n [ ] ( [ ] [ 1]) (1 ) [ 1]y n b S n S n b y n

Example: Holts linear exponential smoothing

[ ] [ 1] [ 1]NS n S n y n N

N

1 1( ) ( ( ) ( ) ) (1 ) ( ) ( )Y z b S z S z z b Y z z f S

1 1( ) ( ) (1 )( ( ) ( ) )S z aX z a X z z Y z z ( , ) ( )f X Y f X

1 1 ( ) ( ) ( ) ( )NS z S z z NY z z f X

( )f X

12

3

8. Non-stationary processes

35

[ , ) [ ] [ ] j nn

X m x n m w n e

Short-time Fourier Transform


36

Wavelet transform


37

Wavelet transform


38

Wavelet transform


39

Empirical Mode Decomposition

40

Session outline

1. Goal2. Linear and non-linear regression3. Polynomial fitting4. Cubic spline fitting5. A short introduction to system analysis6. Spectral representation of stationary processes7. Detrending and filtering8. Non-stationary processes

41

Bibliography



J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994.

1Time Series Analysis

Session III: Probability models for time series

Carlos scar Snchez Sorzano, Ph.D.Madrid,

22

Session outline

1. Goal2. A short introduction to system analysis3. Moving Average processes (MA)4. Autoregressive processes (AR)5. Autoregressive, Moving Average (ARMA)6. Autoregressive, Integrated, Moving Average (ARIMA, FARIMA)7. Seasonal, Autoregressive, Integrated, Moving Average (SARIMA)8. Known external inputs: System identification9. A family of models10. Nonlinear models11. Parameter estimation12. Order selection13. Model checking14. Self-similarity, Fractal dimension, and Chaos theory

33

1. Goal


Explained by statistical models (AR, MA, )

0 10 20 30 40 50 60 70 80 90 100-4

-2

0

2

4

0 10 20 30 40 50 60 70 80 90 100-1

-0.5

0

0.5

1

1.5

0 10 20 30 40 50 60 70 80 90 100-0.5

0

0.5

1

[ ] 0.9 [ ] [ ]random n random n completelyRandom n [ ] [ ] 0.9 [ 1]random n completelyRandom n completelyRandom n

44

1. Goal

55


M

kk

N

kk knxbknya

00

Difference equation

Example: ]1[5.0][]1[][ nxnxnyny

Nk

kk

M

k

kk

za

zb

zXzYzH

0

0

)()()(

M

k

kk

N

k

kk zzXbzzYa

00)()(

11 )(5.0)()()( zzXzXzzYzY

1

1

15.01

)()()(

zz

zXzYzH

Transfer function

Cz

T][nx )(][ xTny

][ 0nnx ZT )(0 zXzn

66


Poles/Zeros0z is a pole of iff)(zH )( 0zH0z is a zero of iff)(zH 0)( 0 zH

Stability of LTI systemsA causal system is stable iff all its poles are inside the unit circle

Example:3

]1[][]1[][ nxnxnxnyzzzH 3131

131)(

Poles: ,0z Zeros: 2321 jz }Re{z

}Im{ z

1z

Invertibility of LTI systemsThe transfer function of the inverse system of a LTI system whose transfer function is is . Therefore, the zeros of one system are the poles of its inverse, and viceversa.

)(zH)(

1zH

1z

77


Downsampling nx

M ][nMxnxd

nxL

Upsampling

k

e kLnkxrestoLLnLnx

nx ][][0

,...2,,0]/[

88

3. Moving average processes: MA(q)

][...]1[][][ 10 qnwbnwbnwbnx q

)(...)( 221

10 zBzbzbzbbzHq

q

)(qMA][nw

LTI, with memory, invertible, causal, stable

0 2 0 4 0 6 0 8 0 1 0 0-4

-3

-2

-1

0

1

2

3w [ n ]

t im e0 2 0 4 0 6 0 8 0 1 0 0

-5

0

5x 1 [ n ]

t im e0 2 0 4 0 6 0 8 0 1 0 0

-1

-0 . 5

0

0 . 5

1x 2 0 [ n ]

t im e

-1 0 -5 0 5 1 0

0

0 . 2

0 . 4

0 . 6

0 . 8

1

A C F w [ n 0 ]

la g-1 0 -5 0 5 1 0

0

0 . 2

0 . 4

0 . 6

0 . 8

1

A C F x1[ n 0 ]

la g-3 0 -2 0 -1 0 0 1 0 2 0 3 0

0

0 . 2

0 . 4

0 . 6

0 . 8

1

A C F x20

[ n 0 ]

la g

Definition

210 ( ) ( )Wn TF H S

99


),0( 2WN

][][ 02

0 nn WW

),0(0

22

q

kkW bN

0)(

0

0

][

00

00

2

0

0

0

0

nn

qnbb

nq

n

X

nq

knkkWX

It has limited support!!

q

kk knwbnx

0][][)(qMA][nw

q

k

q

kkk

q

kk

q

kkX knnwknwEbbknnwbknwbEnnxnxEn

0 0'0'

0'0'

000 ]'[][]'[][][][][

2' 0 ' 00 ' 0 0 ' 0

[ '] [ ' ' ] [ ( ' )]q q q q

k k k k Wk k k k

b b E w n w n n k k b b n k k

Statistical properties

Proof

10

10


0)(

0

0

)]'([...][

00

00

2

0

0 0'0

2'0

0

0

nn

qnbb

nq

kknbbn

X

nq

knkkW

q

k

q

kWkkX

0)'(00)'(1

)]'([0

00 kkn

kknkkn 0' nkk

k

'k

q

q

0n

00 0 0 0 0 0

00 0 0 0 0 0

00 0 0 0 0 0

01 0 0 0 0 0

10 0 0 0 0 0

00 1 0 0 0 0

00 0 1 0 0 0

Statistical propertiesProof (contd.)

11


11

(Brute force) determination of the MA parameters

0

0

0

20 0

0

0 0

0

[ ] 0

( ) 0

q n

X W k k nk

X

q n

n b b n q

n n

2 2 2 20 1 2[0]X Wr b b b 2 0 1 1 2[1]X Wr b b b b 2 0 2[2]X Wr b b

12

12


Invertibility

)(qMA][nw

q

kk knwbnx

0][][

)(1 qMA

q

k

kk zbzH

0)(

q

k

kk

inv

zbzHzH

0

1)(

1)(

q

kk knwbnxb

nw10

][][1][

][nw

][][0

nxknwbq

kk

Example:3

]2[]1[][][ nxnxnxny does not have a stable, causal inverse]1[9.0][][ nxnxny has a stable, causal inverse

}Re{z

}Im{ z

1z

Whitening filter

Colouring filter

Channel equalizationChannel estimationCompressionForecasting

13

13

3. Moving average processes: generalizations

][]1[...]1[][][ 01010 00 nwbnwbqnwbqnwbnx qq Model not restricted to be causal

Model not restricted to be linear

q

k

q

kkk

q

kk knwknwbknwbnx

0

'

0'',

0

]'[][][][

][...]1[1 Fq qnwbnwb F Causal component

Anticausal component

Quadratic component

q

k

q

k

q

kkkk

q

k

q

kkk

q

kk knwknwknwbknwknwbknwbnx

0

'

0'

''

0'''',',

0

'

0'',

0

]''[]'[][]'[][][][

Volterra Kernels

1)

2)

q

kk knwfbnx

0

])[(][

14

14

4. Autoregressive processes: AR(p)

0 2 0 4 0 6 0 8 0 1 0 0-4

-3

-2

-1

0

1

2

3w [ n ]

t im e0 2 0 4 0 6 0 8 0 1 0 0

-5

0

5x 1 [ n ]

t im e0 2 0 4 0 6 0 8 0 1 0 0

-1

-0 . 5

0

0 . 5

1x 2 0 [ n ]

t im e

][...]1[][][ 1 pnxanxanwnx p

)(1

...11)( 2

21

1 zAzazazazH p

p

)( pAR][nw

LTI, with memory, invertible, causal, stable Definition

-30 -20 -10 0 10 20 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

n0

ACF(n0)

15

15


][...]1[][][ 1 pnxanxanwnx p

)(1

...11)( 2

21

1 zAzazazazH p

p

)( pAR][nw

LTI, with memory, invertible, causal, stable Definition

Relationship to MA processes

12211

1...1

1)(k

kkp

p

zbzazaza

zH

Laurent series

p

kXkWX knann

100

20 ][][][


),0( 2WN ])0[,0( XN Yule-Walker equations

p

k

nkkX zAn

10

0][Whose solution is kz Poles of )(zH

16

16


Determination of the constants

p

k

nkkX zAn

10

0][

Example:

]2[]1[][][ 21 nxanxanwnx 22

111

1)( zazazH

Poles:

kA

p

k

nkkX zAnr

1

'0

0][

20 0 0

1[ ] [ ] [ ]

p

X W k Xk

n n a n k

p

kXkX knranr

100 ][][ 00 n

24

, 2211

21

aaazz

1,1,11 21212 aaaaazi

002

'21

'10 ][

nnX zAzAnr

04 221 aaRzi

1]0[ '2'1 AArX

]1[]0[]1[ 212'11

'1 XXX rarazAzAr

17

17

5. Autoregressive, Moving average: ARMA(p,q)

p

kk

q

kk knxaknwbnx

10][][][

1 20 1 2

( ) ( ) 1 21 2

... ( )( ) ( ) ( )1 ... ( )

qq

MA q AR p pp

b b z b z b z B zH z H z H za z a z a z A z

)( pAR][nw

Definition


),0( 2WN ])0[,0( XN

)(qMA

p

kXk

q

kkWX knankhbn

10

00

20 ][][][

18

18

6. Autoregressive, Integrated, Moving Average: ARIMA(p,d,q)

][nxd

)()()()( ),( zHzWzXzH qpARMAd

),( qpARMA][nw

Definition

][*])1[][(*...*])1[][(][][ nxnnnnnxnx dld d

dd

Intd

d zzXzXzHzXzzX

)1(1

)()()()()1()( 1

1

)(dInt ][nx

Poles: 1z (Multiplicity=d)Unit root

)()()()(

)()(

)()()( )(),(),,( zHzHzX

zXzWzX

zWzXzH dIntqpARMA

d

dqdpARIMA

p

kk

q

kk knxknxaknwbnxnx

10]1[][][]1[][Example for d=1:

Qd FARIMA or ARFIMA

19

19

7. Seasonal ARIMA: SARIMA(p,d,q)x(P,D,Q)s(Box-Jenkins model)

),,( qdpARIMA][nw

Definition

( , , )( ) ( )( ) ARIMA p d qs

X z H zX z

),,( QDPARIMAs s][nxs ][nx

)()()(

),,(s

QDPARIMAs zHzWzX

)()()()()( ),,(),,(),,(),,( zHzHzWzXzH qdpARIMA

sQDPARIMAQDPqdpSARIMA s

P

kssk

Q

kkss sknxksnxAksnwBsnxnx

10])1([][][][][

p

kk

q

ksk knxknxaknxbnxnx

10]1[][][]1[][

1D

1d

20

20

7. Seasonal ARIMA: SARIMA(p,d,q)x(P,D,Q)sExample: SARIMA(1,0,0)x(0,1,1)12

),,( qdpARIMA][nw ),,( QDPARIMAs s][nxs ][nx

]12[][]12[][ 10 nwBnwBnxnx ss

]1[][][ 1 nxanxnx s)0,0,1(),,( qdp

)1,1,0(),,( QDP

]1[][][ 1 nxanxnxs

]12[][]13[]12[]1[][ 1011 nwBnwBnxanxnxanx

]13[]1[]12[][]12[][ 110 nxnxanwBnwBnxnx

21

21

8. Known external inputs: System identification

)(1zA

][nw +

)()(zAzB

][nu

][][][][01

nwknubknxanxq

kk

p

kk

)()(

1)()()()( zW

zAzU

zAzBzX

ARX

)()(zAzC][nw +

)()(zAzB

][nu

'

001][][][][

q

kk

q

kk

p

kk knwcknubknxanx

)()()()(

)()()( zW

zAzCzU

zAzBzX

ARMAX

22

22

9. A family of models

)()(zDzC][nw +

)()(zFzB

][nu

)()()(

)()()()(

)()( zWzAzD

zCzUzAzF

zBzX General model

)(1zA

][nx

Polynomials used Name of the model

A AR

C MA

AC ARMA

ACD ARIMA

AB ARX

ABC ARMAX

ABD ARARX

ABCD ARARMAX

BFCD Box-Jenkins

23

23

10. Nonlinear models

][][],...,2[],1[][ nwpnxnxnxfnx Nonlinear AR:][][][][

1nwknxnanx

p

kk

Time-varying AR:

Random coeff. AR: ][][])[(][1

nwknxnanxp

kk

Bilinear models ][][][][][11

nwMknwknxbknxanxq

kk

p

kk

'1 1

[ ] [ ] [ ] (1 ) [ ]p p

k kk k

x n a x n k p n a x n k p n w n

Smooth transition:

24

24


Smooth TAR (STAR):

Threshold AR (TAR):

tdnxnwknxa

tdnxnwknxanx p

kk

p

kk

][][][

][][][][

1

)2(

1

)1(

][])[(][][][1

)2(

1

)1( nwdnxSknxaknxanxp

kk

p

kk

Heterocedastic model: ][][][ nwnnx Random walk)1,0(N

p

kk knxan

1

220

2 ][][

q

kk

p

kk knbknxan

1

2

1

220

2 ][][][

ARCH

GARCH

(Neural networks)(Chaos)

25

25


[ ]x n[ ] 0[ ]x n n 2 0[ ][ ]x n n

q

kk

p

kk knbknxan

1

2

1

220

2 ][][][ GARCH

1 11

p q

k kk k

a b

The model is unique and stationary if Properties

[ ] 0E x n Zero meanLack of correlation

20

[ ] 0 0max ,

1

[ ] [ ]1

x n p q

k kk

n na b

26

26


q

kk

p

kk knbknxan

1

2

1

220

2 ][][][ GARCHEstimation through Maximum LikelihoodForecasting

min ,2 2 2

01 1

[ ] ( ) [ ] [ ]p q q

k k kk k

x n h a b x n h k b z n h k

2 2

0 [ 1] ...x n h 2 2

0 [ 2] ...x n h ...2[ ]x n2[ 1]x n Observed

[ 1] 0z n h [ 2] 0z n h

...2 2[ ] [ ] [ ]z n x n n

2 2[ 1] [ 1] [ 1]z n x n n GARCH(1,1)

2 2 2 20 1 1 [ 1] [ ] [ ]x n a x n b n

12 2 2 1 2

0 1 1 1 1 1 1 1 10

[ ] ( ) ( ) [ ] ( ) [ ]h

k h

kx n h a b a a b x n b a b n

27

27


Extensions of GARCH

Exponential GARCH (EGARCH)

2 2 2 20

1 1log [ ] log log [ ] log [ ]

p q

k kk k

n a x n k b n k

Integrated GARCH (IGARCH)

1 1

1p q

k kk k

a b

1 1

1p q

k kk k

a b

GARCH IGARCH

28

28

11. Parameter estimation

Maximum Likelihood Estimates (MLE)

][]1[][ 1 nwnxanx AR(1)Assume that we observe ][],...,2[],1[ Nxxx

][]1[][ 1 nwnxanx

212

1 1,0|

aNX W

][][ 02

0 nn WW

...)( 121111 nnnnnn WWXaaWXaX...3

312

2111 nnnn WaWaWaW

0nXE 2331221112 ...nnnnn WaWaWaWEXE 2

1

261

41

21

2

1...1

aaaa WW

2112 WXaX 1122 XaXW 2

2 1 1 1 21

| , ,1

WX X N a xa

2,0 WN

21, Wa

29

29


Maximum Likelihood Estimates (MLE)

212

1 1,0|

aNX W

2

2 1 1 1 21

| , ,1

WX X N a xa

2

3 2 1 1 2 21

| , , ,1

WX X X N a xa

),...,,( 21|...21 NXXX xxxf N )()...()()( ,|3,|2,|1| 123121 NXXXXXXX xfxfxfxf NN

N

n W

nnW

N

a

WNXXX

xaxxa

xxxfLWN 2

2

211

212

21

1

21

21

2

21

21

21|... )2log(21

log)2log(),...,,(log)(21

221

21

21

)(0)()(maxarg,1 WaW

LaLLa

Numerical, iterative solution

Confidence intervals

30

30


][][][ nwnxnx

0][ nwE ]0[]1[2]0[][ 21122 XXXW aanwE

][][][ nxnxnw Least Squares Estimates (LSE)

][]1[][ 1 nwnxanx ]1[][ 1 nxanx

1

]1[]0[]0[

]0[]1[]0[2]1[20 2

22

111

2

X

XXW

X

XXX

W aaa

31

31

12. Order selection

If I have to fit a model ARMA(p,q), what are the p and q values I have to supply?

ACF/PACF analysis Akaike Information Criterion

Bayesian Information Criterion

Final Prediction Error

NqpqpAIC W

2)(log),( 2

NNqpqpBIC W

log)(log),( 2

2)( WpNpNpFPE

32

32

12. Order selection

Partial correlation coefficients (PACF, First-order correlations)

][][...]2[]1[][ 0,2,1, 0000 nwnnxnxnxnx nnnn

00

10,0 ][][

n

nnn nnrnr

][

]1[...

]3[]2[]1[

...

]0[]1[...]2[]1[]1[]1[]0[...]4[]3[]2[

..................]3[]4[...]0[]1[]2[]2[]3[...]1[]0[]1[]1[]2[...]2[]1[]0[

0

0

,

1,

3,

2,

1,

000

000

00

00

00

00

00

0

0

0

nrnr

rrr

rnrnrnrrrnrnrnr

nrnrrrrnrnrrrrnrnrrrr

nn

nn

n

n

n

Yule-Walker equations

33

33

12. Order selection

Thumb ruleARMA(1,0): ACF: exponential decrease; PACF: one peakARMA(2,0): ACF: exponential decrease or waves; PACF: two peaksARMA(p,0): ACF: unlimited, decaying; PACF: limitedARMA(0,1): ACF: one peak; PACF: exponential decrease ARMA(0,2): ACF: two peaks; PACF: exponential decrease or wavesARMA(0,q): ACF: limited; PACF: unlimited decayingARMA(1,1): ACF&PACF: exponential decrease ARMA(p,q): ACF: unlimited; PACF: unlimited

34

34

13. Model checking

Residual AnalysisExample: ARMA (1,1)

]1[]1[][][]1[][]1[][ 1110 0 nwbnaxnxnwnwbnwbnaxnx b

Assumptions

1. Gaussianity:1. The input random signal w[n] is univariate normal with zero mean2. The output signal, x[n] (the time series being studied), is multivariate normal and its

covariance structure is fully determined by the model structure and parameters2. Stationarity: x[n] is stationary once that the necessary operations to produce a stationary

signal have been carried out.3. Residual independency: the input random signal w[n] is independent of all previous

samples.

0]0[ w

35

35

13. Model checking

[ ]x n

n

Structural changes

1 1[ ] ( , )y n f x y 2 2[ ] ( , )y n f x y

[ ] ( , )y n f x y

2

1

( [ ] [ ])N

i in

S y n y n

Sum of squares of the residuals with model i

Chow test:

0 1 2

1 1 2

::

H f fH f f

1 2

11 2

1 211 22

( ( )) ( , 2 )( )

k

N N k

S S SF F k N N kS S

Number of samples in each period

Number of parameters in

the model

Assumption: variance is the same in both regionsSolution: Robust standard errors

36

36

13. Model checking

Diagnostic checking

1. Compute and plot the residual error2. Check that its mean is approximately zero3. Check for the randomness of the residual, i.e., there are no time intervals where the

mean is significantly different from zero (intervals where the residual is systematically positive or negative).

4. Check that the residual autocorrelation is not significantly different from zero for all lags

5. Check that the residual is normally distributed.6. Check if there are residual outliers.7. Check the ability of the model to predict future samples

37

14. Self-similarity, Fractal dimension, Chaos theory

37

Intuitively a fractal is a curve that is self-similar at all scales

Koch curve

k=1

k=0

k=2

k=3

k=4

k=5

( )XS

38


38

L==1m

=0.5m=0.5m 12N =0.25m =0.25m =0.25m =0.25m 14N

1 1N D

11N

S=1m2

L==1m

2

11N

=0.5m

2

14N 2=1m2 2=0.25m2

2

116N 2=0.0625m2

=0.25m

2 2N D

39


39

1 1D DN 13m

1m

2

13

m

3

13m

143

DDN

22

143

DDN

33

143

DDN

k=1

k=2

k=3

1 log 44 1.263 log3

DD k

kN D

40


40

( )XS 5 2D

2H D Hurst exponent

0.2

0,1 0.50.8

H

Uncorrelated time series

Long-range dependent alternating signs

Long-range dependent same signs

maxmax max

max

1 [ ] [ ]N

HMA MAN n

n nx n x n n

41


41

42


42

0 10 20 30 40 50 60 70 80 90 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

n

x0.1[n]

[ ] 4 [ 1](1 [ 1])x n x n x n

Chaotic systems

Logistic equation

[0] 0.1x

0 10 20 30 40 50 60 70 80 90 100-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

n

x0.1[n]-x0.100001[n]

43


43

Phase space

Attractor (Fixed point)

2-history 22[ ] [ ], [ 1]n x n x n x 3-history 33[ ] [ ], [ 1], [ 2]n x n x n x n x h-history [ ] [ ], [ 1],..., [ 1] hh n x n x n x n h x

2[2] [2], [1]x xx 2[3] [3], [2]x xx 2[1] [1], [0]x xx

2-Phase space

44


44

Recurrence plots

,

1 [ ] [ ']( , ')

0 [ ] [ ']h h

hh h

n nR n n

n n

x xx x

White noise

2, ( , ')R n n

Sinusoidal Chaotic system with trend

AR model

45


45

Recurrence plots

46


46

Correlation dimension (Grassberger-Procaccia plots)Correlation integral

( 1), ' 12

'

1( ) Pr [ ] [ '] lim [ ] [ ']N

h h h h hN NN n nn n

C n n H n n x x x x

0 0( )

1 0x

H xx

Heaviside functionCorrelation dimension

0

log( ( ))limlog( )

hh

CD

hD

hD h (random)

(maybe chaotic)

h

47


47

Brock-Dechert-Scheinkman (BDS) Test

1,

( ) ( )(0,1)

hh

h

C CV N

N

Lyapunov exponentConsider two time points such that

0[ ] [ '] 1h hn n x x Consider the distance m samples later [ ] [ ' ]m h hn m n m x x

0m

m e The Lyapunov exponent relates these two distances

0 Histories diverge: chaos, cannot be predicted0 Histories converge: can be predicted

Maximal Lyapunov exponent

0, 0

1lim log mm m

H0: samples are iid

48

48

Bibliography



J. D. Hamilton. Time series analysis. Princeton Univ. Press, 1994.


Session IV: Forecasting and Data Mining


2Session outline

1. Forecasting2. Univariate forecasting3. Intervention modelling4. State-space modelling5. Time series data mining

1. Time series representation2. Distance measure3. Anomaly/Novelty detection4. Classification/Clustering5. Indexing6. Motif discovery7. Rule extraction8. Segmentation9. Summarization

31. Forecasting

?][ hnx Goals: Symmetric loss functions:

Quadratic loss function: Other loss functions:

Assymmetric loss functions:

])[],[(minarg][][ hghnxLEhghnx ngn

2][][])[],[( hnxhnxhnxhnxL ][][])[],[( hnxhnxhnxhnxL

][][][][][][][][])[],[( 2

2

hnxhnxhnxhnxbhnxhnxhnxhnxahnxhnxL

Solution: )][],...,2[],1[][][][ nxxxhnxEhghnx n Univariate Forecasting: use only samples of the time series to be predicted

Multivariate Forecasting: use samples of the time series to be predicted and other companion time series.

42. Univariate Forecasting

Trend and seasonal component extrapolation

][][][][ nrandomnseasonalntrendnx nntrend 6.12777][ nnnseasonal 1223282 cos5.22cos5.26][

-0.2 0 0.2 0.4 0.6 0.8 1 1.2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

-1 -0.5 0 0.5 1 1.5 2-2

-1.5

-1

-0.5

0

0.5

1

1.5

2


Exponential smoothing

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870

200

300

Year

P

r

i

c

e

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870

200

300

Year

P

r

i

c

e

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870-100

0

100

Year

P

r

i

c

e

-

t

r

e

n

d

1820 1825 1830 1835 1840 1845 1850 1855 1860 1865 1870-50

050

Year

R

e

s

i

d

u

a

l

][)1(][]1[ nxnxnx

1)()()(

zzX

zXzH

-2 0 20

0.5

1

|

H

(

)

|

2

-2 0 2

-2

0

2

a

n

g

l

e

(

H

(

)

)

][])[][( nxnxnx ][][ nxne

]1[]1[ xx

62. Univariate forecasting

Model based forecasting (Box-Jenkins procedure)

1. Model identification: examine the data to select an apropriate model structure (AR, ARMA, ARIMA, SARIMA, etc.)

2. Model estimation: estimate the model parameters3. Diagnostic checking: examine the residuals of the model to check if it is valid4. Consider alternative models if necessary: if the residual analysis reveals that the

selected model is not apropriate5. Use the model difference equation to predict: as shown in the next two slides.


Model based forecasting

]1[][]1[][ nbwnwnaxnx

][]1[][]1[ nbwnwnaxnx

Example: ARMA(1,1)

1

1

11

)()()(

azbz

zWzXzHW

)()()()()()(zH

zXbzaXzbWzaXzXzW

bzba

bzba

zzHba

zzXzXzH

WX

11

1)(

1)()()(

][)(][]1[ nxbanxbnx ][][)(]1[ nxbnxbanx

...]3[]2[]1[][]1[][ 32 hnxahnxahnbwhnwhnxahnx

][][]1[ 11 nxbnxbaanxa hh

1h


]12[][])13[]1[(]12[][ nwnwnxnxnxnx

]11[]1[])12[][(]11[]1[ nwnwnxnxnxnx

Example: SARIMA(1,0,0)x(0,1,1)12

Model based forecasting

13121

12

11

)()()(

zzzz

zWzXzHW

]12[]13[]12[]1[][][ nwnxnxnxnxnw Eliminating w[n]

]13[]12[)(]11[]1[][)(]1[ nxnxnxnxnxnx ]12[]24[]23[ nxnxnx

10


[ ]x n

n

1 1[ ] ( , )y n f x y

[ ] ( , )y n f x y

Chow test:

0 1

1 1

::

H f fH f f

2

1

11

2 111

( )( , )N

N k

S SF F N N k

S

Assumption: variance is the same in both regionsSolution: Robust standard errors

Predictive power test


11

Artificial Neural Network (ANN)

1 2 ( , ,..., )t t t t p t t ty f y y y y w


12

Fuzzy Artificial Neural Network (ANN)

( ) : [0,1]Ma

x


13



14



15


16

3. Intervention modellingWhat happens in the time series of a price if there is tax raise of 3% in 1990?

[ ] [ 1] [ ] 0.03 [ 1990]price n price n w n u n

1 0[ ]

0 0n

u nn

1982 1984 1986 1988 1990 1992 1994

0

0.02

0.04

[ ] [ 1] [ ] 0.03( [ 1990] [ 1993])price n price n w n u n u n

1982 1984 1986 1988 1990 1992 1994

0

0.02

0.04

What happens in the time series of a price if there is tax raise of 3% between 1990 and 1992?

1 19901

1( ) ( ) ( ) 0.031

price z price z z w z zz

1( ) ( ) ( )price z price z z w z 1990 1993

1

10.03 ( )1

z zz

17

3. Intervention modellingWhat happens in the time series of a price if in 1990 there was an earthquake?

[ ] [ 1] [ ] [ 1990]price n price n w n n 1 0

[ ]0

nn

otherwise

1982 1984 1986 1988 1990 1992 1994

0

0.02

0.04

What happens in the time series of a price if there is a steady tax raise of 3% since 1990?[ ] [ 1] [ ] 0.03( 1989) [ 1990]price n price n w n n u n

1982 1984 1986 1988 1990 1992 19940

0.05

0.1

0.15

0.2

1 1990( ) ( ) ( ) 1price z price z z w z z

1 19901 1

1 1( ) ( ) ( ) 0.031 1

price z price z z w z zz z

18

3. Intervention modelling

In general

[ ] ( [ 1],..., [ ], [ ], [ 1],..., [ ]) ( [ ])x n f x n x n q w n w n w n p f i n ( ) ( )( ) ( ) ( )( ) ( )

B z C zX z W z I zA z D z

We are back to the system identification problem with external inputs

19

3. Intervention modelling: outliers revisited

1982 1984 1986 1988 1990 1992 1994

0

0.02

0.04

Additive outliers: ( )( ) ( ) ( )( )

B zX z W z I zA z

Innovational outliers:( ) ( )( ) ( ) ( )( ) ( )

B z C zX z W z I zA z D z

Level outliers:

1982 1984 1986 1988 1990 1992 1994

0

0.02

0.04

( )( ) ( ) ( )( )

B zX z W z I zA z

Time change outliers: ( ) 1( ) ( ) ( )( ) ( )

B zX z W z I zA z D z

20

4. State-space modelling

F1z

[ ] kn x H

1[ ]n2[ ]n

[ ] ln y StateNoise Noise

Observed time-seriesObservation

systemState-transition

system

Linear, Gaussian system: 1[ ] [ 1] [ ]n F n Q n x x State-transition model2[ ] [ ] [ ]n H n n y x Observation model

0 0[0] ( , )N x 1 1[ ] ( , )n N 02 2[ ] ( , )n N 0

Kalman filter: Given estimateTime series modelling (System Identification):

0 0 1 2[ ], , , , , ,x n H F

1 2[ ], ,x n 0 0[ ], , , ,y n H F

Given y[n] estimate

21


22


Linear, Gaussian system: 1[ ] [ 1] [ ]n F n Q n x x State-transition model2[ ] [ ] [ ]n H n n y x Observation model

Extended/Unscented Kalman filter: Given estimate 1 2[ ], ,x n 0 0[ ], , , ,y n h f

General system: 1[ ] ( [ 1], [ ])n f n n x x 2[ ] ( [ ], [ ])n h n ny x

,f h Differentiable

Particle filter:

Given and the probability density functions of 0 0[ ], , , ,y n h f 1 2, estimate x[n] and the free parameters of the noise signals.

23

5. Time series data mining

Source: http://www.kdnuggets.com/polls/2004/time_series_data_mining.htm

24

5. Time series data miningTime Series

Representations

Data Adaptive Non Data Adaptive

SpectralWavelets PiecewiseAggregateApproximation

Piecewise Polynomial

SymbolicSingularValueDecomposition

RandomMappings

PiecewiseLinear

ApproximationAdaptivePiecewiseConstant

Approximation

DiscreteFourier

Transform

DiscreteCosine

TransformHaar Daubechies

dbn n > 1Coiflets Symlets

Sorted Coefficients

Orthonormal Bi-Orthonormal

Interpolation Regression

Trees

Natural Language

Strings

0 20 40 60 80 100120 0 20 40 60 80 100120 0 20 40 60 80 100120 0 20 40 60 80 1001200 20 40 60 80 100120 0 20 40 60 80 100120

DFT DWT SVD APCA PAA PLA

0 20 40 60 80 100120

SYM

UUCUCUCD

UU

CU

CU

DD

Time series representation

25


0 20 40 60 80 100 120

C

C

0

--

0 20 40 60 80 100 120

bbb

a

cc

c

a

baabccbc

Time series representation

26

5. Time series data miningTime series representation

Source: http://www.cs.cmu.edu/~bobski/pubs/tr01108-onesided.pdf

27

5. Time series data miningTime series distance measure

28


Source: http://www.cs.ucr.edu/~wli/SSDBM05/

Anomaly/Novelty detection

A very complex and noisy ECG, but according to a cardiologist there is only one abnormal heartbeat.The algorithm easily finds it.

29

5. Time series data miningClassification/Clustering

30

5. Time series data miningIndexing

31


W inding Dataset (T he angular speed of reel 2)

0 500 1000 1500 2000 2500

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

A B C

A B C

Motif discovery


32

Motif discovery

33

5. Time series data miningRule extraction


34

Rule extraction

1x n x n

y nx n

Step 1: Convert time series into a symbol sequence


35

Rule extractionStep 2: Identify frequent itemsets

x[n]=abbcaabbcabababcaabcabbcabcbcabbaabcbcabbcbc

2-item setaa 3ab 11ac 1ba 2bb 5bc 11ca 7cb 3cc 0

Min.Support=5

3-item setaba 2abb 5abc 4bba 1bbb 0bbc 4bca 7bcb 3bcc 0caa 2cab 5cac 0

4-item setabba 1abbb 0abbc 4bcaa 2bcab 5bcac 0caba 1cabb 3cabc 1

5-item setbcaba 1bcabb 3bcabc 1

36

5. Time series data miningSegmentation (Change Point Detection)

37

5. Time series data miningSummarization

38

Session outline

1. Forecasting2. Univariate forecasting3. Intervention modelling4. State-space modelling5. Time series data mining

1. Time series representation2. Distance measure3. Anomaly/Novelty detection4. Classification/Clustering5. Indexing6. Motif discovery7. Rule extraction8. Segmentation9. Summarization

39

Conclusions


Regression, Curve fitting

Explained by statistical models (AR, MA, ARMA)

Preprocessing (heteroskedasticity, gaussianity, outliers, )?

Harmonic analysis,Filtering

(F)ARIMA SARIMA

40

Bibliography


C. Chatfield. Time-series forecasting. Chapman & Hall, CRC, 2000. D.S.G. Pollock. A handbook of time-series analysis, signal processing

and dynamics. Academics Press, 1999. D. Pea, G. C. Tiao, R. S. Tsay. A course in time series analysis. John

Wiley and Sons, Inc. 2001

Session 0_x1Session I_x1Session II_x1Session III_x1Session IV_x1

Time Series Analysis

Documents

Transcript of Time Series Analysis