2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance,...

Time Series, Part 1

Content

- Stationarity, autocorrelation, partial autocorrelation, removal of non-

stationary components, independence test for time series

- Linear Stochastic Processes: autoregressive (AR), moving average (MA),

autoregressive moving average (ARMA)

- Fit of models AR, MA and ARMA to stationary time series

- Linear models for non-stationary time series

- Prediction of time series

- Nonlinear analysis of time series with stochastic models

- Nonlinear analysis of time series and dynamical systems

Literature

- “The Analysis of Time Series, An Introduction”, Chatfield C., Sixth edition,

Chapman & Hall, 2004

- “Introduction to time series and forecasting”, Brockwell P.J. and Davis R.A.,

Second edition, Springer, 2002

- “Non-Linear Time Series, A Dynamical System Approach”, Tong H., Oxford

University Press, 1993

- “Nonlinear Time Series Analysis”, Kantz H. and Schreiber T., Cambridge

University Press, 2004

Real world time series

mechanics

physiology

geophysics economy

univariate

time series

non-stationarity

noise

electronics only one time series

limited length

Definitions / notations

observed quantity variable Χ

The observations take place most often at fixed time steps

sampling time

The values of the observed quantity change with randomness (stochasticity)

at some small or larger degree random variable (r.v.) Χ

For each time point t we consider the value xt of the r.v. Χ

The set of the values of xt over a time period n (given in units of the sampling

time) (univariate) time series

1 21{ , , , }

n

t ntx x x x

If there are simultaneous observations of more than one variable

multivariate time series

We apply methods and techniques on the given univariate or multivariate

time series in order to get insight for the system that generates it

time series analysis

The time series can be considered as realization of a

stochastic or deterministic process (dynamical system) t tX

Exchange index and volume of the Athens Stock Exchange (ASE)

86 88 90 92 94 96 98 00 02 04 06 08 10 120

1000

2000

3000

4000

5000

6000

7000

years

clo

se index

ASE index, period 1985 - 2011

07 08 09 10 11 120

1000

2000

3000

4000

5000

6000

years

clo

se index


01 02 03 04 05 06 07 08 09 10 11600

800

1000

1200

1400

1600

1800

months

clo

se index

ASE index, period 2011

98 99 00 01 02 03 04 05 06 07 080

5

10

15x 10

5

years

volu

me

ASE volume, period 1998 - 2008

Prediction?

Dynamical system ?

stochastic process ?

What is the index value tomorrow? The day after?

What is the mechanism of the

Greek stock market?

General Index of Consumer Prices (GICP)

01 02 03 04 05 06100

105

110

115

120

125

years

Genera

l In

dex o

f C

om

sum

er

Prices

General Index of Comsumer Prices, period Jan 2001 - Aug 2005

Trend ?

Seasonality / periodicity ?

Autocorrelation ?

Autoregression ?

Prediction ?

Annual sunspot numbers

1700 1750 1800 1850 1900 1950 20000

50

100

150

200

years

num

ber

of

sunspots

Annual sunspots, period 1700 - 2010

1900 1920 1940 1960 1980 2000

20

40

60

80

100

120

140

160

180

200

years

num

ber

of

sunspots


What will be the sunspot number in 2013, 2014 … ?

What is the mechanism / system / process

that generates sunspots?

Is it a periodic system + noise ?

Is it a stochastic system?

Is it a chaotic system?

1960 1970 1980 1990 2000 20100

50

100

150

200

years

num

ber

of

sunspots


Given the sunspot number for up to 1995,

what is the sunspot number in 1996?

and the years after?

1995 2000 2005 2010 2015 20200

50

100

150

200

year

sunspot

num

ber

Genuine predictions of sunspot data

Model

comparison Genuine

prediction

What is the generating system of a real time series?

100 200 300 400 500

periodic + noise

time in seconds100 200 300 400 500

low dimensional chaos

time in seconds100 200 300 400 500

high dimensional chaos

time in seconds

Candidate deterministic models

0 200 400time index i

x(i)

stochastic

Candidate

stochastic

models

100 200 300 400 500

preictal EEG

time in seconds

Real time series

100 200 300 400 500

ictal EEG

time in seconds

Dripping water faucet (original experiment at UC Santa Cruz).

The observation of the dripping faucet

shows that for some flow velocity the

drops do not run at constant time

intervals.

Crutchfield et al, Scientific American, 1986

3x1x 2x

2 1( , )x x

3 2( , )x x

The scatter diagram of the data

showed that the drop flow is not

random.

scatter

diagram 1( , )i ix x 1 2( , , )i i ix x x

Hénon map 2

1 21 1.4 0.3i i is s s

observed variable

i i ix s w wi noise

chaos

01 02 03 04 05 06100

105

110

115

120

125

years

Genera

l In

dex o

f C

om

sum

er

Prices

General Index of Comsumer Prices, period Jan 2001 - Aug 2005

Trend?

Seasonality / periodicity?

Autocorrelation ?

0 50 100 150 200 250 300 350 400 450 5000

200

400

600

800

1000

1200

1400

time [10 min]

AE

index

Auroral Electrojet Index

Volatility ?

Non-stationarity

Variance stabilizing transformation

simple solution: log( )t tX Y ? Power transform (Box-Cox):

1tt

YX

?

λ Χt Var[yt]

-1

-0.5

0

0.5

Other transforms ?

0 50 100 150 200 250 300 350 400 450 5000

200

400

600

800

1000

1200

1400

time [10 min]

AE

index


1 2, , , ny y y

0 50 100 150 200 250 300 350 400 450 5003

4

5

6

7

8

time [10 min]

AE

index

Logarithm transform of Auroral Electrojet Index

1 2, , , nx x x

1

tY

1

tY

tY

4

tc

log( )tY2

tc

tc

3

tc

Assumption: Var[Υt] changes as a

function of the trend μt

Transform Χt=T(Υt) that stabilizes

the variance of Υt ?

Var[ ] consttX

0 50 100 150 200 250 300 350 400 450 5000

200

400

600

800

1000

1200

1400

time [10 min]

AE

index


0 50 100 150 200 250 300 350 400 450 5003

4

5

6

7

8

time [10 min]

AE

index

Logarithm transform of Auroral Electrojet Index

-1000 -500 0 500 1000 15000

1

2

3

4

5x 10

-3

x

f X(x

)

y

normal

0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

y

f Y(y

)

x=log(y)

normal

1 2, , , ny y y

log( )t tX Y

0 50 100 150 200 250 300 350 400 450 500-4

-3

-2

-1

0

1

2

3

4

time [10 min]

AE

index

Gaussian transform of Auroral Electrojet Index

-4 -2 0 2 40

0.1

0.2

0.3

0.4

0.5

z

f Z(z

)

x=-1(FY

(y))

normal

1 ( )t Y tX F Y

?

Stationarity - trend

Plastic deformation deterministic trend: a function of time μt = f(t)

Trend: slow change of the successive values of yt

82 85 87 90 92 95 97 00 02 050

200

400

600

800

1000

1200

1400

1600

years

index

S&P500

1 2, , , ny y ytime series

stochastic trend: random slow change μt

Removal of trend t t tY X 1. Deterministic trend :

known or estimated function of time μt = f(t)

Fit with first

degree

polynomial

Fit with fifth

degree

polynomial

Plastic deformation

t t tX Y

{Xt} stationary

μt: mean value as

function of t (slowly

varying mean level) Example: polynomial of degree p

0 1( ) p

t pf t a a t a t

Index of the Athens Stock Exchange (ASE)

86 88 90 92 94 96 98 00 02 04 06 08 10 120

1000

2000

3000

4000

5000

6000

7000

years

clo

se index


orig

07 08 09 10 11 120

1000

2000

3000

4000

5000

6000

years

clo

se index


2. Stochastic trend

2α. Smoothing with moving average filter

86 88 90 92 94 96 98 00 02 04 06 08 10 12-1000

0

1000

2000

3000

4000

5000

6000

7000

years

clo

se index


orig

local linear, 10 breakpoints

polynomial,p=20

Simple filter:

moving average

1ˆ

2 1

q

t t j

j q

yq

2 1 3q 1 1

1 1 1ˆ

3 3 3t t t ty y y

"2 1" 4q ?

86 88 90 92 94 96 98 00 02 04 06 08 10 120

1000

2000

3000

4000

5000

6000

7000

years

clo

se index


orig

MA(31)

MA(151)

More general filter:

moving weighted average

ˆq

t j t j

j q

a y

1q

j

j q

a

Simple moving average: 1

, , ,2 1

ja j q qq

2b. Trend removal with differencing

If the trend is locally linear, it is removed by first differences:

0 1t a a t 1 1 1t t t t t t tY Y Y X X

1 0 1 0 1 1( 1)t t a a t a a t a constant!

If the trend is locally polynomial or degree p, it is removed by using p

tY ?

08 10 12-600

-400

-200

0

200

400

years

clo

se index

ASE index: first differences, period 2007 - 2011

86 88 90 92 94 96 98 00 02 04 06 08 10 12-600

-400

-200

0

200

400

years

clo

se index


Second order lag difference 2 2

1 2( ) (1 )(1 ) (1 2 ) 2t t t t t t tY Y B B Y B B Y Y Y Y

One lag difference or first difference

1 (1 )t t t tY Y Y B Y 1t tBY Y B: lag operator

[show first: ] !p

t tY p c X

Which method for trend removal is best ?

08 10 120

1000

2000

3000

4000

5000

6000

years

clo

se index


orig

MA(31)

MA(151)

08 10 12

-1000

-500

0

500

1000

1500

2000

years

clo

se index

ASE index detrended, period 2007 - 2011

MA(31)

MA(151)

08 10 120

1000

2000

3000

4000

5000

6000

years

clo

se index


orig


polynomial,p=20

08 10 12

-1000

-500

0

500

1000

1500

2000

years

clo

se index

ASE index detrended, period 2007 - 2011


polynomial,p=20

08 10 12

-1000

-500

0

500

1000

1500

2000

years

clo

se index


Estimation of trend

82 85 87 90 92 95 97 00 02 050

200

400

600

800

1000

1200

1400

1600

years

index

S&P500

1 2, , , ny y y time series 82 85 87 90 92 95 97 00 02 05-100

-50

0

50

100

years

firs

t diffe

rence

S&P500, first differences

1t t tx y y

change of

the value

82 85 87 90 92 95 97 00 02 05-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

years

rela

tive c

hange

S&P500, relative changes

1t tt

t

y yx

y

relative

change of

the value

82 85 87 90 92 95 97 00 02 05-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

years

diffe

rence o

f lo

gs

S&P500, difference of logs

1ln lnt t tx y y

change of the

logarithm of

the value

… more on differencing transform

Removal of seasonality

or periodicity t t tY s X

1. known or estimated periodic function st = f(t)

st: periodic function of

t with period d

Annual sunspots

1700 1750 1800 1850 1900 1950 20000

50

100

150

200

years

num

ber

of

sunspots


1900 1920 1940 1960 1980 2000

20

40

60

80

100

120

140

160

180

200

years

num

ber

of

sunspots


t t tX Y s {Xt} stationary Period d and

appropriate function st ?

2a. Estimation of si i=1,…,d from the averages for each component

Period d is known 1

1ˆ

k

i i jd

j

s yk

/k n d

2b. Removal of periodicity using lag differences of order d (d-differencing)

(1 )d

d t t t d tY Y Y B Y

Removal of trend and periodicity t t t tY s X

1. Removal of trend t t t t tY Y s X

2. Removal of periodicity t t t t t tX Y s Y s

First remove trend and then

periodicity or vice versa ?

01 02 03 04 05 06100

105

110

115

120

125

year

GIC

P

General Index of Comsumer Prices, period 1/2001-8/2005

01 02 03 04 05 06-3

-2

-1

0

1

2

3

years

detr

ended G

ICP

GICP: Residual of linear fit

01 02 03 04 05 06-3

-2

-1

0

1

2

3

year

year

cycle

of

GIC

P

GICP: Year cycle estimate

01 02 03 04 05 06-3

-2

-1

0

1

2

3

year

resid

ual G

ICP

GICP: detrended and deseasoned

01 02 03 04 05 06100

105

110

115

120

125

year

GIC

P

GICP: Linear fit

non-stationary 1 2, , , ny y y stationary 1 2, , , nx x x Is there information

in the residuals?

{Χt}: time series of residuals

Time correlation

tY

82 85 87 90 92 95 97 00 02 050

200

400

600

800

1000

1200

1400

1600

years

index

S&P500

: the value of the quantity

1 2, , , ny y y time series

( )tYf y

0 500 1000 1500 20000

0.5

1

1.5

2

2.5

3

3.5x 10

-3

Yt

f Yt(y

)

Gaussian pdf superimposed to S&P500

( )tXf x

-0.05 0 0.050

10

20

30

40

50

60

Xt

f Xt(x

)

Gaussian pdf superimposed to S&P500 returns

Static description …

marginal distribution

Dynamic description?

Time correlation

Stochastic process

t tY

t tX

82 85 87 90 92 95 97 00 02 05-100

-50

0

50

100

years

firs

t diffe

rence

S&P500, first differences

1t t tx y y

change of

the value

Distribution and moments of a stochastic process

A stochastic process can be fully described in terms of the

marginal and joint probability distributions

( ) ( , )tY Yf y f y tt Z marginal distribution

1 2, 1 2 1 2 1 2( , ) ( , , , )

t tY Y Yf y y f y y t t

1 2 3, , 1 2 3 1 2 3 1 2 3( , , ) ( , , , , , )

t t tY Y Y Yf y y y f y y y t t t

joint distribution of 2 r.v.

joint distribution of 3 r.v.

1 2,t t Z

1 2 3, ,t t t Z

…

The probability distribution and moments may change in time

First order moment (mean) ( , )dt tYY yf y t y

Second order moment 1 2 1 2 1 2 1 2 1 2 1 2( , , , )d d ( , )t t YY Y y y f y y t t y y t t

Higher order moments …

Central second order moment 1 1 2 2 1 21 2 1 2( , )( )( ) ( , )t t t t t t t tY Y t t

autocovariance

Stationarity

The distributions do not change with time (equivalently, all moments are constant)

( ) ( , ) ( )t tY Y Yf y f y t f y t Z

1 2 1 23, , 1 2 3 , , 1 2 3( , , ) ( , , )

tt tt t tY Y Y Y Y Yf y y y f y y y

1 2,t t Z

1 2 3, ,t t t Z

Strict-sense stationarity

1 2, 1 2 , 1 2( , ) ( , )

t tt tY Y Y Yf y y f y y

constant

t Z

for τ=0 2 (0)tY constant variance

22 2 2(0) (0)t tY Y

Wide-sense stationarity

The first two moments are constant in time

tY

1 2

, ( , ) ( )t tt tY Y Y Y t t

constant

t Z

1 2( , ) ( , ) ( )t t t t

constant

- mean

- variance

- autocovariance

Autocorrelation

Stationary time series

Autocovariance 2 2( )( ) ( )( ) t t t t tX X X X X

Variance 22 2 2(0) (0)t tX X

Autocorrelation ( )

))

0(

(

Time correlation of variables of

at a lag τ.

Measures the “memory” of

t tX

t tX

t tX

(0) 1

Notation: ( ) ( )

0k

Comments:

1k and

k k k k and

Autocovariance matrix

01

201

110

n

n

n

n

Autocorrelation matrix

1 1

1 2

1

1

1

1

n

n

n

n

Basic stochastic processes

2E i j ijX X

white noise (WN), non-correlated r.v. t tX

t tX

independent and identically distributed r.v. (iid)

)()()(),,,( 22112211 nnnn xXPxXPxXPxXxXxXP

E 0tX

E 0tX 2 2E tX

1 0 1E | , , ,t t tY Y Y Y Y

random walk (RW)

1 1 2t t t tY Y X X X X

1t t

Y

t tX

iid

E 0tY 2 2E tY t

Variance increases linearly with time!

?

1

3

2

Chatfield C., “The Analysis of Time Series, An Introduction”, 6th edition, p. 38 (Chapter 3):

“Some authors prefer to make the weaker assumption that the zt’s are mutually uncorrelated,

rather than independent. This is adequate for linear, normal processes, but the stronger

independence assumption is needed when considering non-linear models (Chapter 11). Note

that a purely random process is sometimes called white noise, particularly by engineers.

p. 221 (Chapter 11):

When examining the properties of non-linear models, it can be very important to distinguish

between independent and uncorrelated random variables. In Section 3.4.1, white noise (or a

purely random process) was defined to be a sequence of independent and identically

distributed (i.i.d.) random variables. This is sometimes called strict white noise (SWN), and the

phrase uncorrelated white noise (UWN) is used when successive values are merely

uncorrelated, rather than independent. Of course if successive values follow a normal

(Gaussian) distribution, then zero correlation implies independence so that Gaussian

UWN is SWN. However, with non-linear models, distributions are generally non-normal and

zero correlation need not imply independence.

Wei W.W.C., “Time Series Analysis, Univariate and Multivariate Methods”, p. 15:

2.4 White Noise Processes

A process {at} is called a white noise process if it is a sequence of uncorrelated random

variables from a fixed distribution with constant mean (usually assumed 0), constant

variance and zero autocovariance for lags different from 0.

Uncorrelated (white noise) and independent (iid) observations

t tX

Gaussian (normal) stochastic process

For each order p: is p-dimensional Gaussian distribution 1 1

, , , 1 2( , , , )t t t p

X X X pf x x x

Gaussian distribution is completely defined by the first two moments

strict stationarity ≡ weak stationarity

4

Example

sin( )tX A t Stochastic process:

A r.v. E[ ] 0A Var[ ] 1A

~ [ , ]U θ and A independent

E[ ] E[ ]E[sin( )] 0tX A t

Is the process weak stationary?

2 1E[ ] E sin( )sin( ( ) ) ... cos( )

2t tX X A t t

?

The first and second order moments do not depend on time t.

Sample autocovariance / autocorrelation

1 2, , , nx x xtime series

1

1 n

t

t

xxn

Sample mean

unbiased estimate of the mean μ of the time series ?

2 2

1

1(0) ( )

n

t

t

c x xn

Another estimate of autocovariance 2

1

( )1

( )n

t t

t

x x xn

c

Biased estimates: E[ ] ( )Var[ ]

nc x

n n

E[ ] Var[ ]c x

( )c c

Notation

bias increases

with the lag τ

Sample autocovariance 2

1

( ))1

(n

t t

t

xc x xn

0,1, , 1n

((

))

)

(0

cr

c

Sample autocorrelation (0) 1r

( )r r

Notation

~ N( ,Var[ ])r r For large n:

2 2 21Var[ ] ( 2 4 )m m m m m m

m

rn

Bartlett

formula 21

Var[ ] m

m

rn

very large n

Autocorrelation for white noise

1 2, , , nx x x white noise time series 0, 0

1~ N(0, )r

n ?

Test for independence

1 2, , , nx x xobserved stationary time series

residual time series after trend

or periodicity removal Are there

correlations ?

Is it iid ? Η0

Η0

Hypotheses

Η0: is iid 1 2, , , nx x x Η0: is white noise 1 2, , , nx x x

Statistical Significance test for autocorrelation

0H : 0 1H : 0

Rejection region: 1 /2|1/

t

rR r z

n

for significance level

Band of insignificant autocorrelation: 1 /2

1az

n for =0.05

2

n

1N(0, )r

n

white noise 1 2, , , nx x x

0 5 10 15-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

r()

GICP residual: autocorrelation

At significance level =0.05,

Η0 is rejected for τ=10

Is there any correlation in the GICP time series?

Significance test Η0:

for each independently

0

Numerical Example

For a time series of 200 observations, the autocorrelation for τ=1,…,10 are: 1 2 3 4 5 6 7 8 9 10

-0.38 -0.28 0.11 -0.08 0.02 0.00 0.01 0.07 -0.08 0.05

Assume that the time series is purely random (Η0:ρ=0): 1

Var[ ] 0.005200

r

for =0.05, we expect 95% of autocorrelations to be in the interval

11.96 1.96 0.07 0.139

200

ρ1≠0, ρ2≠0 και ρτ≠0 για τ=3,4,…

Example of GICP

The Portmanteau significance test

A test for each lag 1, ,k

0H : 0, 1, ,k

Test statistic Q:

2

1

k

Q n r

2

1

( 2) / ( )k

Q n n r n j

Box-Pierce

Ljung-Box

2~ kQ rejection region 2

;1k aR Q

0 5 10 15-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

r()

GICP residual: autocorrelation

10k

24.06Q

2

;1 18.30k a

H0 for τ=10

is rejected

0 5 10 150

5

10

15

20

25

30

35

k

Q(k

)

GICP residual: Portmanteau (Ljung-Box)

One test for all lags together ?

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

t

x(t

)random time series

0 2 4 6 8 10-0.3

-0.2

-0.1

0

0.1

0.2

r()

random time series: autocorrelation

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

t

x(t

)

logistic time series

0 2 4 6 8 10-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

r()

logistic time series: autocorrelation

An appropriate significance test ?

Is there correlation in the returns time series of the

ASE index (time period 2007-2011)?

07 08 09 10 11 120

1000

2000

3000

4000

5000

6000

years

clo

se index


0 2 4 6 8 100

5

10

15

20

k

Q(k

)

ASE returns: Portmanteau (Ljung-Box)

sample Q

X2(k,1)

0 2 4 6 8 10-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

r()

ASE first differences: autocorrelation

0 2 4 6 8 100

5

10

15

20

k

Q(k

)

ASE first differences: Portmanteau (Ljung-Box)

sample Q

X2(k,1)

0 2 4 6 8 10-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

r()

ASE returns: autocorrelation

Is there correlation?

0 2 4 6 8 100

50

100

150

200

250

300

350

k

Q(k

)

ASE square returns: Portmanteau (Ljung-Box)

sample Q

X2(k,1)

0 2 4 6 8 10-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

r()

ASE square returns: autocorrelation

07 08 09 10 11 12-300

-200

-100

0

100

200

300

400

years

clo

se index


07 08 09 10 11 12-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

years

clo

se index

ASE index: returns, period 2007 - 2011

What is the appropriate stationary time series:

first differences or returns ?

first

differences

returns

1t t tx y y

1ln lnt t tx y y

07 08 09 10 11 120

0.005

0.01

0.015

0.02

years

clo

se index

ASE index: square returns, period 2007 - 2011

square of

returns

1ln lnt t tx y y

2( )t tx x

… nonlinear ? 2 2E t tX X

2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance,...

Data & Analytics

Transcript of 2013.06.17 Time Series Analysis Workshop ..Applications in Physiology, Climate Change and Finance,...