Time series forecasting:Obtaining long term trends with self-organizing maps

39
1 Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and T echnology Time series forecasting:Ob taining long term trends w ith self-organizing maps Advisor : Dr. Hsu Presenter : Yu-San Hsieh Author :G.Simon , A.Lendasse , M.Cottrell, J-C.Fort , M.V erleysen

description

Time series forecasting:Obtaining long term trends with self-organizing maps. Advisor : Dr. Hsu Presenter : Yu-San Hsieh Author :G.Simon , A.Lendasse , M.Cottrell, J-C.Fort , M.Verleysen. Outline. Motivation Objective Kohonen self-organizing maps - PowerPoint PPT Presentation

Transcript of Time series forecasting:Obtaining long term trends with self-organizing maps

Page 1: Time series forecasting:Obtaining long term trends with self-organizing maps

1Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Time series forecasting:Obtaining long term trends with self-organizing

maps

Advisor : Dr. Hsu

Presenter : Yu-San Hsieh

Author :G.Simon , A.Lendasse , M.Cottrell, J-C.Fort , M.Verleysen

Page 2: Time series forecasting:Obtaining long term trends with self-organizing maps

2

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Motivation Objective Kohonen self-organizing maps The double quantization method Experimental results Conclusions

Outline

Page 3: Time series forecasting:Obtaining long term trends with self-organizing maps

3

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Motivation Kohonen self-organisation maps are a well kno

w classification tool, commonly used in a wide variety of problems,but with limited applications in time series forecasting context.

Many methods designed for time series forecasting perform well on a rather short-term horizon but are rather poor on a longer-term one.

Page 4: Time series forecasting:Obtaining long term trends with self-organizing maps

4

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Objective we propose a forecasting method specifically d

esigned for multi-dimensional long-term trends prediction, with a double application of the Kohonen algorithm.

The proposed method is not designed to obtain accurate forecast of the next values of a series, but rather aims to determine long-term trends.

Page 5: Time series forecasting:Obtaining long term trends with self-organizing maps

5

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Kohonen self-organizing maps SOM have been commonly used since their

first description in a wide variety of problems ─ Classification─ feature extraction─ pattern recognition─ other related applications.

An unsupervised classification algorithm from the artificial neural network paradigm.

Page 6: Time series forecasting:Obtaining long term trends with self-organizing maps

6

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Kohonen self-organizing maps(cont.) After the learning stage each prototype

represents a subset of the initial input set in which the inputs share some similar features.

Property─ a vector quantization of the input space that respects

the original distribution of the inputs.─ prototypes are ordered according to their location in the

input space.─ does not hold for other standard vector quantization

methods like competitive learning.

Page 7: Time series forecasting:Obtaining long term trends with self-organizing maps

7

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

The double quantization method The method described here aims to forecast lo

ng-term trends for a time series evolution Base on the SOM algorithm and can be divide

d into two stages─ the characterization: as the learning─ the forecasting: as the use of a model in a generalizat

ion procedure.

Page 8: Time series forecasting:Obtaining long term trends with self-organizing maps

8

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: characterization Define the prediction of a vector

─ d is the size of the vector to be predicted─ f is the data generating process─ p is the number of past values that influence the future values ─ εt is a centred noise vector.

The past values are gathered in a p-dimensional vector called regressor.

tt-p+1 t+dt+1

P d

Page 9: Time series forecasting:Obtaining long term trends with self-organizing maps

9

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: characterization(cont.) Inputs into regressors leads to n-p+1 vectors in

a p-dimensional space, the resulting regressors are denoted:

─ p ≤ t ≤ n, p is the regressor size, n the number of value at our disposal in the time series.

─ x(t) is the original time series at our disposal with 1≤t ≤ n.

─ the subscript index denotes the first temporal value. ─ the superscript index denotes its last temporal value.

tt-p+1

Page 10: Time series forecasting:Obtaining long term trends with self-organizing maps

10

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: characterization(cont.) Deformation are created according:

─ Each is associated to one of the

tt-p+1 t+1t-p+2

Page 11: Time series forecasting:Obtaining long term trends with self-organizing maps

11

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: characterization(cont.) The SOM algorithm can then be applied to each one o

f these two spaces, quantizing both the original regressors and the deformations respectively.

All of the original space, n1 p-dimensional prototypes xi are obtained (1 ≤ i ≤n1), the clusters associated to xi are denoted ci.

All deformations in the deformation space results in n2 p-dimensional prototypes yj, 1 ≤ j ≤ n2, Similarly the associated clusters are denoted c’j.

Page 12: Time series forecasting:Obtaining long term trends with self-organizing maps

12

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: characterization(cont.) Define transition matrix f(ij):

─ The row fij for a fixed i and 1 ≤ j ≤ n2 is the conditional probability

─ belongs to c’j─ belongs to ci.

Page 13: Time series forecasting:Obtaining long term trends with self-organizing maps

13

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: forecasting The prediction for time t+1 is obtain according

to relation (3):

─ is the estimate of the true given by our time series prediction model.

─ look at row k in the transition matrix and randomly choose a deformation prototype among the according to the conditional probability distribution defined by fkj, 1 ≤ j ≤ n2

t-p+2 t+1

Page 14: Time series forecasting:Obtaining long term trends with self-organizing maps

14

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: forecasting(cont.) we extract the scalar prediction from the p-di

mensional vector to compute by (5) and tracting .W

e then do the same for

This ends the run of the algorithm to obtain a single simulation of the series at horizon h.

Monte-Carlo procedure is used to repeat many times the whole long-term simulation procedure at horizon h, as detailed above.

Page 15: Time series forecasting:Obtaining long term trends with self-organizing maps

15

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting is a vector defined as:

─ d is determined according to a priori knowledge about the series.

─ Example: when forecasting an electrical consumption, it could be advantageous to predict all hourly values for one day in a single step instead of predicting iteratively each value separately.

t+1 t+dt

Page 16: Time series forecasting:Obtaining long term trends with self-organizing maps

16

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting(cont.) regressors of this kind of time series can be co

nstructed:

─ The regressor is thus constructed as the concatenation of d-dimensional vectors from the past of the time series.

─ p: for the sake of simplicity, is supposed to be a multiple of d though this is not compulsory.

Page 17: Time series forecasting:Obtaining long term trends with self-organizing maps

17

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting(cont.) Deformation can be formed here according to:

tt-p+1 t+dt-p+d+1

Page 18: Time series forecasting:Obtaining long term trends with self-organizing maps

18

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting(cont.) Here again,the SOM algorithm can then be applied to

each one of these two spaces, quantizing both the original regressors and the deformations respectively.

All of the original space, n1 prototypes are obtained (1 ≤ i ≤n1), the clusters associated to xi are denoted ci.

All deformations in the deformation space results in n2 prototypes , 1 ≤ j ≤ n2, Similarly the associated clusters are denoted c’j.

Page 19: Time series forecasting:Obtaining long term trends with self-organizing maps

19

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting(cont.) Define transition matrix as a vector generalisat

ion of relation(4):

─ The row fij for a fixed i and 1 ≤ j ≤ n2 is the conditional probability

─ belongs to c’j─ belongs to ci.

Page 20: Time series forecasting:Obtaining long term trends with self-organizing maps

20

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting(cont.) The simulation forecasting procedure can also

be generalised:─ consider the vector input for time t.

The corresponding regressor is─ find the corresponding prototype ─ choose a deformation prototypey among the accord

ing to the conditional distribution given by elements fkj of row k

Page 21: Time series forecasting:Obtaining long term trends with self-organizing maps

21

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

Method description: vector forecasting(cont.) The simulation forecasting procedure can also

be generalised:─ forecast as

─ extract the vector

from the d first columns of ^xttd tptdt1;─ repeat until horizon h.

For this vector case ,Monte-Carlo procedure

Page 22: Time series forecasting:Obtaining long term trends with self-organizing maps

22

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result

This section is devoted to the application of the method on two times series.─ Santa Fe A : scalar time series.─ Polish electrical consumption form 1989 to 1996 : the

prediction of a vector of 24 hourly values.

Page 23: Time series forecasting:Obtaining long term trends with self-organizing maps

23

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A─ The completed data set contains 10,000 data. This set h

as been divided here as follows the learning set contains 6000 data the validation set 2000 data test set 100 data

─ the regressors have been constructed according to

d = 1, p = 6 , value x(t+4) is omitted , and h = 100.

Page 24: Time series forecasting:Obtaining long term trends with self-organizing maps

24

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A─ Kohonen strings of 1 up to 200 prototypes in each spac

e have been used.─ All the 40,000 possible models have been tested on the

validation set.─ The best model among them has 179 prototypes in the r

egressor space and 161 prototypes in the deformation space.

Page 25: Time series forecasting:Obtaining long term trends with self-organizing maps

25

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A

Page 26: Time series forecasting:Obtaining long term trends with self-organizing maps

26

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A

Page 27: Time series forecasting:Obtaining long term trends with self-organizing maps

27

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A

Page 28: Time series forecasting:Obtaining long term trends with self-organizing maps

28

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A

Page 29: Time series forecasting:Obtaining long term trends with self-organizing maps

29

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A

Page 30: Time series forecasting:Obtaining long term trends with self-organizing maps

30

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Scalar forecasting : Santa Fe A

Page 31: Time series forecasting:Obtaining long term trends with self-organizing maps

31

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption─ The whole dataset contains about 72,000 hourly data

Page 32: Time series forecasting:Obtaining long term trends with self-organizing maps

32

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption─ our disposal 3000 data of dimension 24─ use 2000 of them for the learning─ 800 for a simple validation─ 200 for the test

Page 33: Time series forecasting:Obtaining long term trends with self-organizing maps

33

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption─ the optimal regressor is unknown, many different regre

ssors were tried, using intuitive understanding of the process. The final regressor is:

─ p=5 data of dimension d=24 and Contain 2000+800 data

─ The forecasting obtained from this model is repeated 1000 times.

Page 34: Time series forecasting:Obtaining long term trends with self-organizing maps

34

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption

Page 35: Time series forecasting:Obtaining long term trends with self-organizing maps

35

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption

Page 36: Time series forecasting:Obtaining long term trends with self-organizing maps

36

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption

Page 37: Time series forecasting:Obtaining long term trends with self-organizing maps

37

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experimental result(cont.)

Vector forecasting : the polish electrical consumption

Page 38: Time series forecasting:Obtaining long term trends with self-organizing maps

38

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusions

The use of SOMs makes it possible to apply the method both on scalar and vector time series and determine long-term trends.

Page 39: Time series forecasting:Obtaining long term trends with self-organizing maps

39

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.My opinion

Advantage: Obtain long term trends Apply

─ in the financial context─ For the estimation of volatilities.