Indexing of Time Series by Major Minima and Maxima

Post on 08-Jan-2016

29 views 1 download

description

Indexing of Time Series by Major Minima and Maxima. Eugene Fink Kevin B. Pratt Harith S. Gandhi. Example:. 0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1, 0. 4. 3. 2. 1. 0. Time series. A time series is a sequence of real values measured at equal intervals. Results. - PowerPoint PPT Presentation

Transcript of Indexing of Time Series by Major Minima and Maxima

Indexing of Time Seriesby Major Minima and Maxima

Eugene FinkKevin B. Pratt

Harith S. Gandhi

Time series

A time series is a sequence of real values measured at equal intervals.

Example:0, 3, 1, 2, 0, 1, 1, 3, 0, 2, 1, 4, 0, 1, 0

01

32

4

Results

• Compression of a time series by extracting its major minima and maxima

• Indexing of compressed time series

• Retrieval of series similar to a given pattern

• Experiments with stock and weather series

Outline

• Compression

• Indexing

• Retrieval

• Experiments

CompressionWe select major minima and maxima, along with the start point and end point, and discard the other points.

We use a positive parameter R to control the compression rate.

Major minima

A point a[m] in a[1..n] is a major minimum if there are i and j, where i < m < j, such that:• a[m] is a minimum among a[i..j], and• a[i] – a[m] R and a[j] – a[m] R.

a[j]a[i]

a[m]

R R

Major maxima

A point a[m] in a[1..n] is a major maximum if there are i and j, where i < m < j, such that:• a[m] is a maximum among a[i..j], and• a[m] – a[i] R and a[m] – a[j] R.

a[j]a[i]

a[m]

R R

Compression procedureThe procedure performs onepass through a given series.

It can compress a live serieswithout storing it in memory.

It takes linear time and constant memory.

Outline

• Compression

• Indexing

• Retrieval

• Experiments

Indexing of series

We index series in a database by their major inclines, which are upward and downward segments of the series.

Major inclinesA segment a[1..j] is a major upward incline if • a[i] is a major minimum;• a[j] is a major maximum;• for every m [i..j], a[i] < a[m] < a[j].

a[i]

a[j]

The definition of a major downward inclineis symmetric.

Identification of inclines

The procedure performs two passes through a list of major minima and maxima.

Identification of inclines

The procedure performs two passes through a list of major minima and maxima.

Its time is linear in the number of inclines.

Indexing of inclinesWe index major inclines of series in a database by their lengths and heights.

We use a range tree, which supports indexing of points by two coordinates.

lengthheight

length

height

incline

Outline

• Compression

• Indexing

• Retrieval

• Experiments

RetrievalThe procedure inputs a pattern series andsearches for similar segments in a database.

Pattern

Example:

Database

1

32

RetrievalThe procedure inputs a pattern series andsearches for similar segments in a database.

Main steps:

• Find the pattern’s inclines with the greatest height

• Retrieve all segments that have similar inclines

• Compare each of these segments with the pattern

Highest inclinesFirst, the retrieval procedure identifies the important inclines in the pattern. , and selects the highest inclines.

length1

height

length2

1 2

Candidate segmentsSecond, the procedure retrieves segments with similar inclines from the database.

An incline is considered similar if• its height is between height / C and height · C;• its length is between length / D and length · D.

We use the range tree toretrieve similar inclines.

incline

length / C

length · C

height / C

height · C

Similarity testThird, the procedure compares the retrieved segments with the pattern. ,using a given similarity test.

Outline

• Compression

• Indexing

• Retrieval

• Experiments

Experiments

We have tested a Visual-Basic implemen-tation on a 2.4-GHz Pentium computer.

Data sets:

• Stock prices: 98 series, 60,000 points

• Air and sea temperatures: 136 series, 450,000 points

00

210

fast rankingC = D = 5

time: 0.05 sec

200

perf

ect r

anki

ngStock prices (60,000 points) Search for 100-point patternsThe x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search.

00

331

fast rankingC = D = 2

time: 0.02 sec

200

perf

ect r

anki

ng

00

400

fast rankingC = D = 1.5

time: 0.01 sec

151

perf

ect r

anki

ng

Stock prices (60,000 points) Search for 500-point patternsThe x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search.

00

202

fast rankingC = D = 5

time: 0.31 sec

200

perf

ect r

anki

ng

00

328

fast rankingC = D = 2

time: 0.12 sec

200

perf

ect r

anki

ng

00

400

fast rankingC = D = 1.5

time: 0.09 sec

167

perf

ect r

anki

ng

Temperatures (450,000 points) Search for 200-point patternsThe x-axes show the ranks of matches retrieved by the developed procedure, and the y-axes are the ranks assigned by a slow exhaustive search.

00

202

fast rankingC = D = 5

time: 1.18 sec

200

perf

ect r

anki

ng

00

400

fast rankingC = D = 2

time: 0.27 sec

151

perf

ect r

anki

ng

00

400

fast rankingC = D = 1.5

time: 0.14 sec

82

perf

ect r

anki

ng

Conclusions

Main results: Compression and indexing of time series by major minima and maxima.

Current work: Hierarchical indexing by importance levels of minima and maxima.

4

3 3

3 3

1

1 1

11

1