baydogan time series data mining - Home | Living Analytics ...

•BICRITERIA OPTIMIZATION OF ENERGY EFFICIENT PLACEMENT AND ROUTING IN HETEROGENOUS WIRELESS SENSOR NETWORKS

•TIME SERIES DATA MINING

1Mustafa Gokce Baydogan

Singapore Management University, 5/10/2012

Mustafa Gokce Baydogan

PhD Candidate

School of Computing, Informatics and Decision Systems Engineering

Arizona State University (ASU) Tempe, AZ, USA

•TIME SERIES DATA MINING

BICRITERIA OPTIMIZATION OF ENERGY EFFICIENT PLACEMENT AND ROUTING IN HETEROGENOUS

WIRELESS SENSOR NETWORKS

Mustafa Gökçe Baydoğan

School of Computing, Informatics and Decision Systems Engineering



Arizona State University (ASU) Tempe, AZ, USA

Nur Evin Özdemirel, PhD

Department of Industrial Engineering

Middle East Technical University (METU) Ankara, Turkey

MOTIVATION

SOCIOECONOMIC

– Environmental monitoring

– Air, soil or water monitoring

– Habibat monitoring

– Seismic detection

– Military surveillance



– Battlefield monitoring

– Sniper localization

– Nuclear, biological or chemical attack detection

– Disaster area monitoring

RESEARCH0

DESIGN ISSUES IN WSNs

Deployment

random vs deterministic; one-time vs iterative

Mobility

mobile vs immobile

Heterogeneity

homogeneous vs heterogeneus



Communication modality

radio vs light vs sound

Infrastructure

infrastructure vs ad hoc

Network Topology

single-hop vs star vs tree vs mesh

Römer and Mattern, 2004, The Design Space of Wireless Sensor Networks, IEEE Wireless Communications, 11:6, 54-6

PROBLEM CHARACTERISTICS

There are some events (targets) to be sensed in the monitoring area

Sink

Locate sensors (battery powered) to possible locations so that events are sensed(detected) with a

given probability

Determine the rate of data flow between sensors and sink node (base station)



TOTAL SENSOR

COST

NETWORK LIFETIME

PROBLEM DEFINITION

OBJECTIVES

– Minimize total cost of sensors deployed

– Maximize lifetime of the network

DECISIONS

– Location of heterogeneous sensors



– Data routing

CONSTRAINTS

– Connectivity

– Node (sensor) and channel (link) capacity

– Coverage

– Battery power

LITERATURE



PROBLEM DEFINITIONp

CONNECTIVITY

A sensor of type k located at location i can communicate with a sensor of type k located at

location j if ( )jki

k crcr ,mindist ij ≤

i

kcr i

kcr



ijdist j

kcr

ji

(a) (b)

j

kcr

j

ijdist

i

PROBLEM DEFINITIONp

≤

=−

otherwise ,0

dist if , ip k

dist

ikp

srepr

ipkβ Strength of the sensor signal

decreases as distance increases†

COVERAGE

Denoted as the detection probability of a target at point

By a sensor of type k located at location

p

i



otherwise ,0

† Zou and Chakrabarty, 2005, A Distributed Coverage- and Connectivity- Centric Technique for Selecting Active Nodes in

Wireless Sensor Networks, IEEE Transactions on Computers

( )( )∏∈

−−=p

ik

Bki

x

ikppprpr

,

11

Detection probability of a target at point p

PROBLEM DEFINITION

ENERGY CONSUMPTION MODEL †

Sources of energy consumption in a sensor

– Generating data

– Receiving data

β=

γ=keg



– Transmitting data

† J. Tang, B. Hao, and A. Sen, 2006, Relay node placement in large scale wireless sensor networks, Computer Communications,

29:4, 490-501

m

ijij distet *λδ +=

β=ijer

δ is a distance-independent constant term

λ is a coefficient term associated with the distance dependent term

ijdist is the distance between two locations

m is the path loss index

PROBLEM FORMULATION

total cost of sensors located



lifetime of the network

one sensor can be located at each location

PROBLEM FORMULATION

data flow balance at a sensor

all data is routed to sink node



sensor capacity

channel (link) capacity

coverage

PROBLEM FORMULATION

battery power



location decision data flow decision

battery power

THE BICRITERIA PROBLEM

DOMINATION

dominates ( )'2'

1 , zz ( )''2''

1 , zz

and ''

1

'

1zz ≤if

''

2

'

2zz > or and

''

1

'

1zz < ''

2

'

2zz ≥

18

20



10 12 14 16 18 20 22 24 266

8

10

12

14

16

Sensor Cost

Netw

ork

Lifetim

e

FINDING PARETO OPTIMAL SOLUTIONS

A BICRITERIA PROBLEM

10

12

14

16

18

20

Netw

ork

Lifetim

e

Solve for to find lower bound on cost1z

Solve for

s.t.

to find lower bound on lifetime

1cost z≤

2z

Solve for to find upper bound on lifetime2z



10 12 14 16 18 20 22 24 266

8

10

Sensor Cost

Ne Solve for to find upper bound on lifetime2

Solve for

s.t.

to find upper bound on cost

2lifetime z≥

1z

For all integral values

solve for

1z

2z

GENETIC ALGORITHM

Why evolutionary algorithms?

• Classical search and optimization methods– find single solution in every iteration

– need repetitive use of a single objective optimization method

– assumptions like linearity, continuity

• Evolutionary Algorithms – use a population of solutions in every generation



– use a population of solutions in every generation

– no assumptions

– find and maintain multiple good solutions • Emphasize all nondominated solutions in a population equally

• Preserve a diverse set of multiple nondominated solutions

�Near optimal, uniformly disributed, well extended set of solutions for MO problems

Nondominated sorting approach (Goldberg, 1989)

GENETIC ALGORITHM

Convergence to Pareto optimal front



Diverse set of solution along Pareto

optimal front

GENETIC ALGORITHM

REPRESENTATION

type of the sensor located on the corresponding location

Disadvantages

– Flow allocation is not stored

– Lifetime cannot be determined

0 1 2 ------- 1 3 ------- 0 3 0

i 1+i n1 1−n2−n2 3



– Lifetime cannot be determined

– Finding feasible solutions after mutation and crossover operators is very hard

Advantages

– Problem reduces to LP with given sensor locations

– By solving LP, maximum lifetime and constraint violations can be determined

FITNESS

Based on nondominated sorting idea

considering three objectives

– Total sensor cost

– Network lifetime

– Overall constraint violation

• Connectivity

GENETIC ALGORITHM



• Connectivity

• Coverage

• Capacity violations

(channel and sensor)

GENETIC ALGORITHM

INITIAL POPULATION GENERATION

– Two phase approach

– Sensor location

– Location according to target coordinates

– Relay location

– Location according to sensor coordinates



MUTATION

– Repair and improve

– Repair coverage constraints

– Improve cost and lifetime objectives

– Repair connectivity constraints

– Improve cost objective

GENETIC ALGORITHM



Small problems

– Problems with 24 possible locations


TEST PROBLEMS

25 possible locations

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

PS2441 possible locations

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

PS40BS BS



50 targets are dispersed across the monitoring area

Each target has a random coverage threshold uniformly distributed between 0.7 and 1

The rate of data generated for each target is a random integer between 1 Kbps and 3 Kbps

COMPUTATIONAL RESULTS

PERFORMANCE MEASURES

Proximity Indicator (PI)

For each solution found, find the Pareto

optimal solution with closest normalized

Tchebychev distance

Reverse Proximity Indicator (RPI)



For each Pareto optimal solution, find the

solution with closest normalized Tchebychev

distance

Hypervolume Indicator (HI)

Find the ratio of area bounded by nadir point

that cannot be covered


Smaller Problems

Problem

size

Constraint

tightness

# of feasible

problems RPI PI HI ε -constraint GA GA=Exact ε -constraint GA

PS24 LC 30/30 0.0317 0.0220 0.0558 10.20 9.27 3.70 1118 110

TC 29/30 0.0761 0.0574 0.1734 7.47 6.57 1.93 70 110

PS40 LC(1)

10/10 0.0464 0.0489 0.1164 13.60 12.80 1.30 100088 798

LC(2)

20/20 - - - 13.00 14.20 - 85983 821

TC 30/30 0.0744 0.0780 0.1957 11.60 11.10 1.13 12865 797

GA performance measures Number of solutions CPU time (s)



TC 30/30 0.0744 0.0780 0.1957 11.60 11.10 1.13 12865 797

(2) Results across 20 instances that are solved approximately by the ε -constraint approach.

(1) Results across 10 instances that are solved exactly by the ε -constraint approach.







TEST PROBLEMS

We also introduce larger test problems


15

20

15

20

Problems with 99 possible locations Problems with 111 possible locations



0

5

10

15

0 5 10 15 20

0

5

10

15

0 5 10 15 20


Larger Problems



CONCLUSIONCOMMENTS and FURTHER RESEARCH

– GA provides reasonable solution quality with better solution times

� We can obtain better solutions than the exact approach has by representing the

area with more grids (approximation of the continuous space) even with better

solution times

Future research

– Modification of ε-constraint approach



– Modification of ε-constraint approach

– Use of sensitivity analysis results (promising)

– Incorporating decision maker’s preferences

– Different objectives such as minimization of total delay, total hop count or average

path length

– Special network requirements such as K-coverage or K-connectivity

THANK YOU...

QUESTIONS AND COMMENTS?



QUESTIONS AND COMMENTS?

Supplemental material



OUTLINE

– MOTIVATION

– PROBLEM DEFINITION

– GENETIC ALGORITHM



– GENETIC ALGORITHM

– COMPUTATIONAL RESULTS

– CONCLUSION

TEST PROBLEMS

Small problems



EXACT SOLUTION

25 possible locations

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

PS2441 possible locations

0

1

2

3

4

5

6

7

8

9

10

0 1 2 3 4 5 6 7 8 9 10

PS40BS BS



50 targets are dispersed across the monitoring area

Each target has a random coverage threshold uniformly distributed between 0.7 and 1

The rate of data generated for each target is a random integer between 1 Kbps and 3 Kbps

Sensor type Sensor type Relay

100 Kpbs 200 Kpbs 150 Kbps

2 m 3 m 0 m

3 m 5 m 3 m

1 2 1

0.15 0.1 -

10-5

EnergyUnits 1.5x10-5

EnergyUnits 10-5

EnergyUnits

kscap

kc

ksr

kcr

kβ

1=k 2=k 3=k

ke

Sensor type Sensor type Relay

40 Kpbs 80 Kpbs 40 Kbps

2 m 3 m 0 m

3 m 5 m 3 m

1 2 1

0.15 0.1 -

10-5

EnergyUnits 1.5x10-5

EnergyUnits 10-5

EnergyUnits

kscap

kc

ksr

kcr

kβ

1=k 2=k 3=k

ke

EXACT SOLUTION



GENETIC ALGORITHM

Why evolutionary algorithms?

• Classical search and optimization methods– find single solution in every iteration

– need repetitive use of a single objective optimization method

– assumptions like linearity, continuity

• Evolutionary Algorithms – use a population of solutions in every generation



– use a population of solutions in every generation

– no assumptions

– eliminate the need of parameters (like weight, ε or target vectors)

– find and maintain multiple good solutions • Emphasize all nondominated solutions in a population equally

• Preserve a diverse set of multiple nondominated solutions

�Near optimal, uniformly disributed, well extended set of solutions for MO problems







CONCLUSION

– RPI and HI worsen as the problem size increases from 24 to 40 when capacity

constraints are loose

– TC instances are harder to solve for the GA compared to LC instances, whereas

they are easier for the ε-constraint approach.

– Performance measures for tight capacity are about twice as large as those for

loose capacity.

– The problem size has less effect on the performance measures when the capacity

constraints are tight.



constraints are tight.

– When the capacity constraints are loose, the GA solves problems of size 24 in one

tenth of the ε-constraint CPU times. For problems of size 40, GA CPU time is about

100 times shorter than ε-constraint time.

– For the tight capacity case, GA CPU times are slightly longer than ε-constraint

times with 24 possible locations, but they are 15 times shorter with 40 possible

locations.

– For problems with 99 and 111 possible locations, the GA converges to a solution in

about 160 minutes.

Time series data mining

Mustafa Gokce BaydoganGeorge Runger

5/10/2012

Data mining and Operations Research

What is Data Mining?

� Extracting meaningful, previously unknown patterns or knowledge from large databases

� The knowledge discovery process

2

Define

Objective

Prepare

Data

Mine

Knowledge

Interpret

Results

Data cleaning

Data selection

Attribute selection

Visualization

Classification

Association rule

discovery

Clustering

Business/scientific

objective

Data mining

objective

Predictive models

Structural insights

Interdisciplinary Field

Statistics

3

Databases

Optimization

Machine

LearningData Mining


What is time series?

� A (numeric) time series is a sequence of

observations of a numeric property over time

-1,25

-1,00

0,01

4

0,01

0,05

…

5,45

0,00

…


Motivations

� Time series are everywhere

ECG Heartbeat Stock

� Most of the information (data) produced in

a variety of areas are time series

� about 50% of all newspaper graphics are time series

5Images from E. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006.


Motivations

� Other types of data can be converted to

time series.

� Everything is about the representation.

Example: Recognizing words� Example: Recognizing words

6

An example word “Alexandria” from the dataset of word profiles for

George Washington's manuscripts.

Images from E. Keogh. A quick tour of the datasets for VLDB 2008. In VLDB, 2008.

A word can be represented by

two time series created by moving over and under the word


Examples� Recognizing trees from the leaf images

� Understanding what is related to difficulty of

certain task

7Images from E. Keogh. A decade of progress in indexing and mining large time series databases. In VLDB, page 1268, 2006.

PUPILDILATION

EEG

EMOTIONS

ClusteringClustering ClassificationClassification

Time Series Data Mining Tasks

8

0 50 0 1000 150 0 2000 2500

0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140

A B C

A B C

Query by ContentRule Discovery

⇒s = 0.5

c = 0.3

Motif DiscoveryMotif Discovery

Anomaly DetectionAnomaly Detection VisualizationVisualization

10

Time series classification� A supervised learning problem aimed at

labeling temporally structured univariate

(or multivariate) sequences of certain (or

variable) length.

9

Datasets� Datasets are from different domains

Word

recognition

MedicineEnergy

Biology

10

Face

recognition

Image and video

classification

Energy

Robotics

Gesture

recognition

Astronomy

A Bag-of-Features Framework to

classify time series (TSBF)

� Bag of features � a common method used for image classification

� also referred as� Bag of words in document analysis� Bag of frames in audio and speech recognition

� Accurate even with simple shape based features� Accurate even with simple shape based features

� SBF provides a framework for time series classification, alternative algorithms for the following tasks may provide better solutions� Local feature extraction� Codebook generation

� Classification

11

The details and the code of TSBF and the datasets are provided inhttp://www.mustafabaydogan.com/research/time-series-classification.html

Supervised Time Series Pattern Discovery

through Local Importance (TS-PD)

� TS-PD aims at finding patterns for

interpretability

� TS-PD identifies regions of interests� TS-PD identifies regions of interests

� Provides a visualization tool for understanding underlying relations

� Fast approach to detect the local

information related to the classification

12

The details and the code of TS-PD and the datasets will be provided in

http://www.mustafabaydogan.com/research/time-series-pattern-discovery.html soon

TS-PD

Example

� Extending TS-PD to multivariate time series

classification

� Gesture recognition task [12]

� Acceleration of hand on x, y and z axis

� Classify gestures (8 different types of gestures)

13

TS-PD

Example

14

Using DM as a tool� Decision makers are interested in knowledge that permits them to do their jobs better by taking some specific actions in response to the newly discovered knowledge.

� Usually a data mining algorithm is executed first and then profitable actions are determined based on the results from the data mining� Example: Market basket analysis

� Association rule mining to decide location of items in the supermarket

15

Using DM as a tool

Market-Basket transactions

TID Items

1 Bread, Milk

2 Bread, Diaper, Beer, Eggs

3 Milk, Diaper, Beer, Coke

Example of Association Rules

{Diaper} → {Beer},

{Milk, Bread} → {Eggs,Coke},

{Beer, Bread} → {Milk},

� Put diapers and beer on the same shelf???

16

3 Milk, Diaper, Beer, Coke

4 Bread, Milk, Diaper, Beer

5 Bread, Milk, Diaper, Coke

Implication means co-

occurrence, not causality!

Using DM as a tool

� Root cause analysis in networks

� Supply chain networks

� Identify corrupt nodes and their relations

� Why are my deliveries late?

17

� Why are my deliveries late?

Using DM as a tool� Transaction data

� Several factors affecting the network

id Stage 1 Stage 2 Stage 3 Stage 4

Weather

status

at stage 1 ….

Road status

between stage

1 and stage 2 ….

Transportation

vehicle

between stage

1 and stage 2 Weight Delayed?

1 S2 P1 D2 C2 Sunny … good …. Plane 30lbs Yes

2 S5 P3 D4 C1 Rainy … bad … Truck 40lbs No

18

2 S5 P3 D4 C1 Rainy … bad … Truck 40lbs No

3 . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

N . . . . . . . . . . .

Using DM as a tool� DM is required since

� Data is high dimensional� There may be missing values in the data� Not all indicators are numerical

� Identify interaction between the network nodes to find out the causes of delayto find out the causes of delay� What decisions are causing delay?

� Take actions� Modification of the optimization algorithm

� Introduce constraints based on the learning (data mining result)

� Simulation to generate more data� Further analysis of simulated data

19

Future directions� Reinforcement learning

� The decision-maker recognizes her state within the environment, and reacts by initiating an action.

� Example application: dynamic pricing� Example application: dynamic pricing

� Consequently she obtains a reward signal, and enters another state

20

Future directions

� The mechanism that generates reward signals and introduces new states is referred to as the dynamics of the environment.

� Agent is unfamiliar with dynamics of the � Agent is unfamiliar with dynamics of the environment and therefore initially it cannot correctly predict the outcome of actions.

� As the agent interacts with the environment and observes the actual consequences of its decisions, it can gradually adapt its behavior accordingly

21

Future directions� Dynamic programming is widely used to solve this problem� However environment can be highly unpredictable

� Modeling the environment efficiently is a important†

� provides knowledge about the domain that produced the datadata

� Revisiting dynamic pricing problem� In game-theoretic setting, all players are assumed to be rational but is that true?

� predicting opponent’s proposed price in advance (reduce uncertainty in the environment)

� Another example� If we know that a certain pattern observed in the stock price lead to high profit under certain conditions in the past, this may be important in taking actions

22

†L. Busoniu, R. Babuska, and B. De Schutter, "A comprehensive survey of multi-agent reinforcement learning," IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 38, no. 2,

pp. 156-172, Mar. 2008.

Thanks for your patience!

Questions and comments?

23

Supplemental material

24

DM and OR

� Using OR for DM� Optimization algorithms used for DM

� Data visualization

� Attribute selection

� Classification

Unsupervised learning� Unsupervised learning

� Using DM as a tool for decision making� Data mining can be used to complement traditional OR methods in many areas

� Discover patterns for actionable knowledge

� Example applications:

� Supply chain management (e.g., finding corrupt nodes)

25

Using OR for DM

Data Visualization

� Visualizing the data is important in any data mining project

� Generally difficult because the data is always high-dimensional, i.e., hundreds or thousands of attributes (variables)

26

thousands of attributes (variables)

� How can we best visualize such data in 2 or 3 dimensions?

� Traditional techniques include multidimensional scaling, which uses nonlinear optimization

Optimization Formulation� Combinatorial optimization formulation by Abbiw-Jackson,

Golden, Raghavan, and Wasil (2004)

� Map a set M of m points from Rr to Rq, q = 2,3

� Approximate the q-dimensional space by a lattice N

[ ]),(),,(min ∑∑∑∑∈ ∈ ∈ ∈Mi Mj Nk Nl

jlikneworiginal xxlkdjidF

27

1,0

,1s.t.

1

=

∈∀=∑∈

∈>∈ ∈ ∈

ik

Nk

ik

MijMj Nk Nl

x

Mix

etc map,Sammon square,least assuch Function

in measure Distance),(

in measure Distance),(

F

lkd

jidq

new

r

original

R

R

Optimization Formulation

� Quadratic Assignment Problem (QAP)

� Not possible to solve exactly for large

scale problemsscale problems

� Local search procedure proposed

28

Using OR for DM

Data Clustering

� Identify natural clusters or groupings of

data instances

� Many possible set of clusters

29

� Many possible set of clusters

� What makes a set of clusters good?

� Minimize the distance within clusters

� Maximize the distance between clusters

Optimization Formulation� k-medoid clustering

� Select k points to be the cluster center

� Assign other points to the clusters so that within cluster distance is minimized

� interpoint distances

30

� interpoint distances

References

31

References (continued)

32

baydogan time series data mining - Home | Living Analytics ...

Documents

Transcript of baydogan time series data mining - Home | Living Analytics ...