BIRTE-13-Kawashima

26
A Multiple Query Optimization Scheme for Change Point Detection [POSITION PAPER] Masahiro Oke, Hideyuki Kawashima University of Tsukuba, Japan

Transcript of BIRTE-13-Kawashima

Page 1: BIRTE-13-Kawashima

A Multiple Query Optimization Scheme for Change Point Detection

[POSITION PAPER]

Masahiro Oke, Hideyuki KawashimaUniversity of Tsukuba, Japan

Page 2: BIRTE-13-Kawashima

Outline

• Background• In DSMS analytics (philosophy & system)• CPD (Change Point Detection )• Proposal: MQO for multiple CPDs• Experiment• Summary

Page 3: BIRTE-13-Kawashima

SELECT COUNT(*)FROM eth0[TIME 1 MIN]WHERE port = 80

How many packets are arrived for port 80

in a minute ?

SPS

Relationeth0

・ Destination IP・ Source IP・ Destination Port・ Source Port・ Interface (e.g. eth0)・ Length ・ Version ( e.g. IPV4 )・ Payload

Relational schema

20

Quick ReviewData Stream Management System (DSMS)

Q1

Page 4: BIRTE-13-Kawashima

• SQL is translated to operator tree.• On arrival of data, tree is evaluated.• Operators are based on relational database– w(Window) :   Cutting off relations from a stream– σ (Selection) :  Filter– α (Aggregation) : such as AVG, MIN, MAX

Query

ResultUsers/Apps.

w σ αInputadapter

Outputadapter

SPSData

SELECT COUNT(*)FROM eth0[TIME 1 MIN]WHERE port = 80

Page 5: BIRTE-13-Kawashima

Our Target Application: Malware Detection

• Real datasets– Real trace logs of malware activities

• NICTER– Keeps about 160,000 unused ip addresses (DARK

NET)• Packets to dark net are considered as attacks.

– Uses CPD (Change Point Detection) [1]) to detect attacks such as DoS (denial of services).

[1] Daisuke Inoue, K. Yoshioka, M. Eto, Masaya Yamagata, Eisuke Nishino, Jun-ichi Takeuchi, Kazuya Ohkouchi, Koji Nakao: An Incident Analysis System NICTER and Its Analysis Engines Based on Data Mining Techniques. ICONIP (1) 2008: 579-586[2] J. Takeuchi and K. Yamanishi, “A Unifying Framework for Detecting Outliers and Change Points from Time Series,” IEEE TKDE, pp.482-492, 2006.

Page 6: BIRTE-13-Kawashima

Outline

• Background• In DSMS analytics• CPD (Change Point Detection )• Proposal: MQO for multiple CPDs• Experiment• Summary

Page 7: BIRTE-13-Kawashima

Relational data processing

Attack Detection

Discussion

?•  Aggregates are good

•  Outlier detection ?

•  Classification ?

•  Clustering ?

•  What are key techniques ?•  Everything is impossible

CPD(AR)/ LOF / LDA/FIMYet Another DSMS: Falcon

Page 8: BIRTE-13-Kawashima

Example Query on Falcon (1/2)

• #Access for each port ? [1]• Group by aggregates

SELECT dst_port, COUNT(dst_port)FROM pkt[1 sec]GROUP BY dst_port

g-pkt

src_ip

dst_ip

src_port

dst_port

seq_no

packet_size

timestamp

protocol

ack

fin

syn

urg

push

reset

content

22: 2

80: 215: 1

22

NIC

80 15 80 22

1 second

[1] “Enabling Real Time Data Analysis”, Divesh Srivastava (AT&T Labs), et, al. Keynote talk, VLDB 2010. (a similar query is found in pp.15 of talk slide)

Page 9: BIRTE-13-Kawashima

Example Query on Falcon (2/2)

• Access on each port ? [2]• Outlier score for each port/sec

select dst_port, cpd(dst_port)from pkt[1 sec] group by dst_port

g-cpd-pkt

src_ip

dst_ip

src_port

dst_port

seq_no

packet_size

timestamp

protocol

ack

fin

syn

urg

push

reset

content

22: 1.33

80: 2.44

15: 1.22

22

NIC

80 15 80 22

1 second

[2] “An Incident Analysis System NICTER and Its Analysis Engines Based on Data Mining Techniques”, Daisuke Inoue (NICT), et, al. ICONIP (1) 2008: 579-586

Page 10: BIRTE-13-Kawashima

Outline

• Background• In DSMS analytics• CPD (Change Point Detection )• Proposal: MQO for multiple CPDs• Experiment• Summary

Page 11: BIRTE-13-Kawashima

Time

Change Point Detection(CPD)• Outlier detection technique over time series data

– 2 stage learning based on autoregressive (AR) model– Apps: traffic analysis, stock price analysis

Apply CPD !

[1] Jun-ichi Takeuchi and Kenji Yamanishi, “A Unifying Framework for Detecting Outliers and Change Points from Time Series,” IEEE transactions on Knowledge and Data Engineering, pp.482-492, 2006.

11

Page 12: BIRTE-13-Kawashima

Dividing CPD into 4 operators

Compute outiler score and Moving average score

(omitting shwoing outlier score)

1st stage learning

Compute outiler score and Moving average score

Input tx

2nd stage learning

Outlier scoreMoving average score

Probability provided by 2nd stage learning

Compute outiler score and Moving average ascore

Input time series data

Probability provided by 1st stage learning

Page 13: BIRTE-13-Kawashima

Problem of CPD: Parameter setting

• CPD requires 6 parameters (, , , , ) • Appropriate parameter setting is necessary … but it is difficult

– Blue: # accesses, Red: CPD score

Using appropriate parameter set Using inappropriate parameter set

Page 14: BIRTE-13-Kawashima

Parameter set 2

A simple way for parameter tuning:---Multiple CPDs with different parameter sets---

Input packet

Compute outiler score

1st stage learning

Compute outiler score

2nd stage learning

Compute outiler score

1st stage learning

Compute outiler score

2nd stage learning

Result aggregation(e.g. majority voting)

Parameter set 3

Parameter set 4

Parameter set 0

k

Page 15: BIRTE-13-Kawashima

Outline

• Background• In DSMS analytics• CPD (Change Point Detection )• Proposal: MQO for multiple CPDs• Experiment• Summary

Page 16: BIRTE-13-Kawashima

Q: When can we share operators ? (1/2)Preparation: 6 (=3+3) parameters

Input packet

Compute outiler score

1st stage learning

2nd stage learning

Compute outiler score

: Discounting parameter : Degree of autoregressive model

: Time for moving average

: Discounting parameter : Degree of autoregressive model

: Time for moving average

Page 17: BIRTE-13-Kawashima

Q: When can we share operators ? (2/2)-- Branch or merge --

Compute outiler score

1st stage learning

Compute outiler score

2nd stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

1st stage learning

Compute outiler score

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

Compute outiler score

1st stage learning

Branch only Branch & merge Merge only1st stage learning

Compute outiler score

2nd stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

2nd stage learning

1st stage learning

1st stage learning

Both parameters (α..) and input values (arc) must be the same.Merging is NOT allowed on this scheme since

different parents may produce different output values.

Page 18: BIRTE-13-Kawashima

The 4 sharing patterns-- Only branch cases, not merge --

Compute outiler score

1st stage learning

Compute outiler score

2nd stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

1st stage learning

2nd stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

2nd stage learning

Compute outiler score

1st stage learning

Compute outiler score

Compute outiler score

2nd stage learning

Compute outiler score

1st stage learning

Compute outiler score

Compute outiler score

2nd stage learning

NOTE: “1st stage learning” and “3rd stage learning” can be divided to sub operators, and a part of sub operators can also be shared. The sharing patterns are described in the paper.

Pattern 1: Sharing CPD-1 if α_R and α_K are the same.Pattern 2: Sharing CPD-1, 2 if α_R, α_K and α_T are the same.Pattern 3: Sharing CPD-1, 2, 3 if α_R, α_K, α_T, β_R and β_K are the same.Pattern 4: Sharing CPD-1, 2, 3, 4 if α_R, α_K, α_T, β_R, β_K and β_T are the same.

Pattern 1 Pattern 2 Pattern 3 Pattern 4

Page 19: BIRTE-13-Kawashima

Outline

• Background• In DSMS analytics• CPD (Change Point Detection )• MQO for multiple CPDs• Experiment• Summary

Page 20: BIRTE-13-Kawashima

Experiment

1. Measuring reduction ratio provided by MQO– 3 kinds of parameter sets

• Grid style• Random• Uniform (just to see ideal case)

– Implement a system to measure the reduction ratio.

– Measured the reduction ratio.

2. Measuring execution time when sharing 1st stage learning– Implement CPD by C++ and eigen library (for

matrix manipulation). – Measured execution time using the CPD.

Grid style

Page 21: BIRTE-13-Kawashima

Reduction ratio provided by sharingResult

Parameter Pattern

# QueriesNaïve(# Operators)

Sharing(# Operators)

Performance Gain

(Reduction Ratio)

Uniform 64 384 6 98.4 %Random (2 values) 64 384 101 73.7 %

Random (10 values) 64 384 315 18.0 %

Random (100 values) 64 384 366 4.7 %

Grid Style (N = 2) 64 384 126 67.2 %Grid Style (N = 4) 4096 24576 5460 77.7 %Grid Style (N = 8) 262144 1572864 299592 80.1 %

Sharing is effective for grid style, while ineffective for random style.

Taking grid style for CPD and random sampling the result for aggregation is effective.

Page 22: BIRTE-13-Kawashima

Experiment

1. Measuring reduction ratio provided by MQO– Change parameter sets

• Grid style• Random• Uniform (just to see ideal case)

– Implement a system to measure the reduction ratio.

– Measured the reduction ratio.

2. Measuring execution time when sharing ONLY 1st stage learning– Implement CPD by C++ and eigen library (for

matrix manipulation). – Measured execution time using the CPD.

Grid style

Page 23: BIRTE-13-Kawashima

Execution time when sharing 1st stage learningResult

ID

ParametersExecution Time

(second)Performance

Gain (times)

NaiveSharedCPD-1

SharedCPD-1

1 .02 2 5 .02 3 5 2.92 1.77 1.652 .02 4 5 .02 3 5 3.65 1.77 2.063 .02 2 5 .02 4 5 3.29 2.17 1.524 .005 2 5 .02 4 5 2.91 1.76 1.655 .02 2 5 .005 3 5 2.89 1.77 1.646 .02 2 7 .02 7 5 3.00 1.87 1.607 .02 1 5 .02 1 5 1.96 1.11 1.788 .02 10 10 .02 10 10 11.2 5.84 1.929 .02 1 10 .02 10 10 6.66 5.80 1.15

10 .02 10 10 .02 1 10 6.68 1.34 5.00

Observation: (degree of autoregressive) is effective for performance improvement.

Page 24: BIRTE-13-Kawashima

Outline

• Background• In DSMS analytics• CPD (Change Point Detection )• MQO for multiple CPDs• Experiment• Summary

Page 25: BIRTE-13-Kawashima

Related Work

• Related work– Philosophy• In-DB analytics (MADlib, Bismarck, Oracle R Enterprise)

– Acceleration issues• Advanced hardware (GPGPU, FPGA, Xeon Phi)• Combination is a promising way

Page 26: BIRTE-13-Kawashima

Conclusions

• Multiple query optimization for CPD– 4 sharing patterns

• Experimental result– 5 times faster than naïve at the maximum case

• Future work– Integrating MQO and accelerator