Study group 2012.04.09 Junction SHERLOCK IS AROUND: DETECTING NETWORK FAILURES WITH LOCAL EVIDENCE...

1

Study group2012.04.09Junction

SHERLOCK IS AROUND: DETECTING NETWORK FAILURES WITH LOCAL EVIDENCE FUSIONQiang Ma 1 , Kebin L iu 2 , Xin Miao 1 , Yunhao L iu 1 , 2

1 Department of Computer Science and Engineer ing, Hong Kong Univers i ty of Sc ience and Technology

2 MOE Key Lab for Information System Securi ty, School of Software,

Ts inghua Nat ional Lab for Information Science and Technology, Ts inghua Univers i ty

2012/04/09

2

Motivations: Widely deployed WSNs for numerous application

Need to sustain for years, and operate reliably Error-prone and subject to component faults, performance

degradations It’s more challenging to explore the root causes for WSNs

Ad-hoc feature of WSNs: large-scale, dynamical changes of topology

Limit sources of sensor nodes: power, computation capability The existence of a large variety of specific protocols for WSNs

INTRODUCTION

2012/04/09

3

Traditional/popular way of diagnosis process Sink-based

Actively collect global evidences from sensor nodes to the sink Remaining energy, MAC layer backoff, neighbor table, routing table …

Conduct centralized analysis at the powerful back-end Disadvantages

Communication overhead

Avoid large overhead in evidence collection process Self-diagnosis

Injects fault inference model into sensor nodes Make local decisions

Disadvantages Results from single nodes: Inaccurate due to the narrow scope Inconsistent results from different inference processes

2012/04/09

RELATED WORKS

4

Main Design Diagnosis effi ciency

Local diagnosis process instead of backend Reduce communication overhead

Diagnosis accuracy Take judgments form all nodes with the local area into

consideration

2012/04/09

LOCAL DIAGNOSIS (LD2)

5

Working like this: Nodes running NBC: *state attributes = evidences

Posterior probability distribution: P(root causes|evidences) Once a node detect anomalies

Construct a fusion tree and do evidence fusion

Advantages: Balance the workload ensure a local consensus to the final diagnosis result

2012/04/09

SYSTEM ARCHITECTURENaïve Bayesian Classifier to encode the probability correlation between a set of state attributes and root causes

If its neighbor node has been removed from the neighbor list, the process would be triggered.

Dempster-Shafer TheoryTheory of evidence (DST)

6

Parameters learned from historical data R: root cause; F i, where i=1,…,n: evidences; : store s discrete values Calculate the posterior probability

The posterior probabilities of diff erent root causes Each node, based on F i observed, calculate the With certain mapping (normalization), Used later as the basic probability assignments in DST

2012/04/09

NAÏVE BAYESIAN CLASSIFIER (NBC)

Pre-learned

Scale factor: constant for different R

7

Fundamentals Allow us to combine evidence from different sources and

arrive at a degree of belief in all possible states/hypotheses (R, root causes) that takes into account all the available evidences (F, metrics).

Terms: Hypotheses: The frame of discernment: basic probability/belief assignment: m

(subjective or objective) , A: focal element constraint:

*posterior probability (objective)

2012/04/09

DEMPSTER-SHAFER THEORY (DST)

8

Diff erent from the concept of probability Belief: Plausibility: Pl(s)=1-Bel(~s) Belief <= plausibility

In this study The frame of discernment , R i: root causes

RO: no problem Only generates

2012/04/09

DEMPSTER-SHAFER THEORY (DST)

9

Combine the belief from diff erent observers (sensor nodes) To do evidence fusion

conflict factor joint mass

Problem: The combination result goes against the practical sense!! When with low or high conflict factor

2012/04/09

DEMPSTER’S RULE OF COMBINATION

10

Example: Hypotheses Ω = T, M, C

T: brain tumor M: meningitis ( 腦膜炎 ) C: concussion ( 腦震盪 )

The frame of discernment = 2Ω

2012/04/09

LOW/HIGH CONFLICT FACTOR

Doctor A 2Ω Doctor B

m(A1)=0.99

T m(B1)=0.99

m(A2)=0.01

M m(B2)=0

m(A3)=0 C m(B3)=0.01

Doctor A 2Ω Doctor B

m(A1)=0.99

T m(B1)=0

m(A2)=0.01

M m(B2)=0.01

m(A3)=0 C m(B3)=0.99

∩ A1

A2

A3

B1 Ø Ø Ø

B1 Ø M Ø

B1 Ø Ø Ø

m(T)=1!!

11

Believe those results highly consensus between nodes Definition 1: the distance between m1 and m2 is

Where And

Proof:

2012/04/09

MODIFIED COMBINATION RULE

12

Definition 2: The similar degree of m1 and m2 is

If we have one node i whose M i is similar to all the others, than we believe that this node’s M i is important.

Definition 3: The basic confidence of evidence i (i = 1,2,..,N)

Normalization: Modified = Basic probability assignment x basic confidence

Reduce the impact of those evidences with less importance

2012/04/09

MODIFIED COMBINATION RULE

13

Criterion: the fusion result keeps the same even if we change the

fusion order Theorem 1:

2012/04/09

EVIDENCE FUSION

14

Trigger node Detect abnormal symptoms

Node crash Traffi c contention Route loop

Determine the diagnosis area ???

Standard set Reduce computation overhead root node and its one-hop neighbors

DREQ contains Establish the fusion tree Detail of diagnosis task Standard set => basic confi dence

2012/04/09

FUSION ALGORITHM

152012/04/09

EVIDENCE FUSION ALGORITHM

In case the loss of DREQ

16

CitySee project: Urban carbon dioxide sensing 494 sensor nodes

Testbed using CTP protocol 50 TelosB motes

Comparison LD2 and TinyD2

Manually inject evidences Node crash Traffi c contention Route loop

Metrics False negative rate v.s. False positive rate

2012/04/09

EVALUATION

Fault detector (Self-diagnosis) Finite state machine (FSM) model Fault detector M=(E, S, S0, f, F)

E: the set of input evidences

S: the set of statesS0: start state

f: state transition functionF: all Accept states

E.g. high retransmission rate between A and B (A->B)

A finds rate increasing A broadcasts the current state

together with the fault detector If B received, check ACK or DATA B -> S2 and broadcast -> Ci NUM: threshold Bc: severe contention at B

2012/04/09 17

TINYD2 [1]

[1] Kebin Liu; Qiang Ma; Xibin Zhao; Yunhao Liu;"Self-diagnosis for large scale wireless sensor networks," INFOCOM, 2011

Accept states: final diagnosis

decision

18

Problem node: 25 With 16 neighbors

Root node of fusion tree: 13 Time cost

Sampling evidences Assign local basic confidence

Establishing fusion tree Receive & broadcast beacons

2012/04/09

TIME COST

Time cost is stable for all the tree structures

Traffic contention with longer time cost;DEVI packet contains 3 possible root

causes:1. ingress overflow, 2. egress overflow 3.

bad link=> More combination work is needed

192012/04/09

DIAGNOSIS ACCURACY

Decrease as neighbors increase:

More determinate diagnosis

TinyD2 performs unstable:Worse when neighbors

increase=> Fail to achieve a

consensus

TinyD2 performs unstable:Worse when neighbors

increase=> Fail to achieve a

consensus

Several root causes make it difficult for TinyD2 to use FSM to achieve an accept

stat

202012/04/09

COUPLING EFFECT WITH APPLICATION

Application packet loss

21

Conduct diagnosis in local area Reduce the communication overhead

Distribute the diagnosis workload to the sensor nodes within a diagnosis area

Use fusion tree to do evidence fusion A local consensus to the final diagnosis report is achieved

Need to predefine the failures!!

2012/04/09

CONCLUSION

Study group 2012.04.09 Junction SHERLOCK IS AROUND: DETECTING NETWORK FAILURES WITH LOCAL EVIDENCE...

Documents

Transcript of Study group 2012.04.09 Junction SHERLOCK IS AROUND: DETECTING NETWORK FAILURES WITH LOCAL EVIDENCE...