Efficient Dynamic Aggregation

29
Efficient Dynamic Aggregation Yitzhak Birk, Idit Keidar, Liran Liss, Assaf Schuster Technion

description

Efficient Dynamic Aggregation. Yitzhak Birk , Idit Keidar , Liran Liss, Assaf Schuster Technion. Dynamic Aggregation. Continuous monitoring of aggregate value over changing inputs Examples: More than 10% of sensors report of seismic activity Maximum temperature in data center - PowerPoint PPT Presentation

Transcript of Efficient Dynamic Aggregation

Page 1: Efficient Dynamic Aggregation

Efficient Dynamic Aggregation

Yitzhak Birk, Idit Keidar, Liran Liss, Assaf Schuster

Technion

Page 2: Efficient Dynamic Aggregation

Dynamic Aggregation

Continuous monitoring of aggregate value over changing inputsExamples:– More than 10% of sensors report of seismic activity– Maximum temperature in data center– Average load in computation grid

Page 3: Efficient Dynamic Aggregation

The Setting

Large graph (e.g., sensor network)– Direct communication only between neighbors

Each node has a changing inputInputs change more frequently than topology– Consider topology as static

Aggregate function f on multiplicity of inputs– Oblivious to locations

Aggregate result computed at all nodes

Page 4: Efficient Dynamic Aggregation

Goals for Dynamic Aggregation

Fast convergence– If from some time t onward inputs do not change …

• Output stabilization time from t• Quiescence time from t• Note: nodes do not know when stabilization and

quiescence are achieved– If after stabilization input changes abruptly…

Efficient communication– Zero communication when there are zero changes– Small changes little communication

Page 5: Efficient Dynamic Aggregation

Standard Aggregation Solution: Spanning Tree

7 black, 1 white

2 bl

ack

Global communication!

black!

1 bl

ack

black!20 black, 12 white

Page 6: Efficient Dynamic Aggregation

Spanning Tree: Value Change

Global communication!

6

black, 2 white

19 black, 13 white

Page 7: Efficient Dynamic Aggregation

The Bad News

Virtually every aggregation function has instances that cannot be computed without communicating with the whole graph– E.g., majority voting when close to the threshold

“every vote counts”

Worst case analysis: convergence, quiescence times are (diameter)

Page 8: Efficient Dynamic Aggregation

Instance-Locality to the Rescue

Although some instances require global computation, most can stabilize (and become quiescent) locally– In small neighborhood, independent of graph size – Shown empirically [Wolff,Schuster03,

Liss,Birk,Wolf,Schuster04]

Formal instance-based locality in other contexts– Local fault mending [Kutten,Peleg95, Kutten,Patt-Shamir97] – Growth-restricted graphs [Kuhn, Moscibroda, Wattenhofer05]– MST [Elkin04]

Page 9: Efficient Dynamic Aggregation

“Per-Instance” Optimality Too Strong

Instance: assignment of inputs to nodesFor a given instance I, algorithm AI does:– if (my input is as in I) output f(I)

else send message with input to neighbor– Upon receiving message, flood it– Upon collecting info from the whole graph, output f(I)

Convergence and output stabilization in zero time on ICan you beat that?

Need to measure optimality per-class not per-instanceChallenge: capture attainable locality

Page 10: Efficient Dynamic Aggregation

Veracity Radius (VR) for One-Shot Aggregation [BKLSW,PODC’06]

Roughly speaking: the min radius r0 such that r> r0: all r-neighborhoods have same result

Example: majority Radius 1:wrong result

Radius 2:correct result

VR=2

Page 11: Efficient Dynamic Aggregation

Veracity Radius Captures the Locality of One-Shot Aggregation

[BKLSW,PODC’06]Class-based lower bound– Both output stabilization and quiescence– For every r, for every algorithm A, there is an instance I with

VR(I) r on which A takes r time I-LEAG (Instance-Local Efficient Aggregation on Graphs)– Quiescence and output stabilization proportional to VR– Per-class within a factor of optimal– Local: depends on VR, not graph size!

Note: nodes do not know VR or when stabilization and quiescence are achieved– Can’t expect to know you’re “done” in dynamic aggregation…

Page 12: Efficient Dynamic Aggregation

Naïve Dynamic Aggregation

Periodically,– Each node samples input, initiates I-LEAG– Each instance I of I-LEAG takes O(VR(I)) time,

but sends (|V|) messagesSends messages even when no input changes– Costly in sensor networks

To save messages, must compromise freshness of result

Page 13: Efficient Dynamic Aggregation

Contributions

New lower bound– For algorithms that send zero messages when there are zero

changes

Efficient multi-shot aggregation algorithm (MultI-LEAG)– Converges to correct result before sampling the inputs again– Sampling time may be proportional to graph size

Efficient dynamic aggregation algorithm (DynI-LEAG)– Sampling time is independent of graph size– Algorithm tracks global result as close as possible

Page 14: Efficient Dynamic Aggregation

Dynamic Lower Bound

Previous sample (instance) also plays a role– Example (majority voting):

Multi-shot lower bound: max{VRprev,VR}– On quiescence and output stabilization– Assumes sending zero messages when there are zero changes

I1 (VR2)

I2 (0 changes)

I3 (VR=0)

?!

Page 15: Efficient Dynamic Aggregation

Dynamic Aggregation: Take II

Initially, run local one-shot algorithm A– Store distance information travels in this instance, dist

Let D = A’s worst-case convergence timeEvery D time, run a new iteration (MULTI-A)– If input did not change, do nothing– If input changed, run full information protocol up to dist – If new instance’s VR isn’t reached, invoke A anew– Update dist

~(VR)

(~ VRprev)~(VR)

Matches max{VRprev,VR} lower boundwithin same factor as A

Page 16: Efficient Dynamic Aggregation

A is for I-LEAG

I-LEAG uses a pre-computed partition hierarchy– LPH: Local Partition Hierarchy – cluster sizes bounded both

from above and from below (doubling sizes)– Spanning tree in each cluster, rooted at pivot– Computed once per topology

I-LEAG phases correspond to LPH levels– Active phase: full-information from cluster pivot– Phase result communicated to cluster and its neighbors– Phase active only if there is a conflict in the previous level– Conflicts detected without new communication

Page 17: Efficient Dynamic Aggregation

Multi-LEAG

The Veracity Level (VL) of node v is the highest LPH level in which v’s cluster has a conflict (VL<logVR+1) A multi-LEAG iteration’s phases correspond to LPH levels:– Phase level < VL: propagate changes (if any) to pivot

• active only if there are changes

– Phase level VL: fall back to I-LEAG• active only if new VR is larger than previous

– Cache partial aggregate results in pivot nodes• allows conflict detection between active and passive clusters

Page 18: Efficient Dynamic Aggregation

MultI-LEAG Operation

Physical nodes

Pivot nodes

Veracity Level

Page 19: Efficient Dynamic Aggregation

MultI-LEAG Operation

Case I: No changes

… no changes to report

… no conflicts… no conflicts

All is quiet…

Page 20: Efficient Dynamic Aggregation

Input Change

!

New veracity level

no conflicts, no communication

Page 21: Efficient Dynamic Aggregation

Abrupt Change Flips Outcome

Page 22: Efficient Dynamic Aggregation

Abrupt Change Flips Outcome

Clusters at VL recalculate, others forward up

Page 23: Efficient Dynamic Aggregation

Abrupt Change Flips Outcome

New Veracity level

no conflicts, no communication

Page 24: Efficient Dynamic Aggregation

MultI-LEAG Observations

O(max{VRprev,VR}) output stabilization and quiescenceMessage efficient:– Communication only in clusters with changes,

only when radius < max{VRprev,VR}

Sampling time is O(Diameter)– Good for cheap periodic aggregation– Can we do closer monitoring?

Page 25: Efficient Dynamic Aggregation

Dynamic Aggregation Take III: DynI-LEAG

Sample inputs every O(1) link delays– Close monitoring, rapidly converges to correct result

Run multiple MultI-LEAG iterations concurrentlyChallenges: – Pipelining phases with different (doubling) durations– Intricate interaction among concurrent instances

E.g., which phase 4 updates are used in a given phase 5 .. – Avoiding state explosion for multiple concurrent instances

Page 26: Efficient Dynamic Aggregation

Ruler Pipelining

Partial iterations, fewer in every levelChanges only communicated once

t

Sampling interval

Phase 2

Phase 1

Phase 0

Full iteration

Partial iteration

Memory usage: O(log(Diameter))

Page 27: Efficient Dynamic Aggregation

VL and Output Estimation

Problem: correct output and VL of an iteration is guaranteed only after O(Diameter) time– cannot wait that long…

Solution: choose iteration with highest VL according to most recent information– Use this VL for new iterations and its output as MultI-LEAG’s current

output estimation

Eventual convergence and correctness guaranteed

Page 28: Efficient Dynamic Aggregation

DynI-LEAG Operation

Phase below VLPhase above VL

0

2

1

“Previous VL” = 2

The influence of a conflict is

proportional to its level

t

Page 29: Efficient Dynamic Aggregation

Conclusions

Local operation is possible – in dynamic systems – that solve inherently global problems

MultI-LEAG delivers periodic correct snapshots at minimal costDynI-LEAG responds immediately to input changes with a slightly higher message rate