Distributed Algorithms Epidemic Dissemination
description
Transcript of Distributed Algorithms Epidemic Dissemination
Distributed AlgorithmsEpidemic Dissemination
Alberto Montresor
University of Trento, Italy
2014/10/07
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 1 / 45
Contents1 Introduction
MotivationSystem modelSpecificationsModels of epidemics
2 AlgorithmsBest-effortAnti-entropyRumor-mongeringDeletion and death certificatesProbabilistic broadcast
3 Bibliography4 What’s next
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 2 / 45
Introduction Motivation
Data dissemination
Problem
Application-level broadcast/multicast is an important buildingblock to create modern distributed applications
I Video streamingI RSS feeds
Want efficiency, robustness, speed when scalingI Flooding (reliable broadcast) is robust, but inefficient (O(n2))I Tree distribution is efficient (O(n)), but fragileI Gossip is both efficient (O(n log n)) and robust, but has relative
high latency
Scalability:I Total number of messages sent by all nodesI Number of messages sent by each of the nodes
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 3 / 45
Introduction Motivation
Trade-offs
Trade-Offs
Tree Flood
Gossip
efficient
fast
robust
Figure: Courtesy: Robbert van Renesse
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 4 / 45
Introduction Motivation
Introduction
Solution
Reverting to a probabilistic approach based on epidemics / gossip
Nodes infect each other trough messages
Total number of messages is less than O(n2)
No node is overloaded
But:
No deterministic guarantee on reliability
Only probabilistic ones
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 5 / 45
Introduction Motivation
History of the Epidemic/Gossip Paradigm
First defined by Alan Demers et al. (1987)
Protocols based on epidemics predate Demers’ paper (e.g. NNTP)
’90s: gossip applied to the information dissemination problem
’00s: gossip beyond dissemination
2006: Workshop on the future of gossip (Leiden, the Netherlands)
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 6 / 45
Introduction Motivation
Reality check (1987)
XEROX Clearinghouse Servers
Database replicated at thousands of nodes
Heterogeneous, unreliable network
Independent updates to single elements of the DB are injected atmultiple nodes
Updates must propagate to all nodes or be supplanted by laterupdates of the same element
Replicas become consistent after no more new updates
Assuming a reasonable update rate, most information at any givenreplica is “current”
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 7 / 45
Introduction Motivation
Reality check (today)
Amazon uses a gossip protocol to quickly spread informationthroughout the S3 system
Amazon’s Dynamo uses a gossip-based failure detection service
The basic information exchange in BitTorrent is based on gossip
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 8 / 45
Introduction System model
Basic assumptions
System is asynchronousI No bounds on messages and process execution delays
Processes fail by crashingI stop executing actions after the crashI We do not consider Byzantine failures
Communication is subject to benign failuresI Message omissionI No message corruption, creation, duplication
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 9 / 45
Introduction System model
Data model
We consider a database that is replicated at a set of n nodesΠ = {p1, . . . , pn}
The copy of the database at node pi can be represented by atime-varying partial function:
valuei : K → V · T
where:I K is the set of keysI V is the set of valuesI T is the set of timestamps
The update operation is formalized as: valuei(k)← (v, now())where now() returns a globally unique timestamp
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 10 / 45
Introduction Specifications
Database Specification
The goal of the update distribution process is to drive the systemtowards consistency:
Definition (Eventual consistency)
If no new updates are injected after some time t, eventually all correctnodes will obtain the same copy of the database:
∀pi, pj ∈ Π,∀k ∈ K : valuei(k) = valuej(k)
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 11 / 45
Introduction Specifications
Probabilistic Broadcast Specification
An alternative view
PB1 – Probabilistic validity
There is a given probability such that for any two correct nodes p andq, every message PB-broadcast by p is eventually PB-delivered by q
PB2 - Integrity
Every message is PB-delivered by a node at most once, and only if itwas previously PB-broadcast
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 12 / 45
Introduction Specifications
Best-Effort Broadcast vs Probabilistic Broadcast
Similarities:
No agreement property
Differences:
Probability in BEB is “hidden” in process life-cycle
Probability in PB is explicit
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 13 / 45
Introduction Specifications
Some simplifying assumptions
Every node knows Π (the communication network is a full graph)
Communication costs between nodes are homogeneous
We assume there is a single entry in the databaseI We simplify notation (valuei(k)→ valuei)I Managing multiple keys is more complex than adding
foreach k ∈ K
To be relaxed later...
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 14 / 45
Introduction Models of epidemics
Models of epidemics
Definition (Epidemiology)
Epidemiology studies the spread of a disease or infection in terms ofpopulations of infected/uninfected individuals and their rates of change
Why?
To understand if an epidemic will turn into a pandemic
To adopt countermeasures to reduce the rate of infectionI InoculationI Isolation
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 15 / 45
Introduction Models of epidemics
SIR Model
Definition (SIR Model – Kermack and McKendrick, 1927)
An individual p can be:
Susceptible: if p is not yet infected by the disease
Infective: if p is infected and capable to spread the disease
Removed: if p has been infected and has recovered from the disease
Susceptible Infective Removed
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 16 / 45
Introduction Models of epidemics
SIR Model
How does it work?
Initially, a single individual is infective
Individuals get in touch with each other, spreading the disease
Susceptible individuals are turned into infective ones
Eventually, infective individuals will become removed
http://en.wikipedia.org/wiki/File:Sirsys-p9.png
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 17 / 45
Introduction Models of epidemics
SIR is not the only model...
Definition (SIRS Model)
The SIR model plus temporary immunity, so recovered nodes maybecome susceptible again.
Susceptible Infective Removed
Definition (SEIRS Model)
The SEIRS model takes into consideration the exposed or latent periodof the disease
Susceptible Infective RemovedExposed
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 18 / 45
Introduction Models of epidemics
From Epidemiology to Distributed Systems
The idea
Disease spread quickly and robustly
Our goal is to spread an update as fast and as reliable as possible
Can we apply these ideas to distributed systems?
Definition (SIR Model for Database replication)
Susceptible: if p has not yet received an update
Infective: if p holds an update and is willing to share it
Removed: if p has the update but is no longer willing to share it
Note: Rumor spreading, or gossiping, is based on the same principles
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 19 / 45
Algorithms
Algorithm – Summary
Best effort
Anti-entropy (simple epidemics)I PushI PullI Push-pull
Rumor mongering (complex epidemics)I PushI PullI Push-pull
Probabilistic broadcast
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 20 / 45
Algorithms Best-effort
Best-effort
Best-effort (direct mail) algorithm
Notify all other nodes of an update as soon as it occurs.
When receiving an update, check if it is “news”
Direct mail protocol executed by process pi:
upon valuei ← (v, now()) doforeach pj ∈ Π do
send 〈update, valuei〉 to pj
upon receive 〈update, (v, t)〉 doif valuei.time < t then
valuei ← (v, t)
Not randomized nor epidemic algorithm: just the simplestAlberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 21 / 45
Algorithms Anti-entropy
Anti-entropy
Anti-entropy: Algorithm
Every node regularly chooses another node at random andexchanges database contents, resolving differences.
Nodes are eitherI susceptible – they know the updateI infective – they know the update
Anti-entropy protocol executed by process pi:
repeat every ∆ time unitsProcess pj ← random(Π) % Select a random neighbor{ exchange messages to resolve differences }
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 22 / 45
Algorithms Anti-entropy
Anti-entropy: graphical representation
Rounds
During a round of length ∆
every node has the possibility of contacting one random node
can be contacted by several nodes
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 23 / 45
Algorithms Anti-entropy
Resolving differences – Summary
pi
PUSH, v, tp
j
pi
pj
pi
pj
PULL, pi
REPLY, v, t
PUSHPULL, pi, v, t
REPLY, v, t
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 24 / 45
Algorithms Anti-entropy
Resolving differences – Push
Anti-entropy, Push protocol executed by process pi:
repeat every ∆ time unitsProcess pj ← random(Π) % Select a random neighborsend 〈push, valuei〉 to pj
upon receive 〈push, (v, t)〉 doif valuei.time < t then
valuei ← (v, t)
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 25 / 45
Algorithms Anti-entropy
Resolving differences – Pull
Anti-entropy, Pull protocol executed by process pi:
repeat every ∆ time unitsProcess pj ← random(Π) % Select a random neighborsend 〈pull, pi, valuei.time〉 to pj
upon receive 〈pull, pj , t〉 doif valuei.time > t then
send 〈reply, valuei〉 to pj
upon receive 〈reply, (v, t)〉 doif valuei.time < t then
valuei ← (v, t)
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 26 / 45
Algorithms Anti-entropy
Resolving differences – Push-Pull
Anti-entropy, Push-Pull protocol executed by process pi:
repeat every ∆ time unitsProcess pj ← random(Π) % Select a random neighborsend 〈pushpull, pi, valuei〉 to pj
upon receive 〈pushpull, pj , (v, t)〉 doif valuei.time < t then
valuei ← (v, t)else if valuei.time > t then
send 〈reply, valuei〉 to pj
upon receive 〈reply, (v, t)〉 doif valuei.time < t then
valuei ← (v, t)
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 27 / 45
Algorithms Anti-entropy
Analytical results
Definition (Compartmental model analysis)
We want to evaluate convergence of the protocol based on the size ofthe populations of susceptible and infected nodes (compartments)
st: the probability of a node being susceptible after t anti-entropyrounds
it = 1− st: the probability of a node being infective after tanti-entropy rounds
Probability at round k + 1
Initial condition: s0 = n−1n
Pull: E[st+1] = s2t
Push: E[st+1] = st(1− 1n)(1−st)n ≈ ste
−(1−st)
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 28 / 45
Algorithms Anti-entropy
Analytical results
Definition (Termination time)
Let Sn be the first cycle where st = 0 in a population of n individuals
Push: Pittel (1987) shows that in probability,Sn = log n + lnn + O(1)
Push-pull: Karp et al. (2000) shows that in probability,Sn = log log n
Relevant bibliography:
B. Pittel. On spreading a rumor. SIAM Journal of Applied Mathematics,47(1):213–223, 1987.
R. Karp, C. Schindelhauer, S. Shenker, B. Vocking. Randomized rumorspreading. In Proc. the 41st Symp. on Foundations of Computer Science, 2000.
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 29 / 45
Algorithms Anti-entropy
Analytical results
In summary:All methods converge to 0, but pull and push-pull are much more rapid
1e-50
1e-40
1e-30
1e-20
1e-10
1
0 5 10 15 20
st
Rounds
Pull
Push
Figure: n = 10 000
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 30 / 45
Algorithms Anti-entropy
Comments
Benefits
Simple epidemic eventually “infects” all the population
It is extremely robust
Drawbacks
Propagates updates much slower than direct mail (best effort)
Requires examining contents of database even when most dataagrees, so it cannot practically be used too often
Normally used as support to best effort/rumor mongering, i.e. leftrunning in the background
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 31 / 45
Algorithms Anti-entropy
Working with multiple values
Examples of techniques to compare databases:
Maintain checksum, compare databases if checksums unequal
Maintain recent update lists for time T , exchange lists first
Maintain inverted index of database by timestamp; exchangeinformation in reverse timestamp order, incrementally re-computechecksums
Notes:
Those ideas apply to databases
Strongly application-dependent
We will see how anti-entropy may be used beyond informationdissemination
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 32 / 45
Algorithms Rumor-mongering
Rumor mongering in brief
Nodes initially “susceptive”
When a node receives a new update it becomes a “hot rumor” andthe node “infective”
A node that has a rumor periodically chooses randomly anothernode to spread the rumor
Eventually, a node will “lose interest” in spreading the rumor andbecomes “removed”
I Spread too many timesI Everybody knows the rumor
A sender can hold (and transmit) a list of infective updates ratherthan just one
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 33 / 45
Algorithms Rumor-mongering
Rumor mongering: loss of interest
When: Counter vs coin (random)I Coin (random): lose interest with probability 1/kI Counter: lose interest after k contacts
Why: Feedback vs blindI Feedback: lose interest only if the recipient knows the rumor.I Blind: lose interest regardless of the recipient.
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 34 / 45
Algorithms Rumor-mongering
Analytical results
Question
How fast does the system converge to a state where all nodes are notinfective? (inactive state)
Compartmental analysis again – Feedback, coinLet s, i and r denote the fraction of susceptible, infective, and removednodes, respectively. Then:
s + i + r = 1
ds/dt = –si
di/dt = +si–1
k(1–s)i
Solving the differential equations:
s = e−(k+1)(1−s)
Increasing k increases the probability that the nodes get the rumor
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 35 / 45
Algorithms Rumor-mongering
Quality measures
Definition (Residue)
The nodes that remain susceptible when the epidemic ends: valueof s when i = 0.
Residue must be as small as possible
Definition (Traffic)
The average number of database updates sent between nodes
m = total update traffic / # of nodes
Definition (Convergence)
tavg : average time it takes for all nodes to get an update
tlast : time it takes for the last node to get the update
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 36 / 45
Algorithms Rumor-mongering
Simulation results
© Alberto Montresor 25
Simulation results
17.412.54.530.0113
17.512.75.640.00364
17.712.86.680.00125
16.912.13.300.0372
16.811.01.740.1761
tlasttavg
ConvergenceTrafficm
Residues
Counterk
32152.820.0603
3214.13.910.0214
3213.84.950.0085
33171.590.2052
38190.040.9601
tlasttavg
ConvergenceTrafficm
Residues
Counterk
Using feedback and counter
Using blind and random
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 37 / 45
Algorithms Rumor-mongering
Push vs pull
Push (what we have assumed so far)I If the database becomes quiescent, push stops trying to introduce
updates
PullI If many independent updates, pull is more likely to find a source
with a non-empty rumor listI But if the database is quiescent, several update requests are wasted
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 38 / 45
Algorithms Rumor-mongering
Rumor Mongering + Anti-Entropy
Quick & Dirty Push Rumor MongeringI spreads updates fast with low traffic.I however, there is still a nonzero probability of nodes remaining
susceptible after the epidemic
Not-so-fast Push-Pull Anti-EntropyI can be run (infrequently) in the background to ensure all nodes
eventually get the update with probability 1
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 39 / 45
Algorithms Deletion and death certificates
Dealing with deletions
Deletion
We cannot delete an entry just by removing it from a node - theabsence of the entry is not propagated
If the entry has been updated recently, there may still be anupdate traversing the network!
Definition (Death certificate)
Replace the deleted item with a death certificate that has a timestampand spreads like an ordinary update.
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 40 / 45
Algorithms Deletion and death certificates
Dealing with deletions
Problem
We must, at some point, delete DCs or they may consume significantspace
Strategy 1: retain each DC until all nodes have received itI requires a protocol to determine which nodes have it and to handle
node failures
Strategy 2: hold DCs for some time (e.g. 30d) and discard themI pragmatic approachI we still have the “resurrection” problemI increasing the time requires more space
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 41 / 45
Algorithms Deletion and death certificates
Dormant certificates
Observation:
we can delete very old DCs but retain only a few “dormant” copiesin some nodes
if an obsolete update reaches a dormant DC, it is “awakened” andre-propagated
Analogy with epidemiology:
the awakened DC is like an antibody triggered by an immunereaction
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 42 / 45
Algorithms Probabilistic broadcast
Epidemics vs probabilistic broadcast
Anti-entropy:DB is a collection of message received
Rumor mongeringUpdates ≡ Message broadcast
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 43 / 45
Bibliography
Reading material
A. J. Demers, D. H. Greene, C. Hauser, W. Irish, J. Larson, S. Shenker, H. E.Sturgis, D. C. Swinehart, and D. B. Terry.
Epidemic algorithms for replicated database maintenance.
In Proc. of the 6th annual ACM Symposium on Principles of DistributedComputing Systems (PODC’87), pages 1–12, 1987.
http://www.disi.unitn.it/~montreso/ds/papers/demers87.pdf.
M. Jelasity.
Gossip.
In Self-organising Software, pages 139–162. Springer Berlin Heidelberg, 2011.
http://www.disi.unitn.it/~montreso/ds/papers/gossip11.pdf.
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 44 / 45
What’s next
What’s next?
Problems
Membership:How do processes get to know each other, and how many do theyneed to know?
Network awareness:How to make the connections between processes reflect the actualnetwork topology such that the performance is acceptable?
Buffer management:Which updates to drop when the storage buffer of a process is full?
Does not end here...
Epidemic/gossip approach has been successfully applied to otherproblems. We will provide a brief overview of them.
Alberto Montresor (UniTN) DS - Epidemic Dissemination 2014/10/07 45 / 45