Distributed Algorithms Time, clocks and the ordering of...

116
Distributed Algorithms Time, clocks and the ordering of events Alberto Montresor University of Trento, Italy 2017/05/19 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Transcript of Distributed Algorithms Time, clocks and the ordering of...

Page 1: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Distributed AlgorithmsTime, clocks and the ordering of events

Alberto Montresor

University of Trento, Italy

2017/05/19

This work is licensed under a Creative CommonsAttribution-ShareAlike 4.0 International License.

references

O. Babaoglu and K. Marzullo.Consistent global states of distributed systems: Fundamentalconcepts and mechanisms.In S. Mullender, editor, Distributed Systems (2nd ed.).Addison-Wesley, 1993.http://www.disi.unitn.it/~montreso/ds/papers/

ConsistentGlobalStates.pdf.

Page 2: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 3: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 4: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions

Distributed Execution

Definition (Distributed algorithm)

A distributed algorithm is a collection of distributed automata, one perprocess

Definition (Distributed execution)

The execution of a distributed algorithm is a sequence of events executedby the processes

Partial execution: a finite sequence of eventsInfinite execution: a infinite sequence of events

Possible events

send(m, p): sends a message m to process preceive(m): receives a message mlocal events that change the local state

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 1 / 96

Page 5: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Histories

Histories

Definition (Local history)

The local history of process pi is a (possibly infinite) sequence of eventshi = e0i e

1i e

2i . . . e

mii (canonical enumeration)

Definition (Partial history)

The partial history up to event eki is denoted hki and is given by theprefix of the first k events of hi

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 2 / 96

Page 6: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Histories

Histories

Local histories do not specify any relative timing between eventsbelonging to different processes.

We need a notion of ordering between events, that could help us indeciding whether:

I one event occurs before anotherI they are actually concurrent

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 3 / 96

Page 7: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Happen-before

Happen-Before

Definition (Happen-before)

We say that an event e happens-before an event e′, and write e→ e′, ifone of the following three cases is true:

1 ∃pi ∈ Π : e = eri , e′ = esi , r < s(e and e′ are executed by the same process, e before e′)

2 e = send(m, ∗) ∧ e′ = receive(m)(e is the send event of a message m and e′ is the correspondingreceive event)

3 ∃e′′ : e→ e′′ → e′)(in other words, → is transitive)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 4 / 96

Page 8: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Happen-before

Space-Time Diagram of a Distributed Computation

e1 1 e2

1 e3 1 e4

1 e5 1 e6

1

e1 2 e2

2 e3 2

e1 3 e2

3 e3 3 e4

3 e5 3 e6

3

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 5 / 96

Page 9: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Happen-before

Meaning of Happen-Before

If e→ e′, this means that we can find a series of events e1e2e3 . . . en,where e1 = e and en = e′, such that for each pair of consecutive eventsei and ei+1:

1 ei and ei+1 are executed on the same process, in this order

2 ei = send(m, ∗) and ei+1 = receive(m)

Notes:

happen-before captures the concept of potential causal ordering

happen-before captures a flow of data between two events.

Two events e, e′ that are not related by the happen-before relation(e 6→ e′ ∧ e′ 6→ e) are concurrent, and we write e||e′.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 6 / 96

Page 10: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Happen-before

Homework

Prove or disprove that the || relation is transitive.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 7 / 96

Page 11: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Happen-before

Reality check

Forse non tutti sanno che...

The multi-threaded memory model of Java is based on thehappen-before relation, where communication between threads is basedon the acquisition and release of locks.

http://download.oracle.com/javase/6/docs/api/java/util/

concurrent/package-summary.html

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 8 / 96

Page 12: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Global states and cuts

Global States

Definition (Local state)

The local state of process pi after the execution of event eki isdenoted σkiThe local state contains all data items accessible by that process

Local state is completely private to the process

σ0i is the initial state of process pi

Definition (Global state)

The global state of a distributed computation is an n-tuple of localstates Σ = (σ1, . . . , σn), one for each process.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 9 / 96

Page 13: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Global states and cuts

Cut

Definition (Cut)

A cut of a distributed computation is the union of n partial histories,one for each process:

C = hc11 ∪ h

c22 ∪ . . . ∪ h

cnn

A cut may be described by a tuple (c1, c2, . . . , cn), identifying thefrontier of the cut, i.e. the set of last events, one per process.

Each cut (c1, . . . , cn) has a corresponding global state(σc1

1 , σc22 , . . . , σ

cnn ).

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 10 / 96

Page 14: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Global states and cuts

Cuts

e1 1 e2

1 e3 1 e4

1 e5 1 e6

1

e1 2 e2

2 e3 2

e1 3 e2

3 e3 3 e4

3 e5 3 e6

3C C’

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 11 / 96

Page 15: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Global states and cuts

Consistent cut

Consider cuts C ′ and C in the previous figure.

Is it possible that cut C correspond to a “real” state in theexecution of a distributed algorithm?

Is it possible that cut C ′ correspond to a “real” state in theexecution of a distributed algorithm?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 12 / 96

Page 16: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Global states and cuts

Consistent cut

Definition (Consistent cut)

A cut C is consistent, if for all events e and e′,

(e ∈ C) ∧ (e′ → e)⇒ e′ ∈ C

Definition (Consistent global state)

A global state is consistent if the corresponding cut is consistent.

In other words:

A consistent cut is left-closed w.r.t. the happen-before relation

All messages that have been received must have been sent before

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 13 / 96

Page 17: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Modeling Distributed Executions Global states and cuts

Consistent cut

In the previous figures, C is consistent and C ′ is not.

In the space-time diagram, a cut C is consistent if all the arrowsstart on the left of the cut and finish on the right of the cut.

Consistent cuts represent the concept of scalar time in distributedcomputation: it is possible to distinguish between a “before” andan “after”.

Predicates can be evaluated in consistent cuts, because theycorrespond to potential global states that could have taken placeduring an execution.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 14 / 96

Page 18: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 19: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Problem definition

Introduction

Definition (Global Predicate Evaluation)

The problem of detecting whether the global state of a distributedsystem satisfies some predicate Φ.

Motivation

Many problems in distributed systems require a reaction when theglobal state of the system satisfies some condition.

I Monitoring: Notify an administrator in case of failuresI Debugging: Verify whether an invariant is respected or notI Deadlock detection: can the computation continue?I Garbage collection: like Java, but distributed

Thus, the ability to construct a global state and evaluate apredicate over it is a core problem in distributed computing.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 15 / 96

Page 20: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Problem definition

Examples

Garbagecollection

Deadlock

p1

p1

p2

wait-for

wait-for

p1

Message

objectreference

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 16 / 96

Page 21: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Problem definition

Why GPE is difficult

A global state obtained through remote observations could be

obsolete: represent an old state of the system.Solution: build the global state more frequently

inconsistent: capture a global state that could never have occurredin realitySolution: build only consistent global states

incomplete: not “capture” every moment of the systemSolution: build all possible consistent global states

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 17 / 96

Page 22: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Example: deadlock detection

Space-Time Diagram of a Distributed Computation

e1 1 e2

1 e3 1 e4

1 e5 1 e6

1

e1 2 e2

2 e3 2

e1 3 e2

3 e3 3 e4

3 e5 3 e6

3

reqreq

respreq

resp

req

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 18 / 96

Page 23: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Example: deadlock detection

Example – Deadlock detection on a multi-tier system

Processes in the previous figures use RPCs:

Client sends a request for method execution; blocks.

Server receives request.

Server executes method; may invoke methods on other servers,acting as a client.

Server sends reply to client

Clients receives reply; unblocks.

Such a system can deadlock, as RPCs are blocking. It is important tobe able to detect when a deadlock occurs.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 19 / 96

Page 24: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Example: deadlock detection

Runs and consistent runs

Definition (Run)

A run of global computation is a total ordering R that includes all theevents in the local histories and that is consistent with each of them.

In other words, the events of pi appear in R in the same order inwhich they appear in hi.

A run corresponds to the notion that events in a distributedcomputation actually occur in a total order

A distributed computation may correspond to many runs

Definition (Consistent run)

A run R is said to be consistent if for all events e and e′, e→ e′ impliesthat e appears before e′ in R.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 20 / 96

Page 25: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Example: deadlock detection

Runs and consistent runs

e11 e21 e

31 e

41 e

51 e

61 e

12 e

22 e

32 e

13 e

23 e

33 e

43 e

53 e

63 ?

e11 e12 e

13 e

21 e

23 e

33 e

31 e

41 e

43 e

22 e

32 e

51 e

53 e

61 e

63 ?

e1 1 e2

1 e3 1 e4

1 e5 1 e6

1

e1 2 e2

2 e3 2

e1 3 e2

3 e3 3 e4

3 e5 3 e6

3

reqreq

respreq

resp

req

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 21 / 96

Page 26: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Example: deadlock detection

Monitoring Distributed Computations

Assumptions:I There is a single process p0 called monitor which is responsible for

evaluating ΦI We assume that the monitor p0 is distinct from the observed

processes p1 . . . pnI Events executed on behalf of monitoring do not alter canonical

enumeration of “real” events

In general, observed processes send notifications about local eventsto the monitor, which builds an observation.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 22 / 96

Page 27: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Global predicate evaluation Example: deadlock detection

Observations

Definition (Observation)

The sequence of events corresponding to the order in which notificationmessages arrive at the monitor is called an observation.

Given the asynchronous nature of our distributed system, anypermutation of a run R is a possible observation of it.

Definition (Consistent observation)

An observation is consistent if it corresponds to a consistent run.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 23 / 96

Page 28: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 29: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks

How to obtain consistent observations

The happen-before relation captures the concept of potentialcausality

In the “day-to-day” life, causality/concurrency are tracked usingphysical time

I We use loosely synchronized watches;I Example: I have withdrawn money from an ATM in Trento at 13.00

on 17th May 2006, so I can prove that I’ve not withdrawn money onthe same day at 13.20 in Paris

In distributed computing systems:I the rate of occurrence of events is several magnitudes higherI event execution time is several magnitudes smaller

If physical clocks are not precisely synchronized, thecausality/concurrency relations between events may not beaccurately captured

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 24 / 96

Page 30: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Logical clocks

Logical clocks

Instead of using physical clocks, which are impossible to synchronize,we use logical clocks.

Every process has a logical clock that is advanced using a set ofrules

Its value is not required to have any particular relationship to anyphysical clock.

Every event is assigned a timestamp, taken from the logical clock

The causality relation between events can be generally inferredfrom their timestamps

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 25 / 96

Page 31: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Logical clocks

Logical clocks

Definition (Logical clock)

A logical clock LC is a function that maps an event e from the historyH of a distributed system execution to an element of a time domain T :

LC : H → T

Definition (Clock Condition)

e→ e′ ⇒ LC (e) < LC (e′)

Definition (Strong Clock Condition)

e→ e′ ⇔ LC (e) < LC (e′)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 26 / 96

Page 32: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Scalar logical clocks

Scalar logical clocks

Definition (Scalar logical clocks)

Lamport’s scalar logical clock is a monotonically increasingsoftware counter

Each process pi keeps its own logical clock LC i

The timestamp of event e executed by process pi is denoted LCi(e)

Messages carry the timestamp of their send event

Logical clocks are initialized to 0

Update rule

Whenever an event e is executed by process pi, its local logical clock isupdated as follows:

LC i =

{LCi + 1 If ei is an internal or send eventmax{LCi,TS (m)}+ 1 If ei = receive(m)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 27 / 96

Page 33: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Scalar logical clocks

Scalar logical clocks

1 2 4 5 6 7

1 5 6

1 2 43 5 7

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 28 / 96

Page 34: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Scalar logical clocks

Properties

Theorem

Scalar logical clocks satisfy Clock condition, i.e.

e→ e′ ⇒ LC(e) < LC(e′)

Proof.

This immediately follows from the update rules of the clock.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 29 / 96

Page 35: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Properties

Theorem

Scalar logical clocks satisfy Clock condition, i.e.

e→ e′ ⇒ LC(e) < LC(e′)

Proof.

This immediately follows from the update rules of the clock.

2018-1

2-1

6

DS - Time,clocks,events

Real clocks vs logical clocks

Scalar logical clocks

Properties

Total ordering: It is also possible to provide a total order for events, bysorting based on the process identifier for events with the same timestamp.

Page 36: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Scalar logical clocks

Scalar logical clocks

Theorem

Scalar logical clocks do not satisfy Strong clock condition, i.e.

LC(e) < LC(e′) 6⇒ e→ e′

1 2 4 5 6 7

1 5 6

1 2 43 5 7

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 30 / 96

Page 37: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Scalar logical clocks

Theorem

Scalar logical clocks do not satisfy Strong clock condition, i.e.

LC(e) < LC(e′) 6⇒ e→ e′

1 2 4 5 6 7

1 5 6

1 2 43 5 7

p1

p2

p3

2018-1

2-1

6

DS - Time,clocks,events

Real clocks vs logical clocks

Scalar logical clocks

Scalar logical clocks

LC(p12) < LC(p2

3), yet p12 6→ p2

3

Page 38: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Causal histories clocks

Definition (Causal History)

The causal history of an event e is the set of events that happen-beforee, plus e itself.

θ(e) = {e′ ∈ H | e′ → e} ∪ {e}

Theorem

Causal histories satisfy Strong clock condition

Proof.

∀e 6= e′ : LC(e) < LC(e′)def⇔ θ(e) ⊂ θ(e′)⇔ e ∈ θ(e′)⇔ e→ e′

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 31 / 96

Page 39: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Example

Problem:Causal histories tend to grow too much; they cannot be used as“timestamps” for messages.

e1 1 e2

1 e3 1 e4

1 e5 1 e6

1

e1 2 e2

2 e3 2

e1 3 e2

3 e3 3 e4

3 e5 3 e6

3

reqreq

respreq

resp

req

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 32 / 96

Page 40: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Vector clocks

Causal history projection: θi(e) = θ(e) ∩ hi = hciiθ(e) = θ1(e) ∪ θ2(e) ∪ . . . ∪ θn(e) = hc1

1 ∪ hc22 ∪ . . . ∪ hcnn

In other words, θ(e) is a cut, which happens to be consistent.

Cuts can be represented by their frontiers: θ(e) = (c1, c2, . . . , cn)

Definition

The vector clock associated to event e is a n-dimensional vector VC (e)such that

VC (e)[i] = ci where θi(e) = hcii

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 33 / 96

Page 41: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Vector clocks: implementation

Each process pi maintains a vector clock VC i, initially all zeroes;

When event ei is executed, its vector clock is updated and assumesthe value of VC (ei);

If ei = send(m, ∗), the timestamp of m is TS(m) = VC (ei);

Update rule

When event ei is executed by process pi, VC i is updated as follows:

If ei is an internal or send event:

VC i[i] = V Ci[i] + 1

If ei = receive(m):

VC i[j] = max{V Ci[j], TS(m)[j]} ∀j 6= iVC i[i] = V Ci[i] + 1

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 34 / 96

Page 42: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Example

(1,0,0) (2,1,0) (3,1,3) (4,1,3) (5,1,3) (6,1,3)

(0,1,0) (1,2,4)

(4,3,4)

(0,0,1) (1,0,2) (1,0,3)(1,0,4) (1,0,5) (5,1,6)

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 35 / 96

Page 43: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Properties of Vector clocks

“Less than” relation for Vector clocks

V < V ′ ⇔ (V 6= V ′) ∧ (∀k : 1 ≤ k ≤ n : V [k] ≤ V ′[k])

Strong Clock Condition

e→ e′ ⇔ VC (e) < V C(e′)⇔ θ(e) ⊂ θ(e′)

Simple Strong Clock Condition

ei → ej ⇔ VC (ei)[i] ≤ V C(ej)[i]

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 36 / 96

Page 44: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Properties of Vector clocks

Definition (Concurrent events)

Events ei and ej are concurrent (i.e. ei||ej) if and only if:

(V C(ei)[i] > V C(ej)[i]) ∧ (V C(ej)[j] > V C(ei)[j])

In other words, event ei does not happen-before ej , and ej does nothappen before ei.

Example: ?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 37 / 96

Page 45: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Concurrent events

(V C(ei)[i] > V C(ej)[i]) ∧ (V C(ej)[j] > V C(ei)[j])

(1,0,0) (2,1,0) (3,1,3) (4,1,3) (5,1,3) (6,1,3)

(0,1,0) (1,2,4)

(4,3,4)

(0,0,1) (1,0,2) (1,0,3)(1,0,4) (1,0,5) (5,1,6)

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 38 / 96

Page 46: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Concurrent events

(V C(ei)[i] > V C(ej)[i]) ∧ (V C(ej)[j] > V C(ei)[j])

(1,0,0) (2,1,0) (3,1,3) (4,1,3) (5,1,3) (6,1,3)

(0,1,0) (1,2,4)

(4,3,4)

(0,0,1) (1,0,2) (1,0,3)(1,0,4) (1,0,5) (5,1,6)

p1

p2

p32018-1

2-1

6

DS - Time,clocks,events

Real clocks vs logical clocks

Vector logical clocks

Concurrent events

Example of Concurrent Events: Example: (3,1,3) and (1,2,4)

Page 47: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Properties of vector clocks

Definition (Pairwise Inconsistent)

Events ei and ej with i 6= j are pairwise inconsistent if and only if

(V C(ei)[i] < V C(ej)[i]) ∨ (V C(ej)[j] < V C(ei)[j])

In other words, two events are pairwise inconsistent if they cannotbelong to the frontier of the same consistent cut. The formulacharacterize the fact that the cut include a receive event withoutincluding a send event.

Example: ?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 39 / 96

Page 48: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Pairwise inconsistent

(V C(ei)[i] < V C(ej)[i]) ∨ (V C(ej)[j] < V C(ei)[j])

(1,0,0) (2,1,0) (3,1,3) (4,1,3) (5,1,3) (6,1,3)

(0,1,0) (1,2,4)

(4,3,4)

(0,0,1) (1,0,2) (1,0,3)(1,0,4) (1,0,5) (5,1,6)

p1

p2

p3

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 40 / 96

Page 49: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Pairwise inconsistent

(V C(ei)[i] < V C(ej)[i]) ∨ (V C(ej)[j] < V C(ei)[j])

(1,0,0) (2,1,0) (3,1,3) (4,1,3) (5,1,3) (6,1,3)

(0,1,0) (1,2,4)

(4,3,4)

(0,0,1) (1,0,2) (1,0,3)(1,0,4) (1,0,5) (5,1,6)

p1

p2

p32018-1

2-1

6

DS - Time,clocks,events

Real clocks vs logical clocks

Vector logical clocks

Pairwise inconsistent

Example of Pairwise Inconsistent Events: Example: (3,1,3) and (1,0,2)

Page 50: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Real clocks vs logical clocks Vector logical clocks

Properties of Vector Clocks

Definition (Consistent Cut)

A cut defined by (c1, . . . , cn) is consistent if and only if:

∀i, j ∈ [1 . . . n] : VC (ecii )[i] ≥ VC (ecjj )[i])

In other words, a cut is consistent if its frontier does not contain anypairwise inconsistent pair of events.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 41 / 96

Page 51: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 52: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 53: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

A Passive Approach to GPE

How it works

At each (relevant) event, each process sends a notification to themonitor describing it local state

The monitor collects the sequence of notifications, i.e. anobservation of the distributed run.

An observation taken in this way can correspond to:

A consistent run

A run which is not consistent

No run at all

Can you find example of the three cases?Can you explain why this happen?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 42 / 96

Page 54: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Observations which are not runs

Problem

Observations may not correspond to a run because the notificationssent by a single process to the monitor may be delayed arbitrarily andthus arrive in any possible order

Solution

To adopt communication channels between the processes and themonitor that guarantee that notifications are never re-ordered

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 43 / 96

Page 55: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Notification ordering

Definition (FIFO Order)

Two notifications sent by pi to p0 must be delivered in the same orderin which they were sent:

∀m,m′ : send i(m, p0)→ send i(m′, p0)⇒ deliver0(m)→ deliver0(m

′)

Uh? What is “deliver”?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 44 / 96

Page 56: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Delivery Rules

How to order notifications?

To be ordered, each notification m carries a timestamp TS (m)containing “ordering” information

The act of providing the monitor with a notification in the desiredorder is called delivery; the event deliver(m) is thus distinct fromreceive(m).

The rule describing which notifications can be delivered amongthose received is called delivery rule

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 45 / 96

Page 57: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

FIFO Order - Implementation

Each process maintains a local sequence number incremented ateach notification sent

The timestamp of a notification corresponds to the local sequencenumber of the sender at the time of sending

Definition (FIFO Delivery Rule)

If the last notification delivered by p0 from pj has timestamp s, p0 maydeliver “any” notification m received from pj with TS (m) = s+ 1.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 46 / 96

Page 58: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Observations which are not consistent runs

Problem

If we use FIFO order between all processes and p0, all the observationstaken by p0 will be runs; but there is no guarantee that they areconsistent runs.

Solution

To adopt communication channels that guarantee that notifications aredelivered in an order that respects the happen-before relation.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 47 / 96

Page 59: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Notification ordering

Definition (Causal Order)

Two notifications sent by pi and pj to p0 must be delivered followingthe happen-before relation:

∀m,m′ : send i(m, p0)→ send j(m′, p0)⇒ deliver0(m)→ deliver0(m

′)

Question

FIFO order among all channels...Is it sufficient to obtain Causal delivery?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 48 / 96

Page 60: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Example

p1

p2

p3

m

m’

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 49 / 96

Page 61: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Example

p1

p2

p3

m

m’

2018-1

2-1

6

DS - Time,clocks,events

Passive monitoring

Example

• Given that each communication channels is associated with just onemessage, this distributed system satisfies FIFO order.

• However, it does not satisfy Causal order: the deliver of message m′

should be delivered after message m, because the send event of m ispotentially caused by m′.

• Silly example: p1 sends an invitation for a party to p2 and p3 by SMS;p2 sends a message to p3 saying “Are you going to participate to theparty of p1?” p3 gets offended because it has not been invited.

Page 62: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring

Causal delivery and consistent observations

Theorem

If p0 uses a delivery rule satisfying Causal Order, then all of itsobservations will be consistent.

Proof.

Definition of Causal Order ≡ definition of a consistent observation

Three implementations of the causal delivery rule:

Real-time clocks

Logical (scalar) clocks

Vector clocks

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 50 / 96

Page 63: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Causal delivery and consistent observations

Theorem

If p0 uses a delivery rule satisfying Causal Order, then all of itsobservations will be consistent.

Proof.

Definition of Causal Order ≡ definition of a consistent observation

Three implementations of the causal delivery rule:

Real-time clocks

Logical (scalar) clocks

Vector clocks

2018-1

2-1

6

DS - Time,clocks,events

Passive monitoring

Causal delivery and consistent observations

• Suppose, by contradiction, that the monitor obtains an observationwhich is not consistent.

• This means that an event e′ appears before e in the observation, bute→ e′.

• This means that the notification of e′ has been delivered before thenotification of e

• But this is impossible, because send(me, p0)→ send(me′ , p0) and bythe Casual Delivery rule deliver(me)→ deliver(me′), a contradiction.

Page 64: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.1

Passive monitoring with real-time

Initial assumptions

All processes have access to a real-time clock RC

Let RC (e) be the real-time at which e is executed

All notifications are delivered within a time δ

The timestamp of notification m corresponding to an event e isTS (m) = RC (e).

Definition (DR1: Real-time delivery rule)

At any time, deliver all received notifications in increasing timestamporder.

Theorem

Observation O constructed by p0 using DR1 is guaranteed to beconsistent (?)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 51 / 96

Page 65: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.1

Passive monitoring with real-time

Initial assumptions

All processes have access to a real-time clock RC

Let RC (e) be the real-time at which e is executed

All notifications are delivered within a time δ

The timestamp of notification m corresponding to an event e isTS (m) = RC (e).

Definition (DR1: Real-time delivery rule)

At any time, deliver all received notifications in increasing timestamporder.

Theorem

Observation O constructed by p0 using DR1 is guaranteed to beconsistent

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 51 / 96

Page 66: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.1

Stability of messages

Definition (Stability)

A notification m received by p0 is stable at p0 if no notification m′ withTS (m′) < TS (m) can be received in the future by p0

Definition (DR1: Delivery rule for RC )

Deliver all received notifications that are stable at p0, in increasingtimestamp order

Theorem

Observation O constructed by p0 using DR1 is guaranteed to beconsistent

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 52 / 96

Page 67: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.1

Proof

Safety: Clock Condition for RC

e→ e′ ⇒ RC (e) < RC (e′)

Note that RC (e) < RC (e′) 6⇒ e→ e′, but this rule is sufficient toobtain consistent observations, as two notifications e→ e′ are neverdelivered in the incorrect order.

Liveness: Stability

At time t, any message sent by time t− δ is stable.

Note that real-time clocks do not support stability; it is the maximumdelay of messages that enables it.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 53 / 96

Page 68: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.2

Passive monitoring with logical clocks

Initial assumptions

All processes have access to a logical clock LC ;let LC (e) be the logical clock at which e is executed

The timestamp of notification m corresponding to an event e isTS (m) = LC (e)

Definition (DR2: Deliver Rule for LC )

Deliver all received notifications that are stable at p0 in increasingtimestamp order.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 54 / 96

Page 69: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.2

Passive monitoring with logical clocks

Safety: Clock Condition for LC

e→ e′ ⇒ LC (e) < LC (e′)

Note that LC (e) < LC (e′) 6⇒ e→ e′, but this rule is sufficient toobtain consistent observations, as two events e→ e′ are never orderedincorrectly

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 55 / 96

Page 70: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.2

Passive monitoring with logical clocks

Liveness: Stability

We need a way to reproduce the concept of δ in an asynchronoussystem, otherwise no notification will be ever delivered.

Solution

Each process communicates with p0 using FIFO delivery

When p0 receives a notification from pi describing an event e withtimestamp TS (e), it is sure that it will never receive a messagefrom pi describing an event e′ with TS(e′) ≤ TS(e)

Stability of notification m at p0 can be guaranteed when p0 hasreceived at least one notification from all other processes with atimestamp greater or equal than TS (m)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 56 / 96

Page 71: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.2

Problems of Logical Clocks

They add unnecessary delays to observations

They require a constant flux of notifications from all processes

1 2 3 6

2 3 4p1

p2

p01 2 3 5

5

Stable messages

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 57 / 96

Page 72: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.3

Passive Monitoring with Vector Clocks

Variables maintained at p0

M : the set of notifications received but not yet delivered by p0

D: an array, initialized to 0’s, such that D[k] contains TS(m)[k]where m is the last notification delivered by p0 from process pk.

When a notification is deliverable by p0?A notification m ∈M from process pj is deliverable as soon as p0 canverify that there is no other notification m′ (neither in M nor in thechannels) such that send(m′, p0)→ send(m, p0).

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 58 / 96

Page 73: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.3

Implementing Causal Delivery with Vector Clocks

m ∈M : a notification sent by pj to p0

m′: the last notification delivered from process pk, k 6= j

Definition (Weak Gap Detection)

If TS (m′)[k] < TS (m)[k] for some k 6= j, then there exists eventsendk(m′′) such that

sendk(m′, p0)→ sendk(m′′, p0)→ send j(m, p0)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 59 / 96

Page 74: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.3

Implementing Causal Delivery with Vector Clocks

Two conditions to be verified to check if m can be delivered:

There is no earlier message from pj that has not been delivered yet.

Causal Delivery Rule, Part 1 D[j] == TS (m)[j]− 1

∀k 6= j, let m′ be the last message from pk delivered by p0(D[k] = TS (m′)[k]); we must be sure that no message m′′ from pkexists such that: sendk(m′, p0)→ sendk(m′′, p0)→ send j(m, p0)

Causal Delivery Rule, Part 2: ∀k 6= j : D[k] ≥ TS (m)[k]It follows from Weak Gap Detection

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 60 / 96

Page 75: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.3

Example

(D[j] == TS (m)[j]− 1

)∧(∀k 6= j : D[k] ≥ TS (m)[k]

)

p0

p1

p2

VC1 [0,0] [1,0]

VC2 [0,0] [1,1]

D [0,0] [1,0] [1,1]

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 61 / 96

Page 76: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Passive monitoring Passive monitoring, v.3

Final Comments

We have seen how to implement Causal Delivery “many-to-one”

The same rules apply if we implement a mechanism forimplementing “one-to-many” (reliable broadcast)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 62 / 96

Page 77: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 78: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring

Snapshot Protocol

Problem

Goal: To build the global state “on demand” of the monitor.

How: By taking “pictures” (snapshots) of the local state wheninstructed

Challenge: To build a consistent global state.

Applications

Failure recovery: a global state (checkpoint) is periodically saved;to recover from a failure, the system is restored to the last savedcheckpoint.

Distributed garbage collection

Monitoring distributed events (e.g., industrial process control)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 63 / 96

Page 79: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring

Chandy-Lamport Snapshot Algorithm

This particular protocol enables to reason about “channel states”

GPE can be delayed with respect to passive monitoring

Assumption: processes communicate through FIFO channels

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 64 / 96

Page 80: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring

Snapshot Protocol

Channel State

For each channel between pi and pjI xi,j contains the messages sent by pi but not received yet by pjI xj,i contains the messages sent by pj but not received yet by pi

Note: channel state can be obtained by storing appropriateinformation in the local state, but it is complicated.

Recorded information

Each process pi will record its local state σi and the content of itsincoming channels xj,i.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 65 / 96

Page 81: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring

Snapshot Protocol

Assumptions:

Communication channels satisfy FIFO order

Assumptions to be relaxed later:

Access to a real time clock RC ;

Message delays are bounded by some known value δ;

Relative process speeds are bounded;

Message m is tagged with a timestamp TS (m) = RC (e), wheree = send(m).

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 66 / 96

Page 82: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.1

Snapshot Protocol, v.1

1 p0 chooses a time ts far enough in the future that a messagecontaining ts sent by p0 will be received by all processes by ts:

ts = now() + δ

2 p0 sends a message “take a snapshot at ts” to all processes in Π

3 When RC i = ts do:1 pi records its local state2 pi sends an empty message to all processes in Π3 pi starts to record all messages received over each incoming channel

4 When pi receives a message m from j such that TS (m) ≥ ts, itstops recording messages for incoming channel j

5 Each pi sends its recorded local state and channel states to p0

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 67 / 96

Page 83: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.1

Snapshot Protocol, v.1

Liveness The empty messages at 3.2 guarantee liveness.Safety This protocol constructs a consistent global state; actually, thisglobal state did in fact occur. Formally:Let Cs be the cut associated to the constructed global state;

(e ∈ Cs) ∧ (e′ → e) ⇒(e ∈ Cs) ∧ (RC (e′) < RC (e)) ⇒ C.C. : e′ → e⇒ RC (e′) < RC (e)

(e′ ∈ Cs) Cs Def. : e ∈ Cs ⇔ RC (e) < ts

Can we use logical clocks instead of a real time clock?

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 68 / 96

Page 84: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.2

Snapshot Protocol, v.2

From real time clocks to logical clocks:

The construct “ when LC = ts do statement” makes no senseI LC s are not continuous (e.g., they can jump from ts − 1 to ts + 1)I When LC = ts, the event that caused the clock update is already

done.

Solution: before pi executes an event e:I If e is an internal or send event, and LC = ts − 2:

F pi executes eF pi executes statement

I If e = receive(m) ∧ TS (m) ≥ ts and LC < ts − 1:F pi puts LC = ts − 1F pi executes statementF pi executes e

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 69 / 96

Page 85: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.2

Snapshot Protocol, v.2

From real time clocks to logical clocks:

We assume that we can found a logical time ts large enough that amessage containing it sent by p0 will be received by all otherprocesses before ts

Impossible in an asynchronous distributed systems

We will relax this assumption later

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 70 / 96

Page 86: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.2

Snapshot Protocol, v.2

1 p0 chooses a logical time ts large enough

2 p0 sends a message “take a snapshot at ts” to all processes in Π

3 p0 sets its logical clock to ts;

4 When LC i = ts do:1 pi records its local state;2 pi sends an empty message to all processes in Π;3 pi starts to record all messages received over each incoming channel.

5 When pi receives a message m from j such that TS (m) ≥ ts, itstops recording messages for incoming channel j

6 Each pi sends its recorded local state and channel states to p0.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 71 / 96

Page 87: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.3

Snapshot Protocol, v.3

We now remove the need for ts:

A process may receive an ”empty message” from a node before the“take snapshot at ts” is actually received

In other words, it may be already aware of the snapshot protocol

We remove the “at ts” and we use snapshot messages instead ofempty ones

We can now remove logical clocks completely, as messages are nottimestamped any more

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 72 / 96

Page 88: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.3

Snapshot Protocol, v.3

1 p0 sends a message snapshot to itself ;

2 when pi receives snapshot for the first time:I let pj be the sender of this messageI pi records its local state σi;I sends snapshot to all processes in Π;I xk,i ← ∅ ∀k 6= i;

3 when pi receives message m 6= snapshot from pk, k 6= jI xk,i ← xk,i ∪ {m}

4 when pi receives a snapshot from pk, k 6= j beyond the first time:I pi stops recording messages in xk,i;

5 when pi has received a snapshot from all processesI pi then sends its recorded local state and the channel states to p0.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 73 / 96

Page 89: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.3

Example

e1 1 e2

1 e3 1 e4

1 e5 1

e1 2 e2

2 e3 2 e4

2 e5 2

p0

p1

p2

SnapshotsRecorded states

Normal messages State p1

State p2

Incoming channel p1

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 74 / 96

Page 90: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proactive monitoring Snapshot protocol, v.3

Proof of correctness

Theorem

A global state built using the Chandy-Lamport snapshot algorithm isconsistent.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 75 / 96

Page 91: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Proof of correctness

Theorem

A global state built using the Chandy-Lamport snapshot algorithm isconsistent.

2018-1

2-1

6

DS - Time,clocks,events

Proactive monitoring

Snapshot protocol, v.3

Proof of correctness

In order for the global state to be inconsistent, there should be a receiveevent included in the global state for which the corresponding send event isnot included. Let ei = receivei(m) and let ej = send j(m).In order for ej to not belong to the global state, process pj must have re-ceived the first snapshot message before ej . Thus, it has sent a messageto pi containing a snapshot message to pi before sending m. For the FIFOproperty of the channel, pi must have received the snapshot message beforeei. So, it must have stored the local state before receiving m, a contradiction.

Page 92: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 93: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates

Stable Predicates

Problem

Let Σ be a global state built by one of the presented methods

It represents a state of the past, potentially with no bearing to thepresent

Does it make sense to evaluate predicate Φ on it?

A special case: stable predicates

Many systems properties have the characteristic that once they becometrue, they remain true.

Deadlock

Garbage collection

Termination

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 76 / 96

Page 94: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates

A distributed computation may have many runs

Definition (Leads-to relation)

A consistent run R = e1e2 . . . results in a sequence of consistentglobal states Σ0Σ1Σ2 . . ., where Σ0 denotes the initial global state.

We say that a global state Σ leads to to a global state Σ′, denotedΣ R Σ′ in a consistent run R if:

I R results in a sequence of global states Σ0Σ1Σ2 . . .;I Σ = Σi,Σ′ = Σj , i < j.

We write Σ Σ′ if there is a run R such that Σ R Σ′.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 77 / 96

Page 95: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates

A distributed computation may have many runs

Definition (Lattice)

The set of all consistent global states of a computation along withthe leads-to relation defines a lattice;

n orthogonal axis, one per process;

Σk1...kn shorthand for the global state (σk11 , . . . , σ

knn );

The level of Σk1...kn is equal to k1 + · · ·+ kn.

A path in the lattice is a sequence of global states of increasinglevels that corresponds to a consistent run.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 78 / 96

Page 96: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates

6 Consistency

52

12

22

42

32

61

2

11

41

1

51

31

21

11

21

132231

3241

3342

43

4453

455463

5564

65

10

00

12

23

01

02

14

24

34

35

03

04

Figure 3. A Distributed Computation and the Lattice of its Global States

UBLCS-93-1 9

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 79 / 96

Page 97: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates

Stable Predicates

Consider a global state construction protocol:

Let Σa be the global state in which the protocol is initiated;

Let Σf be the global state in which the protocol terminates;

Let Σs be the global state constructed by the protocol

Since Σa Σs Σf , if Φ is stable, then:

Φ(Σs) = true ⇒ Φ(Σf ) = true

Φ(Σs) = false ⇒ Φ(Σa) = false

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 80 / 96

Page 98: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection

Deadlock Detection

Code

Server code

Server code, modified for snapshot protocol

Monitor code for snapshot protocol

Server code, modified for passive protocol

Monitor code for passive protocol

Notes

No need to store channel state in this case

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 81 / 96

Page 99: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection

Server code

Process pi

Queue pending← new Queue()boolean working← falsewhile true do

while working or pending.size() = 0 doreceive 〈m, pj〉if m.type = request then

pending.enqueue(〈m, pj〉)else if m.type = response then〈m′, pk〉 ← nextState(m, pj)working← (m′.type = request)send m′ to pk

while not working and pending.size() > 0 do〈m, pj〉 ← pending.dequeue()〈m′, pk〉 ← nextState(m, pj)working← (m′.type = request)send m′ to pk

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 82 / 96

Page 100: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through active monitoring

Deadlock Detection through Snapshot

Approach

All channels are based on FIFO delivery

Add code to deal with Snapshot messages

Pros and Cons

Generates overhead only when deadlock is suspected

Introduces a delay between deadlock and detection

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 83 / 96

Page 101: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through active monitoring

Server code, modified for active monitoring (1)

Process pi

Queue pending← new Queue()boolean working← falseboolean[] blocking← {false, ..., false}while true do

while working or pending.size() = 0 doreceive 〈m, pj〉if m.type = request then

blocking[j]← truepending.enqueue(〈m, pj〉)

else if m.type = response then〈m′, pk〉 ← nextState(m, pj)working← (m′.type = request)send m′ to pkif m′.type = response then

blocking[k]← false

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 84 / 96

Page 102: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through active monitoring

Server code, modified for active monitoring (2)

Process pi

else if m.type = snapshot thenif s = 0 then

send 〈snapshot, blocking〉 to p0send 〈snapshot〉 to Π− {pi}

s← (s+ 1) mod n

while not working and pending.size() > 0 do〈m, pj〉 ← pending.dequeue()〈m′, pk〉 ← nextState(m, pj)working← (m′.type = request)send m′ to pkif m′.type = response then

blocking[k]← false

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 85 / 96

Page 103: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through active monitoring

Monitor code, for active monitoring

Process p0

boolean[ ][ ] wfg← new boolean[1 . . . n][1 . . . n]while true do{ Wait until deadlock is suspected }send 〈snapshot〉 to Πfor k ← 1 to n do

receive 〈m, j〉wfg[j]← m.data

if there is a cycle in wfg then{ the system is deadlocked }

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 86 / 96

Page 104: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through passive monitoring

Deadlock detection through passive monitoring

Approach

Sends a message to p0 for each relevant event

Communication with p0 based on causal delivery

Pros and Cons

Simpler approach, but complexity is hidden by the causal deliverymechanism

Latency limited to message delays

Higher overhead

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 87 / 96

Page 105: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through passive monitoring

Server code, modified for passive monitoring (1)

Process pi

Queue pending← new Queue()boolean working← falsewhile true do

while working or pending.size() = 0 doreceive 〈m, pj〉if m.type = request then

send 〈requested, j, i〉 to p0pending.enqueue(〈m, pj〉)

else if m.type = response then〈m′, pk〉 ← nextState(m, pj)working← (m′.type = request)send m′ to pkif m′.type = response then

send 〈responded, i, k〉 to p0

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 88 / 96

Page 106: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through passive monitoring

Server code, modified for passive monitoring (2)

Process pi

while not working and pending.size() > 0 do〈m, pj〉 ← pending.dequeue()〈m′, pk〉 ← nextState(m, pj)working← (m′.type = request)send m′ to pkif m′.type = response then

send 〈responded, i, k〉 to p0

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 89 / 96

Page 107: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Stable predicates Deadlock detection through passive monitoring

Monitor code, for passive monitoring

Process p0

boolean[ ][ ] wfg← new boolean[1 . . . n][1 . . . n]while true do

receive 〈m, pj〉if m.type = responded then

wfg[m.from,m.to]← falseelse

wfg[m.from,m.to]← true

if there is a cycle in wfg then{ the system is deadlocked }

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 90 / 96

Page 108: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Contents

1 Modeling DistributedExecutions

HistoriesHappen-beforeGlobal states and cuts

2 Global predicate evaluationProblem definitionExample: deadlock detection

3 Real clocks vs logical clocksLogical clocksScalar logical clocksVector logical clocks

4 Passive monitoringPassive monitoring, v.1Passive monitoring, v.2Passive monitoring, v.3

5 Proactive monitoringSnapshot protocol, v.1Snapshot protocol, v.2Snapshot protocol, v.3

6 Stable predicatesDeadlock detectionDeadlock detection throughactive monitoringDeadlock detection throughpassive monitoring

7 Non-stable predicates

Page 109: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Non-stable predicates

Non-stable predicates

Problems of non-stable predicates

The condition encoded by the predicate may not persist longenough for it to be true when the predicate is evaluated

If a predicate Φ is found to be true by the monitor, we do notknow whether Φ ever held during the actual run.

Conclusions

Evaluating a non-stable predicate over a single computation makesno sense

The evaluation must be extended to the entire lattice of thecomputation

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 91 / 96

Page 110: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Non-stable predicates

Predicates over entire computations

It is possible to evaluate a predicate over an entire computation usingan observation obtained by a passive monitor.

Possibly(Φ): There exists a consistent observation O of thecomputation such that Φ holds in a global state of O.

Definitely(Φ): For every consistent observation O of thecomputation, there exists a global state of O in which Φ holds.

Example (Debugging)

If Possibly(Φ) is true, and Φ identifies some erroneous state of thecomputation, than there is a bug, even if it is not observed during anactual run.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 92 / 96

Page 111: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Non-stable predicates

Example: Possibly((y − x) = 2), Definitely(x = y)

14 Properties of Global Predicates

12

22

42

32

: 6 : 4

52

: 2

2

1

: 3 : 4 : 5

0 10

11

31

21

41

51

61

0211

0321

132231

3241

3342

43

4453

4554

5564

65

10

00

01

12

23

63

2

Figure 16. Global States Satisfying Predicates and 2

UBLCS-93-1 33

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 93 / 96

Page 112: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Non-stable predicates

Predicates over entire computations

Theorem

Possibly and Definitely are not duals:

¬Possibly(Φ) 6⇔ Definitely(¬Φ)

¬Definitely(Φ) 6⇔ Possibly(¬Φ)

Example

Possibly(x 6= y), Definitely(x = y)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 94 / 96

Page 113: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Non-stable predicates

Algorithms for detecting Possibly and Definitely

We use the passive approach in which processes send notificationsof events relevant to Φ to the monitor p0;

Events are tagged with vector clocks;

The monitor collects all the events and builds the lattice of globalstates.How?

To detect Possibly(Φ): if there exists one global state in which Φis true, then return true, otherwise false.

To detect Definitely(Φ): mark nodes where Φ is true with a value1, the other nodes with value 0. If the cost of the shortest pathbetween the initial state and the final state is larger than 0, returntrue, otherwise false.

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 95 / 96

Page 114: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Non-stable predicates

Algorithms for detecting Possibly and Definitely

Problems

The number of states grows exponentially with the number oftotal events.

Techniques can be used to reduce the number of eventsI Only those relevant to ΦI Forcing periodic synchronizationI Reducing the complexity of predicates (conjunction of local

predicates)

Alberto Montresor (UniTN) DS - Time,clocks,events 2017/05/19 96 / 96

Page 115: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Reading Material

O. Babaoglu and K. Marzullo. Consistent global states of distributed systems:Fundamental concepts and mechanisms.

In S. Mullender, editor, Distributed Systems (2nd ed.). Addison-Wesley, 1993.

http:

//www.disi.unitn.it/~montreso/ds/papers/ConsistentGlobalStates.pdf

Page 116: Distributed Algorithms Time, clocks and the ordering of eventsdisi.unitn.it/~montreso/ds/handouts/03-gpe.pdf · Distributed Algorithms Time, clocks and the ordering of events Alberto

Reality Check: Interesting links

Clocks are bad, or welcome to distributed systemsWhy vector clocks are easyWhy vector clocks are hardWhy Cassandra doesn’t need vector clocks