On the minimal information to encode timestamps in distributed computations

8
Information Processing Letters 84 (2002) 159–166 www.elsevier.com/locate/ipl On the minimal information to encode timestamps in distributed computations Roberto Baldoni a,, Giovanna Melideo b a Dipartimento di Informatica e Sistemistica, University of Rome “La Sapienza”, Via Salaria 113, 00198 Roma, Italy b Dipartimento di Matematica Pura ed Applicata, University of L’Aquila, Via Vetoio – Loc. Coppito, 67100 L’Aquila, Italy Received 22 November 2000; received in revised form 21 January 2002 Communicated by M. Yamashita Abstract Timestamping protocols are used to capture the causal order or the concurrency of events in asynchronous distributed computations. In this paper we give an answer to the open problem issued by Schwarz and Mattern [Distrib. Comput. 7 (3) (1994) 149–174] about the minimum amount of information managed by protocols which represent causality in an isomorphic way. We point out that to encode each timestamp an amount of non-structured information (i.e., the number of bits) of log 2 ((m + 1) n n k=3 ( n k )( 2k3 k ) ) bits is necessary. 2002 Elsevier Science B.V. All rights reserved. Keywords: Timestamps; Causal pasts; Causality relation; Vector clocks; Distributed computing 1. Introduction Since the Lamport’s seminal paper [3], that formal- ized the notion of causal dependency between events of an asynchronous distributed computation, a lot of work has been carried out to design distributed proto- cols that capture causal dependencies (or the concur- rency) between events of a computation [2,4–7]. All these protocols consist in associating time- stamps with events and in piggybacking control in- formation upon outgoing messages used to update timestamps. If the system of timestamps is an isomor- * Corresponding author. E-mail addresses: [email protected] (R. Baldoni), [email protected] (G. Melideo). phic embedding of the partial order of the computa- tion, the potential causal precedence or the concur- rency between events can be correctly detected just by comparing their timestamps. In such a case, we say that the protocol characterizes causality. Fidge and Mattern simultaneously and indepen- dently introduced in [2,4] the vector clocks proto- col which associates each event of the computation with an n-dimensional vector of integers (i.e., a vec- tor clock) as timestamp, where n is the number of processes. After a few years, Charron-Bost stated in [1] the optimality of vector clocks by proving that if events are timestamped by vectors of integers par- tially ordered by the canonical vector order < (i.e., u, v N l , u<v iff u[i ] v[i ] for i = 1,...,l and u = v), then causality characterization needs vectors of size at least n. 0020-0190/02/$ – see front matter 2002 Elsevier Science B.V. All rights reserved. PII:S0020-0190(02)00241-7

Transcript of On the minimal information to encode timestamps in distributed computations

Information Processing Letters 84 (2002) 159–166

www.elsevier.com/locate/ipl

On the minimal information to encode timestampsin distributed computations

Roberto Baldonia,∗, Giovanna Melideob

a Dipartimento di Informatica e Sistemistica, University of Rome “La Sapienza”, Via Salaria 113, 00198 Roma, Italyb Dipartimento di Matematica Pura ed Applicata, University of L’Aquila, Via Vetoio – Loc. Coppito, 67100 L’Aquila, Italy

Received 22 November 2000; received in revised form 21 January 2002

Communicated by M. Yamashita

Abstract

Timestamping protocols are used to capture the causal order or the concurrency of events in asynchronous distributedcomputations. In this paper we give an answer to the open problem issued by Schwarz and Mattern [Distrib. Comput. 7 (3)(1994) 149–174] about theminimumamount of information managed by protocols which represent causality in an isomorphicway. We point out that to encode each timestamp an amount of non-structured information (i.e., the number of bits) of�log2((m+ 1)n −∑n

k=3(nk

)(2k−3k

))� bits isnecessary.

2002 Elsevier Science B.V. All rights reserved.

Keywords:Timestamps; Causal pasts; Causality relation; Vector clocks; Distributed computing

1. Introduction

Since the Lamport’s seminal paper [3], that formal-ized the notion of causal dependency between eventsof an asynchronous distributed computation, a lot ofwork has been carried out to design distributed proto-cols that capture causal dependencies (or the concur-rency) between events of a computation [2,4–7].

All these protocols consist in associating time-stamps with events and in piggybacking control in-formation upon outgoing messages used to updatetimestamps. If the system of timestamps is an isomor-

* Corresponding author.E-mail addresses:[email protected] (R. Baldoni),

[email protected] (G. Melideo).

phic embedding of the partial order of the computa-tion, the potential causal precedence or the concur-rency between events can be correctly detected just bycomparing their timestamps. In such a case, we saythat the protocolcharacterizes causality.

Fidge and Mattern simultaneously and indepen-dently introduced in [2,4] the vector clocks proto-col which associates each event of the computationwith an n-dimensional vector of integers (i.e., a vec-tor clock) as timestamp, wheren is the number ofprocesses. After a few years, Charron-Bost stated in[1] the optimality of vector clocks by proving thatif events are timestamped by vectors of integers par-tially ordered by the canonical vector order< (i.e.,∀u,v ∈ N

l, u < v iff u[i] � v[i] for i = 1, . . . , l andu �= v), then causality characterization needs vectorsof size at leastn.

0020-0190/02/$ – see front matter 2002 Elsevier Science B.V. All rights reserved.PII: S0020-0190(02)00241-7

160 R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166

However, a question is open yet: are vector clocksan optimal system (in terms of size of timestamps)with respect toany partial order isomorphic to theone of a distributed computation? The answer to thisquestion, as remarked by Schwarz and Mattern in[6], “ . . . requires some statement about the minimumamount of information that has to be contained intimestamps. . .” in order to characterize causality.

This paper provides such a statement by showingthe minimum number of bits (i.e., non-structured in-formation) managed by protocols which characterizecausality. Towards this aim, we first prove a bijectionbetween the set↓[E ] of causal pasts of events pro-duced by distinct computations and the setφ[E ] oftimestamps which can be associated with events insuch computations. Secondly, we present a characteri-zation of the set↓[E ] which allows us to calculate thecardinality of this set. Previous count and the bijectionallow to formulate a statement on the amount of infor-mation which has to be contained in timestamps in or-der to characterize causality. More precisely, we pointout that, ifn processes generatem events each (withm � n − 2), then the encoding of each timestamp re-quires an amount of non-structured information of atleast⌈

log2

((m+ 1)n −

n∑k=3

(n

k

)(2k − 3

k

))⌉bits for any timestamping protocol which character-izes causality.1

The rest of this paper is structured as follows.Section 2 introduces the computation model. Section 3presents the proof of the bijection between↓[E ]and φ[E ], the count of|↓[E ]| and the proof of thestatement. Section 4 concludes the paper relating ourresult to the number of bits used by vector clocksprotocol.

2. The model

2.1. Distributed computations

A distributed program is made up ofn sequentiallocal programs which communicate and synchronize

1 This value differs by a constant from the number of bits usedto encode vector clocks.

only by exchanging messages. Message transfer de-lays are unpredictable but finite. A distributed compu-tation describes the execution of a distributed program.The execution of each local program gives rise to a se-quential processPi . Let {P1,P2, . . . ,Pn} be the finiteset of sequential processes of the distributed compu-tation. EachPi is modeled by a totally ordered set ofeventsEi which correspond to the execution of state-ments of the local program. Each event can be internalor it can involve the sending or the receipt of a mes-sagem (denoted bysend(m) andreceive(m), respec-tively).

Let E = ⋃ni=1 Ei be the set of events produced

by a distributed computation. This set is structuredas a partial order by Lamport’scausality relation[3],denoted≺ and defined as follows:

Definition 1. The causality relation≺⊆ E ×E is thesmallest relation satisfying the following conditions:e ≺ e′ if one of these conditions holds:

• e and e′ are events in the same process ande

precedese′ (local precedence, denoted bye ≺l

e′);• there exists a messagem s.t. e = send(m), e′ =

receive(m) ande ande′ are produced by distinctprocesses (message precedence, denoted bye ≺m

e′);• ∃e′′: e ≺ e′′ ∧ e′′ ≺ e′ (transitive closure).

A distributed computation corresponds to an exe-cution of a distributed program and it is modeled as anirreflexive partial order(E,≺). Moreover, two eventse ande′ are concurrent, denotede‖e′, if ¬(e ≺ e′) and¬(e′ ≺ e).

2.2. The set of computations

We are interested in formalizing a model ableto describe all possible computations that can begenerated by the execution of any distributed program.Before introducing this model, it is useful to define thenotion ofcausality relations setas follows:

Definition 2. The set of all causality relations, denotedby H[E] ⊆ 2(E×E), is the set of all relations≺ whichare causality relations ofE, i.e.,≺ satisfies followingconditions:

R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166 161

• ≺l ⊆≺, denoting that(E,≺) is ann-decomposedpartial order;

• ≺ is irreflexive;• ∀e ∈ E, |{(e, e′) ∈ ≺m | e′ ∈ E} ∪ {(e′, e) ∈ ≺m |

e′ ∈ E}| � 1, this restriction imposing that forevery receipt of a messagem, there is a singlesending ofm.

Without loss of generality in the rest of the paperlet us consider|Ei | equal tom � 0.2 In such a case wecan define the set of all possible computations havingthe same setE as follows:

Definition 3. The setE = {E = (E,≺) | ≺ ∈ H[E]}represents all possible computations having distinctorder extensions of a basicn-decomposed partial order(E,≺l ).

Putting another way, the set of computations canbe seen as the set of all graphs that can be derivedfrom a basicn-decomposed graph(E,≺l ) by addingedges between vertices in different processes with-out forming cycles and in such a way that pairs ofconnected vertices are in one-to-one correspondence(i.e., by defining any appropriate relation≺m). There-fore an evente ∈ E can play distinct roles (inter-nal/send/receive) depending on≺m.

2.3. Cuts and causal pasts

Definition 4. A cut C of a set of events is a prefix-closed subset of(E,≺l ), i.e., a finite subsetC ⊆ E

such thate ∈ C ∧ e′ ≺l e ⇒ e′ ∈C.

Let C[E] be the set of cutsC of E. More specifi-cally C[E] is the set of prefix-closed subsets ofE un-der the relation≺l .

Definition 5. Thecausal pastof an evente in a compu-tationE = (E,≺) ∈ E is thecutof events which mighthave affectede. Namely,↓ :E×H[E]→ C[E], where

↓(e,≺)= {e′ ∈E | e′ ≺ e} ∪ {e}.

2 If an execution of the local program ofPi produces less thanm events, we assume that the chain is completed with “dummy”internal events.

Notations. Let E = (E,≺) be a computation inE . Wewill denote by:

• ↓[E ] = {↓(e,≺) | e ∈ E} the set of causal pastsof events in the computationE;

• ↓[E ] = {↓[E ] | E ∈ E } the set of all causal pastswhich can be observed in all possible computa-tions in E .

As stated by Schwarz and Mattern in [6], Defini-tion 5 of causal pasts implies that:

Property 1. Causality and causal past notions arerelated as follows: ∀≺ ∈H[E] and∀e, e′ ∈E (e �= e′),e ≺ e′ ⇔ e ∈ ↓(e′,≺) ande‖e′ ⇔ e /∈ ↓(e′,≺)∧ e′ /∈↓(e,≺).

Property 2. The partial order(↓[E ],⊂) of causalpasts of events in a given computationE ∈ E is anisomorphic embedding ofE, i.e.,

(i) the function↓≺ (that is the restriction of↓ to thegiven causality relation≺) is injective, and

(ii) ∀e, e′ ∈ E, (e �= e′), ∀≺ ∈ H[E] the followingproperty is satisfied: e ≺ e′ ⇔ ↓≺(e)⊂↓≺(e′).

2.4. Timestamping protocols

All methods used to track causality between eventsare based on timestamps associated with events and onthe piggybacking of information upon outgoing mes-sages used to update timestamps (also called messagetimestamps). A timestamping protocol assigns on-the-fly (that is during the evolution of a computation) toeach evente ∈E a value of a suitable partially orderedset(T ,<).

A timestamping protocolis usually characterizedby (i) a partially ordered setT = (T ,<) called do-main of timestampsand (ii) a timestamping functionφ :E × H[E] → T which establishes a correspon-dence between events in a computation and values inT called timestamps.

Notations. Let E = (E,≺) be a computation inE . Wewill denote by:

• φ[E ] = {φ(e,≺) | e ∈ E} the set of timestampsassociated with events during the evolution of the

162 R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166

computationE. The pair(φ[E ],<) denotes theorder restriction of(T ,<) on φ[E ] and is calledsystem of timestamps;

• φ[E ] = {φ[E ] | E ∈ E } the set of all the time-stamps which can be associated with events in allpossible computations inE .

The aim is to assign values inT to events so thatfor each computationE ∈ E the system of timestamps(φ[E ],<) may be an isomorphic embedding of thecomputationE. In this case the causal dependencybetween two events can be detected only by comparingtheir timestamps (on-the-fly detection).

More formally, we say the system of timestampsand the protocolcharacterize causalityif the follow-ing property is satisfied [1,2,6]:

Property 3. ∀e, e′ ∈E, ∀≺ ∈H[E], e≺ e′ ⇔ φ(e,≺)

< φ(e′,≺).

3. Necessary condition to characterize causality

In this section we first prove a bijection between↓[E ] andφ[E ] (Corollary 1). Secondly, we character-ize properties of causal pasts with respect to the set ofcutsC[E] (Theorem 1), by proving that↓[E ] ⊂ C[E].Third, we count the number of causal pasts containedin ↓[E ] (Corollary 3). Finally, as a direct consequenceof that count, we provide a statement concerning thenumber of bits necessary to encode timestamps asso-ciated to events.

Proposition 1. Let E ∈ E . If a timestamping protocolcharacterizes causality the corresponding system oftimestamps(φ[E ],<) is an isomorphic embedding of(↓[E ],⊂), i.e.,

∀e, e′ ∈E (e �= e′), ∀≺ ∈H[E],↓(e,≺)⊆↓(e′,≺)⇔ φ(e,≺) < φ(e′,≺).

Proof. The thesis is directly implied by Properties 2and 3. ✷

Namely, when we concentrate on a given compu-tation E ∈ E , there is a bijection between causal pastsin ↓[E ] and timestamps inφ[E ], i.e., each timestamp

assigned to an event during the evolution ofE is actu-ally an encoding of its causal past. As a consequence,∀E = (E,≺) ∈ E , there must exist aninjectiveencod-ing function χ≺ :↓[E ] → T , depending onE, suchthat the timestamping protocol associates a value witheach event in this way:∀E = (E,≺) ∈ E , ∀e ∈E:

φ(e,≺)= χ≺(↓(e,≺)

). (1)

3.1. An observation

Property 3does not statethat if a timestamping pro-tocol characterizes causality, then (φ[E ],<) must bean isomorphic embedding of(↓[E ],⊂). In fact, whenconsidering distinct computations inE , Property 3is not sufficientto ensure that different timestamps,even in different computations, correspond to differ-ent causal pasts and vice versa, i.e.,∀E = (E,≺),∀E′ = (E,≺′) ∈ E , ∀S ∈ ↓[E ], ∀S′ ∈ ↓[E′] we have:

S �= S′ ⇔ χ≺(S) �= χ≺′(S′). (2)

Therefore, Property 3cannot guaranteethat ∀E =(E,≺), E′ = (E,≺′) ∈ E , ∀e, e′ ∈E:

• different timestamps are always assigned to eventswith different causal pasts, even in different com-putations, i.e.,↓(e,≺) �= ↓(e′,≺′) ⇒ φ(e,≺) �=φ(e′,≺′);

• the same timestamp always corresponds to eventswith the same causal past:

↓≺(e)=↓≺′(e′) ⇒ φ≺(e)= φ≺′(e′). (3)

Equivalently, S ∈ ↓[E ] ∩ ↓[E′] ⇔ χ≺(S) =χ≺′(S).

3.2. An operational analysis

Even though from a theoretical point of viewthe timestamping protocol can proceed by assigningdifferent timestamps to events with the same causalpast or the same timestamp to events with differentcausal pasts, the following proposition shows that thisis not acceptable from an operational point of view.

Proposition 2. If a timestamping protocol character-izes causality, then

∀S,S′ ∈ ↓[E ], S �= S′ ⇔ χ≺(S) �= χ≺′(S′).

R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166 163

Proof. (⇐) Under the hypothesis that the protocolstores the minimum information to encode causalpasts, Eq. (3) must hold.

(⇒) Let us consider two eventse, e′ in differentcomputations with different causal pasts and a sametimestamp. As stated by Property 2, on-the-fly detec-tion needs perfect knowledge of all past events at anytime. From an operational point of view, this meansthat any external observer, whose role is to detect de-pendencies between events, has to execute an algo-rithm characterized by a decoding functionϕ whichcorrectly deduces causal pasts from timestamps. Asthe computation cannot be known a priori,ϕ cannotdepend on any computation, so must beϕ :φ[E ] →↓[E ]. Therefore, if there was a same timestamp as-sociated with two different causal pasts, then therewould exist no decoding functionϕ. As a conse-quence, the observer would not be able to find the cor-rect causal past of an event without knowing the com-putation. ✷Corollary 1. If a timestamping protocol character-izes causality, there exists a bijective encoding func-tion χ :↓[E ] → φ[E ] which represents an encodingscheme for the set of all the causal pastsφ[E ].

Proof. If a protocol characterizes causality then

∀E, ∃χ≺ :↓[E ]→ φ[E ](see Eq. (1)). By Proposition 2, for different computa-tions inE , differentχ≺ are related as stated by Eq. (2).So, there must exist a well-defined bijective functionχ :↓[E ] → φ[E ], such thatχ =⋃

E∈E χ≺, which isthe extension of all theχ≺ to ↓[E ]. ✷Proposition 3. If a timestamping protocol character-izes causality,(φ[E ],<) is actually an isomorphic em-bedding of(↓[E ],⊂).

Proof. Corollary 1 establishes the existence of a bijec-tive functionχ :↓[E ] → φ[E ]. The fact that causal-ity has to be detected on-the-fly, i.e., only by com-paring timestamps, and that the computation cannotbe known a priori implies that the bijective decodingfunctionϕ = χ−1 :φ[E ] → ↓[E ] executed by any ex-ternal observer must satisfy the following condition:∀d, d ′ ∈ φ[E ], d < d ′ ⇔ ϕ(d)⊂ ϕ(d ′). ✷

3.3. Causal pasts characterization

In this section we provide a few properties whichcharacterize the set of causal pasts↓[E ] with respectto the setC[E] of all the cutsC.

Let us recall that a cutC can be decomposed inton subsetsC1, . . . ,Cn, where Ci = C ∩ Ei = {ei,1,

ei,2, . . . , ei,hi } (whereei,j represents thej th event ofprocessPi ), for i = 1, . . . , n.

Theorem 1. A cutC is a causal past(i.e.,C ∈ ↓[E ] ∩C[E]) if and only if C �= ∅ and when the numberkof nonempty subsets in its decomposition is at least3,|C|� 2(k − 1).3

Proof. The proof is made of two parts: necessity andsufficiency.

Necessity. By the definition of causal past, at leaste belongs to↓(e, E ), soC �= ∅. Moreover, it is trivialto find some computation in which one can observecausal pasts involving events on at most two processes.If k � 3 processes have events in the causal pastC,then at leastk − 1 processes have to send messagesin order to establish a dependency. Each ofk − 1messages contributes two events toC. Hence, 2(k−1)

is the minimum number of events inC whenk � 3.Sufficiency. Let C be a cut andk (with k � 1) be

the number of nonempty subsets of the decompositionof C. We can distinguish the following cases:

• If k = 1, thenC = Ci = {ei,1, . . . , ei,hi }, for somei ∈ {1, . . . , n}. It is easy to see thatC = ↓(ei,hi ,≺),in a distributed computation(E,≺) in which events inCi are internal or send events.

• If k = 2, then there exist distinct indicesi, j suchthatC = Ci ∪Cj . In this caseC can be the causal pastof eventej,hj in a computation in whichei,hi andej,hj

3 Let us consider the notion ofconsistent cutdefined by Matternin [4] with respect to a given computation(E,≺) as a prefix-closedset of events under the causality relation≺. By definition, a causalpast is a consistent cut, but the converse is not necessarily true. Infact, a cutC = {ei,1, ej,1, el,1} with i �= j �= l is a consistent cutinany computation in whichei,1, ej,1, el,1 are internal or send events.On the contrary, Theorem 1 states that it cannot be a causal past, i.e.,there exists no computation which generates an event with causalpastC. So, denoting byC[E ] the set of all possible consistent cutswhich can be observed in different computations belonging toE , wehave↓[E ] ⊂C[E ] ⊂ C[E].

164 R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166

are communication events such thatei,hi ≺m ej,hj , andother events are internal events.

• If k � 3, let l � k be the number of subsets withonly one event. Without loss of generality, we supposeC1, . . . ,Cl are the subsets of the decomposition withonly one event andCl+1, . . . ,Ck are the subsets withat least two events each.

In Fig. 1 a computationE = (E,≺) such thatC =↓(ek,hk ,≺) is depicted.E is defined as it follows:

(1) ej,hj ≺m ej+1,hj+1−1, ∀j = l + 1, . . . , k − 2, i.e.,⋃k−2j=l+1{(ej,hj , ej+1,hj+1−1)} ∈ ≺.

(2) ek−1,hk−1 ≺m ek,hk , i.e.,(ek−1,hk−1, ek,hk ) ∈≺.

(3) Let the setC \ (⋃l

j=1 Ci) \ (⋃k−1

j=l+2{ej,hj−1,

ej,hj }) \ {el+1,hl+1, ek,hk } be denoted asC′. Thereis a sending message fromej,1 to some evente(j) ∈ C′, i.e., (ej,1, e

(j)) ∈ ≺, ∀j = 1, . . . , l. Wenote that by hypothesis|C′| = |C|− l−2(k− l)+2� (2k−2)−2k+ l+2= l, so there are at leastl

possible receive events forl sendings of messages.The remaining events are internal events.

It follows thatC =↓(ek,hk ,≺), and, therefore,C isa causal past. ✷

Theorem 1 allows us to compute the number of cutswhich are causal pasts, that is|↓[E ]|. This cardinalityis fundamental to obtain both the minimum cardinalityof T and, consequently, the statement on the amountof information necessary to encode timestamps whenthe timestamping protocol characterizes causality on-the-fly.

Corollary 2. If C ∈ C[E]\↓[E ] then, for each non-empty subsetCi of its decomposition,1 � |Ci | �n− 2.

Proof. If C is a cut which is not a causal past (i.e.,C ∈ C[E]\↓[E ]), Theorem 1 implies that|C|� 2k−3if its decomposition hask � 3 nonempty subsets. So,the maximum cardinality of a cutC ∈ C[E]\↓[E ] is2n − 3 whenk = n processes have events inC. Inany cutC with 2n − 3 events, the cardinality|Ci | ofevery subset of the decomposition can range from one(Ci �= ∅) to n− 2, the latter occurring when the cutC

has only one event inn− 1 processes. ✷In the rest of this paper, for simplicity’s sake, we

will consider Ei is a chain of sizem � n − 2. Insuch a case, Corollary 2 allows us to consider all the

Fig. 1. Proof of Theorem 1.

R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166 165

possible cuts which cannot be causal pasts (i.e., cuts inC[E]\↓[E ]).

Proposition 4. If n processes generatem � n − 2events each, the number of cuts which are not causalpasts is:∣∣C[E]\↓[E ]∣∣= 1+

n∑k=3

(n

k

)(2k − 3

k

). (4)

Proof. Any cut C ∈ C[E]\↓[E ] is either the emptyset or it has as size at most 2k − 3 events when itsdecomposition hask � 3 nonempty subsets.

By applying basic mathematical enumeration, since(i) k nonempty subsets can be on any ofn processes,and (ii) the number of cutsC of size h which canbe decomposed intok nonemptysubsets is

(h−1h−k

), it

results that the number required is:

n∑k=3

(n

k

) 2k−3∑h=k

(h− 1

h− k

)=

n∑k=3

(n

k

) k−3∑h=0

(h+ k − 1

h

).

Let us denote byN(h, k) the value(h+k−1

h

). N(h, k)

represents the number of cuts of sizeh whose decom-position hasat mostk nonempty subsets. It can be eas-ily proved that

N(h, k)=h∑

i=0

N(i, k − 1),

so the thesis (4) follows by considering that

k−3∑h=0

(h+ k − 1

h

)=

k−3∑h=0

N(h, k)

= N(k − 3, k + 1)

=(

2k − 3

k

)and that the empty set is not a causal past.✷

Since, if eachEi is a chain of sizem, |C[E]| =(m+ 1)n, one has:

Corollary 3. If n processes generatem � n−2 eventseach, then∣∣↓[E ]∣∣= (m+ 1)n − 1−

n∑k=3

(n

k

)(2k − 3

k

).

3.4. A statement on the minimal amount ofinformation to encode timestamps

Corollary 3 and Proposition 3 allow to state thefollowing statement:

Statement. If n processes generatem � n− 2 eventseach, for any timestamping protocol which character-izes causality theencoding of each element inT re-quires at least⌈

log2

((m+ 1)n −

n∑k=3

(n

k

)(2k − 3

k

))⌉

bits.

Proof. Since|T | � |↓[E ]|, to encode such a numberof elements we need at least⌈log2

∣∣↓[E ]∣∣⌉=⌈

log2

((m+ 1)n − 1−

n∑k=3

(n

k

)(2k − 3

k

))⌉

bits. The thesis follows by considering that processesuse an initial timestamp which does not belong to thesetφ[E ], so we point out that|T |� |φ[E ]| + 1. ✷

4. Conclusion and open problem

Main result of this paper concerned a statement onthe minimum amount of non-structured information(i.e., number of bits) managed by any timestampingprotocol which characterizes causality. This has beendone by proving (i) a bijection between↓[E ] andφ[E ], and (ii) a count of|↓[E ]|.

The amount of bits claimed by our statement dif-fers from the number of bits used by vector clocks bya constant independent of the computation. This con-stant captures the difference between the number ofcuts and the number of causal pasts of a computation(see Proposition 4).

Thanks to our statement, the open question cited inthe introduction (i.e., if vector clocks are an optimalsystem, in terms of size of timestamps, with respect toanypartial order isomorphic to the one of a distributed

166 R. Baldoni, G. Melideo / Information Processing Letters 84 (2002) 159–166

computation4) can be reworded as follows: how manybits it is necessary to add to the amount providedby our statement to obtain thesmallest structuredinformation that represents the partial order of thecomputation in an isomorphic way. Our conjecture isthat to structure timestamps it is necessary to encodealso cuts which are not causal pasts (as done by vectorclocks). In other words, it is necessary to encode∑n

k=3

(nk

)(2k−3k

)elements.

Acknowledgements

The authors would like to thank Alberto Marchetti-Spaccamela and Marco Mechelli for several discus-sion on the initial part of this work. They also thankthe anonymous referee for useful comments.

4 Let us remark that Charron-Bost provided an answer tothis question when considering vectors of integers as timestampspartially ordered by the canonical vector order<.

References

[1] B. Charron-Bost, Concerning the size of logical clocks indistributed systems, Inform. Process. Lett. 39 (1991) 11–16.

[2] C.J. Fidge, Timestamps in message passing system that preservethe partial ordering, in: Proc. of 11th Australian ComputerScience Conf., 1988, pp. 55–66.

[3] L. Lamport, Time, clocks and the ordering of events in adistributed system, Comm. ACM 21 (7) (1978) 558–565.

[4] F. Mattern, Virtual time and global states of distributed systems,Parallel Distrib. Algorithms (1988) 215–226.

[5] M. Raynal, M. Singhal, Logical time: Capturing causality indistributed systems, IEEE Comput. 29 (2) (1996) 49–57.

[6] R. Schwarz, F. Mattern, Detecting causal relationships in dis-tributed computations: In search of the holy grail, Distrib. Com-put. 7 (3) (1994) 149–174.

[7] M. Singhal, A. Kshemkalyani, An efficient implementation ofvector clocks, Inform. Process. Lett. 43 (1992) 47–52.