Indulgent Algorithms (Preliminary Version)disi.unitn.it/~montreso/ds/papers/p289-guerraoui.pdf ·...

9
Indulgent Algorithms (Preliminary Version) Rachid Guerraoui Communication Systems Department Swiss Federal Institute of Technology CH 1015 Lausanne Abstract Informally, an indulgent algorithm is a distributed algorithm that tolerates unreliable failure detection: the algorithm is indulgent towards its failure detector. This paper formally characterises such algorithms and states some of their in- teresting features. We show that indulgent algorithms are inherently safe and uniform. We also state impossibility results for indulgent solutions to divergent problems like consensus, and failure-sensitive problems like non-blocking atomic commit and terminating reliable broadcast. 1 Introduction Indulgent algorithms. The notion of partial failures is a fun- damental characteristic of a distributed system: some of the processes might fail whereas others might keep executing their algorithm. A usual metric to evaluate the reliability of a system is its ability to mask partial failures. Reliable distributed systems are designed to provide continuous ser- vice despite the failures of some of their processes. This ability typically relies on some failure detection mechanism that provides hints about which processes are correct and which are not. Distributed algorithms usually differ on the assumptions made about the reliability of that mechanism. Some algorithms assume failure detectors that accurately detect crashes. For example, the state machine replication algorithm of [16], the election algorithm of [13], and the non- blocking atomic commit algorithm of [17], all assume that any correct process pi accurately detects when any other process pj has failed. Other algorithms do make weaker as- sumptions about failure detectors. For instance, none of the consensus algorithms of [2, 6, 10, 14], or the replication al- gorithms of [11, 12, 5], excludes the possibility of false fail- ure detections. In a sense, those algorithms are indulgent (towards their failure detector). They can cope with false failure detections. Safety and uniformity. A problem has an indulgent solu- tion when there exists an indulgent algorithm that solves the problem: that is, the problem can solved with an algo- rithm that copes with false failure detections. As we will Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the t'ull citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee, PODC 2000 Portland Oregon Copyright ACM 2000 1-58113-183-6/00/07...$5.00 show in the paper, some problems do not have indulgent solutions. These include problems like non-blocking atomic commit [17] and terminating reliable broadcast [9]. Even when indulgent solutions exist, they are often complicated. The difficulty of devising an indulgent algorithm intuitively stems from the fact that processes should take some non- trivial "precautions" to cope with unreliable failure detec- tion. Interestingly, and precisely because of those "precau- tions", indulgent algorithms turn out to have some "good" inherent characteristics: they are safe and uniform. The no- tions of safety and uniformity will be precised later in this paper, but to get an intuitive idea of their meaning, con- sider for instance the rotating-coordinator-based consensus 1 algorithm of [2], which we denote here by OS-cons. The al- gorithm assumes a OS failure detector which ensures strong completeness (eventually every process that crashes is per- manently suspected by every correct process), and eventual weak accuracy (eventually some correct process is not sus- pected by any correct process). Those properties do not prevent a fMlure detector from making an infinite number of false failure detections. Because the ~S-cons algorithm needs to cope with those mistakes, it has the following in- teresting features: • The ~S-cons algorithm preserves safety even if neither of strong completeness nor eventual weak accuracy is satisfied. In particular, even if crashed processes are never suspected and correct processes are permanently suspected by all, (a) no two processes ever disagree on a decision and (b) no process ever decides on a value that was not proposed by some process. We say that the algorithm is safe. • Although initially designed to solve consensus, OS- cons turns out to solve uniform consensus: a stronger variant of consensus where safety is preserved among all processes, whether they are correct or not. 2 We say 1In consensus [2], every process proposes an initial value 0 or 1, and must decide on a final value such that the three following con- ditions are satisfied: agreement, i.e., no two correct processes decide differently; validity, i.e., any value decided by a correct process is proposed by some process; and termination, every correct process eventually decides. 2In uniform consensus [8], beside the termination and validity conditions of consensus, the following condition needs to be satisfied: uniform agreement, i.e., no two processes (correct or not) decide dif- ferently. We will show in the paper that consensus and uniform con- sensus are similar with respect to indulgent algorithms. We have however given in [8] an example of a non-indulgent algorithm that solves consensus but not uniform consensus. The algorithm is non- indulgent because it assumes a perfect failure detector. 289

Transcript of Indulgent Algorithms (Preliminary Version)disi.unitn.it/~montreso/ds/papers/p289-guerraoui.pdf ·...

Indulgent Algorithms (Preliminary Version)

Rachid Guerraoui Communication Systems Department Swiss Federal Institute of Technology

CH 1015 Lausanne

Abstract

Informally, an indulgent algori thm is a d is t r ibuted algori thm that tolerates unreliable failure detection: the algori thm is indulgent towards its failure detector. This paper formally characterises such algori thms and states some of their in- teresting features. We show tha t indulgent algori thms are inherently safe and uniform. We also s ta te impossibil i ty results for indulgent solutions to divergent problems like consensus, and failure-sensitive problems like non-blocking atomic commit and t e rmina t ing reliable broadcast .

1 Introduct ion

Indulgent algorithms. The not ion of partial failures is a fun- damenta l characteristic of a d is t r ibuted system: some of the processes might fail whereas others might keep executing their algorithm. A usual metric to evaluate the reliabili ty of a system is its abil i ty to mask par t ia l failures. Reliable dis tr ibuted systems are designed to provide cont inuous ser- vice despite the failures of some of their processes. This ability typically relies on some failure detect ion mechanism that provides hints abou t which processes are correct and which are not. Dis t r ibuted algori thms usually differ on the assumptions made about the reliabili ty of tha t mechanism. Some algorithms assume failure detectors tha t accurately detect crashes. For example, the s ta te machine replication algorithm of [16], the election algori thm of [13], and the non- blocking atomic commit algori thm of [17], all assume tha t any correct process pi accurately detects when any other process pj has failed. Other algori thms do make weaker as- sumptions about failure detectors. For instance, none of the consensus algorithms of [2, 6, 10, 14], or the replication al- gorithms of [11, 12, 5], excludes the possibility of false fail- ure detections. In a sense, those algori thms are indulgent (towards their failure detector). They can cope with false failure detections.

Safety and uniformity. A problem has an indulgent solu- tion when there exists an indulgent algori thm tha t solves the problem: tha t is, the problem can solved with an algo- r i thm that copes with false failure detections. As we will

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the t'ull citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee, PODC 2000 Portland Oregon Copyright ACM 2000 1-58113-183-6/00/07...$5.00

show in the paper, some problems do not have indulgent solutions. These include problems like non-blocking atomic commit [17] and t e rmina t ing reliable broadcast [9]. Even when indulgent solut ions exist, they are often complicated. The difficulty of devising an indulgent algori thm intui t ively stems from the fact tha t processes should take some non- trivial "precautions" to cope with unreliable failure detec- tion. Interestingly, and precisely because of those "precau- tions", indulgent algori thms tu rn out to have some "good" inherent characteristics: they are safe and uniform. The no- t ions of safety and uniformity will be precised later in this paper, bu t to get an in tui t ive idea of their meaning, con- sider for instance the rota t ing-coordinator-based consensus 1 algori thm of [2], which we denote here by OS-cons. The al- gori thm assumes a OS failure detector which ensures strong completeness (eventual ly every process tha t crashes is per- manen t ly suspected by every correct process), and eventual weak accuracy (eventual ly some correct process is not sus- pected by any correct process). Those properties do not prevent a fMlure detector from making an infinite number of false failure detections. Because the ~S-cons algorithm needs to cope with those mistakes, it has the following in- terest ing features:

• The ~S-cons a lgor i thm preserves safety even if neither of strong completeness nor eventual weak accuracy is satisfied. In par t icular , even if crashed processes are never suspected and correct processes are pe rmanen t ly suspected by all, (a) no two processes ever disagree on a decision and (b) no process ever decides on a value that was not proposed by some process. We say that the algori thm is safe.

• Although ini t ial ly designed to solve consensus, OS- cons turns ou t to solve uniform consensus: a stronger variant of consensus where safety is preserved among all processes, whether they are correct or not. 2 We say

1In consensus [2], ev e ry process proposes an initial value 0 or 1, and mus t decide on a final va lue such t h a t the three following con- di t ions are satisfied: agreement, i.e., no two correc t processes decide differently; validity, i.e., any va lue decided by a correc t process is p roposed by some process; and termination, every correct process even tua l ly decides.

2In uniform consensus [8], bes ide the terminat ion and validity condi t ions of consensus, the following condi t ion needs to be satisfied: uni form agreement, i.e., no two processes (correct or not) decide dif- ferently. We will show in the p a p e r t h a t consensus and uniform con- sensus are s imilar wi th respec t to indulgent a lgor i thms. We have however given in [8] an e x a m p l e of a non- indulgent a lgor i thm tha t solves consensus b u t not un i fo rm consensus. T h e a lgor i thm is non- indulgent because it a s su mes a per fec t failure de tec tor .

289

that the algorithm is uniform.

This paper shows that safety and uniformity are inherent features of indulgent algorithms, i.e., any algorithm that copes with unreliable failure detection is inherently safe and uniform.

On the semantics of unreliability. Characterising indulgent algorithms go through addressing a technical difficulty: defin- ing what it means for a failure detector to be unreliable. One might be tempted for instance to consider as unreliable any failure detector that make mistakes, e.g., any failure detector that suspects a process to have crashed, even if that process is correct. An unreliability degree could then measure the number of mistakes that a failure detector makes [2]. This definition would however be counter-intuitive: a failure de- tector that never suspects any process (even a faulty one) would be reliable. Furthermore, the definition would only apply to failure detectors that output lists of suspects. The definition given in this paper more generally applies to any failure detector that outputs values that encode information about failures. Furthermore, our definition conveys the intu- ition that an unreliable failure detector is one that does not provide any process with the ability to distinguish whether another process has crashed or is simply slow. We actu- ally capture three variants of this intuition through three classes of failure detectors: the class of weakly unreliable failure detectors, denoted by ~/d, the class of strongly un- reliable failure detectors, denoted by ~/d, and the class of completely unreliable failure detectors, denoted by OU. ~re consequently define three corresponding classes of indulgent algorithms: weakly indulgent that assume a failure detector of class ~L/, strongly indulgent that assume a failure de- tector of class ~/A, and completely indulgent that assume a failure detector of class 0/4/.

Contributions. This paper defines the notions of unreliable failure detectors and indulgent algorithms, points out exam- ples of indulgent solutions to well-known' agreement prob-- lems, then states and proves the following results:

• Safety. Every strongly indulgent algorithm A is safe: informally, if A satisfies a safety property P with a fail- ure detector D, then A satisfies P even if D turns out to be completely unreliable. This property captures some observations made about various (indulgent) al- gorithms that were given in the literature [2, 4, 6, 11, 12, 5].

• Uniformity. Every weakly indulgent algorithm A is uniform: informally, A cannot violate a property P without violating the locally-failure-insensitive restric- tion (or correct-restricted [1] version) of P. This result generalises the result of [8] (any algorithm that solves consensus with OS also solves uniform consensus).

• Impossibility 1. No weakly indulgent algorithm can satisfy any globally-failure-sensitive property (e.g., non- blocking atomic commit and terminating reliable broad- cast if any environment where one process can crash).

• Impossibility 2. No strongly indulgent algorithm can satisfy any divergent property (e.g., consensus if half of the processes can crash). This result generalises the lower bound of [2]: no algorithm can solve consensus using an eventually perfect failure detector if any en- vironment where half of the processes can crash.

Road-map. Section 2 defines our system model. Section 3 defines the notions of unreliable failure detectors and indul- gent algorithms. Section 4 discusses the safety of indulgent algorithms. Section 5 discusses the uniformity of indulgent algorithms and gives an impossibility result for globally- failure-sensitive properties (impossibility 1). Section 6 gives a lower bound fault-tolerance result for divergent proper- ties (impossibility 2). Section 7 points out some practical considerations and Section 8 concludes the paper.

2 S y s t e m M o d e l

We consider an asynchronous computation model augmented with the failure detector abstraction [2, 3]. The model is patterned after the FLP model [7]. Basically, we assume a distributed system composed of a finite set of n processes

= {pt ,p2, . . . ,p ,~} (]~] = n > 1). There is no bound on communication delays or process relative speeds. A discrete global clock is assumed, and ~, the range of the clock's ticks, is the set of natural numbers. The global clock is used for presentation simplicity and is not accessible to the processes.

In the following, we recall some important aspects of the model and we introduce some definitions that are needed to state and prove our results. More details on this model can be found in [3].

2 . 1 F a i l u r e p a t t e r n s

We assume that processes can only fail by crashing, i.e., they cannot behave maliciously. A process pi is said to crash at time t if pi does not perform any action after time t (the notion of action is recalled below). A correct process is a process that does not crash. Otherwise the process is said to be faulty. A failure pattern is a function F from • to 2 a, where F(t) denotes the set of processes that have crashed through time t. Failures are permanent, i.e., no process recovers after a crash. In other words, Vt _< t ' , F(t) C F(t ' ) . The set of correct processes in a failure pattern F is noted correct(F) and the set of faulty processes in F is noted fau l t y (F) .

An environment E is a set of failure patterns. Environ- ments describe the crashes that can occur in a system. We consider in this paper environments that contain the failure- free pattern F0 and at least one failure pattern where some process crashes.

Failure pattern coverage. Let F1 and F2 be any two failure patterns. We say that F2 covers F1 if'fit E ~,F2(t) C F1(t). For instance, the failure-free pattern F0 (where no process crashes) covers all failure patterns.

2 . 2 F a i l u r e d e t e c t o r s

Roughly speaking, a failure detector D is a distributed oracle which provides processes with hints about failure patterns. Each process pi has a local failure detector module of D, denoted by 'D~. Associated with each failure detector 'D is a range Gz~ of values output by the failure detector, a A failure detector history H with range G is a function H from

× qb to G. For every process pi E ~, for every time t E ~, H(pi, t) denotes the value of the failure detector module of process p~ at time t, i.e., H(pi, t) denotes the value output

3Unlike in [3], we denote a range by G, instead of R, in order to avoid confusions between runs and failure detector ranges.

290

by 79i at time t. A failure detector 79 is a function that maps each failure pattern F to a set of failure detector histories with range G~. 79(F) denotes the set of possible failure detector histories permitted for the failure pattern F, i.e., each history represents a possible behaviour of 79 for the failure pattern F.

2 . 3 A l g o r i t h m s

An algorithm is a collection A of n deterministic automata Ai (one per process pi). Computation proceeds in steps of the algorithm. In each step of an algorithm A, a process pi atomically performs the following three actions: (1) pi receives a message from some process pj, or a "null" mes- sage A; (2) pi queries and receives a value d from its failure detector module (d is said to be seen by pi); (3) pi changes its state and sends a message (possibly null) to some pro- cess. This third action is performed according to (a) the automaton Ai, (b) the state of pi at the beginning of the step, (c) the message received in action 1, and (d) the value d seen by p~ in action 2. The message received by a process is chosen non-deterministically among the messages in the message buffer destined to pi, and the null message A. A configuration is a pair (I, M) where I is a function mapping each process pi to its local state, and M is a set of messages currently in the message buffer. A configuration (I, M) is an initial configuration if M = 0 (no message is initially in the buffer): in this case, the states to which I maps the processes are called initial states. A step of an algorithm A is a tuple e = (p i ,m ,d ,A ) , uniquely defined by the algorithm A, the identity of the process pi that takes the step, the message m received by pi, and the failure detector value d seen by pi during the step. A step e = (pi, m, d, A) is applicable to a configuration (I, M) if and only if m E MU{A}. The unique configuration that results from applying e to configuration C = (I, M) is noted e(C).

2 . 4 S c h e d u l e s a n d r u n s

A schedule of an algorithm A is a (possibly infinite) sequence S = S[1]; S[2];. . . S[k]; . . . of steps of A. A schedule S is ap- plicable to a configuration C if (1) S is the empty schedule, or (2) SIll is applicable to C, S[2] is applicable to S[1](C) (the configuration obtained from applying S[1] to C), etc.

Let A be any algorithm and 79 any failure detector. A partial run of A using 79 is a tuple R = < F, H, C, S, T > where H is a failure detector history such that H E 79(F), C is an initial configuration of A, T is a finite sequence of increasing time values, and S is a finite schedule of A such that, (1) IS] = IT[, (2) S is applicable to C, and (3) for all k < ISt where S[k] = (p~, m, d, d) , we have pi ~ FF(T[k]) and d = H(p , , T[k]).

A run of of A using 79 is a tuple R = < F, H, C, S, T > where H is a failure detector history and H E 79(F), C is an initial configuration of A, S is an infinite schedule of A, T is an infinite sequence of increasing time values, and in addition to the conditions above of a partial run ((1), (2) and (3)), the two following conditions are satisfied: (4) ev- ery correct process takes an infinite number of steps, and (5) every message sent to a correct process pj is eventually received by pj.

Run extension. Let R1 = < F1, Ht, C1, $1, T1 > be any par- tial run of some algorithm A, and R2 = < F2, H2, C2, $2, T2 >

any run of A. We say that R2 is an extension of R1 if Ci : C2, Vt < Ti[ITll], Vp, E ~: F2(t) C Fz(t) and Hl(p{, t ) = H2(p{,t), and Vi, 1 _< i < ]TII: Sx[i] = $2[i] and Tl[i] : T2[i]. We also say that R1 is a partial run of R2.

2 . 5 P r o p e r t i e s

A property (or a problem) is a set of runs. For instance, con- sensus is the set of runs for which agreement, termination and validity are satisfied [2] . Each of those sub-properties itself defines a set of runs. We say that a property P holds in a run R if R is in P. We say that P holds in a partial run R if P holds in any extension of R. An algorithm A satisfies a property P if P holds in every run of A.

Failure-detector-insensitivity. We say that a property P is failure-detector-insensitive if whenever P holds for a run R = < F , H , C , S , T >, P holds for any run of the form R' = < F, H', C, S, T >. Informally, P does not depend on the values output by the failure detectors: if two runs R and R' differ only in their failure detector history, then P cannot hold in R and not in R'. In the paper, we consider only such properties. We hence exclude properties of the form: "the failure detector is reliable". Such properties would pose a circularity problem since we use characteristics of failure de- tectors to derive results about algorithms that satisfy certain properties.

3 U n r e l i a b i l i t y a n d I n d u l g e n c e

Intuitively, a failure detector is unreliable if does not enable any process to distinguish if any other process has crashed or not, and an indulgent algorithm is one that copes with unreliable failure detection.

This section captures these intuitions more formally by introducing three classes (sets) of unreliable failure detectors and three corresponding categories of indulgent algorithms. We state some relations between the failure detector classes we introduce here and the classes introduced by Chandra and Toueg [2]. These relations enable us to characterize the indulgence of many algorithms that have been given the literature.

3 . 1 U n r e l i a b l e f a i l u r e d e t e c t o r c l a s s e s

D e f i n i t i o n ( c o m p l e t e u n r e l i a b i l i t y ) . A failure detector 79 is completely unreliable if, for every pair of failure pat- terns F and F', 79(F) = 79(F').

We denote by O/b/the class of completely unreliable failure detectors. Consider for example failure detector 79 which always suspects all processes (every process suspects every process), in any failure pattern. Obviously, 79 is of class DL4. Similarly, failure detector 79' which always outputs the empty set, at any time, any process, and in any failure pat- tern, is also of class [ilL( (79' does never suspect any process).

D e f i n i t i o n ( s t r o n g u n r e l i a b i l i t y ) . A failure detector 79 is strongly unreliable if, for every pair of failure patterns F and F', for every history H E 79(F), for every time tk E ~, there is a failure detector history H' E 79(F') such that ~/t < tk, vp, ~ ~, n ' (p , , t) = H(p, , t)/.

291

Informally, a failure detector is strongly unreliable if, for arbitrarily long periods of time, it does not distinguish a crashed process from one that is correct, and vice et versa. In other words, if a strongly unreliable failure detector 79 provides some information, at a time t and a process pi, in a failure pattern F, 79 could have given the same information, at t and pi, in any other failure pattern F ' . We denote by ~L/ the class of strongly unreliable failure detectors. Con- sider 12 = {pl,p2,p3} and let 79 be any failure detector of class ~L/. Let F1 be any failure pattern where pl crashes while p2 and p3 are correct. At any time and any process, the information given by 79 in F1 could have been given by 79 in any failure pattern F2 where only p2 crashes (or any failure pattern F3 where both pl and p3 crash, etc.)

D e f i n i t i o n (weak u n r e l i a b i l i t y ) . A failure detector 79 is weakly unreliable if, for every failure pattern F, for ev- ery history H E 79(F), for every failure pattern F' that covers F, for every time tk E ~, there is a failure de- tector history H' E 79(F') such that [Vt < tk, Vp~ e f2, H'(pi, t) = H(p,, t)].

Informally, a failure detector is weakly unreliable (we sim- ply say unreliable) if, for arbitrary long periods of time, it does not distinguish a crashed process from one that is cor- rect. In other words, if a weakly unreliable failure detector 79 provides some information at a time t and a process pi, in a failure pattern F where a process pj is faulty, 79 could have given the same information, at t and pi, in a failure pattern F' similar to F , except that pj is correct in F ' . Any suspicion by 79 of process pj may actually turn out to be false, i.e., pj might be correct. We denote by ~ L / t h e class (the set) of weakly unreliable failure detectors. We call ele- ments of Z3/L(, simply unreliable failure detectors. Consider for instance the set of processes l~ = {pl,p2,p3} and let '/9 be any failure detector of class ~/L/. Let F1 be any failure pattern where both pl and p2 crash whereas p3 is correct. At any time and any process, the information output by 79 in F1 could have been given by 79 in some failure pattern F2 where only pl crashes (or some failure pattern Fa where all processes are correct).

D e f i n i t i o n ( i n d u l g e n t a l g o r i t h m s ) . Let A be any algo- rithm using a failure detector 79. We say that A is com- pletely indulgent if 79 E t3Lt, strongly indulgent if 79 E ~7Lt, and weakly indulgent if 79 E ~L(.

There are obvious examples of weakly (resp. strongly) un- reliable failure detectors that are not strongly (resp. com- pletely) unreliable. The proofs below contain examples of such failure detectors. However, any completely (resp. strongly) unreliable failure detector is strongly (resp. weakly) unreli- able, i.e., Oh/C V/L/C/X/,( (see Figure 1).

Consequently, every completely (resp. strongly) indul- gent algorithm is strongly (resp. weakly) indulgent. When there is no need to distinguish between them, we call such algorithms simply indulgent algorithms.

3 . 2 F a i l u r e d e t e c t o r r e l a t i o n s

The aim of this section is to point out examples of indulgent algorithms. Instead however of explicitly exhibiting such algorithms, we state some intersection relations between our failure detector classes and the classes defined by Chandra

and Toueg in [2]. These relations are depicted in Figure 1. By doing so, we indirectly show that some algorithms that have been described in the literature are indulgent.

Four main failure detector classes were defined in [2]: each class characterized by a completeness and an accuracy property: (1) the class P (perfect) characterized by strong completeness (eventually every process that crashes is per- manently suspected by every correct process) and strong accuracy (no process is suspected before it crashes); (2) 07 9 (eventually perfect) characterized by strong complete- ness and eventual strong accuracy (eventually no correct process is ever suspected); (3) S (strong) characterized by strong completeness and weak accuracy (some correct pro- cess is never suspected); and (4) OS (eventually strong) characterized by strong completeness and eventual weak ac- curacy (eventually some correct process is never suspected).

P

7P

Os

Figure 1: Intersection relations between failure detector classes

P r o p o s i t i o n 3.1 (S N z2xl.4 ~ O) A failure detector can be unreliable and strong.

PROOF: To show this result, we exhibit a "typical" unreli- able failure detector 79~o that satisfies the strong complete- ness and weak accuracy properties of S. Failure detector 79w has range 2 • and is defined as follows:

• For every failure pat tern F , 79~,(F) = {H [ 3pk E correct(F),Vpi e ~,Vt E ~,pk ~ H(pi , t ) and 3t0 E (b,Vp, E f~ : Vt > to, H(p , , t ) = F(t)}.

Roughly speaking, in every failure pattern F, 79~, might suspect all but some correct process qk until some time to, and after time to, 79~, suspects exactly the crashed processes, i.e., after time to, 79w behaves like a perfect failure detector. It is obvious to see that 79~, is a strong failure detector: it satisfies both strong completeness and weak accuracy.

We show below that 79w is indeed unreliable. Consider any time to E ~, any failure pattern F, and any history H E 79w(F). By the definition of 79~, there is a pro- cess qk E correct(F) that is never suspected in H, and a time after which (1) all processes in fau l t y (F) are perma- nently suspected by all processes and (2) no correct process is ever suspected. Let F p be any failure pattern that covers F. Consider the history H' such that ~ t LS to, Vpi E f~, H'(pi, t) = H(pi, t) and Vt > to, H(pi , t ) = F'(t)]. As pro- cess qk E correct(F), and correct(F) C_ correct(F'), then qk E correct(F) . Process qk is thus never suspected in H and never suspected in H'. Furthermore, there is a time af- ter which, in H t, all processes in f a u l t y ( F ) are permanently

292

suspected by every correct process, and no correct process is ever suspected by any process. Consequently, H ' E 79~ (F ' ) , which means that 79,~, is unreliable. []

P r o p o s i t i o n 3.2 (079 fq ~TLt ~ O) A failure detector can be strongly unreliable and eventually perfect.

PROOF: To show this result, we exhibit a "typical" strongly unreliable failure detector 79, which satisfies the strong com- pleteness and eventual strong accuracy properties of OP. Failure detector 79~ has range 2 f~ and is defined as follows:

• For every failure pattern F, 79~(F) ~ {H I ~to E ~, Vpi E f2, Vt > to,H(p~,t) = F(t)}.

Roughly speaking, in every failure pattern F, 79~ might suspect any subset of the processes until some time to, and after time to, 79~ suspects exactly the crashed processes (af- ter time to, 79~ behaves like a perfect failure detector). It is obvious to see that 79~ satisfies strong completeness and eventual strong accuracy. We show below that 79~ is in- deed strongly unreliable. Consider any time to E (b, any failure pattern F, and any history H E 79~(F). By the def- inition of 79~, there is a time after which all processes in f au l t y (F) are permanently suspected by every correct pro- cess, and no correct process is ever suspected by any pro- cess. Consider any failure pattern F ' . Consider the history H' such that ~ t < to, Vp~ E f2, H'(pi , t ) = H(pi , t ) and Vt > to ,H' (p l , t ) = F'(t)]. After time to, all processes in f au l t y (F ' ) are permanently suspected and no correct pro- cess is ever suspected. Consequently, H ' E 79~(F1), which means that 79~ is strongly unreliable. O

P r o p o s i t i o n 3.3 (S n VL/= q)) No failure detector can be strong and strongly unreliable.

PROOF: (By contradiction) Let 79 be any failure detector that is both strong and strongly unreliable.

Consider the failure pattern Fj where pj initially crashes and all other processes are correct. Since 79 is strong, then for every Hj in 79(Fj), there is a time to after which pk per- manently suspects pj in H3. Consider time to and the failure pattern Fk where Pk initially crashes and all other processes are correct. Since 79 is strongly unreliable, then there is a history H~ in 79(Fk) such that Vt _< to, Hk(pk, t) = Hj(pk, t) and Hk(pj, t) = Hj(pj , t). Since 79 is strong, then there is a time tl after which pj permanently suspects pk in Hi. Con- sider time t2 = sup(to, t l) and Fj.k the failure pattern where pj and pk are correct and all other processes initially crash. (Remember that we assume a system with at least two pro- cesses.) Since 79 is strongly unreliable, then there is a history Hj,k in 79(F~,k) such that Vt _< t2, Hk(pk, t) = H~,k(pk, t) and Hk(pj, t) = Hj,k(pj, t). In Hi,k, Pk suspects pj and pj sus- pects pk. Since pj and pk are the only two correct processes in Fi,j, then 79 does not satisfy weak accuracy, i.e., cannot be strong: a contradiction. []

P r o p o s i t i o n 3.4 (7 9 N zL/4 = 0) No failure detector can be perfect and unreliable.

PROOF: (By contradiction) Let 79 be a failure detector that is both perfect and unreliable. Let F be any failure pat- tern where some process p~ initially crashes and all other

processes are correct. Let H be any history in 79(F). By the strong completeness property of a perfect failure de- tector, there is a time to after which all correct processes permanently suspect pj. Let p~ be any of those processes. Consider the failure-free pattern Fo. Obviously, F0 covers F (i.e., Vt E ~ Fo(t) = 0 C F(t)). By the definition of an unre- liable failure detector, there is a history Ho E 79o such that [Vt < to, Vpi E fl, Ho(pi, t) = H(pi,t)]. Hence, in history H0 and at time t, pl suspects pj in F0, which is a failure-free pattern: in contradiction with the strong accuracy property of a perfect failure detector. []

P r o p o s i t i o n 3.5 ( ~ S N O/..4 = 0) No failure detector can be eventually strong and completely unreliable.

PROOF (SKETCH): An asynchronous system model aug- mented with a completely unreliable failure detector is equiv- alent to a pure asynchronous system model. Assume that some completely unreliable failure detector is eventually strong. Such failure detector would solve consensus [2] and hence would contradict the FLP [7] impossibility result about solving consensus in an asynchronous system if one process can crash.

A simple corollary of Proposition 3.1 (resp. Proposition 3.2) is that any algorithm that uses a strong (resp. eventually perfect) failure detector is indulgent (resp. strongly indul- gent). For example, Chandra and Toueg described in [2] an algorithm S-cons that solves consensus using any strong failure detector, and an algorithm OS-cons that solves con- sensus using any eventually strong failure detector in any environment with a majority of correct processes: OS-cons is strongly indulgent and S-cons is weakly indulgent. It was also shown in [2] that atomic broadcast and consen- sus are equivalent. This also means that atomic broadcast has a weakly indulgent solution in any environment, and a strongly indulgent solution in any environment with a ma- jority of correct processes.

4 Safety and Indulgence

We say that a property P is a safety property if any R where P does not hold has a partial run where P does not hold [15]. We state and prove below that every strongly indulgent al- gorithm A is safe (Proposition 4.2): informally, if A satisfies a safety property P with a failure detector 79, then A satis- fies P even if 79 turns out to be completely unreliable. To state our proposition more formally, we first introduce the notion of failure detector extension.

Failure detector extension. We informally say that a failure detector 79' extends a failure detector 79 if, at any time t and in any failure pat tern F, the output given by 791 could have be given by 79. More precisely: let 791 and 792 be any two failure detectors; We say that/92 extends 791 if for every failure pattern F , for every history H2 E 792(F), for every time tk E qb, there is a failure detector history H1 E 791 (F), such that ~ t < tk, Vpi E f2, H2(pi,t) = Hl(pi, t)].

P r o p o s i t i o n 4.1 0 Every strongly unreliable failure detec- tor 79 has a completely unreliable extension 79'.

293

PROOF: Let 79 be any strongly unreliable failure detector. We construct a failure detector 79' that (1) extends 79, and (2) is completely unreliable. Failure detector 79' has the same range as 79, i.e., G~ = C~,, and for every failure pattern F, 79'(F) = {HI3F', H E 79(F')}.

We first show that 79' is completely unreliable. Let F1 and F2 be any two failure patterns. Let H1 be any history in 79'(F1). By the definition of 79', there is a failure pattern F such that Hi is in 79(F) and Hi is also in 79'(F2).

We show now that 79' is an extension of 79. Let F t be any failure pattern and H ' any history in 79'(F'). Consider any time tk E ¢b. By the definition of 79', there is a failure pattern F such that H ' is in 79(F). Since 79 is strongly unreliable, then there is a history H in 79(F') such that ~Vt < tk, Vpi E f2, H'(pi , t) = H(p,,t)]; which means that 79' extends 79 O

P r o p o s i t i o n 4.2 (safety) Let A be any algorithm and P any safety property. If A satisfies P using a failure detector 79, then A satisfies P using any extension of 79.

PROOF: (By contradiction) Let A be any algorithm using a failure detector 79. Let 79' be any extension of 79 with which A does not satisfy P. Since P is a safety property, then there is a partial run R' = < F , H ' , C , S , T > of A using 79' such that P does not hold in R. Since 79' extends 79, then there is a failure detector history H E 79(F), such that [Vt <_ T[[TI], Vp~ E ~, H'(p~,t) = H(p~,t)]. Partial run R =< F , H , C , S , T > is also a partial run of A because (1) IS[ = IT[, (2) S is applicable to C, and (3) for all k _< IS[ where S[k] = (p~, m, d, d), we have p, ~ F(T[k]) and d = H(pi,T[k]). Let R" be any run of A that extends R. Since R" is an extension of R, then R" is also an extension of R'. R" is a run of A and P does not hold in R": a contradiction. O

A simple corollary of propositions 4.2 and 4.1 is that if an algorithm A solves a problem P with a strongly unreliable failure detector 79, then A always preserves the safety as- pects of P even if A is actually used with an extension of 79 that is completely unreliable. To illustrate this, consider a failure detector 79 that may (falsely) suspect any subset of processes until some arbitrary time t, and behaves perfectly after t: 79 is both strongly unreliable and eventually perfect. An algorithm A that uses 79 to ensure a given property P (say consensus) will never violate the safety aspects of P (agreement and validity) even if, instead of 79, A is actually used with a failure detector 79' that keeps indefinitely behav- ing in an unreliable manner (79' is a completely unreliable extension of 79).

Proposition 4.2 actually captures some observations made about various indulgent algorithms [2, 4, 6, 11, 12, 5], which never violate safety, no matter how the system (the failure detector) behaves.

5 Failure Sensitivity and Indulgence

Some of the properties that have been defined in the lit- erature are failure-insensitive: roughly speaking, the be- haviour of crashed processes does not impact the validity of these properties. Consensus is a typical example of a failure-insensitive property.

Many properties are however failure-sensitive, and their sensitivity to failures may come in different flavours. Some

properties are failure-sensitive in the sense that they a l so restrict the behaviour of faulty processes. This notion cor- responds to the notion of failure-sensitivity in [1]. In uni- form consensus for example, every faulty process should also respect agreement and validity. We say that such proper- ties are locally-failure-sensitive. Other properties are failure- sensitive in a different sense. Consider a property, which we call here atomic consensus, defined with the termination and agreement condition of consensus, plus the following validity condition: 0 can only be decided if some process crashes. Atomic consensus does not restrict the behaviour of crashed processes, but the very fact that some process p~ has crashed might globally impact the behaviour of cor- rect processes - even if pi has initially crashed without exe- cuting any step. We say that atomic consensus is globally- failure-sensitive. Atomic commitment [17] is locally-failure- sensitive and globally-failure-sensitive; terminating reliable broadcast [9] is globally-failure-sensitive but not locally-failure- sensitive; and uniform consensus is globally-failure-sensitive but not locally-failure-sensitive.

This section establishes two results.

1. Informally, we show that every indulgent algorithm A is uniform (Proposition 5.1): A cannot violate a prop- erty P without violating the local-failure-insensitive re- striction of P. This result generalises the result of [8]: any algorithm that solves consensus with <>S also solves uniform consensus.

2. We also show that globally-failure-sensitive properties do not have indulgent solutions (Proposition 5.5). This result generalises the" result of [8]: no algorithm can solve non-blocking atomic commit with 07 9 or S if one process can crash.

5.1 Local-failure-sensitivity Before stating and proving our first result (Proposition 5.1), we first define the ttotions of local-failure-insensitivity and local-failure-insensitive restriction. These notions are them- selves based on the notion of correct-equivalence between runs.

Correct-equivalence. Cons ide r F any failure pattern and S any infinite sequence of steps. We denote by correct(F, S) the restriction of S to correct processes in F. Let R1 = < F1,H1,C1,Si ,T1 > and R2 = < F2,H2,C2, S2,T2 > be any two runs of the same algorithm A. We say that Rx and R2 are correct-equivalent if correct(F1, $1) = correct(F1, $2).

Local-failure-insensitivity. We say that a property P is locally- failure-insensitive if for any two runs R1 and R2 that are correct-equivalent, P holds in R1 iff P holds in R2.

Consensus is an example of a locally-failure-insensitive property: it does not restrict the behaviour of correct pro- cesses. Consider two runs R1 and R2 of any consensus al- gorithm. Assume that the subset H of correct processes in both runs is the same. If consensus holds in a run R, and the correct processes of H behave similarly in R and R', then no matter how faulty processes behave in R', consensus will indeed hold in R'. We say that a property is locally-failure- sensitive if it is not locally-failure-insensitive. Uniform con- sensus is an example of a locally-failure-sensitive property.

2 9 4

Locally-failure-insensitive restriction. Every property P has a locally-failure-insensitive part, which we call the locally- failure-insensitive restriction of P. We define the locally- failure-insensitive restriction of a property P, as the prop- erty denoted by C(P), such that C(P) does not hold in some run R iff P does not hold in every run that is correct- equivalent to R.

Note that if a property P is locally-failure-insensitive, then C(P) = P. Consensus is for example the locally- failure-insensitive restriction of uniform consensus.

Proposition 5.1 ( u n i f o r m i t y ) Let A be any indulgent al- gorithm and P any safety property. IrA satisfies C(P) then A satisfies P.

We first introduce three lemmatas that are needed to prove the proposition.

L e m m a 5.2 Consider any property P and any run R with the failure-free pattern. If P holds in R then P holds in every run that is correct-equivalent to R.

PROOF: Let R1 = < Fo, H1,C1,S1,T1 > and R2 = < Fo,H2,C2,S2,T2 > any two runs with the failure- free pattern F0. If R1 and R2 are correct equivalent, then we have S~ = $2. Since we assume failure-detector-insensitive properties, then for any property P, P holds in R1 iff P holds in R2. []

L e m m a 5.3 Consider P any property and R any run with the failure-free pattern. C(P) does not hold in R iff P does not hold in R.

PROOF: If C(P) does not hold in some run R then obvi- ously P does not hold in R. Consider a run R with the failure-free pattern and assume that P does not hold in R. By Lemma 5.2 above, P does not hold in any run R' that is correct-equivalent to R. Hence, by the definition of the notion of locally-failure-insensitive restriction, P does not hold in R. []

L e m m a 5.4 Let R = < F, H, C, S, T > be any partial run of an algorithm A using an unreliable failure detector 79. For every failure pattern F ' that covers F , there is a failure de- tector history H' E D(F') such that R' = < F ' , H ' , C, S, T > is also a partial run of A.

PROOF: Let R = < F,H,C ,S ,T > be any partial run of A. Consider the time T[tTII. By the definition of unreliable failure detectors, for every failure pattern F ~ that covers F (i.e., such that Vt E 'I~, F'(t) C F(t)) , there is a failure detector history H ' E 79(F') such tha t [Vt < T[[TI] , Vpi E fl, H'(pi,t) = H(pi,t)]. We have ISI = ITh S is applicable to C, and for all k g IS[ where S[k] = (pi,m,d,A), pi F'(T[k]) and d = H'(pi,T[k]) (as Vt < T[ITI] , Vpi E fl, H'(pi, t) = H(pi, t)). Hence the partial run R = < F', H', C, S, T > is also a partial run of A. []

PROOF OF PROPOSITION 5.1 (By contradiction) Let A be any algorithm using an unreliable failure detector 79. As- sume by contradiction that A satisfies C(P) but does not satisfy P. Hence, there is a partial run of A using 79,

R = < F , H , C , S , T >, such that P does not hold in R. Let F0 be the failure-free pattern. By Lemma 5.4 above. Ro = < Fo,H,C,S ,T > is also a partial run of A. Let R~ = < Fo,H' ,C' ,S ' ,T ' > be any run of A that extends Ro. P~ is also an extension of R: which means that P does not hold in R~. Since F0 is a failure-free failure pattern and P does not hold in P~, then by Lemma 5.3 above, C(P) does not hold in R~: a contradiction. Q

5 . 2 G l o b a l - f a i l u r e - s e n s i t i v i t y

Global failure sensitivity. We say that a property P is globally- failure-sensitive if there is a configuration C, a failure pat- tern F, and a failure pattern F ' that covers F, such that any run R = < F, H, C, S, T > where P holds has a partial run R' =< F', H', C, S, T > where P does not hold.

Informally, in a globally-correct-sensitive property, the very fact that some process has crashed might globally re- strict the behaviour of correct processes. Consider any envi- ronment E that has the failure-free pattern F0 and at least one failure-pattern F where all processes are correct, except a process pl that initially crashes. Non-blocking atomic commit [17] is for instance globally-failure-sensitive in E. Consider the configuration C where all processes vote yes. Any run R, with configuration C and failure pattern F1, where all correct processes decide abort, satisfies atomic commit conditions. However, consider any run R ~, with C and F0, where correct processes decide abort: atomic com- mit conditions do not hold in R ~ - the non-triviality condi- tion of atomic commit is violated [8].

Proposition 5.5 ( i m p o s s i b i l i t y 1) No indulgent algorithm can satisfy any globally-failure-sensitive property.

PROOF: (By contradiction) Assume by contradiction that there is an algorithm A using an unreliable failure detector 79 that satisfies a globally-failure-sensitive property P. Since P is globally-failure-sensitive then there is a configuration C, a failure pattern F , and a failure pattern F ~ that covers F, such that any run R with F and C where P holds has a partial run R' = < F ~, H, C, S, T > where P does not hold. Consider time T[[T[]. Since D is unreliable and F ' covers F, then there is a failure detector history H' E 79(F') such that: ~ t < T[ITI], Vpi E fl, H(pl,t) -- H'(p,,t)]. Partial run R" =< F ' , H " , C , S , T > is also a partial run of A and any extension of R" is an extension of R'. Since P does not hold in R ~, then P does not hold in R": a contradiction. D

A simple corollary of this result is that, unlike consensus and atomic broadcast, problems like non-blocking atomic com- mit and terminating reliable broadcast do not have (even weakly) indulgent solutions in environments where one pro- cess may crash.

6 D i v e r g e n c e a n d I n d u l g e n c e

Many properties in distributed systems involve agreement among a set of processes. Some of these properties have a divergent flavour in the sense that the processes could po- tentially reach different (and contradictory) values, but in each run, these processes should agree on the same deci- sion. Consensus is a typical example of a divergent property

2 9 5

in environments where half of the processes can crash. In contrast, reliable broadcast is not [9].

We state and show through Proposition 6.1 that diver- gent properties do not have indulgent solutions. The propo- sition generalises the lower bound fault-tolerance result of [2], which states that no algorithm cart solve consensus using an eventually perfect failure detector if half of the processes can crash (consensus is divergent in any environment where half of the processes can crash).

Before stat ing and proving our proposition, we first de- fine the notion of divergence, itself based on the notion of run composition.

Run composition. Let F1 and Fz be any two failure pat- terns. We say that F1 and F2 are disjoint if correct(F1) A correct(F2) = 0. Let A be any algorithm. Let R1 = < F1 ,H1 ,Ci , S1,T~ > be any part ia l run of A and R2 = < F2, H2, C2, $2, T~ > be any run or part ia l run of A. We say that R2 follows R~ if T2[1] > T~[IT~I].

Now, let Ft and F~ be any two disjoint failure pat- terns, and let A be any algorithm. Consider any two par- tial runs of A: R~ =< F~,H~,C;,S~,Ti > and R2 = < F2, H2, Cz, $2, T2 > such that R2 follows R~. Consider the failure-free and R2 as that: C = Vt, Tx fIT1 [1 IT~I: s[k] IT~I + IW~l: that R1.R2

pat tern F0. We define the composition of R1 the partial run R1.R2 = < Fo, H, C, S ,T > such C1, Vp, • f~, Vt <_ Ti[ITI]], H(p~,t) = Hl(p~,t), < t _< T2[IT2I], H(pi,t) = H2(pi,t), Vk, 1 < k < = S~[k] and Tl[k] = T[k], and Vk, IZ~l < k <_ s N = s ~ N and T N = T~[kl. It is easy to see is also a part ia l run of A.

Divergence. We say that a property P is divergent if there are two disjoint failure pat terns F1 and F2 and a configura- tion C, such that for any algorithm A, the following condi- tion is satisfied. Every run R1 =< F1,H1,C, S1,T1 > of A where P holds, has a part ial run R~ = < F~, H~, C, S~,T~ > such that , any run R2 = < F2,H2,C, S2,T2 > of A tha t follows R1 where P holds, has a part ia l run R~ such that P does not hold in R~ .R~. We say tha t C is a divergent configuration of P for F1 and F2.

To get an intuitive idea of the notion of divergence, con- sider the set f / = {pl,p2,pa} and the two following failure patterns: F1 where pl and p2 initially crash whereas p3 is correct, and FF2 where p3 initially crashes whereas pl and p2 are correct. Consensus is typically divergent in any environ- ment that contains F1 and F2. In fact, consensus has two divergent configurations for F i and F2 :C1 where pl and p2 initially propose 0 and p3 proposes 1; and C2 where pl and p2 initially propose 1 and p3 proposes 0. Start ing from each of those configurations, and given any consensus algorithm, one could exhibit two part ial runs of consensus (one where processes decide 1 and one where correct processes decode 0) such that consensus is violated in the composition of those runs.

P r o p o s i t i o n 6.1 ( i m p o s s i b i l i t y 2) No strongly indulgent algorithm can satisfy any divergent property.

PROOF: (By contradiction) Assume tha t there is an algo- ri thm A using a strongly unreliable failure detector 79 tha t satisfies a divergent property P. Since P is a divergent prop- erty, then P has a divergent initial configuration C for two disjoint failure pat terns --1 and F2.

Consider failure pa t te rn FI , any failure detector history H1 E 79(F1), and the initial configuration C. Let R1 be any run of A of the form R1 = < F1,H1,C, S1,T1 >. Since A satisfies P, then P holds in R1. Since P is divergent, then there is a part ial run R] = < F~, H~, C, S~, T; > of R1, such that for any run R2 = < F2,H2,C, S2,T2 > where P holds and T2[1] > T~[IT~I], there is a restriction R~ of R2 such that P does not hold in R].R~.

Consider time T~[IT~[] and failure pat tern F2. By the definition of a strongly unreliable failure detector, there is a failure detector history H~ E 79(F2) such that [Vt < T~[IT~I], Vp~ E f~, H2(p~, t) = Hi(p,, t)]. Consider failure pa t te rn F2, failure detector history H2, and configuration C. Let R2 = < F2, H2, C, $2, T2 > be any run of A star t ing at t ime T~ [IT~ I] + 1. Since A satisfies P then P holds in R2. Since P is di-

! t t ! vergent, then there is a par t ia l R~ = < F~, H2, C, $2, T~ > of R2 such that P does not hold in R~.R~.

Consider t ime T~[IT~I ] and failure-free pat tern F0. By the definition of a strongly unreliable failure detector, there is a failure detector history H0 E 79(F0) such that [gt _< T~[IT~I], Vpi • f~, Ho(pi,t) = H~(pi,t)].

Consider now the following part ia l run: R0 = < F0, Ho,C, S0 = S~.S~,T0 = T~.T~ >. We have: IS01 = IT01, So is applicable to C, and for all k < ISI where S0[k] = (p,,m,d,A), p~ f~ Fo(To[k]) (Fo is the failure free pat tern) , and d = H(pi,To[k]) because: Vpi • f~, (1) Vt • such that : t < T~[]T~I], Ho(pi,t) = H~(pi,t), and (2) Vt • 'I, such that T~[IT~I ] < t < T~[IT~I], Ho(pi,t) = H~(pi,t). Ro is thus a part ial run of A and any extension R of Ro is also an extension of R[.R~. Let R be any extension of R0. Since P is a divergent property, then it does not hold in R: a contradiction. []

7 P r a c t i c a l C o n s i d e r a t i o n s

Distr ibuted systems are rarely perfectly synchronous nor completely asynchronous. Process relative speeds and com- munication delays usually have upper t iming bounds tha t can be determined, but there are sometimes instability pe- riods where those bounds are overrun (Figure 2).

Failure detectors are typically implemented using time- outs and an application developer is left with a crucial dilemma: either to set up the t ime-outs with short values that ensure fast reaction to failures but increase the prob- abili ty of false suspicions during instabili ty periods, or to choose large values tha t reduce the probabil i ty of false sus- picions but introduce a slow fail-over.

• Wi th a strongly indulgent algorithm, one can safely choose the first option or even consider dynamic time- outs. Despite false suspicions (during the instability period of the system), safety is uniformly guaranteed among all processes. Liveness is eventually ensured when the system becomes stable again (Figure 2).

• A non-indulgent algorithm might violate safety at the least false suspicion, and hence loses any chance to solve the problem even if the system becomes stable immediately after that suspicion (Figure 2). The fail- ure detector developer must choose here a large time- out value to diminish the risk of false suspicions but this would imply a slow fail-over.

Many practical dis t r ibuted problems are globally-failure- sensitive, and as we stated in Proposition 5.5, such problems

2 9 6

I satety is violated I : I I :

I bl~n-in,Julgenl Mg~ lhm 1" i / \ . ,, I )

I 7~" / \ , , " I : "" "\ the a lgor i thm terminates ~\ Indulgemalg°rid'tm / \'1"/ ................. ~/" \ / ~ / ~ [" i I the ~d~orithm is blocked I ,

L J

/%~,/'~/v/\ ins~bili ly period

stability pc~od

Figure 2: Stabili ty and instability periods

do not have indulgent solutions if one process may crash. Fortunately, most of those problems do often have a sig- nificant globally-failure-insensitive part. To il lustrate this, consider non-blocking atomic commit [17]. The problem is defined with four properties: agreement, validity, termina- tion, and non-triviali ty [8]. As we pointed out, the prob- lem is globally-failure-sensitive. However, the agreement, validity and termination conditions of the problem define a sub-problem that is globaUy-failure-insensitive. The prob- lematic condition is non-triviality (i.e., abort cannot be de- cided if all processes vote yes and they are all correct) [8]. This indeed is the property that makes the problem globally- failure-sensitive.

One could hence (1) devise a strongly indulgent algo- r i thm Az that solves the globally-failure-insensitive par t of the problem, and (2) uses this algorithm as a sub-algori thm within the global algorithm A that solves the full problem (A would not be itself indulgent). If the failure detector makes mistakes (i.e., during an instability period of the system), A might violate non-triviality and abort t ransactions that should have been otherwise committed. The developer could choose a large t ime-out value for the failure detector used by A (to reduce the probabili ty of falsely abort ing transactions) and a short t ime-out for the failure detector used by A1. No matter what happens to the system (i.e., during instabil i ty periods), A would never violate agreement or validity. For example, the non-blocking atomic commit algori thm of [8] uses a strongly indulgent sub-algorithm to solve the agree- ment part of the problem. In contrast, the algorithm of [17] does not rely on any indulgent sub-algorithm and might vi- olate agreement if the failure detector makes mistakes.

8 Conc luding Remarks

As we pointed out in the introduction, characterising in- dulgent algorithms go through defining what it means for a failure detector to be unreliable. We defined three classes of unreliable failure detectors and our definitions differ sig- nificantly from the original failure detector class definitions, introduced by Chandra and Toueg in [2].

In [2], the failure detector classes were defined accord- ing to desirable completeness and accuracy properties: com- pleteness measures the extent to which a failure detector suspects the crash of faulty processes, while accuracy re- stricts the mistakes made about false failure detections. In contrast, our unreliable failure detector classes were defined according to undesirable properties that capture the intu- ition of unreliable failure detection. We do not claim to have exhaustively captured the notion of unreliable failure detection through our three failure detector classes. In fact,

exploring the space of meaningful definitions for the notion of unreliable failure detector is, we believe, an interesting open issue.

References

[1]

[2]

R. Bazzi and G. Neiger. Simulating crash failures with many faulty processors. Distributed Algorithms (WDAG'92), Springer Verlag, LNCS 647, 1992.

T. Chandra and S. Toueg. Unreliable Failure Detectors for Reliable Distributed Systems. Journal of the ACM, 43(2), March 1996.

[3] T. Chandra, V. Hadzilacos and S. Toueg. The Weakest Failure Detector for Solving Consensus. Journal of the ACM, 43(4), July 1996.

[4] F. Cristian and C. Fetzer. The Timed Asynchronous System Model Technical Report, CSE 97-519, University of Califor- nia at Saint Diego, 1997.

[5] R. De Prisco, B. Lampson, and N. Lynch. Revisiting the Paxos Algorithm. Distributed Algorithms (WDAG'97), Springer Verlag, LNCS 1320, September 1997.

[6] C. Dwork, N. Lynch, and L. Stockmeyer. Consensus in the presence of partial synehrony. Journal of the ACM, 35 (2), April 1988.

[7] M. Fischer, N. Lynch, and M. Paterson. Impossibility of Dis- tributed Consensus with One Faulty Process. Journal of the ACM, 32 (2), April 1985.

[8] R. Guerra.oui. Revisiting the relationship between non- blocking atomic commitment and consensus. Distributed Al- gorithms (WDAG'95), Springer Verlag, LNCS 972, Septem- ber 1995.

[9] V. Hadzilacos and S. Toueg. Fault-Tolerant Broadcasts and Related Problems. Technical Report TR 94-1425, Cornell Uni- versity, 1994.

[10] M. Hurfin and M. Raynal. A Simple and Fast Consensus Algorithm in an Asynchronous System with a Weak Failure Detector. Distributed Computing, 12 (4), 1999.

[11] L. Lamport. The Part-Time Parliament. ACM Transactions on Computer Systems, 16 (2), pages 133-169, May 1998.

[12] B. Liskov and B. Oki. Viewstamped replication: A new pri- mary copy method to support highly-available distributed sys- tems. ACM Symposium on Principles of Distributed Comput- ing (PODC'88), August 1988.

[13] L. Sabel and K. Marzullo. Election Vs. Consensus in Asyn- chronous Systems. Technical Report TR95-1488, Cornell Uni- versity, 1995. Also, Technical Report CS95-411, University of California at Santa Barbara, 1995.

[14] A. Schiper. Early Consensus in an Asynchronous System with a Weak Failure Detector. Distributed Computing, (10), 1997.

[15] F.B. Schneider. Decomposing Properties into Safety and Liveness. Technical Report TR87-874, Cornell University, 1987.

[16] F.B. Schneider. Replication Management using the State- Machine Approach. In Distributed Systems, Sape Mullender (ed), Addison-Wesley, 1993.

[17] D. Skeen. NonBlocking Commit Protocols. ACM SIGMOD International Conference on Management of Data, 1981.

297