Synthesis of Fault-Tolerant Distributed Programs
description
Transcript of Synthesis of Fault-Tolerant Distributed Programs
Synthesis of Fault-Tolerant Distributed Programs
Ali Ebnenasir
Department of Computer Science and EngineeringMichigan State University
East Lansing MI 48824 [email protected]
Advisor: Dr. Sandeep S. Kulkarni
2
Motivation Programs are subject to unanticipated faults
New classes of faults, add corresponding fault-tolerance
How to add fault-tolerance? Design a fault-tolerant program from scratch Incremental addition of fault-tolerance
How to ensure correctness? Verification after the fact Automatic synthesis of fault-tolerant programs
(correct by construction)
3
Motivation (Continued) Synthesis of fault-tolerant programs
Start from (Temporal Logic) specification Start from the fault-intolerant program
Synthesis of fault-tolerant programs from their fault-intolerant versions has the potential to
Reuse the behaviors of the fault-intolerant program Preserve behaviors that are hard to specify (e.g.,
efficiency)
Problem: Complexity of synthesis A polynomial-time non-deterministic algorithm for the
synthesis of fault-tolerant distributed programs [FTRTFT00]
4
Outline
Program and Fault Model
Distribution Model
Problem Statement
Strategy
Current Results
Future Plan
5
Program and Fault Model Program is identified by its state space and set of
transitions Finite State space Sp Invariant S, fault-span T Sp
Program p, Fault f, Safety { (s0, s1) | (s0, s1) Sp Sp }
Fault-tolerance Satisfy a particular fault-tolerance specification in the presence of
faults Failsafe, Nonmasking, MaskingST
p/f p
f
Sp
6
Distribution Model Read/Write restrictions Example
A program p with two processes j and k Two Boolean variables a and b Process j cannot read b Can we include the following transition?
a=0,b=0 a=1,b=0
Groups of transitions (instead of individual transitions) must be chosen
a=0,b=1 a=1,b=1
Only if we include the transition
7
Problem Statement
Synthesis Algorithm
Fault-intolerant program p
Specification Spec
Invariant S
Fault-tolerant program p'
Invariant S'Faults f
No new transition here New transitions added here
S S'p
Finite state space
Distribution restrictions
Sp f
8
Strategy
Theoretical issues Develop heuristics Explore polynomial-time boundaries Analyze fault-intolerant programs
Develop a synthesis framework for Developers of fault-tolerance Developers of heuristics
9
Theoretical Issues - Heuristics
Apply heuristics to reduce the exponential complexity [SRDS01]
Assign weights to transitions and states based on their usefulness
Different approaches for resolving deadlocks and livelocks
Identify the applicability of heuristics to the problem at hand
Choose different subsets of heuristics Apply in different order
10
Theoretical Issues – Polynomial-Time Boundary
Find properties of programs/specifications where polynomial-time synthesis is possible
Example: Algorithmic synthesis of failsafe fault-tolerant
programs is NP-hard [ICDCS02]
Polynomial-time synthesis of failsafe fault-tolerance for monotonic programs and specification
11
Example for Polynomial-Time Boundary:
Monotonicity of SpecificationsDefinition: A specification spec is positive monotonic with respect to
variable x iff: For every s0, s1, s’0, s’1:
The value of all other variables in s0 and s’0 are the same. The value of all other variables in s1 and s’1 are the same.
s1s0
x = falsex = false
If
Does not violate safety
s’0 s’1
x = truex = true
Does not violate safety
Then
12
Example for Polynomial-Time Boundary:
Monotonicity of ProgramsDefinition: Program p with invariant S is negative monotonic with respect to
variable x iff: For every s0, s1, s’0, s’1:
The value of all other variables in s0 and s’0 are the same. The value of all other variables in s1 and s’1 are the same.
Invariant S
s1s0
x = truex = true
s’0 s’1
x = falsex = false
13
Example for Polynomial-Time Boundary: Theorem
Synthesis of failsafe fault-tolerance can be done in polynomial time if either:
Program is negative monotonic, and Spec is positive monotonic;
Or Program is positive monotonic, and Spec is negative monotonic.
If only one of these conditions is satisfied then synthesizing failsafe fault-tolerance is still NP-hard.
For many problems, these requirements are easily met. E.g., Agreement, Consensus, and Commit.
14
Example for Polynomial-Time Boundary: Byzantine Agreement
Processes: General, g, and three non-generals j, k, and l Variables
d.g : {0, 1} d.j, d.k, d.l : {0, 1, ┴ } b.g, b.j, b.k, b.l : {true, false} f.j, f.k, f.l : {0, 1}
Fault-intolerant program transitions d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ f.j = 0 f.j := 1
Fault transitions ¬b.g /\ ¬b.j /\ ¬b.k /\ ¬b.l b.j := true b.j d.j :=0|1
g
lkj
15
Example for Polynomial-Time Boundary: Byzantine Agreement
(Continued) Safety Specification
Agreement: No two non-Byzantine non-generals can finalize with different decisions
Validity: If g is not Byzantine, each non-Byzantine non-general process should finalize with the same decision as g
Read/Write restrictions Readable variables for process j:
b.j, d.j, f.j d.g, d.k, d.l
Process j can write d.j, f.j
16
Example for Polynomial-Time Boundary: Byzantine Agreement
(Continued)
Observation 1: Positive monotonicity of specification with respect to b.j
Observation 2: Negative monotonicity of program, consisting of the
transitions of j, with respect to b.k Observation 3:
Negative monotonicity of specification with respect to f.j
Observation 4: Positive monotonicity of program, consisting of the
transitions of j, with respect to f.k
17
Example for Polynomial-Time Boundary: Byzantine Agreement
(Continued)
Failsafe fault-tolerant program.
d.j = ┴ /\ f.j = 0 d.j := d.g d.j ≠ ┴ /\ ((d.j = d.k) \/ (d.j = d.l)) /\ f.j = 0 f.j := 1
18
Theoretical Issues – Analysis of Fault-Intolerant
Programs
Analyze the behavior and the structure of the fault-intolerant program.
Example: Reasoning about the program in high atomicity; i.e.,
no distribution restrictions. Enhancement of fault-tolerance [ICDCS03].
Take advantage of model checkers.
19
Theoretical Issues – Analysis of Fault-Intolerant
Programs
SynthesisFramework
The SPIN Model Checker
Fault-tolerant program
Intermediate program in Promela
Fault-intolerant program
Counterexample
20
Theoretical Issues: Current Results
Intolerant Program
Masking fault-tolerant
[FTR
TFT
00
]
Failsafe fault-tolerant
[ICDCS02]
Nonmasking fault-tolerant
[ICDCS03]
21
Synthesis Framework Goals:
Algorithmic synthesis of fault-tolerant programs from their fault-intolerant versions.
Easy to integrate new heuristics. Easy to change its implementation.
Users: Developers of fault-tolerance. Developers of heuristics.
Examples: A canonical version of Byzantine agreement. An agreement program that is subject to Byzantine and
failstop faults (1.3 million states). A token ring program perturbed by state-corruption faults.
22
Related Work E.A. Emerson and E.M. Clarke, Using branching time temporal
logic to synthesize synchronization skeletons, 1982.
Z. Manna and P. Wolper, Synthesis of communicating processes from temporal logic specifications, 1984.
A. Arora, P.C. Attie, and E.A. Emerson, Synthesis of fault-tolerant concurrent programs, 1998.
P.C. Attie, and E.A. Emerson, Synthesis of concurrent programs for an atomic read/write model of computation, 1996.
O. Kupferman and M. Vardi, Synthesis with incomplete information, 1997.
23
Future Plan
Theoretical issues Develop more intelligent heuristics to reduce the
chance of failure in the synthesis Find polynomial-time boundary for other levels of
fault-tolerance
Synthesis framework issues Scalability of the synthesis framework for larger
programs Implement the synthesis algorithm on a distributed platform
24
Future Plan - Continued Synthesis framework issues
Use model checkers for behavioral analysis Query
Intermediate program Reachability analysis from a given state
Result set Deadlock states Non-progress cycles Finite sequence of states
25
Publications [ICDCS02] Sandeep S. Kulkarni and Ali Ebnenasir. The Complexity
of Adding Failsafe Fault-Tolerance. The 22nd International Conference on Distributed Computing Systems, July 2-5, 2002 - Vienna, Austria.
[ICDCS03] Sandeep S. Kulkarni and Ali Ebnenasir. Enhancing The Fault-Tolerance of Nonmasking Programs. Accepted in the 23rd International Conference on Distributed Computing Systems, May 19-22, 2003 - Providence, Rhode Island USA.
[SRDS03] Sandeep S. Kulkarni and Ali Ebnenasir. A Framework for Automatic Synthesis of Fault-Tolerance. Submitted to The 22nd Symposium on Reliable Distributed Systems 6th-8th/October, 2003 - Florence, Italy.
The implementation of the synthesis framework: http://www.cse.msu.edu/~sandeep/software/Code/synthesis-framework/
26
Thank You!
Questions and Comments?
27
Reduction from 3-SAT
Included iff x0 is false
Included iff x0 is true
Included iffxj is false
Included iffxk is true
Included iffxl is false
cj = xj \/ xk \/ xl
_
an = a0a0
x0 x1
x’0 x’1x’n
xn