Tolerating Memory Leaks Michael D. Bond Kathryn S. McKinley.
Tolerating Faults in Distributed Systems Vijay K. Garg Electrical and Computer Engineering The...
-
Upload
rafe-cameron -
Category
Documents
-
view
218 -
download
3
Transcript of Tolerating Faults in Distributed Systems Vijay K. Garg Electrical and Computer Engineering The...
Tolerating Faults in Distributed Systems
Vijay K. GargElectrical and Computer Engineering
The University of Texas at AustinEmail: [email protected]
(joint work with Bharath Balasubramanian and John Bridgman)
Fault Tolerance: Replication
2
Server 1 Server 2 Server 3
1 Fault Tolerance
2 FaultTolerance
Fault Tolerance: Fusion
3
1 FaultTolerance
Server 1 Server 2 Server 3
Fault Tolerance: Fusion
4
2 FaultTolerance
`Fused’ Servers : Fewer Backups than Replication
Server 1 Server 2 Server 3
Motivation
5
Coding Replication Fusion
Space Efficient Wasteful Efficient
Recovery Expensive Efficient Expensive
Updates Expensive Efficient Efficient
Probability of failure is low => expensive recovery is ok
OutlineCrash Faults
Space savingsMessage savings Complex Data Structures
Byzantine FaultsSingle Fault (f=1), O(1) dataSingle Fault, O(m) dataMultiple Faults (f>1), O(m) data
Conclusions & Future Work
6
Example 1: Event Counter
7
n different counters counting n different itemscounti= entry(i) – exit(i)
What if one of the processes may crash?
Event Counter: Single Fault
8
fCount1 keeps the sum of all countsAny crashed count can be recovered using remaining
counts
Event Counter: Multiple Faults
9
Event Counter: Theorem
10
Shared Events: Aggregation
11
Suppose all processes act on entry(0) and exit(0)
Aggregation of Events
12
Some Applications of FusionCausal Ordering of Messages for n Processes
O(n2) matrix at each processReplication to tolerate one fault: O(n3) storageFusion to tolerate one fault: O(n2) storage
Ricart and Agrawala’s AlgorithmO(n) storage per process, 2(n-1) messages/mutexReplication: n backup processes each with O(n) storage,
2(n-1) additional messagesFusion: 1 fused process with O(n) storage
Only n additional messages
13
OutlineCrash Faults
Space savingsMessage savings Complex Data Structures
Byzantine FaultsSingle Fault (f=1), O(1) dataSingle Fault, O(m) dataMultiple Faults (f>1), O(m) data
Conclusions & Future Work
14
Example: Resource Allocation, P(i)
15
user: int initially 0;// resource idlewaiting: queue of int initially null;
On receiving acquire from client pid if (user == 0) { send(OK) to client pid; user = pid; } else waiting.append(pid);
On receiving release if (waiting.isEmpty()) user = 0; else { user = waiting.head(); send(OK) to user; waiting.removeHead(); }
Complex Data Structures: Fused Queue
16
a1 a2
a3
a4
a5a6a7
a8
b1
b2b3b4
b5
head
tail tail
head
(i) Primary Queue A (i) Primary Queue B
HeadA
a2
a3 + b1
a4 + b2
a5 + b3
a6 + b4
a7 + b5a8 + b6
a1
HeadB
tailA tailB
(iii) Fused Queue F
Fused Queue that can tolerate one crash fault
Fused Queues: Circular Arrays
17
Resource Allocation: Fused Processes
18
OutlineCrash Faults
Space savingsMessage savings Complex Data Structures
Byzantine FaultsSingle Fault (f=1), O(1) dataSingle Fault, O(m) dataMultiple Faults (f>1), O(m) data
Conclusions & Future Work
19
Byzantine Fault Tolerance: Replication
20
13 8 45
13 8 45
13 8 45 (2f+1)*n processes
Goals for Byzantine Fault ToleranceEfficient during error-free operationsEfficient detection of faults
No need to decode for fault detectionEfficient in space requirements
21
Byzantine Fault Tolerance: Fusion
22
13 8 45
13 8 45
66
P(i)
Q(i)
F(1)
11
Byzantine Faults (f=1)
Assume n primary state machine P(1)..P(n), each with an O(1) data structure.
Theorem 2: There exists an algorithm with additional n+1 backup machines withsame overhead as replication during normal operations additional O(n) overhead during recovery.
23
Byzantine FT: O(m) data
24
P(i)
Q(i)
F(1)
a1 a2
a3
a4
a5a6a7
a8
a1 a2
a3
a4
a5a6a7
a8
b1
b2b3b4
b5
b1
b2b3b4
b5
HeadA
a2a3 + b1
a4 + b2
a5 + b3
a6 + b4
a7 + b5a8 + b6
a1
HeadB
tailA tailB
g
x
Crucial location
Byzantine Faults (f=1), O(m)Theorem 3: There exists an algorithm with additional
n+1 backup machines such thatnormal operations : same as replication additional O(m+n) overhead during recovery.
No need to decode F(1)
25
Byzantine Fault Tolerance: Fusion
26
3 1 4
3 8 4
P(i)
F(1)
1
3 1 4
3 1 4
8 17 43 F(3)
1*3 + 2*1 + 3*41*3+4*1+9*4
5
5
3Single mismatched primary
10
1*3+1*1+1*4
Byzantine Fault Tolerance: Fusion
27
3 7 4
3 8 4
P(i)
F(1)
1
3 1 4
3 1 4
8 17 43 F(3)
5
5
3Multiple mismatched primary
8
1
Byzantine Faults (f>1), O(1) data
Theorem 4: Algorithm with additional fn+f state machines for f Byzantine faults with same overhead as replication during normal operations.
28
Liar Detection (f > 1), O(m) data Z := set of all f+1 unfused copiesWhile (not all copies in Z identical) do
w := first location where copies differUse fused copies to find v, the correct value of state[w]Delete unfused copies with state[w] != v
Invariant: Z contains a correct machine.
No need to decode the entire fused state machine!
29
Fusible Structures
Fusible Data Structures[Garg and Ogale, ICDCS 2007, Balasubramanian and Garg
ICDCS 2011]Linked Lists, Stacks, Queues, Hash tablesData structure specific algorithmsPartial Replication for efficient updatesMultiple faults tolerated using Reed-Solomon Coding
Fusible Finite State Machines [Ogale, Balasubramanian, Garg IPDPS 09]Automatic Generation of minimal fused state machines
30
Conclusions
31
Coding Replication Fusion
Crash Faults n+nf n+f
Byzantine Faults n+2nf n+nf+f
Replication: recovery and updates simple, tolerates f faults for each of the primaryFusion: space efficient
Can combine them for tradeoffs
n: the number of different servers
Future Work
Optimal Algorithms for Complex Data StructuresDifferent Fusion OperatorsConcurrent Updates on Backup Structures
32
Thank You!
33
Event Counter: Proof Sketch
34
ModelThe servers (primary and backups) execute
independently (in parallel)Primaries and backups do not operate in lock-stepEvents/Updates are applied on all the serversAll backups act on the same sequence of events
35
Model contd…Faults:
Fail Stop (crash): Loss of current stateByzantine: Servers can `lie` about their current state
For crash faults, we assume the presence of a failure detector
For Byzantine faults, we provide detection algorithmsInfrequent Faults
36
Byzantine Faults (f=1), O(m)Theorem 3: There exists an algorithm with additional n+1 backup
machines such thatnormal operations : same as replication additional O(m+n) overhead during recovery.
Proof Sketch:Normal Operation: Responses by P(i) and Q(i), identical Detection: P(i) and Q(i) differ for any response Correction: Use liar detectionO(m) time to determine crucial locationUse F(1) to determine who is correctNo need to decode F(1)
37
Byzantine Faults (f>1)Proof Sketch:
f copies of each primary state machine and f overall fused machines
Normal Operation: all f+1 unfused copies result in the same output
Case 1: single mismatched primary state machine Use liar detection
Case 2: multiple mismatched primary state machinesUnfused copy with the largest tally is correct
38
Resource Allocation Machine
39
RequestQueue 1
RequestQueue 2
Lock Server 1
Lock Server 2
R1 R2 R3
R1 R2
RequestQueue 3
Lock Server 3
R1R2 R4
R3
Byzantine Fault Tolerance: Fusion
40
13 8 45
13 8 45
66 (f+1)*n + f processes
P(i)
Q(i)
F(1)
11