Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San...
-
date post
22-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San...
![Page 1: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/1.jpg)
Coterie availability in sitesCoterie availability in sites
Flavio Junqueira and Keith Marzullo
University of California, San Diego
DISC, Krakow, Poland, September 2005
![Page 2: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/2.jpg)
2DISC’05
Multi-site systemsMulti-site systems
Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources
Data sets Computational power
E.g. BIRN, Geon, TeraGrid, PlanetLab
Site failure All the nodes in a site simultaneously
unavailable
![Page 3: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/3.jpg)
3DISC’05
Site availability — BIRNSite availability — BIRN
10 sites experience at least one outage
One site under 97%
![Page 4: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/4.jpg)
4DISC’05
Improving availabilityImproving availability
Better availability through replication Coteries
Set system of processes: a set of subsets of processes Each subset is called a quorum Minimal sets, pairwise intersect
Coteries are useful Distributed mutual exclusion Distributed registers Consensus through Paxos
Coterie availability in multi-site systems
![Page 5: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/5.jpg)
5DISC’05
RoadmapRoadmap
System model Availability metrics
Previous deterministic metrics not necessarily good A new metric
Failure model Characterize failures using survivor sets Survivor sets: more expressive
Quorum construction Multi-site hierarchical construction
Practical issues Failure model in practice PlanetLab experiment
Conclusions
![Page 6: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/6.jpg)
6DISC’05
System modelSystem model
Set P of processes Pairwise connected by quasi-reliable asynchronous channels Process failure: crash Processes can recover
Set B of sites Partition of the set processes Site failure: simultaneous failure of all the processes in the site Process failures are not independent
Execution Sequence of steps of processes E: set of all executions
In a step s
Available process in s p P is available if p F(s) €
NF(s) = P \ F(s)
€
F(s) = {p : ( p ∈ P)∧( p is faulty in s)}
![Page 7: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/7.jpg)
7DISC’05
Survivor setsSurvivor sets
A set S P is a survivor set iff
Example
€
∀p ∈ S : ∀E ∈E : S \ p ≠ NF(s)
€
∃E ∈E : ∃s ∈ E : S = NF(s)
Processes
Sites
E={E1,E2,E3,E4}
E1,E2: s1 s2 E3: s1 E4: s1
NF(si)
Survivor sets
![Page 8: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/8.jpg)
8DISC’05
Availability metricsAvailability metrics
Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges
Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs
![Page 9: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/9.jpg)
9DISC’05
A counterexampleA counterexample
Processes
Survivor sets
Sites
Majority Quorum: 5 processes In some step, no quorum can
be formed
Using SP as quorums In every step, at least one
quorum can be formed
Majority is not optimal
![Page 10: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/10.jpg)
10DISC’05
Availability metricsAvailability metrics
Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges
Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs
A new metric A(Q), Q is a coterie Number of covered survivor sets in Q A survivor set S is covered in Q if:
€
∃Q ∈Q : Q ⊆ S
![Page 11: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/11.jpg)
11DISC’05
Failure modelFailure model
Multi-site hierarchical model A set Fs of subsets of B
Subsets of simultaneously faulty sites
An array Fp One entry per site Each entry: subsets of
processes in the site Subsets of simultaneously
faulty processes at a site
A survivor set S: FS Fs
Bi FS:FP Fp[i]:P\FP S
Bi FS:Bi S =
Processes (P)
B1 B2 B3
Fs ={{B1},{B2},{B3}}
1 2 3 1 2 3 1 2 3
Fp [1]={{ }: i {1,2,3}}i
Fp [2]={{ }: i {1,2,3}}i
Fp [3]={{ }: i {1,2,3}}i
Sites(B )
Sp={{ }: i, j,k,l {1,2,3} ij kl}i j k l
{{ }: i, j,k,l {1,2,3} ij kl}i j k l
{{ }: i, j,k,l {1,2,3} ij kl}i j k l
![Page 12: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/12.jpg)
12DISC’05
Quorum constructionQuorum construction
Optimal availability with respect to A
Coterie Q : Sp = Q OR Q dominates Sp
Survivor sets in Sp pairwise intersect
If not, then optimally discarding survivor sets is NP-Complete
A special case: Qsite All subsets of B of size fs inFs
All subsets of size t of Bi in Fp[i], for every i
Site 1
Site 2
Site 3
E.g.: fs = 1, t = 1
Quorums
![Page 13: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/13.jpg)
13DISC’05
Model in practiceModel in practice
Qsite fs: Threshold on site failures
Data on site availability t : Threshold on process failures
Markov chains One Markov chain for each site
Transitions Failure transitions: same probability, homogeneous processes Repair transitions: variable probability, amount of resources used
Failure transitions
Repair transitions
![Page 14: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/14.jpg)
14DISC’05
PlanetLab experimentPlanetLab experiment
Toy application Paxos: quorums of acceptors Client accessing quorums
Hosts used Three sites: three from each site One UCSD host: proposer,
learner
Three settings 3Sites: One acceptor per site
Quorum: two hosts 3SitesMaj: All hosts
Quorum: four hosts, majority from each of two sites
SimpleMaj: All hosts Quorum: any five processes
UC Davis
UT Austin
DukeUC San Diego
SimpleMaj has worse availability
3SitesMaj has better availability
![Page 15: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/15.jpg)
15DISC’05
The Bimodal modelThe Bimodal model
Sites are survivor sets Sp is not a coterie
“Throw out” survivor sets In general, optimal solution is NP-Complete Simple solution for this model
Practical issues Practical for two sites More than two sites: open problem
n0
t0 t1 t t
00 01 0t
10 11 1t
0n
n1 n t nn
t n
1n
![Page 16: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/16.jpg)
16DISC’05
ConclusionsConclusions
Coteries for multi-site systems Site failures: process failures not independent
A new metric Counts covered survivor sets
Multi-site hierarchical construction Practical Illustrated with Markov model Experiment shows better availability
Using majority quorums is not a good idea Not optimal Poor performance
Future work More experiments, more constructions, real deployment
![Page 17: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/17.jpg)
17DISC’05
END
![Page 18: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/18.jpg)
18DISC’05
Backup Slides
![Page 19: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/19.jpg)
19DISC’05
Failure modelsFailure models
The multi-site hierarchical model A set Fs of subsets of B
An array Fp One entry per site Each entry: subsets of processes in
the site
A survivor set S: FS Fs
Bi FS:FP Fp[i]:P\FP S
Bi FS:Bi S =
The bimodal model A set Fs of subsets of B
There is one site that is in no element of Fs
An array Fp
A survivor set S As in the previous model OR
Bi B: S = Bi
Processes
B2B1
Fs =
Fp [1]={{ }: i {1,2,3}}
1 2 3 1 2 3
i
Fp [2]={{ }: i {1,2,3}}i
MSH: Sp={{ }: i, j,k,l {1,2,3}
ij kl} i j k l
B: Sp={{ }: i, j,k,l {1,2,3} ij kl} B
i j k l
![Page 20: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/20.jpg)
20DISC’05
Bimodal constructionBimodal construction
Bimodal model By construction: Not all pairs of survivor sets intersect
Discard survivor sets until remaining intersect Selecting optimally is NP-Complete
Solution: Remove |B|-1 survivor sets Survivor sets containing processes from multiple sites pairwise intersect Construction is also optimal with respect to metric A
A special case: Bsite All elements of Fs have size fs
All elements of Fp[i] have the same size t, for every i
E.g.: fs = 1, t = 1 B1
B2
Quorums
![Page 21: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/21.jpg)
21DISC’05
Site availabilitySite availability
Goals Show that sites are unavailable frequently enough
BIRN - Biomedical Informatics Research Network Test bed projects centered around brain imaging Currently: 19 universities, 26 research groups
Availability Monthly basis Pings (BIRN-CC) Storage broker logs
Site availability Jan/04-Aug/04 Availability under 100%
On average in 5 out of the 8 months
€
Availability = Total hours - Unplanned outages
Total hours×100
![Page 22: Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.](https://reader030.fdocuments.in/reader030/viewer/2022032704/56649d785503460f94a5a63e/html5/thumbnails/22.jpg)
22DISC’05
Causes of site failuresCauses of site failures
Misconfigured software Shared resources
1.Storage2.Power circuits3.Cooling pipes4.Air conditioning5.Network