Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains –...

34
Paxos and Zookeeper Roy Campbell

Transcript of Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains –...

Page 1: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Paxos and Zookeeper

Roy Campbell

Page 2: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Motivation

• Centralized service:- Coordination kernel• Maintains – configuration information, – naming, – distributed synchronization, – group services.

• Avoids Synchronization and Races• File-system based API– Manipulates small data nodes: znodes– State is a hierarchy of znodes

Page 3: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

3

Visualizing Paxos

• The proposer requests that the Paxos system accept some command. Paxos is like a “postal system”

• It thinks about the letter for a while (replicating the data and picking a delivery order)

• Once these are “decided” the learners can execute the command

R1 R2 R3

learnersproposer

coordinator

Acceptor Acceptor Acceptor

Page 4: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

• The Client issues a request to the distributed system, and waits for a response. For instance, a write request on a file in a distributed file server.

• The Acceptors act as the fault-tolerant "memory" of the protocol. Acceptors are collected into groups called Quorums. Any message sent to an Acceptor must be sent to a Quorum of Acceptors. Any message received from an Acceptor is ignored unless a copy is received from each Acceptor in a Quorum.

Overview of roles of processes

Page 5: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Paxos Assumptions

• Processors operate at arbitrary speed.• Processors may experience failures.• Processors with stable storage may re-join the protocol after

failures – Using crash-recovery fault tolerance

• Processors do not collude, lie, or otherwise attempt to subvert the protocol. – i.e. Byzantine failures don't occur. See Byzantine Paxos for a solution

that tolerates failures from arbitrary/malicious behavior of the processes.

• In general, a consensus algorithm can make progress using 2F+1 processors despite the simultaneous failure of any F processors.

Page 6: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Paxos Network

• Processors can send messages to any other processor.

• Messages are sent asynchronously and may take arbitrarily long to deliver.

• Messages may be lost, reordered, or duplicated.

• Messages are delivered without corruption. – i.e. Byzantine network failures don't occur. See

Byzantine Paxos for a solution.

Page 7: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Number of Processors

• In general, a consensus algorithm can make progress using 2F+1 processors despite the simultaneous failure of any F processors.

• However, using reconfiguration, a protocol may be employed which survives any number of total failures as long as no more than F fail simultaneously.

Page 8: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

• A Proposer advocates a client request, attempting to convince the Acceptors to agree on it, and Learners act as the replication factor for the protocol. Once a Client request has been agreed on by the Acceptors, the Learner may take action (i.e.: execute the request and send a response to the client). To improve availability of processing, additional Learners can be added.

• Paxos requires a distinguished Proposer (called the leader) to make progress. Many processes may believe they are leaders, but the protocol only guarantees progress if one of them is eventually chosen. If two processes believe they are leaders, they may stall the protocol by continuously proposing conflicting updates. However, the safety properties are still preserved on that case.

Overview of roles of processes

Page 9: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Proposal Number & Agreed Value

• Each attempt to define an agreed value v is performed with proposals which may or may not be accepted by Acceptors.

• Each proposal is uniquely numbered for a given Proposer.

Page 10: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Basic Paxos

• Each instance of the Basic Paxos protocol decides on a single output value.

• The protocol proceeds over several rounds. • A successful round has two phases:

1. Prepare-Promise2. Accept Request - Accepted

Page 11: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Client Proposer Acceptor Learner

Do(X)Request

Page 12: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Prepare Promise

Prepare: 1. A Proposer (the leader) creates a proposal

identified with a number N. 2. This number must be greater than any

previous proposal number used by this Proposer.

3. Then, it sends a Prepare message containing this proposal to a Quorum of Acceptors.

Page 13: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Client Proposer Acceptor Learner

Do(X)Request

Prepare (N, Vx)

Page 14: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Prepare-Promise• Promise1. If the proposal's number N is higher than any previous proposal

number received from any Proposer by the Acceptor, then the Acceptor must return a promise to ignore all future proposals having a number less than N. If the Acceptor accepted a proposal at some point in the past, it must include the previous proposal number and previous value in its response to the Proposer.

2. Otherwise, the Acceptor can ignore the received proposal. It does not have to answer in this case for Paxos to work. However, for the sake of optimization, sending a denial (Nack) response would tell the Proposer that it can stop its attempt to create consensus with proposal N.

Page 15: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Client Proposer Acceptor Learner

Do(X)Request

Prepare (N, Vx)

Promise(N, <Na,Va>,<Nb,Vb>,<Nc,Vc>)

Page 16: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Accept Request

1. If a Proposer receives enough promises from a Quorum of Acceptors, it needs to set a value to its proposal.

2. If any Acceptors had previously accepted any proposal, then they'll have sent their values to the Proposer, who now must set the value of its proposal to the value associated with the highest proposal number reported by the Acceptors.

3. If none of the Acceptors had accepted a proposal up to this point, then the Proposer may choose any value for its proposal.

4. The Proposer sends an Accept Request message to a Quorum of Acceptors with the chosen value for its proposal.

Page 17: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Client Proposer Acceptor Learner

Do(X)Request

Prepare (N, Vx)

Promise(N, <Na,Va>,<Nb,Vb>,<Nc,Vc>)

Accept!(Max(N,Na,Nb,Nc), Vm)

Page 18: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Accepted• If an Acceptor receives an Accept Request message for a

proposal N, it must accept it if and only if it has not already promised to only consider proposals having an identifier greater than N.

• In this case, it should register the corresponding value v and send an Accepted message to the Proposer and every Learner. Else, it can ignore the Accept Request.

• Rounds fail when multiple Proposers send conflicting Prepare messages, or when the Proposer does not receive a Quorum of responses (Promise or Accepted). In these cases, another round must be started with a higher proposal number.

• Notice that when Acceptors accept a request, they also acknowledge the leadership of the Proposer. Hence, Paxos can be used to select a leader in a cluster of nodes.

Page 19: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Client Proposer Acceptor Learner

Do(X)Request

Prepare (N, Vx)

Promise(N, <Na,Va>,<Nb,Vb>,<Nc,Vc>)

Accept!(Max(N,Na,Nb,Nc), Vm)

Accepted(Max(N,Na,Nb,Nc), Vm)

Page 20: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Client Proposer Acceptor Learner

Do(X)Request

Prepare (N, Vx)

Promise(N, <Na,Va>,<Nb,Vb>,<Nc,Vc>)

Accept!(Max(N,Na,Nb,Nc), Vm)

Accepted(Max(N,Na,Nb,Nc), Vm)

Response

Page 21: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

A Paxos for every occasion

• Multi Paxos – avoid Prepare and Promise• Cheap Paxos – tolerate F failures with F+1

processors and F auxiliary • Fast Paxos – reduces end to end messages• Generalized Paxos – exploits communitivity • Byzantine Paxos

Page 22: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

What is ZooKeeper?

A highly available, scalable, distributed, configuration, consensus, group membership,

leader election, naming, and coordination service

• Difficult to implement these kinds of services reliably– brittle in the presence of change– difficult to manage– different implementations lead to management

complexity when the applications are deployed

Page 23: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Zookeeper Properties

• File API without partial reads/writes– Simple wait free data objects organized hierarchically as in file

systems.

• Per Client guarantee of FIFO execution of requests

• Linearizability for all requests that change the Zookeeper state

• Built using ZAB, a totally ordered broadcast protocol (based on Paxos)

• 2F+1 servers can tolerate f crash failures

Page 24: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Any Guarantees?

1. Clients will never detect old data.2. Clients will get notified of a change to data they are

watching within a bounded period of time.3. All requests from a client will be processed in order.4. All results received by a client will be consistent

with results received by all other clients.

Page 25: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZooKeeper Servers

1)All servers store a copy of the data on disk2)A leader is elected at startup3)Followers service clients, all updates go through leader4)Update responses are sent when a majority of servers have persisted the change

Page 26: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZooKeeper Service

• All servers store a copy of the data, logs, snapshots on disk and use an in memory database

• A leader is elected at startup• Followers service clients, all updates go through leader• Update responses are sent when a majority of servers have

persisted the change

ZooKeeper Service

ServerServer ServerServerServerServer

Leader

Client ClientClientClientClient ClientClient

Page 27: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Protocol Guarantees

1) Sequential Consistency - Updates from a client will be applied in the order that they were sent.2) Atomicity - Updates either succeed or fail. No partial results.3) Single System Image - A client will see the same view of the service regardless of the server that it connects to.4) Reliability - Once an update has been applied, it will persist from that time forward until a client overwrites the update.5) Timeliness - The clients view of the system is guaranteed to be up-to-date within a certain bound. Either system changes will be seen by a client within this bound, or the client will detect a service outage.

Page 28: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZAB algorithmhttp://research.yahoo.com/files/ladis08.pdf

• Zookeeper is based on the ZAB algorithm– ZAB: Zookeeper Atomic Broadcast

• Consists of two modes– Recovery• When the service starts or after a leader failure. Ends

when a leader emerges and a quorum of servers have synchronized their state with the leader

– Broadcast• The leader is the server that executes a broadcast by

initiating the broadcast protocol• Once a leader has synchronized with a quorum of

followers, it begins to broadcast messages.

Page 29: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZAB broadcast– The leader broadcasts a proposal for a message to be delivered.– Before proposing a message the leader assigns a monotonically

increasing unique id, called the zxid. • Because Zab preserves causal ordering, the delivered messages will also be

ordered by their zxids.

– Broadcasting consists of putting the proposal with the message attached into the outgoing queue for each follower

– When a follower receives a proposal, it writes it to disk, and sends an acknowledgement to the leader as soon as the proposal is on the disk media. – When a leader receives ACKs from a quorum, the leader will broadcast a COMMIT and deliver the message locally. Followers deliver the message when they receive the COMMIT from the leader.

Page 30: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZAB Leader Election

1)UDP based2)Server with the highest logged transaction getsnominated3)Election doesn't have to be absolutely correct, just very likely correct

Page 31: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZAB Leader Election

1) Each server initially nominates itself2)Servers poll each other to get their votes

lastZxid: 22vote: 1voteZxid: 22

lastZxid: 22vote: 2voteZxid: 22

lastZxid: 23vote: 3voteZxid: 23

lastZxid: 21vote: 4voteZxid: 21

lastZxid: 21vote: 5voteZxid: 21

Page 32: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

ZAB Leader Election

1) Each server initially nominates itself2) Servers poll each other to get their votes3) and vote for the one with the highest zxid if there isn't a winner

lastZxid: 22vote: 3voteZxid: 22

lastZxid: 22vote: 3voteZxid: 22

lastZxid: 23vote: 3voteZxid: 23

lastZxid: 21vote: 3voteZxid: 21

lastZxid: 21vote: 3voteZxid: 21

Page 33: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Difference

Paxos• Tolerates message losses

and reordering• Quorums• If proposer believes it is a

leader, it uses a higher number tom take over leadershop from another leader

ZAB• Uses TCP

• No Quorums needed• New leader cannot take

over leadership until all of the followers agree on the leader

Page 34: Paxos and Zookeeper Roy Campbell. Motivation Centralized service:- Coordination kernel Maintains – configuration information, – naming, – distributed.

Paxos references

• Schneider, Fred (1990). "Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial". ACM Computing Surveys 22: 299.

• The Part-Time Parliament, Leslie Lamport, http://research.microsoft.com/en-us/um/people/lamport/pubs/lamport-paxos.pdf

• Paxos Made Simple, Leslie Lamport, 2001. http://research.microsoft.com/en-us/um/people/lamport/pubs/paxos-simple.pdf