1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring...

38
1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring 2005 Dr. Idit Keidar
  • date post

    18-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    1

Transcript of 1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring...

1

Principles of Reliable Distributed Systems

Lectures 11: Authenticated

Byzantine Consensus

Spring 2005

Dr. Idit Keidar

2

Material

• Synchronous model: – Nancy Lynch, Distributed Algorithms, Ch. 6

• Asynchronous model:– Practical Byzantine Fault-Tolerance

Castro and Liskov, OSDI 1999.– The ABCDs of Paxos

Lampson 2001.

3

Byzantine Faults

• Faulty process can behave arbitrarily, i.e., they don’t have to follow the protocol. E.g.,– can suffer benign failures – crash, timing;– can send bogus values in messages;– can send messages at the wrong time; – can send different messages to different

processes; etc.

• Captures software bugs, hacker intrusions.

4

Authenticated (Byzantine) Model

• Authentication: The receiver of a message can ascertain its origin;– an intruder cannot masquerade as someone else.

• Integrity: The receiver of a message can verify that it has not been modified in transit;– an intruder cannot substitute a false message for a

legitimate one.

• Nonrepudiation: A sender cannot falsely deny later that he sent a message.

5

Implementing Authentication

• Uses a Cryptographic Public Key Infrastructure (PKI).

• Each process has a well-know public key and a matching private key.– <M>p is message M signed by p’s private key.

– Only p can generate <M>p .

– Every process can verify p’s signature on <M>p using p’s public key.

6

Exploiting Authentication

• All messages are signed by their source.• Every receiver can verify that the message

was indeed sent by the source as is.• Signed messages can be forwarded as proof.

“I can prove that Idit said that I don’t have to submit this homework assignment” – <Yossy does not have to submit homework assignment 2>Idit

7

Consensus with Byzantine Failures

• Recall, we defined consensus as follows:– Agreement: correct processes’ decisions are

the same– Termination: eventually all correct processes

decide – Validity: decision is input of one process

• Problem?

8

Validity: Take II

• Strong unanimity: If the input of all the processes is v then no correct process decides a value other than v– Homework question: when is this equivalent to

the previous definition?

• How resilient can an algorithm satisfying this property be?

9

Validity: Take III

• Weak unanimity: If the input of all the processes is v and no process fails then no process decides a value other than v

10

Synchronous Authenticated Byzantine-Tolerant Consensus

11

Exponential Information Gathering Algorithm

• Send <vi > pi to all

• In every round 2 ≤ k ≤ t+1:– For every incoming message m, if m has k-1 different

valid signature, send <m> pi to all the processes that did not sign it

• Let Valid = {<vj > pj | all messages with t+1 valid signatures beginning with pj’s have same initial value vj }

• Decide on most common value in Valid

12

Byzantine Fault-Tolerant Consensus: Overview of Results

• Synchronous t-resilient algorithms – – Exist iff t < n with authentication and weak unanimity.– Exist iff t < n/2 with authentication and strong

unanimity.– Exist iff t < n/3 without authentication.

• Indulgent t-resilient algorithms –– Exist iff t < n/3 with or without authentication.

(Homework problem: show the lower bound).– Recall: an indulgent algorithm tolerates arbitrary

periods of asynchrony.

13

Eventually-Synchronous Byzantine Model

Byzantine Paxos

14

Overcoming Byzantine Failures With 3t+1 Processes

• In indulgent algorithms for crash failures –– We gather “votes” from a majority in every ballot. – Since every two majorities intersect, for every two

ballots, at least one process votes in both.

• But now, a faulty process can lie about what it did in the other ballot.– We want a correct process in the intersection. – Since n-t ≥ 2t+1, two sets of size n-t intersect by at least

one correct process.– Gather n-t votes in a ballot, to ensure that for every two

ballots, at least one correct process votes in both.

15

Reminder: The Paxos Atomic Broadcast Algorithm

• Leader based: each process has an estimate of who is the current leader

• To order an operation, a process sends it to its current leader

• The leader sequences the operation and launches a Consensus algorithm (Synod) to fix the agreement

16

Byzantine Paxos Setting

• State machine replication.• Structured like Paxos:

– Updates are sent to the current leader.

– Leader uses a consensus algorithm to have all replicas agree on the order of updates.

– Our focus today is the Consensus algorithm.

• Used to implement BFS – Byzantine Fault Tolerant NFS.– Only 3% slower than un-replicated NFS.

17

Model

• Universe: n processes: {0,…n-1}.

• Up to t Byzantine failures, t < n/3.– Assume n = 3t+1.

• Authentication (PKI).

• Reliable links, no recovery (for now).

18

Reminder: “Classic” Paxos• Phase 1: prepare

– Leader chooses largest unique ballot number, sends “prepare” to all.

– All respond with “ack” messages with values accepted in smaller ballots.

• Phase 2: accept– When leader gets votes (“acks”) from n-t, it proposes a

value matching the accepted value in the highest ballot.– Leader gets majority to accept his proposal.– A value accepted by a majority can be decided.

19

Reminder: Classic Paxos Phase I

• Periodically, until decision is reached do:

if leader (by ) thenBallotNum BallotNum.num+1, my proc id send (“prepare”, BallotNum) to all

• Upon receive (“prepare”, rank) from iif rank > BallotNum then

BallotNum rank

send (“ack”, rank, AcceptNum, AcceptVal) to i

20

Reminder: Classic Paxos Phase II

Upon receive (“ack”, BallotNum, b, val) from n-tif all vals = then myVal = initial valueelse myVal = received val with highest b send (“accept”, BallotNum, myVal) to all /* proposal */

Upon receive (“accept”, b, v) with b BallotNum

AcceptNum b; AcceptVal v /* accept proposal */send (“accept”, b, v) to all (first time only)

21

How Can Byzantine Failures Cause Problems?

22

Safety Problems: Leader Can Lie

• Leader can choose a value different than the highest accepted by n-t processes.

– Solution: Can “prove” he’s not lying by sending the signed “ack” (phase 1) messages to all processes.

• If no previous ballot was accepted, leader can send different new values to different processes.

23

Solution to the 2nd Problem

• Before accepting a value proposed by the leader, verify that the value was proposed to “enough” processes.

• Byzantine Paxos Phases:– Phase 1: Prepare. – Phase 2: Propose – echo leader’s proposal.– Phase 3: Accept – now only if n-t proposed.

• Add new variable: PropNum, initially 0.

24

Safety Problems: Others Can Lie

• Faulty users can send invalid “accept” messages.– Solution: Wait for n-t=2t+1 “accept” messages.

• Faulty users can send invalid values with higher AcceptNums in “ack” messages. – Solution: Can “prove” value is valid by

forwarding signed “propose” messages.– Add new variable: Proof, initially empty set.

25

Liveness Problems

• Faulty leader can deadlock algorithm.– Propose a new leader when the current does

not deliver.– Use rotating coordinator until one is correct.

Leader will be BallotNum mod n.

• Faulty processes may keep selecting new leaders all the time (livelock).

– Accept a new ballot only if a t+1 of the processes propose a new leader.

26

And Now For Our Feature Presentation

The Byzantine Paxos

Consensus Algorithm

27

Byzantine Paxos Variables

Int BallotNum, initially 0

Int PropNum, initially 0

Int AcceptNum, initially 0

Value {} AcceptVal, initially Message Set Proof, initially empty

Derived variable: Leader = BallotNum mod n

28

Byzantine Paxos Phase I: Prepare

Upon timeout on LeaderBallotNum BallotNum +1send (“prepare”, BallotNum) to all

Upon receive (“prepare”, b) from t+1 if (b < BallotNum) then returnif (b > BallotNum) then

BallotNum b send (“prepare”, BallotNum) to all send (“ack”, b, AcceptNum, AcceptVal, Proof) to Leader

29

Byzantine Paxos Phase II: Propose

Upon receive (“ack”, BallotNum, b, val, proof) from n-tS = {received (signed) “ack” messages}if (all vals that have valid proofs in S are then myVal init valueelse myVal val that has valid proof with highest b in S send (“propose”, BallotNum, myVal, S) to all

Upon receive (“propose”, BallotNum, v, S)if (BallotNum PropNum) then returnif (v is not a valid choice given S) then returnPropNum BallotNum send (“propose”, BallotNum, v, S) to all

30

Byzantine Paxos Phase III: Accept

Upon receive (“propose”, b, v, S) from n-t if (b < BallotNum) then returnAcceptNum b; AcceptVal v Proof set of n-t signed “propose” messages send (“accept”, b, v) to all

Upon receive (“accept”, b, v) from n-tdecide v

31

In Failure-Free Runs

accept

1

prepare ack propose

2

n

.

.

.

1

2

n

.

.

.

1 1

2

n

.

.

.

1

2

n

.

.

.

propose

1

2

n

.

.

.

All send prepare All echo

propose

32

Saving Communication

• Prepare and its “ack” can be merged into one message round.

• Proofs don’t have to be sent with messages: processes can have the information to check the proofs locally because the original messages are multicast.

33

Invariant

• If proposals (b,v) and (b, v’) are accepted by correct processes i and j, (possibly i = j ) then v’=v.

• Proof: – An accepted proposal is proposed by n-t processes.

– Two sets of n-t = 2t+1 processes have at least one correct process in common.

– A correct process sends no more than one propose message with the same b.

34

Lemma 1

• If a proposal (b,v) is accepted by t+1 correct processes, then for every proposal (b’, v’) that is proposed by a correct process with b’>b, v’=v.

• Again, follows from Lemma 2…– Since two sets of t+1 correct processes have at

least one correct process in common.

35

Lemma 2

• If a proposal (b,v) is proposed by a correct process, then there is a set S including at least t+1 correct processes such that either – no correct p in S accepts a proposal ranked less

than b; or– v is the value of the highest-ranked proposal

among proposals ranked less than b accepted by correct processes in S.

36

Liveness

• Is the current leader making progress?– If yes, some correct process decides. This process can

periodically forward the “proof” for its decision to others so they will decide too.

– If not, all timeout on the leader and start a new ballot.

• Once there is a correct leader.– The n-t correct processes will send all the needed

messages.– The t faulty processes will not be able to force a new

ballot.

37

Atomic Broadcast: Issues

• Leader can propose invalid client requests.

• Leader can refrain from proposing client requests.

• Leader can lie to client about response.

• Leader can refrain from sending client responses.

38

Byzantine Message Flow

accept

S1

prepare ack propose

S2

Sn

.

.

.

S1

S2

Sn

.

.

.

S1 S1

S2

Sn

.

.

.

S1

S2

Sn

.

.

.

propose

S1

S2

Sn

.

.

.

S1

request

S2

Sn

.

.

.

C C

response