Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is...

39
Practical Byzantine Fault Tolerance Castro and Liskov SOSP 99

Transcript of Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is...

Page 1: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

PracticalByzantineFaultTolerance

CastroandLiskovSOSP99

Page 2: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Whythispaper?

• Kindofincrediblethatit’sevenpossible• LetaloneapracticalNFSimplementationwithit

• Sofarwe’veonlyconsideredfail-stopmodel

• Quiteabitofresearchinthisarea• Muchlessreal-worlddeployment• Mostsystemsbeingbuilttodaydon’tspantrustdomains• Hardtoreasonaboutbenefitsoncompromise

Page 3: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

WhatisByzantineBehavior?

• Anythingthatdoesn'tfollowourprotocol.• Maliciouscode/nodes.• Buggycode.• Faultnetworksthatdelivercorruptedpackets.• Disksthatcorrupt,duplicate,lose,orfabricatedata.• Nodesimpersonatingothers.• Joiningclusterwithoutpermission.• Operatingwhentheyshouldn't(e.g.unexpectedclockdrift).

• Serviceopsonapartitionafterpartitionwasgiventoanother• Reallywickedbadstuff:anyarbitrarybehavior.• Subjecttorestriction:independence;willcomebacktothis.

Page 4: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Review:Primary/Backup

• Wantlinearizable semantics• f+1replicastotolerateffailures• Runsintoproblemswhen“viewchanges”areneeded(Lab2).

Primary Backup

Backup

put(X,1)put(X,1)

put(X,1)

Page 5: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Review:Consensus

• Replicatedlog=>replicatedstatemachine• Allexecutesamecommandsinsameorder

• Consensusmoduleensuresproperlogreplication• Makesprogressifanymajorityofserversareup• 2f+1serverstoremainavailablewithuptoffailures

• Failuremodel:fail-stop(notByzantine),delayed/lostmessages

add jmp mov shlLog

ConsensusModule

StateMachine

add jmp mov shlLog

ConsensusModule

StateMachine

add jmp mov shlLog

ConsensusModule

StateMachine

Servers

Clients

shl

Page 6: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

3f+1?

• Atf+1wecantolerateffailuresandholdontodata.• At2f+1wecantolerateffailuresandremainavailable.• Whatdowegetfor3f+1?• SMRthatcantoleratefmaliciousorarbitrarilynastyfailures

Page 7: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

First,aFewIssues

1. Caveat:Independence2. Spoofing/Authentication

Page 8: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

TheCaveat:Independence

• AssumesindependentnodefailuresforBFT!• Isthisabigassumption?• Weactuallyhadthisassumptionwithconsensus• Ifnodesfailinacorrelatedwayitamplifiesthelossofasinglenode• Iffactoris>fthensystemstillwedges.

• Putanotherway:forPaxos toremainavailablewhensoftwarebugscanproducetemporallyrelatedcrasheswhatdoweneed?• 2f+1independentimplementations…

Page 9: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

TheStruggleforIndependence

• Samehere:fortrueindependencewe’llneed3f+1implementations• Butitismoreimportanthere

1. Nodesmaybeactivelymaliciousandthatshouldbeok.• Buttheyarelookingforourweakspotandwillexploittoamplifytheireffect.

2. If>ffailureshereanything canhappentothedata.• Attackermightchangeit,deleteit,etc…We’llneverknow.

• Requiresdifferentimplementations,operatingsystems,rootpasswords,administrators.Ugh!

Page 10: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Spoofing/Authentication

get(X)

Page 11: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

MaliciousPrimary?

• Mightlie!

get(X)

X=10

X=10

X=-1

Page 12: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

MaliciousPrimary?

• Mightlie!• Solution:directresponsefromparticipants

get(X)

X=10

X=10

X=-1

X=10

X=10

Page 13: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

MaliciousPrimary?

• Mightlie!• Solution:directresponsefromparticipants• Problemagain:primaryjustliesmore get(X)

X=10

X=10

X=-1

X=10

X=10

X=-1X=-1

Page 14: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

TheNeedforCrypto

• Needtobeabletoauthenticatemessages• Public-keycryptoforsignatures• Eachclientandserverhasaprivateandpublickey• Allhostsknowallpublickeys• Signedmessagesaresignedwithprivatekey• Publickeycanverifythatmessagecamefromhostwiththeprivatekey• Whilewe’reonit:we’llneedhashes/digestsalso

Page 15: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

AuthenticatedMessages

• Clientrejectsduplicatesorunknownsignatures

S1

S2

S3

get(X)

X=10

X=10

X=-1,signedS1

X=10,signedS3

X=10,signedS2

X=-1,signedS1X=-1,signedS??

Page 16: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Howisthispossible?Why3f+1?

• First,remembertherules• Mustbeabletomakeprogresswithnminusfresponses• n=3f+1• Progresswith3f+1- f=2f+1• Often4total,progresswith3

• Why?Incasethosefwillneverrespond

Page 17: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Try2f+1,f=1

• Goal:make(safe)progresswithonly2of3responses.

S1

S2

S3

C1get(X)

X=10

X=10

X=??

X=10

X=10

Page 18: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Try2f+1,f=1

• Problem:whatifS3wasn’tdown,butslow• InsteadthefailureisacompromisedS2• Clientcanwaitforf+1matchingresponses

S1

S2

S3

C1get(X)

X=10

X=-1

X=10

X=10

X=-1

Page 19: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Try2f+1,f=1

• Problem:whatifS3isbehind,doesn’tknowvalueofXyet?• Can’tdistinguishtruthwithoutf+1knowngoodvalues• Fix:replicatetoatleast2f+1,toleratefslow/down=>3f+1• 2f+1- f=f+1,enoughtodeterminetruthinfaceofflies

S1

S2

S3

C1get(X)

X=10

X=-1

X=??

X=10

X=-1

Page 20: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

3f+1

• Progresswithonly2f+1responsesandsafe• Among2f+1onlyfcanbebogus.f+1>f.

S1

S2

S3

S4

C1get(X)

X=10

X=10

X=10

X=??

X=10

X=10

X=-1

Page 21: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

10,000ftView

1. Clientsendsrequesttoprimary.2. Primarysendsrequesttoallbackups.3. Replicasexecutetherequestandsendthereply

totheclient.4. Clientwaitsforf+1responseswiththesame

result.

Page 22: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

ProtocolPieces

• Dealwithfailureofprimaries• Viewchanges(Lab2/4style)• SimilartoRaft,VR

• Mustorderoperationswithinaview• Mustensureoperationsexecutewithintheirview

Page 23: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Views

• Systemgoesthroughaseriesofviews• Inviewv,replica(vmod(3f+1))isdesignatedprimary• Responsibleforselectingtheorderofoperations• Assignsanincreasingsequencenumbertoeachoperation

• Tentativeordersubjecttoreplicasaccepting• Maygetrejectedifanewviewisestablished• Oriforderisinconsistentwithprioroperations

Page 24: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

RequestHandlingPhases

• Innormal-caseoperation,usetwo-phaseprotocolforrequestr:• Phase1(pre-prepare,prepare)goal:• Ensureatleastf+1honestreplicasagreethatIfrequestrexecutesinviewv,willexecutewithseqn

• Phase2(prepare,commit)goal:• Ensureatleastf+1honestreplicasagreethatRequestrhasexecutedinviewvwithseqn

• 2PC-like:• Phase1quibbleaboutorder,Phase2atomicity

Page 25: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Phase1

• ClienttoPrimary{REQUEST,op,timestamp,clientId}sc

• PrimarytoReplicas{PRE-PREPARE,view,seqn,h(req)}sp,req

• ReplicastoReplicas{PREPARE,view,seqn,h(req),replicaId}sri

We define the committed and committed-local predi-cates as follows: committed is true if and onlyif prepared is true for all in some set of

1 non-faulty replicas; and committed-localis true if and only if prepared is true and hasaccepted 2 1 commits (possibly including its own)from different replicas that match the pre-prepare for ;a commit matches a pre-prepare if they have the sameview, sequence number, and digest.The commit phase ensures the following invariant: if

committed-local is true for some non-faultythen committed is true. This invariant and

the view-change protocol described in Section 4.4 ensurethat non-faulty replicas agree on the sequence numbersof requests that commit locally even if they commit indifferent views at each replica. Furthermore, it ensuresthat any request that commits locally at a non-faultyreplica will commit at 1 or more non-faulty replicaseventually.Each replica executes the operation requested byafter committed-local is true and ’s state

reflects the sequential execution of all requests withlower sequence numbers. This ensures that all non-faulty replicas execute requests in the same order asrequired to provide the safety property. After executingthe requested operation, replicas send a reply to the client.Replicas discard requests whose timestamp is lower thanthe timestamp in the last reply they sent to the client toguarantee exactly-once semantics.We do not rely on ordered message delivery, and

therefore it is possible for a replica to commit requestsout of order. This does not matter since it keeps the pre-prepare, prepare, and commit messages logged until thecorresponding request can be executed.Figure 1 shows the operation of the algorithm in the

normal case of no primary faults. Replica 0 is the primary,replica 3 is faulty, and is the client.

X

request pre-prepare prepare commit replyC

0

1

2

3

Figure 1: Normal Case Operation

4.3 Garbage CollectionThis section discusses the mechanism used to discardmessages from the log. For the safety condition to hold,messagesmust be kept in a replica’s log until it knows that

the requests they concern have been executed by at least1 non-faulty replicas and it can prove this to others

in view changes. In addition, if some replica missesmessages that were discarded by all non-faulty replicas,it will need to be brought up to date by transferring allor a portion of the service state. Therefore, replicas alsoneed some proof that the state is correct.Generating these proofs after executing every opera-

tion would be expensive. Instead, they are generatedperiodically, when a request with a sequence number di-visible by some constant (e.g., 100) is executed. We willrefer to the states produced by the execution of these re-quests as checkpoints and we will say that a checkpointwith a proof is a stable checkpoint.A replicamaintains several logical copies of the service

state: the last stable checkpoint, zero ormore checkpointsthat are not stable, and a current state. Copy-on-writetechniques can be used to reduce the space overheadto store the extra copies of the state, as discussed inSection 6.3.The proof of correctness for a checkpoint is generated

as follows. When a replica produces a checkpoint,it multicasts a message CHECKPOINT to theother replicas, where is the sequence number of thelast request whose execution is reflected in the stateand is the digest of the state. Each replica collectscheckpoint messages in its log until it has 2 1 ofthem for sequence number with the same digestsigned by different replicas (including possibly its ownsuch message). These 2 1 messages are the proof ofcorrectness for the checkpoint.A checkpoint with a proof becomes stable and the

replica discards all pre-prepare, prepare, and commitmessages with sequence number less than or equal tofrom its log; it also discards all earlier checkpoints and

checkpoint messages.Computing the proofs is efficient because the digest

can be computed using incremental cryptography [1] asdiscussed in Section 6.3, and proofs are generated rarely.The checkpoint protocol is used to advance the low

and high water marks (which limit what messages willbe accepted). The low-water mark is equal to thesequence number of the last stable checkpoint. The highwater mark , where is big enough so thatreplicas do not stall waiting for a checkpoint to becomestable. For example, if checkpoints are taken every 100requests, might be 200.

4.4 View ChangesThe view-change protocol provides liveness by allowingthe system tomake progress when the primary fails. Viewchanges are triggered by timeouts that prevent backupsfrom waiting indefinitely for requests to execute. Abackup iswaiting for a request if it received a valid request

5

Page 26: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Phase1

• EachreplicawaitsforPRE-PREPARE+2fmatchingPREPAREmessages• Putsthesemessagesinitslog• Thenwesayprepared(req,v,n,i)isTRUE• Ifprepared(req,v,n,i)isTRUEforhonestreplicarithenprepared(req',v,n,j)wherereq'!=req FALSEforanyhonestrj• Sonootheroperationcanexecutewithviewvsequencenumbern

We define the committed and committed-local predi-cates as follows: committed is true if and onlyif prepared is true for all in some set of

1 non-faulty replicas; and committed-localis true if and only if prepared is true and hasaccepted 2 1 commits (possibly including its own)from different replicas that match the pre-prepare for ;a commit matches a pre-prepare if they have the sameview, sequence number, and digest.The commit phase ensures the following invariant: if

committed-local is true for some non-faultythen committed is true. This invariant and

the view-change protocol described in Section 4.4 ensurethat non-faulty replicas agree on the sequence numbersof requests that commit locally even if they commit indifferent views at each replica. Furthermore, it ensuresthat any request that commits locally at a non-faultyreplica will commit at 1 or more non-faulty replicaseventually.Each replica executes the operation requested byafter committed-local is true and ’s state

reflects the sequential execution of all requests withlower sequence numbers. This ensures that all non-faulty replicas execute requests in the same order asrequired to provide the safety property. After executingthe requested operation, replicas send a reply to the client.Replicas discard requests whose timestamp is lower thanthe timestamp in the last reply they sent to the client toguarantee exactly-once semantics.We do not rely on ordered message delivery, and

therefore it is possible for a replica to commit requestsout of order. This does not matter since it keeps the pre-prepare, prepare, and commit messages logged until thecorresponding request can be executed.Figure 1 shows the operation of the algorithm in the

normal case of no primary faults. Replica 0 is the primary,replica 3 is faulty, and is the client.

X

request pre-prepare prepare commit replyC

0

1

2

3

Figure 1: Normal Case Operation

4.3 Garbage CollectionThis section discusses the mechanism used to discardmessages from the log. For the safety condition to hold,messagesmust be kept in a replica’s log until it knows that

the requests they concern have been executed by at least1 non-faulty replicas and it can prove this to others

in view changes. In addition, if some replica missesmessages that were discarded by all non-faulty replicas,it will need to be brought up to date by transferring allor a portion of the service state. Therefore, replicas alsoneed some proof that the state is correct.Generating these proofs after executing every opera-

tion would be expensive. Instead, they are generatedperiodically, when a request with a sequence number di-visible by some constant (e.g., 100) is executed. We willrefer to the states produced by the execution of these re-quests as checkpoints and we will say that a checkpointwith a proof is a stable checkpoint.A replicamaintains several logical copies of the service

state: the last stable checkpoint, zero ormore checkpointsthat are not stable, and a current state. Copy-on-writetechniques can be used to reduce the space overheadto store the extra copies of the state, as discussed inSection 6.3.The proof of correctness for a checkpoint is generated

as follows. When a replica produces a checkpoint,it multicasts a message CHECKPOINT to theother replicas, where is the sequence number of thelast request whose execution is reflected in the stateand is the digest of the state. Each replica collectscheckpoint messages in its log until it has 2 1 ofthem for sequence number with the same digestsigned by different replicas (including possibly its ownsuch message). These 2 1 messages are the proof ofcorrectness for the checkpoint.A checkpoint with a proof becomes stable and the

replica discards all pre-prepare, prepare, and commitmessages with sequence number less than or equal tofrom its log; it also discards all earlier checkpoints and

checkpoint messages.Computing the proofs is efficient because the digest

can be computed using incremental cryptography [1] asdiscussed in Section 6.3, and proofs are generated rarely.The checkpoint protocol is used to advance the low

and high water marks (which limit what messages willbe accepted). The low-water mark is equal to thesequence number of the last stable checkpoint. The highwater mark , where is big enough so thatreplicas do not stall waiting for a checkpoint to becomestable. For example, if checkpoints are taken every 100requests, might be 200.

4.4 View ChangesThe view-change protocol provides liveness by allowingthe system tomake progress when the primary fails. Viewchanges are triggered by timeouts that prevent backupsfrom waiting indefinitely for requests to execute. Abackup iswaiting for a request if it received a valid request

5

Page 27: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

WhyNoDoublePrepares?

prepared(req,v,n,i)→notprepared(req’,v,n,j)forhonestri andrjHonestintersectionofmaximallydisjoint2f+1setsisnon-empty

2f+1

2f+1

Page 28: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Phase2

• Problem:Justbecausesomeotherreq'won'texecuteat(v,n)doesn'tmeanreq will

Page 29: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Problem:Prepared!=Committed

• S3prepared,butcouldn’tgetPREPAREout• S2becomesprimaryinnewview• Can’tfindPRE-PREPARE+2fPREPAREsinanylog

• S1:{S1,S2},S2:{S1,S2},S4:{}• Newprimarymustfill‘hole’sologcanmoveforward

C

S1

S2

S3

S4

Pre-prepare

Prepares

ViewChange NewView

Page 30: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Phase2

• Makesureopdoesn'texecuteuntilprepared(req,v,n,i)isTRUEforf+1non-faultyreplicas• Wesaycommitted(req,v,n)isTRUEwhenthispropertyholds• Howdoesreplicaknowcommitted(req,v,n)holds?• Addonemoremessage:ri ->R{COMMIT,view,seqno,h(req),replicaId}• Once2f+1COMMITsatanode,thenapplyopandrespondtoclient

Page 31: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

ViewChanges

• Allowsprogressifprimaryfails(orisslow)• Ifoperationonbackuppendingforlongtime{VIEW-CHANGE,view+1,seqn,ChkPointMgs,P,i}si• NewprimaryissuesNEW-VIEWonce2fVCmsgs

• IncludessignedVIEW-CHANGEsasproofitcanchangeview• Q:Whatgoeswrongwithoutthis?

• Then,foreachseqno sinceloweststablecheckpoint• UsePfromabove:setofsetsofPRE-PREPARE+2fPREPARES• Forseqno withvalidPRE-PREPARE+2fPREPARE,reissuePRE-PREPAREinv+1

• Forseqno notinP,{PRE-PREPARE,v+1,seqno,null}

Page 32: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

• Oncecommittedatleastf+1non-faultyreplicashaveagreedontheoperationanditsplacementinthetotalorderofoperations• Evenacrossviewchanges

Page 33: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Checkpoints/GC

• NeedtooccasionallysnapshotSMandtruncatelog• Problem:howcanonereplicatrustthecheckpointofanother?• Idea:at(seqn mod100)broadcast{CHECKPOINT,seqn,h(state),i}si• Once2f+1CHECKPOINTshavebeencollectedthencantrustCHECKPOINTatseqn withcorrectdigest(atleastf+1non-faultyservershaveacorrectcheckpointatseqn)

Page 34: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Liveness– ViewChanges

• Interestingissue:can’tletasinglenodestartaviewchange!• Why?Couldlivelock thesystembyspammingviewchanges.• Resolution:waitforf+1serverstotimeoutandindependentlysendVIEW-CHANGErequests.• Interactswithanoptimization:totrytoensurethatviewchangessucceedifanynodethatgetsmorethanf+1VIEW-CHANGErequestsissuesoneaswell.• ThispreventscaseswheretheytimeoutslowlyandthentheoldestVIEW-CHANGEissuerrollsovertoVIEW-CHANGEv+2.

• Havetobecarefulstill:needtowaitonthisoptimizationuntilf+1VIEW-CHANGESawayfromv.

• Why?Otherwisemightbedoingthebiddingofamaliciousnode.

Page 35: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Discussion

• Whatproblemdoesthissolve?• Wouldyourbossbeokwith4designs/implementations?• Howcansystemtoleratemorethanf(non-simultaneous)failuresoveritslifetime?• Periodicallyrecovereachserver?Couldhelpsome…• Whatifprivatekeycompromised?

• Importantpoint:itispossibletooperateinthefaceofByzantinefaults• Maybeevenefficiently

Page 36: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

PerformanceTricks

• Don’thavereplicasrespondwithoperationresults,justdigests• Onlyprimaryhastogiveresult

• Delays:clienttoprimary,pre-prepare,prepare,commit,reply• Idea:commitpreparedoperationstentatively.• Ifwrong,rollback.• Operationsunlikelytofailtocommitiftheypreparesuccessfully.

• Tentativelyexecutereadsagainsttentativeoperations,butwithholdreplyuntilalloperationsreadfromhavecommitted.

Page 37: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Crypto

• Can’tafforddigitalsignaturesonallmessagestoauthenticate• Insteadallpairsofhostsshareasecretkey• SendMACofeachmessage(h(m+secretkey))toverifyintegrity,authenticity.• Problem:whataboutmessageswithmultiplerecipients?

• e.g.clientoperationrequestmessage?• Can’tletfaultynodesspoofoperations.• PutavectorofMACsinforthemessage,oneforeverynodeinthesystem.

• Probably4or7hosts.Constanttimetoverify,lineartogenerate.• 37replicas,MACvectorsstill100xfastertogeneratethan1024bitRSAsig.

• Outputisalsosmallerthana1024bitsig.

Page 38: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

WhyPre-prepare,Prepare,Commit?• Pre-prepare

• Broadcastviewno,seqn,andmessagedigest.• Backupaccepts

• Ifdigestisokforthemessage• Backupisinsameview• Hasn’tacceptedapre-prepareforseqno inviewno withadifferentdigest.

• Ifitacceptsitbroadcastsprepare• Prepare• Commit

• Similartoourdecided;informseveryoneofthechosenvalue• Difference:can’ttakesender’swordforit,needproofthattheclusteragrees.

Page 39: Practical Byzantine Fault Tolerance - School of Computingstutsman/cs6963/public/pbft.pdf · What is Byzantine Behavior? • Anything that doesn't follow our protocol. • Malicious

Phase2

• Justbecausesomeotherreq'won'texecuteat(v,n)doesn'tmeanreq will• Supposeri iscompromisedrightafterprepared(req,v,n,i)• Supposenootherreplicareceivedri's PREPARE• SupposefreplicasareslowandneverevenreceivedthePRE-PREPARE• Nootherhonestreplicawillknowtherequestprepared!• Particularlyifpfails,requestmightnotgetexecuted!