Protocol Verification with Merci

21
Protocol Verification with Merci Mark R. Tuttle and Amit Goel DTS SCL

description

Protocol Verification with Merci. Mark R. Tuttle and Amit Goel DTS SCL. Introduction. I love proof Proof is the path to understanding why things work But theorem provers are too hard for the masses (even me) I advocate model checking at Intel - PowerPoint PPT Presentation

Transcript of Protocol Verification with Merci

Page 1: Protocol Verification with Merci

Protocol Verification with Merci

Mark R. Tuttle and Amit GoelDTS SCL

Page 2: Protocol Verification with Merci

Introduction• I love proof

– Proof is the path to understanding why things work– But theorem provers are too hard for the masses (even me)

• I advocate model checking at Intel– It is the path to automated formal verification for the masses– But model checkers verify without explaining, and don’t scale

• But the world has changed– Decision procedures and SMT now automate some forms of proof– Is theorem proving now viable for nonspecialists in product groups?

Slide 2

Page 3: Protocol Verification with Merci

Our result• Amit wrote Merci: SMT-based proof checker from SCL

– Systems modeled with guarded commands (like Murphi, TLA+)– Clean mapping to decision procedures of an SMT solver

• Mark validated a classical distributed algorithm– A novice: no prior exposure to Merci, little exposure to SMT– Model done in 3 days, proof done in 3 days, just 9 pages long– Model looks like ordinary code, invariants explain the algorithm

• Found little need to coach the prover about “obvious” things

Slide 3

Page 4: Protocol Verification with Merci

Consensus

• Validity:– Each output was an input

• Agreement:– All outputs are equal

• Termination:– All nodes choose an output

n1 n2 n3

0 1 0

1 1 1

nodes

inputs

outputs

[Pease, Shostak, Lamport]

messagepassing

Slide 4

Page 5: Protocol Verification with Merci

A shocking result!

• Consensus is impossible in an asynchronous system if even one node can fail.– Asynchronous: no bound on node step time, msg delivery time– Failure: node just stops (crashes)

• A decade of papers– Different system models, different failure models– How fast? How few messages? How many failures

• Consensus is the “hardest problem” in concurrency!– but sometimes it can be solved…

[Fischer, Lynch, Patterson]

[Herlihy]

Slide 5

Page 6: Protocol Verification with Merci

Synchronous modelComputation is a sequence of rounds of message passing.

nodes send

messages

nodesreceive

messages

nodeschange

state

round r round r+1

node

Slide 6

Page 7: Protocol Verification with Merci

Crash failures

At most t nodes can fail.

n

n is correctsends all messages

n is silentsends no messages

n crashes!sends some

messages

Slide 7

Page 8: Protocol Verification with Merci

Algorithm

procedure consensus (node n)state ← { input }for each round r = 1, 2, …, t+1 do

broadcast state to all nodesreceive state1, state2, …, statek from other nodesstate ← state1 U state2 U … U statek

output ← min(state)

Validity: each output was an inputTermination: all nodes choose an output at end of round t+1Agreement: ???

[Dolev, Strong]

Slide 8

Page 9: Protocol Verification with Merci

Clean round: no nodes fail

• There is a clean round in t+1 rounds (at most t failures).• Nodes have same state after a clean round.• Nodes choose same output value min(state). Agreement!

[Dwork, Moses]

Clean round!

Slide 9

Page 10: Protocol Verification with Merci

Merci • A typed procedural language

• Guarded commands used to describe systems

type nodevar array(node, bool) y = mk_array[node](false)var array(node, bool) critical =mk_array[node](false)var node turn

transition unit req_critical (node n)require (!y[n]){ y[n] := true; }

transition unit enter_critical (node n)require (y[n] && !critical[n] && turn=n){ critical[n] := true; }

transition unit exit_critical (node n)require (critical[n]){critical[n] := false; y[n] := false; nondet turn;}

[Amit Goel]

Page 11: Protocol Verification with Merci

Merci• A typed procedural language

• Guarded commands used to describe systems

• A goal description language for compositional reasoning

def bool mutex = (node n1, node n2) (critical[n1] && critical[n2] => n1=n2)

def bool aux = (node n) (critical[n] => turn=n)

goal g0 = invariant mutex assuming auxgoal g1 = invariant aux

[Amit Goel]

Page 12: Protocol Verification with Merci

Merci• A typed procedural language

• Guarded commands used to describe systems

• A goal description language for compositional reasoning

• A template system for extending the language

template <type elem> Set { type t // set type const bool mem (elem x, t s) const t add (elem x, t s) const t remove (elem x, t s)

axiom mem_add = (elem x, elem y, t s) (mem (x, add (y, s)) = (x = y || mem (x, s)))

axiom mem_remove = (elem x, elem y, t s) (mem (x, remove(y, s)) = (x !=y && mem(x, s)))}

type nodemodule Node= Set<type node>

[Amit Goel]

Page 13: Protocol Verification with Merci

Crash failure model

def bool is_crash_behavior (Nodes crashed, Nodes crashing, message_pattern deliver) =

(node p) (p crashed => is_silent(p,deliver)) && (node p) (is_faulty(p,deliver) => p crashed || p crashing) &&Nodes.disjoint(crashed,crashing) &&Nodes.cardinality(crashed) + Nodes.cardinality(crashing) ≤ t

faulty

silent

Slide 13

Page 14: Protocol Verification with Merci

Synchronous model

for each node pinitialize state of p

for each round rfor each p and q

send msg from p to qfor each p and q

receive msg from p to qfor each p

update state of p

phase

init

send

recv

comp

program counter

init[p]

send[p][q]

recv[p][q]

comp[p]

algorithm

how?

what?

how?

how?decide?decide!

Slide 14

Page 15: Protocol Verification with Merci

phase ← send

phase ← recv

phase ← comp

Synchronous model• Transitions

– initialize(p)

– start_send– send(p,q)

– start_recv– recv(p,q)

– start_comp– comp(p)

init[p] ← true

send[p][q] ← true

recv[p][q] ← true

comp[p] ← true

increment roundsend[q][p] ← falserecv[p][q] ← falsecomp[p] ← fasle

is_init_phase = phase = init

init_phase_done = forall (node p) (init[p])

Slide 15

Page 16: Protocol Verification with Merci

transition start_sending () require ( is_init_phase && init_phase_done ||

is_comp_phase && comp_phase_done){

"send[p][q], recv[p][q], comp[p] <= false""message[p][q] <= null_message"

round := round + 1; phase := send;

crashed := Nodes.union(crashed,crashing);nondet crashing;nondet deliver;assume is_crash_behavior(crashed,crash,deliver);

}

Slide 16

Page 17: Protocol Verification with Merci

transition send (node n, node m) require (is_send_phase)require (!send[n][m]){

messages[n][m] := (deliver [n][m] ? global_state[n] : null_message);

send[n][m] := true;}

initialize(p) 8 lines

start_send() 16 lines send(p,q) 9 lines

start_recv() 5 lines recv(p,q) 7 lines

start_comp() 5 lines comp(p) 13 lines

Transition size

Slide 17

Page 18: Protocol Verification with Merci

Agreement proof• Recall the agreement proof

– A1: There is a clean round – A2: All states are equal at the end of a clean round – A3: All states remain equal after a clean round – A4: All nodes choose from their states the same output value

• Merci proof is short– A1: 7 lines– A2: 127 lines– A3: 12 lines– A4: 25 lines

• Merci proof is almost entirely at the algorithmic levelSlide 18

Page 19: Protocol Verification with Merci

A1: There is a clean rounddef bool clean_round_by_round_t_plus_1 =

round >= t+1 => !before_clean

def bool faulty_grows_until_clean_round = before_clean => Nodes.cardinality(faulty) >= round

goal clean1 = invariant faulty_grows_until_clean_roundgoal clean2 = invariant clean_round_by_round_t_plus_1 assuming faulty_grows_until_clean_round

Slide 19

Page 20: Protocol Verification with Merci

A2: All states equal …def bool state_equality =

(node n, node m) (noncrashed(n) && noncrashed(m) => state[n] = state[m])

def bool state_equality_in_clean = in_clean && send_phase_done && recv_phase_done =>

state_equality

• Proof– A2.1: If nonfaulty n has v, then n received v in a message– A2.2: That message was sent to everyone since round is clean– A2.3: If m received v in a message, then m has v– A2.4: So nonfaulty n and m have the same values

• Proof algorithmic and short: 48, 34, 15, and 30 lines long

Slide 20

Page 21: Protocol Verification with Merci

Conclusion• Classical fault-tolerant distributed algorithm proved w/Merci

– Model looks like ordinary code, invariants explain the algorithm– Merci proof is 170 lines, Classical proof is 1+ page– Model and proof done in 6 days with no prior experience

• Yices made quantification hard– exists: usually have to produce the example by hand– forall: template instantiation wouldn’t find the right instantiation

• Yices counterexamples mostly useless– Get a context from first few lines, ignore the rest– “Is property false or is Yices failing to instantiate a forall template?”– BKM: Think about the algorithm itself, and ignore Yices output

Slide 21