Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS:...

36
1 Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana, Han Wang, Ki Suh Lee, Hakim Weatherspoon, Marco Canini, Fernando Pedone, and Robert Soulé Università della Svizzera italiana (USI), Cornell University, and Université catholique de Louvain

Transcript of Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS:...

Page 1: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 1

Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service

Huynh Tu Dang, Daniele Sciascia, Pietro Bressana, Han Wang, Ki Suh Lee, Hakim Weatherspoon, Marco Canini, Fernando Pedone, and Robert SouléUniversità della Svizzera italiana (USI),Cornell University, and Université catholique de Louvain

Page 2: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 2

Outline

• Introduction • CAANS • Demo • Results • Conclusion

Page 3: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 3

Introduction

Many distributed problems can be reduced to consensus

▪ E.g., Atomic commit, leader election, atomic broadcast

Consensus protocols are the foundation for fault-tolerant systems

▪ E.g., OpenReplica, Ceph, Chubby, etc., Paxos: one of the most widely used consensus protocols

▪ Paxos is slow ▪ Optimizations: Generalized Paxos, Fast Paxos

Page 4: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 4

Motivations

Network devices are becoming: ▪ More powerful: NFP-6xxxx, PISA chips, FlexPipe ▪ More programmable: custom pipeline processing

Page 5: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 5

Motivations

Network devices are becoming: ▪ More powerful: NFP-6xxxx, PISA chips, FlexPipe ▪ More programmable: custom pipeline processing

High level languages: ▪ OpenFlow, PX, P4

Co-designnetworksandconsensusprotocols

Page 6: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 6

Background: Consensus Problem

Page 7: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 7

Background on Paxos Consensus

Client Proposer Acceptor Learner

submitrequests

coordinaterequests

learntheoutcome}

Page 8: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 8

Background on Paxos Consensus

Client Coordinator Acceptor

prepare(2)

Distinguished Proposer

Acceptor

Acceptor

Page 9: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 9

Background on Paxos Consensus

Client Coordinator Acceptor

promise(2, ø, ø)

proposal: 2 value: ø

Acceptor

Acceptor

Case 1: acceptor has NOT accepted any message previously

Page 10: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 10

Background on Paxos Consensus

Client Coordinator Acceptor

promise(2, 1, x) Acceptor

Acceptor

Case 2: acceptor has ACCEPTED proposal(1,x)

proposal: 2 value: x

Page 11: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 11

Background on Paxos Consensus

Client Coordinator Acceptor

promise(2,ø, ø) Acceptor

Acceptor

request(v)

proposal: 2 value: v

Page 12: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 12

Background on Paxos Consensus

Client Coordinator Acceptor

accept(2,v)Acceptor

Acceptor

request(v)

proposal: 2 value: v

Page 13: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 13

Background on Paxos Consensus

Client Coordinator Acceptor

accept(2,v)Acceptor

Acceptor

request(v)Learner

Learner

accepted(2,v)

proposal: 2 value: v

Page 14: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 14

Bottleneck Experiment Setup

Client+Acceptor+Learner

Coordinator+Acceptor+Learner

Acceptor+Learner

Page 15: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 15

Coordinator Bottleneck

CPUUsage(%

)

0

25

50

75

100

Client Coordinator Acceptor Learner

Client offered load at maximum rate Messages are 102 Bytes Minimum latency is 96 µs

Page 16: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 16

Acceptor BottleneckCP

UUsage(%

)

0

25

50

75

100

NumberofLearners

4 8 12 16 20

Acceptor Client Coordinator Learner

To increase replication factor, we launch multiple learner processes on each servers

Page 17: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 17

CAANS: Consensus As A Network Service

Page 18: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 18

Consensus / Network Design Space

TraditionalPaxos

Exploit Better ServiceGuarantees

Best effort

FiFODelivery

No lostmessages

Stateless processing

Persistent storage

stateful processing

Implement Paxosin the network

Page 19: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 19

Design Goals

A network consensus library: ▪ Compatible with the software library ▪ Independent targets ▪ High consensus throughput ▪ Low end-to-end latency

Page 20: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 20

CAANS

Consensus as a network service

submit deliver ……

Coordinaterequests

Page 21: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 21

CAANS

Consensus as a network service

submit deliver ……

Coordinaterequests

Page 22: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 22

P4Paxos Data Flow

Eth

IPv4

UDP

Paxos

Parser Control Actions

round_tbl

forward_tbl

read_round

forward

Ingress

paxos_tbl

handle_phase1a

drop

handle_phase2a

Page 23: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 23

P4Paxos Data Flow

isPaxos?

round_tbl read_round

handle_phase1a

drop

forward

handle_phase2a

paxos_tbl

Packet’sN>=Acceptor’sN?

N:proposalnumber

type

forward_tbl

Egresshttps://github.com/open-nfpsw/NetPaxos

Page 24: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 24

DEMO: Debugging Paxos Acceptor

Page 25: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 25

DEMO

Paxos Acceptor ▪ Store Accept Message in Registers ▪ Modify Paxos header ▪ Read accepted proposal from Registers

Acceptornode

CX-Agilio

PacketGenerator

node

IntelXL710 40GbElink Forwardto“vf0”

Page 26: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 26

Results: Absolute Performance

Page 27: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 27

Result: Forwarding LatencyLatency(ns)

0

300

600

900

*Forwarding Coordinator Acceptor

790720

40

810

340

40

Agilio-CX NetFPGASUME

*Numbersareprovidedbyothersources

PacketGenerator DUT

software96µsvs.hardware340ns

Page 28: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 28

Experiment Setup

Clients Coordinator Acceptor

Acceptor

Acceptor

Learner

Learner

Coordinator & Acceptors: Libpaxos: software processes *CAANS: hardwares

*https://github.com/usi-systems/p4paxos

Page 29: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 29

Experiment Setup

Clients Coordinator Acceptor

Acceptor

Acceptor

Learner

Learner

Clients & Learners: software processes

Page 30: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 30

End-to-End LatencyLatency(us)

0

100

200

300

400

Throughput(Messages/Second)

10000 25000 30000 90000 120000

Libpaxos CAANS

Page 31: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 31

CAANS: Fault ToleranceThroughp

ut(M

sg/S)

0

45000

90000

135000

180000

1 2 3 4 5 6 7

CoordinatorFailureAcceptorFailure

Links are down at the 3rd second. Coordinator failure: requests are switched to a software coordinator Acceptor failure: Paxos tolerates failure of a minority of nodes

Page 32: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 32

Summary

Page 33: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 33

Summary

Consensus is important in distributed systems ▪ Optimizations exploit network assumptions

SDN allows network programmability ▪ OpenFlow, P4

CAANS: Co-design Network and Consensus ▪ Implement Paxos in network devices ▪ Achieve high performance

Page 34: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 34

Future Work

• Checkpoint Acceptor’s log • Fast fail-over to software/hardware coordinator • Use kernel-bypass API to increase learner’s performance • Full-fledged deployment in traditional networks

Page 35: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 35

Acknowledgements

• Thank Bapi Vinnakota, Mary Pham, Jici Gao, and Netronome for providing us with a hardware testbed, and support with using their toolchain

• Thank Gordon Brebner and Xilinx for donating two SUME boards, providing access to SDNet, performing measurements

• Thank Google! Faculty award helped support this research

Page 36: Open-NFP Summer Webinar Series: CAANS: Consensus As A … · Open-NFP Summer Webinar Series: CAANS: Consensus As A Network Service Huynh Tu Dang, Daniele Sciascia, Pietro Bressana,

©2016 OpenNFP 36

The End

Huynh Tu Dang,Università della Svizzera italiana (USI) tudang.github.io NetPaxos http://www.inf.usi.ch/faculty/soule/netpaxos.html