Jeff Wolski - Explorations in Cooperative, Distributed Systems with Uber's Ringpop

51
EXPLORATIONS IN COOPERATIVE DISTRIBUTED SYSTEMS WITH UBER’S RINGPOP

Transcript of Jeff Wolski - Explorations in Cooperative, Distributed Systems with Uber's Ringpop

EXPLORATIONS IN COOPERATIVE DISTRIBUTED SYSTEMSWITH UBER’S RINGPOP

WHAT IS RINGPOP?HIGH-LEVEL FACTS

It is a library written in Node.js and Go.

It is...

“...scalable, fault-tolerant application layer sharding.”

"...a library that brings cooperation and coordination to distributed applications.”

“...a hash ring.”

It is open source.

BUT WHAT IS UBER?REALTIME DISPATCH ENGINEERING

Mobile API, match-maker, trip orchestrator

Platform for marketplaces

Highly interactive, real-time

Data locality

Long-running transactions

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

ROUTING LAYER DISPATCH LAYER

LON

DO

NN

EW

YO

RK

ORIGINAL DISPATCH ARCHITECTUREUBER REAL-TIME DISPATCHING

0 1 2 3

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

ROUTING LAYER

ORIGINAL DISPATCH ARCHITECTUREUBER REAL-TIME DISPATCHING

0 1 2 3

DISPATCH LAYER

LON

DO

NN

EW

YO

RK

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

ROUTING LAYER

ORIGINAL DISPATCH ARCHITECTUREUBER REAL-TIME DISPATCHING

0 1 2 3

DISPATCH LAYER

LON

DO

NN

EW

YO

RK

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

ROUTING LAYER DISPATCH LAYER

LON

DO

NN

EW

YO

RK

ORIGINAL DISPATCH ARCHITECTUREUBER REAL-TIME DISPATCHING

0 1 2 3

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NG

INX

HA

PR

OX

Y

NO

DE

.JS

TWE

MP

RO

XY

RE

DIS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

NO

DE

.JS

RE

DIS

NO

DE

.JS

NO

DE

.JS

ROUTING LAYER DISPATCH LAYER

LON

DO

NN

EW

YO

RK

ORIGINAL DISPATCH ARCHITECTUREUBER REAL-TIME DISPATCHING

0 1 2 3

EVOLUTION OF THINKING AT UBERUBER REAL-TIME DISPATCHING

Scaling our organization, product and

systems

100x scale

Availabilityover

consistency Self-healing

No “master” No monolith

Flexibility and safety

ENTER RINGPOPMEMBERSHIP PROTOCOL, CONSISTENT HASHING, REQUEST ROUTING

INST. A INST. B INST. C

DATABASE DATABASE

DISCRETE APPLICATION INSTANCESENTER RINGPOP: MEMBERSHIP PROTOCOL

FE1 FE2 FE3 FE4 FE5FRONT-END

APPLICATION

STORAGE

COOPERATIVE APPLICATION INSTANCESENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. A

INST. B

INST. C

DATABASE DATABASE

FE1 FE2 FE3 FE4 FE5FRONT-END

APPLICATION

STORAGE

MEMBERSHIP PROTOCOLRINGPOP

SWIM GOSSIP PROTOCOLENTER RINGPOP: MEMBERSHIP PROTOCOL

PINGINST. A INST. B

INST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

PING PING

Membership list

ENTER RINGPOP: MEMBERSHIP PROTOCOL

PINGINST. A INST. B

INST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

FAILURE DETECTION

ENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. A INST. BINST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

PING-REQ PING

AN INDIRECT PING

ENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. A INST. BINST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

PING-REQ

DECLARE INST. B SUSPECT

ENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. A INST. BINST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

PING

INST. B

PIGGYBACK MEMBERSHIP UPDATES

ENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. A INST. BINST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

PING

INST. B

INFECTION-STYLE DISSEMINATION

ENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. A INST. BINST. AINST. BINST. C

INST. AINST. BINST. C

INST. CINST. AINST. BINST. C

PING

DETECTING A FAILURE

CREATING A CLUSTERENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. AINST. A

INST. C

INST. B

SENDS JOIN

SENDS JOIN

INST. B STARTS UPENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. AINST. A

INST. C

SENDS JOIN

SENDS JOININST. B INST. B

PIGGYBACK MEMBERSHIP ON JOINENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. AINST. A

INST. C

INST. B INST. B

SENDS JOIN

RESPONDS TO JOIN

INST. B

“A” APPLIES UPDATE FROM “B”ENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. AINST. A

INST. C

INST. B INST. BINST. B

SENDS JOIN

RESPONDS TO JOIN

INST. B

“A” PIGGYBACKS UPDATEENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. AINST. A

INST. C

INST. B INST. BINST. B

SENDS JOIN

SENDS PING

INST. A

“B” APPLIES “A” UPDATEENTER RINGPOP: MEMBERSHIP PROTOCOL

INST. AINST. A

INST. C

INST. B INST. BINST. B

SENDS JOIN

SENDS PING

INST. A

INST. A

CONSISTENT HASHINGRINGPOP

ENTER RINGPOP: CONSISTENT HASHING

START WITH A KEYSPACE AND A RING

232-1

ENTER RINGPOP: CONSISTENT HASHING

hash(“INST. A”);

hash(“INST. B”);

hash(“INST. C”);

HASH APPLICATION INSTANCES

ENTER RINGPOP: CONSISTENT HASHING

INST. B

INST. C

DIVIDE UP THE KEYSPACE

INST. A

ENTER RINGPOP: CONSISTENT HASHING

ASSIGN OWNERSHIP “USER1”

hash(“USER1”);

INST. A

INST. B

INST. C

ENTER RINGPOP: CONSISTENT HASHING

ASSIGN OWNERSHIP “USER5”

hash(“USER5”);

INST. B

INST. C

INST. A

ENTER RINGPOP: CONSISTENT HASHING

INST. A

INST. B

INST. C

ASSIGN OWNERSHIP “USER8”, “USER4”

hash(“USER4”);

hash(“USER8”);

ENTER RINGPOP: CONSISTENT HASHING

LOSING CAPACITY

hash(“USER1”);

hash(“USER5”);

hash(“USER4”);

hash(“USER8”);

INST. A

INST. B

INST. C

ADDING CAPACITYENTER RINGPOP: CONSISTENT HASHING

INST. D

INST. A

INST. B

INST. C

hash(“USER1”);

hash(“USER5”);

hash(“USER4”);

hash(“USER8”);

APPLICATION LAYER MIDDLEWAREENTER RINGPOP: REQUEST ROUTING

HTTP / THRIFT / ETC

ROUTING

HASH RING

BUSINESS LOGIC

MEMBERSHIP

INST. A

{RINGPOP } PROCESSAPPLICATION

FRONT-END

STORAGE

PROGRAMMING RINGPOPINSTANTIATE, BOOTSTRAP, LOOKUP

A TYPICAL WEB APPPROGRAMMING RINGPOP

INSTANTIATING RINGPOPPROGRAMMING RINGPOP

BOOTSTRAPPING RINGPOPPROGRAMMING RINGPOP

RING LOOKUPSPROGRAMMING RINGPOP

APPLICATIONSRINGPOP

GEOSPATIAL INDEXAPPLICATIONS OF RINGPOP

REPLICA 2

REPLICA 1

OWNER

INST. D

DISPATCH UPDATE DRIVER LOCATION

UPDATE

UPDATE

WORK DELEGATIONAPPLICATIONS OF RINGPOP

BACKUP 2

BACKUP 1

LEADER

BACKUP 3

POLL DB FOR WORK

APPLICATIONS OF RINGPOPRINGPOP IN PRODUCTION

Geospatial sharding

Work delegation

Server-side push / long-polling

Caching

Aggregation

Mailboxes

Database

Service Discovery and Routing

LESSONS LEARNEDRINGPOP

Verifying correctness.

Scaling and Failing.

DEVELOPMENT AND STAGINGLESSONS LEARNED

Convergence

Cross-pollination

Flappy nodes

Hard to forget

Slow start times

Anti-entropy

Tooling

Backwards compatibility

PRODUCTIONLESSONS LEARNED

CONCLUSIONRINGPOP

DynamoDBby Amazon

Riakby Basho

Serfby Hashicorp

Cassandraby Apache

Orleansby Microsoft

Akkaby Typesafe

GRAZIE!

Presented by Jeff Wolski <[email protected]>

Uber is hiring.Come work with me in our Amsterdam office!