PBS VLDB 2012 - Shivaramshivaram.org/talks/pbs-vldb12-talk.pdf · Soundcloud Twitter Mozilla Yammer...

Peter Bailis, Shivaram Venkataraman, Mike Franklin, Joe Hellerstein, Ion Stoica

VLDB2012

UC Berkeley

Probabilistically Bounded Stalenessfor Practical Partial Quorums

watch a recording(of an earlier talk) at:http://vimeo.com/37758648

PBSNOTE TO READER

play with a live demo at:http://pbs.cs.berkeley.edu/#demo

PBSNOTE TO READER

strongconsistencyhigherlatency

eventualconsistency

lowerlatency

consistencycontinuum is a choicebinary

strong eventual

latency vs.consistency

informed by practice

our focus:

availability, partitions,failures

not in this talk:

quantify eventual consistency:wall-clock time (“how eventual?”)versions (“how consistent?”)

analyze real-world systems:EC is often strongly consistentdescribe when and why

our contributions

introsystem model

practicemetricsinsights

integration

Apache, DataStax

Project Voldemort

Dynamo:Amazon’s Highly Available Key-value Store

SOSP 2007

Gowalla

Morningstar

NetflixPalantir

Rackspace

Rhapsody

Shazam

Spotify

Soundcloud

Twitter

Mozilla

Ask.comYammerAol

GitHubJoyentCloud

Best Buy

LinkedInBoeing

Comcast

Cassandra

RiakVoldemortGilt Groupe

N replicas/keyread: wait for R replieswrite: wait for W acks

N=3R=2

“strong”consistency

R+W > Nif:

eventualconsistency

reads return the last acknowledged write or anin-flight write (per-key)

consistency___“strong”

regular register

R+W > N

99th 99.9th1 1x 1x2 1.59x 2.35x3 4.8x 6.13xR

W99th 99.9th

1 1x 1x2 2.01x 1.9x3 4.96x 14.96x

LatencyLinkedIndisk-basedmodelN=3

⇧consistency, ⇧latencywait for more replicas,

read more recent data

consistency, ⇧ ⇧

latencywait for fewer replicas,

read less recent data

eventualconsistency“if no new updates are made to the object, eventually all accesses will return the last updated value”

W. Vogels, CACM 2008

R+W ≤ N

HowHow long do I have to wait?

eventual?

consistent?How

What happens if I don’t wait?

eventualconsistency

lowerlatency

introsystem model

integration

Cassandra:R=W=1, N=3

by default(1+1 ≯ 3)

eventual consistency

“maximum performance”

“very lowlatency”

okay for “most data”

“general case”

in the wild

anecdotally, EC“good enough” for many kinds of data

How eventual?How consistent?

“eventual and consistent enough”

Probabilistically Bounded Staleness

can’t make promisescan give expectations

Can we do better?

introsystem model

integration

HowHow long do I have to wait?

eventual?

t-visibility: probability p of consistent reads after t seconds

(e.g., 10ms after write, 99.9% of reads consistent)

How eventual?

t-visibility depends on

messaging and processing delays

Coordinator Replicawrite

response

wait for W responses

t seconds elapse

wait for R responses

response is stale

if read arrives before write

once per replica Time

response

R2inconsistent

(W)write

response

t seconds elapse

response is stale

once per replica

Coordinator Replica Time

solving WARS: order statistics

dependent variables

instead: Monte Carlo methods

to use WARS:

W53.244.5101.1

A10.38.211.3...

R15.322.419.8...

S9.614.26.7...

run simulationMonte Carlo, sampling

gather latency data

44.511.3

15.314.2

real Cassandra cluster varying latencies:

t-visibility RMSE: 0.28%latency N-RMSE: 0.48%

WARS accuracy

How eventual?

key: WARS modelneed: latencies

t-visibility: consistent reads with probability p after t seconds

introsystem model

integration

Yammer100K+ companies

uses Riak

LinkedIn 175M+ users

built and uses Voldemort

production latenciesfit gaussian mixtures

Latency is combined read and write latency at 99.9th percentile

R=3, W=1100% consistent:Latency: 15.01 ms

LNKD-DISKN=3

16.5% faster

R=2, W=1, t =13.6 ms99.9% consistent:Latency: 12.53 ms

worthwhile?

Latency is combined read and write latency at 99.9th percentile

R=3, W=1100% consistent:Latency: 4.20 ms

LNKD-SSDN=3

59.5% faster

R=1, W=1, t = 1.85 ms99.9% consistent:Latency: 1.32 ms

10�2 10�1 100 101 102 103

0.20.40.60.81.0

10�2 10�1 100 101 102 103

0.20.40.60.81.0

10�2 10�1 100 101 102 103

Write Latency (ms)

0.20.40.60.81.0

LNKD-SSD LNKD-DISK YMMR WANLNKD-SSD LNKD-DISK YMMR WAN

Coordinator Replica

ack(A)

response(S)

t seconds elapse

response is stale

once per replica

SSDs reducevariance

compared todisks!

Yammer

latency81.1%

(187ms) 202 mst-visibility 99.9th

k-staleness (versions)How consistent?monotonic reads

quorum load

in the paper

<k,t>-staleness:versions and time

latency distributions

WAN model

varying quorum sizes

staleness detection

in the paper

introsystem model

integration

1.Tracing2. Simulation3. Tune N,R,W

Integration

Project Voldemort

https://issues.apache.org/jira/browse/CASSANDRA-4261

http://pbs.cs.berkeley.edu/#demo

Related WorkQuorum Systems• probabilistic quorums [PODC ’97]

• deterministic k-quorums [DISC ’05, ’06]

Consistency Verification• Golab et al. [PODC ’11]

• Bermbach and Tai [M4WSOC ’11]

• Wada et al. [CIDR ’11]

• Anderson et al. [HotDep ’10]

• Transactional consistency:Zellag and Kemme [ICDE ’11],Fekete et al. [VLDB ’09]

Latency-Consistency• Daniel Abadi [Computer ’12]

• Kraska et al. [VLDB ’09]

Bounded StalenessGuarantees

• TACT [OSDI ’00]

• FRACS [ICDCS ’03]

• AQuA [IEEE TPDS ’03]

eventualconsistency

lowerlatency

consistencycontinuum is a

strong eventual

quantify eventual consistency

model staleness in time, versions

latency-consistency trade-offs

analyze real systems and hardware

pbs.cs.berkeley.eduPBSquantify which choice is best and explain

why EC is often strongly consistent

Extra Slides

PBS and apps

staleness requires either:

staleness-tolerant data structurestimelines, logs

cf. commutative data structures logical monotonicity

asynchronous compensation codedetect violations after data is returned; see paper

cf. “Building on Quicksand” memories, guesses, apologies

write code to fix any errors

minimize:(compensation cost)×(# of expected anomalies)

asynchronouscompensation

Read only newer data?

client’s read rateglobal write rate

(monotonic reads session guarantee)

# versions tolerablestaleness

(for a given key)

Failure?

latency spikes

Treat failures as

How l o n gdo partitions last?

what time interval?99.9% uptime/yr ⇒ 8.76 hours downtime/yr

8.76 consecutive hours down⇒ bad 8-hour rolling average

hide in tail of distribution ORcontinuously evaluate SLA, adjust

10�2 10�1 100 101 102 103

0.20.40.60.81.0

10�2 10�1 100 101 102 103

0.20.40.60.81.0

10�2 10�1 100 101 102 103

Write Latency (ms)

0.20.40.60.81.0

LNKD-SSD LNKD-DISK YMMR WAN

10�2 10�1 100 101 102 103

0.20.40.60.81.0

10�2 10�1 100 101 102 103

0.20.40.60.81.0

10�2 10�1 100 101 102 103

Write Latency (ms)

0.20.40.60.81.0

(LNKD-SSD and LNKD-DISK identical for reads)N=3

<k,t>-staleness:versions and time

approximation: exponentiate

t-staleness by k

Synthetic,Exponential Distributions

W 1/4x ARS

W 10x ARS

N=3, W=1, R=1

concurrent writes:deterministically choose

Coordinator R=2

(“key”, 1) (“key”, 2)

N = 3 replicas

Coordinator

read(“key”)

(“key”, 1)(“key”, 1)(“key”, 1)

client

read(“key”)(“key”, 1)readR=3

R1 R2 R3(“key”, 1) (“key”, 1) (“key”, 1)

N = 3 replicas

Coordinator

read(“key”)

(“key”, 1)(“key”, 1)(“key”, 1)

(“key”, 1)read(“key”)readR=3

R1 R2 R3(“key”, 1) (“key”, 1) (“key”, 1)

client

N = 3 replicas

Coordinator

read(“key”)

(“key”, 1)(“key”, 1)(“key”, 1)

(“key”, 1)read(“key”)

send read to all

readR=1

R1 R2 R3(“key”, 1) (“key”, 1) (“key”, 1)

client

(“key”, 2)

Coordinator Coordinator

read(“key”)

ack(“key”, 2)

write(“key”, 2)

write(“key”, 2)ack(“key”, 2)

R1(“key”, 1) R2(“key”, 1) R3(“key”, 1)(“key”, 2)

(“key”, 1)

read(“key”)

(“key”,1)

ack(“key”, 2) ack(“key”, 2)

(“key”, 2)

(“key”, 2) (“key”, 2)

PBS VLDB 2012 - Shivaramshivaram.org/talks/pbs-vldb12-talk.pdf · Soundcloud Twitter Mozilla Yammer...

Documents

Transcript of PBS VLDB 2012 - Shivaramshivaram.org/talks/pbs-vldb12-talk.pdf · Soundcloud Twitter Mozilla Yammer...

California by Adrian Munoz. Bibliography Wikipedia.com Ask.com Wikianswers.com.

LICENSING AND DISTRIBUTION OF PBS KIDS VOD … · licensing and distribution of pbs kids vod programming before pbs/comcast joint venture after pbs/comcast joint venture after pbs/comcast

d1qbemlbhjecig.cloudfront.net...PBS 1M ALGEBRA 1 EPISODE 3 UNIT 6 LESSON 3 PBS WIT AND WISDOM G3 PBS WIT AND WISDOM G4 PBS WIT AND WISDOM G5 PBS WIT AND WISDOM Gl MUSIC-4 HEALTH &

BBVA COLOMBIA · TAM +6 pbs TAM -16 pbs TAM -53 pbs $44 Billones 11.8% Recursos de Clientes TAM +28 pbs Vista Cuota 12.10% TAM +124 pbs CDT’s Cuota 13.92% TAM +64 pbs Ahorro Cuota

K–8 CORE - SAFARI Montage · 2017-02-23 · PBS children’s series • Super WHY! PbS • The Abolitionists PbS • NEW! The March PbS • The Story of India PbS • Peep and the

ENDEAVOUR - PBS

User´s Guide: BNI PBS-104-101-Z001 / BNI PBS-202-101-Z001 ... · BNI PBS-104-…: 16 standard inputs -BNI PBS-202-…: 8 standard outputs -BNI PBS-206-…: 16 standard outputs -BNI

Sites saw their inbound traffic from Ask.com and Yahoo! explode over the past year. Wait, really?!

PBS Professional 13.1 User’s Guide - pbsworks.co.kr · About PBS Documentation UG-viii PBS Professional 13.1 User’s Guide Document Conventions PBS documentation uses the following

PBS-SEPM Newsletter PBS-SEPM Executive Board 2015-2016 3 ... form 8 Individual Sponsors About PBS-SEPM 9 PBS-SEPM ... PBS “Science is facts;

Large Scale Internet Search at Ask.com Tao Yang Chief Scientist and Senior Vice President InfoScale 2006.

PBS Education PBS TeacherLine Stations Meeting Update 22 October 2007 PBS TeacherLine Stations Meeting Update 22 October 2007.

ask.com ppt

Probabilistically Bounded Staleness for Practical Partial ...db.cs.berkeley.edu/papers//vldb12-pbs.pdf · Probabilistically Bounded Staleness for Practical Partial Quorums Peter Bailis,

PBS Application Services 12.1 Administrator’s Guideresources.altair.com/pbs/documentation/support/PBS...Chapter 1 PBS Application Services Administrator’s Guide iii Technical Support

PBS Professional 7 - LNCC · PBS Professional 7 User’s Guide ix Acknowledgements PBS Professional is the enhanced commercial version of the PBS software originally developed for

Micro PBS PBS for Small Group Instruction

PBS Kids PBS Kids by Amanda Sullivan .

Polk County Positive Behavior Support (PBS) Academy PBS 101.

PBS Professional 12 - Altairresources.altair.com/pbs/documentation/support/PBSProAdminGuide12.1.pdf · PBS Professional 12.1 Administrator’s Guide ix About PBS Documentation Where