Bolt-On Global Consistency for the Cloud - GitHub PagesChallenges for Data Replication in Cloud...
Transcript of Bolt-On Global Consistency for the Cloud - GitHub PagesChallenges for Data Replication in Cloud...
Bolt-On Global Consistency for the Cloud
Zhe Wu, Edward Wijaya, Muhammed Uluyol,
Harsha V. Madhyastha
University of Michigan
1
Geo-distribution for Low Latency
2
Geo-distribution Requires Data Replication
3
Geo-distribution Requires Data Replication
4
Cloud Simplifies App Deployment
5
Cloud Simplifies App Deployment
6
?
?
Application Needs to Manage Replication
Isolated storage services
7
Application Needs to Manage Replication
Isolated storage services
8
Application Needs to Manage Replication
Isolated storage services
No replication across cloud providers
9
Challenges for Data Replication in Cloud
Conflict?
10
Challenges for Data Replication in Cloud
Megastore(CIDR’11) Spanner(OSDI’12) MDCC(Eurosys’13) Tapir(SOSP’15) …..
Paxos
11
Challenges for Data Replication in Cloud
Paxos logic Paxos logic Paxos logicPropose writes
Paxos
12
Challenges for Data Replication in Cloud
Paxos logic Paxos logic Paxos logicPropose writes
Paxos
13
Challenges for Data Replication in Cloud
Paxos logic Paxos logic Paxos logicPropose writes
Paxos
14
Challenges for Data Replication in Cloud
Paxos logic Paxos logic Paxos logic
PUT/GET
Propose writes
Paxos
15
Challenges for Data Replication in Cloud
Paxos
VMs managedby application
PUT/GET
16
Paxos logic Paxos logic Paxos logic
Challenges for Data Replication in Cloud
Paxos1. High cost
VMs managedby application
PUT/GET
17
Paxos logic Paxos logic Paxos logic
Challenges for Data Replication in Cloud
Paxos1. High cost2. Bottleneck
VMs managedby application
PUT/GET
18
Paxos logic Paxos logic Paxos logic
Challenges for Data Replication in Cloud
Conflict?
Paxos1. High cost2. Bottleneck
19
Challenges for Data Replication in Cloud
Conflict?
Paxos1. High cost2. Bottleneck
Paxos withlimited interface
Disk Paxos(Distributed Computing’03)
pPaxos (ATC’15)
20
Challenges for Data Replication in Cloud
Paxos1. High cost2. Bottleneck
Paxos withlimited interface
DiskPaxos, pPaxos
21
Challenges for Data Replication in Cloud
Paxos1. High cost2. Bottleneck
Paxos withlimited interface
DiskPaxos, pPaxos
1. Conflict-free writePaxos ops
22
Challenges for Data Replication in Cloud
Paxos1. High cost2. Bottleneck
Paxos withlimited interface
DiskPaxos, pPaxos
1. Conflict-free writePaxos ops
2. Read from the logto replay Paxos logic
23
Challenges for Data Replication in Cloud
Paxos1. High cost2. Bottleneck
Paxos withlimited interface
DiskPaxos, pPaxos
1. Conflict-free writePaxos ops
2. Read from the logto replay Paxos logic
1. High latency
24
Challenges for Data Replication in Cloud
Paxos1. High cost2. Bottleneck
Paxos withlimited interface
DiskPaxos, pPaxos
1. Conflict-free writePaxos ops
2. Read from the logto replay Paxos logic
1. High latency2. High cost
25
Problems with Existing Solutions
26
Low latency Compatible with limited interface Low cost
Traditional Paxos
Disk Paxos,pPaxos
Our Solution: Consistent Replication In the Cloud
27
Low latency Compatible with limited interface Low cost
Traditional Paxos
Disk Paxos,pPaxos
CRIC
CRIC Overview
Cloud Storage
Cloud Storage
App VM App VMCRIC Library CRIC Library
Key-value store(reads/writes)
28
CRIC Overview
Cloud Storage
Cloud Storage
App VM App VMCRIC Library CRIC Library
CPaxos
Key-value store(reads/writes)
29
CRIC Overview
Cloud Storage
Cloud Storage
App VM App VMCRIC Library CRIC Library
CPaxos
Key-value store(reads/writes)
30
ü Apps directly read/write data from/to cloud storage
CRIC Overview
Cloud Storage
Cloud Storage
App VM App VMCRIC Library CRIC Library
CPaxos
Key-value store(reads/writes)
31
ü Apps directly read/write data from/to cloud storage
ü Low latency (1 RTT)
CPaxos In Action
Proposer(App) Acceptor Storage
Prepare
Accept
Executing a write in traditional Paxos
32
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage
Prepare
Accept
Executing a write in traditional PaxosExecuting a write in CPaxos
33
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage(Passive
acceptor)
Prepare
Accept
Executing a write in traditional PaxosExecuting a write in CPaxos
34
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage(Passive
acceptor)
Prepare
Read Paxos state
Run Paxos prepare logic
Update Paxos state
Accept
Executing a write in traditional PaxosExecuting a write in CPaxos
35
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage(Passive
acceptor)
Prepare
Read Paxos state
Run Paxos prepare logic
Update Paxos state
Accept
Executing a write in traditional Paxos
Run Paxos accept logic
Update Paxos state and data
Executing a write in CPaxos
36
CPaxos In Action
Proposer(App)
Storage(Passive
acceptor)
Read Paxos state
Run Paxos prepare logic
Update Paxos state
Run Paxos accept logic
Update Paxos state and data
Executing a write in CPaxos
37
Proposer
CPaxos In Action
Proposer(App)
Storage(Passive
acceptor)
Read Paxos state
Run Paxos prepare logic
Update Paxos state
Run Paxos accept logic
Update Paxos state and data
Executing a write in CPaxos
38
Proposer
Leverage cloud supported conditional-PUT(available in all cloud storage services)
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage(Passive
acceptor)
Prepare
Read Paxos state
Update Paxos state
Accept
Executing a write in traditional Paxos
Update Paxos state and data
Executing a write in CPaxos
39
2 RTTs 3 RTTs
Preparelogic
Acceptlogic
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage(Passive
acceptor)
Prepare
Read Paxos state
Update Paxos state
Accept
Executing a write in traditional Paxos
Update Paxos state and data
Executing a write in CPaxos
40
2 RTTs 3 RTTs
Preparelogic
Acceptlogic
Can be omitted when:1. Write follows a read2. Object creation
CPaxos In Action
Proposer(App) Acceptor Storage
Proposer(App)
Storage(Passive
acceptor)
Prepare
Read Paxos state
Update Paxos state
Accept
Executing a write in traditional Paxos
Update Paxos state and data
Executing a write in CPaxos
41
2 RTTs 3 RTTs
Preparelogic
Acceptlogic
Can be omitted when:1. Write follows a read2. Object creation
Leverage Fast Paxos to execute reads and writes in one round
Tradeoff: High Latency under Conflict
Propose 0Propose 1Time
42
Tradeoff: High Latency under Conflict
Propose 0Propose 1Time
43
Tradeoff: High Latency under Conflict
Propose 0Propose 1Time
Retry Retry
44
Tradeoff: High Latency under Conflict
Propose 0Propose 1Time
Higher proposal will succeed in traditional Paxos
Retry Retry
45
Tradeoff: High Latency under Conflict
Propose 0Propose 1Time
Retry Retry
Reason for conflict: variance inlatency to different data centers
46
Optimization: Staggered Requests
Propose 0Propose 1Time
47
Optimization: Staggered Requests
Propose 0Propose 1Time
48
Optimization: Staggered Requests
Propose 0Propose 1Time
49
Optimization: Staggered Requests
Propose 0Propose 1Time
50
Optimization: Staggered Requests
Propose 0Propose 1Time
Detect conflict faster
51
Optimization: Staggered Requests
Propose 0Propose 1Time
Detect conflict faster
52
Observation: low network latency variance between cloud DCs
CRIC Optimizations
¤Reduce latency under conflict¤Staggered Requests
¤Reduce reader-write-back¤Asynchronous commit notification
¤Reduce storage and data transfer cost¤ Separates data and Paxos log¤ Aggressive garbage collection in Accept phase¤ Store data digest in Paxos log
53
CRIC Optimizations
¤Reduce latency under conflict¤Staggered Requests
¤Reduce reader-write-back¤Asynchronous commit notification
¤Reduce storage and data transfer cost¤ Separates data and Paxos log¤ Aggressive garbage collection in Accept phase¤ Store data digest in Paxos log
54
Cost-effectiveOnly one version of the data is stored in each replica data center
Evaluation
¤ Deploy CRIC in 5 Azure data centers and run YCSB workload
¤ Comparison systems:¤ active acceptor Fast Paxos¤ passive acceptor pPaxos
55
Evaluation
¤ Deploy CRIC in 5 Azure data centers and run YCSB workload
¤ Comparison systems:¤ active acceptor Fast Paxos¤ passive acceptor pPaxos
¤How does CRIC compare with respect to cost and performance?
56
Evaluation
¤ Deploy CRIC in 5 Azure data centers and run YCSB workload
¤ Comparison systems:¤ active acceptor Fast Paxos¤ passive acceptor pPaxos
¤How does CRIC compare with respect to cost and performance?
¤How effective are staggered requests?
57
CRIC Enables Low Cost
58
CRIC Enables Low Cost
59
CRICFast Paxos
pPaxos
CRIC Enables Low Cost
60
CRICFast Paxos
pPaxos
Eliminate need for relay VMs
CRIC Enables Low Cost
61
CRICFast Paxos
pPaxos
Reduce I/O and data transfers
CRIC Enables Low Cost
62
CRICFast Paxos
pPaxos
CRIC can reduce cost by 20% ~ 50%
… without Sacrificing Performance
0
50
100
150
200
250
300
350
400
Read Write
Med
ian
late
ncy
(ms)
Und
er lo
w c
onfli
ct
FastPaxospPaxosCRIC
63
… without Sacrificing Performance
0
50
100
150
200
250
300
350
400
Read Write
Med
ian
late
ncy
(ms)
Und
er lo
w c
onfli
ct
FastPaxospPaxosCRIC
Same performance as FastPaxos
64
… without Sacrificing Performance
0
50
100
150
200
250
300
350
400
Read Write
Med
ian
late
ncy
(ms)
Und
er lo
w c
onfli
ct
FastPaxospPaxosCRIC
Better write performance than pPaxos
65
Staggered Requests Lower Latency Under Conflict
100
1000
0 1 2 3 4 5 6 7 8
Med
ian
late
ncy
for
succ
essf
ul w
rites
(ms)
# of client servers per DC
100
1000
0 1 2 3 4 5 6 7 8
Med
ian
late
ncy
for
succ
essf
ul w
rites
(ms)
# of client servers per DC
Without staggeredWith staggered
Increasing conflict rate
66
Staggered Requests Lower Latency Under Conflict
100
1000
0 1 2 3 4 5 6 7 8
Med
ian
late
ncy
for
succ
essf
ul w
rites
(ms)
# of client servers per DC
100
1000
0 1 2 3 4 5 6 7 8
Med
ian
late
ncy
for
succ
essf
ul w
rites
(ms)
# of client servers per DC
Without staggeredWith staggered
Increasing conflict rate
Lower latency for same conflict rate
67
Conclusions
¤Consistent Replication In the Cloud¤Compatible with cloud storage interface ¤One round read/write in common case¤Low cost
Thank [email protected]
68
69