23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU...

86
Intro to Database Systems 15-445/15-645 Fall 2019 Andy Pavlo Computer Science Carnegie Mellon University AP 23 Distributed OLTP Databases

Transcript of 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU...

Page 1: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

Intro to Database Systems

15-445/15-645

Fall 2019

Andy PavloComputer Science Carnegie Mellon UniversityAP

23 Distributed OLTP Databases

Page 2: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

ADMINISTRIV IA

Homework #5: Monday Dec 3rd @ 11:59pm

Project #4: Monday Dec 10th @ 11:59pm

Extra Credit: Wednesday Dec 10th @ 11:59pm

Final Exam: Monday Dec 9th @ 5:30pm

2

Page 3: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

L AST CL ASS

System Architectures→ Shared-Memory, Shared-Disk, Shared-Nothing

Partitioning/Sharding→ Hash, Range, Round Robin

Transaction Coordination→ Centralized vs. Decentralized

3

Page 4: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

OLTP VS. OL AP

On-line Transaction Processing (OLTP):→ Short-lived read/write txns.→ Small footprint.→ Repetitive operations.

On-line Analytical Processing (OLAP):→ Long-running, read-only queries.→ Complex joins.→ Exploratory queries.

4

Page 5: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

P3 P4

P1 P2

DECENTRALIZED COORDINATOR

5

ApplicationServer

Begin Request

Partitions

Page 6: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

P3 P4

P1 P2

DECENTRALIZED COORDINATOR

5

ApplicationServer

Query

Partitions

Query

Query

Page 7: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

P3 P4

P1 P2

DECENTRALIZED COORDINATOR

5

ApplicationServer

Safe to commit?

Commit Request

Partitions

Page 8: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

OBSERVATION

We have not discussed how to ensure that all nodes agree to commit a txn and then to make sure it does commit if we decide that it should.→ What happens if a node fails?→ What happens if our messages show up late?→ What happens if we don't wait for every node to agree?

6

Page 9: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

IMPORTANT ASSUMPTION

We can assume that all nodes in a distributed DBMS are well-behaved and under the same administrative domain.→ If we tell a node to commit a txn, then it will commit the

txn (if there is not a failure).

If you do not trust the other nodes in a distributed DBMS, then you need to use a Byzantine Fault Tolerant protocol for txns (blockchain).

7

Page 10: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

TODAY'S AGENDA

Atomic Commit Protocols

Replication

Consistency Issues (CAP)

Federated Databases

8

Page 11: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

ATOMIC COMMIT PROTOCOL

When a multi-node txn finishes, the DBMS needs to ask all the nodes involved whether it is safe to commit.

Examples:→ Two-Phase Commit→ Three-Phase Commit (not used)→ Paxos→ Raft→ ZAB (Apache Zookeeper)→ Viewstamped Replication

9

Page 12: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (SUCCESS)

10

Commit Request

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 13: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (SUCCESS)

10

Commit Request

Phase1: Prepare

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 14: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (SUCCESS)

10

Commit Request

OK

OK

Phase1: Prepare

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 15: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (SUCCESS)

10

Commit Request

OK

OK

Phase1: Prepare

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Phase2: Commit

Page 16: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (SUCCESS)

10

Commit Request

OK

OK

OK

Phase1: Prepare

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Phase2: Commit

OK

Page 17: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (SUCCESS)

10

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Success!

Page 18: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (ABORT )

11

Commit Request

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 19: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (ABORT )

11

Commit Request

Phase1: Prepare

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 20: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (ABORT )

11

Commit Request

ABORT!

Phase1: Prepare

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 21: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (ABORT )

11

ABORT!

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Aborted

Page 22: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (ABORT )

11

ABORT!

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Phase2: Abort

Aborted

Page 23: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

TWO-PHASE COMMIT (ABORT )

11

ABORT!

OK

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Phase2: Abort

OK

Aborted

Page 24: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

2PC OPTIMIZATIONS

Early Prepare Voting→ If you send a query to a remote node that you know will

be the last one you execute there, then that node will also return their vote for the prepare phase with the query result.

Early Acknowledgement After Prepare→ If all nodes vote to commit a txn, the coordinator can

send the client an acknowledgement that their txn was successful before the commit phase finishes.

12

Page 25: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

EARLY ACKNOWLEDGEMENT

13

Commit Request

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Page 26: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

EARLY ACKNOWLEDGEMENT

13

Commit Request

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2Phase1: Prepare

Page 27: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

EARLY ACKNOWLEDGEMENT

13

Commit Request

OK

OK Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2Phase1: Prepare

Page 28: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

EARLY ACKNOWLEDGEMENT

13

OK

OK Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Success!

Phase1: Prepare

Page 29: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

EARLY ACKNOWLEDGEMENT

13

OK

OK Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

Success!

Phase1: Prepare

Phase2: Commit

Page 30: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

EARLY ACKNOWLEDGEMENT

13

OK

OK

OK

Particip

an

tP

articip

an

t

Co

ord

inato

r

ApplicationServer

Node 3

Node 2

OK

Success!

Phase1: Prepare

Phase2: Commit

Page 31: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

TWO-PHASE COMMIT

Each node records the outcome of each phase in a non-volatile storage log.

What happens if coordinator crashes?→ Participants must decide what to do.

What happens if participant crashes?→ Coordinator assumes that it responded with an abort if it

hasn't sent an acknowledgement yet.

14

Page 32: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed.

Does not block if a majority ofparticipants are available and has provably minimal message delays in the best case.

15

Page 33: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Commit Request

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Acce

pto

r

Node 3

Page 34: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Commit Request

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Acce

pto

r

Node 3

Propose

Page 35: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Commit Request

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Acce

pto

r

Node 3XPropose

Page 36: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Commit RequestAgree

Agree

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Acce

pto

r

Node 3XPropose

Page 37: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Commit RequestAgree

Agree

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Acce

pto

r

Node 3XPropose

Commit

Page 38: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Commit RequestAgree

Agree

Accept

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Accept

Acce

pto

r

Node 3XPropose

Commit

Page 39: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

Node 1

PAXOS

16

Acce

pto

rA

ccep

tor

Pro

po

ser

ApplicationServer

Node 4

Node 2

Success!

Acce

pto

r

Node 3X

Page 40: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsT

IME

Page 41: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

TIM

E

Page 42: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

TIM

E

Page 43: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

Propose(n+1)

TIM

E

Page 44: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

Propose(n+1)

Commit(n)

TIM

E

Page 45: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

Propose(n+1)

Commit(n)

Reject(n,n+1)

TIM

E

Page 46: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

Propose(n+1)

Commit(n)

Reject(n,n+1)

Agree(n+1)

TIM

E

Page 47: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

Propose(n+1)

Commit(n)

Reject(n,n+1)

Commit(n+1)

Agree(n+1)

TIM

E

Page 48: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PAXOS

17

Proposer ProposerAcceptorsPropose(n)

Agree(n)

Propose(n+1)

Commit(n)

Reject(n,n+1)

Commit(n+1)

Agree(n+1)

Accept(n+1)

TIM

E

Page 49: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

MULTI-PAXOS

If the system elects a single leader that is in charge of proposing changes for some period of time, then it can skip the Propose phase.→ Fall back to full Paxos whenever there is a failure.

The system periodically renews who the leader isusing another Paxos round.

18

Page 50: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

2PC VS. PAXOS

Two-Phase Commit→ Blocks if coordinator fails after the prepare message is

sent, until coordinator recovers.

Paxos→ Non-blocking if a majority participants are alive,

provided there is a sufficiently long period without further failures.

19

Page 51: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

REPLICATION

The DBMS can replicate data across redundant nodes to increase availability.

Design Decisions:→ Replica Configuration→ Propagation Scheme→ Propagation Timing→ Update Method

20

Page 52: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

REPLICA CONFIGURATIONS

Approach #1: Master-Replica→ All updates go to a designated master for each object.→ The master propagates updates to its replicas without an

atomic commit protocol.→ Read-only txns may be allowed to access replicas.→ If the master goes down, then hold an election to select a

new master.

Approach #2: Multi-Master→ Txns can update data objects at any replica.→ Replicas must synchronize with each other using an

atomic commit protocol.

21

Page 53: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

REPLICA CONFIGURATIONS

22

Master-Replica

Master

P1

P1

P1

Replicas

Multi-Master

Node 1

P1

Node 2

P1

WritesReads Writes

Reads

WritesReads

Reads

Page 54: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

K-SAFET Y

K-safety is a threshold for determining the fault tolerance of the replicated database.

The value K represents the number of replicas per data object that must always be available.

If the number of replicas goes below this threshold, then the DBMS halts execution and takes itself offline.

23

Page 55: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PROPAGATION SCHEME

When a txn commits on a replicated database, the DBMS decides whether it must wait for that txn'schanges to propagate to other nodes before it can send the acknowledgement to application.

Propagation levels:→ Synchronous (Strong Consistency)→ Asynchronous (Eventual Consistency)

24

Page 56: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PROPAGATION SCHEME

Approach #1: Synchronous→ The master sends updates to replicas and

then waits for them to acknowledge that they fully applied (i.e., logged) the changes.

25

Commit? Flush? Flush!

Page 57: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PROPAGATION SCHEME

Approach #1: Synchronous→ The master sends updates to replicas and

then waits for them to acknowledge that they fully applied (i.e., logged) the changes.

25

Commit? Flush?

AckAck

Flush!

Page 58: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PROPAGATION SCHEME

Approach #1: Synchronous→ The master sends updates to replicas and

then waits for them to acknowledge that they fully applied (i.e., logged) the changes.

Approach #2: Asynchronous→ The master immediately returns the

acknowledgement to the client without waiting for replicas to apply the changes.

25

Commit? Flush?

AckAck

Flush!

Commit? Flush?

Ack

Page 59: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

PROPAGATION TIMING

Approach #1: Continuous→ The DBMS sends log messages immediately as it

generates them.→ Also need to send a commit/abort message.

Approach #2: On Commit→ The DBMS only sends the log messages for a txn to the

replicas once the txn is commits.→ Do not waste time sending log records for aborted txns.→ Assumes that a txn's log records fits entirely in memory.

27

Page 60: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

ACTIVE VS. PASSIVE

Approach #1: Active-Active→ A txn executes at each replica independently.→ Need to check at the end whether the txn ends up with

the same result at each replica.

Approach #2: Active-Passive→ Each txn executes at a single location and propagates the

changes to the replica.→ Can either do physical or logical replication.→ Not the same as master-replica vs. multi-master

28

Page 61: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP THEOREM

Proposed by Eric Brewer that it is impossible for a distributed system to always be:→ Consistent→ Always Available→ Network Partition Tolerant

Proved in 2002.

29

Brewer

Pick Two!Sort of…

Page 62: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP THEOREM

30

AC

P

ConsistencyAvailabilityPartition Tolerant

LinearizabilityAll up nodes can satisfy

all requests.

Still operate correctly despite message loss.

Impossible

Page 63: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP CONSISTENCY

31

Master Replica

NETWORK

Set A=2

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

Page 64: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP CONSISTENCY

31

Master Replica

NETWORK

Set A=2

A=1

B=8

A=2 A=1

B=8

ApplicationServer

ApplicationServer

Page 65: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP CONSISTENCY

31

Master Replica

NETWORK

Set A=2

A=1

B=8

A=2 A=1

B=8

A=2

ApplicationServer

ApplicationServer

Page 66: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP CONSISTENCY

31

Master Replica

NETWORK

Set A=2

A=1

B=8

A=2 A=1

B=8

A=2

ApplicationServer

ApplicationServerACK

Page 67: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP CONSISTENCY

31

Master Replica

NETWORK

Set A=2

A=1

B=8

A=2

Read A

A=1

B=8

A=2

ApplicationServer

ApplicationServerACK

Page 68: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP CONSISTENCY

31

Master Replica

NETWORK

Set A=2

A=1

B=8

A=2

Read A

A=2

A=1

B=8

A=2

If master says the txn committed, then it should be immediately

visible on replicas.

ApplicationServer

ApplicationServerACK

Page 69: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP AVAIL ABILIT Y

32

Master Replica

NETWORK

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

X

Page 70: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP AVAIL ABILIT Y

32

Master Replica

NETWORK

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

X

Read B

Page 71: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP AVAIL ABILIT Y

32

Master Replica

NETWORK

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

X

Read B

B=8

Page 72: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP AVAIL ABILIT Y

32

Master Replica

NETWORK

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

X

Read A

Page 73: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP AVAIL ABILIT Y

32

Master Replica

NETWORK

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

X

Read A

A=1

Page 74: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP PARTITION TOLERANCE

33

Master Replica

NETWORK

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

Page 75: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP PARTITION TOLERANCE

33

Master

A=1

B=8

A=1

B=8

ApplicationServer

ApplicationServer

Master

Page 76: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP PARTITION TOLERANCE

33

Master

Set A=2

A=1

B=8

Set A=3

A=1

B=8

ApplicationServer

ApplicationServer

Master

Page 77: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP PARTITION TOLERANCE

33

Master

Set A=2

A=1

B=8

A=2

Set A=3

A=1

B=8

A=3

ApplicationServer

ApplicationServer

Master

Page 78: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP PARTITION TOLERANCE

33

Master

Set A=2

A=1

B=8

A=2

Set A=3

ACK

A=1

B=8

A=3

ApplicationServer

ApplicationServerACK

Master

Page 79: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP PARTITION TOLERANCE

33

Master

NETWORK

Set A=2

A=1

B=8

A=2

Set A=3

ACK

A=1

B=8

A=3

ApplicationServer

ApplicationServerACK

Master

Page 80: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CAP FOR OLTP DBMSs

How a DBMS handles failures determines which elements of the CAP theorem they support.

Traditional/NewSQL DBMSs→ Stop allowing updates until a majority of nodes are

reconnected.

NoSQL DBMSs→ Provide mechanisms to resolve conflicts after nodes are

reconnected.

34

Page 81: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

OBSERVATION

We have assumed that the nodes in our distributed systems are running the same DBMS software.

But organizations often run many different DBMSs in their applications.

It would be nice if we could have a single interface for all our data.

35

Page 82: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

FEDERATED DATABASES

Distributed architecture that connects togethermultiple DBMSs into a single logical system.

A query can access data at any location.

This is hard and nobody does it well→ Different data models, query languages, limitations.→ No easy way to optimize queries→ Lots of data copying (bad).

36

Page 83: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

FEDERATED DATABASE EXAMPLE

37

Mid

dle

ware

Query Requests

ApplicationServer

Back-end DBMSs

Connectors

Page 84: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

FEDERATED DATABASE EXAMPLE

37

Query Requests

ApplicationServer

Back-end DBMSs

Foreign Data

Wrappers

Connectors

Page 85: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

CONCLUSION

We assumed that the nodes in our distributed DBMS are friendly.

Blockchain databases assume that the nodes are adversarial. This means you must use different protocols to commit transactions.

38

Page 86: 23 Distributed OLTP Databases - CMU 15-445/645 · 2020-02-11 · Federated Databases 8. CMU 15-445/645 (Fall 2019) ATOMIC COMMIT PROTOCOL When a multi-node txn finishes, the DBMS

CMU 15-445/645 (Fall 2019)

NEXT CL ASS

Distributed OLAP Systems

39