Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28...

41
Distributed Transactions 7. Transaction Management for distributed databases 28 Distributed Transactions Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–1

Transcript of Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28...

Page 1: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions

7. Transaction Management for distributed databases

28 Distributed Transactions

29 Distributed Commit

30 Distributed Synchronization

31 Distributed Deadlocks

32 Transaction Monitors

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–1

Page 2: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions

7. Transaction Management for distributed databases

28 Distributed Transactions

29 Distributed Commit

30 Distributed Synchronization

31 Distributed Deadlocks

32 Transaction Monitors

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–1

Page 3: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions

7. Transaction Management for distributed databases

28 Distributed Transactions

29 Distributed Commit

30 Distributed Synchronization

31 Distributed Deadlocks

32 Transaction Monitors

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–1

Page 4: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions

7. Transaction Management for distributed databases

28 Distributed Transactions

29 Distributed Commit

30 Distributed Synchronization

31 Distributed Deadlocks

32 Transaction Monitors

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–1

Page 5: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions

7. Transaction Management for distributed databases

28 Distributed Transactions

29 Distributed Commit

30 Distributed Synchronization

31 Distributed Deadlocks

32 Transaction Monitors

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–1

Page 6: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Transactions

Distributed Transactions

In distributed DBS, transactions across several nodesCommit as an atomic event→ Simultaneous in distributed nodesDistributed synchronization in order to guarantee consistency ininterleaved executionsDeadlock detection

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–2

Page 7: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Requirements for distributed commit

Commit protocol: Guarantee for atomicity and durabilityRequirements for distributed cases

I All nodes make a decision (Commit, Abort); globally, all nodesmake the same decision

I Commit only if all nodes vote “yes“I If no failure occurs and all nodes vote “yes“ global decision iscommit

I All processes terminate

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–3

Page 8: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Two-phase commit protocol

Roles: 1 coordinator, several participantsExecution:

1 Voting phase

1 Coordinator asks participants whether Commit can be performed2 Participants signal their decision to coordinator

2 Decision phase

1 Coordinator makes a decision based on participants’ signals (allcommit→ Global-Commit; one Abort→ Global-Abort

2 Participants that voted “yes“ wait for decision

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–4

Page 9: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

2PC: Execution schemeparticipants

Prepare-To-Commit

coordinator

INITIALINITIAL

Vote-CommitWAIT

Vote-Abort

READYGlobal-Abort

Global-Commit

ABORTCOMMIT

Commit

Abort

ACK

ACK

COMMITABORT

Uni

late

ral

MAbo

rt

writebegin_commit

inMLog

writeabortinMLog

no

yes

no

yes

writereadyinMlog

writeabortinMlog

writecommitinMlog

writeabortinMlog

writecommitinMlog

writeEOT

inMLog

Allcommitted?

ReadyMtocommit?

globaldecision?

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–5

Page 10: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

2PC: State transition

ABORT COMMIT ABORT COMMIT

INITIAL

Vote-Abort

Prepare

Prepare

Global-Abort

Vote-Abort Vote-Commit

Global-Commit

Global-Abort

ACK

Vote-Commit

Prepare

(b)W(a)Wcoordinator

Global-Commit

ACK

INITIAL

WAIT READY

commitWcommand

participants

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–6

Page 11: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

2PC: Problems Icoordinator

v-cv-c

v-c

g-cg-c

ACK

g-cACK

ACK

(2)

(1)

v-c:sVote-Commitp:sPrepare-To_Commitg-c:sGlobal-Commit

participants1 participants2 participants3

1stsPhase

2ndsPhase

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–7

Page 12: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

2PC: Problems II

Participants signaled Vote-Commit but coordinator fails (1)I Abort of participants after timeoutI But: Undo of a made decision!

After sending Global-Commit (to an unkown number ofparticipants) has been sent, coordinator and participant 1 fail (2)

I Who sends Global-Commit? Or Abort?

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–8

Page 13: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Variants of 2PC I

Linear 2PC: Coordinator as initiatorI Coordinator sends Prepare-To-Commit to participant 1I Participant 1 makes a decision and sends it to the next participantI Vote-Abort signal also to predecessorI Last participant receives Vote-Commit and votes “Yes“→Global-Commit to predecessor

Disadvantage: Slow because of sequential processing

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–9

Page 14: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Linear 2PC: Execution schema

abort

coordinator participant 1 participant 2 participant 3 participant 4

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–10

Page 15: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Variants of 2PC II

Distributed 2PC: Local voting processI Coordinator sends Prepare-To-Commit to all participantsI Write decision into Log and forward to all participantsI Every participant receives all results and makes a local decisionI Disadvantage: A lot of communication is requiredI Advantage: Quick answers because of missing phase 2

Hierarchical 2PC: Coordinators and sub-coordinators

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–11

Page 16: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Distributed 2PC: Execution schema

Vote-AbortVote-Commit

Global-Commit

Global-AbortPrepare

coordinator participants participants

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–12

Page 17: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

Hierarchical 2PC: Execution schemaA

B C

D E

P

P P

RR R

R

C C

C C

ok

ok

ok

ok

P = Prepare C = CommitR = Ready

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–13

Page 18: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

3-Phase Commit Protocol

Problems of 2PC: Failure of coordinators before participantsreceive Global-Commit / Global-AbortSolution: 3PC with additional PRE-COMMIT-phase

I Participants that receive Prepare-To-Commit know that Commitwill arrive only if coordinator does not fail

I Coordinator sends Commit, only after k participants confirm thePrepare-To-Commit with a Ready-To-Commit

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–14

Page 19: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

3PC: Phases I

1 Voting phase1 Coordinator sends Prepare signal2 Every participant answers and signals its decision (Vote-Commit

or Vote-Abort)3 In case of Vote-Abort, directly into state ABORT

2 Decision preparation phase1 Coordinator collects decisions; in case of Vote-Commit, a

Prepare-To-Commit is sent to all; otherwise Global-Abort2 Every participant with Vote-Commit waits for

Prepare-To-Commit and confirms with Ready-To-Commit;otherwise Global-Abort

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–15

Page 20: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

3PC Phases II3 Decision Phase

1 Coordinator collects all confirmations and makes a decision2 Participants wait for decision

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–16

Page 21: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

3PC: Execution schema

ABORT

writeabortinKLog

writecommitinKlog

writeEOT

inKLog

coordinator

INITIALINITIAL

Vote-CommitWAIT

Vote-Abort

READYGlobal-Abort

ABORT

Abort

ACK

Uni

late

ral

KAbo

rt

Prepare

Prepare-To-Commit

Ready-To-Commit

COMMIT

COMMIT

Prepare-To-Commit

Global-Commit

ACK

PRE-

PRE-

COMMIT

COMMIT

ReadyKtocommit?

no

yes

writereadyinKlog

Allcommitted?

no

yes

writeabortinKlog

writeprecommit

inKlog

globaldecision?

writeabortinKLog

writeprepare-to-commit

inKlogwrite

commitinKlog

[>Kk]

participants

writebegin_commit

inKLog

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–17

Page 22: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

3PC: State TransitionINITIAL

WAIT

ABORT

Prepare

COMMIT

PRE-

COMMIT

(a)ycoordinator

Vote-Abort

Global-Abort

Ready-To-Commit

Global-Commit

INITIAL

Prepare

READY

ABORT

COMMIT

(b)yparticipants

Prepare

ACK

ACK

Vote-Commit

Prepare-To-Commit

COMMIT

PRE-

Vote-Abort

Global-Abort

Vote-Commit

Prepare-To-Commit

Ready-To-Commit

Global-Commit

commitycommand

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–18

Page 23: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Commit

3PC: ErrorsFailure of coordinator and up to k − 1 further participants

1 All further participants in state READYI Failed participants can only be in states READY, ABORT orPRE-COMMIT→ Abort of transaction

2 One participant in state PRE-COMMIT or COMMITI Becomes new coordinator and continues protocolI decision was already made for commit

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–19

Page 24: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Distributed Synchronization

Local synchronization not sufficientExample: T1, T2

Node 1 Node 2T1 T2 T1 T2

r1(x)w1(x)

r2(y)w2(y)

r2(x)w2(x)

r1(y)w1(y)

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–20

Page 25: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Distributed Synchronization: II

Resulting schedule not conflict serializable; however no localsynchronization conflictSolution:

I Distributed timestamp-ordering methodF Total order on timestamps (Time, Node-ID)F Requires global clock and distributed clock synchronization

I Central or distributed locking methodsI . . .

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–21

Page 26: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Distributed timestamp-ordering methodGlobal timestamp ts as two-tuple(local timestamp tsl , hid is Host-ID):

ts = (tsl ,hid)

Order determination:Order according to tsl valueIn case of same tsl value, decision according to hid

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–22

Page 27: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Synchronization of local clocks

Usage of global clock:Requires regular synchronization of local clocks→ often notacceptableUsage of radio-controlled clockDistributed clock synchronizationSynchronization during communication, later time is adopted→(unique time)

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–23

Page 28: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Distributed timestamp allocation (via counter)

Node 1 Node 2Point in timeLocal TA timestamps local TA timestamps

1 T1 12 T1 13 T2 24 T3 35 T2 26 T4 47 T5 58 T3 59 T6 6

10 T4 611 T7 712 T5 713 T8 8

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–24

Page 29: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Transactions on replicas I

A schedule s on a replicated database is 1-copy serializable if thereis a serial schedule on a non-replicable database that has the sameeffect as s on a replicable data set.

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–25

Page 30: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Transactions on replicas IIReplication protocol

ROWA-Method (Read One, Write All): Local read andsynchronized updates of all replicas→extremely high complexity; some computer nodes might beunavailableROWAA-Method (Read One, Write All Available)Voting procedure: voting procedure or quorum procedure

I Statistical number of “eligible voters“I Dynamical number of “eligible voters“ is depending on

environmental influences such as lost connections and accessbehavior

(Weighting of votes is possible)

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–26

Page 31: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Transactions on replicas IIIReplication protocol

Absolutistic approaches: e.g. primary copy method: A certainnode updates a replica in any case. Choice is static or the nodehas a token

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–27

Page 32: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Synchronization

Distributed locking methods

Centralized 2PL (C2PL): Central management of locks on a node→ requires a lot of communication, heavy load for central lockmanager (replication protocol needs to be observed)Primary copy 2PL (PC2PL): Several lock managers on differentnodes; each DB-object has exactly one lock manager→ Distribution of lock managing loadDistributed 2PL (D2PL): Lock manager on every DBMS; lockmanager is responsible for its own DB-objects(no replication→ PC2PL, otherwise ROWA)

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–28

Page 33: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Deadlocks

Distributed DeadlocksClasses of deadlock handling:

Deadlock-freePreclaiming (C2PL) – atomic requirement can hardly be fulfilled indistributed casesDeadlock preventionTotal order of objects and their occupancyDeadlock detectionDetection of distributed deadlocks is problematic

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–29

Page 34: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Deadlocks

Deadlock detection

Time-Out-MechanismGlobal deadlock graphCentral coordinator manages conflict graph (coordinator couldfail!)

node 1 node 2

global DG

A B

A

B

T1 T2 T1 T2

T1 T2

local DG 1 local DG 2

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–30

Page 35: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Deadlocks

Deadlock handlingPractical methods

Conservative lockingC2PL-method: Problems with atomicityTimestamps as requirement orderTimestamps for handling lock conflictsDeadlock detection

I Centralized: Central vertex manages complete wait graphI Hierarchical deadlock detection: Many deadlocks can be identified

locally; difficult implementation thoughI Distributed deadlock detection

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–31

Page 36: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Deadlocks

Distributed detection of global deadlocks I

Deadlocks do not exist locally in any computer node. Computers sendeach other messages of the form [m,n, k ].

1 m ∼= Number of blocked process2 n ∼= Number of transaction that sent the message (sender)3 k ∼= Number of transaction to which the message is directed

(receiver)

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–32

Page 37: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Deadlocks

Distributed identification of global deadlocks: II

Message transmission starts with [0,0,1]Message [0,2,3] means that the blocked transaction is transaction0, sender is transaction 2 and receiver is transaction 3A deadlock occurs if and only if the message arrives at theblocked process

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–33

Page 38: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Distributed Deadlocks

Distributed identification of global deadlocks: Example

1

203

4

5

6

7

8[0,k2,k3]

[0,k4,k6]

[0,k8,k0]

[0,k5,k7]

n m

transaction

Nodek1 Nodek2 Nodek3

Thektransactionkmkwaitskforktransactionknkorthektransactionkmkiskblockedkbyktransactionkn

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–34

Page 39: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Transaction Monitors

Transaction monitors

Presentation Server Presentation Server

Workflow Controller

Transaction Server Transaction Server

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–35

Page 40: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Transaction Monitors

Transaction monitors: Architecture

Presentation server, acts as client and realizes communicationwith user (command language or menu-driven interfaces forsending transactions etc.)Workflow controller forces routing of transaction requirements ofdifferent DBMS and realizes, for instance, two-phase commitprotocolTransaction server realizes connection of local DBMS withtransaction monitor

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–36

Page 41: Distributed Transactions 7. Transaction …...7. Transaction Management for distributed databases 28 Distributed Transactions 29 Distributed Commit 30 Distributed Synchronization 31

Distributed Transactions Transaction Monitors

Advantages of a transaction monitor

Offers one standardized interface for programming transactions ondifferent DBMSIn distributed processing, it manages routing of transactions andforces commit protocolsOffers systems functions such as load balancing, error control,and system configurationIs able to fulfill functions such as writing log files or monitoringcommunicationTransaction server of a TP-monitor can also encapsulate data thatis not managed by a DBMS with full transaction functionality

Thomas Leich, Gunter Saake Transaction Management Last updated: 14.12.2018 7–37