Distributed Data Management - KIT · 2008. 11. 24. · Distributed Data Management Chapter 4:...
Transcript of Distributed Data Management - KIT · 2008. 11. 24. · Distributed Data Management Chapter 4:...
IPD, Forschungsbereich Systeme der Informationsverwaltung
Lecture
Distributed Data Management
Chapter 4: Distributed Transactions(Second Part)
Erik [email protected]
IPD, Forschungsbereich Systeme der Informationsverwaltung
2PC Variants
Erik Buchmann IWM: Einleitung – 3
Linear Two-Phase Commit (1)
● Commit processing sequentially via the TMs of the n nodes participating in the global TA.
● Phase 1: communication ‚in the forward direction‘ from coordinator (TM1) to last agent (TMn);Phase 2: other direction.
T M 1 T M 2 T M 3 T M n
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
A C K
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 4
Linear Two-Phase Commit (2)
● Coordinator enters PREPARED state,passes its local commit decision (READY) to TM2.
● Agent enters PREPARED state; after having received READY,sends READY to next agent.
2PC Optim.
3PC
Discussion
T M 1 T M 2 T M 3 T M n
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
A C K
Erik Buchmann IWM: Einleitung – 5
Linear Two-Phase Commit (3)
● Successful termination of transaction – is decided once last agent TMn has received
READY and has written commit log entry (of entire transaction).
– commit decision goes to agents in reverse order; logging and release of locks.
– TM1 then sends ACK to TMn; TMn writes end log entry.
2PC Optim.
3PC
Discussion
T M 1 T M 2 T M 3 T M n
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
A C K
Erik Buchmann IWM: Einleitung – 6
Linear Two-Phase Commit (4)
● Abort of transaction– if one of the nodes decides abort;
FAILED message is passed on.– Last agent (TMn) becomes coordinator:
it logs global commit result and passes it on.
2PC Optim.
3PC
Discussion
T M 1 T M 2 T M 3 T M n
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
A C K
Erik Buchmann IWM: Einleitung – 7
Linear Two-Phase Commit (5)
● Advantages– reduced communication overhead
● combine PREPARE- and READY-Messages● only one ACK
● Disadvantages– serial processing,
very slow if many nodes are involved
2PC Optim.
3PC
Discussion
T M 1 T M 2 T M 3 T M n
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
R E A D Y / F A I L E D
C O M M I T / A B O R T
A C K
Erik Buchmann IWM: Einleitung – 8
Hierarchical Two-Phase Commit (1)
● Generalization of basic scheme for hierarchical invocation structures (transaction tree).
● Each agent communicates only with direct ancestor and direct successors.
standard
linearhierarchical
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 9
Hierarchical Two-Phase Commit (2)
TM1 (Coordinator)
Logging
End
TM2 (Agent)
READY / FAILED
PREPARE
COMMIT / ABORT
ACK
TM3 (Agent)
READY / FAILED
PREPARE
Logging
Logging
Logging Release of Locks
COMMIT / ABORT
ACKEnd
Logging Release of Locks
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 10
Hierarchical Two-Phase Commit (3)
● No changes for root and leaf nodes.● Intermediate nodes:
coordinator for successors in tree,agent from its ancestor‘s perspective.
● PREPARE messages go to all successors,wait for their commit votes.Then commit decision for entire subtree, logging + sending it to ancestor.
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 11
Hierarchical Two-Phase Commit (4)
● Abort – immediately inform all successors having voted commit.
● Phase 2: receive commit result from ancestor, log it, pass it on to successors, and confirm it immediately.
● After having received all ACKs from successors write end log entry.
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 12
Hierarchical Two-Phase Commit (5)
● High generality and flexibility– less messages, compared to basic scheme– might speed up processing if groups of nodes
are located in different subnets with slow interconnections
● In general, reduced performance– less parallelism and longer duration
(proportional to height of tree)– one additional (asynchronous) log write
in intermediate node (end log).
2PC Optim.
3PC
Discussion
IPD, Forschungsbereich Systeme der Informationsverwaltung
2PC Optimizations
Erik Buchmann IWM: Einleitung – 14
Basic Protocol
● Flow of messages:
Messages shown occur for each agent.● Recap: What happens when agent recovers
during uncertainty period?
TM1 (Coordinator)
Ph
ase
1
Determineglobal commit resultand log it.
End
Determine local commitresult and log it.
TM2 (Agent)
Ph
ase
2
local commit result
(READY / FAILED)
PREPARE
global commit result
(COMMIT / ABORT)
Confirmation
(ACK)
Log globalcommit result;release locks.
2PC Optim.
3PC
Discussion
Recap: Why necessary?
Erik Buchmann IWM: Einleitung – 15
Presumed Abort (1)
● Objective: less messages and less log entries.● If after failure coordinator log file does not contain
commit log entry: decide abort.● Presumed abort – incorporated in several products
and in standards (ISO/OSI TP, X/Open DTP).
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 16
Presumed Abort (2)
● Advantages:– Coordinator does not need to write
abort log entry synchronously.– ACK messages for failed transactions
superfluous,– same with end log entries
at coordinator and intermediate nodes.● However, no savings
with successful global transactions.
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 17
Read-Only Subtransactions (1)
● Objective: save messages and/or (synchronous) log entries.
● Example – transaction with three subtransactions:1.STA1: read balance of bank account x.
2.STA2: if x>1000, withdraw 500 from account y.
3.STA3: otherwise withdraw 100 from account z.● What happens after STA1 has voted commit
– if transaction is successful,– if transaction fails? Note that STA1 is read-only.
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 18
Read-Only Subtransactions (2)
● Neither recovery nor logging, only release of locks.● May already happen in Phase 1 of 2PC protocol,
irrespective of success of global transaction;save entire second commit phase
● If m subtransactions are read-only (of n-1),number of messages reduces by 2m to 4·(n-1)-2m, number of log writes reduces to 2n-m.
● If global transaction is read-only (m = n),only 2·(n-1) messages and no log writes.
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 19
● So far: work in subtransactions separated from commit protocol:
● Short distributed transactions with only one external database operation (e.g., money transfer to bank): commit processing is more expensive than transaction itself.
One-Phase Commit (1)
primary transaction
WORK
DONE
PREPARE
READYsubtransaction
COMMIT
ACK
once for each operation2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 20
One-Phase Commit (2)
● Thus: combine PREPARE message with WORK message.
● Subtransaction enters PREPARE state immediately after having executed operation and before replying to primary transaction.
primarytransaction
WORK & PREPARE
DONE & READY subtransactionCOMMIT
ACK
2PC Optim.
3PC
Discussion
Erik Buchmann IWM: Einleitung – 21
One-Phase Commit (3)
● We can save first commit phase, commit processing consists of only one phase to communicate global result(hence the name).
● Two messsages less per agent.● Why does it only work with short transactions?
2PC Optim.
3PC
Discussion
IPD, Forschungsbereich Systeme der Informationsverwaltung
3PC
Erik Buchmann IWM: Einleitung – 23
3PC – Introduction
● Weakness of all 2PC protocols– dependency on coordinator– failure while agents are in READY state
may result in long blockings.● Thus, some ‚solutions‘ in practice
refrain from transactional guarantees.● Alternative:
„non-blocking“ commit protocols alleviate the situation but require more effort, e.g., three-phase commit (3PC).
● For practical purposes, 2PC generally is sufficient.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 24
3PC – Variants
● Two variants of 3PC protocol:– Tolerates site failures.
Non-blocking, except for total failures.(Total failures – all nodes are down.)Communication failures may result in inconsistencies.
– Tolerates both communication and site failures, but blocking.
● We deal with Variant 1 first, then with Variant 2.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 25
Non-Blocking Characteristic (1)
● (Recap:) Process is blocked if fixing an error/failure is necessary s.t. it can proceed.
● Non-blocking characteristic: if an operational process is uncertain, no other process (operational or failed) has decided commit.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
READYMessages
Nonblocking characteristicis not violated.
COMMITMessages
Nonblocking characteristicis typically violated.
Erik Buchmann IWM: Einleitung – 26
Non-Blocking Characteristic (3)
● 2PC does not have non-blocking characteristic, → COMMIT messages do not arrive at same time.
● Note:– Non-blocking characteristic leaves aside
processes that have just recovered and are finding out the state of the transaction.
– Non-blocking characteristic is not violated as long as all nodes are uncertain.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 27
Non-Blocking Characteristic (4)
● Desired characteristic:– uncertain processes may abort– processes that have failed have not decided
commit as well.
→ No blocking.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 28
3PC – Overview
● PRE-COMMIT messages – end uncertainty, but not yet commit.
● PRE-COMMIT tells node that it will eventually receive COMMIT message if coordinator does not fail.
● Coordinator fails→ Nodes decide without coordinator
and are not blocked.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 29
3PC (1)
● First phase and abort case (not depicted here) as with 2PC.
● Additional phase only if all agents vote READY.
TM1 (Coordinator)
logging (precommit)
end
TM2 (Agent)
READY
PREPARE
PRECOMMIT
PC-ACK
logging (prepared)
logging (precommit)
logging (commit) release locks
COMMIT
ACK
logging (commit)
‘Precommit’ = ‘Committable’
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
precommit: agent has the
intention to commit; different
from commit!
Erik Buchmann IWM: Einleitung – 30
3PC (2)
● Intermediate state PRECOMMIT of coordinator, respective log entry.
● It informs all agents; they write log entry as well + confirm.
● If k of n-1 PC-ACK messages have arrived, coordinator decides commit and writes respective log entry.
● Last phase – same as with 2PC.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 31
3PC (3)
● Precommit of coordinator:– assertion that it will not abort transaction in
future– TA may still abort if this coordinator fails.
● Failure of coordinator node still possible:– timeout to recognize this,– election of new coordinator.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 32
Why COMMIT and ACK Still Needed?
● Node knows that it will eventually receive COMMIT message if coordinator does not fail.
● COMMIT also guarantees that transaction was completed successfully
● ACK proves that agent is in a consistent state
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 33
Timeouts in 3PC
● First phase and abort case (not depicted here) as with 2PC.
● Additional phase only if all agents vote READY.
TM1 (Coordinator)
logging (precommit)
end
TM2 (Agent)
READY
PREPARE
PRECOMMIT
PC-ACK
logging (prepared)
logging (precommit)
logging (commit) release locks
COMMIT
ACK
logging (commit)
‘Precommit’ = ‘Committable’
1
2
3
4
5
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 34
Dealing with Timeouts
● When do we wait?1.Participants wait for PREPARE.
2.Coordinator waits for the votes.
3.Participants wait for PRE-COMMIT/ABORT.
4.Coordinator waits for PC-ACKs.
5.Participants wait for COMMIT.● 1., 2. – unproblematic, abort is always possible.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion 4
2
1
3
5
Erik Buchmann IWM: Einleitung – 35
Coordinator Failures (1)
● Case 4 („Coordinator waits for PC-ACKs“).● Ignore missing PC-ACK messages;
go on after timeout period.● Agent must find out state of protocol
after recovery. (Will be dealt with right away.)● Non-blocking characteristic is not violated.
(Recap. non-blocking characteristic: if operational process is uncertain, no (other) process has decided commit.)
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 36
Coordinator Failures (2)
TM1 (Coordinator)
logging
(precommit)
end
TM2 (Agent)
READY
PREPARE
PRECOMMIT
PC-ACK
logging (prepared)
logging (precommit)
logging (commit) release locks
COMMIT
ACK
logging (commit)
‘Precommit’ = ‘Committable’
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
uncertainty ends3
5
Erik Buchmann IWM: Einleitung – 37
Coordinator Failures (3)
● Cases 3., 5. („Participants wait ...“)● Available nodes elect new coordinator
→ election protocol● New coordinator requests states
of all available nodes (message STATE-REQ).→ termination rules specify how protocol continues
● Why cannot participant simply commit in Case 5 („participants wait for COMMIT“)? Non-blocking property is typically violated.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 38
Coordinator Failures (4)
● Second node cannot commit – non-blocking property would be violated.
● Coordinator has failed – not guaranteed that protocol will terminate successfully.
READYMessages
PRECOMMITMessages
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
PCACK
Erik Buchmann IWM: Einleitung – 39
Election Protocol (1)
● Select new coordinator if the old one failed● Prerequisite: linear ordering of processes (‚<‘).● UPp – set of processes
of which p believes that they are operational.● New coordinator – first node
according to that ordering.● Message UR-ELECTED.● Then messages STATE-REQ.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 40
● Illustration:
Election Protocol (2)
2
5
31 4
URELECTED
STATEREQ
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 41
Termination Rules (1)
● After new coordinator has collected states of available nodes:– TR1: a process has aborted
→ coordinator decides abort and sends out ABORT messages.
– TR2: a process has committed→ coordinator decides commit
and sends out COMMIT messages.– TR3: all processes that have reported their state
are uncertain (PREPARED state)→ coordinator decides abort
and sends out ABORT messages.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 42
Termination Rules (2)
– TR4: a process is committable, none is committed. → PRE-COMMIT messages to processes in uncertain state and wait for acks.Then commit decision, and COMMIT messages are sent out.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 43
Failures during Termination
● Point that is still open:how to deal with failures during termination.– Ignore failures of agents.– Failures of coordinator – algorithm is repeated,
but with less nodes → termination.I.e., nodes do not need to remain available.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 44
Summary of 3PC Features
● Conclusion:– no blockings any more (site failures only).
● Why?● Situation that would lead to blocking
with 2PC, but not with 3PC?
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 45
2PC vs. 3PC
● 3PC does not have to wait, but elects new coordinator instead.
● Intermediate state with 3PC → pleasant situation that we know votes of all other nodes without decision having been taken.
READYMessages
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
COMMITdecision
READYMessages
PRECOMMIT
Erik Buchmann IWM: Einleitung – 46
Communication Failures
● What happens in case of communication failures?
● Partitions, with different outcomes of protocol!
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
READYMessages
PRECOMMIT
Erik Buchmann IWM: Einleitung – 47
● Illustration:
● What may happen if there are communication failures(disconnected partitions)?
Election and Communication Failures (2)2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion 2
5
31 4
URELECTED
STATEREQ
Erik Buchmann IWM: Einleitung – 48
Election and Communication Failures (2)
● Effects that may occur, e.g., because of message delays.– New coordinator q,
p‘ does not yet know that c is not available any more.p‘ receives Message STATE-REQ.p‘ concludes that c is not available any more.
– p‘ receives Message STATE-REQ from q, then one from q‘, with q‘>q.What does it mean?
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 49
Recovery after Total Failures
● Protocol typically blocks in case of total failures (but this is a pathological case).
● Process p that has just recovered:in general, autonomous decision is not feasible.
● Namely, decision for commit or abort could have been taken after failure of p.
● Only process that has failed last can do this.● The only possible approach:
wait for recovery of this process
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 50
3PC + Communication Failures
● In the following: second variant of 3PC protocol.● Variant 1 assumes
that there are no communication failures.● Illustration: Components A and B,
separated from each other.All processes in A: uncertain,all processes in B: commitable.
● Characteristics of Variant 2: tolerates communication and site failures, but blocking.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
PRECOMMIT
Erik Buchmann IWM: Einleitung – 51
3PC – Variant 2
● Variant 2: Tolerates communication and site failures, but blocking.
● Coordinator that decides must be able to communicate with majority of processes.– Idea: rule out wrong descision of a secondary
coordinators which were elected by mistake due to network failures
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 52
Differences to Variant 1
● First phase and abort case (not depicted here) as with 2PC.
● Additional phase only if all agents vote READY.
TM1 (Coordinator)
logging
(precommit)
end
TM2 (Agent)
READY
PREPARE
PRECOMMIT
PC-ACK
logging (prepared)
logging (precommit)
logging (commit) release locks
COMMIT
ACK
logging (commit)
‘Precommit’ = ‘Committable’
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 53
Abort in 3PC
● Second round of messages in ABORT case as well.
TM1 (Coordinator)
logging
(preabort)
end
TM2 (Agent)
FAILED
PREPARE
PREABORT
PREABORT-ACK
logging (failed)
logging (preabort)
logging (abort)
release locks
ABORT
ACK
logging (abort)
‘Preabort’ = ‘Abortable’
Intention
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 54
Intention in 3PC – Variant 2
● Intention – Coordinator effects intention
as soon as it knows states of majority of processes.
– Intention = decision that coordinator will take as soon as majority of processes has been informed (if not one process already in aborted or committed state).
– Intention commit – at least one process is commitable. Abort – all processes are uncertain.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 55
Intention (cont.)
● Intention must have been communicated to majority of processes before coordinator takes decision.
● Intention is communicated to processes(PRE-COMMIT/PRE-ABORT).
● Coordinator waits for PRE-COMMIT-ACK or PRE-ABORT-ACK of absolute majority.
● Coordinator sends COMMIT or ABORT to component.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 56
Abortable and Committable (1)
Intention: Commit
Committable
CommittablePrepared Prepared Prepared
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 57
Abortable and Committable (2)
● After timeout, green component elects new coordinator.
● New coordinator collects states from nodes in component.
● New coordinator comes up with intention ‚Abort‘.
Committable
Committable
CPREABORT
PREABORT
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 58
Abortable and Committable (3)
● What happens when communication failure goes away?
● Participants in Abortable state will not react to PRE-COMMIT message,coordinator will not be able to form majority.
Committable
CommittableAbortable Abortable Abortable
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 59
C
What If There Were No PRE-ABORT Messages?
Committable
CommittablePREPARED
PREPARED
Intention:Abort PRECOMMIT
PRECOMMIT
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
● If new coordinator fails after deciding abort, and if communication is restored afterwards, the old coordinator would enforce commit
Erik Buchmann IWM: Einleitung – 60
Majority Termination Rules (1)
● Majority Termination Rules – specify how coordinator decides, depending on the states received.– Coordinator receives a committed state
→ commit decision, COMMIT messages.– Aborted – same.
In what follows, no more committed or aborted states.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 61
Majority Termination Rules (2)
1. Sequence:a) Coordinator:
one committable and a majority non-abortable states (prepared or committable)⇒ PRE-COMMIT messages to all sites that have not sent commitable.
b) Site: PRE-COMMIT → new state committable→ PRE-COMMIT-ACK
c) Coordinator: if sites in state commitable form majority: commit; otherwise blocking.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 62
Majority Termination Rules (3)
1. Sequence:a) Coordinator:
Majority of non-committable states (prepared or abortable) ⇒ PRE-ABORT messages to all sites
that have not replied abortable.b) Site: PRE-ABORT
→ new state abortable→ PRE-ABORT-ACK
c) Coordinator: if sites in state abortable form majority: abort; otherwise blocking.
blocking: e.g., insufficient number of responses from other nodes
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 63
Election Protocol
● Problem: due to communication failures, multiple coordinators could be elected
● Each process p administers set UPp of processes that it believes it can currently communicate with.
● Message UR-ELECTED to process in UPp with smallest ID.
● q ignores UR-ELECTED if it can communicate with processes with smaller ID.
● Process ignores messages STATE-REQ, PRE-COMMIT etc. from other coordinator.
2PC Optim.
3PC
- Introduction
- Protocol - Steps
- Timeouts & Site Fail.
- Variant 2
Discussion
Erik Buchmann IWM: Einleitung – 64
Comparison
● Number of messages(including optimized treatment of read-only subtransactions)
● transaction has executed at n nodes (n > 1),m nodes with read-only subtransactions (m < n)
2PC Optim.
3PC
Discussion
general Example 1(n=2, m=0)
Example 2(n=10, m=5)
1PC 2(n-1) 2 18
Linear 2PC 2n-1 3 19
Centralized/hierarchical 2PC
4(n-1)-2m 4 26
3PC 6(n-1)-4m 6 34
Erik Buchmann IWM: Einleitung – 65
Comparison (2)
● For 1PC and linear 2PC no savings for read-only subtransactions:with long read locks two messages per agent per release for these protocols as well.
● 3PCsignificantly more messages (6·(n – 1)) and log writes (3n)
2PC Optim.
3PC
Discussion