Concurrency Control in Distributed Database Systems

23
Concurrency Control in Distributed Database Systems Intelligent Information Systems Seminar 2 nd Sep 2015 Based on: Philip A. Bernstein and Nathan Goodman. "Concurrency Control in Distributed Database Systems." ACM Computing Surveys (CSUR) 13.2 (1981): 185-221. Mahdi Jaberzadeh Ansari

Transcript of Concurrency Control in Distributed Database Systems

Page 1: Concurrency Control in Distributed Database Systems

Concurrency Control in Distributed Database

Systems

Intelligent Information Systems Seminar

2nd Sep 2015

Based on: Philip A. Bernstein and Nathan Goodman. "Concurrency Control in Distributed Database Systems." ACM Computing Surveys

(CSUR) 13.2 (1981): 185-221.

Mahdi Jaberzadeh Ansari

Page 2: Concurrency Control in Distributed Database Systems

2

Outline

1. Distributed Database Systemso Centralized DBS Vs. Distributed DBSo DDBMS

2. Transaction-Processing Modelso Transaction Managers and Data Managers in DDBMSo Two-Phase Commit Algorithm

3. Concurrency Control in DDBMSo Concurrency Control Anomalies o Concurrency Control in DDBMS Vs. Mutual Exclusion in OSs

4. Distributed Synchronization Techniqueso Tow-Phase Locking Algorithmo Wound-Wait Locking Algorithm

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Page 3: Concurrency Control in Distributed Database Systems

3

Centralized Multi-User DBS

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Database Systems

DBMS

DB

Page 4: Concurrency Control in Distributed Database Systems

4

Distributed Multi-User DBS

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Database Systems

DBMSDB

DBMS

DB

DBMS

DB

DDBS

A distributed database system (DDBS) is a collection of multiple, logically interrelated databases distributed over a network.

Page 5: Concurrency Control in Distributed Database Systems

5

Distributed Multi-User DBMS

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Database Systems

DBMSDB

DBMS

DB

DBMS

DB

DDBMS

A DDBMS is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users.

Page 6: Concurrency Control in Distributed Database Systems

6

A Simple Architecture of a DDBMS

Transaction Managers (TMs) supervise interactions between users and the DDBMS while Data Managers (DMs) manage the actual database.

DB 1

DB 2

DB 3

DM1

DM3

DM2

TM1

TM3

TM2

T1

….

Tn

T1

….

Tn

T1

….

Tn

Four operations are available at each TM interface:

o READ(X)o WRITE(X, Val)o BEGINo END

Distributed Transaction-Processing ModelIIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of

Bonn

Distributed Database Systems

DDBMS

DBMS 1

DBMS 2

DBMS 3

Page 7: Concurrency Control in Distributed Database Systems

7

Centralized Transaction-Processing Model

A centralized DBMS consists of one TM and one DM executing at one site.

DMTMT

X

T’s Private network

dm-write(x)

dm-read(x)

BEGIN READ(X) WRITE(X, new-value) END

TM initializes for T a private network.

Sends dm-read(x) if not exists in private network, and returns its value.

If X exists in private network rewrites it, otherwise create it with new value.

The TM issues dm-write(x) for updateditems.

DB

X

X’

X’

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Transaction-Processing Models

Page 8: Concurrency Control in Distributed Database Systems

8

Two-Phase Commit

In composed transactions a DBMS can avoid partial results by having property of atomic commitment.Atomic commitment means all of a transaction’s dm-writes are processed or none are.

DMTMT

X T’s Private network

dm-write(X)

X’

Y Y’

DB

Y

dm-write(Y) Secure Storage

Y

X

END

X’X

Incorrect Data

DB

Recovery

dm- write(Y)

Failed during 2nd phase

dm-write(X)

✓prewrite(X,Y

)

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Transaction-Processing Models

Secure Storage is part of the permanent memory which is used to save some partial data or some rows of DB temporally.

?First PhaseSecond

Phase

Page 9: Concurrency Control in Distributed Database Systems

9

Distributed Transaction-Processing Model (1)

DMTMT

X

T’s Private network

X’DB

Secure Storage

X

BEG

IN

X’X

prewrite(X)dm-

write(X)dm-

write(X)

RE

AD(X) dm-read(X)dm-read(X)

WR

ITE(

X)E

ND

DM

DB

Secure Storage

Xi’X

TMT’

Two-Phase Commit

It has to be repeated for each copy of X in all sites.

Now assume the commit failed on the second DM.

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Transaction-Processing Models

Page 10: Concurrency Control in Distributed Database Systems

10

dm- write(X)

Distributed Transaction-Processing Model (2)

DM1TM1T

X

T’s Private network

X’DB

Secure Storage

X’

END

DM2

DB

Secure Storage

Xi’X

TM2T’Slightly modify the prewrites to include all involved DMs.

X, DM1

prewrite(X, DM1)

dm-write(X)

X

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Transaction-Processing Models

Two-Phase Commit

Page 11: Concurrency Control in Distributed Database Systems

11

Concurrency Control Definition

• Concurrency control is the activity of coordinating concurrent accesses for updates to a database in a multi-user DBMS or DDBMS.

• The goal is to prevent database updates performed by one user from interfering with database retrievals and updates performed by another one.

• An algorithm to perform such control in a multi-user DDBMS is called a synchronization technique.

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

http://www.cliparthut.com/clip-arts/127/policeman-clip-art-127884.gif

Concurrency Control in DDBMS

Page 12: Concurrency Control in Distributed Database Systems

12

Anomalies (1): Lost Update Anomaly

Execution of T1 Execution of T2

$ 5,000

$ 5,000 $ 5,000

$ 1,000 $ 2,000

$ 4,000

$3,000

Read balance. Read balance.

Sub $1,000. Sub $2,000.

Write Result back to DB.

Write Result back to DB.

Tim

e

Bank Lost $1,000.

It must be$2,000.

In this example 2 users try to update same row of 1 table simultaneously.

Tom has an account with 2

ATM cards. Tom at ATM1 wants

to withdraw $1,000 in

Germany to buy a suit.

In a very same time, Tom’s wife which has the second ATM

card, wants to withdraw $2,000

in Australia to buy a pair of high heels.

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Concurrency Control in DDBMS

Page 13: Concurrency Control in Distributed Database Systems

13

Anomalies (2) : Inconsistent Retrieval Anomaly

Execution of T1 Execution of T2S $2,000C $500$

2,000$

1,000 S $1,000C $500

S $1,000C $1,500

$ 500

$ 1,500

S $ 1,000C $500Ʃ = $1,500

Read saving balance

- $1,000Write result

Read Saving and Checking

balancesPrint Sum

Read Checking balance

+ $1,000

Write result

Tim

e

It must be $2,500

TemporalResult

TemporalResult

Tom at ATM1 wants to

transfer $1,000 from Saving

account to his Checking account.

A little bit later, Tom’s

wife which has the second ATM card,

wants to print total amount of

money that they have.

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Concurrency Control in DDBMS

Page 14: Concurrency Control in Distributed Database Systems

14

Comparison with Mutual Exclusion in OSs (1)

• Concurrency control in DDBS and mutual exclusion in operating systems are similar in that, both are concerned with controlling concurrent access to shared resources.

• However, control schemes that work for one, do not necessarily work for the other.

http://csunplugged.org/routing-and-deadlock/

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Concurrency Control in DDBMS

Page 15: Concurrency Control in Distributed Database Systems

15

Comparison with Mutual Exclusion in OSs (2)

Suppose processes P1 and P2 require access to resources R1 and R2 at different points in their execution.

Tim

e

Execution of P1 Execution of P2Resources

P1R1

R2

P2

P2

P1 It is OK in OS concept. However in a database, this

execution is not always acceptable.IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Concurrency Control in DDBMS

Page 16: Concurrency Control in Distributed Database Systems

16

Comparison with Mutual Exclusion in OSs (3)

Assume P1 wants to transfer $1000 from R1 to R2, while P2 wants to print balances and sum of balances must be true.

Execution of P1 Execution of P2$ 2,000

R1 $2,000

R2 $500

- $1,000

P1

R1 $1,000

P2

P2

R1 $1,000

R2 $1,500

P1$ 500

+ $1,000

$500

$1,000

P2 reads wrong result for R2

balance.

Tim

e

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Concurrency Control in DDBMS

Page 17: Concurrency Control in Distributed Database Systems

17

Principal Issues in Concurrency Control

There are 2 correctness criteria for each concurrency control algorithm:

1) It is expected that each transaction submitted to the system be executed eventually.

2) It is expected the computation performed by each transaction to be the same whether it executes alone in a dedicated system or in parallel with other transactions in a DDBS.

DBDM1 T3 T2 T1

DBDM2 T3 T2T1

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Concurrency Control in DDBMS

Page 18: Concurrency Control in Distributed Database Systems

18

Serializability

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Synchronization Techniques

$5000

$2000

𝑇 21❑

𝑇 11❑

𝑇 12❑

𝑇 31❑

𝑇 32❑

𝑇 22❑

DB

DB

$5000

$3000

𝑇 21❑

𝑇 11❑

𝑇 12❑

𝑇 31❑

𝑇 32❑

𝑇 22❑

DB

DB

$5000

$2500

𝑇 1❑

𝑇 3❑

DB

DB

𝑇 2❑

$5000

$2000

𝑇 2❑

𝑇 3❑

DB

DB

𝑇 1❑

$5000

$1500

𝑇 2❑

𝑇 1❑

DB

DB

𝑇 3❑

$5000

$6700

𝑇 3❑

𝑇 2❑

DB

DB

𝑇 1❑

$5000

$4500

𝑇 3❑

𝑇 1❑

DB

DB

𝑇 2❑

• The order of concurrent transactions is serializable if the result of that be coincide with the result of the one of the possible sterilized orders.

• The art of finding such order in the partial transactions is called serializability.

Serialized ordersConcurrency

Page 19: Concurrency Control in Distributed Database Systems

19

Two-Phase Locking (1)

The Two-Phase Locking (2PL) protocol forces each transaction to make a lock or unlock request in two steps:

o Growing Phase: A transaction may obtain locks but may not release any locks.

o Shrinking Phase: A transaction may release locks but not obtain any new lock.

While the 2PL protocol guarantees serializability, it does not ensure that deadlocks do not happen. In this algorithm local and global deadlock detectors are looking for deadlocks periodically and solve them with restarting transactions to their initial states.

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Synchronization Techniques

Page 20: Concurrency Control in Distributed Database Systems

20

T1 was continuing updating X on DM2 without any knowledge of T2.

Now it is time to release all locks.And awake the next queued transaction.

Same as before, just form a private network for transaction.

If LOCAL X is free then put a Read Lock on It. Keep the lock till END or change to WL.

Main approach in 2PL is “Read Any, Write all”.

Now just store new value of X in workspace.

To update an item, write locks are required on all copies.

Two-Phase Locking (2)

DM1TM1T1

XT’s Private networkX’

DB

Secure Storage

X

BEG

IN

X’X

prewrite(X)dm-

write(X)dm-

write(X)

RE

AD(X) dm-read(X)dm-read(X)

WR

ITE(

X)E

ND

DM2

DB

Secure Storage

Xi’

TM1T2

prewrite(X, DM1)

X, DM1

Now by having all locks, start to update all copies.

dm-write(X)

X’X

BE

GI

NR

EAD

(X)

T2’s Private network

Now in the middle of updating another transaction arisen.

dm-write(X)

But it must wait, because LOCAL X is lock.

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Synchronization Techniques

Page 21: Concurrency Control in Distributed Database Systems

21

Wound-Wait Locking

Wound-Wait locking algorithm follows the same approach as the 2PL protocol. Except it does not have deadlock detector and uses timestamps to prevent deadlock.

Requests T1 is T1 is allowed tot(T1) > t(T2)

Younger Wait for T2(the older one) until it finishes.

t(T1) < t(T2)

Older Abort and rolled back and allowed T2 to be done.

Time

DB

XTM1 DM1

TM2

T1

T2

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Distributed Synchronization Techniques

Page 22: Concurrency Control in Distributed Database Systems

22

Conclusion

1. Distributed Database Systemso Centralized and Distributed Multi-User DBS.o DDB, DDBS, DDBMS.

2. Transaction-Processing Modelso Centralized and Distributed Transaction-Processing Modelso Transaction Managers(TM) and Data Managers(DM) in DDBMSo 4 Main Operations in a TM Transaction in a DBMS and in a DDBMS:

BEGIN, READ, WRITE, END o Modified Two-Phase Commit Algorithm in DDBMS

3. Concurrency Control in DDBMSo Concurrency Control Anomalies

Lost Update Anomaly, Inconsistent Retrieval Anomalyo Concurrency Control in DDBMS versus Mutual Exclusion in OSs

4. Distributed Synchronization Techniqueso Tow-Phase Locking Algorithmo Wound-Wait Locking Algorithm

IIS Seminar: Concurrency Control In Distributed DBS Mahdi Jaberzadeh Ansari University of Bonn

Page 23: Concurrency Control in Distributed Database Systems

23

The End.

IIS Seminar: Concurrency Control In Distributed DBMS Mahdi Jaberzadeh Ansari University of Bonn