CSL 771: Database Implementation Transaction Processing

72
CSL 771: Database Implementation Transaction Processing Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, Vassos Hadzilacos and Nathan Goodman (http://research.microsoft.com/en-us/people/philbe/ ccontrol.aspx)

description

CSL 771: Database Implementation Transaction Processing. Maya Ramanath All material (including figures) from: Concurrency Control and Recovery in Database Systems Phil Bernstein, Vassos Hadzilacos and Nathan Goodman (http :// research.microsoft.com /en-us/people/ philbe / ccontrol.aspx ). - PowerPoint PPT Presentation

Transcript of CSL 771: Database Implementation Transaction Processing

Page 1: CSL  771: Database Implementation Transaction Processing

CSL 771: Database ImplementationTransaction Processing

Maya RamanathAll material (including figures) from:

Concurrency Control and Recovery in Database SystemsPhil Bernstein, Vassos Hadzilacos and Nathan Goodman

(http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx)

Page 2: CSL  771: Database Implementation Transaction Processing

Transactions• Interaction with the DBMS through

SQL

update Airlines set price = price - price*0.1, status = “cheap” where price < 5000

A transaction is a unit of interaction

Page 3: CSL  771: Database Implementation Transaction Processing

ACID Properties• Atomicity• Consistency• Isolation• Durability

Database system must ensure ACID properties

Page 4: CSL  771: Database Implementation Transaction Processing

Atomicity and Consistency• Single transaction– Execution of a transaction: “all-or-

nothing”Either a transaction completes in its entiretyOr it “does not even start”– As if the transaction never existed– No partial effect must be visible

2 outcomes: A transaction COMMITs or ABORTs

Page 5: CSL  771: Database Implementation Transaction Processing

Consistency and Isolation• Multiple transactions– Concurrent execution can cause an

inconsistent database state– Each transaction executed as if isolated

from the others

Page 6: CSL  771: Database Implementation Transaction Processing

Durability• If a transaction commits the effects

are permanent

• But, durability has a bigger scope– Catastrophic failures (floods, fires,

earthquakes)

Page 7: CSL  771: Database Implementation Transaction Processing

What we will study…• Concurrency Control– Ensuring atomicity, consistency and

isolation when multiple transactions are executed concurrently

• Recovery– Ensuring durability and consistency in

case of software/hardware failures

Page 8: CSL  771: Database Implementation Transaction Processing

Terminology• Data item

– A tuple, table, block

• Read (x)• Write (x, 5)

• Start (T)• Commit (T)• Abort (T)• Active Transaction

– A transaction which has neither committed nor aborted

Page 9: CSL  771: Database Implementation Transaction Processing

High level model

Transaction Manager

Scheduler

Recovery Manager

Cache ManagerDisk

Transaction 1 Transaction 2 Transaction n

Page 10: CSL  771: Database Implementation Transaction Processing

Recoverability (1/2)• Transaction T Aborts– T wrote some data items– T’ read items that T wrote

• DBMS has to…– Undo the effects of T– Undo effects of T’– But, T’ has already committed

T T’Read (x)Write (x,

k)Read (y)

Read (x)Write (y,

k’)Commit

Abort

Page 11: CSL  771: Database Implementation Transaction Processing

Recoverability (2/2)• Let T1,…,Tn be a set of transactions• Ti reads a value written by Tk, k < i• An execution of transactions is

recoverable if Ti commits after all Tk commitT1 T2

Write (x,2)

Read (x)Write (y,2)

Commit

T1 T2

Write (x,2)

Read (x)Write (y,2)

CommitCommit

Page 12: CSL  771: Database Implementation Transaction Processing

Cascading Aborts (1/2)• Because T was aborted, T1,…, Tk also

have to be abortedT T’ T’’

Read (x)Write (x,

k)Read (y)

Read (x)Write (y,

k’)Abort

Read (y)

Page 13: CSL  771: Database Implementation Transaction Processing

Cascading Aborts (2/2)• Recoverable executions do not

prevent cascading aborts• How can we prevent them then ?

T1 T2

Write (x,2)

Read (x)Write (y,2)

CommitCommit

T1 T2

Write (x,2)

CommitRead (x)

Write (y,2)

Commit

Page 14: CSL  771: Database Implementation Transaction Processing

What we learnt so far…

T1 T2

Write (x,2)

Read (x)Write (y,2)

Commit

T1 T2

Write (x,2)

Read (x)Write (y,2)

CommitCommit

T1 T2

Write (x,2)

CommitRead (x)

Write (y,2)

Commit

Not recoverable Recoverable with cascading aborts

Recoverable without cascading aborts

Reading a value, committing a transaction

Page 15: CSL  771: Database Implementation Transaction Processing

Strict Schedule (1/2)• “Undo”-ing the effects of a

transaction– Restore the before image of the data

itemT1 T2

Write (x,1)Write (y,3)

Write (y,1)

CommitRead (x)Abort

T1 T2

Write (x,1)Write (y,3)

Commit

Equivalent toFinal value of y: 3

Page 16: CSL  771: Database Implementation Transaction Processing

Strict Schedule (2/2)T1 T2

Write (x,2)

Write (x,3)

Abort

Initial value of x: 1

Should x be restored to 1 or 3?

T1 T2

Write (x,2)

Write (x,3)

AbortAbortT1 restores x to 3?

T2 restores x to 2?

Do not read or write a value which has been written by an active transaction until that transaction has committed or aborted

T1 T2

Write (x,2)

AbortWrite (x,3)

Page 17: CSL  771: Database Implementation Transaction Processing

The Lost Update ProblemT1 T2

Read (x)Read (x)Write (x, 200,000)Commit

Write (x, 200)

Commit

Assume x is your account balance

Page 18: CSL  771: Database Implementation Transaction Processing

Serializable Schedules• Serial schedule– Simply execute transactions one after

the other• A serializable schedule is one which

equivalent to some serial schedule

Page 19: CSL  771: Database Implementation Transaction Processing

SERIALIZABILITY THEORY

Page 20: CSL  771: Database Implementation Transaction Processing

op21, op22, op23, op24

op11, op12, op13

Serializable SchedulesT1: op11, op12, op13

T2: op21, op22, op23, op24

• Serial schedule– Simply execute transactions one after

the otherop11, op12, op13

op21, op22, op23, op24

• Serializable schedule– Interleave operations– Ensure end result is equivalent to some

serial schedule

Page 21: CSL  771: Database Implementation Transaction Processing

Notationr1[x] = Transaction 1, Read (x)w1[x] = Transaction 1, Write (x)c1 = Transaction 1, Commita1= Transaction 1, Abort

r1[x], r1[y], w2[x], r2[y], c1, c2

Page 22: CSL  771: Database Implementation Transaction Processing

Histories (1/3)• Operations of transaction T can be

represented by a partial order.r1[x]

r1[y]w1[z] c1

Page 23: CSL  771: Database Implementation Transaction Processing

Histories (2/3)• Conflicting operations– Of two ops operating on the same data

item, if one of them is a write, then the ops conflict

– An order has to be specified for conflicting operations

Page 24: CSL  771: Database Implementation Transaction Processing

Histories (3/3)• Complete History

Page 25: CSL  771: Database Implementation Transaction Processing

Serializable Histories• The goal: Ensure that the

interleaving operations guarantee a serializable history.

• The method–When are two histories equivalent?–When is a history serial?

Page 26: CSL  771: Database Implementation Transaction Processing

Equivalence of Histories (1/2)

H ≅ H’ if1. they are defined over the same set of

transactions and they have the same operations

2. they order conflicting operations the same way

Page 27: CSL  771: Database Implementation Transaction Processing

Equivalence of Histories (2/2)

Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman

y

Page 28: CSL  771: Database Implementation Transaction Processing

Serial History• A complete history is serial if for

every pair of transactions Ti and Tk,– all operations of Ti occur before Tk OR– all operations of Tk occur before Ti

• A history is serializable if its committed projection is equivalent to a serial history.

Page 29: CSL  771: Database Implementation Transaction Processing

Serialization Graph

T1 T3 T2

Page 30: CSL  771: Database Implementation Transaction Processing

Serializability TheoremA history H is serializable if its

serialization graph SG(H) is acyclic

On your ownHow do recoverability, strict

schedules, cascading aborts fit into the big picture?

Page 31: CSL  771: Database Implementation Transaction Processing

LOCKING

Page 32: CSL  771: Database Implementation Transaction Processing

High level model

Transaction Manager

Scheduler

Recovery Manager

Cache ManagerDisk

Transaction 1 Transaction 2 Transaction n

Page 33: CSL  771: Database Implementation Transaction Processing

Transaction ManagementTransaction

Manager• Receives

Transactions• Sends operations to

scheduler

Scheduler• Execute op• Reject op• Delay op

Read1(x)Write2(y,k)Read2(x)Commit1

Transaction 1Transaction 2Transaction 3

.

.

.Transaction n

Disk

Page 34: CSL  771: Database Implementation Transaction Processing

Locking• Each data item x has a lock

associated with it• If T wants to access x– Scheduler first acquires a lock on x– Only one transaction can hold a lock on

x• T releases the lock after processing

Locking is used by the scheduler to ensure serializability

Page 35: CSL  771: Database Implementation Transaction Processing

Notation• Read lock and write lock

rl[x], wl[x]• Obtaining read and write locks

rli[x], wli[x]• Lock table– Entries of the form [x, r, Ti]

• Conflicting locks– pli[x], qlk[y], x = y and p,q conflict

• Unlockrui[x], wui[x]

Page 36: CSL  771: Database Implementation Transaction Processing

Basic 2-Phase Locking (2PL)Receive pi[x]

is qlk[x] set such that p and q conflict?

pi[x] delayed

Acquire pli[x]

pi[x] scheduled

RULE 1

NO

YES

RULE 2

pli[x] cannot be released until pi[x] is completed

RULE 3 (2 Phase Rule)

Once a lock is released no other locks may be obtained.

Page 37: CSL  771: Database Implementation Transaction Processing

The 2-phase ruleOnce a lock is released no other locks may be obtained.T1: r1[x] w1[y] c1

T2: w2[x] w2[y] c2

H = rl1[x] r1[x] ru1[x] wl2[x] w2[x] wl2[y] w2[y] wu2[x] wu2[y] c2 wl1[y] w1[y] wu1[y] c1

T1 T2

Page 38: CSL  771: Database Implementation Transaction Processing

Correctness of 2PL 2PL always produces serializable

historiesProof outline

STEP 1: Characterize properties of the schedulerSTEP 2: Prove that any history with these properties is serializable

(That is, SG(H) is acyclic)

Page 39: CSL  771: Database Implementation Transaction Processing

Deadlocks (1/2)T1: r1[x] w1[y] c1

T2: w2[y] w2[x] c2

Schedulerrl1[x] wl2[y] r1[x] w2[y] <cannot proceed>

Page 40: CSL  771: Database Implementation Transaction Processing

Deadlocks (2/2)Strategies to deal with deadlocks• Timeouts– Leads to inefficiency

• Detecting deadlocks–Maintain a wait-for graph, cycle

indicates deadlock– Once a deadlock is detected, break the

cycle by aborting a transaction• New problem: Starvation

Page 41: CSL  771: Database Implementation Transaction Processing

Conservative 2PL• Avoids deadlocks altogether– T declares its readset and writeset– Scheduler tries to acquire all required locks– If not all locks can be acquired, T waits in a queue

• T never “starts” until all locks are acquired– Therefore, it can never be involved in a deadlock

On your ownStrict 2PL (2PL which ensures only strict

schedules)

Page 42: CSL  771: Database Implementation Transaction Processing

Extra Information• Assumption: Data items are

organized in a tree

Can we come up with a better (more efficient) protocol?

Page 43: CSL  771: Database Implementation Transaction Processing

Tree Locking Protocol (1/3)Receive ai[x]

is alk[x] ?

ai[x] delayed

RULE 2

RULE 1

NO

YESRULE 3ali[x] cannot be released until ai[x] is completed

RULE 2if x is an intermediate node, and y is a parent of x, the ali[x] is possible only if ali[y]

RULE 4Once a lock is released the same lock may not be re-obtained.

pi[x] scheduled

Page 44: CSL  771: Database Implementation Transaction Processing

Tree Locking Protocol (2/3)• Proposition: If Ti locks x before Tk,

then for every v which is a descendant of x, if both Ti and Tk lock v, then Ti locks v before Tk.

• Theorem: Tree Locking Protocol always produces Serializable Schedules

Page 45: CSL  771: Database Implementation Transaction Processing

Tree Locking Protocol (3/3)• Tree Locking Protocol avoids

deadlock• Releases locks earlier than 2PL

BUT• Needs to know the access pattern to

be effective• Transactions should access nodes

from root-to-leaf

Page 46: CSL  771: Database Implementation Transaction Processing

Multi-granularity Locking (1/3)

• Granularity– Refers to the relative size of the data

item– Attribute, tuple, table, page, file, etc.

• Efficiency depends on granularity of locking

• Allow transactions to lock at different granularities

Page 47: CSL  771: Database Implementation Transaction Processing

Multi-granularity Locking (2/3)

• Lock Instance Graph

Source: Concurrency Control and Recovery in Database Systems: Bernstein, Hadzilacos and Goodman

• Explicit and Implicit Locks

• Intention read and intention write locks

• Intention locks conflict with explicit read and write locks but not with other intention locks

Page 48: CSL  771: Database Implementation Transaction Processing

Multi-granularity Locking (3/3)

• To set rli[x] or irli[x], first hold irli[y] or iwli[y], such that y is the parent of x.

• To set wli[x] or iwli[x], first hold iwli[y], such that y is the parent of x.

• To schedule ri[x] (or wi[x]), Ti must hold rli[y] (or wli[y]) where y = x, or y is an ancestor of x.

• To release irli[x] (or iwli[x]) no child of x can be locked by Ti

Page 49: CSL  771: Database Implementation Transaction Processing

The Phantom Problem• How to lock a tuple, which (currently)

does not exist?T1: r1[x1], r1[x2], r1[X], c1

T2: w[x3], w[X], c2

rl1[x1], r1[x1], rl1[x2], r1[x2], wl2[x3], wl[X], w2[x3], wu2[x3,X], c2, rl1[X], ru1[x1,x2,X], c1

Page 50: CSL  771: Database Implementation Transaction Processing

NON-LOCK-BASED SCHEDULERS

Page 51: CSL  771: Database Implementation Transaction Processing

Timestamp Ordering (1/3)• Each transaction is associated with a

timestamp– Ti indicates Transaction T with

timestamp i.• Each operation in the transaction has

the same timestamp

Page 52: CSL  771: Database Implementation Transaction Processing

Timestamp Ordering (2/3)TO RuleIf pi[x] and qk[x] are conflicting operations, then pi[x] is processed before qk[x] iff i < k

Theorem: If H is a history representing an execution produced by a TO scheduler, then H is serializable.

Page 53: CSL  771: Database Implementation Transaction Processing

Timestamp Ordering (3/3)• For each data item x, maintain: max-rt(x), max-wt(x), c(x)• Request ri[x]

– Grant request if TS (i) >= max-wt (x) and c(x), update max-rt (x)– Delay if TS(i) > max-wt(x) and !c(x)– Else abort and restart Ti

• Request wi[x]– Grant request if TS (i) >= max-wt (x) and TS (i) >= max-rt (x),

update max-wt (x), set c(x) = false– Else abort and restart Ti

ON YOUR OWN: Thomas write rule, actions taken when a transaction has to commit or abort

Page 54: CSL  771: Database Implementation Transaction Processing

Validation• Aggressively schedule all operations• Do not commit until the transaction

is “validated”

ON YOUR OWN

Page 55: CSL  771: Database Implementation Transaction Processing

Summary• Lock-based Schedulers– 2-Phase Locking– Tree Locking Protocol–Multi-granularity Locking– Locking in the presence of updates

• Non-lock-based Schedulers– Timestamp Ordering– Validation-based Concurrency Control

(on your own)

Page 56: CSL  771: Database Implementation Transaction Processing

RECOVERYSOURCE: Database System: The complete book. Garcia-Molina, Ullman and Widom

Page 57: CSL  771: Database Implementation Transaction Processing

Logging• Log the operations in the

transaction(s)• Believe the log– Does the log say transaction T has

committed?– Or does it say aborted?– Or has only a partial trace (implicit

abort)?• In case of failures, reconstruct the DB

from its log

Page 58: CSL  771: Database Implementation Transaction Processing

The basic setup

T1

T2

T3

Tk

LOGThe Disk

Buffer Space for data and log

Buffer Spacefor each transaction

Transactions

Page 59: CSL  771: Database Implementation Transaction Processing

Terminology• Data item: an element which can be

read or written– tuple, relation, B+-tree index, etc

Input x: fetch x from the disk to bufferRead x,t: read x into variable local variable tWrite x,t: write value of t into xOutput x: write x to disk

Page 60: CSL  771: Database Implementation Transaction Processing

Example

Read P, xx -= x* 0.1Write x,PRead S, yy = “CHEAP”Write y, SOutput POutput S

update Airlines set price = price - price*0.1, status = “cheap” where price < 5000

System fails here

System fails here

System fails here

Page 61: CSL  771: Database Implementation Transaction Processing

Logs• Sequence of log records• Need to keep track of– Start of transaction– Update operations (Write operations)– End of transaction (COMMIT or ABORT)

• “Believe” the log, use the log to reconstruct a consistent DB state

Page 62: CSL  771: Database Implementation Transaction Processing

Types of logs• Undo logs– Ensure that uncommitted transactions are

rolled back (or undone)• Redo logs– Ensure that committed transactions are

redone• Undo/Redo logs– Both of the aboveAll 3 logging styles ensure atomicity and

durability

Page 63: CSL  771: Database Implementation Transaction Processing

Undo Logging (1/3)• <START T>: Start of transaction T• <COMMIT T>• <ABORT T>• <T, A, x>: Transaction T modified A

whose before-image is x.

Page 64: CSL  771: Database Implementation Transaction Processing

Undo Logging (2/3)Read P, xx -= x* 0.1Write x,PRead S, yy = “CHEAP”Write y, SFLUSH LOGOutput POutput SFLUSH LOG

<START T>

<T, P, x>

<T, S, y>

<COMMIT T>

U1: <T, X, v> should be flushed before Output X

U2: <COMMIT T> should be flushed after all OUTPUTs

Page 65: CSL  771: Database Implementation Transaction Processing

Undo Logging (3/3)• Recovery with Undo log

1. If T has a <COMMIT T> entry, do nothing

2. If T has a <START T> entry, but no <COMMIT T>• T is incomplete and needs to be undone• Restore old values from <T,X,v> records

• There may be multiple transactions– Start scanning from the end of the log

Page 66: CSL  771: Database Implementation Transaction Processing

Redo Logging (1/3)• All incomplete transactions can be

ignored• Redo all completed transactions• <T, A, x>: Transaction T modified A

whose after-image is x.

Page 67: CSL  771: Database Implementation Transaction Processing

Redo Logging (2/3)Read P, xx -= x* 0.1Write x,PRead S, yy = “CHEAP”Write y, S

FLUSH LOGOutput POutput S

<START T>

<T, P, x>

<T, S, y><COMMIT T> Write-ahead

Logging

R1: <T, X, v> and <COMMIT T> should

be flushed before Output X

Page 68: CSL  771: Database Implementation Transaction Processing

Redo Logging (3/3)• Recovery with Redo Logging– If T has a <COMMIT T> entry, redo T– If T is incomplete, do nothing (add

<ABORT T>)• For multiple transactions– Scan from the beginning of the log

Page 69: CSL  771: Database Implementation Transaction Processing

Undo/Redo Logging (1/3)• Undo logging: Cannot COMMIT T

unless all updates are written to disk• Redo logging: Cannot release

memory unless transaction commits

• Undo/Redo logs attempt to strike a balance

Page 70: CSL  771: Database Implementation Transaction Processing

Undo/Redo Logging (2/3)

Read P, xx -= x* 0.1Write x,PRead S, yy = “CHEAP”Write y, SFLUSH LOGOutput POutput S

<START T>

<T, P, x, a>

<T, S, y, b>

<COMMIT T>

UR1: <T, X, a, b> should be flushed before Output X

U1: <T, X, v> should be flushed before Output X

U2: <COMMIT T> should be flushed after all OUTPUTs R1: <T, X, v> and

<COMMIT T> should be flushed before Output X

Page 71: CSL  771: Database Implementation Transaction Processing

Undo/Redo Logging (3/3)• Recovery with Undo/Redo Logging– Redo all committed transactions

(earliest-first)– Undo all uncommitted transactions

(latest-first)

What happens if there is a crash when you are writing a log? What happens if there is a crash during recovery?

Page 72: CSL  771: Database Implementation Transaction Processing

Checkpointing• Logs can be huge…can we throw

away portions of it?• Can we avoid processing all of it

when there is a crash?

ON YOUR OWN