Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by...
-
Upload
alfred-cannon -
Category
Documents
-
view
219 -
download
0
Transcript of Transactions CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by...
Transactions
CPSC 356 Database
Ellen Walker
Hiram College
(Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Transaction
• A logical unit of work for a database– Example: “move $5 from savings to checking”– { withdraw 5 from savings; add 5 to checking }
• Must be prevented from interfering with each other
• Transactions must be “all or nothing”– If the transaction doesn’t complete, no changes
should be made at all!
States of a Transaction
• Successful transactions are committed• Failed partial transactions are rolled back• Committed transactions cannot be undone. We must
create a new (compensating) transaction to fix the database.
ACID Properties of a Transaction(Härder and Reuter, 1983)
• Atomicity — a transaction is either performed in its entirety or not at all
• Consistency — a transaction must take the database from one consistent state to another
• Isolation (Serializable) — if two transactions run at the same time, the result must look as if they ran sequentially in some arbitrary order; a transaction’s updates must not be visible to other transactions until it commits
• Durability — once a transaction commits, its result is permanent (must never be lost)
Concurrency Control
• Two or more transactions proceed concurrently, while preserving serializability (isolation)
• Transactions cannot interfere with each other– Lost update problem– Dirty read problem– Inconsistent analysis problem
Lost Update Problem
– Account A = $100, B = $200, C = $300• Transaction T transfers $4 from A to B• Transaction U transfers $3 from C to B• Should end A = $96, B = $207, C = $297
– U’s update of B is lost:Transaction T Transaction Ubal=read(A) $100write(A,bal–4) $96
bal=read(C)$300
write(C,bal–3) $297bal=read(B) $200
bal=read(B) $200write(B,bal+3) $203
write(B,bal+4) $204
Dirty Read Problem
• Account A = $200, B = $200– Transaction T transfers $100 from A to B but fails!– Transaction U deposits $25 to A– Should end A = $225, B = $200
• Problem: – Transaction U read “dirty value” of A after $100 was taken…
Transaction T Transaction U bal=read(A) $200write(A,bal–100) $100
bal=read(A) $100write(A, bal+25) $125
bal=read(B) $200…ROLLBACK!
Nonrepeatable Read Problem
• Similar to dirty read, but the same transaction reads the same value twice
Transaction T Transaction U read(A) $1000
sal=read(A) $1000 (unrelated actions)write(A,sal*1.1) $1100
sal=read(A) $1100
Inconsistent Analysis Problem
– Situation:• Transaction T gives everyone a 10% raise• Transaction U computes the average salary
– Problem: • Some salaries have been raised, some not when
average is computed (avg should be 1500 or 1650)Transaction T Transaction U sal=read(A) $1000write(A,sal*1.1) $1100
bal=read(A) $1100bal+=read(B) $3100
sal=read(B) $2000write(B,sal*1.1) $2200
avg = bal/2 $1550
Interleaving Causes Problems
• We need concurrency control mechanism– Allow as much concurrency among transactions
as possible (throughput)– Prevent other transactions from viewing
intermediate values (not yet committed)
Definitions for Scheduling
• Schedule– A sequence of operations by a set of concurrent
transactions that preserves order of operations within each transaction
• Serial Schedule– A schedule without any interleaving
• Nonserial Schedule– A schedule where operations from different
transactions are interleaved
Conflict Serializability
• A serializable schedule has the same result as a serial schedule
• Recognize conflicts between transactions– Both transactions access the same variable– At least one of those accesses is a write
• When all conflicts happen in the same order (T before U or U before T), then the schedule is serializable; otherwise not.
Serializability Testing
• Draw a downward (forward in time) arrow for each conflict (when one transaction is writing). If all arrows point the same way, then the schedule is serializable
Transaction T Transaction Ubal=read(A)write(A,bal–4)
bal=read(C)write(C,bal–3)
bal=read(B)write(B,bal+4)
bal=read(B)write(B,bal+3)
Serializability Testing (cont.)
• If at least one arrow is pointing leftward and another arrow is pointing rightward, the schedule is not serializableTransaction T Transaction U
bal=read(A)write(A,bal–4)
bal=read(C)write(C,bal–3)bal=read(B)
bal=read(B)write(B,bal+4)
write(B,bal+3)
Generalizing Serializability
• With more than two transactions, build a conflict serializable graph– Each transaction is a node of the graph– For each conflict, draw an arc from the earlier
transaction to the later transaction.
• If this graph has a cycle, then the schedule is not serializable
Serializability Testing vs. Enforcement
• To test serializability, you have to create the graph and check for cycles– This cannot be done efficiently (result from study
of algorithms)
• Instead, let’s create extra constraints (locking) to enforce serializability
Locking Algorithms
• Locking is a method of controlling concurrency using a lock (variable) to deny transactions access to certain objects
• Types of locking– Static locking– 2 Phase Locking
• Other algorithms (we won’t cover)– Optimistic concurrency control– Timestamp ordering
Using Locks
• Transaction must lock the data object before accessing it
• Transaction should unlock the data object when done
• If an item is locked, the transaction must wait until it is unlocked
• Example transaction:– Lock B; read B; … write B; unlock B; commit.
Types of Locks
• Shared lock– Transaction can read item only (read lock)
• Exclusive lock– Transaction can read and update item (write lock)
• Shared lock can be upgraded to exclusive lock.
• Exclusive lock can be downgraded to shared lock.
Locking Protocols
• Even locking doesn’t guarantee serializability– Object is unlocked and locked again within a
transaction; another transaction “jumps in”
• Locking protocols prevent this– Static locking– 2 Phase locking
Static Locking
• Transaction locks all the data items before using any of them.– Usually the first operation in the transaction
• Transaction releases all locks at once when it’s done with the data– Usually at the end of the transaction
• This method limits concurrency but guarantees serializability
• Transaction must know in advance which objects it will use
2 Phase Locking
• Constraint: A transaction cannot request a lock on one data item after it has unlocked any data items.
• To maintain the constraint, use 2 phases:– Growing phase — transaction requests locks, but
doesn’t release any locks (upgrades allowed)• The stage of a transaction when it holds locks on all the
needed data objects is called the lock point
– Shrinking phase — transaction releases locks, but doesn’t request any more locks (downgrades allowed)
2-Phase Locking can cause Cascading Rollback
• With 2PL, after the transaction has released some of its locks, yet before it has committed the transaction, those intermediate results become visible
• When a transaction is rolled back, all modified data objects are restored
• What if another transaction reads those intermediate results, and this transaction later aborts?– All transactions that have read these data objects must also
be rolled back (even if they’ve already completed!) — this is called cascaded roll-back
Rigorous & Strict 2 Phase Locking
• Rigorous 2PL– A transaction holds all its locks until it completes,
when it commits (or aborts) and releases all of its locks in a single atomic action
• Strict 2PL– A transaction holds all its exclusive locks until it
completes, when it commits (or aborts) and releases all of its locks in a single atomic action
Deadlock
• When 2 or more transactions are each waiting for locks on items held by other waiting transactions. (Circular wait)
• Example: Dining Philosophers– 5 philosophers, 5 forks– To eat, you need both left and right forks– If each philosopher picks up a left fork and waits
for a right fork to become available, deadlock!
2 Phase Locking can lead to Deadlock
• A transaction can request a lock on a data object while holding locks on other data object, so a circular wait can result
• Resolved (after detecting deadlock) by:– Abort deadlocked transaction, restore all modified
data objects, release all its locks, and withdraw all pending lock requests
Deadlock Detection
• Deadlock detection– Wait-for Graph
• If transaction T is waiting for a lock that transaction U holds, there is an arrow from T to U in WFG
– Lock manager is responsible for detection• It looks for cycles in its Wait For Graph• If it finds a cycle, it must select and abort a transaction
(the deadlock victim)• Choose victim based on age, number of changes already
made, number of changes still to be made
Deadlock Prevention (Lock methods)
• Lock all items when transaction starts (static locking)• Request locks in predefined order
– May cause premature locking, which reduces concurrency
• Lock timeouts (enables preemption)– Each lock is invulnerable for a limited period, and vulnerable
afterwards– If a transaction wants to access a data object protected by a
vulnerable lock, the lock is broken and the transaction holding it is aborted
Deadlock Prevention (Timestamp)
– Transaction timestamps• Each transaction is assigned a unique timestamp when it
starts • If a transaction needs to access a data object that is
locked by another transaction, the timestamps of the two transactions are compared
– Older transaction (smaller timestamp) generally have priority
– Wait-for edges are only allowed from older to younger, which prevents cycles
Eliminating Deadlock with Timestamps
• Wait-die: (aborts one)– If older transaction wants something held by
younger transaction, it waits– If younger transaction wants something held by
older transaction, it must die
• Wound-wait: (preempts resource)– If older transaction wants something held by
younger transaction, it preempts it– If younger transaction wants something held by
older transaction, it waits