Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of...
-
Upload
jahiem-catherine -
Category
Documents
-
view
230 -
download
0
Transcript of Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of...
Outline
• Transaction concepts & protocols
• Performance impact of concurrency control
• Performance tuning
ApplicationProgrammer
(e.g., business analyst,Data architect)
SophisticatedApplicationProgrammer
(e.g., SAP admin)
DBA,Tuner
Hardware[Processor(s), Disk(s), Memory]
Operating System
Concurrency Control Recovery
Storage SubsystemIndexes
Query Processor
Application
Transaction Concepts & Protocols
• Transaction– A logical unit of database processing– A sequence of begin, reads/writes, end– Unit of recovery, consistency, concurrency
• Transaction Processing Systems– Large databases with multiple users executing
database transactions– Examples
• Banking systems, airline reservations, supermarket checkouts, ...
Transaction States
Active
Failed
Committed
Terminated
Partially Committed
begin-transaction commit
read-item, write-item
abortabort
end-transaction
STATE
Transition
Transaction “Correctness”
• ACID properties– Atomicity– Consistency– Isolation– Durability
• Enforced by concurrency control and recovery methods of the DBMS
Serial Schedule
• Schedule– A sequence of read & write operations from various transactions– R1[X] W3[Y] R2[X] W2[Y] W1[X] W2[X]
• Serial schedule– No interleaved operations from the participating transactions– W3[Z] R3[Y] R1[X] W1[Y] R2[Y] W2[Z] W2[X]– Always correct, but … so slow!
• A schedule that is equivalent to some serial schedule is correct too
Equivalent Schedules
• 2 schedules are equivalent if the transactions– Read the same values– Produce the same output– Have the same effect on the database
• Examples1. R1[X] W2[X] R3[Y] W1[Y] R2[Y] W3[Z] W2[Z]
2. W3[Z] R3[Y] R1[X] W1[Y] R2[Y] W2[Z] W2[X]
3. W2[X] R1[X] W1[Y] R2[Y] W3[Z] W2[Z] R3[Y]• 1 and 2 are equivalent; not 3
Serializable Schedule
Theorem• A schedule is serializable if there is a serial
schedule such that for every conflicting pair of operations, the two operations appear in the same order in both schedules.
• 2 operations conflict if they are on the same object and one is a write
• Example 1 is serializable
WR Conflicts
T1 T2
R(A) ($200)
W(A) ($100)
R(A) (100)
W(A) (106)
R(B) (200)
W(B) (212)
R(B) (212)
W(B) (312)
Commit
Commit
T1 transfer $100 from A to B, and T2 increments both and B by 6%
(A and B have $200 initially)
Dirty read
R(A)
Unrepeatable Read (UR)
WW Conflicts
T1 T2
R(A)
W(A) ($1000)
R(A)
W(A) ($2000)
R(B)
W(B) ($2000)
R(B)
W(B) ($1000)
Commit
Commit
T1 to set both A and B to $1000, T2 to set both A and B to $2000
Lost Update!
Concurrency Control Enforces Serializability
• Most commercial DBMS use protocols (a set of rules) which when enforced by DBMS ensure the serializability of all schedules in which transactions participate.– Serializability testing after execution is meaningless;
how to rectify?– This done by Concurrency Control
Concurrency Control Protocols
• Commercially accepted mechanisms– Locking– Timestamps
• Others mechanisms– Multi-version and optimistic protocols
• Granularity issues
Locking
• Locking is used to synchronize accesses by concurrent transactions on data items– A concept also found in operating systems and
concurrent programming
• A lock is a variable for a data item, that describes the status of the item with respect to allowable operations
Types of Locks
• Binary locks– Locked, or Unlocked
• Check before enter; wait when locked; lock after enter; unlock after use (and wakeup one waiting transaction).
– Simple but too restrictive
• Read/Write locks in commercial DBMS– read-locked– write-locked– Unlocked
R-lock W-lock
R-lock
W-lock
Y
N
N
N
Read/Write Locking Scheme
• A transaction T must issue read-lock (X) or write-lock before any read-item (X)
• T must issue write-lock (X) before any write-item (X)
• T must issue unlock-item (X) after completing all read-item (X) and write-item (X)
• T will not issue a read-lock (X) if T already holds a read/write lock on X
• T will not issue write-lock (X) if T already holds a write lock on X
Does Locking Ensure Serializability?
X unlocked too early
Y unlocked too early
T1
read-lock (Y);
read-item (Y);
unlock (Y);
write-lock (X);
read-item (X);
X:=X+Y;
write-item (X);
unlock (X);
T2
read-lock (X);
read-item (X);
unlock (X);
write-lock (Y);
read-item (Y);
Y:=X+Y;
write-item (Y);
unlock (Y);
Cannot serialize T1 and T2
X == Y (orignal X + originalY)
For serializable T1T2,
X == X + Y
Y == 2Y + originalX?
Need for Locking Protocol
• Locking alone does not ensure serializability!• We need a locking protocol• A set of rules that dictate the positioning of
locking and unlocking operations, thus guaranteeing serializability
Two-Phase Locking (2PL)
• A transaction follows the two-phase protocol if all locking operations precede the first unlocking operation
Phase 2: Shrinking
read-lock (Y)unlock (X)unlock (Y)
Phase 1: Growing
read-lock (X)write-lock (X)write-lock (Y)
2PL Variants
• Basic 2PL
• Conservative 2PL– Locking operations precede transaction execution– Make sure can acquire necessary locks
• Strict 2PL– Unlocking of write-locks after commit (or abort)– Avoid cascading abort
• Rigorous 2PL– Unlocking of all locks after commit (or abort)
Limitations of 2PL
• Some serializable schedules may not be permitted– Performance not optimal
• 2PL (and locking in general) may cause deadlocks and starvation – Deadlock: no transactions can proceed– Starvation: some transaction wait forever
Lock Granularity
• Larger size - lower concurrency• Smaller size - higher overhead
What is the best item size?
Processing a mix of transactions?
Depends on the type of transactions
Multiple granularity locking scheme, changing the size of the data item dynamically
Performance of Locking
Throughput
# of Active Transactions
Thrashing
• Overhead: blocking
• Increasing the throughput:
•Locking smaller size objects
•Reducing locking time
•Reducing hot spots
Other CC Protocols
• Timestamp based• Multi-version based• Optimistic concurrency control
– No checking is done before or during transaction execution
– The transaction is validated at the end of execution, by checking if serializability has been violated
Summary of Transaction Concepts
ACID
Baseline:Serial Schedule
Strict 2PL
2PL
Ideal:Serializable Schedule
TransactionCorrectness
Other CC Protocols• Timestamp• Multi-version• Optimistic
Summary
To improve performance
Interleave transactions
Correctness: ACID
Serial schedule is correct
Serializable schedule is equivalent to some serial schedule
Concurrency control enforces serializability
2PL- Deadlock- Starvation- Granularity
Optimistic Timestamping Multi-version