Tx Processing
-
Upload
roheetbhatnagar -
Category
Documents
-
view
1.735 -
download
3
Transcript of Tx Processing
Transaction ProcessingTransaction Processing
15.2
General OverviewGeneral Overview
Where we ‘ve been...
DBA skills for relational DB’s:
Logical Schema Design
E/R diagrams
Decomposition and Normalization
Query languages
RA, RC, SQL
Integrity Constraints
Transactions
Where we are going
Database Implementation Issues
15.3
What Does a DBMS Manage?What Does a DBMS Manage?1. Data organization
E/R Model
Relational Model
2. Data Retrieval Relational Algebra
Relational Calculus
SQL
3. Data Integrity Integrity Constraints
Transactions
15.4
What is a transactionWhat is a transaction
A transaction is the basic or atomic and logical unit of execution or DB processing in an information system which accesses & possibly updates various data items in the DB.
A transaction is a sequence of operations that must be executed as a whole, taking a consistent (& correct) database state into another consistent (& correct) database state;
A collection of actions that make consistent transformations of system states while preserving system consistency
An indivisible unit of processingdatabase in a consistent state
database in a consistent state
database may be temporarily in an inconsistent state during execution
begin Transaction end Transactionexecution of Transaction
Account A Fred Bloggs £1000
Account B Sue Smith £0 Account B Sue Smith £500
Account A Fred Bloggs £500Transfer £500
15.5
Updates in SQLUpdates in SQLAn example:
UPDATE accountSET balance = balance -50WHERE acct_no = A102
account
Dntn: A102: 300
Dntn: A15: 500
Mian: A142: 300
…
…
(1) Read
(2) update
(3) writeTransaction:
1. Read(A)2. A <- A -503. Write(A)
What takes place:
memory
Disk
15.6
The Threat to Data IntegrityThe Threat to Data IntegrityConsistent DB
Name Acct bal-------- ------ ------Joe A-33 300Joe A-509 100
Joe’s total: 400
Consistent DB
Name Acct bal-------- ------ ------Joe A-33 250Joe A-509 150
Joe’s total: 400
transaction
Inconsistent DB
Name Acct bal-------- ------ ------Joe A-33 250Joe A-509 100
Joe’s total: 350
What a Transaction shouldlook like to Joe
What actually happensduring execution
15.7
TransactionsTransactions
What?: Updates to db that can be executed concurrently
Why?:
(1) Updates can require multiple reads, writes on a db
e.g., transfer $50 from A-33 to A509 = read(A) A A -50 write(A) read(B) BB+50 write(B)
(2) For performance reasons, db’s permit updates to be executed concurrently
Concern: concurrent access/updates of data can compromise data integrity
15.8
ACID PropertiesACID Properties
Atomicity: either all operations in a Transaction take effect, or none
i.e. a transaction is an atomic unit of processing and it is either performed entirely or not at all
Consistency: operations, taken together preserve db consistency
i.e. a transaction's correct execution must take the database from one correct state to another
Properties that a Transaction needs to have:
15.9
ACID Properties contd…ACID Properties contd…
Isolation / Independence: intermediate, inconsistent states must be concealed from other Transactions
i.e. the updates of a transaction must not be made visible to other transactions until it is committed (solves the temporary update problem)
Durability / Permanency. If a Transaction successfully completes (“commits”), changes made to db must persist, even if system crashes
i.e. if a transaction changes the database and is committed, the changes must never be lost because of subsequent failure
Serialisability: transactions are considered serialisable if the effect of running them in an interleaved fashion is equivalent to running them serially in some order
15.10
Demonstrating ACIDDemonstrating ACIDTransaction to transfer $50 from account A to account B:
1. read(A)2. A := A – 503. write(A)4. read(B)5. B := B + 506. write(B)
Consistency: total value A+B, unchanged by Xaction
Atomicity: if Transaction fails after 3 and before 6, 3 should not affect db
Durability: once user notified of Transaction commit, updates to A,B shouldnot be undone by system failure
Isolation: other Xactions should not be able to see A, B between steps 3-6
15.11
Threats to ACIDThreats to ACID1. Programmer Errore.g.: $50 substracted from A, $30 added to B threatens consistency
2. System Failurese.g.: crash after write(A) and before write(B) threatens atomicitye.g.: crash after write(B) threatens durability
3. ConcurrencyE.g.: concurrent Transaction reads A, B between steps 3-6
threatens isolation
15.12
IsolationIsolationSimplest way to guarantee: forbid concurrent Xactions!
But, concurrency is desirable:
(1) Achieves better throughput (TPS: transactions per second)
one Transaction can use CPU while another is waiting for disk to service request
(2) Achieves better average response time
short Xactions don’t need to get stuck behind long ones
Prohibiting concurrency is not an option
15.13
IsolationIsolation Approach to ensuring Isolation:
Distinguish between “good” and “bad” concurrency
Prevent all “bad” (and sometime some “good”) concurrency from happening OR
Recognize “bad” concurrency when it happens and undo its effects (abort some transactions)
Pessimistic vs Optimistic CC
Both pessimistic and optimistic approaches require distinguishing between good and bad concurrency
How: concurrency characterized in terms of possible Transaction“schedules”
15.14
Requirements for Database ConsistencyRequirements for Database Consistency
Concurrency Control Most DBMS are multi-user systems. The concurrent execution of many different transactions submitted by
various users must be organised such that each transaction does not interfere with another transaction with one another in a way that produces incorrect results.
The concurrent execution of transactions must be such that each transaction appears to execute in isolation.
Recovery System failures, either hardware or software, must not result in an
inconsistent database
15.15
Transaction as a Recovery UnitTransaction as a Recovery Unit
If an error or hardware/software crash occurs between the begin and end, the database will be inconsistent Computer Failure (system crash)
A transaction or system error
Local errors or exception conditions detected by the transaction
Concurrency control enforcement
Disk failure
Physical problems and catastrophes
The database is restored to some state from the past so that a correct state—close to the time of failure—can be reconstructed from the past state.
A DBMS ensures that if a transaction executes some updates and then a failure occurs before the transaction reaches normal termination, then those updates are undone.
The statements COMMIT and ROLLBACK (or their equivalent) ensure Transaction Atomicity
15.16
RecoveryRecovery
Mirroring keep two copies of the database and maintain them simultaneously
Backup periodically dump the complete state of the database to some form of tertiary
storage
System Logging the log keeps track of all transaction operations affecting the values of
database items. The log is kept on disk so that it is not affected by failures except for disk and catastrophic failures.
15.17
Recovery from Transaction FailuresRecovery from Transaction Failures
Catastrophic failure
Restore a previous copy of the database from archival backup
Apply transaction log to copy to reconstruct more current state by redoing committed transaction operations up to failure point
Incremental dump + log each transaction
Non-catastrophic failure
Reverse the changes that caused the inconsistency by undoing the operations and possibly redoing legitimate changes which were lost
The entries kept in the system log are consulted during recovery.
No need to use the complete archival copy of the database.
15.18
Transaction States Transaction States
For recovery purposes the system needs to keep track of when a transaction starts, terminates and commits.
Begin_Transaction: marks the beginning of a transaction execution;
End_Transaction: specifies that the read and write operations have ended and marks the end limit of transaction execution (but may be aborted because of concurrency control);
Commit_Transaction: signals a successful end of the transaction. Any updates executed by the transaction can be safely committed to the database and will not be undone;
Rollback (or Abort): signals that the transaction has ended unsuccessfully. Any changes that the transaction may have applied to the database must be undone;
Undo: similar to ROLLBACK but it applies to a single operation rather than to a whole transaction;
Redo: specifies that certain transaction operations must be redone to ensure that all the operations of a committed transaction have been applied successfully to the database;
15.19
Entries in the System LogEntries in the System Log
For every transaction a unique transaction-id is generated by the system.
[start_transaction, transaction-id]: the start of execution of the transaction identified by transaction-id
[read_item, transaction-id, X]: the transaction identified by transaction-id reads the value of database item X. Optional in some protocols.
[write_item, transaction-id, X, old_value, new_value]: the transaction identified by transaction-id changes the value of database item X from old_value to new_value
[commit, transaction-id]: the transaction identified by transaction-id has completed all accesses to the database successfully and its effect can be recorded permanently (committed)
[abort, transaction-id]: the transaction identified by transaction-id has been aborted
Credit_labmark (sno NUMBER, cno CHAR, credit NUMBER)old_mark NUMBER; new_mark NUMBER;
SELECT labmark INTO old_mark FROM enrol WHERE studno = sno and courseno = cno FOR UPDATE OF labmark;
new_ mark := old_ mark + credit; UPDATE enrol SET labmark = new_mark WHERE studno = sno and courseno = cno ;
COMMIT;
EXCEPTION WHEN OTHERS THEN ROLLBACK; END credit_labmark;
15.20
active partially committed
committed
failed terminated
BEGIN
TRANSACTION
READ, WRITE
END
TRANSACTION
ROLLBACK ROLLBACK
COMMIT
Transaction executionTransaction execution
A transaction reaches its commit point when all operations accessing the database are completed and the result has been recorded in the log. It then writes a [commit, transaction-id].
If a system failure occurs, searching the log and rollback the transactions that have written into the log a
[start_transaction, transaction-id][write_item, transaction-id, X, old_value, new_value]
but have not recorded into the log a [commit, transaction-id]
15.21
Read and Write Operations of a Read and Write Operations of a TransactionTransaction
Specify read or write operations on the database items that are executed as part of a transaction
read_item(X): reads a database item named X into a program variable also named X.
1. find the address of the disk block that contains item X
2. copy that disk block into a buffer in the main memory
3. copy item X from the buffer to the program variable named
write_item(X): writes the value of program variable X into the database item named X.
1. find the address of the disk block that contains item X
2. copy that disk block into a buffer in the main memory
3. copy item X from the program variable named X into its current location in the buffer store the updated block in the buffer back to disk (this step updates the database on disk)
XX:=
15.22
Checkpoints in the System LogCheckpoints in the System Log
A [checkpoint] record is written periodically into the log when the system writes out to the database on disk the effect of all WRITE operations of committed transactions.
All transactions whose [commit, transaction-id] entries can be found in the system log will not require their WRITE operations to be redone in the case of a system crash.
Before a transaction reaches commit point, force-write or flush the log file to disk before commit transaction.
Actions Constituting a Checkpoint temporary suspension of transaction execution forced writing of all updated database blocks in main
memory buffers to disk writing a [checkpoint] record to the log and force writing
the log to disk resuming of transaction execution
data
log
15.23
Transaction as a Concurrency UnitTransaction as a Concurrency Unit
Transactions must be synchronised correctly to guarantee database consistency
Account A Fred Bloggs £1000
Account B Sue Smith £0
Account B Sue Smith £500
Account A Fred Bloggs £500
Transfer £500 from A to B
Account C Jill Jones £700Account C Jill Jones £400
Account A Fred Bloggs £800
Transfer £300from C to A
Net result Account A 800Account B 500Account C 400
T1
T2
Sim
ultaneous Execution
15.24
Transaction scheduling algorithmsTransaction scheduling algorithms
Transaction Serialisability The effect on a database of any number of transactions executing
in parallel must be the same as if they were executed one after another
Problems due to the Concurrent Execution of Transactions The Lost Update Problem
The Incorrect Summary or Unrepeatable Read Problem
The Temporary Update (Dirty Read) Problem
15.25
The Lost Update ProblemThe Lost Update Problem Two transactions accessing the same database item have their operations
interleaved in a way that makes the database item incorrect
item X has incorrect value because its update from T1 is “lost” (overwritten)
T2 reads the value of X before T1 changes it in the database and hence the updated database value resulting from T1 is lost
T1: (joe) T2: (fred) X Y
read_item(X); 4X:= X - N; 2
read_item(X); 4X:= X + M; 7
write_item(X); 2read_item(Y); 8
write_item(X); 7Y:= Y + N; 10write_item(Y); 10
X=4Y=8N=2M=3
15.26
The Incorrect Summary or Unrepeatable Read or The Incorrect Summary or Unrepeatable Read or Inconsistent Retrieval ProblemInconsistent Retrieval Problem
One transaction is calculating an aggregate summary function on a number of records while other transactions are updating some of these records.
The aggregate function may calculate some values before they are updated and others after.
T1: T2: T1 T2 Sumsum:= 0; 0read_item(A); 4sum:= sum + A; 4
read_item(X);. . 4X:= X - N; . 2write_item(X); 2
read_item(X); 2sum:= sum + X; 6read_item(Y); 8sum:= sum + Y; 14
read_item(Y); 8Y:= Y + N; 10write_item(Y); 10
T2 reads X after N is subtracted and reads Y before N is added, so a wrong summary is the result
15.27
Dirty Read or The Temporary Update ProblemDirty Read or The Temporary Update Problem One transaction updates a database item and then the transaction fails. The
updated item is accessed by another transaction before it is changed back to its original value
transaction T1 fails and must change the value of X back to its old value
meanwhile T2 has read the “temporary” incorrect value of X
T1: (joe) T2: (fred) Database Logold
Lognew
read_item(X); 4X:= X - N; 2write_item(X); 2 4 2
read_item(X); 2X:= X- N; -1write_item(X); -1 2 -1
failed write (X) 4 rollback T1log
Joe books seat on flight X
Fred books seat on flight X because Joe was on Flight X
Joe cancels
15.28
Next class we will discuss about
SCHEDULES
Serialisable and non-serialisable schedules
Deadlock & Locking protocols
Deadlock prevention
Recovery Techniques
15.29
SchedulesSchedules
Schedules – sequences that indicate the chronological order in which instructions of concurrent transactions are executed a schedule for a set of transactions must consist of all instructions of
those transactions
must preserve the order in which the instructions appear in each individual transaction
T1123
T2ABCD
T11
23
T2
AB
CD
one possible schedule:
15.30
Example SchedulesExample Schedules
Constraint: The sum of A+B must be the same
Before: 100+50
After: 45+105
T1read(A)A <- A -50write(A)read(B)B<-B+50write(B)
T2
read(A)tmp <- A*0.1A <- A – tmpwrite(A)read(B)B <- B+ tmpwrite(B)
Transactions: T1: transfers $50 from A to B T2: transfers 10% of A to B
=150, consistent
Example 1: a “serial” schedule
15.31
Example ScheduleExample Schedule
Another “serial” schedule:
T1
read(A)A <- A -50write(A)read(B)B<-B+50write(B)
T2read(A)tmp <- A*0.1A <- A – tmpwrite(A)read(B)B <- B+ tmpwrite(B)
Before: 100+50
After: 40+110
Consistent but not the same as previous schedule..
Either is OK!
=150, consistent
15.32
Example Schedule (Cont.)Example Schedule (Cont.)Another “good” schedule:
T1read(A)A <- A -50write(A)
read(B)B<-B+50write(B)
T2
read(A)tmp <- A*0.1A <- A – tmpwrite(A)
read(B)B <- B+ tmpwrite(B)
Effect: Before After A 100 45 B 50 105
Same as one of the serial schedulesSerializable
15.33
Example Schedules (Cont.)Example Schedules (Cont.) A “bad” schedule
Before: 100+50 = 150
After: 50+60 = 110 !!
Not consistent
T1read(A)A <- A -50
write(A)read(B)B<-B+50write(B)
T2
read(A)tmp <- A*0.1A <- A – tmpwrite(A)read(B)
B <- B+ tmpwrite(B)
15.34
SerializabilitySerializabilityHow to distinguish good and bad schedules?
for previous example, any schedule leaving A+B = 150 is good
Q: could we express good schedules in terms of integrity constraints?
Ans: No. In general, won’t know A+B, can’t check value of A+B at given time
for consistency
Alternative: Serializability
15.35
SerializabilitySerializability
All schedules
Serializable schedules
“conflict serializable” schedules
“view serializable” schedules
SQL serializable
Serializable: A schedule is serializable if its effects on the db are the equivalent to some serial schedule.
Hard to ensure; more conservative approaches are used in practice
15.36
Conflict SerializabilityConflict SerializabilityConservative approximation of serializability
(conflict serializable => serializable but <= doesn’t hold)
Idea: can we swap the execution order of consecutive operation
wo/ affecting state of db, Xactions so as to leave a serial schedule?
15.37
Conflict Serializability (Cont.)Conflict Serializability (Cont.) If a schedule S can be transformed into a schedule S´ by a
series of swaps of non-conflicting instructions, we say that S and S´ are conflict equivalent.
We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule
Ex:
T1 ….read(A)
T2 ….
read(A) . . .
T1 ….
read(A)
T2 ….
read(A)
. . .
can be rewrittento equivalentschedule
15.38
Conflict Serializability (Cont.)Conflict Serializability (Cont.)
T11. Read(A)2. A A -503. Write(A)
4.Read(B)5. B B + 506. Write(B)
T2
a. Read(A)b. tmp A * 0.1c. A A - tmp d.Write(A)
e. Read(B)f. B B + tmpg. Write(B)
Swaps:
4 <->d4<->c4<->b4<->a
T1, T2
Example:
5<->d5<->c5<->b5<->a
6<->d6<->c6<->b6<->a
Conflict serializble
15.39
Conflict Serializability (Cont.)Conflict Serializability (Cont.)
T1read(A)A <- A -50write(A)read(B)B<-B+50write(B)
T2
read(A)tmp <- A*0.1A <- A – tmpwrite(A)read(B)B <- B+ tmpwrite(B)
The effects of swaps
Because example schedule couldbe swapped to this schedule (<T1, T2>)
example schedule is conflict serializable
15.40
The swaps we madeThe swaps we madeA. Reads and writes of different data elements
e.g.: T1 T2 T1 T2 write(A) read(B) = read(B) write(A)
OK because: value of B unaffected by write of A ( read(B) has same effect ) write of A is not undone by read of B ( write(A) has same effect)
Note : T1 T2 T1 T2 write(A) read(A) = read(A) write(A)
Why? In the first, T1 reads value of A written by T2. May be different value than previous value of A
15.41
SwapsSwaps T1 T2 T1 T2 write(A) read(A) = read(A) write(A)
What affect on state of db could above swap make?
Suppose what follows read(A) in T1 is: read(C) C C+A write(C)
Unless T2 writes the same value to A, the first schedule will leave a different value for C than the second
15.42
The swaps We MadeThe swaps We MadeA. Reads and writes of different data elements 4 <-> d 6 <-> a
B. Reads of different data elements: 4 <-> a
C. Writes of different data elements: 6 <-> d
D. Any operation with a local operation
OK because local operations don’t go to disk. Therefore, unaffected by other operations: 4 <-> b 5 <-> a .... 4 <-> c
To simplify, local operations are ommited from schedules
15.43
Conflict Serializability (Cont.)Conflict Serializability (Cont.)
T11. Read(A)2. Write(A)
3. Read(B)4. Write(B)
T2
a. Read(A)b. Write(A)
c. Read(B)d. Write(B)
Swaps:
3 <->b3<->a4<->b4<->a
T1, T2
Previous example wo/ local operations:
15.44
Swappable OperationsSwappable Operations
Swappable operations:
1. Any operation on different data element
2. Reads of the same data (Read(A))
(regardless of order of reads, the same value for A is read)
Conflicts:
T1: Read (A)
T1: Write (A)
T2: Read(A) T2: Write(A)
OK R/W Conflict
W/R Conflict W/W Conflict
15.45
ConflictsConflicts
(1) READ/WRITE conflicts:
conflict because value read depends on whether write has occurred
(2) WRITE/WRITE conflicts:
conflict because value left in db depends on which write occurred last
(3) READ/READ : no conflict
15.46
Conflict SerializabilityConflict Serializability
T1 T2
(1) read(Q)write(Q) (a)
(2) write(Q)
Q: Is the following shcedule conflict serializable? If so, what’s its equivalent serial schedule? If not, why?
Ans: No. Swapping (a) with (1) is a R/W conflict, and swapping (a) with (2)is a W/W conflict. Not equivalent to <T1, T2> or <T2, T1>
15.47
Conflict SerializabilityConflict SerializabilityQ: Is the following schedule conflict serializable? If so, what’s its equivalent serial schedule? If not, why?
T1
(1) Read(A)
(2) Write(S)
T2
(a) Write(A)(b) Read(B)
T3
(x) Write(B)(y) Read(S)
Ans.: NO.All possible serial schedules arenot conflict equivalent.
<T1, T2, T3><T1, T3, T2><T2, T1, T3> . . . . . .
15.48
Conflict SerializabilityConflict Serializability
Testing: too expensive to test a schedule by swapping operations
(usually schedules are big!)
Alternative: “Precedence Graphs”
* vertices = Xactions
* edges = conflicts between Xactions
E.g.: Ti Tj if: (1) Ti, Tj have a conflicting operation, and
(2) Ti executed its operation in conflict first
15.49
Precedence GraphPrecedence Graph
An example of a “Precedence Graph”:
T1
Read(A)
T2
Write(A)Read(B)
T3
Write(B)Read(S)
T1
T2
T3
R/W(A)
R/W(B
)
Q: When is a schedule notconflict serializable?
15.50
Precedence GraphPrecedence Graph
Another example:
T1
Read(A)
Write(S)
T2
Write(A)Read(B)
T3
Write(B)Read(S)
T1
T2
T3
R/W(A)
R/W(B
)R/W
(S)
Not conflict serializable!!Because there is a cycle in the PG,the cycle creates contradiction
15.51
Example Schedule (Schedule B)Example Schedule (Schedule B)T1 T2 T3 T4 T5
read(X)read(Y)read(Z)
read(V)read(W)
read(Y)write(Y)
write(Z)read(U)
read(Y)write(Y)read(Z)write(Z)
read(U)write(U)
15.52
Precedence Graph for Schedule APrecedence Graph for Schedule A
T3T4
T1 T2
R/W (Y)
R/W(Y)R/W(Z)
R/W(Z) , W/W(Z)
R/W(y), R/W
(Z)
15.53
Test for Conflict SerializabilityTest for Conflict Serializability
A schedule is conflict serializable if and only if its precedence graph is acyclic.
Cycle-detection algorithms exist which take order n2 time, where n is the number of vertices in the graph. (Better algorithms take order n + e where e is the number of edges.)
If precedence graph is acyclic, the serializability order can be obtained by a topological sorting of the graph.
For example, a serializability order for Schedule A would beT5 T1 T3 T2 T4 .
15.54
View SerializabilityView Serializability “View Equivalence”:
S and S´ are view equivalent if the following three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then transaction Ti must, in schedule S´, also read the initial value of Q.
2. For each data item Q, if transaction Ti reads the value of Q written by Tj in S, it also does in S’
3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule S must perform the final write(Q) operation in schedule S´.
As can be seen, view equivalence is also based purely on reads
and writes alone.
15.55
View Serializability (Cont.)View Serializability (Cont.)
A schedule S is view serializable it is view equivalent to a serial schedule. Example:
T1
Read(A)
Write(A)
T2
Write(A)
T3
Write(A)
Every view serializable schedule that is not conflict serializable has blind writes.
Is this scheduleview serializable?conflict serializable?
VS: Yes. Equivalent to <T1, T2, T3>
CS: No. PG has a cycle.
15.56
View SerializabilityView Serializability(1) We just showed: conflict serializable
view serializable
(2) We can also show:view serializable
serializable
15.57
Other Notions of SerializabilityOther Notions of SerializabilityEquivalent to the serial schedule < T1, T2 >, yet is not conflict equivalent or view equivalent to it.
T1
Read(A) A A -50 Write(A)
Read(B)B B + 50Write(B)
T2
Read(B)B B - 10Write(B)
Read(A)A A + 10Write(A)
Determining such equivalence
requires analysis of operations
other than read and write.
15.58
Transaction Definition in SQLTransaction Definition in SQL Data manipulation language must include a construct for
specifying the set of actions that comprise a transaction. In SQL, a transaction begins implicitly. A transaction in SQL ends by:
Commit work commits current transaction and begins a new one.
Rollback work causes current transaction to abort.
Levels of consistency specified by SQL-92: Serializable — default (more conservative than conflict
serializable) Repeatable read Read committed Read uncommitted
15.59
Transactions in SQLTransactions in SQL
Serializable — default
- can read only committed records
- if T is reading or writing X, no other Transaction can change X until T commits
- if T is updating a set of records (identified by WHERE clause), no other Transaction can change this set until T commits
Weaker versions (non-serializable) levels can also be declared
Idea: tradeoff: More concurrency => more overhead to ensure
valid schedule.
Lower degrees of consistency useful for gathering approximateinformation about the database, e.g., statistics for query optimizer.