TRANSACTION PROCESSING CONCEPTS - …pclsoft.weebly.com/uploads/2/9/8/3/298350/unit-iv-dbms.pdf ·...
Transcript of TRANSACTION PROCESSING CONCEPTS - …pclsoft.weebly.com/uploads/2/9/8/3/298350/unit-iv-dbms.pdf ·...
1
UNIT-IV
TRANSACTION PROCESSING CONCEPTS
Transaction
A Transaction refers to a logical unit of work in DBMS, which comprises a set of DML
statements that are to be executed atomically (indivisibly).
Commit of a Transaction
COMMIT of a Transaction refers to a state when the transaction completes successfully
and all its updates have been made safe.
Abort of a Transaction
Abort of a transaction refers to a state when the transaction has been rolled-back to its
initial state, subsequent to a failure. Effect of an aborted transaction is as though it had
never commenced execution.
ACID Properties of a Transaction
The execution of a transaction Ti must satisfy some properties that are referred to as
„ACID‟. The acronym “ACID” stands for „Atomicity‟, „Consistency‟, „Isolation‟ and
„Durability‟, which are explained below:-
1. Atomicity The property of Atomicity refers to the system requirement that a
Transaction must be executed atomically (indivisibly). This means that once a
transaction has commenced execution, it should either be executed fully or not at
all. Suppose a transaction fails during its execution (either due to its own internal
error or due to some system failure), then to meet the requirement of atomicity,
the transaction must be rolled back to its initial state; i.e. the effect of its partial
execution must be undone. The effect of rollback would be as though the
transaction had never commenced execution. The task of rolling back the failed
transactions is performed by DBMS, with the help of entries in the system log-
file. If the transaction had failed due to some internal logical error, then the
system will generate an error message and dump the transaction. However, if the
transaction had failed on account of some system failure (hardware/software
failure), then the transaction is automatically restarted after the rollback.
2. Consistency A Transaction T, while executing in isolation (i.e. with no other
transactions interfering with its execution), must preserve the consistency of the
database on which the transaction is operating i.e. T must transform the database
from one consistent state to another consistent state.
2
3. Isolation When a number of Transactions are executing Concurrently, then
the system must ensure virtual isolation amongst the transactions. This implies
that for a pair of concurrent transactions {Ti and Tj}, it must appear to Ti as if Ti
had started its execution only after Tj had already finished OR Ti had already
finished before Tj stated execution. Ensuring of this property is known as
concurrency control, which is one of the major functions of DBMS.
4. Durability When a Transaction Ti commits successfully, all its updates must
be made durable i.e. all the updates of Ti must persist, even if the system fails
immediately after the committing of Ti. To ensure this property, all updates of a
committed transaction are made safe by the DBMS; either by applying the
updates physically to the database or by making suitable entries in the system log
file, on a stable (non-volatile) storage media.
Database Access Operations
Database access operations mainly comprise of the following:-
Read (X) It transfers data item X from database to a local buffer of the executing
Transaction. Read (X) can be performed by more than one transactions
concurrently.
Write (X) It transfers a data item X from the local buffer of the executing transaction
T to the database. Write (X) cannot be performed by more than one
transactions concurrently.
Example of a Transaction
Suppose Ti is a transaction that transfers Rs. 50/= from account A to account B.
Ti: read (A);
A:= A-50;
write(A);
read(B);
B:= B +50;
Write(B);
Let us consider the ACID properties wrt Ti.
Consistency: The consistency requirement, in the case of Ti, implies that sum of A and
B must remain unchanged by its execution, since the amount is being only transferred
internally from A to B.
Atomicity: Whenever Ti is executed, either it should be executed fully or not all.
Suppose the system fails at a point when write (A) has been performed, but write (B) is
yet to be performed, the value of A+B, as reflected in the database, would be deficit by
3
Rs 50, thus violating the consistency requirement. Now, the solution would be to roll-
back the transaction; and thus reverting back to the old consistent state, that existed prior
to the commencement of Ti. Reverting back to the old state is achieved by reverting A to
its old value. The old value of A is retrieved from the Log-File (since when A was
updated, its old value must have preserved in the Log-File).
Durability: Once Ti commits successfully, the updated values of A & B must persist;
even though the system may fail immediately thereafter.
Isolation: During the execution of Ti, after Write(A) is executed and before Write (B) is
executed, the database would be momentarily in an in-consistent state. During this
period, if another concurrent Transaction Tj reads the values of A and B and performs
some of its own update operations, then the database would be left in an inconsistent
state. To obviate such an eventuality, the system must ensure that the data items,
modified by a transaction Ti, are not permitted to be accessed by another Concurrent
Transaction Tj, till Ti COMMITS. The isolation of Concurrent Transactions is ensured by
a system component, called Concurrency Control Component.
States of a Transaction
State diagram of Transaction Processing
Active When a transaction is fired (starts execution), it enters Active State. The
transaction remains in this state during its entire execution.
Partially Committed A transaction enters this state when its last statement has been
executed, but its update are not yet made safe.
Committed A Partially-Committed transaction enters COMMITTED state, when all
its updates have been made safe by the DBMS; either in the database itself or in a
Log File (both on a non-volatile media)
Active
Partially
Commi-
tted
Failed
Commi-
tted
Aborted
Terminated
4
Failed A transaction enters this state when it is not able to proceed in a normal manner,
either due to its own internal error or due to some system failure (hardware failure or
software failure).
Aborted A Transaction T enters Aborted State, when it has been rolled-back after a
failure. This is the state when the failed Transaction T has reverted back to the initial
state, which existed prior to its commencement. The end result is as though T had never
commenced execution.
Terminated A Transaction is said to be terminated, when it has either committed
successfully or has aborted after a failure.
Restarting of a Transaction after a Failure When a Transaction has been
aborted after a failure, the system would have two options:-
(a) It may restart the transaction, but only if the transaction was aborted not due
to its own internal error, but due to some system failure (hardware or software
failure). A restarted transaction would be treated as a new transaction.
(b) It may dump the transaction, if the failure was due to its internal error, since
the transaction in its present state is not fit to be executed. The user of the
failed transaction is intimated about the cause of failure, through a suitable
message. The transaction would need to be debugged before its resubmission
for execution.
Potential of Concurrency amongst Transactions
A transaction involves multiple steps, some of the steps will involve an I/O activity and
others will involve CPU activity. Whenever, an executing transaction leaves the CPU and
goes for an I/O, another ready transaction can be taken up for execution by the CPU
Thus, there is a potential of parallelism (concurrency) amongst the transactions.
Advantages offered by the Concurrent execution of transactions
(a) Improved system throughput and Resource Utilization When an
executing Transaction Ti requests I/O it goes to „wait‟ state till its I/O is
completed. During this period, the CPU is free and another ready transaction is
assigned to the CPU. Thus, as long as there is a transaction available to engage
the CPU, the CPU is not left idling. Thus, CPU activity continues concurrently
with the I/O activities on various devices. This will enhance the effective
utilization of the resources, thus improving the system throughput.
(b) Reduced average waiting time of transactions At any moment, there
will be a mix of transactions of varying lengths. In concurrent processing, the
5
relatively shorter transactions would get completed during I/O bursts of longer
transactions; and thus reducing the average waiting time of the transactions
considerably.
Concurrency Control The exploitation of concurrency amongst transactions
throws up the issue of concurrency control. This control has to be exercised by the
DBMS to ensure that the execution of the concurrent transactions is dovetailed in such a
manner that the concurrent execution is virtually equivalent to serial execution. So, the
system is able to draw benefits of concurrency while preserving database consistency at
the same time.
Execution Schedule of Transactions
A Schedule refers to the chronological order, in which instructions of concurrent
transactions are executed by the system.
Serial Schedule
A serial schedule is the one, in which transactions are executed, strictly one after another.
The execution of next transaction is taken up, only after the successful completion of the
previous transaction.
Examples Suppose the initial balance of accounts is A= 5000 and B =3000
T1: Transfer Rs 1000 from Account A to B
T2: Credit Rs 500 to Account A
Schedule S1
A Serial Schedule: T1 , T2 (i.e. T1 followed by T2)
T1 T2 Database State
read (A);
A:= A-1000;
write(A);
read(B);
B:= B +1000;
Write(B);
read (A);
A := A +500;
write(A);
A = 5000
A = 4000
B = 3000
B= 4000
A= 4000
A= 4500
S1 : T1 , T2
The balances at the end of S1 will be: A = 4500 and B = 4000 (As expected)
6
Schedule S2
A Serial Schedule: T2 , T1 (i.e. T2 followed by T1)
T1 T2 Database State
read (A);
A:= A-1000;
write(A);
read(B);
B:= B +1000;
Write(B);
read (A);
A := A + 500;
write(A);
A = 5000
A= 5500
A= 5500
A = 4500
B = 3000
B = 4000
S2 : T2 , T1
The balances at the end of S2 will be: A = 4500 and B = 4000 (As expected)
The Schedules S1 & S2 execute the transactions T1 and T2 serially i.e. T1 followed by T2
in Schedule S1 and T2 followed by T1 in Schedule S2. Such Schedules are called serial
schedules. Execution of Serial Schedules would preserve consistency of the database;
without requiring any additional mechanism. But, such schedules do not exploit the
potential parallelism amongst the transactions. Thus, the utilization of resources will
remain poorer and the average waiting time of transactions will remain higher than
concurrent schedules; resulting in reduced system- throughput.
Concurrent Schedules
When more than one Transaction are taken up for execution at the same time
(concurrently), the system would schedule one of the concurrent transactions (say T i) to
take control of the CPU. During the execution, when Ti request I/O, then Ti would go to a
„wait‟ state, waiting for completion of its I/O. since, the CPU would now be free, the
system would schedule another concurrent transaction (say Tj) that may be ready to take
control of CPU. When the I/O requested by Ti is competed, it would come out of „wait‟
state and indicate its readiness to take control of the CPU again, as per its turn( which
will be determined by the syatem). Thus, the control of CPU is multiplexed amongst a
number of concurrent transactions. As long as, there are some transactions ready to take
control of CPU, the CPU will not be left idling. This exploitation of concurrency amongst
the transactions would reduce their average waiting time of and increase the System-
Throughput.
7
Examples.
Schedule S3
A Concurrent Schedule: T1, T2, T1
T1 T2 Database State
Read (A);
A:= A-1000;
write(A);
read(B);
B:= B + 1000;
Write(B);
read (A);
A := A + 500;
write(A);
A = 5000
A = 4000
A=4000
A=4500
B= 3000
B=4000
S3 : T1 , T2, T1
The balances at the end of S3 will be: A = 4500 and B = 4000 (As expected)
Schedule S4
Another Concurrent Schedule: T1, T2, T1
T1 T2 Database State
read (A);
A:= A-1000;
write(A);
read(B);
B:= B +1000;
Write(B)
read (A);
A := A + 500;
write(A);
A = 5000
A = 5000
A=5500
A= 4000
B = 3000
B= 4000
S4 : T1 , T2, T1 The balances at the end of S4 will be: A = 4000 and B = 4000 (Not as expected)
The end result is not as expected, since the update ON „A‟ performed by T2 has been lost
in the concurrency.
8
Serial Schedules Vs Concurrent Schedules
From the viewpoint of database consistency, serial schedules are always safe, but they
fail to exploit any parallelism amongst the transactions. On the other hand, a concurrent
schedule, if left to itself, may fail to preserve database consistency as demonstrated above
in schedule S4. So, there is a requirement of concurrency control to be exercised by the
DBMS.
Equivalent Schedules
Two Schedules S and S‟ are said to be Equivalent, if when executed independently, each
of the schedules transforms the affected database from a consistent state S1 to another
consistent state S2.
Serializable Schedules
Suppose S is a Concurrent Schedule and there exists a Schedule S‟, which is serial and
logically equivalent to S, then S is said to be a Serializable Schedule. This refers to a
situation, wherein the concurrent execution of two transactions (say Ti and Tj ) is
logically equivalent to their serial execution i.e.:-
- Ti followed by Tj or
- Tj followed by Ti
For example the Schedule S3 is logically equivalent to Schedule S1 (a serial schedule).
So, the Schedule S3 is called a Serializable Schedule. Whereas, Schedule S4 does not
have any equivalent serial schedule; thus S4 is not a serializable schedule. In may be
noted that in Schedule S4, the transaction T2 reads an uncommitted value of data item A.
Also, the write (A) of T2 is defunct, since the update gets overwritten by T1.
Serialization If left entirely to the Operating System, to decide on the interleaving of the
concurrent transactions, it would not be possible to predict the end results and the
database may be left in an inconsistent state. So, it is to be ensured by the DBMS that any
schedule that executes, must leave the database in a consistent state. The database
component that ensures this aspect is called concurrency-control-component. The
schedule should in some sense be equivalent to a serial schedule. This process is
called serialization.
The only significant operations of a transaction, from a scheduling point of view, are read
and write instructions. So, we will care about only these instructions in the schedules and
ignore the others.
Types of Serializability The serializability of schedules is of two types:-
(a) Conflict Serializability
9
(b) View Serializability
Conflict Serializability
This is based on the concept of logical swapping of non-conflicting instructions in a
schedule.
Non-Conflicting Instructions
Let a Schedule S have two consecutive instructions Ii and Ij belonging to two
concurrent Transactions Ti and Tj respectively such that Ii Ti and Ij Tj.
The two consecutive instructions would be called non-conflicting, if they satisfy any of
the following conditions:-
(a) The instructions are referring to the access of different data items. For
example Ii may be referring to the access of data item P and Ij may be
referring to access of another data item Q.
(b) Both are referring to the access of the same data item (say Q) and both are
only Read instructions.
If the non-conflicting instructions are swapped on the time-scale, the end state of
the database remains unchanged. So, logical swapping of two non-conflicting instructions
makes no difference as far as database consistency is concerned.
Conflicting Instructions
On the other hand, two consecutive instructions Ii and Ij, belonging to two concurrent
Transactions Ti and Tj respectively such that Ii Ti and Ij Tj, are called conflicting
instructions if both are referring to the access of same data item (say Q) and at least one
of the two instructions is a Write (Q) instruction. In this case, changing the order of the
two instructions would effect the end-results of database updates.
Conflict-Equivalence of Schedules
Two schedules S and S‟ are said to be Conflict-Equivalent, if by a series of SWAPS of
non-conflicting instructions of S, the schedule S gets transformed to Schedule S‟. For
example Schedule S5 and Schedule S6 are Conflict-Equivalent Schedules.
Conflict Serializable Schedule
A Concurrent (Non-Serial) Schedule S is said to be Conflict-Serializable, if there
exists a serial schedule S’ that may be Conflict-Equivalent to S.
10
Example of a Conflict-Serializable Schedule
Schedule S5
Schedule S5 is same as Schedule S3 (omitting the instructions other than read and write).
T1 T2
read (A);
write(A);
read(B);
Write(B)
read (A);
write(A);
The read(B) and write(B) instructions of T1 can be swapped with read(A) and write(A)
instructions of T2, resulting in an equivalent Schedule S6 (shown below), which is serial.
Schedule S6
T1 T2
read (A);
write(A);
read(B);
Write(B)
read (A);
write(A);
Thus, the Non-Serial Schedule S3 (which is same as Schedule S5) is a Conflict-
Serializable Schedule, since it is conflict equivalent to a Serial Schedule S6.
Example of a Non-Serializable Schedule
Schedule S7
T1 T2
read (A);
write(A);
read(B);
Write(B);
read (A);
write(A);
Schedule S7 is same as Schedule S4 (omitting the instructions other than read and write).
No swapping of non-conflicting instructions will result in a Serial Schedule. Thus
Schedule S4 is not a Conflict-Serializable.
11
Testing of Serializability of a Schedule
We can test the conflict-serializability of a concurrent schedule by using Precedence
Graph Method, explained below:-
Precedence Graph Method
1. For the given Schedule, draw a Precedence Graph as follows:-
(a) Each Transaction Ti participating in the Schedule S will be represented
by a Vertex
(b) For each data item Q accessed in the Schedule S, there will be an edge
from Ti to Tj
(Indicating that Ti precedes Tj), provided any of the following three
conditions holds:-
(i) Ti executes Write (Q) before Tj executes Read (Q)
(ii) Ti executes Read (Q) before Tj executes Write (Q)
(iii) Ti executes Write (Q) before Tj executes Write (Q)
2. Test of Serializabilty
If (no cycle is detected in the Precedence Graph)
then the Schedule S is Conflict–Serializable
else it is NOT Conflict-Serializable.
Examples:
Precedence Graph for Schedule S5 (same as Schedule S3)
T1 Reads (A) before T2 Writes (A); so draw an Edge from T1 to T2. T1 Writes (A) before T2 Reads (A); so draw an edge from T1 to T2 (already drawn).
T1 Writes (A) before T2 Writes (A); so draw an edge from T1 to T2 (already drawn).
In all the above cases, there is an edge from T1 to T2, as shown below:-
Ti
T1 T2
Ti Tj
12
The Precedence Graph has no cycle; thus Schedule S3 is Conflict-Serializable.
Precedence Graph for Schedule 7 (same as Schedule 4)
T1 Reads (A) before T2 Writes (A); so draw an Edge from T1 to T2. T2 Reads (A) before T1 Writes (A); so draw an edge from T2 to T1.
(We can stop at this point itself, since a cycle T1 T2 T1 has been detected; so the
schedule is not Conflict-Serializable.)
However, we can check other conflicting situations also:-
T2 Writes (A) before T1 Writes (A); so draw an edge from T2 to T1 (already drawn).
The Precedence Graph has a cycle; thus Schedule 4 is NOT Conflict-Serializable.
View Serializability
In some situations, a schedule S may not be conflict-serializable; but it may be equivalent
to a serial schedule S‟, which starting from a given initial state of a database, produces
same end-results in the database as produced by S starting from the same initial state. For
example consider Schedule S8:-
Schedule S8
T1 T2 T3
Read (A);
Write(A);
Write(A);
Write(A);
The Precedence Graph of Schedule S8:-
T1 Reads (A) before T2 Writes (A); so draw an edge from T1 to T2. T1 Reads (A) before T3 Writes (A); so draw an Edge from T1 to T3. T2 Writes (A) before T1 Writes (A); so draw an edge from T2 to T1.
T2 Writes (A) before T3 Writes (A); so draw an edge from T2 to T3.
T1 T2
13
A Cycle is detected T1 T2 T1; therefore S8 is not Conflict-Serializable.
But the results produced by the following serial schedule (Schedule S9) will be same as
produced by able Schedule S8.
Schedule S9
T1 T2 T3
Read (A);
Write(A);
Write(A);
Write(A);
So the criteria of Conflict-Serializability is found to be unnecessarily stringent in the case
of Schedule S8. Thus, we define another serializability criteria called View Serializability,
which is less stringent as compared to Conflict Serializability. There may be situations
wherein a schedule may not satisfy the criteria of conflict serializabilty; but may not be
posing any danger to database consistency; like in the case of Schedule S8. In such cases,
the schedule may be View Equivalent to a Serial Schedule.
View Equivalence A Schedule S is said to be View Equivalent to another Schedule
S‟, if the following conditions are met:-
(a) For each data item Q, if transaction Ti reads its initial value of Q in S, then
Ti must be reading the initial value of Q in S‟ also.
(b) For each data item Q, if a transaction Tj writes out its final value in S, it
must be similar in S‟ also.
(c) For each data item Q, if transaction Tj reads the value of Q produced by
other transaction Ti in S, then it must be similar in S‟ also.
T1
T2
T3
14
Going by the above criteria, Schedule 8 and Schedule 9 are View Equivalent Schedules,
since in both T1 reads the initial value of A and T3 writes the final value of A.
View-Serializable Schedule
A non-serial schedule S is said to be View-Serializable if it is View-Equivalent to a serial
schedule.
Thus, Schedule S8 is a View-Serializable Schedule, since it is View-Equivalent to a serial
Schedule S9.
A conflict-serializable schedule will also be view-serializable, but reverse may not be
true.
Serializable Schedule A schedule is said to be Serializable, if it is:-
Conflict- Serializable
OR View-Serializable.
Cascading Rollbacks
Suppose a Transaction Ti modifies a data item Q and the modified value of Q is
read by another Transaction Tj before transaction Ti COMMITS. Now, suppose Ti fails
during its execution and it has to be rolled-back. So, the value of data item Q, which Tj
has read is undone by the rolling-back of Ti. Thus, any computation performed by Tj ,
based on this value of Q, would cause database inconsistencies. Thus, when transaction
Ti fails during its execution, and it has to be rolled back, we also have to rollback those
transactions, which might have read the data items modified by Transaction Ti. Such a
Rollback is called a Cascading Rollback.
Example:-
Schedule S10
T1 T2 T3
Read (A);
Write (A);
Read (A);
Write(A);
Read (A);
Write (A);
Read (B);
Read (C);
In the above Schedule S10, Since T2 reads the value of data item A, after it has
been already modified by T1, but before T1 commits and also T3 reads the value of data
15
item A as modified further by T2 but before T1 commits and T2 commits. In this case, if
T1 fails during its subsequent execution before it Commits, then T2 and T3 also must
rollback along with T1. This kind of rollback is called Cascading Rollback.
The management of Cascading Rollbacks increases the system complexity, since
the system has to keep a track of all those transactions, which have read data items
modified by uncommitted transactions, till those transactions Commit successfully. Also,
in the case of Cascading Rollbacks, a significant amount of work gets undone, thus
affecting the System Throughput adversely.
Cascade-less Schedules
The need of Cascading Rollbacks can be effectively obviated by imposing a restriction on
the schedules that the data items, modified by a Transaction Ti, must not be permitted to
be read by other concurrent transactions, till Ti Commits. Such Schedules are called
Cascade-less Schedules. The Cascade-less Schedules obviate the need of Cascading
Rollbacks.
Recoverable Schedules
A concurrent Schedule is said to be Recoverable, iff:-
(a) It is Serializable (Conflict-Serializable or View-Serializable)
AND
(b) It is a Cascade-less.
16
Deadlock Handling in Transaction Processing
Deadlock in Transaction Processing
It refers to a situation wherein Transactions Wait forever for accessing the Data Items
locked by each other.
Necessary Conditions for Deadlock to Occur
There are four conditions, which must exist simultaneously, for a Deadlock to Occur:-
(1) Mutual Exclusion Some Data Items must be locked by some transactions in
Exclusive Mode.
(2) Hold & Wait Some Transactions must be holding Exclusive Locks on some Data
Items and at the same time must be requesting Exclusive Lock on some other data items
currently locked by other transactions.
(3) No Pre-emption The data items locked exclusively by a Transaction can not
be forcibly pre-empted. The Transaction will release the locks at its own will.
(4) Cyclic Wait The must exist a situation, wherein a set of n transactions say (T0,
T1 , T2, …… , Tn-1) are waiting in a cyclic manner for the data items locked by each other
i.e. T0 is waiting for some data item currently locked by T1
T1 is waiting for some data item currently locked by T2
T2 is waiting for some data item currently locked by T3
|
|
Tn-2 is waiting for some data item currently locked by Tn-1
Tn-1 is waiting for some data item currently locked by T0
Deadlock Prevention
The Deadlocks can be prevented by imposing some restrictions on the sequence, in which
a given set of data items, can be accessed by a Transaction. One such algorithm is
explained below, which is graph based.
Graph Based Algorithm to Lock Resources
It works as follows:-
T0 T1 T2 Tn-1
17
- A graph is drawn in which each node represents a Data Item
- A node will have only one parent.
- A transaction can Lock Data Items as follows:-
- First Lock can be obtained on any Node
- Any Subsequent lock can be obtained only on a Node whose parent is
currently locked by the Transaction.
- Lock on Root Node can be obtained only as a first lock; it cannot be
locked subsequently.
- If a Transaction violates the above protocol, it is forced to Roll-back.
Suppose, the Resource Graph for some environment is as follows:-
Suppose a Transaction Ti needs to access data items F, E, B in that order, it has to
obtain locks in the following sequence
Lock-X (A);
Lock-X (C);
Lock-X (F);
Access (F);
Unlock (F);
Unlock (C);
Lock-X (B);
Lock-X (E);
Access( E);
A
B C
D E F G
18
Unlock (E);
Access (B);
Unlock (B);
Unlock (A);
How the above Algorithm helps to prevent Deadlocks is illustrated in the following
Example:-
Let the protocol be that if both data items A and B are to be locked by a Transaction T i
then it must first lock data item A and then B, not the other way. If a transaction violates
this protocol, then it must be rolled back.
Now, using the above Graph, the above Schedule (involving Deadlock) gets modified as
follws:-
T1 T2 Lock Manager
Lock-X (A); Grant-X (A, T1)
Lock-X (B); Grant-X (B, T2)
Read (A);
Write (A);
Lock-X (B); T1 has to wait, since B is currently
Locked by T2
Read (B);
Write (B);
Lock-X (A); At this point, it is Deadlock;
But T2 is violating the algorithm,
since it is attempting to lock A
After locking B.
So, T2 is rolled back and it is forced
To release lock on data item B.
So, T1 will proceed and Deadlock is
Prevented.
B
A
19
Deadlock Detection & Recovery
Deadlock Recovery will involve Roll-back of some of the transactions involved in the
Deadlock; but before that a deadlock needs to be detected.
Deadlock Detection
Transaction Wait For Graph Method
- A Directed Edge Graph is drawn, wherein each node represents a Transaction.
- A directed edge from Ti to Tj indicates that Ti is waiting for a data item currently
locked by Tj.
- If there exists a cycle in the Graph, it indicates existence of a Deadlock; else
there is no Deadlock.
Example 1
The above Graph has no cycle; thus there is no Deadlock.
Example 2
The above Graph has a cycle T2 T4 T3 T2 ; thus there exists a Deadlock.
T1
T2 T3
T4
T1
T2 T3
T4
20
Deadlock Recovery
When a deadlock is detected, the system initiates recovery action, which involves roll-
back of some of the transactions involved in the Deadlock. The rolled-back transactions
would release the exclusive locks currently held by these transactions. So, the other
transactions, which may be waiting for such locks, would get the awaited resources and
would proceed; thus breaking the Deadlock.
The issues involved are:-
1. Selection of Victims for Roll-Back: The Criteria for selection of victim would
take into consideration:-
- The amount of work already completed by the transaction. Since this
work will be undone during the roll-back, so a transaction which has
completed lesser work should be a preferred as a victim.
- The amount of work yet to be completed. This information is difficult
to get. However, if this information is available, then the transaction,
which is still farther from the end, should be preferred as a victim. A
Transaction, which is near its end, should not be rolled-back.
- The Number of resources locked by the Transaction . If a transaction
with large number of resources locked by it is rolled back, then large
number of resources will get free, which may meet the need of large
number of waiting transaction.
2. Roll Back the selected Transaction. With the help of log file entries, the
data items modified by this transaction will be reverted back to their old values. Also, the
data items, currently locked by this transaction, will be unlocked. So, the other
transactions, awaiting lock on such data items, will be able to proceed and the deadlock
will be broken.
3. Restart the Rolled-Back Transaction. Once the deadlock is broken, the
rolled-back transaction is restarted.
Implications of Deadlock Recovery
1. The work already performed by the rolled-back transaction is undone. In fact,
undoing it also needs more work to be performed. So, system throughput gets affected
adversely.
2. There is always a possibility that a Transaction may face roll-back again and
again and may never get completed; thus facing starvation.
21
In fact the Deadlock Detection also needs a lot of book-keeping and processing. This can
be avoided by the following approach:-
Time-Out based approach for Deadlock Detection & Recovery
When a Transaction Ti requests lock on a data item Q, which may be currently locked by
another transaction Tj then Ti is put to Wait State. Based upon history of Transaction
processing on a system, in a given environment, it can always quantified time period t
within which the data item Q is likely to be free. This time period t is explicitly
specified in a system. If data item Q does not become available in time t (Time-Out
Condition) then it presumes existence of a Deadlock and Transaction Ti is rolled back
automatically.