Concurrency Control - Praveen kumarpraveencs.weebly.com/uploads/1/0/4/4/10440152/unit-v-dbms.pdf ·...

1

Praveen Kumar

Concurrency Control

One of the important functions of DBMS is Concurrency Control, which is performed by

one of its major components called Concurrency Control Manager (CCM). Concurrency

Control implies controlling the execution of concurrent transactions in a schedule in such

a way that the resulting schedules are Serializable and Cascade-less.

Concurrency Control Protocols

The Concurrency Control Protocols can be divided into the following sub-categories:-

(a) Lock based Protocols

(b) Time-Stamp based Protocols

(c) Hybrid Time-Stamp and Lock based Protocols

(d) Multi-Version based Protocols

(e) Validation based Protocols

Lock Based Protocols

In the Lock-Based Protocols, when a Concurrent transaction Ti needs to access a shared

data item Q, it will first lock the data item Q in an appropriate Mode and then access it.

The locking and unlocking of data items is centrally controlled by Concurrency Control

Manager. For accessing data item Q, Transaction Ti will proceed as follows:-

(i) Ti requests Concurrency Control Manager to grant it lock on data item Q in

Mode M. Locks on all data items are controlled by Concurrency Control

Manager.

(ii) Ti will examine whether lock on Q can be granted immediately to Ti in the

requested mode or not?

(iii) If YES then lock on data item Q in mode M is granted to Ti and Ti proceeds

with its execution Else Ti is made to wait till the requested lock on Q is

available. So, Ti will execute its next instruction only after the requested lock

is granted.

(iv) After Ti finishes with accessing of Q, it releases the lock on Q, which can now

be assigned to another waiting transaction.

(v) If Ti fails during its execution before releasing a lock, the lock is

automatically released when Ti is rolled-back during recovery.

Different Locking Modes

2

Praveen Kumar

Locks will be granted in two different modes:-

(a) Shared-Mode A Shared Mode lock on a data item Q permits the owner

transaction to perform only Read(Q) Operation; not Write(Q) operation. Since, the owner

transaction of a Shared-Mode Lock on a data item Q is not permitted to modify the data

value, shared mode lock can be granted on Q to more than one transactions concurrently.

(b) Exclusive-Mode If a transaction Ti needs to perform Read/Write or Write

operations on a data item Q, then it would request Exclusive Lock (X-Lock) on the data

item Q. Once Exclusive Lock is granted to Ti on Q, it provides exclusive access rights on

Q. No other transaction can have Shared or Exclusive lock on Q till Ti releases the X-

Lock. So, Exclusive Lock on a data item Q can be granted to a waiting transaction, only

when no other transaction is holding any lock (Shared or Exclusive) on Q.

Compatibility of Locks

Existing Lock on Data Item Q →

Requested Lock

On Data Item Q

Shared Exclusive

Shared Yes No

Exclusive No No

It implies that if a data item Q is currently held by a Transaction Ti in Shared Mode then

other transactions can also be granted Shared Lock (but NOT Exclusive Lock) on Q. But

if a data item Q is currently held by a Transaction Ti in Exclusive Mode then no other

transaction can be granted Shared or Exclusive Lock on Q.

Use of Locks to achieve serializability.

Schedule L1

T1 T2

Read (A);

Read (A);

Write (A);

Write (A);

The above Schedule L1 is not Conflict-Serializable; but it can be serialized by using locks

as below in Schedule L2 :-

Schedule L2

3

Praveen Kumar

T1 T2 Concurrency Control Manager

Lock-X (A); Grant-X (A, T1)

/*Assuming A is currently not locked;

the requested X-Lock will be granted to T1 */

Read (A); /* T1 Reads A */

Lock-X (A); /* Since A is currently locked in X-Mode

by Transaction T1, so T2 is made to wait

till A is unlocked by T1 */

Write (A);

Unlock (A); De-assign (A, T1)

/* The X-Lock on A granted to T1 is de-assigned */

Grant-X (A, T2)

/* X-Lock on A is assigned to waiting transaction T2 */

Read (A); /* Now T2 proceeds to execute Read (A) */

Write (A);

Unlock (A);

So, the non-serial Schedule L1 is now converted to a Conflict-Serializable Schedule L2. It

is in fact a serial schedule. So, in the bargain, Concurrency was sacrificed.

The use of locks may not always result in a serializable schedule, as indicated below for

schedule L3

Schedule L3



Read (A);

Lock-X(A); /* Since ‘A’ is currently locked in X-Mode


till ‘A’ is unlocked by T1 */

Write (A);

Unlock (A); /* At this point, T1 unlocks ‘A’ */

Grant-X (A, T2)

Read (A);

Write (A);

4

Praveen Kumar

Unlock (A);

Lock-X (B); Grant-X (B, T2)

Read (B);

Write (B);

Unlock (B);


Read (B);

Write (B);

Unlock (B);

The Precedence Graph of Schedule L3 :-

As shown above, the Schedule L3, despite use of Locks, is NOT a Conflict-Serializable

Schedule. This has happened, because T1 unlocked Resource ‘A’ prematurely. The

locking needs some additional protocols to ensure Serializability, as discussed below.

Two Phase Locking Protocol

As per this protocol, each transaction, during its execution, will obtain and release locks

in two distinct phases:-

Phase I This phase begins as soon as a transaction becomes active. During this

phase, a transaction will only obtain locks and will not release any locks. This phase ends

as soon as a lock, held by the transaction, is released.

Phase II This phase begins when first lock is released by a transaction. During this

phase, no new locks will be obtained by the transaction; only the locks held by it will be

released.

Let us modify the locking of Schedule L3 to Two-Phase-Locking, as shown below in

the Schedule L4.

Schedule L4 (using Two-Phase Locking Protocol)



Read (A);

Lock-X(A); /* Since ‘A’ is currently locked in X-Mode


T1 T2

5

Praveen Kumar

till ‘A’ is unlocked by T1 */

Write (A);


Unlock (A); /* At this point, T1 unlocks ‘A’ */

Grant-X (A, T2)

Read (A);

Write (A);

Lock-X (B); /*Since X-Lock on ‘B’ is currently held by T1

T2 has to wait */

Read (B);

Write (B);

Unlock (B); Grant-X (B, T1)

Unlock (A);

Read (B);

Write (B);

Unlock (B);

The Precedence Graph of Schedule L4 :-

As indicated above, the Schedule L4, which is using Two-Phase Locking Protocol, is a

Conflict-Serializable Schedule.

So, Two-Phase-Locking Protocol ensures Conflict-Serializability.

The Two-Phase Locking results in Conflict-Serializable Schedules. The point, where

last lock is obtained by a transaction, is called Lock-Point. The transactions, participating

in a schedule, will be serialized in the same order as the order of their lock-points.

The Two-Phase Locking ensures a Conflict-Serializable Schedule but not a Cascade-

less Schedule, as shown below in Schedule L5

Schedule L5

Suppose Transaction T1 needs to access both data items ‘A’ and ‘B’ but Transaction T2

needs to access only one data item ‘A’.


T1 T2

6

Praveen Kumar


Read (A);

Lock-X (A); Since ‘A’ is currently locked by T1, T2 has to wait.

Write (A);


Unlock (A); Since T1 has unlocked ‘A’, T2 can now be granted

lock on ‘A’

Grant-X (A, T1)

Read (A);

Write (A);

Unlock (A);

Read (B);

Write (B);

Unlock (B);

Commit;

Commit;

The Schedule L5 is following Two-Phase-Locking Protocol. It is Conflict-Serializable but

not Cascade-less, since T2 is able to read the value of data item ‘A’ after ‘A’ has been

modified by T1 but before T1 Commits. This can be obviated by using Strict-Two-Phase-

Locking Protocol as explained below:-

Strict Two Phase Locking Protocol This protocol is a variant of two-phase locking

protocol. In addition to two-phase protocol, it ensures that all exclusive locks held by a

transaction are released only after the transaction has committed successfully. This

ensures that the data items updated by a transaction are permitted to be read by other

transactions till T commits. Thus, this protocol will ensure Cascade-Less Schedules. Let

us modify the Schedule L5 (Shown above) to follow Strict-Two-Phase-Locking, as

indicated below:-

Schedule L6



Read (A);

Lock-X (A); since ‘A’ is currently locked by T1, T2 has to wait.

Write (A);


Read (B);

Write (B);

Unlock (A);

Unlock (B);

7

Praveen Kumar

Commit; Since T1 has unlocked ‘A’, T2 can now be granted

lock on ‘A’

Grant-X (A, T1)

Read (A);

Write (A);

Unlock (A);

Commit;

The Schedule L6 , as indicated above, is Conflict-Serializable as well as Cascade-less. T2

is permitted to read the value of ‘A’ (modified by T1) only after T1 Commits.

Rigorous Two Phase Locking Protocol This protocol is also a variant of two-phase

locking protocol. It follows two-phase-locking protocol and also ensures that all locks

(Exclusive as well as Shared Locks) held by a transaction will be released only when a

transaction Commits successfully. This ensures that transactions will be serialized strictly

in the same order as the order of their Commit.

Limitation of Lock-based Algorithms

The Lock-based algorithms suffer from the problem of Deadlocks. For example, suppose

the transactions T1 and T2 need to access Resources ‘A’ and ‘B’ and suppose T1 needs to

first access ‘A’ followed by ‘B’ and T2 needs to first access ‘B’ followed by ‘A’ and

suppose the two transactions are following Strict Two Phase Protocol or Rigorous Two

Phase Protocol, the two transaction can enter into Deadlock, as indicated below:-

Suppose both T1 and T2 need to access data items ‘A’ and ‘B’. T1 needs to access ‘A’

first and then ‘B’ whereas T2 needs to access ‘B’ first and then ‘A’.

Schedule L7



Read (A);

Lock-X(B); Grant-X (B, T2)

Read (B);

Write (A);

Lock-X (B); /* Since X-Lock on ‘B’ is currently held by T2

so T1 has to wait */

Write (B);

Lock-X (A); /* Since X-Lock on ‘A’ is currently held by T1

so T2 has to wait */

8

Praveen Kumar

The above Schedule L7 follows Strict-Two-Phase Protocol. It has entered a condition,

wherein T1 and T2 will wait forever for the Resources locked by each other; thus causing

a Deadlock Condition.

The Solution for Deadlocks can be found in Time-Stamp based algorithms, as explained

subsequently.

Multiple Granularity Locking

In the concurrency control schemes discussed above, data items are locked individually.

However, the locking overheads can be much reduced, if data-items are grouped into

synchronization units. A hierarchical relationships can be created between the data items

in the form of a tree as indicated below:-

----- -----

The tree indicated above has three levels of nodes- the root represents the entire database

(DB), the middle level indicates Files (representing Tables or Relations) and the lowest

level indicates Records (Tuples of the Relations). It uses two-phase locking and makes

use of the following types of locks:-

(a) Exclusive (X) Lock When a Transaction is granted an X lock on a node “K”, it

is implicitly granted X lock on all the descendents of node “K”.

(b) Shared (S) Lock When a Transaction is granted an S lock on a node “K”, it

is implicitly granted S lock on all the descendents of node “K”.

(c) Intentional X (IX) Lock When a Transaction needs X lock on a node “K”,

the transaction would need to apply IX lock on all the precedent nodes of “K” starting

DB

File A File B File C

Record

Ra1 Ran Rb1 Rbm

9

Praveen Kumar

from the root node. So, when a node is found locked in IX lock mode, it indicates that

some of its descendent nodes must be locked in X mode.

(d) Intentional S (IS) Lock When a Transaction needs S lock on a node “K”,

the transaction would need to apply IS lock on all the precedent nodes of “K” starting

from the root node. So, when a node is found locked in IS lock mode, it indicates that

some of its descendent nodes must be locked in S mode.

(e) SIX Lock When a node is locked in SIX mode, it indicates that the node is

explicitly locked in S Mode and IX Mode. So, the entire tree rooted by that node is

locked in S mode and some nodes in that are locked in X mode. This mode is compatible

only with IS mode.

The Compatibility Matrix is as indicated below:-

IS IX SIX S X

IS True True True True False

IX True True False False False

SIX True False False False False

S True False False True False

X False False False False False

The Scheme operates as follows:-

(a) A Transaction must first lock the Root Node and it can be locked in any mode.

(b) Locks are granted as per the Compatibility Matrix indicated above.

(c) A Transaction can lock a node in S or IS mode if it has already locked all the

predecessor nodes in IS or IX mode.

(d) A Transaction can lock a node in X or IX or SIX mode if it has already locked

all the predecessor nodes in SIX or IX mode.

(e) A transaction must follow two-phase locking. It can lock a node only it has

not previously unlocked a node.

(f) Before it unlocks a node, a Transaction has to first unlock all the children

nodes of that node.

So, locking proceeds top-down and unlocking proceeds bottom-up.

Take the following example:-

DB

10

Praveen Kumar

Suppose a transaction needs lock Records Ra2 and Rc1 in X Mode, it will lock as

follows:-

DB, Fa, Fc ----- in IX mode.

Ra2, Rc1------ in X mode.

Time-Stamp Based Algorithms

The main features of Time-Stamp based algorithm are as follows:-

- When a Transaction Ti is initiated, it is assigned a Time Stamp TS (Ti).

- The Time Stamp will be a number; it could be real time (indicated by the

system clock) at the time of initiation of Ti or a counter value, updated by

the system in an ascending order; initial value could be zero and it could

be incremented by one whenever Time Stamp is assigned to a new

Transaction. So, no two transactions will have same value of Time Stamp.

If a Transaction Tj is initiated later than Ti then TS (Tj) > TS (Ti).

- Each Data Item Q, being accessed by the Transactions will have two

Time Stamps- Read Time Stamp R-TS(Q) and Write Time Stamp W-

TS(Q). The Initial Value of these Time Stamps will be zero but later the

values are updated as explained below:-

The value R-TS (Q) will be equal to the Time-Stamp of the

Transaction Tk which has read the data item Q successfully and

this happens to be largest amongst the transactions which have

read it successfully. So, whenever a Transaction Ti reads data item

Q successfully, R-TS(Q) is updated as follows:-

R-TS (Q) = Max ( R-TS(Q), TS(Ti) )

Fa Fb

Fc

Ra1 Ra2

Ra3

Rb2

Rc1

Rc3 Rb1

Rc2

11

Praveen Kumar

The value W-TS(Q) will be equal to the Time-Stamp of the

Transaction Tk which has Updated (Written) the data item Q

successfully and this happens to be largest amongst the

transactions which have updated it successfully. So, whenever a

Transaction Ti updates data item Q successfully, W-TS(Q) is

updated as follows:-

W-TS (Q) = Max ( W-TS(Q), TS(Ti) )

Whenever a Transaction Ti needs to access a data item ‘Q’, it proceeds as follows:-

Read (Q)

- The Transaction Ti requests Read (Q)

- The System will process the request as follows:-

If ( W-TS(Q) > TS (Ti) )

it indicates that a transaction that was initiated later than Ti has

already updated Q. So, the value of Q, which Ti intends to Read,

has already been overwritten.

In this case, the Read(Q) is rejected and Transaction Ti is rolled

back. Then, Ti will be re-initiated with a new Time-Stamp

assigned to it.

Else Read(Q) is executed and R-TS(Q) is updated as follows:-

R-TS(Q) = Max ( R-TS(Q), TS(Ti) )

Write (Q)

- The Transaction Ti requests Write (Q)

- The System will process the request as follows:-

If ( R-TS(Q) > TS (Ti) )

it indicates that a transaction (Tj) that was initiated later than Ti

has already Read the old value Q. So, Ti should have performed

this Write operation earlier than Tj’s Read operation, so that Tj

could have Read the updated value of Q.

In this case, the Write (Q) is rejected and Transaction Ti is rolled

back. Then, Ti will be re-initiated with a new Time-Stamp

assigned to it.

Else If ( W-TS (Q) > TS (Ti ) )

12

Praveen Kumar

It indicates that a Transaction (say Tj) initiated later than Ti has

already updated Q. It means the value of Q which Ti now intends

to write is already obsolete.

In this case also, the Write (Q) is rejected and Transaction Ti is

rolled back (though rolling-back the transaction in this case is

not really necessary, which will be discussed later). Then, Ti will

be re-initiated with a new Time-Stamp assigned to it.

Else Write (Q) is executed and W-TS(Q) is updated as follows:-

W-TS(Q) = TS(Ti)

So, in the Time-Stamp based algorithm, whenever a Transaction requests Read/Write of a

data item Q, either the Read/ Write is successfully performed or the requesting

Transaction is Rolled-back. If Rolled-back, the transaction will be re-initiated with a new

Time-Stamp assigned to it.

Advantages of Time-Stamp based Algorithm vis-à-vis Lock-Based Algorithm

Since a Transaction never waits for accessing of a data item, there is no possibility of

Deadlock.

Dis-advantages of Time-Stamp based Algorithm vis-à-vis Lock-Based Algorithm

1. There is a distinct possibility that a Transaction may be rolled back again and

again and may face starvation.

2. Whenever a transaction is rolled back, the work already performed by that

transaction is undone; Even undoing the work also requires more work. This

affects the system throughput adversely.

Thomas Write Rule

Whenever a Transaction Ti requests Write (Q)

If ( W-TS (Q) > TS (Ti ) ) It indicates that a Transaction (say Tj) initiated later than Ti

has already updated Q. It means the value of Q which Ti now intends to Write is already

obsolete. Under this condition, the Time-Stamp based algorithm will reject the Write (Q)

and also roll-back the Transaction Ti. Rolling-back the transaction is not really necessary

in this case. This was proposed by Thomas as follows:-

As per Thomas’ Write Rule, under the above condition, the Write (Q) operation is

ignored (since the value which Ti intends to Write is obsolete) but Ti is not Rolled-back.

13

Praveen Kumar

Some Algorithms, which avoid Starvation

The following algorithms, which make use of both Locks and Time Stamps, obviate the

possibility of starvation:-

1. Wait-Die Algorithm

2. Wound-Wait Algorithm

1. Wait-Die Algorithm

It operates as follows:-

- When a Transaction is initiated, it is assigned a Time Stamp just like in

the case of Time-Stamp based algorithm.

- When a Transaction Ti requests accessing of a data item Q, which is

currently locked by another Transaction Tj, the request is processed as

follows:-

If ( TS(Ti ) < TS (Tj) )

Then Ti waits for Tj to finish and release lock on Q

Else Ti Rolls-back and restarts with old Time-Stamp retained by it.

It is possible that a transaction may be rolled back a couple of times while attempting to

access a data item Q. But since a rolled back transaction is re-initiated with its original

time-stamp retained by it, finally it will mature and be eligible to access the data item Q.

So, there is no possibility of starvation.

2. Wound-Wait Algorithm

It operates as follows:-

- When a Transaction is initiated, it is assigned a Time Stamp just like in

the case of Time-Stamp based algorithm.

- When a Transaction Ti requests accessing of a data item Q, which is

currently locked by another Transaction Tj, the request is processed as

follows:-

If ( TS(Ti ) < TS (Tj) )

Then Ti forces Tj to Roll-back. When Tj Rolls-back, it will release

lock on Q, which will then be available for access to Ti

Else Ti waits for Tj to finish and release lock on Q, which will then

become available to Ti

In this case also, there is no possibility of starvation.

14

Praveen Kumar

Wound-Wait Vs Wait-Die Algorithm

It is better than Wait-Die Algorithm, as the average number of roll-backs in this case will

be lower. Since, if a Transaction Tj has been forced to roll-back by another Transaction Ti

then Tj will be re-initiated with the original Time-Stamp retained by it. Now, when it

requests access of Q, suppose Ti is still holding a lock on Q, then Tj will wait for Ti to

finish and then access Q. So, in this there would be only one roll-back, not repeated roll-

backs which are possible in Wait-Die Algorithm. Since, the average number of Roll-back

will be less than Wait-Die Algorithm, the Wound-Wait Algorithm will produce more

throughput.

Validation Based Protocols

The protocol operates as follows:-

A transaction is executed in the following three phases:-

1. Read Phase During this phase, a Transaction Ti reads-in all the

data-items needed by the transaction into its local variables. It performs its

processing and stores the results into its local variables, without affecting the

database.

2. Validation Phase During this phase, it determines whether it is valid

to copy the updated local variables into the Database, without causing a

serializability conflict. The Validation Test is subsequently explained.

3. Write Phase If Transaction Ti passes the Validation Test, then the

system transfers its updates from the local variables to the Database; else Ti is

rolled back to its initial state and re-started. As obvious, there won’t be any roll-

back of ‘Read Only’ Transactions and such transactions will always go-through

without any waiting.

Time Stamps associated with a Transaction

To enable performing of Validation Test, the system associate the following

Time-Stamps with each Transaction Ti :-

1. Start (Ti) The time when Ti started its execution i.e. start of Read

Phase.

2. Validation(Ti) The time when Ti finished its Read Phase and stated

Validation Phase.

3. Finish (Ti) The Time when Ti its Write Phase.

15

Praveen Kumar

The Time Stamp associated with Ti for serialization TS(Ti) = Validation(Ti).

The Serializabililty requirement of a Transaction Pair (Ti , Tj) is that if TS(Ti) <

TS(Tj) then in the valid schedule, Ti must appear before TJ.

Validation Test

A Transaction Ti is said to pass the Validation Test if one of the following two

conditions holds against each concurrent transaction Tj that satisfies TS(Tj) < TS

(Ti) i.e. Tj commenced Validation Test earlier than Ti :-

(a) Finish (Tj) < Start (Ti), since Tj finished before Ti commenced; so Tj does

not conflict with Ti.

(b) Start (Ti) < Finish (Tj) < Validation (Ti) AND the set of data items written

by Tj does not overlap with the data items read-in by Ti. This implies that

Ti started before Tj finished but Tj finished before Ti stated Validation

Phase AND the set of data items read-in by Ti does not overlap with the

data items modified by Tj. Since, Tj is already finished; so it can’t modify

any more data items. So, Tj does not conflict with Ti.

In this protocol, the concurrency checks are not made for individual Read/Write

operations of a transaction; rather it makes a single check after all Read/Write

operations have been performed in the local variables; and if the validation check

goes through successfully, all updates of the transaction are applied to the

database. So, concurrency control overheads will be very low. This protocol is

ideal for an environment, wherein majority of the Transactions are “Read-Only’

and contention of ‘Write’ Transactions is extremely rare. In case of such

contentions, a transaction would need to be rolled back

Multi-Version based Scheme

The Multi-Version Scheme operates as follows:-

- When a Transaction Ti requests a Write (Q) operation, the system creates

a new Version of data item Q.

- When a Transaction requests a Read (Q) operation, the system will

determine which version of Q should be read by the Transaction Tj and

the system returns appropriate value of Q to Tj.

- The Versions of data-item Q, which are no more required, are deleted by

the system.

16

Praveen Kumar

Each version Qk of data-item Q will have three fields:-

Content (Qk) the value of data-item Q associated with version Qk. It is equal to

the value of Q written by the transaction that created the version.

W-TS(Qk) The time Stamp of the Transaction that created Version Qk by a

Write(Q) operation.

R-TS(Qk) The largest time Stamp of any Transaction that has successfully read

the Version Qk by a Read (Q) operation.

When a Transaction Ti requests a Read (Q) or Write (Q) operation then the system will

determine the version Qk of Q, whose W-TS(Qk) is largest amongst all the versions of Q

and less than or equal to TS (Ti), then the system proceeds as follows:-

Read (Q) Value of Content (Qk) is returned to Ti. And if TS(Ti) > R-TS(Qk) then the

value of R-TS(Qk) is updated to MAX (R-TS(Qk), TS(Ti)).

Write (Q)

If TS(Ti) < R-TS(Qk) (i.e. a transaction which was initiated later than Ti has already

read the most current version of Q) the Transaction Ti is rolled-back and restarted

with a new time-stamp;

Else If TS(Ti) = W-TS(Qk) it implies that version Qk has been created by Ti itself;

in this case Ti overwrites the contents of Qk;

Else Ti creates a new version Qi of Q. The Content (Qi) will be equal to the value of

Q written by Ti. R-TS(Qi) and W-TS(Qi) will be set equal to TS (Ti).

Deletion of Defunct Versions

A Crucial issue in this scheme is to determine the versions of Q that are no more

required and therefore can be deleted. Suppose there are two versions Qi and Qj of a data

item Q such that both have Write Time Stamp W-TS less than the oldest Transaction in

the system at that moment of time then the older of the two transactions is deleted.

In Multi-Version based protocol, all Read operations will always go through

successfully, but a Read operation will have higher overheads as compared to other

protocols since the system has to determine the version of Q to be read. As regards Write

operation, it may go through or it may have to be rolled back. In case, a transaction is

rolled back, it is re-started with a new Time-Stamp. Next time, hopefully it will go

through.

CHPATER 12

17

Praveen Kumar

Recovery

A failure would leave the Database System in a Suspect State. Recovery implies restoring

the Database (after a failure) to a state that is assumed to be Consistent. The Recovery

is made possible by the log of updates maintained in a system Log-File on a stable (non-

volatile) storage.

Let us consider a Transaction for Transfer an amount of Rs.1000/= from

ACCOUNT# 100 to ACCOUNT# 200.

Let the relation Account be on Schema ACCOUNT (Account-No, Branch-Name, Bal).

BEGIN TRANSACTION;

UPDATE Account

SET Balance = Balance –1000

WHERE Account-No = ‘100’;

IF any error THEN GO TO Undo;

UPDATE Account

Balance = Balance + 1000

WHERE Account-No = ‘200’;

IF any error THEN GO TO Undo;

COMMIT TRANSACTION;

GOTO FINISH;

UNDO: ROLLBACK TRANSACTION;

FINISH: RETURN;

The above transaction involves two updates to the relation Account. Temporarily,

the database would be in an inconsistent state, when one update has been performed and

the other one is still to be performed. During this period, the total Balance at the

BRANCH would be deficit by 1000, but at the end of the Transaction, the total would

tally. So, we can state that a Transaction transforms database from one consistent

state to other consistent state, without necessarily preserving consistency at all

intermediate steps.

Suppose the system fails between the two updates i.e. the first update is executed,

but not the second. Then, the database would be left in an inconsistent state. To obviate

this situation, the Transaction should be either executed in its entirety or not at all.

So, if a transaction executes some of its updates and then a failure occurs before the

transaction reaches its planned termination, then the completed updates must be undone

(called ROLLBACK). The system component that provides this atomicity to the

transaction processing is called Transaction Manager.

18

Praveen Kumar

COMMIT TRANSACTION This operation signals a successful end of the

Transaction. It confirms to the Transaction Manager that a transaction has been

successfully completed and the database is in a consistent state. All the updates made by

the transaction have been made safe in a non-volatile Log File.

ROLLBACK TRANSACTION This operation signals an unsuccessful end of the

transaction. It indicates to the transaction manager that a transaction has failed to

complete due to some failure and the database may be in an inconsistent state, till all the

updates made by the transaction are “rolled back” i.e. undone.

The log is maintained in two portions:-

An active or online log, which is maintained on the online disk. This log is used

for minor recoveries, during normal operations.

An archive or offline log, which is maintained on a tape. The offline log

maintains the record of updates since the last backup, which are used to restore

the system, in case of major failures. The database is first installed from the last

backup and then updates since the last backup are applied, to bring the database as

close as possible to the state that existed at the time of failure. Under such

situations, 100% recovery is not practically feasible.

Transaction Recovery A transaction begins with the successful execution of a

BEGIN TRANSACTION statement and ends with the successful execution of a

COMMIT or ROLLBACK statement. Thus, a COMMIT establishes a COMMIT

POINT (also called synch-point) at which database is in a state of consistency. A

ROLLBACK rolls back the database to the previous COMMIT POINT, at which again

the database was in a state of consistency.

When a COMMIT POINT is established:-

1. All the updates, made by the program since the previous COMMIT POINT, are

committed i.e. are made PERMANENT.

2. All database pointers (i.e. addressability to certain tuples) and all tuple- locks will

be released. Some systems provide an option to retain addressability to tuples (and also

retain tuple locks) from one commit point to next.

A single program execution may comprise a sequence of transactions. A

COMMIT or ROLLBACK will terminate a transaction, but will not terminate the entire

program.

The System does not assume that an application program can include explicit

checks for all possible error conditions. The System will issue implicit ROLLBACK for

any program that fails due to any reason, even if it has not been specified explicitly in the

program. So, a transaction is also used as a unit of recovery.

19

Praveen Kumar

If a transaction SUCCESSFULLY COMMITS, the system will guarantee that all

its updates are permanently reflected in the database, even if the system crashes before

the updates are physically written into the database. With the help of log, the system

would write such updates physically into the database during RESTART after the failure.

The RESTART procedure will recover any transactions that completed successfully, but

did not manage to get their updates physically written into the database (on stable

storage) prior to the crash.

System Recovery

There are two types of System Failures:-

(a) Local Failure

(b) Global Failure

A local failure affects only the transaction, during the execution of which the failure has

occurred. Recovery from such failures has been covered above.

A global failure affects all the transactions that may be in progress at the time of failure.

Such failure fall into two broad categories:-

1. System Failure (e.g. Power Failure) This affects all transactions currently

in progress but does not physically damage the database. This failure is called soft crash.

2. Media Failure (e.g. Disk Head Crash) It causes damage to the

database, or to some portion of it, and affects those transactions currently using that

portion of the database. A media failure is called hard crash.

Modes of Database Updates

1. Immediate Update

When an active Transaction Ti updates a data item Q, the update is immediately reflected

in the database (on stable storage) even before Ti Commits. In this case, there will be a

requirement to UNDO the updates of those transactions, which fail before Commit Point

is reached. Another limitation of this mode of update is that for each update, the system

has to perform a Disk-Write, that too on sectors which may be widely separated. This

involves significant overheads of Disk Seek Time.

How to UNDO the updates in case of ROLLBACK?

20

Praveen Kumar

The System maintains a log (or journal) of all operations on the disk, which

contains details of all updates. The pre-update and post-update values of the

updated objects are recorded in the log, as follows:-

<Ti , Q, Old-Value, New-Value>

This implies that Transaction Ti has updated data item Q from Old-Value to New-

Value.

In case of a rollback, the system uses the log to restore the values to the pre-

update state i.e. in case of roll-back of Transaction Ti the value of data item Q

will be reverted back from New-Value to Old-Value.

2. Deferred Update The updates performed by an active transaction Ti are not

immediately reflected on the stable storage. The Updates are kept in RAM in DBMS

Buffers. The updates of Transaction Ti are transferred from the DBMS Buffers to Stable

Storage only after Ti Commits. The updates are also reflected in the Log-File on stable

storage, which enable recovery in case of failures. Since, log-file updates are only in

close-by sectors, so for each update, the system does not encounter large seek time of

disk write. So, this mode will produce higher throughput than Immediate Update Mode.

But one limitation of this mode is that suppose the system failure occurs after a

Transaction Ti has committed but before its updates have been transferred from the

DBMS Buffers to the stable storage. In this case, the updates of Ti have to be Redone

during Recovery Procedure. Suppose <Ti , Q, Old-Value, New-Value> is an entry in the

log-file pertaining to a Transaction Ti, which is to be Redone, the during Recovery, New-

Value will be rewritten into data item Q.

Recovery from System Failures.

During System Failures, contents of the main memory i.e. database buffers are lost. The

precise state of the transactions, which were in progress at the time of failure, will no

longer be known. Such transactions would need to be rolled back, when the system is

restarted after the failure.

There may be some transactions, which might have already committed prior to the system

failure, but not managed to get their updates transferred from the database buffers to the

physical database. Such transactions will need to be redone.

Which failed transactions to be UNDONE and which transactions to be REDONE?

During the normal operations, the system keeps TAKING CHECKPOINTS at pre-

specified regular intervals or when prescribed number of entries have been made in the

log. Taking a CHECKPOINT means:-

1. Physically writing (force-writing) the contents of the database buffers into the

physical database.

21

Praveen Kumar

2. Physically writing a special CHECKPOINT RECORD into the physical non-

volatile log. This CHECKPOINT RECORD gives a list of the transactions that

were in progress at the time of taking the CHECKPOINT .

Criteria for UNDO and REDO

Recovery Process is initiated when system is restarted after a failure. From the view-point

of recovery, there will be four types of transactions:-

1. Transactions, which began and were committed before the last CHECKPOINT.

These transactions need no action during Recovery.

2. Transactions, which began either before or after the last CHECKPOINT, but were

COMMITTED after the checkpoint, prior to failure. These transactions need a

REDO operation during Recovery.

3. Transactions, which began before or after the last CHECKPOINT , but were still

NOT COMITTED at the time of failure. These need UNDO operation at the time

of Recovery.

Recovery Procedure

At restart time, the system goes through the following procedure:-

1. It will make use of two lists- the UNDO list and REDO list. Initialize the UNDO

list to the list of transactions, recorded in the most recent CHECKPOINT

RECORD and initialize the REDO list to empty.

2. Starting from the most recent CHECKPOINT RECORD, search the log file in the

forward direction.

3. If a “BEGIN TRANSACTION” log entry is found for transaction Ti, then add Ti

to the UNDO list.

4. If a “COMMIT” log entry is found for transaction Ti, move Ti from UNDO list to

the REDO list.

5. When the end of log file is reached, the UNDO and REDO lists are final.

6. The system now works backward through the log file, undoing the transactions in

the UNDO list. This is called BACKWARD RECOVERY.

7. Then, the system works forward redoing the transactions in the REDO list. This is

called FORWARD RECOVERY.

22

Praveen Kumar

Log Based Recovery Algorithm

Undo-List:= 0;

Redo-List := 0;

Pass I

/* This Pass is made scanning the Log-File, starting from the Failure Point, traversing it

in the Backward Direction, till the Last Check-Point Record is encountered.

Then, update the Undo-List as follows:- */

For each Transaction Entry Ti Log-File Do

Undo-List := Undo-List {Ti};

Pass II

/* This Pass is made scanning the Log-File, starting from the Last Check-Point Record,

traversing it in the Forward Direction, till the Failure Point Record is encountered. */

While Traversing the Log-File Do

Begin

For Each Record <Begin-Transaction Ti > Log-File Do

Undo-List := Undo-List {Ti};

For Each Record <Commit Ti > Log-File Do

Begin

Undo-List := Undo-List - {Ti}; /* Transfer Ti from Undo-List to Redo-

List*/

Redo-List := Undo-List {Ti};

End;

End;

/*At the End of this Pass, the Undo-List and Redo-List will be ready */

Pass III

/* This Pass is made scanning the Log-File, starting from the Failure Point, traversing the

Log-File in the Backward Direction, till the Undo-List gets empty. */


Begin

For Each Data-Update Record <Ti , Q, Old-Val, New-Val > Log-File Do

If Ti Undo-List

Then Q := Old-Val; /* Undo the Update */

For Each Record <Begin-Transaction Ti > Log-File Do

Undo-List := Undo-List - {Ti}; /* Remove Ti from the Undo-List */

23

Praveen Kumar

End;

/*At the End of this Pass, the Undo of all Transactions in the Undo-List would be

complete */

Pass IV

/* This Pass is made scanning the Log-File, starting from the Last Check-Point,

traversing the Log-File in the Forward Direction, till the Redo-List gets empty. */


Begin

For Each Data-Update Record <Ti , Q, Old-Val, New-Val > Log-File Do

If Ti Undo-List

Then Q := New-Val; /* Redo the Update */

For Each Record <Commit Ti > Log-File Do

Redo-List := Redo-List - {Ti}; /* Remove Ti from the Redo-List */

End;

/*At the End of this Pass, the Redo of all Transactions in the redo-List would be complete

*/

Media Recovery

Media failures imply failures like disk-head crash or disk-controller failure, in which case

some portion of the database is physically destroyed. Recovery from such a failure

involves a reloading of the database from a backup copy (dump) and then using the log

files (both active log file and archived log files) to REDO all transactions that completed

since the backup copy was taken. There is no need to UNDO those transactions, which

were in progress at the time of failure, since those would have been lost from the

database buffers anyway.

Exercise

Q.1. (a) Discuss the salient features of Immediate Database Modification and Deferred Database Modification. Explain the role log-file. (b) Explain Log-Based-Recovery Algorithm can achieve Database Recovery. This Algorithm should cater both for Immediate Update and Deferred Update.

Exercises

24

Praveen Kumar

Ex.10.1 (a) Explain Two-Phase Locking Protocol for Concurrency Control in Transaction Processing. Show how this protocol ensures conflict-serializable schedules. (b) What are the additional stipulations of Strict-Two-Phase Locking Protocol? Explain how it ensures conflict-serializable and cascade-less schedules.

Ex.10.2 Draw Precedence Graph to determine whether the following schedule is Conflict-serializable? If not, show how:-

(i) Two-Phase locking can be used to achieve conflict-serializability. (ii) Time Stamp Ordering can be used to achieve conflict serializability.

T1 T2

Read (A) Write (A);

Read (A) Write (A);

Ex10.3 Explain Time-Stamp Ordering technique for concurrency control. What are its strengths and limitations as compared to lock-based protocols?

Ex.10.4 Consider the following transactions:-

T1 : Read (A); Read (B); If A = 0 then B:= B+1; Write (B); T2 : Read (B); Read (A); If B = 0 then A:= A+1; Write (A); Add Lock and Unlock instructions to Transactions T1 and T2, so that they observe the Two-Phase-Locking Protocol.

Concurrency Control - Praveen kumarpraveencs.weebly.com/uploads/1/0/4/4/10440152/unit-v-dbms.pdf ·...

Documents

Transcript of Concurrency Control - Praveen kumarpraveencs.weebly.com/uploads/1/0/4/4/10440152/unit-v-dbms.pdf ·...