CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction...

33
CS 377 Database Systems Transaction Processing and Recovery 1 Transaction Processing and Recovery Li Xiong Department of Mathematics and Computer Science Emory University

Transcript of CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction...

Page 1: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

CS 377

Database SystemsTransaction Processing and Recovery

1

Transaction Processing and Recovery

Li Xiong

Department of Mathematics and Computer Science

Emory University

Page 2: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Transaction Processing

� Basic DB Functionalities

� Data Storage

� Query Processing

� Transaction Processing Systems

2

Transaction Processing Systems

� Systems with large databases and concurrent users executing database transactions

� A transaction is a logical unit of DB processing which may include one or more database access operations

� Issues

� Transactions may fail – how to recover?

� Multiple transactions are being done once – how to ensure consistency?

Page 3: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Why is Recovery Needed?

Pooh was sitting in his house one day, counting his pots of

honey, when there came a knock on the door.

“Fourteen,” said Pooh. “Come in. Fourteen. Or was it fifteen?

Bother. That’s muddled me.”

“Hallo, Pooh,” said Rabbit.

3

“Hallo, Pooh,” said Rabbit.

“Hallo, Rabbit. Fourteen, wasn’t it?”

“What was?”

“My pots of honey what I was counting.”

“Fourteen, that’s right.”

“Are you sure?”

“No, ” said Rabbit. “Does it matter?”

Page 4: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

� Computer failure (system crash)

� Main memory failure

� Transaction error

� integer overflow; division by zero; erroneous parameter values; logical programming error

Why is Recovery Needed? Really?

4

� Disk failure

� Read/write malfunction or head crash

� Catastrophes

� Power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes by mistake, and mounting of a wrong tape by the operator.

Page 5: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Basic Model and Notations

� A database - collection of named data items

� Granularity of data - a field, a record , or a whole disk block(Concepts are independent of granularity)

� Basic query operations are read and write

� read_item (x)

5

� read_item (x)

� block containing x -> memory

� Program variable x ← value of x in block

� write_item (x):

� value of x in block ← program variable x

� block containing x -> disk

Page 6: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Sample Transaction with Failure� E.g. X = 100, Y = 50, N = 10

6

failure!

Page 7: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Multiple Transactions

� Serial schedules: transactions are executed in

isolation and consecutively

� E.g. X = 100, Y = 50, N = 10, M = 20

� Drawback?

7

Drawback?

Page 8: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Concurrency

� Concurrency� Interleaved processing:

�Concurrent execution of processes is interleaved in a single CPU

� Parallel processing:�Processes are concurrently executed in multiple CPUs.

8

Page 9: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Transaction Execution with

concurrency� E.g. X = 100, Y = 50, N = 10, M = 20

9

Page 10: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Outline

� Motivation

� Transaction Concepts

� Recovery

� Concurrency Control (next lecture)

10

� Concurrency Control (next lecture)

Page 11: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Transactions Concepts� Transaction: logical unit of data processing that includes one or

more basic access operations (read -retrieval, write - insert or

update, delete)

� ACID Properties of Transactions

� Atomicity: an atomic unit; either performed in its entirety or not

performed at all

11

performed at all

� Consistency: preserve consistency; take the database from one consistent

state to another

� Isolation: appear executed in isolation from other transactions

� Durability: changes by a committed transaction must persist

� Transaction Management

� Recovery – atomicity, durability

� Concurrency control – isolation

� Application programs - consistency

Page 12: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Writing to Disk

� In-place updating

� Write the buffer to the same original disk, overwriting

the old value

� Before image and after image

12

� Shadowing

� Write the updated buffer to a different disk location, so

multiple versions of data items can be maintained

Page 13: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

T1: Read (A,t);

t ← t×2;Write (A,t);Read (B,t); t ← t×2Write (B,t);

Constraint: A=B

T1: A ← A × 2

B ← B × 2

Unfinished Transaction Example

13

A: 8B: 8

memory disk

Page 14: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

T1: Read (A,t);

t ← t×2;Write (A,t);Read (B,t); t ← t×2Write (B,t);

failure!

Constraint: A=B

T1: A ← A × 2

B ← B × 2

Unfinished Transaction Example

14

A: 8B: 8

A: 8B: 8

memory disk

1616

16

�Violates atomicity

Page 15: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Recovery

Credits: Hansel and Gretel, 782 AD

� Keep a system log and perform recovery when necessary

� System log

� Separate, non-volatile

15

� Periodically backed up to archival storage (tape)

� append only file consists of entries called log records

� record the operations that each transaction has

performed on the data.

Page 16: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Recovery

� Log records

� start: beginning of transaction execution.

� read or write: read or write operations on database items

� commit: successful end of the transaction – any updates should be permanently applied to DB (appear on disk)

� rollback or abort: unsuccessful end - any changes should not be applied to DB or undone if applied

� Write ahead logging (WAL): all modifications are written to a log before they

16

� Write ahead logging (WAL): all modifications are written to a log before they are applied to the database

� Logging

� Undo – immediate update

� Redo – deferred update

� Recovery

� undo certain transaction operations to ensure all operations of an uncommitted transaction are not applied

� redo certain transaction operations to ensure all operations of a committed transaction are applied successfully

Page 17: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Undo Logging

� Idea: undo operations for uncommitted transactions to go

back to original state of DB

� In order to undo the updates made by a transaction,

we save the original (old) value of every updated data

item

17

item

� An UNDO log:

� [start, TID] : indicates that transaction TID has started

� [write, TID, X, old_value]: indicates

that transaction TID has over-written data item X whose

value was old_value

� [commit, TID] : indicates

that transaction TID has completed successfully

� [abort, TID] : indicates that transaction TID has been aborted

Page 18: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Undo Logging� When a new transaction begins

� Append [start, T] to the UNDO log

� When transaction T reads a data item X:

� Don't need to do anything...

� When transaction T writes a data item X:

18

� Append [write, T, X, old_value] to the UNDO log

� AFTER the log has been written successful , update X (with the

new value)

� When transaction T completes successfully:

� Append [commit, T] to the UNDO log

� When transaction T is aborted:

� Append [abort, T] to the UNDO log

Page 19: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Undo Logging: Disk Writing Order

a) Log records of changed data items

b) Changed data items (immediate modifications)

c) Commit log record

19

Page 20: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

T1: Read (A,t);

t ← t×2;Write (A,t);Read (B,t); t ← t×2;Write (B,t);

Undo logging

20

A:8B:8

memory disk log

Page 21: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

T1: Read (A,t);

t ← t×2;Write (A,t);Read (B,t); t ← t×2;Write (B,t);

Undo logging

21

A:8B:8

A:8B:8

memory disk log

1616

<T1, start><T1, A, 8>

<T1, commit>16 <T1, B, 8>

16

Page 22: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Undo logging: Possible Recovery Rules

� For every Ti with <Ti, start> in log:

If <Ti,commit> or <Ti,abort> in log, do nothing

Else in forward order:

For all <Ti, X, v> in log:

22

For all <Ti, X, v> in log:

write (X, v); output (X )

Write <Ti, abort> to log

Page 23: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Scans the log in reverse order (latest → earliest)

(1) Remember transactions with <Ti, commit> (or <Ti, abort>)

record

(2) For each <Ti, X, v>

if Ti does not have <Ti, commit> (or <Ti, abort>)

Undo logging: Recovery Rules

23

if Ti does not have <Ti, commit> (or <Ti, abort>)

then write (X, v); output (X)

(3) For each Ti that does not have <Ti, commit> (or <Ti, abort>)

write <Ti, abort> to log

Page 24: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Checkpointing

Periodically:

(1) Do not accept new transactions

(2) Wait until all active transactions to finish

(3) Flush all log records to disk (log)

24

(3) Flush all log records to disk (log)

(4) Write “checkpoint” record on disk (log)

(5) Resume transaction processing

Page 25: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Nonquiescent Checkpointing

(1) Write a log record <Start CKPT (T1, …, Tk)>

with all active transactions

(2) Wait until all active transactions commit or abort,

do not prohibit other transactions from starting

25

do not prohibit other transactions from starting

(3) Flush all log records to disk (log)

(4) Write a log record <End CKPT>

Page 26: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Exercise: Undo Logging� An undo loggoing database starts a nonquiescent checkpoint after line 5.

Initial value of A in the database (on disk) is 2. � Show the log file entries that would be generated by this execution.

� If the system crashes, what is the value of A in the database? What recovery would have to be done?� immediately after line 12

� immediately after line 11

T1 T2 T3

--------------------------------------------------------------------

26

--------------------------------------------------------------------

0 start

1 READ A

2 A := A + 1

3 start

4 WRITE A

5 commit

6 start

7 READ A

8 A := A + 1

9 READ A

10 commit

11 WRITE A

12 commit

Page 27: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Outline

� Transaction Basics

� Recovery

� Undo Logging

� Redo Logging

27

� Redo Logging

� Undo/Redo Logging

� Concurrency Control

Page 28: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

REDO logging

� Idea: save disk I/Os by deferring data changes – (re)do the

changes for committed transactions

� In order to redo the updates made by the transaction,

we save the NEW value of every updated data item

� A REDO log

28

� A REDO log

� [start, TID] : indicates that transaction TID has started

� [write, TID, X, new_value]: indicates

that transaction TID has over-written data item X with

new_value

� [commit, TID] : indicates

that transaction TID has completed successfully

� [abort, TID] : indicates that transaction TID has been aborted

Page 29: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

REDO Logging

� When a new transaction begins , do:

� Append [start, T] to the REDO log

� When transaction T reads a data item X:

� Don't need to do anything...

� When transaction T writes a data item X:

29

� When transaction T writes a data item X:

� Append [write, T, X, new_value] to the REDO log

� When transaction T completes successfully:

� Append [commit, T]

� Updates the database

� Append [End, T]

� When transaction T is aborted:

� Append [abort, T]

Page 30: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Redo Logging: Disk Writing Order

a) Log records of changed data items

b) Commit log record

c) Changed data items (deferred modification)

30

Page 31: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

T1: Read (A,t);

t ← t×2;Write (A,t);Read (B,t); t ← t×2;Write (B,t);

REDO logging

31

A:8B:8

memory disk log

Page 32: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

Redo logging (deferred modification)

T1: Read(A,t); t t×2; write (A,t);

Read(B,t); t t×2; write (B,t);

Output(A); Output(B)

32

A: 8B: 8

A: 8B: 8

memory DB

LOG

1616

<T1, start><T1, A, 16><T1, B, 16><T1, commit>

<T1, end>

output

1616

Page 33: CS 377 Database Systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction operations to ensure all operations of a committed ... Resume transaction processing.

(1) Let S = set of transactions with

<Ti, commit> (and no <Ti, end>) in log

(2) For each <Ti, X, v> in log, in forward order (earliest

→ latest) do:

Recovery rules: Redo logging

33

if Ti ∈ S then

Write(X, v); Output(X)

(3) For each Ti ∈ S, write <Ti, end>