Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase...

42
Database Management Systems II, Huiping Cao 1 Recovery System

Transcript of Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase...

Page 1: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

Database Management Systems II, Huiping Cao! 1!

Recovery System!

Page 2: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

2!Database Management Systems II, Huiping Cao!

Review: The ACID properties!

q  Atomicity: All actions in the transaction happen, or none happen."

q  Consistency: If each transaction is consistent, and the DB starts consistent, it ends up consistent."

q  Isolation: Execution of one transaction is isolated from that of other transactions."

q  Durability: If a transaction commits, its effects persist."

q  The Recovery Manager guarantees Atomicity & Durability."

Page 3: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

3!Database Management Systems II, Huiping Cao!

Motivation!q  Atomicity: "

q  Transactions may abort (“Rollback”)."q  Durability:"

q  What if DBMS stops running?"q  Causes: System Crash, media failures"

q  Desired Behavior after system restarts:"q  T1, T2 & T3 should be durable."q  T4 & T5 should be aborted (effects not seen)."

crash!

T1 T2 T3 T4 T5

Page 4: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

4!Database Management Systems II, Huiping Cao!

ARIES recovery algorithm!

q  Analysis: identify dirty pages in the buffer pool and active transactions at the time of the crash. "

q  Redo: Repeats all actions, starting from an appropriate point in the log, and restores the database state to what it was at the time of the crash"

q  Undo: Undoes the actions of transactions that did not commit, so that the database reflects only the actions of committed transactions."

Page 5: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

5!Database Management Systems II, Huiping Cao!

Three main principles!

q  Write-Ahead Logging"q  Repeating History During Redo"q  Logging Changes During Undo"

Page 6: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

6!Database Management Systems II, Huiping Cao!

Basic Idea: Logging!

q  Record REDO and UNDO information, for every update, in a log."q  Sequential writes to log (put it on a separate disk)."q  Minimal info (diff) written to log, so multiple updates fit in a single

log page."q  Log: An ordered list of REDO/UNDO actions"

q  Log record contains: "<XID, pageID, offset, length, old data, new data> "

q  and additional control info (which we will see soon)."

Page 7: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

7!Database Management Systems II, Huiping Cao!

Write-Ahead Logging (WAL)!

q  The Write-Ahead Logging Protocol:"q  Must force the log record for an update before the corresponding

data page gets to disk."q  Must write all log records for a transaction before commit."

"

Page 8: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

8!Database Management Systems II, Huiping Cao!

WAL & the Log!

q  Each log record has a unique Log Sequence Number (LSN). "q  LSNs always increas."

q  Each data page contains a pageLSN."q  The LSN of the most recent log record

for an update to that page."q  System keeps track of flushedLSN."

q  The max LSN flushed so far."q  WAL: Before a page is written,"

q  pageLSN ≤ flushedLSN"

LSNs

DB

pageLSNs

RAM

flushedLSN

pageLSN

Log records flushed to disk

“Log tail” in RAM

Page 9: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

9!Database Management Systems II, Huiping Cao!

Log Records!

q  Log Record fields"

prevLSN" transID" Type" pageID" Length" Offset" Before-image"

After-image"

Fields common to all log records"

Additional fields for update log records"

Page 10: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

10!Database Management Systems II, Huiping Cao!

Log Record Types!q  Update!q  Commit!

q  The log record is appended to the log"q  The log tail is written to stable storage (up to and including the

commit record)"q  Others: Remove the transaction’s entry from the transaction table;

writing of the commit log record"q  Abort!

q  Initiate Undo "q  End (signifies end of commit or abort)"

q  After commit/abort, additional actions must be taken; "q  After all these additional steps are completed, write “End”"

q  Compensation Log Records (CLRs) !q  When a transaction is aborted during recovery from a crash

(UNDO actions)"q  undoNextSLN"

Page 11: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

11!Database Management Systems II, Huiping Cao!

Transaction Commit!

q  Write commit record to log."q  All log records up to the transaction’s lastLSN are flushed."

q  Guarantees that flushedLSN ≥ lastLSN."q  Note that log flushes are sequential."q  Many log records per log page."

q  Commit() returns."q  Write end record to log."

Page 12: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

12!Database Management Systems II, Huiping Cao!

Transaction Abort!

q  For now, consider an explicit abort of a transaction."q  No crash involved."

q  We want to “play back” the log in reverse order, undoing updates."q  Get lastLSN of transaction from transaction table."q  Can follow chain of log records backward via the prevLSN

field."q  Before starting UNDO, write an Abort log record."

! For recovering from crash during UNDO!"q  To perform UNDO, must have a lock on data!"

Page 13: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

13!Database Management Systems II, Huiping Cao!

Abort, cont.!

q  Before restoring old value of a page, write a CLR:"q  You continue logging while you UNDO!!"q  CLR has one extra field: undonextLSN"

! Points to the next LSN to undo (i.e. the prevLSN of the record we are currently undoing)"

! 10 T1 update P2"! 20 T1 update P1"! 40 CLR: Undo T1 LSN 20 //undonextLSN = prevLSN(LSN20)=10"

q  CLRs never Undone (but they might be Redone when repeating history: guarantees Atomicity!)"

q  At end of UNDO, write an “end” log record."

Page 14: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

14!Database Management Systems II, Huiping Cao!

CLR!

q  Compensation Log Records (CLRs) !q  When a transaction is aborted or during recovery from a crash

(UNDO actions)"q  undoNextSLN"

LSN! prevLSN!

transID! type! pageID! len! offset! Before-image!

After-image!

undoNextLSN!

10" null" T1000" update" P500" 3" 21" ABC" DEF"

20" null" T2000" update" P600" 3" 41" HIJ" KLM"30" 20" T2000" update" P500" 3" 20! GDE" QRS"

40" 10" T1000" update" P505" 3" 21" TUV" WXY"45" 40" T1000" abort"50" 45" T1000" CLR

UNDO 40"

P505" 3" 21" WXY" TUV" 10"

Page 15: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

15!Database Management Systems II, Huiping Cao!

Other Log-Related State!

q  Transaction Table:"q  One entry per active transaction."q  Contains XID: transaction id"q  Status (running/committed/aborted)"q  lastLSN: the LSN of the most recent log record for this

transaction"q  Dirty Page Table:"

q  One entry per dirty page in buffer pool."q  Contains recLSN: the LSN of the log record which first

caused the page to be dirty."

Page 16: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

16!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID! len! offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20! GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"

Log"

Page 17: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

17!Database Management Systems II, Huiping Cao!

Normal Execution of a transaction!

q  Series of reads & writes, followed by commit or abort."q  We will assume that write is atomic on disk."

! In practice, additional details to deal with non-atomic writes."

q  Strict 2PL"

Page 18: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

18!Database Management Systems II, Huiping Cao!

Checkpointing!

q  Periodically, the DBMS creates a checkpoint, in order to minimize the time taken to recover in the event of a system crash. Write to log:"q  Step1: begin_checkpoint record: Indicates when checkpoint

began."q  Step 2: end_checkpoint record: Contains current transaction table

and dirty page table. This is a `fuzzy checkpoint’:"! Other transactions continue to run; so these tables are

accurate only as of the time of the begin_checkpoint record."! No attempt to force dirty pages to disk; effectiveness of

checkpoint limited by oldest unwritten change to a dirty page. (So it is a good idea to periodically flush dirty pages to disk!)"

q  Step 3: Store LSN of checkpoint record in a safe place (master record)."! This will be done after the “end_checkpoint” record is written to

stable storage. "

Page 19: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

19!Database Management Systems II, Huiping Cao!

The Big Picture: What is Stored Where!

DB

Data pages each with a pageLSN

Transaction Table lastLSN status

Dirty Page Table

recLSN flushedLSN

RAM

prevLSN XID type

length pageID

offset before-image after-image

LogRecords

LOG

master record

Page 20: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

20!Database Management Systems II, Huiping Cao!

Crash Recovery: Big Picture!

q  Start from a checkpoint (found via master record)."

q  Three phases. Need to:"o  Analysis: Figure out which transactions

committed since checkpoint, which failed."o  REDO all actions."

o  (repeat history)"o  UNDO effects of failed transactions."

Oldest log rec. of Xact active at crash

Smallest recLSN in dirty page table after Analysis

Last chkpt

CRASH

A R U

Page 21: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

21!Database Management Systems II, Huiping Cao!

Recovery: The Analysis Phase!

q  Finish three tasks"q  Determine REDO point"q  Determine dirty pages"q  Determine active transactions"

q  Reconstruct state at checkpoint."q  Start from the most recent begin_checkpoint"q  via end_checkpoint record"

q  Scan log forward from checkpoint."q  End record: Remove transaction from transaction table."q  Other records: Add transaction to transaction table, set

lastLSN=LSN, change transaction status on commit."q  Update record: If P is not in the dirty page table,"

! Add P to the dirty page table, set its recLSN=LSN."

Page 22: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

22!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"T1000" update" P700" …" …" …"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table""Last log record not written to disk"

Page 23: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

23!Database Management Systems II, Huiping Cao!

pageID" recLSN"

transID" lastLSN"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Restore the dirty page table and transaction table using the checkpoint"Read log"

Page 24: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

24!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"

transID" lastLSN"T1000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table"Scan the first log record"

Page 25: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

25!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table"Scan the first log record"Scan the second log record"

Page 26: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

26!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table"Scan the first log record"Scan the second log record"Scan the third log record (no change)"

Page 27: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

27!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table"Scan the first log record"Scan the second log record"Scan the third log record (no change)"Scan the fourth log record"

Page 28: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

28!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table"Scan the first log record"Scan the second log record"Scan the third log record (no change)"Scan the fourth log record"Scan the fifth log record: set the status of Transaction T2000 from “U” to “C”"""

Page 29: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

29!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"T2000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Checkpoint: beginning of the execution, with empty dirty page table and transaction table"Scan the first log record"Scan the second log record"Scan the third log record (no change)"Scan the fourth log record"Scan the fifth log record: set the status of Transaction T2000 from “U” to “C”"Scan the sixth log record: remove T2000 from the transaction table."What if P600 was already written to the disk? ""

Page 30: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

30!Database Management Systems II, Huiping Cao!

Recovery: The REDO Phase!

q  We repeat History to reconstruct state at crash:"q  Reapply all updates (even of aborted transactions!), redo CLRs."

q  Scan forward from log rec containing smallest recLSN in Dirty Page Table "q  For each redoable log record (CLR or update), REDO the action unless: "

q  Affected page is not in the Dirty Page Table, or"q  Affected page is in Dirty Page Table., but the recLSN for the entry > LSN of

the log record being checked, or"q  pageLSN of this page (that is in the disk) >= LSN of the log record being

checked"q  To REDO an action:"

q  Reapply logged action."q  Set pageLSN to LSN. "q  No additional logging!"

Page 31: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

31!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Smallest recLSN = LSN of the 1st log record"REDO:"• Fetch P500, compare pageLSN and this LSN, apply updates, pageLSN updated"

"

Page 32: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

32!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"

Log"

Smallest recLSN = LSN of the 1st log record"REDO:"• Fetch P500, compare pageLSN and this LSN, apply updates, pageLSN updated"• Fetch P600, assume P600 was written to disk and that item was not in DPT, NO UPDATE"

"

Page 33: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

33!Database Management Systems II, Huiping Cao!

Recovery: The UNDO Phase!

ToUndo = {x | x is a lastLSN of a “loser” transaction}"Repeat:"

q  Choose the largest LSN among ToUndo."q  If this LSN is a CLR (Compensation log record) and undonextLSN==NULL"

! Write an End record for this transaction."q  If this LSN is a CLR, and undonextLSN != NULL"

! Add undonextLSN to ToUndo "q  Else this LSN is an update. "

! Undo the update, "! Write a CLR"! Add prevLSN to ToUndo"

Until ToUndo is empty.!

Page 34: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

34!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"T1000" CLR-4"

Log"

ToUndo = {LSN for 4th log record}"Active transactions: T1000"Most recent LSN: the 4th log record LSN"• Undo the update"• Write CLR"• undoNextLSN = prevLSN of the 4th log record = LSN of 1st log record""

"

Page 35: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

35!Database Management Systems II, Huiping Cao!

pageID" recLSN"P500"P600"P505"

transID" lastLSN"T1000"

Transaction Table"

Dirty Page Table"

prevLSN!

transID! type! pageID!

len!

offset! Before-image!

After-image!

T1000" update" P500" 3" 21" ABC" DEF"

T2000" update" P600" 3" 41" HIJ" KLM"T2000" update" P500" 3" 20" GDE" QRS"

T1000" update" P505" 3" 21" TUV" WXY"T2000" commit"T2000" end"T1000" CLR-4 "T1000" CLR-1"T1000" End"

Log"Active transactions: T1000"Most recent LSN: the 4th log record LSN"• Undo the update"• Write CLR-4"• undoNextLSN = LSN of 1st log record"Undo the action in the 1st log record; Write CLR-1; undoNextLSN=null; "T1000 End"

Page 36: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

36!Database Management Systems II, Huiping Cao!

Example of Recovery!

begin_checkpoint end_checkpoint update: T1 writes P5 update: T2 writes P3 T1 abort CLR: Undo T1 LSN 10

T1 End update: T3 writes P1 update: T2 writes P5 CRASH, RESTART

LSN LOG

00 05 10 20 30 40

45 50 60

Xact Table lastLSN status

Dirty Page Table recLSN

flushedLSN ToUndo

prevLSNs

RAM

Page 37: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

37!Database Management Systems II, Huiping Cao!

Example of Recovery!

begin_checkpoint end_checkpoint update: T1 writes P5 update: T2 writes P3 T1 abort CLR: Undo T1 LSN 10

T1 End update: T3 writes P1 update: T2 writes P5 CRASH, RESTART

LSN LOG

00 05 10 20 30 40

45 50 60

prevLSNs

pageID" recLSN"P1" 50"P3" 20"P5" 10"

Dirty Page Table"

transID" lastLSN"T2" 60"T3" 50"

Transaction Table"

Redo:10-60"ToUndo={60,50}

Page 38: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

38!Database Management Systems II, Huiping Cao!

Example: Crash During Restart!!

begin_checkpoint, end_checkpoint update: T1 writes P5 update T2 writes P3 T1 abort CLR: Undo T1 LSN 10, T1 End update: T3 writes P1

update: T2 writes P5 CRASH, RESTART CLR: Undo T2 LSN 60 CLR: Undo T3 LSN 50, T3 end

CRASH, RESTART

LSN LOG

00,05 10 20 30 40,45 50

60 70 80,85

undonextLSN

Redo:10-60"ToUndo={60,50} • Process 60, CLR, undoNextLSN=20 • Process 50, CLR, undoNextLSN=null, T3 end • Log 70-85 written to disk

pageID" recLSN"P1" 50"P3" 20"P5" 10"

Dirty Page Table"

transID" lastLSN"T2" 60"T3" 50"

Transaction Table"

Page 39: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

39!Database Management Systems II, Huiping Cao!

Example: Crash During Restart!!

begin_checkpoint, end_checkpoint update: T1 writes P5 update T2 writes P3 T1 abort CLR: Undo T1 LSN 10, T1 End update: T3 writes P1

update: T2 writes P5 CRASH, RESTART CLR: Undo T2 LSN 60 CLR: Undo T3 LSN 50, T3 end

CRASH, RESTART CLR: Undo T2 LSN 20, T2 end

LSN LOG

00,05 10 20 30 40,45 50

60 70 80,85

90

undonextLSN

Redo:10-85"ToUndo={70} •  Process 70, NO CLR,

undoNextLSN=20 •  Process 20, CLR (90),

undoNextLSN=null, T2 end

pageID" recLSN"P1" 50"P3" 20"P5" 10"

Dirty Page Table"

transID" lastLSN"T2" 70"

Transaction Table"

Page 40: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

40!Database Management Systems II, Huiping Cao!

Additional Crash Issues!

q  What happens if system crashes during Analysis? During REDO?"

q  How do you limit the amount of work in REDO?"q  Flush asynchronously in the background."q  Watch “hot spots”!"

q  How do you limit the amount of work in UNDO?"q  Avoid long-running transactions."

Page 41: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

41!Database Management Systems II, Huiping Cao!

Summary of Logging/Recovery!

q  Recovery Manager guarantees Atomicity & Durability."q  LSNs identify log records; linked into backwards chains per

transaction (via prevLSN)."q  pageLSN allows comparison of data page and log records."

Page 42: Recovery System - Computer Sciencehcao/teaching/cs582/note/DB2_t23_trans3_recovery.pdfDatabase Management Systems II, Huiping Cao! 4! ARIES recovery algorithm!! Analysis: identify

42!Database Management Systems II, Huiping Cao!

Summary, Cont.!

q  Checkpointing: A quick way to limit the amount of log to scan on recovery. "

q  Recovery works in 3 phases:"q  Analysis: Forward from checkpoint."q  Redo: Forward from oldest recLSN."q  Undo: Backward from end to first LSN of oldest Xact alive

at crash."q  Upon Undo, write CLRs."q  Redo “repeats history”: Simplifies the logic!"