CSE 444: Database Internals - courses.cs.washington.edu · 2019-05-19 · Tk, or transactions that...
Transcript of CSE 444: Database Internals - courses.cs.washington.edu · 2019-05-19 · Tk, or transactions that...
CSE444:DatabaseInternals
Section6:Transactions- Recovery
Reviewinthissection
1. UNDOlogging
2. REDOlogging
3. UpdatingARIESDataStructures
UndoLogging
• TwoRules:– 1.IfatransactionwriteselementX,thenthelogrecordofthisupdate<T,X,v>mustbewrittentodiskbeforethenewvalueofX iswrittentodisk.
– 2.Ifatransactioncommits,thentheCOMMITmustbewrittentodiskonlyafterallelementschangedbythetransactionhavebeenwrittentodisk.
Action T Mem A Mem B Disk A Disk B Log
<START T>
INPUT(A) 8 8 8
READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8 <T,A,8>
INPUT(B) 16 16 8 8 8
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,8>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
COMMIT <COMMIT T>
4
UNDOLOGRULES
1. <T,X,v>beforeOUTPUT(X)2. OUTPUT(X)before<COMMIT>
Whenrecovering(withUNDOlogging)…
• Wecannotsimplyignorethelogbeforearecentcommito Manytransactionsinterleaveatonce.Ifwetruncate
beforeacommitforatransaction,anyinformationaboutthoseunfinishedtransactionswouldbelost.
• Instead,wecanusecheckpointthelogperiodically…
Review:Checkpointing
• Checkpointing (naïve)– Writea<STARTCKPT(T1,…,Tk)>.Flushlogtodisk– Stopacceptingnewtransactions– Waituntilallactivetransactionsabort/commit– Write<CKPT>.Flushlogtodisk.– Resume acceptingtransactions
• Nonquiescent Checkpointing– Writea<STARTCKPT(T1,…,Tk)>.Flushlogtodisk– Continuenormaloperation– WhenallofT1,…,Tk havecompleted,write<ENDCKPT>.Flushlogtodisk– Moreefficient,systemdoesnotseemtobestalled
Problem1.UNDOLogging
LSN1 <START T1>LSN2 <T1 X 5>LSN3 <START T2>LSN4 <T1 Y 7>LSN5 <T2 X 9>LSN6 <START T3>LSN7 <T3 Z 11>LSN8 <COMMIT T1>LSN9 <START CKPT(T2,T3)>LSN10 <T2 X 13>LSN11 <T3 Y 15>
*CRASH*
1.Showhowfarbackintherecoverymanagerneedstoreadthelog
(whichLSNdoweneedtoreadupto?)
UNDO:Howfartoscanlogfromtheend?
• Case1:See<ENDCKPT>first– Allincompletetransactions
beganafter<STARTCKPT…>
• Case2:See<STARTCKPT(T1..TK)>first– Incompletetransactions
beganafter<STARTCKPT…>orincompleteonesamongT1..TK
– Findtheearliest<STARTTi>amongthem
– Atmostwehavetogountiltheprevious<STARTCKPT>…<ENDCKPT>
LSN1 <START T1>LSN2 <T1 X 5>LSN3 <START T2>LSN4 <T1 Y 7>LSN5 <T2 X 9>LSN6 <START T3>LSN7 <T3 Z 11>LSN8 <COMMIT T1>LSN9 <START CKPT(T2,T3)>LSN10 <T2 X 13>LSN11 <T3 Y 15>
*CRASH*
Problem1.UNDOLogging
LSN1 <START T1>LSN2 <T1 X 5>LSN3 <START T2>LSN4 <T1 Y 7>LSN5 <T2 X 9>LSN6 <START T3>LSN7 <T3 Z 11>LSN8 <COMMIT T1>LSN9 <START CKPT(T2,T3)>LSN10 <T2 X 13>LSN11 <T3 Y 15>
*CRASH*
1.Showhowfarbackintherecoverymanagerneedstoreadthelog
(writetheearliestLSN)
LSN3(startoftheearliesttransactionamongincompletetransactions)
Problem1.UNDOLogging
LSN1 <START T1>LSN2 <T1 X 5>LSN3 <START T2>LSN4 <T1 Y 7>LSN5 <T2 X 9>LSN6 <START T3>LSN7 <T3 Z 11>LSN8 <COMMIT T1>LSN9 <START CKPT(T2,T3)>LSN10 <T2 X 13>LSN11 <T3 Y 15>
*CRASH*
2.Showtheactionsoftherecoverymanagerduringrecovery.
Problem1.UNDOLogging
LSN1 <START T1>LSN2 <T1 X 5>LSN3 <START T2>LSN4 <T1 Y 7>LSN5 <T2 X 9>LSN6 <START T3>LSN7 <T3 Z 11>LSN8 <COMMIT T1>LSN9 <START CKPT(T2,T3)>LSN10 <T2 X 13>LSN11 <T3 Y 15>
*CRASH*
2.Showtheactionsoftherecoverymanagerduringrecovery.
Y=15X=13Z=11X=9
RedoLogging
• OneRule:– 1.BeforemodifyinganyelementXondisk,alllogrecordspertainingtothismodification(<T,X,v>andthe<COMMITT>),mustappearondisk.
Action T Mem A Mem B Disk A Disk B Log
<START T>
READ(A,t) 8 8 8 8
t:=t*2 16 8 8 8
WRITE(A,t) 16 16 8 8 <T,A,16>
READ(B,t) 8 16 8 8 8
t:=t*2 16 16 8 8 8
WRITE(B,t) 16 16 16 8 8 <T,B,16>
<COMMIT T>
OUTPUT(A) 16 16 16 16 8
OUTPUT(B) 16 16 16 16 16
13
REDOLOGRULE
Both<T,X,v>and<COMMIT>beforeOUTPUT(X)v=newvalue
Problem2:REDOLogging
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPT????>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPT????>20.<COMMITT5>*CRASH*
1.Whatarethecorrectvaluesofthetwo<STARTCKPT????>records?
Problem2:REDOLogging
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPT????>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPT????>20.<COMMITT5>
1.Whatarethecorrectvaluesofthetwo<STARTCKPT????>records?
FirstSTARTCKPT:<STARTCKPT(T2,T3)>
SecondSTARTCKPT:<STARTCKPT(T4,T5)>
Problem2:REDOLogging(Checkpoint)
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPTT2,T3>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPTT4,T5>20.<COMMITT5>
NOTE:
<CommitT3>after<ENDCKPT>
WhatareweCKPTing?
Thetransactionsthatcommittedbefore<STARTCKPT>
REDO:Howfartoscanlogfromthestart?
• Identifycommittedtransactions
• Case1:See<ENDCKPT>first– Allcommittedtransactionsbefore<STARTCKPT(T1..TK)>arewritten– ConsiderT1..Tk,ortransactionsthatstartedafter<STARTCKPT…>,
tracebackuntilearliest<STARTTi>
• Case2:See<STARTCKPT(T1..TK)>first– CommittedtransactionsbeforeSTARTCKPTmightnothavebeen
written– Findprevious<ENDCKPT>,itsmatching<STARTCKPT(S1,…Sm)>– Redocommittedtransactionsthatstartedafter<STARTCKPTT1..Tk>
orS1..Sm
REDO:Howfartoscanlogfromthestart?
• Identifycommittedtransactions
• Case1:See<ENDCKPT>first– Allcommittedtransactionsbefore<STARTCKPT(T1..TK)>arewritten– ConsiderT1..Tk,ortransactionsthatstartedafter<STARTCKPT…>,
tracebackuntilearliest<STARTTi>
• Case2:See<STARTCKPT(T1..TK)>first– CommittedtransactionsbeforeSTARTCKPTmightnothavebeen
written– Findprevious<ENDCKPT>,itsmatching<STARTCKPT(S1,…Sm)>– Redocommittedtransactionsthatstartedafter<STARTCKPTT1..Tk>
orS1..Sm
Problem2:REDOLogging
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPTT2,T3>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPTT4,T5 >20.<COMMITT5>
2.Whatfragmentofthelogdoestherecoverymanagerneedtoread?
Problem2:REDOLogging
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPTT2,T3>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPTT4,T5>20.<COMMITT5>
2.Whatfragmentofthelogdoestherecoverymanagerneedtoread?•WeknowtherewasacommitforT5.
•InthepreviousSTARTCKPT,T2andT3werethetwoactivetransactions.Bothtransactionscommittedandmustthusberedone.
•T2wastheearliestone
Problem2:REDOLogging
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPTT2,T3>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPTT4,T5 >20.<COMMITT5>
3.Whichelementsarerecoveredbytheredorecoverymanager?computetheirvaluesafterrecovery.
Problem2:REDOLogging
1. <STARTT1>2. <T1,A,10>3. <STARTT2>4. <T2,B,5>5. <T1,C,7>6. <STARTT3>7. <T3,D,12>8. <COMMITT1>9. <STARTCKPTT2,T3>10.<STARTT4>
11.<T2,E,5>12.<COMMITT2>13.<T3,F,1>14.<T4,G,15>15.<ENDCKPT>16.<COMMITT3>17.<STARTT5>18.<T5,H,3>19.<STARTCKPTT4,T5 >20.<COMMITT5>
3.Whichelementsarerecoveredbytheredorecoverymanager?computetheirvaluesafterrecovery.
AllchangesbyT2,T3,T5(committed)B=5D=12E=5F=1H=3
Review:ARIESDataStructures(UNDO/REDOLogging)
Example.1. T1000 changesthevalueofAfrom“abc”to“def” onpageP5002. T2000 changesthevalueofBfrom“hij”to“klm” onpageP6003. T2000 changesthevalueofDfrom“mnp”to“qrs”onpageP5004. T1000 changesthevalueofCfrom“tuv”to“wxy”onpageP5055. T2000 commitsandtheendlogrecordiswritten6. T1000 changesthevalueofEfrom“pq”to“rs” onpageP7007. P600 isflushedtodisk8. Crash!!
SeeSection7
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
ARIES Data StructurespageID recLSN LSN prevLSN tID pID Log entry Type undoNextLSN
101
Dirty page table Log
transID lastLSN statusTransaction table
Buffer Pool
24
A=abc D=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
pageID recLSN LSN prevLSN tID pID Log entry Type undoNextLSN
101
Dirty page table Log
transID lastLSN statusTransaction table
Buffer Pool
25
A=abc D=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Firstoperation:1.T1000 changesthevalueofAfrom“abc”to“def” onpageP500?
P500PageLSN= 101
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
pageID recLSNP500 101
LSN prevLSN tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -
Dirty page table Log
transID lastLSN status
T1000 101 Running
Transaction table
Buffer Pool
26
A=defD=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Changes1.T1000 changesthevalueofAfrom“abc”to“def” onpageP500
P500PageLSN= 101
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
pageID recLSNP500 101
LSN prevLSN tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -
Dirty page table Log
transID lastLSN status
T1000 101 Running
Transaction table
Buffer Pool
27
A=defD=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Next:2.T2000 changesthevalueofBfrom“hij”to“klm” onpageP600?
P500PageLSN= 101
P600PageLSN= 102
P505PageLSN= -
P700PageLSN= -
B=klm
pageID recLSNP500 101
P600 102
LSN prevLSN tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -102 - T2000 P600 Write B
“hij” -> “klm”Update
Dirty page table Log
transID lastLSN status
T1000 101 Running
T2000 102 Running
Transaction table
Buffer Pool
28
A=def D=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Changes:2.T2000 changesthevalueofBfrom“hij”to“klm” onpageP600?
P500PageLSN= 101
P600PageLSN= 102
P505PageLSN= -
P700PageLSN= -
B=klm
pageID recLSNP500 101
P600 102
LSN prevLSN tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -102 - T2000 P600 Write B
“hij” -> “klm”Update
Dirty page table Log
transID lastLSN status
T1000 101 Running
T2000 102 Running
Transaction table
Buffer Pool
29
A=def D=mnp
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Next:3.T2000 changesthevalueofDfrom“mnp”to“qrs”onpageP500?
P500PageLSN= 103
P600PageLSN= 102
P505PageLSN= -
P700PageLSN= -
B=klm
pageID recLSNP500 101
P600 102
LSN prevLSN tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -
102 - T2000 P600 Write B“hij” -> “klm”
Update -
103 102 T2000 P500 Write D“mnp” -> “qrs”
Update -
Dirty page table Log
transID lastLSN status
T1000 101 Running
T2000 103 Running
Transaction table
Buffer Pool
30
A=def D=qrs
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Changes:3.T2000 changesthevalueofDfrom“mnp”to“qrs”onpageP500
P500PageLSN= 103
P600PageLSN= 102
P505PageLSN= -
P700PageLSN= -
B=klm
pageID recLSNP500 101
P600 102
LSN prevLSN tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -
102 - T2000 P600 Write B“hij” -> “klm”
Update -
103 102 T2000 P500 Write D“mnp” -> “qrs”
Update -
Dirty page table Log
transID lastLSN status
T1000 101 Running
T2000 103 Running
Transaction table
Buffer Pool
31
A=def D=qrs
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Next:4.T1000 changesthevalueofCfrom“tuv”to“wxy”onpageP505?
P500PageLSN= 103
P600PageLSN= 102
P505PageLSN= 104
P700PageLSN= -
B=klm
pageID recLSNP500 101
P600 102
P505 104
LSN prevLSN
tID pID Log entry Type undoNextLSN
101 - T1000 P500 Write A“abc” -> “def”
Update -
102 - T2000 P600 Write B“hij” -> “klm”
Update -
103 102 T2000 P500 Write D“mnp” -> “qrs”
Update -
104 101 T1000 P505 Write C“tuv” -> “wxy”
Update -
Dirty page table Log
transID lastLSN status
T1000 104 Running
T2000 103 Running
Transaction table
Buffer Pool
32
A=defD=qrs
C=tuv E=pq
P500PageLSN= -
P600PageLSN= -
P505PageLSN= -
P700PageLSN= -
B=hij
Disk
A=abc D=mnp
C=tuv E=pq
Changes:4.T1000 changesthevalueofCfrom“tuv”to“wxy”onpageP505?