Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

62
Final Review
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Page 1: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Final Review

Page 2: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

SQL• Will be in the exam.

• So, refresh midterm review.

• Constraints as well.

Page 3: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Security and Authorization

Page 4: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

GRANT CommandGRANT privileges ON object TO users [WITH GRANT OPTION]

• The following privileges can be specified:– SELECT Can read all columns

• including those added later via ALTER TABLE command

– INSERT(column-name) Can insert tuples with non-null or nondefault values in this column.

– INSERT means same right with respect to all columns.– DELETE Can delete tuples.– REFERENCES (column-name) Can define foreign keys (in

other tables) that refer to this column.

Page 5: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Grant Examples I• Joe has created the tables

– Sailors(sid, sname, rating, age)

– Boats(bid, bname, color)

– Reserves(sid, bid, day)

• Joe now executes the following:

GRANT INSERT, DELETE ON Reserves TO Yuppy WITH GRANT OPTION;

• Yuppy can now insert or delete Reserves rows and authorize someone else to do the same.

Page 6: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Grant Examples II• Joe further executes:

GRANT SELECT ON Reserves TO Michael;

GRANT SELECT ON Sailors TO Michael WITH GRANT OPTION;

• Michael can now execute SELECT queries on Sailors and Reserves, and he can pass this privilege to others for Sailors but not for Reserves.

• With the SELECT privilege, Michael can create a view that accesses the Sailors and Reserves tables, for example, the ActiveSailors view:

CREATE VIEW ActiveSailors (name, age, day) AS

SELECT S.sname, S.age, R.day

FROM Sailors S, Reserves R

WHERE S.sid = R.sid AND S.rating > 6;

• However, Michael cannot grant SELECT on ActiveSailors to others. Why?

Page 7: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Grant Examples III• On the other hand, suppose that Michael creates the following

view:CREATE VIEW YoungSailors (sid, age, rating)AS

SELECT S.sid, S.age, S.rating

FROM Sailors S

WHERE S.age < 18;

• The only underlying table is Sailors, for which Michael has SELECT with grant option. Therefore he can pass this on to Eric and Guppy:GRANT SELECT ON YoungSailors TO Eric, Guppy;

• Eric and Guppy can now execute SELECT queries on the view YoungSailors.

• Note, however, that Eric and Guppy don’t have the right to execute SELECT queries directly on the underlying Sailor table.

Page 8: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Grant Examples IV• Suppose now Joe executes:

GRANT UPDATE (rating) ON Sailors TO Leah;

• Leah can update only the rating column of Sailors. E.g.UPDATE Sailors S

SET S.rating = 8;

• However, she cannot execute:UPDATE Sailors S

SET S.age = 25;

• She cannot execute either:UPDATE Sailors S

SET S.rating = S.rating-l;

• Why?

Page 9: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Grant Examples V• Suppose now Joe executes:

GRANT SELECT, REFERENCES(bid) ON Boats TO Bill;

• Bill can refer to the bid column of Boats as a foreign key in another table. E.g.

CREATE TABLE Reserves ( sid INTEGER, bid INTEGER, day DATE, PRIMARY KEY (bid, day), FOREIGN KEY (sid) REFERENCES Sailors, FOREIGN KEY (bid) REFERENCES Boats);

Page 10: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Revoke Examples IREVOKE [GRANT OPTION FOR] privileges

ON object FROM users {RESTRICT | CASCADE}

• Suppose Joe is the creator of Sailors.

GRANT SELECT ON Sailors TO Art WITH GRANT OPTION(executed by Joe)

GRANT SELECT ON Sailors TO Bob WITH GRANT OPTION(executed by Art)

REVOKE SELECT ON Sailors FROM Art CASCADE(executed by Joe)

Page 11: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Revoke Examples II• Art loses the SELECT privilege on Sailors.

• Then Bob, who received this privilege from Art, and only Art, also loses this privilege. – Bob’s privilege is said to be abandoned

• When CASCADE is specified, all abandoned privileges are also revoked – Possibly causing privileges held by other users to become

abandoned and thereby revoked recursively.

• If the RESTRICT keyword is specified, the command is rejected if revoking privileges causes other privileges becoming abandoned.

Page 12: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Grant and Revoke on Views• Suppose that Joe created Sailors and gave Michael the SELECT privilege on it

with the grant option.

• Michael then created the view YoungSailors and gave Eric the SELECT privilege on YoungSailors.

• Eric now defines a view called FineYoungSailors:

CREATE VIEW FineYoungSailors (name, age, rating) AS

SELECT S.sname, S.age, S.rating

FROM YoungSailors S

WHERE S.rating> 6

• What happens if Joe revokes the SELECT privilege on Sailors from Michael?

• Michael no longer has the authority to execute the query used to define YoungSailors because the definition refers to Sailors.

– Therefore, the view YoungSailors is dropped (I.e., destroyed).

– In turn, FineYoungSailors is dropped as well.

Page 13: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Revoking REFERENCES privilege• Suppose Joe had executed:

GRANT REFERENCES(bid) ON Boats TO Bill;

• Bill can refer to the bid column of Boats as a foreign key in another table. E.g.

CREATE TABLE Reserves ( sid INTEGER, bid INTEGER, day DATE, PRIMARY KEY (bid, day), FOREIGN KEY (sid) REFERENCEs Sailors, FOREIGN KEY (bid) REFERENCES Boats);

• If Joe revokes the REFERENCES privilege from Bill, then the Foreign Key constraint referencing the Boat table will be dropped from the Bill’s Reserves table.

Page 14: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Storage

Page 15: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

The Memory Hierarchy

fastest, perhaps 1Mb

under a microsecond, random access, perhaps 512Mb

Typically magnetic disks, magneto optical

(erasable), CD ROM.

•Access times in milliseconds, great

variability.

•Unit of read/write = block or page,

typically 16Kb.

•Capacities in gigabytes.

Desired data carried to read/write port,

access times in seconds.

Most common: racks of tapes; newer

devices: CD ROM “juke boxes,” tape

“silo's.”

Capacities in terabytes.

Page 16: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Disks

•Platters with top and bottom surfaces rotate

around a spindle.

•Diameters 1 inch to 4 feet.

•2--30 surfaces.

•Rotation speed: 3600--7200 rpm.

•One head per surface.

•All heads move in and out in unison.

To motivate many of the ideas used in DBMS’es, we must examine the operation of disks in detail.

Page 17: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Tracks and sectors• Surfaces are covered with concentric tracks.

– Tracks at a common radius = cylinder.

– Important because all data of a cylinder can be read quickly, without moving the heads.

• Typical magnetic disk: 16,000 cylinders

• Tracks are divided into sectors by unmagnetized gaps (which are 10% of track). – Typical track: 512 sectors.

– Typical sector: 4096 bytes.

• Sectors are grouped into blocks. – Typical: one 16K block = 4 4096 byte sectors.

Page 18: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

I/O model of computation• Disk I/O = read or write of a block is very expensive compared

with what is likely to be done with the block once it arrives in main memory. – Perhaps 1,000,000 machine instructions in the time to do one

random disk I/O.

• Random block accesses is the norm if there are several processes accessing disks, and the disk controller does not schedule accesses carefully.

• Reasonable model of computation that requires secondary storage: count only the disk I/O's.

Page 19: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Two Phase, Multiway Merge Sort

Merge Sort still not very good in disk I/O model. • log2n passes, so each record is read/written from disk log2n times.

• The secondary memory algorithms operate in a small number of passes; – in one pass every record is read into main memory once and written

out to disk once.

• 2PMMS: 2 reads + 2 writes per block. • Phase 1

1. Fill main memory with records. 2. Sort using favorite main memory sort. 3. Write sorted sublist to disk. 4. Repeat until all records have been put into one of the sorted lists.

Page 20: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Phase 2• Use one buffer for

each of the sorted sublists and one buffer for an output

block.

• Initially load input buffers with the first blocks of their respective sorted lists.

• Repeatedly run a competition among the first unchosen records of each of the buffered blocks. • Move the record with the least key to the

output block; it is now “chosen.”

• Manage the buffers as needed: • If an input block is exhausted, get the next

block from the same file.

• If the output block is full, write it to disk.

Page 21: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Reasons to limit the block size

• First, we cannot use blocks that cover several tracks effectively.

• Second, small relations would occupy only a fraction of a block, so large blocks would waste space on the disk.

• The larger the blocks are, the fewer records we can sort by 2PMMS (see next slide).

• Nevertheless, as machines get faster and disks more capacious, there is a tendency for block sizes to grow.

Page 22: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

How many records can we sort? 1. The block size is B bytes.2. The main memory available for buffering blocks is M bytes.3. Records take R bytes.

• Number of main memory buffers = M/B blocks• We need one output buffer, so we can actually use (M/B)-1 input buffers.

• How many sorted sublists makes sense to produce? • (M/B)-1.

• What’s the total number of records we can sort?• Each time we fill in the memory we sort M/R records. • Hence, we are able to sort (M/R)*[(M/B)-1] or approximately M2/RB.

If we use the parameters in the example about TPMMS we have:M=100MB = 100,000,000 Bytes = 108 BytesB = 16,384 BytesR = 160 BytesSo, M2/RB = (108)2 / (160 * 16,384) = 4.2 billion records, or 2/3 of a TeraByte.

Page 23: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Sorting larger relations• If our relation is bigger, then, we can use 2PMMS to create sorted

sublists of M2/RB records.

• Then, in a third pass we can merge (M/B)-1 of these sorted sublists.

• The third phase let’s us sort

• [(M/B)-1]*[M2/RB] M3/RB2 records

• For our example, the third phase let’s us sort 75 trillion records occupying 7500 Petabytes!!

Page 24: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Primary IndexesDense Indexes

Pointer to every record of a sequential file, (ordered by search key).

• Can make sense because records may be much bigger than key pointer pairs. – Fit index in memory, even if data file does not?

– Faster search through index than data file?

– Test existence of record without going to data file.

Sparse Indexes

Key pointer pairs for only a subset of records, typically first in each block.

• Saves index space.

Page 25: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Dense Index

Page 26: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Secondary Indexes• A primary index is an

index on a sorted file.

• Such an index “controls” the placement of records to be “primary,”

• Secondary index = index that does not control placement, surely not on a file sorted by its search key. – Sparse, secondary index

makes no sense.

– Usually, search key is not a “key.”

Page 27: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Indirect Buckets• To avoid repeating keys in index, use

a level of indirection, called buckets.

• Additional advantage: allows intersection of sets of records without looking at records themselves.

• Example Movies(title, year, length,

studioName);

secondary indexes on studioName and year. SELECT title

FROM Movies

WHERE studioName = 'Disney' AND

year = 1995;

Page 28: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.
Page 29: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

B TreesGeneralizes multilevel index.

• Number of levels varies with size of data file, but is often 3.

• B+ tree = form we'll discuss. – All nodes have same format: n keys, n + 1 pointers.

• Useful for primary, secondary indexes, primary keys, nonkeys.

• Leaf has at least key-pointer pairs

• Interior nodes use at least pointers.

2/)1( n

2/)1( n

Page 30: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

A typical leaf and interior node (unclusttered index)

958157

To record with key 57 To record

with key 81

To record with key 95

To next leaf in sequence

Leaf

958157

To keysK<57 To keys

57K<81

To keys81K<95

Interior Node

To keysK95

57, 81, and 95 are the least keys we can reach by via the corresponding pointers.

Page 31: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

13

7 23 31 43

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

Lookup

Recursive procedure:•If we are at a leaf, look among the keys there. If the i-th key is K, the the i-th pointer will take us to the desired record. •If we are at an internal node with keys K1,K2,…,Kn, then if K<K1we follow the first pointer, if K1K<K2 we follow the second pointer, and so on.

Try to find a record with search key 40.

Page 32: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

13

7 23 31 43

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47

Insertion Try to insert a search key = 40.First, lookup for it, in order to find where to insert.

It has to go here, but the node is full!

Page 33: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

13

7 23 31 43

2 3 5 7 11 13 17 19 23 29

31 37

43 47

40 41

Beginning of the insertion of key 40

Observe the new node and the redistribution of keys and pointers

What’s the problem?No parent yet for the new node!

Page 34: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

13

7 23 31 43

2 3 5 7 11 13 17 19 23 29

31 37

43 47

40 41

Continuing of the Insertion of key 40We must now insert a pointer to the new leaf into this node. We must also associate with this pointer the key 40, which is the least key reachable through the new leaf.But the node is full. Thus it too must split!

Page 35: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

13

7 23 31

2 3 5 7 11 13 17 19 23 29

31 37

43 47

40 41

Completing of the Insertion of key 40

43

This is a new node.

•We have to redistribute 3 keys and 4 pointers.•We leave three pointers in the existing node and give two pointers to the new node. 43 goes in the new node.•But where the key 40 goes? •40 is the least key reachable via the new node.

Page 36: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

13 40

7 23 31

2 3 5 7 11 13 17 19 23 29

31 37

43 47

40 41

Completing of the Insertion of key 40

43

It goes here!40 is the least key

reachable via the new node.

Page 37: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Structure of B-trees• Degree n means that all nodes have space for n search keys and

n+1 pointers

• Node = block

• Let– block size be 4096 Bytes,

– key 4 Bytes,

– pointer 8 Bytes.

• Let’s solve for n:

4n + 8(n+1) 4096

n 340

n = degree = order = fanout

Page 38: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Example• n = 340, however a typical node has 255 keys• At level 3 we have:

2552 nodes, which means

2553 16 220 records can be indexed.

• Suppose record = 1024 Bytes we can index a file of size

16 220 210 16 GB

• If the root is kept in main memory accessing a record requires 3 disk I/O

Page 39: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Transactions

Page 40: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Transactions Correctness Principle

• A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result.

• A transaction, may change the DB from a consistent state to another consistent state. Otherwise it is rejected (aborted).

• Concurrent execution of transactions may lead to inconsistency – each transaction must appear to be executed in isolation

• The effect of a committed transaction is durable i.e. the effect on DB of a transaction must never be lost, once the transaction has completed.

• ACID: Properties of a transaction:

Atomicity, Consistency, Isolation, and Durability

Page 41: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Concurrent Transactions

• Even when there is no “failure,” several transactions can interact to turn a

consistent state

into an

inconsistent state.

Page 42: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Transactions and Schedules• A transaction (model) is a sequence of r and w actions on database elements.

• A schedule is a sequence of reads/writes actions performed by a collection of transactions.

• Serial Schedule = All actions for each transaction are consecutive.

r1(A); w1(A); r1(B); w1(B); r2(A); w2(A); r2(B); w2(B); …

• Serializable Schedule: A schedule whose “effect” is equivalent to that of some serial schedule.

• We will introduce a sufficient condition for serializability.

Page 43: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Conflicts• Suppose for DB elements X and Y,

ri(X); rj(Y) is part of a schedule, and we flip the order of these operations.

– ri(X); rj(Y) ≡ rj(Y); ri(X)

– This holds always (even when X=Y)

• We can flip ri(X); wj(Y), as long as X≠Y

• That is, ri(X); wj (X) wj(X); ri (X)

– In the RHS, Ti reads the value of X written

by Tj, whereas it is not so in the LHS.

Page 44: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Conflicts (Cont’d)• We can flip wi(X); wj(Y); provided X≠Y

• However, wi(X); wj(X) ≢ wj(X); wi(X);

– The final value of X may be different depending on which write occurs last.

• There is a conflict if 2 conditions hold.

• A read and a write of the same X, or

• Two writes of X conflict in general and may not be swapped in order.

All other events (reads/writes) may be swapped without changing the effect of the schedule (on the DB).

Page 45: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Schedulers• A scheduler takes requests from transactions for reads and writes, and decides if it is

“OK” to allow them to operate on DB or defer them until it is safe to do so.

• Ideal: a scheduler forwards a request iff it cannot lead to inconsistency of DB

– Too hard to decide this in real time.

• Real: it forwards a request if it cannot result in a violation of conflict serializability.

• We thus need to develop schedulers which ensure conflict-serializability.

Page 46: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Lock Actions• Before reading or writing an element X, a transaction Ti requests a lock on X from the

scheduler.

• The scheduler can either grant the lock to Ti or make Ti wait for the lock.

• If granted, Ti should eventually unlock (release) the lock on X.

• Shorthands:

– li(X) = “transaction Ti requests a lock on X”

– ui(X) = “Ti unlocks/releases the lock on X”

Page 47: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Two Phase LockingThere is a simple condition, which guarantees confict-serializability: In every transaction, all lock requests (phase 1) precede all unlock requests (phase 2).

Page 48: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Undo Logging (Cont’d)

Two rules of Undo Logging:

• U1: Log records for a DB element X must be on disk before any database modification to X appears on disk.

• U2: If a transaction T commits, then the log record <COMMIT T> must be written to disk only after all database elements changed by T are written to disk.

• In order to force log records to disk, the log manager needs a FLUSH LOG command that tells the buffer manager to copy to disk any log blocks that haven’t previously been copied to disk or that have been changed since they were last copied.

Page 49: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Action t Buff A Buff B A in HD B in HD Log

Read(A,t) 8 8 8 8 <Start T>

t:=t*2 16 8 8 8

Write(A,t) 16 16 8 8 <T,A,8>

Read(B,t) 8 16 8 8 8

t:=t*2 16 16 8 8 8

Write(B,t) 16 16 16 8 8 <T,B,8>

Flush Log

Output(A) 16 16 16 16 8

Output(B) 16 16 16 16 16 <Commit T>

Flush Log

Example:

Page 50: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Recovery With Undo Logging

1. Examine the log to identify all transactions T such that <START T> appears in the log, but neither <COMMIT T> nor <ABORT T> does.

– Call such transactions incomplete.

2. Examine each log entry <T, X, v>

a) If T isn’t an incomplete transaction, do nothing.

b) If T is incomplete, restore the old value of X

In what order?

From most recent to earliest.

3. For each incomplete transaction T add <ABORT T> to the log, and flush the log.

• What about the transactions that had already <ABORT T> in the log?

• We do nothing about them. If T aborted, then the effect on the DB should have been restored anyway.

Page 51: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Checkpointing• Problem: in principle, recovery requires looking at the entire log.

• Simple solution: occasional checkpoint operation during which we:

1. Stop accepting new transactions.

2. Wait until all current transactions commit or abort and have written a Commit or Abort log record

3. Flush the log to disk

4. Enter a <CKPT> record in the log and flush the log again

5. Resume accepting transactions

• If recovery is necessary, we know that all transactions prior to a <CKPT> record have committed or aborted and need not be undone

Page 52: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Example of an Undo log <START T1>

<T1,A,5>

<START T2>

<T2,B,10> decide to do a checkpoint

<T2,C,15>

<T1,D,20>

<COMMIT T1>

<COMMIT T2>

<CKPT> we may now write the CKPT record

<START T3>

<T3,E,25>

<T3,F,30> If a crash occurs at this point?

Page 53: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

• Problem: we may not want to stop transactions from entering the system.

• Solution:

1. Write a record <START CKPT(T1,...,Tk)>

to log and flush to disk, where Ti’s are

all current “active” transactions.

2. Wait until all Ti’s commit or abort,

but do not prohibit new transactions.

3. When all T1…Tk are “done”, write the

record <END CKPT> to log and flush.

Nonquiescent Checkpoint (NQ CKPT)

Page 54: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Recovery with NQ CKPTFirst case:

• If the crash follows <END CKPT>,

• Then we can restrict recovery to transactions that started after the <START CKPT>.

Second case:

• If the crash occurs between <START CKPT> and <END CKPT>, we need to undo:

1. All transactions T on the list associated with <START CKPT> with no <COMMIT T>.

2. All transactions T with <START T> after the <START CKPT> but with no <COMMIT T>.

i.e. 1+2 undo any incomplete transaction that is on the CKPT list or started after <START CKPT>.

Page 55: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Example of NQ Undo Log

<START T1> <T1,A,5> <START T2> <T2,B,10> <START CKPT (T1,T2)>

<T2,C,15> <START T3> <T1,D,20> <COMMIT T1> <T3,E,25> <COMMIT T2> <END CKPT>

<T3,F,30> A crash occurs at this pointWhat if we have a crash right after <T3,E,25>?

Page 56: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Redo Logging• Idea: Commit (log record appears on disk) before writing data to

disk.

• Redo log entries contain the new values:

– <T,X,NewX> = “transaction T modified X and the new value is NewX”

• Redo logging rule:

– R1. Before modifying DB element X on disk, all log entries (including <COMMIT T>) must be written to log (in disk).

Page 57: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Action t Buff A Buff B A in HD B in HD Log

Read(A,t) 8 8 8 8 <Start T>

t:=t*2 16 8 8 8

Write(A,t) 16 16 8 8 <T,A,16>

Read(B,t) 8 16 8 8 8

t:=t*2 16 16 8 8 8

Write(B,t) 16 16 16 8 8 <T,B,16>

<Commit T>

Flush Log

Output(A) 16 16 16 16 8

Output(B) 16 16 16 16 16

Example:

Page 58: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Recovery for Redo Logging1. Identify committed transactions.

2. Examine the log forward, from earliest to latest. – Consider only the committed transactions, T.

– For each <T, X, v> in the log do:

WRITE(X,v); OUTPUT(X);

Note 1: Uncommitted transactions will have no effect on the DB (unlike in undo logging)

This because none of the changes of an uncommitted T have reached the disk

Note 2: “Redoing” starts from the head of the log;

In effect, each data item X will have the value written by the last transaction in the log that changed X.

Page 59: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Checkpointing for Redo Logging

• The key action that we must take between the start and end of checkpoint is to write to disk all the dirty buffers.

• Dirty buffers are those that have been changed by committed transactions but not written to disk.

• Unlike in the undo case, we don’t need to wait for active transactions to finish (in order to write <END CKPT>).

• However, we wait for copying dirty buffers of the committed transactions.

Page 60: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Checkpointing for Redo (Cont’d)

1. Write a <START CKPT(T1,...,Tk )> record to the log, where Ti’s are all active transactions.

2. Write to disk all the dirty buffers of transactions that had already committed when the START CKPT was written to log.

3. Write an <END CKPT> record to log.

Page 61: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Checkpointing for Redo (Cont’d)

<START T1>

<T1,A,5>

<START T2>

<COMMIT T1>

<T2,B,10>

<START CKPT(T2)>

<T2,C,15>

<START T3>

<T3,D,20>

<END CKPT>

<COMMIT T2>

<COMMIT T3>

The buffer containing value A might be dirty. If so, copy it to disk. Then write <END CKPT>.

During this period three other actions took place.

Page 62: Final Review. SQL Will be in the exam. So, refresh midterm review. Constraints as well.

Recovery with Ckpt. RedoTwo cases:

1. If the crash follows <END CKPT>,

we can restrict ourselves to transactions that began after the <START CKPT> and those in the START list.

• This is because we know that, in this case, every value written by committed transactions, before START CKPT(…), is now in disk.

2. If the crash occurs between <START CKPT> and <END CKPT>,

then go and find the previous <END CKPT> and do the same as in the first case.

• This is because we are not sure that committed transactions before START CKPT(…) have their changes in disk.