C20.0046: Database Management Systems Lecture #26
description
Transcript of C20.0046: Database Management Systems Lecture #26
M.P. Johnson, DBMS, Stern/NYU, Sp2004
1
C20.0046: Database Management SystemsLecture #26
Matthew P. Johnson
Stern School of Business, NYU
Spring, 2004
M.P. Johnson, DBMS, Stern/NYU, Sp2004
2
Agenda Previously: Indices Next:
Finish Indices, advanced indices Failure/recovery Data warehousing & mining
Websearch
Hw3 due today no extensions!
1-minute responses
Review: clustered, dense, primary, #/tbl, syntax
M.P. Johnson, DBMS, Stern/NYU, Sp2004
3
Query compiler/optimizer
Execution engine
Index/record mgr.
Buffer manager
Storage manager
storage
User/Application
Queryupdate
Query executionplanRecord,
indexrequests
Page commands
Read/writepages
Transaction manager:•Concurrency control•Logging/recovery
Transactioncommands
Let’s get physical
M.P. Johnson, DBMS, Stern/NYU, Sp2004
4
BSTs Very simple data structure in CS: BSTs
Binary Search Trees Keep balanced Each node ~ one item
Each node has two children: Left subtree: < Right subtree: >=
Can search, insert, delete in log time log2(1MB = 220) = 20
M.P. Johnson, DBMS, Stern/NYU, Sp2004
5
Search for DBMS Big improvement: log2(1MB) = 20
Each op divides remaining range in half! But recall: all that matters is #disk accesses
20 is better than 220 but:
Can we do better?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
6
BSTs B-trees Like BSTs except each node ~ one block Branching factor is >> 2
Each access divides remaining range by, say, 300 B-trees = BSTs + blocks B+ trees are a variant of B-trees
Data stored only in leaves Leaves form a (sorted) linked list Better supports range queries
Consequences: Much shorter depth Many fewer disk reads Must find element within node Trades CPU/RAM time for disk time
M.P. Johnson, DBMS, Stern/NYU, Sp2004
7
B+ Trees Parameter n branching factor is n+1
Largest number s.t. one block can contain n search-key values and n+1 pointers
Each node (except root) has at least n/2 keys
30 120 240
Keys k < 30Keys 30<=k<120 Keys 120<=k<240 Keys 240<=k
40 50 60
40 50 60
Next leaf
M.P. Johnson, DBMS, Stern/NYU, Sp2004
8
Searching a B+ Tree Exact key values:
Start at the root If we’re in leaf, walk through its key values; If not, look at keys K1..Kn
If Ki <= K <= Ki+1, look in child i
Range queries: As above Then walk left until test fails
Select nameFrom peopleWhere age = 25
Select nameFrom peopleWhere age = 25
Select nameFrom peopleWhere 20 <= age and age <= 30
Select nameFrom peopleWhere 20 <= age and age <= 30
M.P. Johnson, DBMS, Stern/NYU, Sp2004
9
B+ Tree Example80
20 60 100 120
140
10 15 18 20 30 40 50 60 65 80 85 90
10 15 18 20 30 40 50 60 65 80 85 90
n = 4Find the key 40
40 80
20 < 40 60
30 < 40 40
NB: Leaf keys are sorted; data pointed to is only if clustered
M.P. Johnson, DBMS, Stern/NYU, Sp2004
Clustered & unclustered B-trees
Data entries(Index File)
(Data file)
Data Records
Data entries
Data Records
CLUSTERED UNCLUSTERED
M.P. Johnson, DBMS, Stern/NYU, Sp2004
11
B+ trees, and, or Assume index on a,b,c
Intuition: phone book
WHERE a = ‘x’ and b = ‘y’
WHERE b = ‘y’ and c = ‘z’
WHERE a = ‘a’ and c = ‘z’
WHERE a = ‘x’ or b = ‘y’ or c = ‘z’
M.P. Johnson, DBMS, Stern/NYU, Sp2004
12
B+ trees and LIKE Supports only hard-coded prefix LIKE checks
Intuition: phone book
Select * from T where a like ‘xyz%’
Select * from T where a like ‘%xyz’
Select * from T where a like ‘xyz%zyx%’
M.P. Johnson, DBMS, Stern/NYU, Sp2004
13
B-tree search efficiency With params:
block=4k integer = 4b, pointer = 8b
the largest n satisfying 4n+8(n+1) <= 4096 is n=340 Each node has 170..340 keys assume on avg has (170+340)/2=255
Then: 255 rows depth = 1 2552 = 64k rows depth = 2 2553 = 16M rows depth = 3 2554 = 4G rows depth = 4
M.P. Johnson, DBMS, Stern/NYU, Sp2004
14
B-trees in practice Most DBMSs use B-trees for most indices
Default in MySQL Default in Oracle
Speeds up where clauses Some like checks Min or max functions joins
Limitation: fields used must Be a prefix of indexed fields Be ANDed together
M.P. Johnson, DBMS, Stern/NYU, Sp2004
15
Next topic: Advanced types of indices Spatial indices based on R-trees (R = region)
Support multi-dimensional searches on “geometry” fields
2-d not 1-d ranges
Oracle:
MySQL:
CREATE INDEX geology_rtree_idx ON geology_tab(geometry) INDEXTYPE IS MDSYS.SPATIAL_INDEX;
CREATE INDEX geology_rtree_idx ON geology_tab(geometry) INDEXTYPE IS MDSYS.SPATIAL_INDEX;
CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g));CREATE TABLE geom (g GEOMETRY NOT NULL, SPATIAL INDEX(g));
M.P. Johnson, DBMS, Stern/NYU, Sp2004
16
Advanced types of indices Inverted indices for web doc search
First, think of each webpage as a tuple One column for every possible word True means the word appears on the page Index on all columns
Now can search: you’re fired select * from T where youre=T and fired=T
M.P. Johnson, DBMS, Stern/NYU, Sp2004
17
Advanced types of indices Can simplify somewhat:
1. For each field index, delete False entries
2. True entries for each index become a bucket
Create “inverted index”: One entry for each search word Search word entry points to corresponding
bucket Bucket points to pages with its word Amazon
M.P. Johnson, DBMS, Stern/NYU, Sp2004
18
Advanced types of indices Function-based indices
Speeds up WHERE upper(name)=‘BUSH’, etc.
Now supported in Oracle 8, not MySQL
Bitmap indices Speeds up arbitrary combination of reqs
Not limited to prefixes or conjunctions Now supported in Oracle 9, not MySQL
create index on T(my_soundex(name));create index on T(substr(DOB),4,5));
create index on T(my_soundex(name));create index on T(substr(DOB),4,5));
M.P. Johnson, DBMS, Stern/NYU, Sp2004
19
Bitmap indices Assume table has n records Assume F is a field with m different values Bitmap index on F: m length-n bitstrings
One bitstring for each value of F Each one says which rows have that value for F
Example: n = , mF = , mG =
Q: find rows whereF=50 or (F=30 and G=‘Baz’)
F G
1 30 Foo
2 30 Bar
3 40 Baz
4 50 Foo
5 40 Bar
6 30 Baz
M.P. Johnson, DBMS, Stern/NYU, Sp2004
20
Bitmap index search Larger example: (age,salary) of jewelry buyers:
Bitmaps for age: 25:100000001000, 30:000000010000, 45:01000000100,
50:001110000010, 60:000000000001, 70:000001000000, 85:000000100000
Bitmaps for salary: 60:110000000000, 75:001000000000, 100:000100000000,
110:000001000000, 120:000010000000, 140:000000100000, 260:000000010001, 275:000000000010, 350:000000000100, 400:000000001000
Age Sal.
5 50 120
6 70 110
7 85 140
8 30 260
Age Sal.
9 25 400
10 45 350
11 50 275
12 50 260
Age Sal.
1 25 60
2 45 60
3 50 75
4 50 100
M.P. Johnson, DBMS, Stern/NYU, Sp2004
21
Bitmap index search Query: find buyers of age 45-55 with salary
100-200 Age range: 010000000100 (45) |
001110000010 (50) = 011110000110 Bitwise or of Salary range: 000111100000 AND together: 011110000110 &
000111100000 = 000110000000
What does this mean?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
22
Bitmap index search Once we have row numbers, then what?
Get rows with those numbers (How?)
Bitmap indices in Oracle:
Best for low-cardinality fields Boolean, enum, gender
lots of 0s in our bitmaps Compress: 000000100001 6141
“run-length encoding”
CREATE BITMAP INDEX ON T(F,G);CREATE BITMAP INDEX ON T(F,G);
M.P. Johnson, DBMS, Stern/NYU, Sp2004
23
New topic: Recovery
Type of Crash Prevention
Wrong data entryConstraints andData cleaning
Disk crashesRedundancy:
e.g. RAID, archive
Fire, theft, bankruptcy…
Buy insurance, Change jobs…
System failures:e.g. blackout
DATABASERECOVERY
M.P. Johnson, DBMS, Stern/NYU, Sp2004
24
System Failures Each transaction has internal state When system crashes, internal state is lost
Don’t know which parts executed and which didn’t Remedy: use a log
A file that records each action of each xact Trail of breadcrumbs
M.P. Johnson, DBMS, Stern/NYU, Sp2004
25
Media Failures Rule of thumb: Pr(hard drive has head crash
within 10 years) = 50% Simpler rule of thumb: Pr(hard drive has head
crash within 1 years) = 10% Serious problem
Soln: different RAID strategies RAID: Redundant Arrays of Independent
Disks
M.P. Johnson, DBMS, Stern/NYU, Sp2004
26
RAID levels RAID level 1: each disk gets a mirror RAID level 4: one disk is xor of all others
Each bit is sum mod 2 of corresponding bits E.g.:
Disk 1: 11110000 Disk 2: 10101010 Disk 3: 00111000 Disk 4:
How to recover?
M.P. Johnson, DBMS, Stern/NYU, Sp2004
27
Transactions Transaction: unit of code to be executed
atomically In ad-hoc SQL
one command = one transaction In embedded SQL
Transaction starts = first SQL command issued Transaction ends =
COMMIT ROLLBACK (=abort)
Can turn off/on autocommit
M.P. Johnson, DBMS, Stern/NYU, Sp2004
28
Primitive operations of transactions Each xact reads/writes rows or blocks: elms
INPUT(X) read element X to memory buffer
READ(X,t) copy element X to transaction local variable t
WRITE(X,t) copy transaction local variable t to element X
OUTPUT(X) write element X to disk
LOG RECORD
M.P. Johnson, DBMS, Stern/NYU, Sp2004
29
Transaction example Xact: Transfer $100 from savings to checking
A = A+100; B = B-100;
READ(A,t);
t := t+100;
WRITE(A,t);
READ(B,t);
t := t-100;
WRITE(B,t)
M.P. Johnson, DBMS, Stern/NYU, Sp2004
30
Transaction example READ(A,t); t := t+100;WRITE(A,t); READ(B,t); t := t-100;WRITE(B,t)
Action t Mem A Mem B Disk A Disk B
INPUT(A) 1000 1000 1000
READ(A,t) 1000 1000 1000 1000
t:=t+100 1100 1000 1000 1000
WRITE(A,t) 1100 1100 1000 1000
INPUT(B) 1100 1100 1000 1000 1000
READ(B,t) 1000 1100 1000 1000 1000
t:=t-100 900 1100 1000 1000 1000
WRITE(B,t) 900 1100 900 1000 1000
OUTPUT(A) 900 1100 900 1100 1000
OUTPUT(B) 900 1100 900 1100 900
M.P. Johnson, DBMS, Stern/NYU, Sp2004
31
The log An append-only file containing log records Note: multiple transactions run concurrently,
log records are interleaved After a system crash, use log to:
Redo some transaction that didn’t commit Undo other transactions that didn’t commit
Three kinds of logs: undo, redo, undo/redo We’ll discuss only Undo
M.P. Johnson, DBMS, Stern/NYU, Sp2004
32
Undo Logging Log records <START T>
transaction T has begun <COMMIT T>
T has committed <ABORT T>
T has aborted <T,X,v>
T has updated element X, and its old value was v
M.P. Johnson, DBMS, Stern/NYU, Sp2004
33
Undo-Logging Rules U1: Changes logged (<T,X,v>) before being
written to disk U2: Commits logged (<COMMIT T>) after being
written to disk
Results: May forget we did whole xact (and so wrongly undo) Will never forget did partial xact (and so leave)
Log-change, change, log-change, change, Commit, log-commit
M.P. Johnson, DBMS, Stern/NYU, Sp2004
34
Action T Mem A Mem B Disk A Disk B Log
<START T>
READ(A,t) 1000 1000 1000 1000
t:=t+100 1100 1000 1000 1000
WRITE(A,t) 1100 1100 1000 1000 <T,A,8>
READ(B,t) 1000 1100 1000 1000 1000
t:=t-100 900 1100 1000 1000 1000
WRITE(B,t) 900 1100 900 1000 1000 <T,B,8>
OUTPUT(A) 900 1100 900 1100 900
OUTPUT(B) 900 1100 900 1100 900
COMMIT
<COMMIT T>
Undo-Logging e.g. (inputs omitted)
M.P. Johnson, DBMS, Stern/NYU, Sp2004
35
Recovery with Undo Log After system’s crash, run recovery manager
1. Decide for each xact T whether it was completed
2. Undo all modifications from incomplete xacts, in reverse order (why?) and abort each
<START T>….<COMMIT T> yes<START T>….<ABORT T> yes<START T>…………………… no
<START T>….<COMMIT T> yes<START T>….<ABORT T> yes<START T>…………………… no
M.P. Johnson, DBMS, Stern/NYU, Sp2004
36
Recovery with Undo Log Read log from the end; cases:
<COMMIT T>: mark T as completed <ABORT T>: mark T as completed <T,X,v>:
<START T>: ignore
if T is not completed thenwrite X=v to disk
elseignore
if T is not completed thenwrite X=v to disk
elseignore
M.P. Johnson, DBMS, Stern/NYU, Sp2004
37
Recovery with Undo Log……<T2,X2,v2>……<START T5><START T4><T1,X1,v1><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>
Q: Which updates areundone?
Crash!
Start:
M.P. Johnson, DBMS, Stern/NYU, Sp2004
38
Recovery with Undo Log Note: undo commands are idempotent
No harm done if we repeat them Q: What if system crashes during recovery?
How far back in the log do we go? Don’t go all the way back to the start May be very large
Better idea: use checkpointing
M.P. Johnson, DBMS, Stern/NYU, Sp2004
39
Checkpointing Checkpoint the database periodically
Stop accepting new transactions Wait until all current xacts complete Flush log to disk Write a <CKPT> log record, flush log Resume accepting new xacts
M.P. Johnson, DBMS, Stern/NYU, Sp2004
40
Undo Recovery with Checkpointing
……<T1,X1,v1>……(all completed)<CKPT><START T2><START T3<START T5><START T4><T4,X4,v4><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>
During recovery,can stop at first<CKPT>
xacts T2,T3,T4,T5
other xacts
M.P. Johnson, DBMS, Stern/NYU, Sp2004
41
Non-quiescent Checkpointing Problem: database must freeze during
checkpoint Would like to checkpoint while database is
operational Idea: non-quiescent checkpointing
Quiescent: quiet, still, at rest; inactive
M.P. Johnson, DBMS, Stern/NYU, Sp2004
42
Next time Next: Data warehousing mining! For next time: reading online
Proj5 due next Thursday no extensions!
Now: one-minute responses Relative weight: warehousing, mining, websearch Data mining techniques
NNs GAs kNN Decision Trees