CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization...

82
CS 277 – Spring 2002 Notes 3 1 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    3

Transcript of CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization...

Page 1: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 1

CS 277: Database System Implementation

Notes 03: Disk Organization

Arthur Keller

Page 2: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 2

• How to lay out data on disk• How to move it to memory

Topics for today

Page 3: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 3

What are the data items we want to store?

• a salary• a name• a date• a pictureWhat we have available: Bytes

8bits

Page 4: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 4

To represent:

• Integer (short): 2 bytese.g., 35 is

00000000 00100011

• Real, floating pointn bits for mantissa, m for exponent….

Page 5: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 5

• Characters various coding schemes suggested,

most popular is ASCII

To represent:

Example:A: 1000001a: 11000015: 0110101LF: 0001010 (line feed)

Page 6: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 6

• Booleane.g., TRUE

FALSE

1111 1111

0000 0000

To represent:

• Application specifice.g., RED 1 GREEN 3

BLUE 2 YELLOW 4 …Can we use less than 1

byte/code?Yes, but only if desperate...

Page 7: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 7

• Datese.g.: - Integer, # days since Jan 1, 1900

- 8 characters, YYYYMMDD - 7 characters, YYYYDDD

(not YYMMDD! Why?)• Time

e.g. - Integer, seconds since midnight - characters, HHMMSSFF

To represent:

Page 8: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 8

• String of characters– Null terminated

e.g.,

– Length givene.g.,

- Fixed length

c ta

c ta3

To represent:

Page 9: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 9

• Bag of bits

Length Bits

To represent:

Page 10: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 10

Key Point

• Fixed length items

• Variable length items- usually length given at beginning

Page 11: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 11

• Type of an item: Tells us how to interpret(plus size if

fixed)

Also

Page 12: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 12

Data Items

Records

Blocks

Files

Memory

Overview

Page 13: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 13

Record - Collection of related data

items (called FIELDS)E.g.: Employee record:

name field,salary field,date-of-hire field, ...

Page 14: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 14

Types of records:

• Main choices:– FIXED vs. VARIABLE FORMAT– FIXED vs. VARIABLE LENGTH

Page 15: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 15

A SCHEMA (not record) containsfollowing information

- # fields- type of each field- order in record- meaning of each field

Fixed format

Page 16: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 16

Example: fixed format and length

Employee record(1) E#, 2 byte integer(2) E.name, 10 char. Schema(3) Dept, 2 byte code

55 s m i t h 02

83 j o n e s 01

Records

Page 17: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 17

• Record itself contains format“Self Describing”

Variable format

Page 18: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 18

Example: variable format and length

4I52 4S DROF46

Field name codes could also be strings, i.e. TAGS

# F

ield

s

Cod

e id

enti

fyin

g

field

as

E#

Inte

ger

typ

e

Cod

e f

or

En

am

eS

trin

g t

yp

eLe

ng

th o

f st

r.

Page 19: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 19

Variable format useful for:

• “sparse” records• repeating fields• evolving formats

But may waste space...

Page 20: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 20

• EXAMPLE: variable format record with

repeating fieldsEmployee one or more children3 E_name: Fred Child: SallyChild: Tom

Page 21: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 21

Note: Repeating fields does not imply- variable format, nor- variable size

John Sailing Chess --

• Key is to allocate maximum number ofrepeating fields (if not used null)

Page 22: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 22

Many variants betweenfixed - variable format:

Ex. #1: Include record type in record

record type record lengthtells me whatto expect(i.e. points to schema)

5 27 . . . .

Page 23: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 23

Record header - data at beginning

that describes recordMay contain:

- record type- record length- time stamp- other stuff ...

Page 24: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 24

Ex #2 of variant between FIXED/VAR format

• Hybrid format– one part is fixed, other variable

E.g.: All employees have E#, name, deptother fields vary.

25 Smith Toy 2 retiredHobby:chess

# of varfields

Page 25: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 25

Also, many variations in internal

organization of record

Just to show one: length of field

3 F310 F1 5 F2 12 * * *

3 32 5 15 20 F1 F2 F3

total size

offsets

0 1 2 3 4 5 15 20

Page 26: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 26

Question:We have seen examples for

* Fixed format and length records* Variable format and length records

(a) Does fixed format and variable length

make sense?(b) Does variable format and fixed length

make sense?

Page 27: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 27

Other interesting issues:

• Compression– within record - e.g. code selection– collection of records - e.g. find

common patterns

• Encryption

Page 28: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 28

Next: placing records into blocks

blocks ...

a file

assume fixedlength blocks

assume a single file (for now)

Page 29: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 29

(1) separating records(2) spanned vs. unspanned(3) mixed record types – clustering(4) split records(5) sequencing(6) indirection

Options for storing records in blocks:

Page 30: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 30

Block

(a) no need to separate - fixed size recs.(b) special marker(c) give record lengths (or offsets)

- within each record- in block header

(1) Separating records

R2R1 R3

Page 31: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 31

• Unspanned: records must be within one block

block 1 block 2

...

• Spannedblock 1 block 2

...

(2) Spanned vs. Unspanned

R1 R2

R1

R3 R4 R5

R2 R3(a)

R3(b) R6R5R4 R7

(a)

Page 32: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 32

need indication need indication

of partial record of continuation

“pointer” to rest (+ from where?)

R1 R2 R3(a)

R3(b) R6R5R4 R7

(a)

With spanned records:

Page 33: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 33

• Unspanned is much simpler, but may waste space…

• Spanned essential if record size > block size

Spanned vs. unspanned:

Page 34: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 34

Example

106 recordseach of size 2,050 bytes (fixed)block size = 4096 bytes

block 1 block 2

2050 bytes wasted 2046 2050 bytes wasted 2046

R1 R2

• Total wasted = 2 x 109 Utiliz. = 50%• Total space = 4 x 109

Page 35: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 35

• Mixed - records of different types(e.g. EMPLOYEE, DEPT)allowed in same block

e.g., a block

(3) Mixed record types

EMP e1 DEPT d1 DEPT d2

Page 36: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 36

Why do we want to mix?Answer: CLUSTERING

Records that are frequently accessed together should bein the same block

Page 37: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 37

Compromise:

No mixing, but keep relatedrecords in same cylinder ...

Page 38: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 38

Example

Q1: select A#, C_NAME, C_CITY, …from DEPOSIT, CUSTOMERwhere DEPOSIT.C_NAME =

CUSTOMER.C.NAME

a blockCUSTOMER,NAME=SMITH

DEPOSIT,NAME=SMITH

DEPOSIT,NAME=SMITH

Page 39: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 39

• If Q1 frequent, clustering good• But if Q2 frequent

Q2: SELECT * FROM CUSTOMER

CLUSTERING IS COUNTER PRODUCTIVE

Page 40: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 40

Fixed part in one block

Typically forhybrid format

Variable part in another block

(4) Split records

Page 41: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 41

Block with fixed recs.

R1 (a)R1 (b)

Block with variable recs.

R2 (a)

R2 (b)

R2 (c)

This block alsohas fixed recs.

Page 42: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 42

Question

What is difference between- Split records- Simply using two different

record types?

Page 43: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 43

• Ordering records in file (and block) by some key value

Sequential file ( sequenced)

(5) Sequencing

Page 44: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 44

Why sequencing?

Typically to make it possible to efficiently read records in order(e.g., to do a merge-join — discussed later)

Page 45: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 45

Sequencing Options

(a) Next record physically contiguous

...

(b) Linked

Next (R1)R1

R1 Next (R1)

Page 46: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 46

(c) Overflow area

Recordsin sequence

R1

R2

R3

R4

R5

Sequencing Options

header

R2.1

R1.3

R4.7

Page 47: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 47

• How does one refer to records?

(6) Indirection

Rx

Many options: Physical Indirect

Page 48: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 48

Purely Physical

Device IDE.g., Record Cylinder #

Address = Track #or ID Block #

Offset in block

Block ID

Page 49: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 49

Fully Indirect

E.g., Record ID is arbitrary bit string

maprec ID r address

a

Physicaladdr.Rec ID

Page 50: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 50

Tradeoff

Flexibility Costto move records of indirection(for deletions, insertions)

Page 51: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 51

Physical Indirect

Many optionsin between …

Page 52: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 52

Ex #1 Indirection in block

Header

A block: Free

space

R3

R4

R1 R2

Page 53: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 53

Block header - data at beginning that

describes blockMay contain:

- File ID (or RELATION or DB ID) - This block ID - Record directory

- Pointer to free space- Type of block (e.g. contains recs type 4;

is overflow, …)- Pointer to other blocks “like it”- Timestamp ...

Page 54: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 54

Ex. #2 Use logical block #’s understood by file system

REC ID File ID Block # Record # or Offset

File ID, PhysicalBlock # Block ID

File System Map

Page 55: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 55

File system map may be “Semi-physical”…

File F1: physical address of block 1table of bad blocks:

B57 XXXB107 YYY

Rest can be computed via formula...

Page 56: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 56

Num. Blocks: 20Start Block: 1000Block Size: 100Bad Blocks:

3 20,0007 15,000

File DEFINITION

Where is Block # 2?

Where is Block # 3?

Page 57: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 57

(1) Separating records(2) Spanned vs. Unspanned(3) Mixed record types - Clustering(4) Split records(5) Sequencing(6) Indirection

Options for storing records in blocks

Page 58: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 58

(1) Insertion/Deletion(2) Buffer Management(3) Comparison of Schemes

Other Topics

Page 59: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 59

Block

Deletion

Rx

Page 60: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 60

Options:

(a) Immediately reclaim space(b) Mark deleted

– May need chain of deleted records

(for re-use)– Need a way to mark:

• special characters• delete field• in map

Page 61: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 61

As usual, many tradeoffs...

• How expensive is to move valid record to free space for immediate reclaim?

• How much space is wasted?– e.g., deleted records, delete fields,

free space chains,...

Page 62: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 62

Dangling pointers

Concern with deletions

R1 ?

Page 63: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 63

Solution #1: Do not worry

Page 64: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 64

E.g., Leave “MARK” in map or old location

Solution #2: Tombstones

• Physical IDs

A block

This space This space cannever re-used be re-used

Page 65: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 65

• Logical IDs

ID LOC

7788

map

Never reuseID 7788 nor

space in map...

E.g., Leave “MARK” in map or old location

Solution #2: Tombstones

Page 66: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 66

• Place record ID within every record• When you follow a pointer, check if

it leads to correct record

Solution #3 (?):

Does this work??? If space reused, won’t new record have same ID?

to3-77

rec-id:3-77

Page 67: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 67

• To point, use (pointer + hash)or (pointer + key)?

Solution #4 (?):

• What if record modified???

ptr+hash

key

Page 68: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 68

Easy case: records not in sequence Insert new record at end of

file or in deleted slot If records are variable size,

not as easy...

Insert

Page 69: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 69

Hard case: records in sequence If free space “close by”, not too bad... Or use overflow idea...

Insert

Page 70: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 70

Interesting problems:

• How much free space to leave in each block, track, cylinder?

• How often do I reorganize file + overflow?

Page 71: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 71

Freespace

Page 72: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 72

• DB features needed• Why LRU may be bad Read• Pinned blocks Textbook!• Forced output• Double buffering• Swizzling

Buffer Management

in Notes02

Page 73: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 73

Swizzling

Memory Disk

Rec A

block 1

Rec Ablock 2 block 2

block 1

Page 74: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 74

Translation DB Addr Mem Addr Table Rec-A Rec-A-inMem

One Option:

Page 75: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 75

In memory pointers - need “type” bit

to disk

to memoryM

Another Option:

Page 76: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 76

Swizzling

• Automatic• On-demand• No swizzling / program control

Page 77: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 77

• There are 10,000,000 ways to organize my data on disk…

Which is right for me?

Comparison

Page 78: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 78

Issues:

Flexibility Space Utilization

Complexity Performance

Page 79: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 79

To evaluate a given strategy, compute following parameters:-> space used for expected data-> expected time to- fetch record given key- fetch record with next key- insert record- append record- delete record- update record- read all file- reorganize file

Page 80: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 80

Example

How would you design Megatron 3000 storage system? (for a relational DB, low end)– Variable length records?– Spanned?– What data types?– Fixed format?– Record IDs ?– Sequencing?– How to handle deletions?

Page 81: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 81

• How to lay out data on disk

Data Items

Records

Blocks

Files

Memory

DBMS

Summary

Page 82: CS 277 – Spring 2002Notes 31 CS 277: Database System Implementation Notes 03: Disk Organization Arthur Keller.

CS 277 – Spring 2002

Notes 3 82

How to find a record quickly,given a key

Next