Storage and File Organization. 11.2Database System Concepts General Overview - rel. model ...
-
date post
21-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Storage and File Organization. 11.2Database System Concepts General Overview - rel. model ...
Storage and File OrganizationStorage and File Organization
11.2Database System Concepts
General Overview - rel. modelGeneral Overview - rel. model
Relational model - SQL
Formal & commercial query languages
Functional Dependencies
Normalization
Physical Design
Indexing
11.3Database System Concepts
ReviewReview
DBMSs store data on disk
Disk characteristics: 2 orders of magnitude slower than MM
Unit of read/write operations a block/page (multiple of sectors)
Access time = seek time + rotation time + transfer time
Sequential I/O much faster than random I/O
Database Systems try to minimize the overhead of moving data from and to disk
11.4Database System Concepts
ReviewReview
Methods to improve MM to/from disk transfers
1. Improve the disk technology
a. Use faster disks (more RPMs)
b. Parallelization (RAID) + some redundancy
2. Avoid unnecessary reads from disk
a. Buffer management: go to buffer (MM) instead of disk
b. Good file organization
3. Other (OS based improvements): disk scheduling (elevator algorithm, batch writes, etc)
11.5Database System Concepts
Buffer ManagementBuffer Management
Keep pages in a part of memory (buffer), read directly from there
What happens if you need to bring a new page into buffer and buffer is full: you have to evict one page
Replacement policy: LRU : Least Recently Used (CLOCK)
MRU: Most Recently Used
Toss-immediate : remove a page if you know that you will not need it again
Pinning (needed in recovery, index based processing,etc)
Other DB specific RPs: DBMIN, LRU-k, 2Q
11.6Database System Concepts
File OrganizationFile Organization
Basics A database is a collection of files, file is a collection of
records, record (tuple) is a collection of fields (attributes)
Files are stored on Disks (that use blocks to read and write)
Two important issues:
1. Representation of each record
2. Grouping/Ordering of records and storage in blocks
11.7Database System Concepts
File OrganizationFile Organization
Goal and considerations:
Compactness
Overhead of insertion/deletion
Retrieval speed: sometime we prefer to bring more tuples than necessary in MM and use CPU to filter out the unnecessary ones!
11.8Database System Concepts
Record RepresentationRecord Representation
Fixed-Length Records Example
Account( acc-number char(10), branch-name char(20), balance real)
Each record is 38 bytes.
Store them sequentially, one after the other
Record0 at position 0, record1 at position 38, record2 at position 76 etc
Compactness (350 bytes)
11.9Database System Concepts
Fixed-Length RecordsFixed-Length Records
Simple approach: Store record i starting from byte n (i – 1), where n is the size of
each record.
Record access is simple but records may cross blocks Modification: do not allow records to cross block boundaries
Insertion of record i: Add at the end
Deletion of record i: Two alternatives: move records:
1. i + 1, . . ., n to i, . . . , n – 1
2. record n to i
do not move records, but link all free records on a free list
11.10Database System Concepts
Free ListsFree Lists 2nd approach: FLR with Free Lists Store the address of the first deleted record in the file header. Use this first record to store the address of the second deleted record,
and so on Can think of these stored addresses as pointers since they “point” to
the location of a record. More space efficient representation: reuse space for normal attributes
of free records to store pointers. (No pointers stored in in-use records.)
Better handling ins/del
Less compact: 420 bytes
11.11Database System Concepts
Variable-Length RecordsVariable-Length Records
3rd approach: Variable-length records arise in database systems in several ways: Storage of multiple record types in a file.
Record types that allow variable lengths for one or more fields.
Record types that allow repeating fields or multivalued attribute.
Byte string representation Attach an end-of-record () control character to the end of each
record
Difficulty with deletion (leaves holes)
Difficulty with growth
4
FieldCount
R1 R2 R3
11.12Database System Concepts
Variable-Length Records: Slotted Page Variable-Length Records: Slotted Page StructureStructure
4th approach VLR-SP
Slotted page header contains: number of record entries
end of free space in the block
location and size of each record
Records stored at the bottom of the page
External tuple pointers point to record ptrs: rec-id = <page-id, slot#>
11.13Database System Concepts
Page iRid = (i,N)
Rid = (i,2)
Rid = (i,1)
Pointerto startof freespace
SLOT DIRECTORY
N . . . 2 120 16 24 N
# slots
Insertion: 1) Use FP to find space and insert 2) Find available ptr in the directory (or create a new one) 3) adjust FP and number of records
Deletion ?
11.14Database System Concepts
Variable-Length Records (Cont.)Variable-Length Records (Cont.) Fixed-length representation:
reserved space
pointers
5th approach: Fixed Limit Records (for VLR)
Reserved space – can use fixed-length records of a known maximum length; unused space in shorter records filled with a null or end-of-record symbol.
11.15Database System Concepts
Pointer MethodPointer Method
6th approach: Pointer method Pointer method
A variable-length record is represented by a list of fixed-length records, chained together via pointers.
Can be used even if the maximum record length is not known
11.16Database System Concepts
Pointer Method (Cont.)Pointer Method (Cont.) Disadvantage to pointer structure; space is wasted in all
records except the first in a a chain.
Solution is to allow two kinds of block in file: Anchor block – contains the first records of chain
Overflow block – contains records other than those that are the first records of chairs.
11.17Database System Concepts
Ordering and Grouping recordsOrdering and Grouping records
Issue #1:
In what order we place records in a block?
1. Heap technique: assign anywhere there is space
2. Ordered technique: maintain an order on some attribute
So, we can use binary search if selection on this attribute.
11.18Database System Concepts
Sequential File OrganizationSequential File Organization Suitable for applications that require sequential
processing of the entire file
The records in the file are ordered by a search-key
11.19Database System Concepts
Sequential File Organization (Cont.)Sequential File Organization (Cont.) Deletion – use pointer chains
Insertion –locate the position where the record is to be inserted if there is free space insert there
if no free space, insert the record in an overflow block
In either case, pointer chain must be updated
Need to reorganize the file from time to time to restore sequential order
11.20Database System Concepts
Clustering File OrganizationClustering File Organization Simple file structure stores each relation in a separate file Can instead store several relations in one file using a
clustering file organization E.g., clustering organization of customer and depositor:
good for queries involving depositor customer, and for queries involving one single customer and his accounts
bad for queries involving only customer results in variable size records
SELECT account-number, customer-nameFROM depositor d, account aWHERE d.customer-name = a.customer-name
11.21Database System Concepts
File organizationFile organization
Issue #2: In which blocks should records be placed
Many alternatives exist, each ideal for some situation , and not so good in others:
Heap files: Add at the end of the file.Suitable when typical access is a file scan retrieving all records.
Sorted Files:Keep the pages ordered. Best if records must be retrieved in some order, or only a `range’ of records is needed.
Hashed Files: Good for equality selections. Assign records to blocks according to their value for some attribute
11.22Database System Concepts
Data Dictionary StorageData Dictionary Storage
Information about relations names of relations names and types of attributes of each relation names and definitions of views integrity constraints
User and accounting information, including passwords Statistical and descriptive data
number of tuples in each relation
Physical file organization information How relation is stored (sequential/hash/…) Physical location of relation
operating system file name or disk addresses of blocks containing records of the relation
Information about indices (Chapter 12)
Data dictionary (also called system catalog) stores metadata: that is, data about data, such as
11.23Database System Concepts
Data dictionary storageData dictionary storage
Stored as tables!!
E-R diagram? Relations, attributes, domains
Each relation has name, some attributes
Each attribute has name, length and domain
Also, views, integrity constraints, indices
User info (authorizations etc)
statistics
11.24Database System Concepts
relation
name
attribute
A-name
has
1 N
domain
position
11.25Database System Concepts
Data Dictionary Storage (Cont.)Data Dictionary Storage (Cont.)
A possible catalog representation:
Relation-metadata = (relation-name, number-of-attributes, storage-organization, location)Attribute-metadata = (attribute-name, relation-name, domain-type,
position, length)User-metadata = (user-name, encrypted-password, group)Index-metadata = (index-name, relation-name, index-type,
index-attributes)View-metadata = (view-name, definition)
11.26Database System Concepts
Large ObjectsLarge Objects
Large objects : binary large objects (blobs) and character large objects (clobs) Examples include:
text documents
graphical data such as images and computer aided designs audio and video data
Large objects may need to be stored in a contiguous sequence of bytes when brought into memory. If an object is bigger than a page, contiguous pages of the buffer
pool must be allocated to store it.
May be preferable to disallow direct access to data, and only allow access through a file-system-like API, to remove need for contiguous storage.
11.27Database System Concepts
Modifying Large ObjectsModifying Large Objects
If the application requires insert/delete of bytes from specified regions of an object: B+-tree file organization (described later in Chapter 12) can be
modified to represent large objects
Each leaf page of the tree stores between half and 1 page worth of data from the object
Special-purpose application programs outside the database are used to manipulate large objects: Text data treated as a byte string manipulated by editors and
formatters.
Graphical data and audio/video data is typically created and displayed by separate application
checkout/checkin method for concurrency control and creation of versions
11.28Database System Concepts
Data organization and retrievalData organization and retrieval
File organization can improve data retrieval time
Select *From depositorsWhere bname=“Downtown”
Mianus A-215Perry A-218Downtown A-101....
Brighton A-217Downtown A-101Downtown A-110......
Heap Ordered File
Searching a heap: must search all blocks (100 blocks)
OR
Searching an ordered File: 1. Binary search for the 1st tuple in answer : log 100 = 7 block accesses2. scan blocks with answer: no more than 2 Total <= 9 block accesses
100 blocks200 recs/blockAns: 150 recs
11.29Database System Concepts
Data organization and retrievalData organization and retrieval
But... file can only be ordered on one search key:
Brighton A-217Downtown A-101Downtown A-110......
Ordered File (bname)Ex. Select * From depositors Where acct_no = “A-110”
Requires linear scan (100 BA’s)
Solution: Indexes! Auxiliary data structures over relations that can improve the search time
11.30Database System Concepts
A simple indexA simple index
Brighton A-217 700Downtown A-101 500Downtown A-110 600Mianus A-215 700Perry A-102 400......
A-101A-102A-110A-215A-217......
Index of depositors on acct_no
Index records: <search key value, pointer (block, offset or slot#)>
To answer a query for “acct_no=A-110” we:
1. Do a binary search on index file, searching for A-1102. “Chase” pointer of index record
Index file
11.31Database System Concepts
Index ChoicesIndex Choices
1. Primary: index search key = physical order search key
vs Secondary: all other indexes
Q: how many primary indices per relation?
2. Dense: index entry for every search key value
vs Sparse: some search key values not in the index
3. Single level vs Multilevel (index on the indices)
11.32Database System Concepts
Measuring ‘goodness’Measuring ‘goodness’On what basis do we compare different indices?
1. Access type: what type of queries can be answered: selection queries (ssn = 123)? range queries ( 100 <= ssn <= 200)?
2. Access time: what is the cost of evaluating queries Measured in # of block accesses
3. Maintenance overhead: cost of insertion / deletion? (also BA’s)
4. Space overhead : in # of blocks needed to store the index
11.33Database System Concepts
IndexingIndexing
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 smith forbes ave
… … …
123234345456567
Primary (or clustering) index on SSN
11.34Database System Concepts
IndexingIndexing
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
forbes aveforbes aveforbes avemain strmain str
Secondary (or non-clustering) index: duplicates may exist
Address-index
• Can have many secondary indices• but only one primary index
11.35Database System Concepts
IndexingIndexing
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
forbes avemain str
secondary index: typically, with ‘postings lists’
Postings lists
11.36Database System Concepts
IndexingIndexing
STUDENTSsn Name Address
123 smith main str234 jones forbes ave345 tomson main str456 stevens forbes ave567 smith forbes ave
123456
…
Primary/sparse index on ssn (primary key)
>=123
>=456
11.37Database System Concepts
IndexingIndexing
Ssn Name Address345 tomson main str234 jones forbes ave123 smith main str567 smith forbes ave456 stevens forbes ave
123234345456567
Secondary / dense index
Secondary on a candidate key:No duplicates, no need for posting lists
11.38Database System Concepts
Primary vs SecondaryPrimary vs Secondary
1. Access type: Primary: SELECTION, RANGE
Secondary: SELECTION, RANGE but index must point to posting lists
2. Access time: Primary faster than secondary for range (no list access, all results
clustered together)
3. Maintenance Overhead: Primary has greater overhead (must alter index + file)
4. Space Overhead: secondary has more.. (posting lists)
11.39Database System Concepts
Dense vs SparseDense vs Sparse
1. Access type: both: Selection, range (if primary)
2. Access time: Dense: requires lookup for 1st result
Sparse: requires lookup + scan for first result
3. Maintenance Overhead: Dense: Must change index entries
Sparse: may not have to change index entries
4. Space Overhead: Dense: 1 entry per search key value
Sparse: < 1 entry per search key value
11.40Database System Concepts
SummarySummary
Dense Sparse
Primary rare usual
secondary usual
• All combinations are possible
• at most one sparse/clustering index
• as many as desired dense indices
• usually: one primary index (probably sparse) and a few secondary indices (non-clustering)