Appendix C File Organization & Storage Structure.
-
Upload
benedict-franklin -
Category
Documents
-
view
223 -
download
5
Transcript of Appendix C File Organization & Storage Structure.
Appendix C
File Organization & Storage Structure
Agenda
• Definition
• Types of File Organization
Definition
• Logical record & physical record
• File organization
• Access method
Types of File Organization
• Heap (unordered)
• Sequential (ordered or sorted)
• Hash (direct or random)
• Index
Heap• Unordered structure• Pros
– Simple– No overhead
• Cons– Slow– Waste space (deletion)
• For– Bulk-loaded– Short file– Retrieving 80% of the file
Ordered
• Sorted according to a field value or primary key field
• Pros– Binary search– Sequential processing
• Con– Slow for retrieval information needed by
management
Hash
• Terminology– Hash field, hash key
– Collision, synonyms
– Bucket, slots
• Types– Folding
– Division-remainder
• Collision handling– Open addressing or unchained overflow
– Chained overflow
– Multiple hashing
Direct (Random or Hash)
• Pro– Random processing
• Cons– Sequential processing– Updating (reorganization)
Indexes• Terminology
– Primary index (one for each file)– Secondary index for unique field or non-unique field
(several for each file)– Clustering index for clustering attribute (non-key field
or non-unique field)– Sparse index for some of the search key values– Dense index for every search key value
• Types– Linked list– Inverted file– Indexed sequential– B+-tree
Indexed Sequential • Structure
– Prime area
– Index area: track no, highest key on the track, highest key in the overflow, address of first overflow record
– Overflow area: address, record, pointer
• Types– Indexed Sequential Access Method (ISAM)– Virtual Sequential Access Method (VSAM)
• Pro– Sequential & random processing
• Con– Waste spaces (deletion)
– Inefficient due to overflow
B+-Tree• Terminology
– Node– Root– Parent– Child– Leaf– Depth: the maximum number of level– Balanced tree– Degree or order (n): the maximum number of children
• Rules– Root having at least two children– Each node having n/2 and n pointers (children)– Key values in leaf have to be between (n-1)/2 and (n-1)– Max no. of key values in non-leaf is 1 less than pointer– Balanced tree– Ordered values in leaf
Points to Remember
• Definition
• Types of File Organization
Assignment
• Review chapter 1 & appendix C
• Read chapter 2
• Group list due date: 9/18/07
• Homework due date: