File System Implementation CSCI 444/544 Operating Systems Fall 2008.

29
File System Implementation CSCI 444/544 Operating Systems Fall 2008
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Page 1: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File System Implementation

CSCI 444/544 Operating Systems

Fall 2008

Page 2: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Agenda

• File system layout

• Free-space management

• Directory implementation

• File caching

Page 3: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File System Layout

Overall question: how to organize files on disk?

• This is really just a data structure issue– What data structure is the right one to use to store a file on

disk?– Usage patterns matter

• Many issues in OS boil down to data structure and algorithms

– Disk scheduling is similar to traveling salesman problem

Page 4: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File System Usage Patterns• 80% of file accesses are reads

• Most programs that use a file sequentially access the whole file– Spatial locality

– Pre-fetching

• Most files are small, although most bytes are taken up by large files

Page 5: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File Allocation

How do we lay out the blocks of a file on disk?

Many different approaches• Contiguous• Linked list• Indexed

Implications• Large files should be allocated sequentially• Files in same directory should be allocated near each other• Data should be allocated near its meta-data

Page 6: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Design Metrics

• Amount of fragmentation (internal and external)?

• Ability to grow file over time?• Seek cost for sequential accesses?• Speed to find data blocks for random

accesses?•Wasted space for pointers to data blocks?

Page 7: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Contiguous AllocationAllocate each file to contiguous blocks on disk

• Meta-data: Starting block and size of file• OS allocates by finding sufficient free space• Example: IBM OS/360

Advantages• Little overhead for meta-data• Excellent performance for sequential accesses• Simple to calculate random addresses

Drawbacks• Horrible external fragmentation (Requires periodic compaction)• May not be able to grow file without moving it

A A A B B B B C C C

Page 8: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Contiguous Allocation of Disk Space

Page 9: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Extent-Based AllocationAllocate multiple contiguous regions (extents) per file

• Meta-data: Small array (2-6) designating each extent – Each entry: starting block and size

Improves contiguous allocation• File can grow over time (until run out of extents)• Helps with external fragmentation

Advantages• Limited overhead for meta-data• Very good performance for sequential accesses• Simple to calculate random addresses

Disadvantages (Small number of extents):• External fragmentation can still be a problem• Not able to grow file when run out of extents

D A A A B B B B C C C B BD D

Page 10: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Linked AllocationAllocate linked-list of fixed-sized blocks

• Meta-data: Location of first block of file– Each block also contains pointer (first word) to next block

Advantages• No external fragmentation• Files can be easily grown, with no limit

Disadvantages• Random access is extremely slow• Unreliable: what if you lose one block in chain?

D A A A B B B B C C C B BD D D DB

Page 11: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Linked Allocation of Disk Space

Page 12: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File-Allocation Table (FAT)

Variation of Linked allocation• Keep linked-list information for all files in on-disk FAT table • Meta-data: Location of first block of file

– And, FAT table itself

Comparison to Linked Allocation• Same basic advantages and disadvantages• Full block size available• Optimization: FAT must be in main memory

– Greatly improves random accesses

D A A A B B B B C C C B BD D D DB

Page 13: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File-Allocation Table

Page 14: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Indexed Allocation

Brings all pointers together into the index block.Logical view.

index table

• each file has an index table (i-node in UNIX)– a collection of pointers to file’s blocks

• only need to load index tables (i-nodes) into memory when you open files

Page 15: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Example of Indexed Allocation

Page 16: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Indexed Allocation

Advantages• No external fragmentation• Files can be easily grown, with no limit• Supports random access

Disadvantages• Large overhead for meta-data:

– Wastes space for unneeded pointers – most files are small!

Page 17: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Multi-Level Indexed FilesVariation of Indexed Allocation

• Dynamically allocate hierarchy of pointers to blocks as needed• Meta-data: Small number of pointers allocated statically

– Additional pointers to blocks of pointers

• Examples: UNIX file systems

indirect

doubleindirect

indirecttripleindirect

Page 18: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Comparison to Indexed Allocation

• Advantage: does not waste space for unneeded pointers

– Still fast access for small files– Can also grow to a very large size

• Disadvantage: need to read indirect blocks of pointers to calculate addresses (extra disk read)

– Keep indirect blocks cached in main memory

Page 19: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

The UNIX I-node

Page 20: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

04/18/23 20

i-nodesAttributes:

• File type, size • Owner, group,

permissions (r/w/x)• Times: creation, last

access, last modified• Reference count

Block Addresses• Direct• Inderect

File Attributes

Address of block 0

Address of block 1

Address of block N

Single Indirect

Double Indirect

Triple Indirect

Page 21: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

i-nodes

Assume: N=10, 1KB blocks, 4 byte entries

• Direct: 10 KB• Single indirect: 256 KB• Double indirect: 64 MB• Triple indirect: 16 GB!

File Attributes

Address of block 0

Address of block 1

Address of block N

Single Indirect

Double Indirect

Triple Indirect

Page 22: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File System Layout

A possible file system layout

Page 23: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File System Layout

Partitions: independent file systems

MBR (Master Boot Record): boots computer, then active partition

Boot block: first block executed

Superblock: Info about the file system• Contains all the key parameters about the file system: # of files, #

of blocks, # of free blocks, etc.

Page 24: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Free-space Management

Must keep track of blocks that are free• Bitmap (bit vector)

• Linked list

• Grouping– the first free block stores the address of n free blocks– The first n-1 of these blocks are actually free– The last block contains the addresses of another n free blocks,

and so on.

Page 25: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

BitmapBit vector (n blocks)

0 1 2 n-1

bit[i] =0 block[i] free

1 block[i] occupied

Block number calculation

(number of bits per word) *(number of 0-value words) +offset of first 1 bit

Page 26: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Linked Free Space List on Disk

Page 27: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Implementing Directories (1)

(a) A simple directoryfixed size entriesdisk addresses and attributes in directory entry

(b) Directory in which each entry just refers to an i-node

Page 28: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

Implementing Directories (2)

Two ways of handling long file names in directory• (a) In-line• (b) In a heap

Page 29: File System Implementation CSCI 444/544 Operating Systems Fall 2008.

File Caching

File system has lots of data structures on disk• Meta-data: bitmap of free blocks, directories, I-nodes,

indirect blocks• Data blocks

File caches speed access to all these types of data• Changing disk I/O to memory access• Response time can improve by 1000,000x• Write-through or write-back?