File System Implementations
CS-502 Fall 2007 1
File System Implementations
CS-502, Operating SystemsFall 2007
(Slides include materials from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne and from Modern Operating Systems, 2nd ed., by Tanenbaum)
File System Implementations
CS-502 Fall 2007 2
Implementation of Files
• Create file abstraction using physical disk devices and disk blocks– Efficient in time, space, use of disk resources
– Fast enough for application requirements
• Must be scalable to a wide variety of file sizes– Many small files (< 1 page)
– Huge files (100’s of gigabytes, terabytes, spanning disks)
– Everything in between
File System Implementations
CS-502 Fall 2007 3
File Allocation Schemes
• Contiguous– Blocks of file stored in consecutive disk sectors– Directory points to first entry
• Linked– Blocks of file scattered across disk, as linked list– Directory points to first entry
• Indexed– Separate index block contains pointers to file blocks– Directory points to index block
File System Implementations
CS-502 Fall 2007 4
File Allocation Schemes (continued)
• The allocation scheme is an attribute of a file system, not of individual files within a system.
• All files within a file system follow same allocation model
File System Implementations
CS-502 Fall 2007 5
Volume
• The fundamental unit of a file system
• Physical volume may be a• physical disk storage device
• physical partition of a single disk (aka minidisk)
• Logical Volume• A physical volume
• A combination of other volumes– Usually similar in size and characteristics
File System Implementations
CS-502 Fall 2007 6
File Allocation Schemes
• Contiguous– Blocks of file stored in consecutive disk sectors– Directory points to first entry
• Linked– Blocks of file scattered across disk, as linked list– Directory points to first entry
• Indexed– Separate index block contains pointers to file blocks– Directory points to index block
File System Implementations
CS-502 Fall 2007 7
Contiguous Allocation
• Ideal for large, static files– Databases, fixed system structures, OS code– Multi-media video and audio– CD-ROM, DVD
• Simple address calculation– Directory entry points to first sector– File block i disk sector address
• Fast multi-block reads and writes– Minimize seeks between blocks
File System Implementations
CS-502 Fall 2007 8
Contiguously Allocated Files
File System Implementations
CS-502 Fall 2007 9
Block-to-sector Calculation
• To find disk sector containing block i of file f– Starting_block(f) + i
• Starting block of each file is named in– Directory, or– File metadata
File System Implementations
CS-502 Fall 2007 10
File Creation(Contiguous File System)
• Search for an empty sequence of blocks– First-fit– Best-fit
• Prone to fragmentation when …• Files come and go
• Files change size
• Similar to physical memory allocation in base-limit type of virtual memory
File System Implementations
CS-502 Fall 2007 11
Digression: Bad Block Management
• Bad blocks on disks are inevitable• Part of manufacturing process (less than 1%)
• Most are detected during formatting
• Occasionally, blocks become bad during operation
• Manufacturers typically add extra tracks to disks• Physical capacity = (1 + x) * rated_capacity
• Who handles bad blocks?• Disk controller: Bad block list maintained internally
– Automatically substitutes good blocks
• Formatter: Re-organize track to avoid bad blocks
• OS: Bad block list maintained by OS, bad blocks never used
File System Implementations
CS-502 Fall 2007 12
Bad Block Management inContiguous Allocation File Systems
• Bad blocks must be concealed• Foul up the block-to-sector calculation
• Methods• Look-aside list of bad sectors
– Check each sector request against hash table
– If present, substitute a replacement sector behind the scenes
• Spare sectors in each track, remapped by formatting
• Handling• Disk controller, invisible to OS
• Lower levels of OS; concealed from higher layers of file system and from application
File System Implementations
CS-502 Fall 2007 13
Contiguous Allocation – Extents
• Extent: a contiguously allocated subset of a file• Directory entry points to
– (For file with one extent) the extent itself
– (For file with multiple extents) pointer to an extent block describing multiple extents
• Advantages– Speed, ease of address calculation of contiguous file
– Avoids (some of) the fragmentation issues
– Can be adapted to support files across multiple disks
• …
File System Implementations
CS-502 Fall 2007 14
Contiguous Allocation – Extents
• …• Disadvantages
– Too many extents degenerates to indexed allocation• As in Unix-like systems, but not so well
• Popular in 1960s & 70s– OS/360, other systems for commercial data processing
• Currently used for large files in NTFS• Rarely mentioned in textbooks
• Silbershatz, §11.4.1 & 22.5.1
File System Implementations
CS-502 Fall 2007 15
Questions?
File System Implementations
CS-502 Fall 2007 16
File Allocation Schemes
• Contiguous– Blocks of file stored in consecutive disk sectors– Directory points to first entry
• Linked– Blocks of file scattered across disk, as linked list– Directory points to first entry
• Indexed– Separate index block contains pointers to file blocks– Directory points to index block
File System Implementations
CS-502 Fall 2007 17
Linked Allocation
• Blocks scattered across disk
• Each block contains pointer to next block
• Directory points to first and last blocks
• Sector header:– Pointer to next block
– ID and block number of file
10
16
01
25
File System Implementations
CS-502 Fall 2007 18
Linked Allocation (Note)
• This is Silbershatz figure 11.5
• Links in the book are incorrect
10
16
01
25
File System Implementations
CS-502 Fall 2007 19
Linked Allocation
• Advantages– No space fragmentation!– Easy to create, extend files– Ideal for lots of small files
• Disadvantages– Lots of disk arm movement– Space taken up by links– Sequential access only!
• Random access simulated by caching links
• Used in Xerox Alto file system
File System Implementations
CS-502 Fall 2007 20
Bad Block Management –Linked File Systems
• In OS:– format all sectors of disk• Don’t reserve any spare sectors
• Allocate bad blocks to a hidden file for the purpose
• If a block becomes bad, append to the hidden file
• Advantages• Very simple
• No look-aside or sector remapping needed
• Totally transparent without any hidden mechanism
File System Implementations
CS-502 Fall 2007 21
Variation on Linked Allocation – File Allocation Table (FAT)
• Instead of link on each block, put all links in one table– the File Allocation Table
— i.e., FAT
• Each entry corresponds to physical block in disk– Directory points to first &
last blocks of file– Each block points to next
block (or EOF)
File System Implementations
CS-502 Fall 2007 22
FAT File Systems
• Advantages– Advantages of Linked File System– FAT can be cached in memory– Searchable at CPU speeds, pseudo-random access
• Disadvantages– Limited size, not suitable for very large disks– FAT cache describes entire disk, not just open files!– Not fast enough for large databases
• Used in MS-DOS, early Windows systems– Also USB Flash drives, floppy disks, etc.
File System Implementations
CS-502 Fall 2007 23
Bad Block Management –FAT File Systems
• Same as Linked File Systems• I.e., format all sectors of disk
• Don’t reserve any spare sectors
• Allocate bad blocks to a hidden file for the purpose
• If a block becomes bad, append to the hidden file
• Same advantages and disadvantages
File System Implementations
CS-502 Fall 2007 24
Disk Defragmentation
• Re-organize blocks in disk so that file is (mostly) contiguous
• Link or FAT organization preserved
• Purpose:– To reduce disk arm movement during
sequential accesses– Does not change the linked structure of the file
system!
File System Implementations
CS-502 Fall 2007 25
Exam Question (last spring)
• You have a humongous database stored in a file on a 4 GB flash drive with a FAT file system. What must the file system do to locate block n of the database?
• Assume that database has not been defragmented, so that its blocks are likely to be scattered randomly across the flash drive.
• Given that the file system has found the location of block n, what must it do to find the location of block n+1? block n-1?
File System Implementations
CS-502 Fall 2007 26
Questions?
Linked and FAT File Systems
File System Implementations
CS-502 Fall 2007 27
File Allocation Schemes
• Contiguous– Blocks of file stored in consecutive disk sectors– Directory points to first entry
• Linked– Blocks of file scattered across disk, as linked list– Directory points to first entry
• Indexed– Separate index block contains pointers to file blocks– Directory points to index block
File System Implementations
CS-502 Fall 2007 28
Indexed Allocation
• i-node:– Part of file metadata– Data structure lists the
sector address of each block of file
• Advantages– True random access– Only i-nodes of open
files need to be cached– Supports small and
large files
File System Implementations
CS-502 Fall 2007 29
Unix/Linux i-nodes
• Direct blocks:– Pointers to first n
sectors
• Single indirect table:– Extra block containing
pointers to blocks n+1 .. n+m
• Double indirect table:– Extra block containing
single indirect blocks
• …
File System Implementations
CS-502 Fall 2007 30
Indexed Allocation
• Access to every block of file is via i-node
• Bad block management– Similar to Linked/FAT systems
• Disadvantage– Not as fast as contiguous allocation for large
databases• Requires reference to i-node for every access
vs.
• Simple calculation of block to sector address
File System Implementations
CS-502 Fall 2007 31
Indexed Allocation (continued)
• Widely used in Unix, Linux, Windows NTFS
• Robust• Has withstood the test of time
• Many variations
File System Implementations
CS-502 Fall 2007 32
Questions?
File System Implementations
CS-502 Fall 2007 33
Free Block Management in File Systems
• Bitmap– Very compact on disk
– Expensive to search
– Supports contiguous allocation
• Free list– Linked list of free blocks
• Each block contains pointer to next free block
– Only head of list needs to be cached in memory
– Very fast to search and allocate
– Contiguous allocation very difficult
File System Implementations
CS-502 Fall 2007 34
Free Block ManagementBit Vector
…
0 1 2 n-1
bit[i] = 0 block[i] free
1 block[i] occupied
Free block number calculation
(number of bits per word) *(number of 0-value words) +offset of first 1 bit
File System Implementations
CS-502 Fall 2007 35
Free Block ManagementBit Vector (continued)
• Bit map– Must be kept both in memory and on disk– Copy in memory and disk may differ– Cannot allow for block[i] to have a situation
where bit[i] = 1 in memory and bit[i] = 0 on disk
File System Implementations
CS-502 Fall 2007 36
Free Block ManagementBit Vector (continued)
• Solution:– Set bit[i] = 1 in disk– Allocate block[i]– Set bit[i] = 1 in memory– Similarly for set of contiguous blocks
• Potential for lost blocks in event of crash!– Discussion:– How do we solve this problem?
File System Implementations
CS-502 Fall 2007 37
Free Block ManagementLinked List
• Linked list of free blocks– Not necessarily in order!
• Cache first few free blocks in memory
• Head of list must be stored both– On disk– In memory
• Each block must be written to disk when freed
• Potential for losing blocks?
File System Implementations
CS-502 Fall 2007 38
Reading Assignment
• Silbershatz, Chapter 11• Ignore §11.8 – 11.10
• Tanenbaum (Modern Operating Systems), Chapter 6
File System Implementations
CS-502 Fall 2007 39
Scalability of File Systems
• Question: How large can a file be?• Answer: limited by
– Number of bits in length field in metadata
– Size & number of block entries in FAT or i-node
• Question: How large can file system be?• Answer: limited by
– Size & number of block entries in FAT or i-node
File System Implementations
CS-502 Fall 2007 40
MS-DOS & Windows
• FAT-12 (primarily on floppy disks):
• 4096 512-byte blocks
• Only 4086 blocks usable!
• FAT-16 (early hard drives):
• 64 K blocks; block sizes up to 32 K bytes
• 2 GBytes max per partition, 4 partitions per disk
• FAT-32 (Windows 95)
• 228 blocks; up to 2 TBytes per disk
• Max size FAT requires 232 bytes in RAM!
File System Implementations
CS-502 Fall 2007 41
MS-DOS File System (continued)
• Maximum partition for different block sizes• The empty boxes represent forbidden combinations
File System Implementations
CS-502 Fall 2007 42
Classical Unix
• Maximum number of i-nodes = 64K!• How many files in a modern PC?
• I-node structure allows very large files, but …
• Limited by size of internal fields
File System Implementations
CS-502 Fall 2007 43
Modern Operating Systems
• Need much larger, more flexible file systems
• Many terabytes per system
• Multi-terabyte files
• Suitable for both large and small
• Cache only open files in RAM
File System Implementations
CS-502 Fall 2007 44
Examples of Modern File Systems
• Windows NTFS• Silbershatz §22.5
• Tanenbaum §11.7
• Linux ext2fs• Silbershatz §21.7.2
• Other file systems …• Consult your favorite Linux system documentation
File System Implementations
CS-502 Fall 2007 45
New Topic
File System Implementations
CS-502 Fall 2007 46
Mounting
mount –t type device pathname
• Attach device (which contains a file system of type type) to the directory at pathname
• File system implementation for type gets loaded and connected to the device
• Anything previously below pathname becomes hidden until the device is un-mounted again
• The root of the file system on device is now accessed as pathname
• E.g.,mount –t iso9660 /dev/cdrom /myCD
File System Implementations
CS-502 Fall 2007 47
Mounting (continued)
• OS automatically mounts devices in mount table at initialization time
• /etc/fstab in Linux
• Users or applications may mount devices at run time, explicitly or implicitly — e.g.,
• Insert a floppy disk
• Plug in a USB flash drive
• Type may be implicit in device• Windows equivalent
• Map drive
File System Implementations
CS-502 Fall 2007 48
Virtual File Systems
• Virtual File Systems (VFS) provide object-oriented way of implementing file systems.
• VFS allows same system call interface to be used for different types of file systems.
• The API is to the VFS interface, rather than any specific type of file system.
File System Implementations
CS-502 Fall 2007 49
Schematic View of Virtual File System
File System Implementations
CS-502 Fall 2007 50
Virtual File System (continued)
• Mounting: formal mechanism for attaching a file system to the Virtual File interface
File System Implementations
CS-502 Fall 2007 51
Linux Virtual File System (VFS)
• A generic file system interface provided by the kernel
• Common object framework– superblock: a specific, mounted file system– i-node object: a specific file in storage– d-entry object: a directory entry– file object: an open file associated with a
process
File System Implementations
CS-502 Fall 2007 52
Linux Virtual File System (continued)
• VFS operations– super_operations:
• read_inode, sync_fs, etc.
– inode_operations:• create, link, etc.
– d_entry_operations:• d_compare, d_delete, etc.
– file_operations:• read, write, seek, etc.
File System Implementations
CS-502 Fall 2007 53
Linux Virtual File System (continued)
• Individual file system implementations conform to this architecture.
• May be linked to kernel or loaded as modules
• Linux kernel 2.6 supports over 50 file systems in official version
• E.g., minix, ext, ext2, ext3, iso9660, msdos, nfs, smb, …
File System Implementations
CS-502 Fall 2007 54
Reading references
• Silbershatz, §11.2.3
• Robert Love, Chapter 12
File System Implementations
CS-502 Fall 2007 55
Questions?
File System Implementations
CS-502 Fall 2007 56
Implementation of Directories
• A list of [name, information] pairs• Must be scalable from very few entries to very many
• Name:• User-friendly, variable length• Any language• Fast access by name
• Information:• File metadata (itself)• Pointer to file metadata block (or i-node) on disk• Pointer to first & last blocks of file• Pointer to extent block(s)• …
File System Implementations
CS-502 Fall 2007 57
Very Simple Directory
• Short, fixed length names• Attribute & disk addresses contained in directory• MS-DOS, etc.
name1 attributes
name2 attributes
name3 attributes
name4 attributes
… …
File System Implementations
CS-502 Fall 2007 58
Simple Directory
• Short, fixed length names• Attributes in separate blocks (e.g., i-nodes)
• Attribute pointers are disk addresses (or i-node numbers)
• Older Unix versions, MS-DOS, etc.
name1
name2
name3
name4
…
i-node
i-node
i-node
i-node
Data structurescontaining attributes
File System Implementations
CS-502 Fall 2007 59
More Interesting Directory
• Variable length file names– Stored in heap at end
• Modern Unix, Windows
• Linear or logarithmic search for name
• Compaction needed after– Deletion, Rename
attributes
attributes
attributes
attributes
… …
name1 longer_name3 very_long_name4 name2 …
File System Implementations
CS-502 Fall 2007 60
Very Large Directories
• Hash-table implementation
• Each hash chain like a small directory with variable-length names
• Must be sorted for listing
File System Implementations
CS-502 Fall 2007 61
Questions?
Top Related