1 CS503: Operating Systems Spring 2014 General File Systems.
-
Upload
sheena-hamilton -
Category
Documents
-
view
222 -
download
1
Transcript of 1 CS503: Operating Systems Spring 2014 General File Systems.
Long Term Storage
• Must store large amounts of information
• Must survive termination of process using the information
• Must allow other processes to access the information
• Store information on disks in units called files
04/18/23
3
Why Files?
• Physical reality– Block oriented– Physical sector #s– No protection among
users of the system– Data might be
corrupted if machine crashes
• File system model– Byte oriented
– Named files
– Users protected from each other
– Robust to machine failures
04/18/23
4
File System Requirements• Users must be able to:
– Create, modify, and delete files at will. – Read, write, and modify file contents with a minimum
of fuss about blocking, buffering, etc. – Share each other's files with proper authorization – Transfer information between files. – Refer to files by symbolic names. – Retrieve backup copies of files lost through accident
or malicious destruction. – See a logical view of their files without concern for
how they are stored.
04/18/23
5
File Structure• Byte sequence
– Read or write a number of bytes– Unstructured– User program decides meaning
• Tree– Records with keys– Read, insert, delete a specific record– Still used on mainframe computers in some commercial data
processing
• Record sequence– Fixed length– Read or write a one of record at a time– Punch cards, 80 char records
File Access Patterns
• Sequential access– Read all bytes in order– Tapes, continuous media files (mp3, swf…)
• Random access– Read bytes/records out of order– Essential for some applications: databases– seek– Disks
File System Implementation
• Four key aspects:– Layout– Allocation– Management of free blocks– Directory management
File System Layout
• Partitions: independent file systems• MBR (Master Boot Record): boots computer, then active
partition• Boot block: first block executed• Superblock: Info about the file system
– # of files, # of blocks, free blocks
MBR
Boot block Super block Free space mgmt File mgmt Root dir Files and Directores
Partition Table
Contiguous Allocation
04/18/23
10
• Simple: store 2 #’s• Efficient: 1 seek• Random Access• Must know file size• Fragmentation
04/18/23
11
Linked-List Allocation
• Index with address to first
• Random Access is slow• Internal Fragmentation
• First word points to next block
• Unreliable: what if you lose one block in chain?
• Can grow files dynamically
• Incomplete block sizes
Linked-List + Table
04/18/2312
• Random Access w/o disk reference
• FAT must be stored in memory!
• FAT in memory– pointers stored in table
(block,next)– next = -1 indicates eof
20 GB disk
1 KB block
1 word per entry
=> 80 MB!
0
1
2 10
3 11
4 7
5
6 3
7 2
8
9
10 12
11
12 -1
13
14
15
Example file: 4, 7, 2, 10, 12
Current Next
Indexed Allocation
• Associate each file with data structure: i-node– table of file’s blocks
• Only need i-nodes in memory for open files– set max # of open files
04/18/23
14
i-nodes• Attributes:
– File type, size – Owner, group,
permissions (r/w/x)– Times: creation, last
access, last modified– Reference count
• Block Addresses– Direct– Inderect
File Attributes
Address of block 0
Address of block 1
…
Address of block N
Single Indirect
Double Indirect
Triple Indirect
i-nodes
• Assume: N=10, 1KB blocks, 4 byte entries
– Direct: 10 KB– Single indirect: 256 KB– Double indirect: 64 MB– Triple indirect: A lot!
File Attributes
Address of block 0
Address of block 1
…
Address of block N
Single Indirect
Double Indirect
Triple Indirect
Free Block List
• Block Contents:– entries of other block addresses that are free– last entry points to another block
16 GB free, 1 KB blocks, 4 bytes per entry => 65,794 blocks
• Pros/Cons:– Can be large– Can be stored on Free Blocks– Can load one block into memory at a time
Free Bitmap
• 1 bit per block (1 = free, 0 not free)16 GB free, 1 KB blocks, 1 bit per entry => 2,048 blocks
• Allocated blocks are closer together
• Fixed in size– Free list can be smaller if few free pages– Can have 1 block in memory at a time
04/18/23
19
Directory System
• Map ASCII name of file to information needed to locate data (i-node)– Can also store attributes about file
• UNIX– Stored like a regular file– Table of names and i-nodes
Opening a File: /usr/bob/file
• Fetch root dir /• Look up “usr”• Get i-node for directory (as a file) usr• Use i-node to retrieve blocks for directory usr• Look up “bob”• Get i-node for directory bob• Use i-node to retrieve blocks for /usr/bob• Look up “file”• Get i-node for file
Directory Representation and Hard Links
• A directory is a file that contains a list of pairs (file name, I-node number)
• Each pair is also called a hard-link• An I-node may appear in multiple directories.• A reference count in the I-node keeps track of the
number of directories where the I-node appears.• When the reference-count reaches 0, the file is
removed.
Hard Links
• Hard Links cannot cross partitions, that is, a directory cannot list an I-node of a different partition.
• Example. Creating a hard link to a target-file in the current directoryln target-file name-link
Soft-Links
• Directories may also contain Soft-Links.• A soft-link is a pair of the form
(file name, i-node number-with-file-storing-path)Where path may be an absolute or relative path in this or another
partition.
• Soft-links can point to files in different partitions. • A soft-link does not keep track of the target file.• If the target file is removed, the symbolic link becomes
invalid (dangling symbolic link).• Example:
ln –s target-file name-link
Opening a File• Once the file i-node is retrieved
– keep track of opened file i-nodes– store current read/write location within file
int fseek(FILE *stream, long offset, int whence);// whence can be SEEK_SET, SEEK_CUR, or SEEK_END
long lseek(int fd, long offset, int whence);// whence can be SEEK_SET, SEEK_CUR, or SEEK_END
– store r/w bits
04/18/23
25
Data Structures for a Typical File System
Processcontrolblock
...
Openfilepointerarray
Open filetable(system-wide, why?)
i-node table
Filei-node
04/18/23
26
Per-process Table Information• All files that the process has open • Information regarding the use of the file by the
process• The following item may be stored on a per file,
process basis – Current file pointer indicating the location in the file
• current read, write position
• Each entry in the per-process table points to an entry in the open-file table
04/18/23
27
Open-file Table Information
• File Open Count– counter which tracks the number of opens
and closes.
• Pointer to I-node– pointing to the I-node of the file. The I-
node has been read from disk to kernel memory upon file open
04/18/23
28
Opening a File (continued)
PCB
fd = open( FileName, access)
Openfiletable
Directory+I-node
Allocate & link updata structures
File name lookup& authenticate
File system on disk
• Definitions: – File descriptor (fd): an
integer used to represent a file, easier than using names
– Directory: locating I-node
– I-node: disk location of data, used to access the “real” data
04/18/23
29
Reading A Block
PCB
Openfiletable
I-node
read( fd, userBuf, size )
Logical physical
read( device, phyBlock, size )
Get physical block to sysBufcopy to userBuf
Disk device driver
BufferCache (why?)
File System Consistency
• During a crash file system can be damaged
• File system has inherent redundancy– Reconstruct (fsck, scandisk)
File System Consistency
• Iterate through blocks of files
• Iterate through blocks on free list
• If file system is consistent then block is in one or the other
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 1 0 1 0 1 1 1 1 0 0 1 1 1 0 0
0 0 1 0 1 0 0 0 0 1 1 0 0 0 1 1
Block #
Used
Free
File System Consistency
• Missing block: 0 in both rows– Wastes disk capacity– Add to free list
• Free list has block twice (what about bitmap?)
– Rebuild free list (complement of used blocks)
• Block present in multiple files– Allocate a free block and duplicate
More Cases of Inconsistency
• Blocks allocated to multiple files.
• i-nodes containing block numbers that overlap.
• i-nodes containing block numbers out of range.
• Discrepancies between the number of directory references to a file and the link count of the file.
• i-nodes containing block numbers that are marked free in the disk map.
• i-nodes containing corrupt block numbers.
• Size checks:
– Incorrect number of blocks.
– Directory size not a multiple of 512 bytes.
• Directory checks:
– i-node number out of range.
– Files that are not referenced or directories that are not reachable.
33