1 CS503: Operating Systems Spring 2014 General File Systems.

34
1 CS503: Operating Systems Spring 2014 General File Systems

Transcript of 1 CS503: Operating Systems Spring 2014 General File Systems.

1

CS503: Operating SystemsSpring 2014

General File Systems

Long Term Storage

• Must store large amounts of information

• Must survive termination of process using the information

• Must allow other processes to access the information

• Store information on disks in units called files

04/18/23

3

Why Files?

• Physical reality– Block oriented– Physical sector #s– No protection among

users of the system– Data might be

corrupted if machine crashes

• File system model– Byte oriented

– Named files

– Users protected from each other

– Robust to machine failures

04/18/23

4

File System Requirements• Users must be able to:

– Create, modify, and delete files at will. – Read, write, and modify file contents with a minimum

of fuss about blocking, buffering, etc. – Share each other's files with proper authorization – Transfer information between files. – Refer to files by symbolic names. – Retrieve backup copies of files lost through accident

or malicious destruction. – See a logical view of their files without concern for

how they are stored.

04/18/23

5

File Structure• Byte sequence

– Read or write a number of bytes– Unstructured– User program decides meaning

• Tree– Records with keys– Read, insert, delete a specific record– Still used on mainframe computers in some commercial data

processing

• Record sequence– Fixed length– Read or write a one of record at a time– Punch cards, 80 char records

File Access Patterns

• Sequential access– Read all bytes in order– Tapes, continuous media files (mp3, swf…)

• Random access– Read bytes/records out of order– Essential for some applications: databases– seek– Disks

File System Implementation

• Four key aspects:– Layout– Allocation– Management of free blocks– Directory management

File System Layout

• Partitions: independent file systems• MBR (Master Boot Record): boots computer, then active

partition• Boot block: first block executed• Superblock: Info about the file system

– # of files, # of blocks, free blocks

MBR

Boot block Super block Free space mgmt File mgmt Root dir Files and Directores

Partition Table

04/18/23

9

File Allocation

• Contiguous

• Linked-List

• Linked-List + Table

• I-nodes

Contiguous Allocation

04/18/23

10

• Simple: store 2 #’s• Efficient: 1 seek• Random Access• Must know file size• Fragmentation

04/18/23

11

Linked-List Allocation

• Index with address to first

• Random Access is slow• Internal Fragmentation

• First word points to next block

• Unreliable: what if you lose one block in chain?

• Can grow files dynamically

• Incomplete block sizes

Linked-List + Table

04/18/2312

• Random Access w/o disk reference

• FAT must be stored in memory!

• FAT in memory– pointers stored in table

(block,next)– next = -1 indicates eof

20 GB disk

1 KB block

1 word per entry

=> 80 MB!

0

1

2 10

3 11

4 7

5

6 3

7 2

8

9

10 12

11

12 -1

13

14

15

Example file: 4, 7, 2, 10, 12

Current Next

Indexed Allocation

• Associate each file with data structure: i-node– table of file’s blocks

• Only need i-nodes in memory for open files– set max # of open files

04/18/23

14

i-nodes• Attributes:

– File type, size – Owner, group,

permissions (r/w/x)– Times: creation, last

access, last modified– Reference count

• Block Addresses– Direct– Inderect

File Attributes

Address of block 0

Address of block 1

Address of block N

Single Indirect

Double Indirect

Triple Indirect

i-nodes

• Assume: N=10, 1KB blocks, 4 byte entries

– Direct: 10 KB– Single indirect: 256 KB– Double indirect: 64 MB– Triple indirect: A lot!

File Attributes

Address of block 0

Address of block 1

Address of block N

Single Indirect

Double Indirect

Triple Indirect

Free Blocks

• Must keep track of blocks that are free– List– Bitmap

Free Block List

• Block Contents:– entries of other block addresses that are free– last entry points to another block

16 GB free, 1 KB blocks, 4 bytes per entry => 65,794 blocks

• Pros/Cons:– Can be large– Can be stored on Free Blocks– Can load one block into memory at a time

Free Bitmap

• 1 bit per block (1 = free, 0 not free)16 GB free, 1 KB blocks, 1 bit per entry => 2,048 blocks

• Allocated blocks are closer together

• Fixed in size– Free list can be smaller if few free pages– Can have 1 block in memory at a time

04/18/23

19

Directory System

• Map ASCII name of file to information needed to locate data (i-node)– Can also store attributes about file

• UNIX– Stored like a regular file– Table of names and i-nodes

Opening a File: /usr/bob/file

• Fetch root dir /• Look up “usr”• Get i-node for directory (as a file) usr• Use i-node to retrieve blocks for directory usr• Look up “bob”• Get i-node for directory bob• Use i-node to retrieve blocks for /usr/bob• Look up “file”• Get i-node for file

Directory Representation and Hard Links

• A directory is a file that contains a list of pairs (file name, I-node number)

• Each pair is also called a hard-link• An I-node may appear in multiple directories.• A reference count in the I-node keeps track of the

number of directories where the I-node appears.• When the reference-count reaches 0, the file is

removed.

Hard Links

• Hard Links cannot cross partitions, that is, a directory cannot list an I-node of a different partition.

• Example. Creating a hard link to a target-file in the current directoryln target-file name-link

Soft-Links

• Directories may also contain Soft-Links.• A soft-link is a pair of the form

(file name, i-node number-with-file-storing-path)Where path may be an absolute or relative path in this or another

partition.

• Soft-links can point to files in different partitions. • A soft-link does not keep track of the target file.• If the target file is removed, the symbolic link becomes

invalid (dangling symbolic link).• Example:

ln –s target-file name-link

Opening a File• Once the file i-node is retrieved

– keep track of opened file i-nodes– store current read/write location within file

int fseek(FILE *stream, long offset, int whence);// whence can be SEEK_SET, SEEK_CUR, or SEEK_END

long lseek(int fd, long offset, int whence);// whence can be SEEK_SET, SEEK_CUR, or SEEK_END

– store r/w bits

04/18/23

25

Data Structures for a Typical File System

Processcontrolblock

...

Openfilepointerarray

Open filetable(system-wide, why?)

i-node table

Filei-node

04/18/23

26

Per-process Table Information• All files that the process has open • Information regarding the use of the file by the

process• The following item may be stored on a per file,

process basis – Current file pointer indicating the location in the file

• current read, write position

• Each entry in the per-process table points to an entry in the open-file table

04/18/23

27

Open-file Table Information

• File Open Count– counter which tracks the number of opens

and closes.

• Pointer to I-node– pointing to the I-node of the file. The I-

node has been read from disk to kernel memory upon file open

04/18/23

28

Opening a File (continued)

PCB

fd = open( FileName, access)

Openfiletable

Directory+I-node

Allocate & link updata structures

File name lookup& authenticate

File system on disk

• Definitions: – File descriptor (fd): an

integer used to represent a file, easier than using names

– Directory: locating I-node

– I-node: disk location of data, used to access the “real” data

04/18/23

29

Reading A Block

PCB

Openfiletable

I-node

read( fd, userBuf, size )

Logical physical

read( device, phyBlock, size )

Get physical block to sysBufcopy to userBuf

Disk device driver

BufferCache (why?)

File System Consistency

• During a crash file system can be damaged

• File system has inherent redundancy– Reconstruct (fsck, scandisk)

File System Consistency

• Iterate through blocks of files

• Iterate through blocks on free list

• If file system is consistent then block is in one or the other

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

1 1 0 1 0 1 1 1 1 0 0 1 1 1 0 0

0 0 1 0 1 0 0 0 0 1 1 0 0 0 1 1

Block #

Used

Free

File System Consistency

• Missing block: 0 in both rows– Wastes disk capacity– Add to free list

• Free list has block twice (what about bitmap?)

– Rebuild free list (complement of used blocks)

• Block present in multiple files– Allocate a free block and duplicate

More Cases of Inconsistency

• Blocks allocated to multiple files.

• i-nodes containing block numbers that overlap.

• i-nodes containing block numbers out of range.

• Discrepancies between the number of directory references to a file and the link count of the file.

• i-nodes containing block numbers that are marked free in the disk map.

• i-nodes containing corrupt block numbers.

• Size checks:

– Incorrect number of blocks.

– Directory size not a multiple of 512 bytes.

• Directory checks:

– i-node number out of range.

– Files that are not referenced or directories that are not reachable.

33

04/18/23

34

Summary

• File Systems need to satisfy three essential requirements– Store a very large amount of information– Survive termination of the process using it– Multiple processes must be able to access the

information concurrently