CE01000-3 Operating Systems

39
CE01000-3 Operating Systems Lecture 19 Linux/Unix – File System

description

CE01000-3 Operating Systems. Lecture 19 Linux/Unix – File System. Overview of lecture. In this lecture we will look at: Files in Linux/Unix – Ordinary files, Directory files, Special files Virtual File System System V Free data blocks inode free lists inode disk location mapping - PowerPoint PPT Presentation

Transcript of CE01000-3 Operating Systems

Page 1: CE01000-3 Operating Systems

CE01000-3Operating Systems

Lecture 19

Linux/Unix – File System

Page 2: CE01000-3 Operating Systems

Overview of lecture

In this lecture we will look at: Files in Linux/Unix – Ordinary files, Directory files, Special

files Virtual File System System V Free data blocks inode free lists inode disk location mapping Linux Second Extended File System (Ext2FS) Disk block groups

Page 3: CE01000-3 Operating Systems

Overview of Unix/Linux approach

Philosophy - attempt to treat all resources to which you can output and from which you can input like ordinary data files

Unix/Linux has 3 types of files ordinary files - for ordinary user data/programs, etc directory files - to structure file system special files - for access to I/O and system devices

these are all arranged in a single tree structured hierarchy

Page 4: CE01000-3 Operating Systems

root /

bindev home

user2user1

fd0 cat

myData myProgram

SPECIAL FILES

ORDINARY FILES

Tree structured hierarchy

Page 5: CE01000-3 Operating Systems

Ordinary files

Ordinary files contain a linear array of bytes - no structure is imposed on bytes by Unix/Linux

reads and writes start at a file pointer file pointer can be moved anywhere multiple programs can read/write

concurrently to one file (but order of access unpredictable)

Page 6: CE01000-3 Operating Systems

Ordinary files (Cont.)

although user gives names to ordinary files they are identified within system by inode numbers which index an array of data structures called inodes held on the disk

inodes contain administrative information about file and location information for data of file where appropriate. Specifically it contains,

Page 7: CE01000-3 Operating Systems

Ordinary files (Cont.)

files device and inode numbers file type (ordinary, directory, special, etc.) link count file owner’s user and group ids file access permissions major and minor device numbers (for special files) time of last access/modification/status change pointers to disk blocks of file contents if data file

Page 8: CE01000-3 Operating Systems

Directory files

allow file access by name from directory table

table provides logical grouping together of files as defined by user

table provides translation between name and inode number

inode number and file name constitute a link

Page 9: CE01000-3 Operating Systems

338

339

340

341

342

343

344

345

MyDir

SomeOtherDir

inodenumberFile Name

File1 340File2 338

SomeOtherDir 345

inodenumberFile Name

FileA 341FileB 338

inode array

Directory to inode mapping

Page 10: CE01000-3 Operating Systems

Directory files (Cont.)

directory is stored like ordinary files but have the directory type in the inode. This allows a link in one directory to refer to another allowing tree structures to be built

inode numbers only unique to a given partition of a disk - each partition has its own inode array

symbolic links permit references to files that cross partition boundaries - symbolic link is a special file type - consists of a simple text file that holds absolute pathname of linked file - access to file is indirect (via pathname in link) rather than direct to inode

Page 11: CE01000-3 Operating Systems

Directory files (Cont.)

each file has 2 types of path name - absolute (from root) and relative (from current directory)

directory files can only be written to with special system calls

multiple links to a file are allowed - number of links to file held in link count in inode

when removing a link from a directory other links may still exist, link count just decremented. Only when link count falls to 0 is inode entry and file data blocks released back to system to reuse.

Page 12: CE01000-3 Operating Systems

special files

there are 3 main types character special files block special files FIFO special files (these are named pipes)

the first 2 are primarily used to access input/output devices (disks, terminals, printers, etc.)

some devices (disks for example) have both character and block special files

Page 13: CE01000-3 Operating Systems

special files (Cont.)

device files have inodes but these contain no reference to data blocks

instead of data blocks major and minor device numbers are stored to access a device driver

adding new device drivers is relatively easy it is a popular way of adding new facilities into the

Unix/Linux kernel modern systems also support symbolic links to

special files

Page 14: CE01000-3 Operating Systems

File systems

disks - in general are split into convenient sized partitions - each physical disk having one or more partitions

each partition will have a Unix file system imprinted on it with a partition root directory and its own directory tree underneath

Unix/Linux makes these individual file systems appear to be one directory hierarchy by mounting the root of one file system over a leaf directory in another and making the join appear seamless to user

Page 15: CE01000-3 Operating Systems

Joined file systems/

Page 16: CE01000-3 Operating Systems

File System Internals

All the disk I/O we have looked at so far has been done via the kernel’s file system using abstractions like file, I-node and directory

When a disk file system is mounted into the directory structure the disk has this abstract model built into it

It is also possible to access the physical disks directly via a device special file in the /dev directory

Page 17: CE01000-3 Operating Systems

File system internals (Cont.)

This bypasses all the file abstractions and allows access to the underlying device itself.

In practice most users are denied this low level access by the correct setting of the device special file permission bits.

This is necessary to prevent unauthorised access to programs and data on the disk partitions.

Page 18: CE01000-3 Operating Systems

Virtual File SystemUser process

System Call Interface

Virtual File System

System V MS-DOS Ext2fs NFS

Buffer cache

Device drivers

Disk ControllerHARDWARE

KERNEL

Page 19: CE01000-3 Operating Systems

Virtual File System (Cont.) Linux can support a number of different file systems –

Second/Third Extended Filesystem (Linux default filesystems), plus e.g. Network File System(NFS), MS-DOS, Unix System V, HPFS (O/S2/NT), Minix (original Linux filesystem),etc.

it does this by using a virtual file system layer file system calls are passed onto the VFS which

translates the call into the call appropriate to the underlying file system

the VFS specifies a set of functions that every file system it can support, has to provide implementations for

Page 20: CE01000-3 Operating Systems

Virtual File System (Cont.)

VFS uses a table defined during kernel configuration. Each table entry defines a file system type - file system type name and pointer to function called to mount that type of file system

When file system is mounted, table is looked up and appropriate function called to mount the file system

this function returns a mounted file system descriptor to VFS

Page 21: CE01000-3 Operating Systems

Virtual File System (Cont.)

a mounted file system descriptor includes pointers to functions provided by physical file system kernel code

the VFS then uses descriptor to access the file system internal routines

The VFS also maintains 2 other types of descriptors - inode descriptors and open file descriptors

Page 22: CE01000-3 Operating Systems

Virtual File System (Cont.)

each descriptor contains info about files in use and the set of operations provided by physical file system code

inode descriptor contains pointers to functions that act on any file (e.g. creat(), unlink())

the file descriptor contains pointers to functions that can only act on an open file (e.g. read(), write())

Page 23: CE01000-3 Operating Systems

Unix System V

System V filesystem is the classic early Unix filesystem

on a System V filesystem disk partition, the disk blocks would have the following layout:

Page 24: CE01000-3 Operating Systems

Unix System V (Cont.)

boot block (block 0) - holds boot up code if bootable partition, if not then left unused

super block (block 1) - contains information about the file system as a whole, especially: the free block list (a list of all the data blocks

that are free for use) a free inode list - for rapid allocation of inodes

Page 25: CE01000-3 Operating Systems

Unix System V (Cont.)

inodes (block 2 to n) - disk blocks that hold inodes for filesystem - number of disk blocks allocated to inodes is fixed when filesystem created (hence max number of inodes fixed for given partition)

data blocks (blocks n+1 to end) - disk blocks for actual file/directory data

Page 26: CE01000-3 Operating Systems

Free data block list initially list of free data blocks fills space in super block,

then overflow is held in free data blocks with last block number in super block or each subsequent data block giving a linked list

200 199 .... .... .... 162 160 x x

286 284 .... .... .... 207 204 202 201

357 355 .... .... .... 297 293 292 290

Super block list

Data block 200

Data block 286

To data block 357

Page 27: CE01000-3 Operating Systems

Free data block list (Cont.)

allocation of free blocks proceeds from the first free block entry - this repeated until only one free block entry left in super block

prior to allocation of that data block it’s contents are copied into super block

when data blocks become free they are added to list in super block, but if super block is full then super block entries are copied to newly freed data block and replaced by a single entry in super block to new data block

Page 28: CE01000-3 Operating Systems

Free inode list

A free inode will have a flag set within struct to indicate whether or not it is free

so could conduct a simple linear search through inodes to find free inodes - but inefficient

super block contains a list of free inodes - these are allocated from first to last until list is empty

the inodes are then searched from number of last free inode in list refilling inode list in super block

when inode freed that is lower number than those in free list then replace last inode number with this new number so the next search through inodes will start from new location

Page 29: CE01000-3 Operating Systems

Free inode list (Cont.)100 Free inodes 43 42 41 40

Initial free list

100 Free inodes 43 42 x x

Allocate inodes

100 Empty list x x x x

Until last is taken

182 Free inodes 104 103 102 101

Refill free list

72 Free inodes 104 103 102 101

I-72 becomes free

Index

Index

Index

Index

Index

Page 30: CE01000-3 Operating Systems

inode data block pointers

Page 31: CE01000-3 Operating Systems

inode data block pointers (Cont.)

Mechanism for locating file data blocks from inode for file.

system V inode contains 13 pointers: the first 10 point directly at file data blocks, the 11th pointer (single indirect) points at a data

block that contains pointers to file data blocks (typically 256 pointers)

the 12th pointer (double indirect) points at a data block that contains pointers to data blocks that contain pointers to file data blocks

Page 32: CE01000-3 Operating Systems

inode data block pointers (Cont.)

the 13th pointer (triple indirect) points at a data block that contains pointers to data blocks that contain pointers to data blocks that contain pointers to file data blocks

this arrangement means that small files (the majority of files) can be accessed very rapidly, whereas larger files may take a larger number of disk accesses to locate their blocks.

Page 33: CE01000-3 Operating Systems

Second extended filesystem

Ext2fs has 15 pointers in the inode - 12 direct, 1 indirect, 1 double indirect and 1 triple indirect

Ext2fs does not have fixed size records for file names, but variable length records permitting long file names (max 255 chars) without wasting space

Page 34: CE01000-3 Operating Systems

Second extended filesystem (Cont.)

disk partition structure changed - to consist in boot block followed by several block groups

.......

bootblock

blockgroup 1

blockgroup 2

blockgroup n

.......

Page 35: CE01000-3 Operating Systems

Disk block groups

each block group contains duplicates of critical info especially super block and the file system descriptors - improves reliability

Superblock

FSdescriptors

blockbitmap

inodebitmap

inodes

file data blocks

Page 36: CE01000-3 Operating Systems

Disk block groups (Cont.)

each block group holds some of the inodes and file data blocks (attempts to keep inodes and their data blocks in close proximity)

Also free inode list and free data block list replaced by an inode bitmap and block bitmap

Page 37: CE01000-3 Operating Systems

Disk block groups (Cont.)

bitmap has a single bit for each inode or block in the block group. If bit is set to 1 then inode or block is free, if set to 0 then inode/block is in use. Allocation/deallocation consists in searching through a bit list setting values appropriately. Bitmaps are small and can thus be held in memory during mounting of filesystem - very fast to search.

Page 38: CE01000-3 Operating Systems

Third Extended File System

Ext3fs replaces Ext2fs – major improvements over Ext2fs

Provides ability to log intended changes to filesystem – makes recovery from crashes/failures easier

Allows Htree indexing of large directories Htree is a form of binary tree which allows very

quick searching

Page 39: CE01000-3 Operating Systems

References

Operating System Concepts. Chapter 22 & Appendix C.