File Systems cs550 Operating Systems David Monismith.

18
File Systems cs550 Operating Systems David Monismith

Transcript of File Systems cs550 Operating Systems David Monismith.

Page 1: File Systems cs550 Operating Systems David Monismith.

File Systems

cs550Operating SystemsDavid Monismith

Page 2: File Systems cs550 Operating Systems David Monismith.

Files

File - sequence of bytes

File attributes

• Size

• Location

• Type Permissions

• Time stamps

Folders

• Path name specifies sequence of folders users must

traverse to travel down the tree to the file

Page 3: File Systems cs550 Operating Systems David Monismith.

E.g. path for B2 is /abc/def/B2 absolute path for B2 abc

/ \ | |

def A1 | |

B1 B2

• Files also have relative paths– Sequence of folders users must traverse starting at an intermediate

place

• Files often have symbolic links or shortcuts, too.

• These take you directly to another folder where the file

resides– Fast way to get to the file you want

Files

Page 4: File Systems cs550 Operating Systems David Monismith.

File Access Methods

1.Sequential 2.Random

What types of storage devices allow for these types of access?

• Operations including open, close, read, and write allow

interaction with files

• Open

Page 5: File Systems cs550 Operating Systems David Monismith.

File Access - Open

• Needed to read or write

• OS needs to know where file is located

• Requires directory traversal

• Don't want to do this with every read and write,

because it requires disk accesses

Page 6: File Systems cs550 Operating Systems David Monismith.

Access Methods (Contd..)

• Require open call to avoid overhead

• Do directory traversal once

• Cache directory entry in main memory

– Get fast access to file

– Call to open returns a "file handle" identifying the

cached directory entry

• Typically must identify whether you are reading or

writing and whether sequential or random access

will be used

• Writing can start at the end of the file (append) or

the beginning

Page 7: File Systems cs550 Operating Systems David Monismith.

File Access Methods - Close

• Several functions exist to perform a close operation

• The operation destroys a cached directory entry

• Ensure all data has been written to the file (it may be buffered)

• If on removable media, makes sure media is ready to be removed

Page 8: File Systems cs550 Operating Systems David Monismith.

File Access - Reading

• Read data from file into memory space of process requesting data

• File Mgmt System needs several pieces of information

• Which file? Use handle from open call • Where is the data in the file? • Typically don't read a whole file at once Instead, read

data blocks Indicate what portion of the file to read • Where will data be stored? Typically in a buffer in

memory. • How much data to read

Typically a system level read call is as follows: read(handle, file position, buffer, length)

Page 9: File Systems cs550 Operating Systems David Monismith.

File Access - Write

• Similar to read (but opposite)• Write data from process memory space to file • Again File Mgmt. System needs several pieces of

information • Which file? Need file handle • Where to store data? Need to write portions of the

file (blocks) • Where is the buffer that stores the data in memory? • How much data?

Typically done with following pseudo codewrite(handle, file position, buffer, length)

Page 10: File Systems cs550 Operating Systems David Monismith.

Sequential Access

• If file is open for sequential access, file mgmt subsystem keeps track of current file position for reading and writing

• System maintains a file position pointer for position of next read or write

• Often set as follows: Set to zero to start reading or writing at

the beginning of a file Set to the end of the file to append data

• Pointer is incremented by the amount of data read or written after each read or write

Page 11: File Systems cs550 Operating Systems David Monismith.

Streams, Pipes, and Redirection

• We looked at (used) many of these but we haven't analyzed them

• Stream - flow of data bytes (sequentially) into the process (read) and out of the process (write).

• Used by java, c++, and windows.net

Page 12: File Systems cs550 Operating Systems David Monismith.

PipesA pipe is a dynamic connection established between two processes • A pipe has a process at one end that writes

and process at the other end reads • Allows for interprocess communication • Data flows one way• Pipes can improve system performance by

not needing to use a file • And by allowing for buffered data to be sent

from one process to another as the initial process is still reading it

• Pipes often use a buffer in main memory as a temporary holding place for data (actually an implementation of producer consumer problem)

Page 13: File Systems cs550 Operating Systems David Monismith.

When a process executes the file management system creates 3 standard I/O files

stdin - takes input from the user keyboard (typically) stdout - typically writes to the screen stderr - used for process error messages

• Each file is a pipe that is connected to a system module that communicates with the appropriate device

• The command line interface supports I/O redirection

• Using redirection, the user can request the CLI Take data from a file or another process using

redirection instead of stdin Send data to another file or process instead of stdout

or stderr

Standard I/O Files

Page 14: File Systems cs550 Operating Systems David Monismith.

File redirection means sending input data to or forwarding

output data to another file as input or output, respectively,

via the command line.

An example of this follows below

dir > tmp.txt

more < tmp.txt

dir | more

Redirection

Page 15: File Systems cs550 Operating Systems David Monismith.

Other I/O Calls

Many other system I/O calls exist including flush

• directory functions

• create file

• delete file

• rename file

• file exists

• directory list

• get and set attributes

Page 16: File Systems cs550 Operating Systems David Monismith.

File space allocation

Allocate space for files

• Simple to use contiguous allocation

• Optimal performance for application reading/writing the file

Problems

• Need to know how much space the file will take up

• After long run time file system becomes cluttered

• Need to move files around

Cluster allocation

• Used to solve problem

• Allocate chunks of data and link them together

• Can also have indexed clusters

Page 17: File Systems cs550 Operating Systems David Monismith.

Calculating read and write addresses

• Typically use file pointer

• Use offset to get to address

• More complicated when using clusters

• Data requests may cross cluster boundaries

Free space management

• Keep track of clusters allocated to each file

• Use linked list

• Or use bit mapping

• Each bit in the bitmap represents a cluster

On – used

Off - free

File space allocation

Page 18: File Systems cs550 Operating Systems David Monismith.

Disk fragmentation External

• File broken into many clusters scattered across disk

• More clusters = poor performance

• Can allocate larger clusters, but need more

defragmentation

i.e. cluster rearrangement

Internal

• If using large clusters, small files use lots of disk space

• Waste may be a concern

Next time: Real world file systems