CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

32
CS 162 Section Lecture 8

Transcript of CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Page 1: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

CS 162 Section

Lecture 8

Page 2: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

What happens when you issue a read() or write() request?

Page 3: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Life Cycle of An I/O Request

Device DriverTop Half

Device DriverBottom Half

DeviceHardware

Kernel I/OSubsystem

UserProgram

Page 4: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

When should you return from the read()/write() call?

Page 5: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Interface Timing

• Blocking Interface: “Wait”– When request data (e.g., read() system call), put process

to sleep until data is ready– When write data (e.g., write() system call), put process to

sleep until device is ready for data

• Non-blocking Interface: “Don’t Wait”– Returns quickly from read or write request with count of

bytes successfully transferred to kernel– Read may return nothing, write may write nothing

• Asynchronous Interface: “Tell Me Later”– When requesting data, take pointer to user’s buffer, return

immediately; later kernel fills buffer and notifies user– When sending data, take pointer to user’s buffer, return

immediately; later kernel takes data and notifies user

Page 6: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Magnetic Disk Characteristic• Cylinder: all the tracks under the

head at a given point on all surfaces• Read/write data is a three-stage

process:– Seek time: position the head/arm over the proper track (into

proper cylinder)– Rotational latency: wait for the desired sector

to rotate under the read/write head– Transfer time: transfer a block of bits (sector)

under the read-write head• Disk Latency = Queuing Time + Controller time +

Seek Time + Rotation Time + Xfer Time

• Highest Bandwidth: – Transfer large group of blocks sequentially from one track

SectorTrack

CylinderHead

Platter

SoftwareQueue

(Device Driver)

Ha

rdw

are

Co

ntro

ller Media Time

(Seek+Rot+Xfer)

Req

uest

Resu

lt

Page 7: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

We have a disk with the following parameters:

• 1TB in size• 7200 RPM, Data transfer rate of 40 Mbytes/s

(40 × 106 bytes/sec) • Average seek time of 6ms• ATA Controller with 2ms controller initiation

time • A block size of 4Kbytes (4096 bytes)

What is the average time to read a random block from the disk?

Page 8: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

SSD

– No penalty for random access– Rule of thumb: writes 10x more expensive than reads, and erases

10x more expensive than writes (read 25μs)– Limited drive lifespan

– Controller maintains pool of empty pages by coalescing used sectors (read, erase, write), also reserve some % of capacity

– Controller uses ECC, performs wear leveling– OS may provide TRIM information about “deleted” sectors

(normally only file system knows about unallocated blocks, not the disk drive)

Page 9: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

How will you allocate space on disk?

Page 10: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?
Page 11: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

What is the purpose of a File System?

Page 12: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

File System

• Transforms blocks into Files and Directories

• Optimize for access and usage patterns

• Maximize sequential access, allow efficient random access

Page 13: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Linked Allocation: File-Allocation Table (FAT)

Page 14: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

If entry size is 16 bits

What is the max size of the FAT?

Page 15: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Given a 512 byte block, What is the max size

of the FS?

Page 16: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

What is the space overhead of FAT?

Page 17: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Multilevel Indexed Files (UNIX 4.1)

Page 18: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Where are the i-nodes stored?

Page 19: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?
Page 20: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

What are problems with multi-level indexed files?

Page 21: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Directory Structure

Page 22: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

What can the FS do to improve performance?

Page 23: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Bitmap of free blocks

Page 24: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Variable sized splits

Page 25: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Cylinder Groups

Page 26: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

File System Caching

• Optimizations for sequential access:– Try to store consecutive blocks of a file near each other– Store inode near data blocks– Try to locate directory near the inodes it points to

• Buffer cache used to increase file system performance– Read Ahead Prefetching and Delayed Writes

• Key Idea: Exploit locality by caching data in memory– Name translations: Mapping from pathsinodes– Disk blocks: Mapping from block addressdisk content

• Buffer Cache: Memory used to cache kernel resources, including disk blocks and name translations– Can contain “dirty” blocks (blocks yet on disk)– Size: adjust boundary dynamically so that the disk access

rates for paging and file access are balanced

Page 27: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

File System Caching (cont’d)• Delayed Writes: Writes to files not immediately sent out to

disk– Instead, write() copies data from user space buffer to kernel

buffer (in cache)» Enabled by presence of buffer cache: can leave written file

blocks in cache for a while» If some other application tries to read data before written to disk,

file system will read from cache – Flushed to disk periodically (e.g. in UNIX, every 30 sec)– Advantages:

» Disk scheduler can efficiently order lots of requests» Disk allocation algorithm can be run with correct size value for a

file» Some files need never get written to disk! (e..g temporary scratch

files written /tmp often don’t exist for 30 sec)– Disadvantages

» What if system crashes before file has been written out?» Worse yet, what if system crashes before a directory file has

been written out? (lose pointer to inode!)

Page 28: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Log Structured and Journaled File Systems• Better reliability through use of log

– All changes are treated as transactions – A transaction is committed once it is written to the log

» Data forced to disk for reliability» Process can be accelerated with NVRAM

– Although File system may not be updated immediately, data preserved in the log

• Difference between “Log Structured” and “Journaled”– In a Log Structured file system, data stays in log form– In a Journaled file system, Log used for recovery

• For Journaled system:– Log used to asynchronously update filesystem

» Log entries removed after used– After crash:

» Remaining transactions in the log performed (“Redo”)» Modifications done in way that can survive crashes

• Examples of Journaled File Systems: – Ext3 (Linux), XFS (Unix), HDFS (Mac), NTFS (Windows), etc.

Page 29: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Key Value Store

• Very large scale storage systems• Two operations

– put(key, value)– value = get(key)

• Challenges– Fault Tolerance replication– Scalability serve get()’s in parallel; replicate/cache hot

tuples– Consistency quorum consensus to improve put()

performance

Page 30: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Key Value Store

• Also called a Distributed Hash Table (DHT)• Main idea: partition set of key-values across many

machineskey, value

Page 31: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Chord Lookup

• Each node maintains pointer to its successor

• Route packet (Key, Value) to the node responsible for ID using successor pointers

• E.g., node=4 lookups for node responsible for Key=37

4

20

3235

8

15

44

58

lookup(37)

node=44 is responsible for Key=37

Page 32: CS 162 Section Lecture 8. What happens when you issue a read() or write() request?

Chord

• Highly scalable distributed lookup protocol• Each node needs to know about O(log(M)), where m is

the total number of nodes• Guarantees that a tuple is found in O(log(M)) steps• Highly resilient: works with high probability even if half of

nodes fail