Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

48
Chapter 6 Distributed File Systems

Transcript of Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Page 1: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Chapter 6

Distributed File Systems

Page 2: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Topics Review of UNIX Sun NFS

VFS architecture caching

Page 3: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Layered Structure Directory service

Mapping: file name unique file ID Access control

File service Mapping: file ID inode File access

Block service Block management Device access

Directory

File service

Block service

Page 4: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Hierarchical Directory Systems A general hierarchy: a tree of

directoriesroot

directory

file

directory directory

directory directory

directory

file file

file

file

file

file

User directory

Page 5: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

File System Layout Disk is divided up into several partitions

Each partition has one file system MBR – master boot record

boot the computer & contain the partition table Partition table

Starting & ending addresses of each partition One partition is marked as active

Within each partition Boot block – first block, a program loads the OS Superblock – key parameters about the file sys.

MBRPartition 1 Partition

2Partition 3

Partition 4

Boot block

Super block

Free space mgmt

I-nodes

Root dir

Files and directories

Page 6: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Implementing Files Key issue: how to keep track of which

disk sectors go with which file? E.g., block size= 512B, file size=2014B, so

where are these 2014/514=4 blocks on disk? Many methods

Contiguous allocation Linked list allocation I-nodes Each one has its own pros and cons

Page 7: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Index Nodes (i-nodes) An i-node lists the attributes and disk

addresses of the file’s blocks Only when a file is open, its i-nodes should be

loaded into memory Much smaller than FAT Irrelevant to size of disk

File attributes

Address of disk block 0

Address of disk block 1

Address of disk block 2

Address of disk block 3

Address of block of pointers

Disk block containing

additional disk addresses

Page 8: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

i-node and 3-level index

1

12

13

14

151K pointers

1K pointers

1K pointers

4 KB

4 KB

i-node

Page 9: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Managing open files in File Service layer

Parent’s OFT

Child’s OFT

SystemOFT(storespositionpointers)

In-coreinodetable

inode

data

Kernel-resident Disk-residentSwappableper process

0

0

1

1

2

2

3

OFT: Open File Table (one entry per open)

1

12

Page 10: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Implementing Directories Directory system: map the ASCII file name onto the

info needed to locate the data Directory entry

Where are the attributes stored? In the directory entry (MS-DOS/Windows) In the i-nodes (UNIX)

Games Attributes

Mail Attributes

News Attributes

Work Attributes

DOS/Windows

Games

Mail

News

Work

File attributes

Address of disk block 0

Address of disk block 1

i-node

UNIX

Page 11: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Implementing Directories: Example

...foo

bin

64

...foo

bin2

63

...usr

vmunix

42

local 3

8

6

8

Hello world!

/usr/bin

Lnk_cnt=2Lnk_cnt=1

5

VMUNIX5

Page 12: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Locate A File: /usr/ast/mbox

1 .1 ..

4 bin

7 dev

14 lib

9 etc

6 usr

8 tmp

Attr.

132

…..

6 .1 ..

19

dick

30

erik

51

jim

26

ast

45

bal

Attr.

406

…..

26 .6 ..

64 grants

92 books

60 mbox

81 simix

17 src

root I-node 6 is for /usr

Block 132 is /usr dir.

I-node 26 is for /usr/ast

Block 406 is /usr/ast dir.

Looking up usr yields i-node 6

/usr is in block 132

/usr/ast is i-node 26

/usr/ast is in block 406 /usr/ast/mbox

is i-node 60

Page 13: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

How to Share A File? If directory entry has addresses of blocks

How about new appended blocks? Addresses of Disk blocks stored separately

UNIX i-node approach Symbolic linking: create a link file containing the

path name

Dir A

Dir B Dir C

File 1Directory entry contains disk address

Dir A

Dir B Dir C

File 1

i-node

Dir A

Dir B Dir C

File 1Link file

../Dir C/File1

Symbolic linking

Page 14: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Caching Reserve a set of blocks in main memory as disk

sectors cache How cache works?

Maintenance of the cache Like page replacement: FIFO, LRU, etc.

Hash tableFront (LRU) Rear (MRL)

Page 15: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Write Important Blocks Back First Write critical blocks back to disk

immediately after they are updated (write-through) Reduce the probability of inconsistency greatly Write-through cache: modified blocks are

written back immediately Compared to delayed-write

Don’t keep data blocks in memory for too long Force synchronization periodically (per 30 sec)

Page 16: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Block Read Ahead If a file is read sequentially, read block

(k+1) when block k is in used by a process

If a file is randomly accessed, read ahead wastes bandwidth

Detect the access patterns for open files Switch between read ahead or not

according to current pattern Q: how to use it on stateless or stateful

servers?

Page 17: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Mapping file systems to physical devices

Page 18: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Mounting

bin etc usr

cc date sh passwd getty

bin src include

yacc ban awk uts stdio.h

/

/

Rootfile system

/dev/sd0g

Mount point

Page 19: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

man mount Mount attaches a file system to the file system

hierarchy at the mount_point, which is the pathname of a directory. If mount_point has any contents prior to the mount operation, these are hidden until the file system is unmounted.

The table of currently mounted file systems

can be found by examining the mounted file system information file. This is provided by a file system that is usually mounted on /etc/mnttab.

Page 20: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

NFS Architecture

Page 21: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Stateless File Server Robust in the face of failures, but

Not all operations are idempotent Like lock operation

Longer messages Longer processing time

Page 22: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Transparency Location transparency

Path name (i.e. full name of file) does not say where the file is located.

Location Independence Path name is independent of the server. Hence you can

move a file from server to server without changing its name.

Have a namespace of files and then have some (dynamically) assigned to certain servers. This namespace would be the same on all machines in the system.

Root transparency made up name / is the same on all systems This would ruin some conventions like /tmp

Page 23: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

NFS Protocols Mounting

Analyze the pathname Request & store file handler Static & auto mounting

Directory and file access Support most UNIX calls No support for open() and close()

Page 24: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

VFS/v-node Architecture Motivation: share a common file server by

an arbitrary collection of clients and servers Require a file-system independent framework

for file access v-node (virtual i-node): for every open file

in the VFS layer Check if a directory or file is local Contain a pointer pointing to an r-node

(remote i-node) in NFS client VFS: represent any file system

Well-defined interface One for each file system

Page 25: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Virtual File System

Page 26: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

v-node Data fields (struct v-node)

Methods (struct vnodeops)

vop_open vop_lookupvop_read vop_mkdirvop_getaddr …

v_flagv_count v_type v_vfsmountedhere …

v_data

v_op

r-node

FS-independent part

Interfacedefinition

FS-dependentimplementation of vnodeops(Shared amongUnix vnodes)c

Page 27: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Data fields (struct vfs)

Methods (struct vfsops)

vfs_mount vfs_rootvfs_unmount vfs_syncvfs_statvfs …

vfs_next vfs_fstype vfs_vnodecoverd …

vfs_datavfs_op

FS-dependentimplementation of vfsops

FS-dependentdata

FS-independent part

Interfacedefinition

VFS implementation

Page 28: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Struct vfs instance vfs_data vfs_ops vfs_next: pointer to the next FS

mounted vfs_fstype: ufs, nfs, ext2fs, etc.

Page 29: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Mounting

rootvfsRoot filesystem

Mounted file system

vfs vfs

/ /usr /

vnode

ROOT ROOT

belongs to mounted

herevnode vnode

covers

v-nodes for mounted-on directories are kept in main memory.

Page 30: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Implementation Server: export one or more of its

directories for access by remote clients /etc/exports file, e.g.,/usr/local –access=hostA:hostB/usr/bin –ro

Client: mount the exported directories Become part of its directory No difference between a local file or a remote

file Two clients can communicate by sharing files

in their common directories.

Page 31: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Mount A Remote File System Call mount program, specify the remote

directory and local mount point. E.g., mount -t msdos /dev/ad0s1 /mnt/windows E.g., mount indus:/usr/src /usr/src

Parse the name and find the server Contact the server Receive the file handler Create a v-node for the remote directory in vfs

layer Create a r-node in NFS client, pointed by the v-

node

Page 32: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Mount (1)

Mounting (part of) a remote file system in NFS.

Page 33: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Mount (2)

Mounting nested directories from multiple servers in NFS.

Page 34: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Automounting (1)

ps -fe | grep automount

Page 35: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Automounting (2)Using symbolic links with automounting.

• Can also be used with file replication.

Page 36: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Open A Remote File Parse the file name

Get the v-node and r-node of the mounted file system

Ask NFS client to open the file Contact server and get the file

handler for the opened file NFS client creates an r-node for the

file vfs creates a v-node for the file

Page 37: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

File Attributes (1)

Attribute Description

TYPE The type of the file (regular, directory, symbolic link)

SIZE The length of the file in bytes

CHANGEIndicator for a client to see if and/or when the file has changed

FSID Server-unique identifier of the file's file system

Some general mandatory file attributes in NFS.

Page 38: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

File Attributes (2)Attribute Description

ACL an access control list associated with the file

FILEHANDLE The server-provided file handle of this file

FILEID A file-system unique identifier for this file

FS_LOCATIONS Locations in the network where this file system may be found

OWNER The character-string name of the file's owner

TIME_ACCESS Time when the file data were last accessed

TIME_MODIFY Time when the file data were last modified

TIME_CREATE Time when the file was created

Some general recommended file attributes.

Page 39: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Semantics of File Sharing (1)

a) On a single processor, when a read follows a write, the value returned by the read is the value just written.

b) In a distributed system with caching, obsolete values may be returned.

Page 40: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Semantics of File Sharing (2)

Method Comment

UNIX semantics Every operation on a file is instantly visible to all processes

Session semantics

No changes are visible to other processes until the file is closed

Immutable files No updates are possible; simplifies sharing and replication

Transaction All changes occur atomically

Modified session semantics: changes to an open file are initially visible only to the processes on the same machine. Upon closed, the changes are visible to other machines.

Page 41: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

UNIX Semantic Probably Unix doesn't quite do this.

If a write is large (several blocks) do seeks for each

During a seek, the process sleeps (in the kernel)

Another process can be writing a range of blocks that intersects the blocks for the first write.

The result could be (depending on disk scheduling) that the result does not have a last write.

Perhaps Unix semantics means - A read returns the value stored by the last write providing one exists.

Page 42: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

File Locking in NFS

Operation Description

Lock Creates a lock for a range of bytes

Lockt Test whether a conflicting lock has been granted

Locku Remove a lock from a range of bytes

Renew Renew the lease on a specified lock

More complicated with file replication.

Page 43: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Client Caching (1)

Q: where to put the cache? a) user space b) kernel space

Page 44: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Client Caching (2)

Using the NFS version 4 callback mechanism to recall file delegation.

Page 45: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Lease When a client wants a file, the server

gives a lease on it that specifies how long the copy is valid

Client renew the lease before it expires No message sent when a lease expires

How about client crash? How about server crash?

Lease time and reboot time

Page 46: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Cache Management Algorithms

Write trough Works, but heavy network traffic

Delayed write Better performance but possibly ambiguous semantics

Write on close Matches session semantics

Centralized control

UNIX semantics, but not robust and scales poorly

Page 47: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

General Principles for DS Proposed by Satyanarayanan

Clients have cycles to burn Cache whenever possible Exploit the usage properties Minimize system-wide knowledge and

change Trust the fewest possible entities Batch work where possible

Page 48: Chapter 6 Distributed File Systems. Topics Review of UNIX Sun NFS VFS architecture caching.

Possible Trends Main memory file system Fiber optic network

Effects on cache Mobile users

Disconnection Geographic location

Multimedia application VOD