OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as...

45
s: 15. Distributed File Systems Operating Systems Operating Systems Objectives Objectives introduce issues such as naming, introduce issues such as naming, stateful and stateless, and stateful and stateless, and replication replication Certificate Program in Software D evelopment CSE-TC and CSIM, AIT September -- November, 2003 15. Distributed File Systems (S&G 6th ed., Ch. 16)

Transcript of OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as...

Page 1: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 1

Operating SystemsOperating Systems

ObjectivesObjectives– introduce issues such as naming, stateful introduce issues such as naming, stateful

and stateless, and replicationand stateless, and replication

Certificate Program in Software DevelopmentCSE-TC and CSIM, AITSeptember -- November, 2003

15. Distributed File Systems(S&G 6th ed., Ch. 16)

Page 2: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 2

OverviewOverview

1. Background1. Background

2. Naming and Transparency2. Naming and Transparency

3. Remote File Access 3. Remote File Access

4. Stateful versus Stateless Service4. Stateful versus Stateless Service

5. File Replication5. File Replication

6. Example System: Andrew6. Example System: Andrew

Page 3: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 3

1. Background1. Background

Distributed File SystemDistributed File System (DFS) (DFS)– a distributed implementation of the classical a distributed implementation of the classical

time-sharing model of a file system, where time-sharing model of a file system, where multiple users share files and storage resourcesmultiple users share files and storage resources

A DFS manages set of dispersed storage A DFS manages set of dispersed storage devices.devices.

continued

Page 4: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 4

Overall storage space managed by a DFS is Overall storage space managed by a DFS is composed of different, remotely located, composed of different, remotely located, smaller storage spaces.smaller storage spaces.

There is usually a correspondence between There is usually a correspondence between constituent storage spaces and sets of files. constituent storage spaces and sets of files.

Page 5: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 5

1.1. DFS Structure1.1. DFS Structure

ServiceService– software entity running on one or more software entity running on one or more

machines and providing a particular type of machines and providing a particular type of function to a priori unknown clientsfunction to a priori unknown clients

ServerServer– service software running on a single machineservice software running on a single machine

continued

Page 6: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 6

ClientClient– process that can invoke a service using a set of operations process that can invoke a service using a set of operations

that forms its that forms its client interfaceclient interface

A client interface for a file service is formed by a set A client interface for a file service is formed by a set of primitive of primitive file operationsfile operations (create, delete, read, (create, delete, read, write).write).

Client interface of a DFS should be transparent, i.e., Client interface of a DFS should be transparent, i.e., not distinguish between local and remote files. not distinguish between local and remote files.

Page 7: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 7

2. Naming and Transparency2. Naming and Transparency

NamingNaming– mapping between logical and physical objectsmapping between logical and physical objects

Multilevel mappingMultilevel mapping– abstraction of a file that hides the details of how abstraction of a file that hides the details of how

and where on the disk the file is actually storedand where on the disk the file is actually stored

continued

Page 8: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 8

A A transparenttransparent DFS hides the location where DFS hides the location where in the network the file is stored.in the network the file is stored.

For a file being replicated in several sites, For a file being replicated in several sites, the mapping returns a set of the locations of the mapping returns a set of the locations of this file’s replicas; both the existence of this file’s replicas; both the existence of multiple copies and their location are multiple copies and their location are hidden.hidden.

Page 9: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 9

2.1. Naming Structures 2.1. Naming Structures

Location transparencyLocation transparency – file name does – file name does not reveal the file’s physical storage not reveal the file’s physical storage location.location.– file name still denotes a specific, although file name still denotes a specific, although

hidden, set of physical disk blockshidden, set of physical disk blocks– convenient way to share dataconvenient way to share data– can expose correspondence between component can expose correspondence between component

units and machinesunits and machines

continued

Page 10: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 10

Location independenceLocation independence – file name does – file name does not need to be changed when the file’s not need to be changed when the file’s physical storage location changes. physical storage location changes. – better file abstractionbetter file abstraction– promotes sharing the storage space itselfpromotes sharing the storage space itself– separates the naming hierarchy form the separates the naming hierarchy form the

storage-devices hierarchystorage-devices hierarchy

Page 11: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 11

2.2. Three Naming Schemes Approaches 2.2. Three Naming Schemes Approaches

1. Files named by combination of their host 1. Files named by combination of their host name and local name (e.g. URL)name and local name (e.g. URL)– guarantees a unique system-wide nameguarantees a unique system-wide name

2. Attach remote directories to local 2. Attach remote directories to local directories, giving the appearance of a directories, giving the appearance of a coherent directory treecoherent directory tree– only previously mounted remote directories can only previously mounted remote directories can

be accessed transparentlybe accessed transparently

continued

Page 12: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 12

3. Total integration of the component file 3. Total integration of the component file systems.systems.– a single global name structure spans all the files a single global name structure spans all the files

in the systemin the system

– if a server is unavailable, some arbitrary set of if a server is unavailable, some arbitrary set of directories on different machines also becomes directories on different machines also becomes unavailable. unavailable.

Page 13: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 13

3. Remote File Access 3. Remote File Access

Reduce network traffic by retaining recently Reduce network traffic by retaining recently accessed disk blocks in a cache, so that accessed disk blocks in a cache, so that repeated accesses to the same information repeated accesses to the same information can be handled locallycan be handled locally

– if needed data not already cached, a copy of if needed data not already cached, a copy of data is brought from the server to the userdata is brought from the server to the user

continued

Page 14: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 14

– bigger caches are better (64K)bigger caches are better (64K)– accesses are performed on the cached copyaccesses are performed on the cached copy

– files identified with one master copy residing at files identified with one master copy residing at the server machine, but copies of (parts of) the the server machine, but copies of (parts of) the file are scattered in different cachesfile are scattered in different caches

– cache-consistencycache-consistency problem – keeping the cached problem – keeping the cached copies consistent with the master file.copies consistent with the master file.

Page 15: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 15

3.1. Cache Location – 3.1. Cache Location – Disk vs. Main MemoryDisk vs. Main Memory

Advantages of disk cachesAdvantages of disk caches– more reliablemore reliable

– cached data kept on disk are still there during cached data kept on disk are still there during recovery and don’t need to be fetched againrecovery and don’t need to be fetched again

continued

Page 16: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 16

Advantages of main-memory caches:Advantages of main-memory caches:– permit workstations to be disklesspermit workstations to be diskless

– data can be accessed more quicklydata can be accessed more quickly

– performance speedup in bigger memoriesperformance speedup in bigger memories

– server caches (used to speed up disk I/O) are in main server caches (used to speed up disk I/O) are in main memory regardless of where user caches are located; using memory regardless of where user caches are located; using main-memory caches on the user machine permits a single main-memory caches on the user machine permits a single caching mechanism for servers and userscaching mechanism for servers and users

Page 17: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 17

3.2. Cache Update Policy3.2. Cache Update Policy

Write-throughWrite-through – write data through to disk as – write data through to disk as soon as they are placed on any cachesoon as they are placed on any cache– reliable, but poor performancereliable, but poor performance

Delayed-writeDelayed-write – modifications written to the – modifications written to the cache and then written through to the server later. cache and then written through to the server later. – write accesses complete quickly; some data may be write accesses complete quickly; some data may be

overwritten before they are written back, and so need overwritten before they are written back, and so need never be written at allnever be written at all

continued

Page 18: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 18

– poor reliability; unwritten data will be lost whenever a poor reliability; unwritten data will be lost whenever a user machine crashesuser machine crashes

– variation – scan cache at regular intervals and flush variation – scan cache at regular intervals and flush blocks that have been modified since the last scanblocks that have been modified since the last scan

– variation – variation – write-on-closewrite-on-close, writes data back to the , writes data back to the server when the file is closed. Best for files that are server when the file is closed. Best for files that are open for long periods and frequently modified.open for long periods and frequently modified.

Page 19: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 19

3.3. Consistency3.3. Consistency

Is locally cached copy of the data consistent Is locally cached copy of the data consistent with the master copy?with the master copy?

Client-initiated approachClient-initiated approach– client initiates a validity checkclient initiates a validity check– server checks whether the local data are server checks whether the local data are

consistent with the master copyconsistent with the master copy

continued

Page 20: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 20

Server-initiated approachServer-initiated approach– server records, for each client, the (parts of) server records, for each client, the (parts of)

files it cachesfiles it caches– when server detects a potential inconsistency, it when server detects a potential inconsistency, it

must react must react – session semantics? How are the cache and session semantics? How are the cache and

original re-combined?original re-combined?

Page 21: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 21

3.4. Comparing Caching & Remote 3.4. Comparing Caching & Remote ServicesServices

If many remote accesses are handled by the If many remote accesses are handled by the local cache then overall efficiency will local cache then overall efficiency will improve.improve.

Servers are contracted only occasionally in Servers are contracted only occasionally in caching (rather than for each access).caching (rather than for each access).– reduces server load and network trafficreduces server load and network traffic– enhances potential for scalabilityenhances potential for scalability

continued

Page 22: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 22

If the remote server method handles every If the remote server method handles every remote access across the network, then remote access across the network, then there is a penalty in network traffic, server there is a penalty in network traffic, server load, and performance.load, and performance.

Total network overhead in transmitting big Total network overhead in transmitting big chunks of data (caching) is lower than a chunks of data (caching) is lower than a series of networkc requests.series of networkc requests.

continued

Page 23: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 23

Caching is superior when there are few writes. Caching is superior when there are few writes.

With frequent writes, there is a substantial With frequent writes, there is a substantial overhead to overcome the cache consistency overhead to overcome the cache consistency problemproblem

Caching is best when carried out on machines Caching is best when carried out on machines with local disks or large main memories.with local disks or large main memories.

continued

Page 24: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 24

Remote access should be used on diskless, Remote access should be used on diskless, small memory machines.small memory machines.

In caching, the lower inter-machine interface is In caching, the lower inter-machine interface is different form the upper user interfacedifferent form the upper user interface

In a remote service, the inter-machine interface In a remote service, the inter-machine interface mirrors the local user file system interface.mirrors the local user file system interface.

Page 25: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 25

4. Stateful File Service4. Stateful File Service

Mechanism.Mechanism.– client opens a fileclient opens a file– server fetches information about the file from its disk, server fetches information about the file from its disk,

stores it in its memory, and gives the client a stores it in its memory, and gives the client a connection identifierconnection identifier unique to the client and the open unique to the client and the open file file

– identifier is used for subsequent accesses until the identifier is used for subsequent accesses until the session endssession ends

– server must reclaim the main-memory space used by server must reclaim the main-memory space used by clients who are no longer activeclients who are no longer active

continued

Page 26: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 26

Increased performanceIncreased performance– fewer disk accessesfewer disk accesses

– stateful server knows if a file was opened for stateful server knows if a file was opened for sequential access and can thus read ahead the sequential access and can thus read ahead the next blocksnext blocks

Page 27: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 27

4.1. Stateless File Server4.1. Stateless File Server

Avoids state information by making each Avoids state information by making each request self-contained.request self-contained.

Each request identifies the file and position Each request identifies the file and position in the file.in the file.

No need to establish and terminate a No need to establish and terminate a connection by open and close operations.connection by open and close operations.

Page 28: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 28

4.2. Distinctions Between Stateful & 4.2. Distinctions Between Stateful & Stateless Service Stateless Service

Failure Recovery.Failure Recovery.– A stateful server loses all its volatile state in a A stateful server loses all its volatile state in a

crash.crash. Restore state by recovery protocol based on a dialog Restore state by recovery protocol based on a dialog

with clients, or abort operations that were underway with clients, or abort operations that were underway when the crash occurred.when the crash occurred.

Server needs to be aware of client failures in order Server needs to be aware of client failures in order to reclaim space allocated to record the state of to reclaim space allocated to record the state of crashed client processes (orphan detection and crashed client processes (orphan detection and elimination).elimination).

continued

Page 29: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 29

– With stateless server, the effects of server With stateless server, the effects of server failure and recovery are almost unnoticeablefailure and recovery are almost unnoticeable

no state to restoreno state to restore client just keeps resending request client just keeps resending request

– A newly reincarnated server can respond to a A newly reincarnated server can respond to a self-contained request without any difficulty. self-contained request without any difficulty.

continued

Page 30: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 30

Penalties for using the stateless service: Penalties for using the stateless service: – longer request messages to hold the statelonger request messages to hold the state– slower request processing to process those slower request processing to process those

messagesmessages– additional constraints imposed on DFS design, additional constraints imposed on DFS design,

such as such as idempotencyidempotency one client operation must have the same meaning as one client operation must have the same meaning as

many copies of that operationmany copies of that operation

continued

Page 31: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 31

Some environments require stateful service.Some environments require stateful service.– A server employing server-initiated cache validation A server employing server-initiated cache validation

cannot provide stateless service, since it maintains a cannot provide stateless service, since it maintains a record of which files are cached by which clientsrecord of which files are cached by which clients

– UNIX use of file descriptors and implicit offsets is UNIX use of file descriptors and implicit offsets is inherently stateful; servers must maintain tables to inherently stateful; servers must maintain tables to map the file descriptors to inodes, and store the map the file descriptors to inodes, and store the current offset within a file.current offset within a file.

Page 32: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 32

5. File Replication5. File Replication

Replicas of the same file reside on failure Replicas of the same file reside on failure independent machines.independent machines.

Improves availability and can shorten service Improves availability and can shorten service time.time.

Naming scheme maps a replicated file name to a Naming scheme maps a replicated file name to a particular replica.particular replica.– Existence should be invisible to higher levels. Existence should be invisible to higher levels. – Replicas must be distinguished from one another by Replicas must be distinguished from one another by

different lower-level names.different lower-level names.

continued

Page 33: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 33

UpdatesUpdates– replicas of a file denote the same logical entity, and replicas of a file denote the same logical entity, and

thus an update to any replica must be reflected on thus an update to any replica must be reflected on all other replicasall other replicas

Demand replicationDemand replication– reading a nonlocal replica causes it to be cached reading a nonlocal replica causes it to be cached

locally, thereby generating a new nonprimary locally, thereby generating a new nonprimary replica replica

Page 34: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 34

6. Example System: Andrew6. Example System: Andrew

AndrewAndrew is a distributed computing is a distributed computing environment under development since 1983 environment under development since 1983 at Carnegie-Mellon University.at Carnegie-Mellon University.– also known as the AFS (Andrew File System)also known as the AFS (Andrew File System)

Andrew is highly scalable; the system is Andrew is highly scalable; the system is targeted to span over 5000 workstations.targeted to span over 5000 workstations.

continued

Page 35: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 35

Andrew distinguishes between client machines Andrew distinguishes between client machines (workstations) and dedicated (workstations) and dedicated server machinesserver machines. . Servers and clients run the 4.2BSD UNIX OS Servers and clients run the 4.2BSD UNIX OS and are interconnected by an internet of LANs.and are interconnected by an internet of LANs.

Clients are presented with a partitioned space Clients are presented with a partitioned space of file names: a of file names: a local name spacelocal name space and a and a shared name spaceshared name space..

continued

Page 36: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 36

Dedicated servers, called Dedicated servers, called ViceVice, present the , present the shared name space to the clients as an shared name space to the clients as an homogeneous, identical, and location homogeneous, identical, and location transparent file hierarchytransparent file hierarchy

The local name space is the root file system The local name space is the root file system of a workstation, from which the shared of a workstation, from which the shared name space descends.name space descends.

continued

Page 37: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 37

Workstations run the Workstations run the Virtue protocolVirtue protocol to to communicate with Vice, and are required to communicate with Vice, and are required to have local disks where they store their local have local disks where they store their local name spacename space

Servers collectively are responsible for the Servers collectively are responsible for the storage and management of the shared storage and management of the shared name space. name space.

continued

Page 38: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 38

Clients and servers are structured in clusters Clients and servers are structured in clusters interconnected by a backbone LAN.interconnected by a backbone LAN.

A cluster consists of a collection of workstations and a A cluster consists of a collection of workstations and a clustercluster serverserver and is connected to the backbone by a and is connected to the backbone by a routerrouter..

A key mechanism selected for remote file operations is A key mechanism selected for remote file operations is whole file cachingwhole file caching. Opening a file causes it to be . Opening a file causes it to be cached, in its entirety, on the local disk.cached, in its entirety, on the local disk.

Page 39: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 39

6.1. Andrew Shared Name Space6.1. Andrew Shared Name Space

Andrew’s volumes are small component Andrew’s volumes are small component units associated with the files of a single units associated with the files of a single client.client.

A fid identifies a Vice file or directory. A A fid identifies a Vice file or directory. A fid is 96 bits long and has three equal-fid is 96 bits long and has three equal-length components:length components:– volume number, vnode number, unique idvolume number, vnode number, unique id

continued

Page 40: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 40

Fids are location transparent; therefore, file Fids are location transparent; therefore, file movements from server to server do not movements from server to server do not invalidate cached directory contentsinvalidate cached directory contents

Location information is kept on a volume Location information is kept on a volume basis, and the information is replicated on basis, and the information is replicated on each server.each server.

Page 41: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 41

6.2. Andrew’s File Operations6.2. Andrew’s File Operations

Andrew caches entire files from servers on Andrew caches entire files from servers on a client. a client.

A client workstation interacts with Vice A client workstation interacts with Vice servers only during opening and closing of servers only during opening and closing of filesfiles– good for performancegood for performance– aids cache consistencyaids cache consistency

continued

Page 42: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 42

VenusVenus – caches files from Vice when they are opened, – caches files from Vice when they are opened, and stores modified copies of files back when they are and stores modified copies of files back when they are closed.closed.

Reading and writing bytes of a file are done by the Reading and writing bytes of a file are done by the kernel without Venus intervention on the cached copy.kernel without Venus intervention on the cached copy.

Venus caches contents of directories and symbolic Venus caches contents of directories and symbolic links, for path-name translation.links, for path-name translation.

Page 43: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 43

6.3. Andrew Implementation6.3. Andrew Implementation

Client processes are interfaced to a UNIX Client processes are interfaced to a UNIX kernel with the usual set of system calls.kernel with the usual set of system calls.

Venus carries out path-name translation Venus carries out path-name translation component by component.component by component.

The UNIX file system is used as a low-level The UNIX file system is used as a low-level storage system for both servers and clients. storage system for both servers and clients. The client cache is a local directory on the The client cache is a local directory on the workstation’s disk.workstation’s disk.

continued

Page 44: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 44

Both Venus and server processes access UNIX files Both Venus and server processes access UNIX files directly by their inodes to avoid the expensive path directly by their inodes to avoid the expensive path name-to-inode translation routine.name-to-inode translation routine.

Venus manages two separate caches: Venus manages two separate caches: – one for statusone for status– one for dataone for data

LRU algorithm used to keep each of them bounded in LRU algorithm used to keep each of them bounded in size.size.

continued

Page 45: OSes: 15. Distributed File Systems 1 Operating Systems v Objectives –introduce issues such as naming, stateful and stateless, and replication Certificate.

OSes: 15. Distributed File Systems 45

The status cache is kept in virtual memory The status cache is kept in virtual memory to allow rapid servicing of to allow rapid servicing of statstat (file status (file status returning) system callsreturning) system calls

The data cache is resident on the local disk, The data cache is resident on the local disk, but the UNIX I/O buffering mechanism but the UNIX I/O buffering mechanism does some caching of the disk blocks in does some caching of the disk blocks in memory that are transparent to Venus.memory that are transparent to Venus.