Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

41
Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765

Transcript of Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Page 1: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Distributed File Systems

Andy Wang

Operating Systems

COP 4610 / CGS 5765

Page 2: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Distributed File System

Provides transparent access to files stored on a remote disk

Recurrent themes of design issues Failure handling Performance optimizations Cache consistency

Page 3: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

No Client Caching

Use RPC to forward every file system request to the remote server open, seek, read, write

Server cache: X

Client A cache: Client B cache:

read write

Page 4: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

No Client Caching

+ Server always has a consistent view of the file system

- Poor performance

- Server is a single point of failure

Page 5: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Network File System (NFS)

Uses client caching to reduce network load

Built on top of RPC

Server cache: X

Client A cache: X Client B cache: X

Page 6: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Network File System (NFS)

+ Performance better than no caching

- Has to handle failures

- Has to handle consistency

Page 7: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Failure Modes

If the server crashes Uncommitted data in memory are lost Current file positions may be lost The client may ask the server to perform

unacknowledged operations again

If a client crashes Modified data in the client cache may be lost

Page 8: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

NFS Failure Handling

1. Write-through caching

2. Stateless protocol: the server keeps no state about the client read open, seek, read, close No server recovery after a failure

3. Idempotent operations: repeated operations get the same result No static variables

Page 9: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

NFS Failure Handling

4. Transparent failures to clients Two options

The client waits until the server comes back The client can return an error to the user application

• Do you check the return value of close?

Page 10: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

NFS Weak Consistency Protocol

A write updates the server immediatelyOther clients poll the server periodically for

changesNo guarantees for multiple writers

Page 11: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

NFS Summary

+ Simple and highly portable

- May become inconsistent sometimes Does not happen very often

Page 12: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Andrew File System (AFS)

Developed at CMUDesign principles

Files are cached on each client’s disks NFS caches only in clients’ memory

Callbacks: The server records who has the copy of a file

Write-back cache on file close. The server then tells all clients that own an old copy.

Session semantics: Updates are only visible on close

Page 13: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A Client B

Page 14: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A Client B

read X

read X

callback list of Xclient A

Page 15: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B

read X

read X

callback list of Xclient A

Page 16: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B

read X

read X

callback list of Xclient A

Page 17: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B

read X

read X

callback list of Xclient Aclient B

Page 18: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

read X

read X

callback list of Xclient Aclient B

Page 19: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

write X, X X

Page 20: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

close X

X X

Page 21: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

close X

X X

Page 22: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

close X

Page 23: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

open X

X

Page 24: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Illustrated

Server cache: X

Client A cache: X Client B cache: X

open X

X

Page 25: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS Failure Handling

If the server crashes, it asks all clients to reconstruct the callback states

Page 26: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

AFS vs. NFS

AFS Less server load due to clients’ disk caches Not involved for read-only files

Both AFS and NFS Server is a performance bottleneck Single point of failure

Page 27: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Serverless Network File Service (xFS)

Idea: construct a file system as a parallel program and exploit the high-speed LAN Four major pieces

Cooperative caching Write-ownership cache coherence Software RAID Distributed control

Page 28: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Cooperative Caching

Uses remote memory to avoid going to disk On a cache miss, check the local memory and

remote memory, before checking the disk Before discarding the last cached memory

copy, send the content to remote memory if possible

Page 29: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Cooperative Caching

Client C cache: Client D cache:

Client A cache: X Client B cache:

Page 30: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Cooperative Caching

Client C cache: Client D cache:

Client A cache: X Client B cache:

read X

X

Page 31: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Cooperative Caching

Client C cache: X Client D cache:

Client A cache: X Client B cache:

read X

X

Page 32: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Declares a client to be a owner of the file at writes No one else can have a copy

Page 33: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Client C cache: Client D cache:

Client A cache: X Client B cache:

owner, read-write

Page 34: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Client C cache: Client D cache:

Client A cache: X Client B cache:

owner, read-write

read X

Page 35: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Client C cache: Client D cache:

Client A cache: X Client B cache:

read-only

read X

X

Page 36: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Client C cache: X Client D cache:

Client A cache: X Client B cache:

read-only

read-only

X

Page 37: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Client C cache: X Client D cache:

Client A cache: X Client B cache:

read-only

read-onlywrite X

Page 38: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Write-Ownership Cache Coherence

Client C cache: X Client D cache:

Client A cache: Client B cache:

owner, read-writewrite X

Page 39: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

Other components

Software RAID Stripe data redundantly over multiple disks

Distributed control File system managers are spread across all

machines

Page 40: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

xFS Summary

Built on small, unreliable componentsData, metadata, and control can live on

any machineIf one machine goes down, everything else

continues to workWhen machines are added, xFS starts to

use their resources

Page 41: Distributed File Systems Andy Wang Operating Systems COP 4610 / CGS 5765.

xFS Summary

- Complexity and associated performance degradation

- Hard to upgrade software while keeping everything running