Dr Markus Hagenbuchner [email protected] CSCI319markus/SIM/CSCI319/11_FileSystems.pdf · NFS...
-
Upload
hoangthien -
Category
Documents
-
view
246 -
download
1
Transcript of Dr Markus Hagenbuchner [email protected] CSCI319markus/SIM/CSCI319/11_FileSystems.pdf · NFS...
CSCI319 Chapter 11 Page: 1
Dr Markus Hagenbuchner
CSCI319
Distributed Systems
CSCI319 Chapter 11 Page: 2
DISTRIBUTED FILE SYSTEMSLecture notes based on the textbook by Tannenbaum
Study objectives:
1. Understand the role of distributed file systems.
2. Understand the requirements and concepts of distributed file
system design.
3. Understand how the eight design principles are applied in
the realization of distributed file systems.
4. Obtain a better understanding of the workings of NFS.
CSCI319 Chapter 11 Page: 3 of 54
Content
• File system models
• Typical client-server architectures
• Communication
• Naming and Mounting
• Synchronization
• File sharing
• File locking
• Caching
• Fault tolerance
• Security
Therefore, we are taking a view ahead of what is covered later in this subject. The aim is to obtain an overview on how the various design principles are applied.
CSCI319 Chapter 11 Page: 4 of 49
Distributed File Systems
• Sharing data is fundamental to distributed systems.
• Distributed file systems form the basis for many distributed applications.
• Distributed file systems allow multiple processes to share data over long periods of time (in a secure and reliable way)
• Examples: NFS, Coda, plan 9, etc….
CSCI319 Chapter 11 Page: 5 of 49
Client-Server Architectures (1)
The two most common models in DFS: The remote access model, and upload/download model.
Upload/download modelRemote access model
CSCI319 Chapter 11
Interactive slide
We have already spoken briefly about centralized and decentralized architectures. Which one of the two terms {centralized, decentralized} is correct in the following two sentences?
In DFS,
• the remote access model as depicted on the previous slide engages the architecture.
• the upload/download model uses a .architecture.
6
CSCI319 Chapter 11
Interactive slide
We have already spoken briefly about centralized and decentralized architectures. Which one of the two terms {centralized, decentralized} is correct in the following two sentences?
In DFS,
• the remote access model as depicted on the previous slide engages the decentralized architecture.
• the upload/download model uses a centralizedarchitecture.
7
CSCI319 Chapter 11 Page: 8 of 49
Client-Server Architectures (2)
NFS realizes the remote access model using a layered architecture.
The basic NFS architecture for UNIX systems can be illustrated as
follows:
CSCI319 Chapter 11 Page: 9 of 49
Interactive Slide
What is the purpose of the VFS layer?
* To achieve transparency. The VFS clients are unaware of
physical attributes (i.e. physical location, media, etc.) of a file.
Which elements can call the VFS layer (in NFS systems)?
1. The system call layer (part of the OS)
2. The NFS client
What is a “stub” in the context of RPC?
* A communication “end-point”
CSCI319 Chapter 11 Page: 10 of 49
Interactive Slide
What is the purpose of the VFS layer?
* To achieve transparency. The VFS client is unaware of physical
attributes (i.e. physical location, media, etc.) of a file.
Which elements can call the VFS layer (in NFS systems)?
1. The system call layer (part of the OS)
2. The NFS client
What is a “stub” in the context of RPC?
* A communication “end-point”
CSCI319 Chapter 11 Page: 11 of 49
File System Model (1)
• NFS is a protocol, based on RPC, for the
realization of distributed file systems.
• NFS is actively being maintained and developed.
• There exist several versions of NFS. Mainstream
versions are NFSv3 and NFSv4.
• NFSv4 is designed to improve performance, and
breaks with traditional views on what constitutes a
file.
• Lets compare the differences of the NFSv3 and
NFSv4 protocol on a subset of supported primitives:
CSCI319 Chapter 11 Page: 12 of 49
File System Model (1)
An incomplete list of file system operations supported by NFS.
Operation NFSv3 NFSv4 Description
Create Yes No Create a regular file
Create No Yes Create a non-regular file
Link Yes Yes Create a hard link to a file
Symlink Yes No Create a symbolic link to a file
Mkdir Yes No Create a subdirectory
Mknod Yes No Create a special file
Rename Yes Yes Change the name of a file
Remove Yes Yes Remove a file from the file system
Rmdir Yes No Remove an empty subdirectory
Open No Yes Open a file
Close No Yes Close a file
Lookup Yes Yes Look up a file by means of a file name
CSCI319 Chapter 11 Page: 13 of 49
Interactive slide
In general, what does NFS provide?
NFS provides high level primitives which allow the creation, modification, and removal of possibly remote files or directories.
NFSv4 takes the concept further by generalizing the concept of “file”, and simplifying the handling of remote files.
What are non-regular files in NFSv4?
Symbolic links, directories, and special files such as a mount point.
How does a “lookup” differ between NFSv3 and NFSv4?
NFSv4 can resolve beyond a mount point. NFSv3 can not.
CSCI319 Chapter 11 Page: 14 of 49
Interactive slide
In general, what does NFS provide?
NFS provides high level primitives which allow the creation, modification, and removal of possibly remote files or directories.
NFSv4 takes the concept further by generalizing the concept of “file”, and simplifying the handling of remote files.
What are non-regular files in NFSv4?
Symbolic links, directories, and special files such as a mount point.
How does a “lookup” differ between NFSv3 and NFSv4?
NFSv4 can resolve beyond a mount point. NFSv3 can not.
CSCI319 Chapter 11 Page: 15 of 49
More on files in Distributed File Systems (1)
Data is generally stored in blocks. This allows us to think about weather to distribute the files or the data blocks.
The difference between (a) distributing whole files across several servers and (b) striping files for parallel access.
CSCI319 Chapter 11 Page: 16 of 49
Interactive slide
Name advantages of file striping:
Reduces risk of catastrophic data loss.
Allows parallelization of data access
Name disadvantages of file striping:
Increases risk of data loss (this can be addressed by
introducing a parity disc).
More difficult to manage.
CSCI319 Chapter 11 Page: 17 of 54
Interactive slide
Name advantages of file striping:
• Reduces risk of catastrophic data loss.
• Allows parallelization of data access (scalable)
Name disadvantages of file striping:
• Increases risk of data loss.
• More difficult to manage (i.e. number of blocks may not be
a multiple of the number of disks).
In practice, these disadvantages can be addressed through the
introduction of redundancies (i.e., parity), and through
byte-level striping. This will be a topic in one of the
laboratory classes.
CSCI319 Chapter 11 Page: 18 of 54
Files in Distributed File Systems (2)
The striping files technique scales well. But for extremely large
file systems a different approach is needed.
Example: Web Search Engines. All general purpose search
engines for the Web require a local copy of Web content.
As of 2009, it is estimated that the WWW consists of over
60 billion Web pages of a combined size of 1.12
petabytes. The Web continues to grow at an exponential
rate. Therefore, a search engine which covers a sizeable
portion of the WWW requires a file system that scales
extremely well.
Example: Google's File System. Google introduced a cluster
based distributed file system to achieve scalability.
CSCI319 Chapter 11 Page: 19 of 49
Cluster-Based Distributed File Systems (1)
Googles’ File System (GFS) stores data in 64MB segments
distributed across a number of chunk servers. The
organization of a Google cluster of servers is as follows.
CSCI319 Chapter 11 Page: 20 of 49
Interactive slide
Explain why does the GFS scale?
Master mostly passive
Data distributed and balanced over chunk servers
Master uses a hash table (the chunk table)
The chunk table fits into the main memory.
What is the potential bottleneck in GFS, and how can this
be addressed?
Network: Introduce dedicated lines between chunk server
and GFS client.
Disk: Striping.
CSCI319 Chapter 11 Page: 21 of 49
Interactive slide
Explain why does the GFS scale?
• Master mostly passive
• Data distributed and balanced over chunk servers
• Master uses a hash table (the chunk table)
• The chunk table fits into the main memory.
What is the potential bottleneck in GFS, and how can this
be addressed?
• Network: Introduce dedicated lines between chunk
server and GFS client.
• Disk speed on Chunk Server: Use Striping.
CSCI319 Chapter 11 Page: 22 of 49
Communication in DFS
• Communication in DFS is mostly based
on RPC
– RPC makes DFS independent to OS, network,
transport protocols, etc.
• Using RPC in NFS as an example:
CSCI319 Chapter 11 Page: 23 of 49
Remote Procedure Calls in NFSExample, NFSv4 supports compound RPCs. I.e. reading a file in NFS
version 3 (a), and by using a compound procedure in version 4 (b).
Compound RPC is faster since network is often slower than disk access.
CSCI319 Chapter 11 Page: 24 of 49
The RPC2 Subsystem (2)
RPC2 aims at offering more flexible and reliable RPC:
1. Server can send back message to client to let
client know that it is still working on a request
(avoid timeouts).
2. Allows the embedding (injection) of application
side protocols in RPC. This is called “side
effects”.
3. Allows parallel RPC calls.
This will be covered in more detail during the
laboratory classes.
CSCI319 Chapter 11 Page: 25 of 49
The RPC2 Subsystem (1)
Support for “side effects” in Coda’s RPC2 system.
CSCI319 Chapter 11 Page: 26 of 49
The RPC2 Subsystem (3)
Efficiency of RPCs can be enhanced by allowing mutually
independent tasks to occur in parallel. Example: sending of
invalidation messages in RPCv1 (a) versus RPCv2 (b).
Note that RPC2 calls are still blocking calls.
CSCI319 Chapter 11 Page: 27 of 49
Naming in DFS
• Names are (almost) always organized as
hierarchical (structured) name spaces in
DFS (see textbook, chapter 5).
• NFS is defined for structured name
spaces.
• We will now look at how NFS handles
naming.
CSCI319 Chapter 11 Page: 28 of 49
Naming in NFS (1)
Example: Mounting (part of) a remote file system in NFS.
Only sub-trees explicitly “exported” by the server can be
mounted by a client.
CSCI319 Chapter 11 Page: 29 of 49
Naming in NFS (2)
Example 2: Mounting nested
directories from multiple servers in
NFS.
CSCI319 Chapter 11 Page: 30 of 49
Automounting (1)(Static) mounting can be troublesome with large directory structures (i.e.
home directories). This is countered in NFS through an automounter.
CSCI319 Chapter 11 Page: 31 of 49
Automounting (2)To bypass the automounter whenever a mountpoint (here alice) is accessed, we can use symbolic links with automounting.
CSCI319 Chapter 11 Page: 32 of 49
Constructing a Global Name Space
A distributed file server may have to deal with several different name spaces. This has been addressed through the introduction of GNS which introduces the notion of Junctions. In GNS, clients maintain a virtual tree in which nodes are either a directory or a junction. There are 5 types of junctions:
CSCI319 Chapter 11 Page: 33 of 49
Interactive slide
What is the role of the 5 junctions in GNS?
• GNS junction: Refers to another GNS which may
be hosted on another system or by another process.
• Logical file-system name and physical file name:
Required to contact a location service (which
provides a handle or address of a file)
• Physical file-system name and logical file name:
Refer to a file system on another server (the contact
address). Example:
http://www.uow.edu.au/index.html is a physical file
name example.
CSCI319 Chapter 11 Page: 34 of 49
Interactive slide
What is the role of the 5 junctions in GNS?
• GNS junction: Refers to another GNS which may
be hosted on another system or by another process.
• Logical file-system name and logical file name:
Required to contact a location service (which
provides a handle or address of a file). Example:
http://www.uow.edu.au/research/2012/index.html
• Physical file-system name and physical file
name: Refer to a file system on another server.
Example: C:\data\pub\index.html may be the name
of a physically existing file.
CSCI319 Chapter 11 Page: 35 of 49
Synchronization in DFS
Issues that require attention:
• File sharing: A files may be accessed by
multiple clients simultaneously.
• File locking: Deny concurrent accesses.
• Caching: Replication of files to where the
processes are located.
CSCI319 Chapter 11 Page: 36 of 49
File Sharing Semantics (1)
Example: Read-follows-
write semantics. On a
single processor, when a
read follows a write, the
value returned by the read
is the value just written.
(UNIX semantics)
NFS used the UNIX
semantics thus fast
successive writes followed
by a read maintains the
correct order.
CSCI319 Chapter 11 Page: 37 of 49
File Sharing
Semantics (2)
Example 2: Session semanticsallows client side caching. In a distributed system with caching, obsolete values may be returned. This can result in inconsistencies:
But UNIX semantics only work
on systems where there is:
•Only one file server
•No client side caching
CSCI319 Chapter 11 Page: 38 of 49
File Sharing Semantics (3)
In fact, there are four common ways of dealing with
shared files in a distributed system.
Immutable files cannot change content but atomically
replace a file.
With transactions, changes between begin_transaction
and end_transaction are atomic.
CSCI319 Chapter 11 Page: 39 of 49
Interactive slide
What happens with using immutable files when one file is
replaced while another process is reading it?
CSCI319 Chapter 11 Page: 40 of 49
Interactive slide
What happens with using immutable files when one file is
replaced while another process is reading it?
1. Maintain a copy of the old file until all reads or
complete, or
2. Refuse subsequent reads from old file.
CSCI319 Chapter 11 Page: 41 of 49
File Locking (1)
Transaction semantics differ from file locking. File locking is supported
with NFSv4. Operations in NFSv4 related to file locking are:
With locking, there are two cases to consider:
1. Accessing a resource which may or may not already be locked.
2. Requesting a lock on a resource which may already be accessed by
another process.
CSCI319 Chapter 11 Page: 42 of 49
File Locking (2)
Case 1: A client requests shared access given the current
denial state. The result of an open operation with share
reservations in NFSv4 is as follows:
CSCI319 Chapter 11 Page: 43 of 49
File Locking (3)
Case 2: A client requests a denial state given the current
file access state. The result of an open operation with share
reservations in NFSv4 is as follows:
CSCI319 Chapter 11 Page: 44 of 49
Client-Side Caching (1)A more detailed look at the effects of client-side caching in
NFS.
Problem: Local cache may be inconsistent with associated file on FS.
->NFSv4 aims at improving issues with inconsistencies when caching data.
CSCI319 Chapter 11 Page: 45 of 49
Client-Side Caching (2)NFSv4 addresses this problem by using file delegation, and a
callback mechanism to recall file delegation.
CSCI319 Chapter 11 Page: 46 of 49
Sharing Files in Coda (with respect to caching)
Example: The transactional behavior in sharing files in Coda. Note that
transaction semantics utilizes the upload/download model.
From a transactional point of view there is no problem since SA precedes SB
CSCI319 Chapter 11 Page: 47 of 49
Client-Side Caching in Coda
A solution to overcome problems with caching is by using a “callback-promise” as in Coda. For example, the use of local copies when opening a session in Coda.
CSCI319 Chapter 11 Page: 48 of 49
Fault tolerance
.
RAID is used to achieve fault tolerance on centralized FS. RAID is
not suitable for DFS. In DFS, fault tolerance can be achieved i.e.
with the Byzantine method. For example: The different phases in
Byzantine fault tolerance:
CSCI319 Chapter 11 Page: 49 of 49
Interactive slide
.
Give one solution to how to achieve transparency in
the Byzantine method.
For example:
• use a trusted coordinator (or master) process.
• implement within the clients’ middleware layer.
• implement within the application layer.
CSCI319 Chapter 11 Page: 50 of 49
Interactive slide
.
Give one solution to how to achieve transparency in
the Byzantine method.
For example:
• use a trusted coordinator (or master) process.
• implement in the client side middleware layer.
• implement within the application layer (not
recommended).
CSCI319 Chapter 11 Page: 51
Security in NFSNFSv3 has very limited support for security. The NFSv3
security architecture is founded on SSL which serves as a
secured tunnel for NFS data. For example:
CSCI319 Chapter 11 Page: 52
Secure RPCsThis is improved significantly in NFSv4 where a security layer has become part of the FS. This is illustrated in the following:
CSCI319 Chapter 11 Page: 53
Access ControlThis allows various kinds of users and processes to be distinguished by NFSv4 with respect to access control. For example, a selection of valid users in NFSv4 are as follows:
The first three are also known in NFSv3 but not consequently realized.