Network File System Phil Segel Muhammad Kamran Arain Ala F. Alnawaiseh Group 3.

Post on 26-Dec-2015

214 views 0 download

Tags:

Transcript of Network File System Phil Segel Muhammad Kamran Arain Ala F. Alnawaiseh Group 3.

Network File System

Phil SegelMuhammad Kamran ArainAla F. Alnawaiseh

Group 3

Outline

Introduction Network File System (NFS) Windows Distributed File System (DFS)

Introduction

A Distributed File System (DFS): is a File System, that supports sharing of files and resources in the form of persistent storage over a network.

The first file servers were developed in the 1970s.

Sun’s Network File System (NFS) became the first widely used distributed file system after its introduction in 1985.

Clients and servers

A file server provides file services to clients. A client interface for a file service is formed

by a set of primitive file operations:– Creating a file.– Deleting a file.– Reading from a file.– and Writing to a file.

Distribution

A DFS is a file system whose clients, servers, and storage devices are dispersed among the machines of a Distributed System or intranet.

Accordingly, service activity has to be carried out across the network, and instead of a single centralized data repository, the system has multiple and independent storage devices.

The distinctive features of a DFS are the multiplicity and autonomy of clients and servers in the system.

Transparency

Ideally, a DFS should appear to its clients to be a conventional, centralized file system.

The multiplicity and dispersion of its servers and storage devices should be made invisible.

Performance

The most important performance measurement of a DFS is the amount of time needed to satisfy service requests. – In conventional systems, this time consists of a

disk-access time and a small amount of CPU-processing time.

– In a DFS, however, a remote access has the additional overhead attributed to the distributed structure.

Concurrent File Updates

A DFS should provide for multiple client processes on multiple machines not just accessing but also updating the same files.

Concurrency control or locking may be either built into the file system or be provided by an add-on protocol

Distributed Data Store

A Distributed Data Store is a network in which a user stores his or her information on a number of peer network nodes.

Most of the peer to peer networks do not have distributed data stores in that the user's data is only available when their node is on the network.

Distributed Data-Store Networks

FreeNet. MNet. Andrew File System (AFS). NNTP. BitTorrent The Mnesia Database. GNUnet. Secure File system (SFS) Global File System (GFS) The Chord Project. SVK – Distributed Version Control. Groove shared workspace, used for DoHyki.

Windows Distributed File Systems

What is the purpose of Windows DFS?

To unite files on different computers into a single namespace

Make it easy to build a single, hierarchal view of multiple file servers and file server shares on your network

To display files in a single directory structure regardless of what server the files are on

Comparison

Windows Distributed File Systems do for servers what a file system does for a hard disk.

DFS File Protocols

Not limited to a single protocol Regardless of client used, can support

mapping of:– Servers– Shares– Files

Supports these provided that the client supports the native server and share

History

The UNC (Universal Naming Convention) was required to specify the physical server and share to access file information– i.e. \\Server\share\path\filename

Could be used directly by drive mapping– i.e. X:\path\fileame

As network continues to grow mapping shares individually scales poorly

Solution to historical problems

Windows DFS solves these problems by linking physical storage into logical representation.

Permits shares to be hierarchally connected to other Windows shares

Make physical location of data transparent to users and applications

DFS Features and Benefits

Feature: Custom hierarchical view of shared network resources

– Description: By linking shares together, administrators can create a single hierarchical volume that behaves as though it were one giant hard drive. Individual users can create their own Dfs volumes, which in turn can be incorporated by other Dfs volumes. These are called inter-Dfs links.

– Benefit: Provides a simplified view of network shares that can be customized by the administrator.

DFS Features and Benefits (Continued)

Feature: Flexible volume administration Description: Individual shares participating in the Dfs

volume can be taken offline without affecting the remaining portion of the volume name space.

Benefit: Allows administrators to manage physical network shares, independent of their logical representation to users.

DFS Features and Benefits (Continued)

Feature: Graphical administration tool – Description: Each Dfs root is administered with an easy-to-

use graphical administration tool that permits browsing, configuration of volumes, alternates, and inter-Dfs links, as well as administration of remote Dfs roots.

– Benefit: Requires little training, reducing the need for trained, full-time server administrators.

DFS Features and Benefits (Continued)

Feature: Higher data availability – Description: Multiple copies of read-only shares can be

mounted under the same logical Dfs name to provide alternate locations for accessing data. If one of the copies becomes unavailable, an alternate is automatically selected.

– Benefit: Important business data is always available, even if a server, disk drive, or file occasionally fails.

DFS Features and Benefits (Continued)

Feature: Load balancing – Description: Multiple copies of read-only shares on separate

disk drives or servers can be mounted under the same logical Dfs name, thus permitting limited load balancing between drives or servers. As users request files from the Dfs volume, they are transparently referred to one of the network shares comprising the Dfs volume.

– Benefit: Automatically distributes file access across multiple disk drives or servers to balance loads and improve response time during peak usage periods.

DFS Features and Benefits (Continued)

Feature: Name transparency – Description: End users navigate the logical name space

without consideration to the physical locations of their data. Physical data can be relocated to any server and the logical Dfs name space can be reconfigured so that the end user‘s perspective of the Dfs name space is unaffected (that is, it is transparent to users that their data has changed location).

– Benefit: Increased administrative flexibility. Administrators can move network shares between servers or disk drives without affecting users’ ability to access the data.

DFS Features and Benefits (Continued)

Feature: Integration with Windows NT security model – Description: No additional administrative or security issues.

Any user who connects to a Dfs volume is only permitted to access files for which he or she has appropriate rights on that share.

– Benefit: Uses the existing Windows NT security model for easy administration and secure access.

DFS Features and Benefits (Continued)

Feature: Dfs client integrated into Windows NT Workstation 4.0, available for Windows 95 and Windows 98

– Description: The Dfs Windows NT Workstation client has been incorporated into Windows NT Workstation 4.0. This integration with the SMB redirector allows the extra Dfs features to be fully pageable and does not affect memory needs or standard client access performance.

– Benefit: Dfs functionality requires no additional resources on client systems.

DFS Features and Benefits (Continued)

Feature: Intelligent client caching – Description: A Dfs volume can potentially connect hundreds

or thousands of published shares. The client software makes no assumptions about what portion of Dfs published information a user might access. As a result, the first access of a published directory caches certain information locally. The next time a client accesses that portion of the Dfs name space, the cached referral is accessed, rather than obtaining a new referral.

– Benefit: Allows high-performance access to complex hierarchies of network volumes.

DFS Features and Benefits (Continued)

Feature: Windows 95 and Windows 98 Client – Description: Dfs includes a service to permit Windows 95

and Windows 98 users to navigate the Dfs name space. With the current release of Dfs, Windows clients can only access non-SMB volumes through a server-based gateway (for example, Microsoft Gateway Services for NetWare, which is included with Windows NT Server).

– Benefit: Extends Dfs benefits to Windows 95 and Windows

98 users.

DFS Features and Benefits (Continued)

Feature: Interoperates with other network file systems

– Description: Any volume that is accessible through a redirector on Windows NT Workstation can participate in the Dfs name space. This can be through either client redirectors or server-based gateway technology.

– Benefit: Administrators can create a single hierarchy incorporating heterogeneous network file systems.

Administration

DFS Provides tools to add and remove shares as necessary

Administration (Continued)

Easy to replace servers since each node in the Dfs is assigned a logical name that points to a file share

Can point a particular share to a new node while the current node is being replaced

User View

Maps just like a regular Windows drive

Load Balancing

If volumes are unavailable, Dfs will hand off request to an alternate volume if available

Example:– If 300 users require access to one volume, Dfs

can split users among copies of 2 or more servers to balance the load

Name Transparency

Eliminates the need for end users to know where the information is physically stored

Eases updating to accommodate additional storage

Example:– Users do not need to know the location of

physical storage, so it can be swapped out behind the scenes to accommodate additional storage

Technical Overview of DFS

DFS Root– Serves as a starting point and host to other

shares

Post-Junction Junctions

This is a junction that has child junctions Inter-Dfs Links

– Can join separate Dfs volumes together– Example: Organizations having their own Dfs, and

then one large Dfs to encompass the smaller Dfs’s

Post-Junction Junctions (Continued)

Midlevel Junctions– Planned for future versions of Dfs– Unlimited hierarchical junctioning

Reduces points of failures Does not require inter-dfs links Minimizes the number of referrals to deeply

nested paths Maintained by the Root

Example

UNC Name Maps to Description

\\Server\Public \\Server\Public Root of the organization’s Dfs

\\Server\Public\Intranet \\IIS\Root Junction to the intranet launching point

\\Server\Public\Intranet\CorpInfo

\\Marketing\Info\Corporate_HTML

Junction to departmental intranet content

\\Server\Public\Users \\Server\Public\Users Collection of home directories

\\Server\Public\Users\Bob

\\Server\Public\Users Junction from Users to Bob’s directory on the corporate development server

\\Server\Public\Users\Bob\Java_Apps

\\Bob1\Data\Java_Apps

Junction point from Bob’s development directory to one of Bob’s personal workstations

\\Server\Public\Users\Bob\Java_Apps

\\Bob2\Backups\Java_Apps

ALTERNATE Volume: Manually maintained backup of Bob’s work

\\Server\Public\Users\Ray

\\Server\Public\Users Down-level Volume : junction to a non-SMB volume (such as NetWare or NFS)

Example (Continued)

Alternate Volumes

Keeping exact replicas of the same volume for redundancy

Can be mounted to the same point Limit of 32 alternates for any given junction

point

Down-Level Volumes

Legacy support for all older Windows operating systems

Can participate in Dfs but cannot host the Dfs tree

Partition Knowledge Table (PKT)

Maintains knowledge of all of the junction points

Approximately 300 bytes per entry containing:– Dfs Path– [Server + Share] (a list)– Time to Live

Illustration

Resolving Junctions

Logical names into physical names is done by searching the PKT.

Maintained in a tree Top-down search

Example

Fail over to between volumes

When alternates are available they are provided to the client during name resolution

Choosing which volume among alternates is arbitrary and selected by the client

Fail-over Scenario 1

A client is browsing an alternate volume. The computer hosting the alternate loses power or drops completely from the network for any reason. To fail-over, the client must first detect that the hosting computer is no longer present. How long this takes depends on which protocol the client is using. Many protocols account for slow and loosely connected WAN links, and therefore may have retry counts of up to two minutes before the protocol itself times out. Once that occurs, Dfs immediately selects a new alternate. If none are available from the local cache, the Dfs client consults with the Dfs root to see if the administrator has modified any PKT entries. If no alternates are available at the root, a failure occurs; otherwise, Dfs initiates a fresh alternate selection and session setup.

Fail-over Scenario 2

A client is browsing an alternate volume. The computer hosting the alternate loses a hard disk hosting the alternate, or the share is deactivated. In this situation, the server hosting the alternate is still responding to the client request; the fail-over to a fresh alternate is nearly instantaneous.

Fail-over Scenario 3

A client has open files. The computer hosting the alternate loses power or drops completely from the network for any reason. In this scenario, the same protocol fail-over process described in Scenario #1 occurs, but the application that previously had file locks from the previous alternate must detect the change and establish new locks.

New attempts to open files trigger the same fail-over process described in Scenario #1. Operations on already open files fail with appropriate errors.

Fail-over Scenario 4

A client has open files. The computer hosting the alternate loses a hard disk hosting the alternate, or the share is deactivated. In this scenario, the same very quick fail-over process described in Scenario #2 occurs, but the application that previously had file handles from the previous alternate must detect the change and establish new handles.

Security

Allows for special handling of security issues at session startup using ACL’s.

The ACL’s are not consistent system wide ACL’s maintained on each server share

Administered at each physical share

There is no mechanism to administer system wide security from the Dfs root

There is no attempt made to keep the ACL’s consistent between alternate volumes

Network File System

NFS Architecture (1)

a) The remote access model.b) The upload/download model

NFS Architecture (2)

The basic NFS architecture for UNIX systems.

Important Advantage Of NFS

Largely independent of local file systems

In principle it does not matter which OS client or server uses (Unix or Windows)

Only important issue is that file systems must be compliant with file system model offered by NFS

Example: short MS-DOS names cannot be used to implement an NFS server in a fully transparent way

File System Model

An incomplete list of file system operations supported by NFS.

Operation v3 v4 Description

Create Yes No Create a regular file V3 only

Create No Yes Create a nonregular file - V4 symbolic links, directories and special files

Link Yes Yes Create a hard link to a file

Symlink Yes No Create a symbolic link to a file

Mkdir Yes No Create a subdirectory in a given directory

Mknod Yes No Create a special file

Rename Yes Yes Change the name of a file

Rmdir Yes No Remove an empty subdirectory from a directory

Open No Yes Open a file – V4 - will create a file if it does not exist

Close No Yes Close a file

Lookup Yes Yes Look up a file by means of a file name

Readdir Yes Yes Read the entries in a directory

Readlink Yes Yes Read the path name stored in a symbolic link

Getattr Yes Yes Read the attribute values for a file

Setattr Yes Yes Set one or more attribute values for a file

Read Yes Yes Read the data contained in a file

Write Yes Yes Write data to a file

File Handles

A reference to a file within a file system It is independent of the name of the file it refers to Created by the server that is hosting the file system Unique with respect to all file systems exported by the server Created when the file is created Client is kept ignorant of the actual content of the file handle – it is completely

opaque

Processes

NFS Traditional Client/Server system Version 2 and Version 3 Server stateless Stateless model not always fully implemented Very little client info held Version 4 Stateless model abandoned

Stateful Approach

Besides file locking and authentication there is another reason for making the server stateful

NFS 4 is expected to work over WANs This requires that client can make efficient use of caches This, in turn, requires an efficient cache consistency protocol Server needs to maintain information on files used by clients For example, the server may associate a lease with each client, promising to

give client exclusive read/write

Communication

a) Reading data from a file in NFS version 3 - Iterative

b) Reading data using a compound procedure in version 4 - Recursive

Naming (1)

Mounting (part of) a remote file system in NFS.

Naming (2)

Mounting nested directories from multiple servers in NFS.

Automounting (1)

A simple automounter for NFS.

Automounting (2)

Using symbolic links with automounting.

File Attributes

Version 3 used fixed set of attributes Fully implementing version 3 was difficult on some platforms Version 4 split attributes into 3 sets: Mandatory attributes Recommended attributes Named attributes• Named attributes not actually part of NFS protocol

File Attributes

Attribute Description

ACL an access control list associated with the file

FILEHANDLE The server-provided file handle of this file

FILEID A file-system unique identifier for this file

FS_LOCATIONS Locations in the network where this file system may be found

OWNER The character-string name of the file's owner

TIME_ACCESS Time when the file data were last accessed

TIME_MODIFY Time when the file data were last modified

TIME_CREATE Time when the file was created

Attribute Description

TYPE The type of the file (regular, directory, symbolic link)

SIZE The length of the file in bytes

CHANGE Indicator to see if and/or when the file has changed

FSID Server-unique identifier of the file's file system

Semantics of File Sharing (1)

a) On a single processor, when a read follows a write, the value returned by the read is the value just written.

b) In a distributed system with caching, obsolete values may be returned.

Semantics of File Sharing

• Immutable Files

No updates are possible Simplifies sharing and replication Only operations are create and read

• Transaction All changes occur atomically

File Locking in NFS (1)

NFS version 4 operations related to file locking.

Operation Description

Lock Creates a lock for a range of bytes

Lockt Test whether a conflicting lock has been granted

Locku Remove a lock from a range of bytes

Renew Renew the lease on a specified lock

Client Caching (1)

Client-side caching in NFS.

Client Caching (2)

Using the NFS version 4 callback mechanism to recall file delegation.

RPC Failures

Three situations for handling retransmissions.a) The request is still in progressb) The reply has just been returnedc) The reply has been some time ago, but was lost.

Security

The NFS security architecture.

Secure RPCs

Secure RPC in NFS version 4.

Access Control

The classification of operations recognized by NFS with respect to access control.

Operation Description

Read_data Permission to read the data contained in a file

Write_data Permission to modify a file's data

Append_data Permission to append data to a file

Execute Permission to execute a file

List_directory Permission to list the contents of a directory

Add_file Permission to add a new file t5o a directory

Add_subdirectory Permission to create a subdirectory to a directory

Delete Permission to delete a file

Delete_child Permission to delete a file or directory within a directory

Read_acl Permission to read the ACL

Write_acl Permission to write the ACL

Read_attributes The ability to read the other basic attributes of a file

Write_attributes Permission to change the other basic attributes of a file

Read_named_attrs Permission to read the named attributes of a file

Write_named_attrs Permission to write the named attributes of a file

Write_owner Permission to change the owner

Synchronize Permission to access a file locally at the server with synchronous reads and writes

Benchmarking study

– Network File System (NFS)

Outline

File system architectures Performance study design Experimental results

NFS Architecture

Client/server system Single server for files

Performance Study Design

Experimental cluster

–Seven dual-processor Pentium III 1GHz, 1GB memory computers

–Dual EIDE disk RAID 0 subsystem in all nodes, measured throughput about 50MBps

–Myrinetswitches, 250MBps theoretical bandwidth

NFS Parameters

Mount on Node 0 is a local mount

–Optimization for NFS NFS server can participate or not as a client

in the workload

System Software

RedHatLinux version 7.1 Linux kernel version 2.4.17-rc2 NFS protocol version 3 PVFS version 1.5.3 PVFS kernel version 1.5.3 Myrinetnetwork drivers gm-1.5-pre3b MPICH version 1.2.1

Clearcache

Clear NFS client and server-side caches

–UnmountNFS directory, shutdown NFS

–Restart NFS, remount NFS directories

Experimental Parameters

I/O servers NFS may or may not also participate as clients

Experimental Results

NFS, LWF and GWF with and without server

reading PVFS UNIX/POSIX API compared to NFS PVFS and NFS, GWF, 1 and 2 clients

with/without server participating

NFS, LWF and GWF with and without server reading

PVFS UNIX/POSIX API compared to NFS

PVFS and NFS, GWF, 1 and 2 clients with/without server participating

Conclusions

NFS can take advantage of a local mount NFS performance is limited by contention at

the single server

–Limited to the disk throughput or the network throughput from the server, whichever has the most contention